The statement I changed was in the function genFileSinkPlan() and was on line 2571 not 2711
Saurabh. On Fri, Aug 14, 2009 at 2:31 PM, Saurabh Nanda <[email protected]>wrote: > I changed the log statement, rebuilt Hive, and re-ran the insert query. I > didn't find this log entry anywhere. Where exactly should I be looking for > this log entry? > > Saurabh. > > > On Fri, Aug 14, 2009 at 1:04 PM, Saurabh Nanda <[email protected]>wrote: > >> I'm changing the LOG.debug statement to the following -- >> >> LOG.info("Created FileSink Plan for clause: " + dest + "dest_path: " >> + dest_path + " row schema: " >> + inputRR.toString() + ". HiveConf.ConfVars.COMPRESSRESULT=" >> + conf.getBoolVar(HiveConf.ConfVars.COMPRESSRESULT)); >> >> Saurabh. >> >> >> On Fri, Aug 14, 2009 at 12:39 PM, Zheng Shao <[email protected]> wrote: >> >>> Great. We are one step closer to the root cause. >>> >>> Can you print out a log line here as well? This is the place that we >>> fill in the compression option. >>> >>> SemanticAnalyzer.java:2711: >>> Operator output = putOpInsertMap( >>> OperatorFactory.getAndMakeChild( >>> new fileSinkDesc(queryTmpdir, table_desc, >>> >>> conf.getBoolVar(HiveConf.ConfVars.COMPRESSRESULT), currentTableId), >>> fsRS, input), inputRR); >>> >>> >>> Zheng >>> >>> On Thu, Aug 13, 2009 at 11:34 PM, Saurabh Nanda<[email protected]> >>> wrote: >>> > The query is being split into two map/reduce jobs. The first job >>> consists of >>> > 16 map tasks (no reduce job). The relevant log output is given below: >>> > >>> > >>> > 2009-08-14 11:29:38,245 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: >>> FS >>> > >>> hdfs://master-hadoop:8020/tmp/hive-ct-admin/1957063362/_tmp.10002/_tmp.attempt_200908131050_0218_m_000000_0 >>> > 2009-08-14 11:29:38,246 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Compression >>> configuration >>> > is:true >>> > >>> > 2009-08-14 11:29:38,347 INFO org.apache.hadoop.io.compress.CodecPool: >>> Got >>> > brand-new compressor >>> > 2009-08-14 11:29:38,358 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 6 FS >>> initialized >>> > 2009-08-14 11:29:38,358 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 6 >>> FS >>> > >>> > >>> > The second job consists of 16 map tasks & 3 reduce tasks. None of the >>> map >>> > tasks contain any log output from FileSinkOperator. The reduce tasks >>> contain >>> > the following relevant log output: >>> > >>> > >>> > 2009-08-14 11:38:13,553 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 3 >>> FS >>> > 2009-08-14 11:38:13,553 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 3 FS >>> > 2009-08-14 11:38:13,604 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: >>> FS >>> > >>> hdfs://master-hadoop/tmp/hive-ct-admin/2045778473/_tmp.10000/_tmp.attempt_200908131050_0219_r_000000_0 >>> > >>> > 2009-08-14 11:38:13,605 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Compression >>> configuration >>> > is:false >>> > 2009-08-14 11:38:43,128 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 3 FS >>> initialized >>> > >>> > 2009-08-14 11:38:43,128 INFO >>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 3 >>> FS >>> > >>> > You can see, that compression is "on" for the first map/reduce job, >>> but >>> > "off" for the second one. Did I forget to set any configuration >>> parameter? >>> > >>> > Saurabh. >>> > -- >>> > http://nandz.blogspot.com >>> > http://foodieforlife.blogspot.com >>> > >>> >>> >>> >>> -- >>> Yours, >>> Zheng >>> >> >> >> >> -- >> http://nandz.blogspot.com >> http://foodieforlife.blogspot.com >> > > > > -- > http://nandz.blogspot.com > http://foodieforlife.blogspot.com > -- http://nandz.blogspot.com http://foodieforlife.blogspot.com
