I'm changing the LOG.debug statement to the following --
LOG.info("Created FileSink Plan for clause: " + dest + "dest_path: "
+ dest_path + " row schema: "
+ inputRR.toString() + ". HiveConf.ConfVars.COMPRESSRESULT=" +
conf.getBoolVar(HiveConf.ConfVars.COMPRESSRESULT));
Saurabh.
On Fri, Aug 14, 2009 at 12:39 PM, Zheng Shao <[email protected]> wrote:
> Great. We are one step closer to the root cause.
>
> Can you print out a log line here as well? This is the place that we
> fill in the compression option.
>
> SemanticAnalyzer.java:2711:
> Operator output = putOpInsertMap(
> OperatorFactory.getAndMakeChild(
> new fileSinkDesc(queryTmpdir, table_desc,
>
> conf.getBoolVar(HiveConf.ConfVars.COMPRESSRESULT), currentTableId),
> fsRS, input), inputRR);
>
>
> Zheng
>
> On Thu, Aug 13, 2009 at 11:34 PM, Saurabh Nanda<[email protected]>
> wrote:
> > The query is being split into two map/reduce jobs. The first job consists
> of
> > 16 map tasks (no reduce job). The relevant log output is given below:
> >
> >
> > 2009-08-14 11:29:38,245 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS
> >
> hdfs://master-hadoop:8020/tmp/hive-ct-admin/1957063362/_tmp.10002/_tmp.attempt_200908131050_0218_m_000000_0
> > 2009-08-14 11:29:38,246 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Compression
> configuration
> > is:true
> >
> > 2009-08-14 11:29:38,347 INFO org.apache.hadoop.io.compress.CodecPool: Got
> > brand-new compressor
> > 2009-08-14 11:29:38,358 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 6 FS
> initialized
> > 2009-08-14 11:29:38,358 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 6 FS
> >
> >
> > The second job consists of 16 map tasks & 3 reduce tasks. None of the map
> > tasks contain any log output from FileSinkOperator. The reduce tasks
> contain
> > the following relevant log output:
> >
> >
> > 2009-08-14 11:38:13,553 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 3 FS
> > 2009-08-14 11:38:13,553 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 3 FS
> > 2009-08-14 11:38:13,604 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS
> >
> hdfs://master-hadoop/tmp/hive-ct-admin/2045778473/_tmp.10000/_tmp.attempt_200908131050_0219_r_000000_0
> >
> > 2009-08-14 11:38:13,605 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Compression
> configuration
> > is:false
> > 2009-08-14 11:38:43,128 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 3 FS
> initialized
> >
> > 2009-08-14 11:38:43,128 INFO
> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 3 FS
> >
> > You can see, that compression is "on" for the first map/reduce job, but
> > "off" for the second one. Did I forget to set any configuration
> parameter?
> >
> > Saurabh.
> > --
> > http://nandz.blogspot.com
> > http://foodieforlife.blogspot.com
> >
>
>
>
> --
> Yours,
> Zheng
>
--
http://nandz.blogspot.com
http://foodieforlife.blogspot.com