Great. We are one step closer to the root cause.
Can you print out a log line here as well? This is the place that we
fill in the compression option.
SemanticAnalyzer.java:2711:
Operator output = putOpInsertMap(
OperatorFactory.getAndMakeChild(
new fileSinkDesc(queryTmpdir, table_desc,
conf.getBoolVar(HiveConf.ConfVars.COMPRESSRESULT), currentTableId),
fsRS, input), inputRR);
Zheng
On Thu, Aug 13, 2009 at 11:34 PM, Saurabh Nanda<[email protected]> wrote:
> The query is being split into two map/reduce jobs. The first job consists of
> 16 map tasks (no reduce job). The relevant log output is given below:
>
>
> 2009-08-14 11:29:38,245 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS
> hdfs://master-hadoop:8020/tmp/hive-ct-admin/1957063362/_tmp.10002/_tmp.attempt_200908131050_0218_m_000000_0
> 2009-08-14 11:29:38,246 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Compression configuration
> is:true
>
> 2009-08-14 11:29:38,347 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new compressor
> 2009-08-14 11:29:38,358 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 6 FS initialized
> 2009-08-14 11:29:38,358 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 6 FS
>
>
> The second job consists of 16 map tasks & 3 reduce tasks. None of the map
> tasks contain any log output from FileSinkOperator. The reduce tasks contain
> the following relevant log output:
>
>
> 2009-08-14 11:38:13,553 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 3 FS
> 2009-08-14 11:38:13,553 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 3 FS
> 2009-08-14 11:38:13,604 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS
> hdfs://master-hadoop/tmp/hive-ct-admin/2045778473/_tmp.10000/_tmp.attempt_200908131050_0219_r_000000_0
>
> 2009-08-14 11:38:13,605 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Compression configuration
> is:false
> 2009-08-14 11:38:43,128 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 3 FS initialized
>
> 2009-08-14 11:38:43,128 INFO
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 3 FS
>
> You can see, that compression is "on" for the first map/reduce job, but
> "off" for the second one. Did I forget to set any configuration parameter?
>
> Saurabh.
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>
--
Yours,
Zheng