Re: Output compression not working on hive-trunk (r802989)

Saurabh Nanda Fri, 14 Aug 2009 02:04:10 -0700

The statement I changed was in the function genFileSinkPlan() and was on
line 2571 not 2711


Saurabh.

On Fri, Aug 14, 2009 at 2:31 PM, Saurabh Nanda <[email protected]>wrote:

> I changed the log statement, rebuilt Hive, and re-ran the insert query. I
> didn't find this log entry anywhere. Where exactly should I be looking for
> this log entry?
>
> Saurabh.
>
>
> On Fri, Aug 14, 2009 at 1:04 PM, Saurabh Nanda <[email protected]>wrote:
>
>> I'm changing the LOG.debug statement to the following --
>>
>>     LOG.info("Created FileSink Plan for clause: " + dest + "dest_path: "
>>               + dest_path + " row schema: "
>>               + inputRR.toString() + ". HiveConf.ConfVars.COMPRESSRESULT="
>> + conf.getBoolVar(HiveConf.ConfVars.COMPRESSRESULT));
>>
>> Saurabh.
>>
>>
>> On Fri, Aug 14, 2009 at 12:39 PM, Zheng Shao <[email protected]> wrote:
>>
>>> Great. We are one step closer to the root cause.
>>>
>>> Can you print out a log line here as well? This is the place that we
>>> fill in the compression option.
>>>
>>> SemanticAnalyzer.java:2711:
>>>    Operator output = putOpInsertMap(
>>>      OperatorFactory.getAndMakeChild(
>>>        new fileSinkDesc(queryTmpdir, table_desc,
>>>
>>> conf.getBoolVar(HiveConf.ConfVars.COMPRESSRESULT), currentTableId),
>>>        fsRS, input), inputRR);
>>>
>>>
>>> Zheng
>>>
>>> On Thu, Aug 13, 2009 at 11:34 PM, Saurabh Nanda<[email protected]>
>>> wrote:
>>> > The query is being split into two map/reduce jobs. The first job
>>> consists of
>>> > 16 map tasks (no reduce job). The relevant log output is given below:
>>> >
>>> >
>>> > 2009-08-14 11:29:38,245 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file:
>>> FS
>>> >
>>> hdfs://master-hadoop:8020/tmp/hive-ct-admin/1957063362/_tmp.10002/_tmp.attempt_200908131050_0218_m_000000_0
>>> > 2009-08-14 11:29:38,246 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Compression
>>> configuration
>>> > is:true
>>> >
>>> > 2009-08-14 11:29:38,347 INFO org.apache.hadoop.io.compress.CodecPool:
>>> Got
>>> > brand-new compressor
>>> > 2009-08-14 11:29:38,358 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 6 FS
>>> initialized
>>> > 2009-08-14 11:29:38,358 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 6
>>> FS
>>> >
>>> >
>>> > The second job consists of 16 map tasks & 3 reduce tasks. None of the
>>> map
>>> > tasks contain any log output from FileSinkOperator. The reduce tasks
>>> contain
>>> > the following relevant log output:
>>> >
>>> >
>>> > 2009-08-14 11:38:13,553 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 3
>>> FS
>>> > 2009-08-14 11:38:13,553 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 3 FS
>>> > 2009-08-14 11:38:13,604 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file:
>>> FS
>>> >
>>> hdfs://master-hadoop/tmp/hive-ct-admin/2045778473/_tmp.10000/_tmp.attempt_200908131050_0219_r_000000_0
>>> >
>>> > 2009-08-14 11:38:13,605 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Compression
>>> configuration
>>> > is:false
>>> > 2009-08-14 11:38:43,128 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 3 FS
>>> initialized
>>> >
>>> > 2009-08-14 11:38:43,128 INFO
>>> > org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 3
>>> FS
>>> >
>>> > You can see, that compression is "on" for the  first map/reduce job,
>>> but
>>> > "off" for the second one. Did I forget to set any configuration
>>> parameter?
>>> >
>>> > Saurabh.
>>> > --
>>> > http://nandz.blogspot.com
>>> > http://foodieforlife.blogspot.com
>>> >
>>>
>>>
>>>
>>> --
>>> Yours,
>>> Zheng
>>>
>>
>>
>>
>> --
>> http://nandz.blogspot.com
>> http://foodieforlife.blogspot.com
>>
>
>
>
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>



-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Re: Output compression not working on hive-trunk (r802989)

Reply via email to