Re: Output compression not working on hive-trunk (r802989)

Zheng Shao Tue, 25 Aug 2009 01:23:39 -0700

Hi Saurabh,

Finally I found the line of code. See
https://issues.apache.org/jira/browse/HIVE-794 for details.
Can you help make a patch for that?


Zheng

On Tue, Aug 25, 2009 at 12:19 AM, Saurabh Nanda <[email protected]>wrote:

> Hi Zheng,
>
> Here's the plan for the second map-reduce job --
> http://pastebin.com/m59d5a84b
> I don't see compression anywhere.
>
> Saurabh.
>
>
> On Fri, Aug 21, 2009 at 11:30 AM, Zheng Shao <[email protected]> wrote:
>
>> Hi Suarabh,
>>
>> Sorry for the delay on this. We are busy with the production this week.
>>
>> I don't think there is much difference in CLI queries and JDBC queries.
>>
>> Yes, this is what I am talking about. Since your query has 2
>> map-reduce jobs, there will be two .xml files.
>> Can you show us the second one? Does the second one also contains
>> "<...>compressed<...>true<...>" in the section of FileSinkOperator?
>>
>> Zheng
>>
>> On Tue, Aug 18, 2009 at 3:21 AM, Saurabh Nanda<[email protected]>
>> wrote:
>> > Is this what you're talking about -- http://pastebin.ca/1533627 ? Seems
>> like
>> > compression is on.
>> >
>> > Is there any difference in how CLI queries and JDBC queries are treated?
>> >
>> > Saurabh.
>> >
>> > On Tue, Aug 18, 2009 at 11:19 AM, Zheng Shao <[email protected]> wrote:
>> >>
>> >> Hi Saurabh,
>> >>
>> >> So the compression flag is correct when the plan is generated.
>> >> When you run the query, you should see "plan = xxx.xml" in the log
>> >> file. Can you open that file (in HDFS) and see whether the compression
>> >> flag is on or not?
>> >>
>> >> Zheng
>> >>
>> >> On Mon, Aug 17, 2009 at 5:17 AM, Saurabh Nanda<[email protected]>
>> >> wrote:
>> >> > Hey Zheng, any clues as to what the bug is? Or what I'm doing wrong?
>> I
>> >> > can
>> >> > do some more digging and logging if required.
>> >> >
>> >> > Saurabh.
>> >> >
>> >> > On Mon, Aug 17, 2009 at 1:28 PM, Saurabh Nanda <
>> [email protected]>
>> >> > wrote:
>> >> >>
>> >> >> Here's the log output:
>> >> >>
>> >> >> 2009-08-17 13:26:42,183 INFO  parse.SemanticAnalyzer
>> >> >> (SemanticAnalyzer.java:genFileSinkPlan(2575)) - Created FileSink
>> Plan
>> >> >> for
>> >> >> clause: insclause-0dest_path:
>> >> >> hdfs://master-hadoop/user/hive/warehouse/raw/dt=2009-04-07 row
>> schema:
>> >> >> {(_col0,_col0: string)(_col1,_col1: string)(_col2,_col2:
>> >> >> string)(_col3,_col3: string)(_col4,_col4: string)(_col5,_col5:
>> >> >> string)(_col6,_col6: string)(_col7,_col7: string)(_col8,_col8:
>> >> >> string)(_col9,_col9: string)(_col10,_col10: int)} .
>> >> >> HiveConf.ConfVars.COMPRESSRESULT=true
>> >> >>
>> >> >> Is the SemanticAnalyszer run more than once in the lifetime of a
>> job?
>> >> >> Should I be looking for another log entry like this one?
>> >> >>
>> >> >> Saurabh.
>> >> >>
>> >> >> On Mon, Aug 17, 2009 at 1:26 PM, Saurabh Nanda <
>> [email protected]>
>> >> >> wrote:
>> >> >>>
>> >> >>> Strange. The compression configuration log entry was also info but
>> I
>> >> >>> could see it in the task logs:
>> >> >>>
>> >> >>>       LOG.info("Compression configuration is:" + isCompressed);
>> >> >>>
>> >> >>> Saurabh.
>> >> >>>
>> >> >>> On Mon, Aug 17, 2009 at 12:56 PM, Zheng Shao <[email protected]>
>> wrote:
>> >> >>>>
>> >> >>>> The default log level is WARN. Please change it to INFO.
>> >> >>>>
>> >> >>>> hive.root.logger=INFO,DRFA
>> >> >>>>
>> >> >>>> Of course you can also use LOG.warn() in your test code.
>> >> >>>>
>> >> >>>> Zheng
>> >> >>>>
>> >> >>> --
>> >> >>> http://nandz.blogspot.com
>> >> >>> http://foodieforlife.blogspot.com
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> http://nandz.blogspot.com
>> >> >> http://foodieforlife.blogspot.com
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > http://nandz.blogspot.com
>> >> > http://foodieforlife.blogspot.com
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Yours,
>> >> Zheng
>> >
>> >
>> >
>> > --
>> > http://nandz.blogspot.com
>> > http://foodieforlife.blogspot.com
>> >
>>
>>
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>



-- 
Yours,
Zheng

Re: Output compression not working on hive-trunk (r802989)

Reply via email to