I can't find the previous log entry anywhere -- LOG.info("Writing to temp
file: FS " + outPath); -- where should I be looking? Should I configure
log4j differently for LOG.info to show up?

Saurabh.

On Fri, Aug 14, 2009 at 10:54 AM, Saurabh Nanda <[email protected]>wrote:

> Files in table raw_compressed start with this header:
> SEQ|"org.apache.hadoop.io.BytesWritable|org.apache.hadoop.io.Text||'org.apache.hadoop.io.compress.GzipCodec
>
> Files in table raw start with this header:
> SEQ|"org.apache.hadoop.io.BytesWritable|org.apache.hadoop.io.Text
>
> File size for raw_compressed: 250MB
> File size for raw: 2150 MB
>
> After "boolean isCompressed = conf.getCompressed();" should I put
> "LOG.info("Compression config is:" + isCompressed);" ?
>
> Saurabh.
>
>
> On Fri, Aug 14, 2009 at 9:51 AM, Zheng Shao <[email protected]> wrote:
>
>> What is the average file size in table raw?
>>
>> Can you put a log line in FileSinkOperator.java:107 ? That will tell
>> us whether compression is turned on or not.
>>
>> Zheng
>>
>> On Thu, Aug 13, 2009 at 9:06 PM, Saurabh Nanda<[email protected]>
>> wrote:
>> >
>> >> hive.exec.compress.output=true is the correct option. Can you post the
>> >> "insert" command that you run which produced non-compressed results?
>> >> Is the output in TextFileFormat or SequenceFileFormat?
>> >
>> > Here's the query. raw_compressed is a SequenceFile table with raw lines.
>> raw
>> > is a SequenceFile table with separate columns for each data field.
>> >
>> > from raw_compressed
>> >     insert overwrite table raw partition (dt='2009-04-02')
>> >     select transform(line) using 'parse_logs.rb' as ip_address, aid,
>> uid,
>> > ts, method, uri, response, referer, user_agent, cookies, ptime
>> >
>> > Saurabh.
>> > --
>> > http://nandz.blogspot.com
>> > http://foodieforlife.blogspot.com
>> >
>>
>>
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>



-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Reply via email to