I can't find the previous log entry anywhere -- LOG.info("Writing to temp
file: FS " + outPath); -- where should I be looking? Should I configure
log4j differently for LOG.info to show up?Saurabh. On Fri, Aug 14, 2009 at 10:54 AM, Saurabh Nanda <[email protected]>wrote: > Files in table raw_compressed start with this header: > SEQ|"org.apache.hadoop.io.BytesWritable|org.apache.hadoop.io.Text||'org.apache.hadoop.io.compress.GzipCodec > > Files in table raw start with this header: > SEQ|"org.apache.hadoop.io.BytesWritable|org.apache.hadoop.io.Text > > File size for raw_compressed: 250MB > File size for raw: 2150 MB > > After "boolean isCompressed = conf.getCompressed();" should I put > "LOG.info("Compression config is:" + isCompressed);" ? > > Saurabh. > > > On Fri, Aug 14, 2009 at 9:51 AM, Zheng Shao <[email protected]> wrote: > >> What is the average file size in table raw? >> >> Can you put a log line in FileSinkOperator.java:107 ? That will tell >> us whether compression is turned on or not. >> >> Zheng >> >> On Thu, Aug 13, 2009 at 9:06 PM, Saurabh Nanda<[email protected]> >> wrote: >> > >> >> hive.exec.compress.output=true is the correct option. Can you post the >> >> "insert" command that you run which produced non-compressed results? >> >> Is the output in TextFileFormat or SequenceFileFormat? >> > >> > Here's the query. raw_compressed is a SequenceFile table with raw lines. >> raw >> > is a SequenceFile table with separate columns for each data field. >> > >> > from raw_compressed >> > insert overwrite table raw partition (dt='2009-04-02') >> > select transform(line) using 'parse_logs.rb' as ip_address, aid, >> uid, >> > ts, method, uri, response, referer, user_agent, cookies, ptime >> > >> > Saurabh. >> > -- >> > http://nandz.blogspot.com >> > http://foodieforlife.blogspot.com >> > >> >> >> >> -- >> Yours, >> Zheng >> > > > > -- > http://nandz.blogspot.com > http://foodieforlife.blogspot.com > -- http://nandz.blogspot.com http://foodieforlife.blogspot.com
