> hive.exec.compress.output=true is the correct option. Can you post the
> "insert" command that you run which produced non-compressed results?
> Is the output in TextFileFormat or SequenceFileFormat?
Here's the query. raw_compressed is a SequenceFile table with raw lines. raw
is a SequenceFile table with separate columns for each data field.
from raw_compressed
insert overwrite table raw partition (dt='2009-04-02')
select transform(line) using 'parse_logs.rb' as ip_address, aid, uid,
ts, method, uri, response, referer, user_agent, cookies, ptime
Saurabh.
--
http://nandz.blogspot.com
http://foodieforlife.blogspot.com