Hi Saurabh, hive.exec.compress.output=true is the correct option. Can you post the "insert" command that you run which produced non-compressed results? Is the output in TextFileFormat or SequenceFileFormat?
Zheng On Wed, Aug 12, 2009 at 10:52 PM, Saurabh Nanda<[email protected]> wrote: > I've even tried setting "mapred.output.compress=true" in hadoop-site.xml and > restarting the cluster but in vain. > > How do I get compression to work in Hive-trunk? Is it something to do with > the Hive query as well. Here's what I'm trying: > > from raw_compressed > insert overwrite table raw partition (dt='2009-04-02') > select transform(line) using 'parse_logs.rb' as ip_address, aid, uid, > ts, method, uri, response, referer, user_agent, cookies, ptime > > Saurabh. > > On Thu, Aug 13, 2009 at 9:44 AM, Saurabh Nanda <[email protected]> > wrote: >> >> I migrated from Hive-0.30 to Hive-trunk (r802989 compiled against Hadoop >> 0.18.3), copied over metastore_db & the conf directory. Output compression >> used to work with my earlier Hive installation, but it seems to have stopped >> working now. Are the configuration parameters different from Hive-0.3? >> >> "set -v" on Hive-trunk throws up the following relevant configuration >> parameters: >> >> mapred.output.compress=false >> hive.exec.compress.intermediate=false >> hive.exec.compress.output=true >> mapred.output.compression.type=BLOCK >> mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec >> >> mapred.map.output.compression.codec=org.apache.hadoop.io.compress.DefaultCodec >> io.seqfile.compress.blocksize=1000000 >> io.seqfile.lazydecompress=true >> mapred.compress.map.output=false >> >> io.compression.codecs=org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec >> >> What am I missing? >> >> Saurabh. >> -- >> http://nandz.blogspot.com >> http://foodieforlife.blogspot.com > > > > -- > http://nandz.blogspot.com > http://foodieforlife.blogspot.com > -- Yours, Zheng
