[Hadoop Wiki] Update of "CompressedStorage" by Evan

Apache Wiki Thu, 10 Sep 2009 08:17:18 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The following page has been changed by Evan:
http://wiki.apache.org/hadoop/CompressedStorage

------------------------------------------------------------------------------
  SET hive.exec.compress.output=TRUE; 
  SET io.seqfile.compression.type=BLOCK; -- NONE/RECORD/BLOCK (see below)
  INSERT OVERWRITE TABLE raw_sequnce SELECT LINE FROM raw;
+ 
+ INSERT OVERWRITE TABLE raw_sequnce SELECT * FROM raw; -- The previous line 
did not work for me, but this does.
  }}}
  
  The value for io.seqfile.compression.type determines how the compression is 
performed. If you set it to RECORD you will get as many output files as the 
number of map/reduce jobs. If you set it to BLOCK, you will get as many output 
files as there were input files. There is a tradeoff involved here -- large 
number of output files => more parellel map jobs => lower compression ratio.

[Hadoop Wiki] Update of "CompressedStorage" by Evan

Reply via email to