Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hive/CompressedStorage" page has been changed by Peter Voss. http://wiki.apache.org/hadoop/Hive/CompressedStorage?action=diff&rev1=5&rev2=6 -------------------------------------------------- SET hive.exec.compress.output=true; SET io.seqfile.compression.type=BLOCK; -- NONE/RECORD/BLOCK (see below) - INSERT OVERWRITE TABLE raw_sequnce SELECT LINE FROM raw; + INSERT OVERWRITE TABLE raw_sequence SELECT LINE FROM raw; - INSERT OVERWRITE TABLE raw_sequnce SELECT * FROM raw; -- The previous line did not work for me, but this does. + INSERT OVERWRITE TABLE raw_sequence SELECT * FROM raw; -- The previous line did not work for me, but this does. }}} The value for io.seqfile.compression.type determines how the compression is performed. If you set it to RECORD you will get as many output files as the number of map/reduce jobs. If you set it to BLOCK, you will get as many output files as there were input files. There is a tradeoff involved here -- large number of output files => more parellel map jobs => lower compression ratio.
