Hi lulynn, If you are using biginsights follow the steps to enable compression.
set output.compression.enabled true; set output.compression.codec org.apache.hadoop.io.compress.GzipCodec; data = LOAD '/bigdata/sample_data/NOAA_Weather_csv/2011/999999-53019-2011.csv.gz' using PigStorage(',') as (projectname:chararray); STORE data INTO '/comCodecGzip'; Cheers, Krishna On Tue, Nov 18, 2014 at 2:04 PM, lulynn_2008 <lulynn_2...@163.com> wrote: > BTW, cmx is "com.ibm.biginsights.compress.CmxCodec", the related jar is > ibm-compression.jar. > > > > At 2014-11-18 15:49:53, "lulynn_2008" <lulynn_2...@163.com> wrote: > > Hi All, > I am trying to use CMX as temp file compression > > codec, i.e > SET pig.tmpfilecompression true; > SET pig.tmpfilecompression.codec cmx; > > but following errors happened: > Caused by: java.io.IOException: Invalid temporary file compression codec > []. Expected compression codecs are gz and lzo > > from pig cookbook I found following line > > "pig.tmpfilecompression.codec - Specifies which compression codec to use. > Currently, Pig accepts "gz" and "lzo" as possible values. However, because > LZO is under GPL license (and disabled by default) you will need to > configure your cluster to use the LZO codec to take advantage of this > feature. For details, see > http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ. " > > > Is there any workaround? or there are roadmaps for adding cmx as a > > supported codec ? I was using pig 0.12.0. > > Thanks > > > >