[
https://issues.apache.org/jira/browse/PIG-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284131#comment-14284131
]
Daniel Dai commented on PIG-4341:
---------------------------------
I think it reaches to a point we shall use config file to track the codec
mapping rather than hard code. We can add the following entries into
pig-default.properties:
pig.tmpfilecompression.seqfile.codecs=gz,gzip,cmx,lzo,snappy,bzip2
pig.tmpfilecompression.tfile.codecs=gz,gzip,lzo
pig.tmpfilecompression.compression.codec.gz.class=org.apache.hadoop.io.compress.GzipCodec
pig.tmpfilecompression.compression.codec.gzip.class=org.apache.hadoop.io.compress.GzipCodec
pig.tmpfilecompression.compression.codec.lzo.class=com.hadoop.compression.lzo.LzoCodec
pig.tmpfilecompression.compression.codec.snappy.class=org.xerial.snappy.SnappyCodec
pig.tmpfilecompression.compression.codec.bzip2.class=org.apache.hadoop.io.compress.BZip2Codec
pig.tmpfilecompression.compression.codec.cmx.class=com.ibm.biginsights.compress.CmxCodec
And refactory Pig code to use these.
> Add CMX support to pig.tmpfilecompression.codec
> -----------------------------------------------
>
> Key: PIG-4341
> URL: https://issues.apache.org/jira/browse/PIG-4341
> Project: Pig
> Issue Type: Improvement
> Components: impl
> Affects Versions: 0.13.0
> Reporter: fang fang chen
> Assignee: fang fang chen
> Fix For: 0.15.0
>
> Attachments: PIG-4341.patch
>
>
> Pig has supported compression(GZ, GZIP, LZO). But latest pig has not
> supported CMX codec yet. cmx is "com.ibm.biginsights.compress.CmxCodec". This
> information also could be found from latest release pig-0.13.0 documentation:
> http://pig.apache.org/docs/r0.13.0/perf.html.
> Besides, I once tested CMX codec with pig-0.13.0 before. Following was the
> current settings:
> SET pig.tmpfilecompression true;
> SET pig.tmpfilecompression.codec cmx;
> Error:
> Caused by: java.io.IOException: Invalid temporary file compression codec
> [cmx]. Expected compression codecs for org.apache.pig.impl.io.TFileStorage
> are GZ,GZIP,LZO.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)