[ https://issues.apache.org/jira/browse/PIG-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584612#comment-14584612 ]
Tomas Hudik commented on PIG-4599: ---------------------------------- [~knoguchi] - can you be more specific? The documentation says: http://pig.apache.org/docs/r0.11.1/func.html#handling-compression "Support for compression is determined by the load/store function. PigStorage and TextLoader support gzip and bzip compression for both read (load) and write (store). BinStorage does not support compression." Does your comment mean that tar is not supported,however, gz/bz2 are? This would be strange since files usually combines these compressions into one: tar.gz (tar and gzip combined) > tar.gz compression doesn't produce correct output > ------------------------------------------------- > > Key: PIG-4599 > URL: https://issues.apache.org/jira/browse/PIG-4599 > Project: Pig > Issue Type: Bug > Affects Versions: 0.12.1 > Reporter: Tomas Hudik > Labels: compression, easytest > > I'm not completely sure whether this is the right place to put this issue > since Pig is involved, however, Pig leave decompression of tar.gz to > hadoop-common. > How to reproduce the issue: > # simple file (file1) with arbitrary text lines put into in1 in HDFS > # same file (file1) compressed by tar -cvzf file1.tar.gz file put into in2 in > HDFS > # issue simple pig commands in pig: > {quote} > raw = load 'in1/' USING TextLoader AS (line: bytearray); > dump raw; > {quote} > run for both (compressed and uncompressed file) > # in case of compressed version you will get strange 1st line > {quote} > a0000644000570000001440000000002512534073736011260 0ustar loadhadoopusersa > ... > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)