[ 
https://issues.apache.org/jira/browse/PIG-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791223#action_12791223
 ] 

Jeff Zhang commented on PIG-1110:
---------------------------------

Richard,

Regarding the second point, I know that TextInputFormat and TextOutputFormat do 
not support .bz file extension internally . But it's not the responsibility of 
PigStorage() to control the compression, it is still the responsibility of 
OutputFormat.  Because if you want to support .bz output, you have to add the 
following code in PigStorage()
{code}

    if (location.endsWith(".bz")) {
       FileOutputFormat.setCompressOutput(job, true);
      FileOutputFormat.setOutputCompressorClass(job,  BZipCodec.class);
}
{code}

So eventually it 'sstill hadoop's OutputFormat that control the compression not 
PigStoroage().  And even you add the above code in PigStorage, it still won't 
work. You have to add the BzipCodec.class in hadoop's classpath. and setting 
the CompressionCodec in configuration.

In a word, I do not think it make sense to use the output folder name to 
determine the CompressionCodec.


> Handle compressed file formats -- Gz, BZip with the new proposal
> ----------------------------------------------------------------
>
>                 Key: PIG-1110
>                 URL: https://issues.apache.org/jira/browse/PIG-1110
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>         Attachments: PIG-1110.patch, PIG_1110_Jeff.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to