[
https://issues.apache.org/jira/browse/MAPREDUCE-477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871993#action_12871993
]
Yuri Pradkin commented on MAPREDUCE-477:
----------------------------------------
Just tried this on our cluster:
echo "content1" | bzip2 - >foo.bz2
echo "content2" | bzip2 - >>foo.bz2
bzcat foo.bz2
{quote}
content1
content2
{quote}
hdfs -put foo.bz2 foo.bz2
hadoop jar .../hadoop-streaming.jar -input foo.bz2 -output foo -mapper
/bin/cat -reducer /bin/cat
This completes after scheduling some rediculous number of splits (98)
hdfs -getmerge foo foo
cat foo
{quote}
content1
content2
{quote}
mapreduce/common: trunk rev 897063
> Support for reading bzip2 compressed file created using concatenation of
> multiple .bz2 files
> ---------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-477
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-477
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Reporter: Suhas Gogate
> Priority: Minor
>
> Bzip2Codec supported in Hadoop 0.19/0.20 should support for reading bzip2
> compressed file created using concatenation of multiple .bz2 files
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.