Support starts in 0.21, yes. It will soon be backported and available in 1.1.0. 
A patch to 1.0.0 to enable bzip2 splittability is here, 
https://issues.apache.org/jira/browse/HADOOP-7823, if you feel up to patching 
and rebuilding.

    - Tim.
________________________________________
From: Rohit Bakhshi [ro...@hortonworks.com]
Sent: Friday, February 24, 2012 7:53 AM
To: common-user@hadoop.apache.org
Subject: Re: BZip2 Splittable?

Daniel,

I just noticed your Hadoop version - 0.20.2.

The JIRA fix below is for Hadoop 0.21.0, which is a different version. So it 
may not be supported on your version of Hadoop.

--
Rohit Bakhshi
www.hortonworks.com (http://www.hortonworks.com/)




On Friday, February 24, 2012 at 7:49 AM, Rohit Bakhshi wrote:

> Hi Daniel,
>
> Bzip2 compression codec allows for splittable files.
>
> According to this Hadoop JIRA improvement, splitting of bzip2 compressed 
> files in Hadoop jobs is supported:
> https://issues.apache.org/jira/browse/HADOOP-4012
>
> --
> Rohit Bakhshi
> www.hortonworks.com (http://www.hortonworks.com/)
>
>
>
>
> On Friday, February 24, 2012 at 7:43 AM, Daniel Baptista wrote:
>
> > Hi All,
> >
> > I have a cluster of 6 datanodes, all running hadoop version 0.20.2, r911707 
> > that take a series of bzip2 compressed text files as input.
> >
> > I have read conflicting articles regarding whether or not hadoop can split 
> > these bzip2 files, can anyone give me a definite answer?
> >
> > Thanks is advance, Dan.
>

The information and any attached documents contained in this message
may be confidential and/or legally privileged.  The message is
intended solely for the addressee(s).  If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful.  If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.

Reply via email to