Daniel, 

I just noticed your Hadoop version - 0.20.2.

The JIRA fix below is for Hadoop 0.21.0, which is a different version. So it 
may not be supported on your version of Hadoop. 

-- 
Rohit Bakhshi
www.hortonworks.com (http://www.hortonworks.com/)




On Friday, February 24, 2012 at 7:49 AM, Rohit Bakhshi wrote:

> Hi Daniel, 
> 
> Bzip2 compression codec allows for splittable files.
> 
> According to this Hadoop JIRA improvement, splitting of bzip2 compressed 
> files in Hadoop jobs is supported:
> https://issues.apache.org/jira/browse/HADOOP-4012
> 
> -- 
> Rohit Bakhshi
> www.hortonworks.com (http://www.hortonworks.com/)
> 
> 
> 
> 
> On Friday, February 24, 2012 at 7:43 AM, Daniel Baptista wrote:
> 
> > Hi All,
> > 
> > I have a cluster of 6 datanodes, all running hadoop version 0.20.2, r911707 
> > that take a series of bzip2 compressed text files as input.
> > 
> > I have read conflicting articles regarding whether or not hadoop can split 
> > these bzip2 files, can anyone give me a definite answer?
> > 
> > Thanks is advance, Dan. 
> 

Reply via email to