Daniel, I just noticed your Hadoop version - 0.20.2.
The JIRA fix below is for Hadoop 0.21.0, which is a different version. So it may not be supported on your version of Hadoop. -- Rohit Bakhshi www.hortonworks.com (http://www.hortonworks.com/) On Friday, February 24, 2012 at 7:49 AM, Rohit Bakhshi wrote: > Hi Daniel, > > Bzip2 compression codec allows for splittable files. > > According to this Hadoop JIRA improvement, splitting of bzip2 compressed > files in Hadoop jobs is supported: > https://issues.apache.org/jira/browse/HADOOP-4012 > > -- > Rohit Bakhshi > www.hortonworks.com (http://www.hortonworks.com/) > > > > > On Friday, February 24, 2012 at 7:43 AM, Daniel Baptista wrote: > > > Hi All, > > > > I have a cluster of 6 datanodes, all running hadoop version 0.20.2, r911707 > > that take a series of bzip2 compressed text files as input. > > > > I have read conflicting articles regarding whether or not hadoop can split > > these bzip2 files, can anyone give me a definite answer? > > > > Thanks is advance, Dan. >