Hi Rohit, thanks for the response, this is pretty much as I expected and 
hopefully adds weight to my other thoughts...

Could this mean that all my datanodes are being sent all of the data or that 
only one datanode is executing the job. 

Thanks again , Dan.

-----Original Message-----
From: Rohit Bakhshi [mailto:ro...@hortonworks.com] 
Sent: 24 February 2012 15:54
To: common-user@hadoop.apache.org
Subject: Re: BZip2 Splittable?

Daniel, 

I just noticed your Hadoop version - 0.20.2.

The JIRA fix below is for Hadoop 0.21.0, which is a different version. So it 
may not be supported on your version of Hadoop. 

-- 
Rohit Bakhshi
www.hortonworks.com (http://www.hortonworks.com/)




On Friday, February 24, 2012 at 7:49 AM, Rohit Bakhshi wrote:

> Hi Daniel, 
> 
> Bzip2 compression codec allows for splittable files.
> 
> According to this Hadoop JIRA improvement, splitting of bzip2 compressed 
> files in Hadoop jobs is supported:
> https://issues.apache.org/jira/browse/HADOOP-4012
> 
> -- 
> Rohit Bakhshi
> www.hortonworks.com (http://www.hortonworks.com/)
> 
> 
> 
> 
> On Friday, February 24, 2012 at 7:43 AM, Daniel Baptista wrote:
> 
> > Hi All,
> > 
> > I have a cluster of 6 datanodes, all running hadoop version 0.20.2, r911707 
> > that take a series of bzip2 compressed text files as input.
> > 
> > I have read conflicting articles regarding whether or not hadoop can split 
> > these bzip2 files, can anyone give me a definite answer?
> > 
> > Thanks is advance, Dan. 
> 


________________________________________________________________________

CONFIDENTIALITY - This email and any files transmitted with it, are 
confidential, may be legally privileged and are intended solely for the use of 
the individual or entity to whom they are addressed. If this has come to you in 
error, you must not copy, distribute, disclose or use any of the information it 
contains. Please notify the sender immediately and delete them from your system.

SECURITY - Please be aware that communication by email, by its very nature, is 
not 100% secure and by communicating with Perform Group by email you consent to 
us monitoring and reading any such correspondence.

VIRUSES - Although this email message has been scanned for the presence of 
computer viruses, the sender accepts no liability for any damage sustained as a 
result of a computer virus and it is the recipient’s responsibility to ensure 
that email is virus free.

AUTHORITY - Any views or opinions expressed in this email are solely those of 
the sender and do not necessarily represent those of Perform Group.

COPYRIGHT - Copyright of this email and any attachments belongs to Perform 
Group, Companies House Registration number 6324278.

Reply via email to