Re: How to IO catch exceptions using python

Jeff Hammerbacher Mon, 19 Oct 2009 11:02:49 -0700

Hey Xavier,

The functionality you are looking for was added to 0.19 and above:
http://issues.apache.org/jira/browse/HADOOP-3828. If you upgrade your
cluster to CDH2, you should be good to go.


Regards,
Jeff

On Mon, Oct 19, 2009 at 10:58 AM, <[email protected]>wrote:

> Hi Everybody,
>
> I'm doing a project where I have to read a large set of compress files
> (gz). I'm using python and streaming to achieve my goals. However, I
> have a problem, there are corrupt compress files that are killing my
> map/reduce jobs.
> My environment is the following:
> Hadoop-0.18.3 (CDH1)
>
>
> Do you guys have some recommendations how to manage this case?
> How I can catch that exception using python so that my jobs don't fail?
> How I can identify these files using python and move them to a corrupt
> file folder?
>
> I really appreciate any recommendation
>
> Xavier
>
>

Re: How to IO catch exceptions using python

Reply via email to