Hey Xavier, The functionality you are looking for was added to 0.19 and above: http://issues.apache.org/jira/browse/HADOOP-3828. If you upgrade your cluster to CDH2, you should be good to go.
Regards, Jeff On Mon, Oct 19, 2009 at 10:58 AM, <[email protected]>wrote: > Hi Everybody, > > I'm doing a project where I have to read a large set of compress files > (gz). I'm using python and streaming to achieve my goals. However, I > have a problem, there are corrupt compress files that are killing my > map/reduce jobs. > My environment is the following: > Hadoop-0.18.3 (CDH1) > > > Do you guys have some recommendations how to manage this case? > How I can catch that exception using python so that my jobs don't fail? > How I can identify these files using python and move them to a corrupt > file folder? > > I really appreciate any recommendation > > Xavier > >
