You can read the input as plain text then do type conversion in mapper, if there's NumberFormatException happens, you can decide how to do with it , like add a customized Counter to record it. or set a default value
On Sat, Oct 16, 2010 at 5:02 AM, Boyu Zhang <[email protected]> wrote: > Hi all, > > I am running a program with input 1 million lines of data, among the 1 > million, 5 or 6 lines data are corrupted. The way the are corrupted is: in > the position which a float number is expected, like 3.4 , instead of a float > number, something like this is there: 3.4.5.6 . So when the map runs, it > throws a multiple point in num exception. > > My question is: the map tasks that have the exception are marked failure, > how about the data processed by the same map before the exception, do they > reach the reduce task? or they are treated like garbage? Thank you very much > any help is appreciated. > > Boyu > -- Best Regards Jeff Zhang
