Skipping Bad Records

Billy Pearson Mon, 13 Apr 2009 16:58:45 -0700

I am running a job that pulls data from hbase but I getting heap errors onsome of the records because there to large to fit in the heap of the task

I enabled I thought so the skip option in the site conf file and I alsoadded these options to my job conf


   conf.setMaxMapAttempts(10);
   SkipBadRecords.setMapperMaxSkipRecords(conf, 1);
   SkipBadRecords.setAttemptsToStartSkipping(conf, 1);

From the MR docs it seams the task is split and ran as two different task

how would this be handled in hbase

I read somewhere that someone is working on getting the scanners from MRjobs ability to run more then one task per region is this still pending oris it done?

and do we have an open issue for supporting this hadoop function for hbase?

Skipping Bad Records

Reply via email to