Hi Mingxi, By dynamic counter you mean custom counter or is it a different kind of counter ?
plus I cannot do 2 passes as I ge to know about errors in record only when I parse the line. Thanks, -JJ On Mon, Nov 14, 2011 at 3:38 PM, Mingxi Wu <mingxi...@turn.com> wrote: > You can do two passes of the data. **** > > The first map-reduce pass is sanity checking the data. **** > > The second map-reduce pass is to do the real work assuming the first pass > accept the file. **** > > ** ** > > You can utilize the dynamic counter and define an enum type for error > record categories. **** > > In the mapper, you parse each line, and use the result to update the > counter. **** > > ** ** > > -Mingxi**** > > ** ** > > *From:* Mapred Learn [mailto:mapred.le...@gmail.com] > *Sent:* Monday, November 14, 2011 3:06 PM > *To:* mapreduce-user@hadoop.apache.org > *Subject:* how to implement error thresholds in a map-reduce job ?**** > > ** ** > > Hi,**** > > **** > > I have a use case where I want to pass a threshold value to a map-reduce > job. For eg: error records=10.**** > > **** > > I want map-reduce job to fail if total count of error_records in the job > i.e. all mappers, is reached.**** > > **** > > How can I implement this considering that each mapper would be processing > some part of the input data ?**** > > **** > > Thanks,**** > > -JJ**** >