This may be a dumb question, but can you accomplish the same thing by
placing the following code in mapred-site.xml.  Or did I misunderstand the
fix...

<property>
  <name>mapred.skip.attempts.to.start.skipping</name>
  <value>2</value>
  <!-- default: 2 -->
  <description>
        The number of Task attempts AFTER which skip mode will be kicked
off. When skip mode is kicked off,
        the tasks reports the range of records which it will process next,
to the TaskTracker. So that on failures, 
        TT knows which ones are possibly the bad records. On further
executions, those are skipped.  
  </description>
</property>

<property>
  <name>mapred.skip.map.max.skip.records</name>
  <value>1</value>
  <!-- default: 0 -->
  <description>
    The number of acceptable skip records surrounding the bad record PER bad
record
        in mapper. The number includes the bad record as well. To turn the
feature of detection/skipping 
        of bad records off, set the value to 0. The framework tries to
narrow down the skipped range by 
        retrying until this threshold is met OR all attempts get exhausted
for this task. Set the value
        to Long.MAX_VALUE to indicate that framework need not try to narrow
down. 
        Whatever records(depends on application) get skipped are acceptable.
  </description>
</property>

Brad

Reply via email to