Hi,
I'm testing some of our jobs on 0.21 (built last week) which are still
written against the old API and I get across a weird issue. Simply, we
don't want reducers so we set job.setNumReduceTasks(0) but when the
job starts it does get reduce tasks and they all fail the same way:
2010-01-26 19:17:23,942 WARN org.apache.hadoop.mapred.Child: Exception
running child :
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in
shuffle in fetcher#5
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:119)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:358)
at org.apache.hadoop.mapred.Child.main(Child.java:165)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 219
at
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.getMapsForHost(ShuffleScheduler.java:319)
at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:167)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:144)
See the a full log here: http://pastebin.com/m7359f8ce
Relevant line from JT: 2010-01-26 19:12:27,625 INFO
org.apache.hadoop.mapred.JobInProgress: Job job_201001261911_0001
initialized successfully with 1001 map tasks and 113 reduce tasks.
In mapred-site.xml mapred.reduce.tasks is set to 113.
I searched the jiras, didn't find anything obviously relevant. Is
there something we are overlooking?
J-D