[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13768460#comment-13768460
 ] 

Chris Douglas commented on MAPREDUCE-5506:
------------------------------------------

collect is already synchronized. The exception is probably due to
sharing the 'word' field across threads. If you use a ThreadLocal for
that field, do you still see the exception? -C


                
> Hadoop-1.1.1 occurs ArrayIndexOutOfBoundsException with MultithreadedMapRunner
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5506
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5506
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1
>    Affects Versions: 1.1.1
>         Environment: RHEL 6.3 x86_64
>            Reporter: sam liu
>            Priority: Blocker
>
> After I set:
> - 'jobConf.setMapRunnerClass(MultithreadedMapRunner.class);' in MR app
> - 'mapred.map.multithreadedrunner.threads = 2' in mapred-site.xml
> A simple MR app failed as its Map task encountered 
> ArrayIndexOutOfBoundsException as below(please ignore the line numbers in the 
> exception as I added some log print codes):
> java.lang.ArrayIndexOutOfBoundsException
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1331)
>         at java.io.DataOutputStream.write(DataOutputStream.java:101)
>         at org.apache.hadoop.io.Text.write(Text.java:282)
>         at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
>         at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1060)
>         at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:591)
>         at study.hadoop.mapreduce.sample.WordCount$Map.map(WordCount.java:41)
>         at study.hadoop.mapreduce.sample.WordCount$Map.map(WordCount.java:1)
>         at 
> org.apache.hadoop.mapred.lib.MultithreadedMapRunner$MapperInvokeRunable.run(MultithreadedMapRunner.java:231)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
>         at java.lang.Thread.run(Thread.java:738)
> And the exception happens on line 'System.arraycopy(b, off, kvbuffer, 
> bufindex, len)' in MapTask.java#MapOutputBuffer#Buffer#write(). When the 
> exception occurs, 'b.length=4' but 'len=9'. 
> Btw, if I set 'mapred.map.multithreadedrunner.threads = 1', no exception 
> happened. So it should be an issue caused by multiple threads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to