[
https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13930420#comment-13930420
]
Hudson commented on MAPREDUCE-5028:
-----------------------------------
SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1723 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1723/])
MAPREDUCE-5028. Fixed a bug in MapTask that was causing mappers to fail when a
large value of io.sort.mb is set. Contributed by Karthik Kambatla. (vinodkv:
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1576170)
*
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataInputBuffer.java
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
*
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java
*
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/ReduceContextImpl.java
*
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java
*
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/LargeSorter.java
*
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestLargeSort.java
*
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/test/MapredTestDriver.java
> Maps fail when io.sort.mb is set to high value
> ----------------------------------------------
>
> Key: MAPREDUCE-5028
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
> Reporter: Karthik Kambatla
> Assignee: Karthik Kambatla
> Priority: Critical
> Fix For: 1.2.0, 2.4.0
>
> Attachments: MR-5028_testapp.patch, mr-5028-1.patch, mr-5028-2.patch,
> mr-5028-3.patch, mr-5028-branch1.patch, mr-5028-branch1.patch,
> mr-5028-branch1.patch, mr-5028-trunk.patch, mr-5028-trunk.patch,
> mr-5028-trunk.patch, repro-mr-5028.patch
>
>
> Verified the problem exists on branch-1 with the following configuration:
> Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m,
> io.sort.mb=1280, dfs.block.size=2147483648
> Run teragen to generate 4 GB data
> Maps fail when you run wordcount on this configuration with the following
> error:
> {noformat}
> java.io.IOException: Spill failed
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
> at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> at
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
> at
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
> at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
> at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
> at
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
> at
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
> at
> org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)