[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value

Vinod Kumar Vavilapalli (JIRA) Fri, 07 Mar 2014 20:12:06 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13924705#comment-13924705
 ]


Vinod Kumar Vavilapalli commented on MAPREDUCE-5028:
----------------------------------------------------

Looked at the patch. Tx for working on it, this one is a year in the making. 
Let's do the last push!

So the main bug fix is the int overflow stuff in MapTask.java, right?

Not caused by your patch, but minor suggestions for InMemoryReader.java
 - start and length can and should be marked final.
 - memDataIn can be private and final

Can we fix the javadoc of the getLengh() API? Ideally I'd also rename the API, 
but will leave it up to you. I don't have a problem reviewing a patch with that 
rename. Anyways, back to javadoc for DataInputBuffer.getLength() and 
DataInputBuffer.Buffer.getLength(), we can say what [~chris.douglas] said above 
 - (from ByteArrayInputStream) "returns the index one greater than the last 
valid character in the input stream buffer."

There seems to be one more related bug w.r.t usage of DatInputBuffer.reset(). 
Can you also cross verify Task.ValuesIterator.readNextKey() and readNextValue()?

Why not run this LargeSorter with large values for io.sort.mb as a unit test? 
I'm afraid that we may unknowingly add more such issues in future if we don't 
automatically run it as part of our test-suite. The problem will be playing 
with surefire's heap sizes, but that's about it. 

We should hop onto MAPREDUCE-5032 next as I am not sure if we are fixing 
everything here.

> Maps fail when io.sort.mb is set to high value
> ----------------------------------------------
>
>                 Key: MAPREDUCE-5028
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>            Priority: Critical
>             Fix For: 1.2.0
>
>         Attachments: MR-5028_testapp.patch, mr-5028-1.patch, mr-5028-2.patch, 
> mr-5028-branch1.patch, mr-5028-branch1.patch, mr-5028-branch1.patch, 
> mr-5028-trunk.patch, mr-5028-trunk.patch, mr-5028-trunk.patch, 
> repro-mr-5028.patch
>
>
> Verified the problem exists on branch-1 with the following configuration:
> Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, 
> io.sort.mb=1280, dfs.block.size=2147483648
> Run teragen to generate 4 GB data
> Maps fail when you run wordcount on this configuration with the following 
> error: 
> {noformat}
> java.io.IOException: Spill failed
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
>       at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
>       at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>       at 
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
>       at 
> org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>       at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.EOFException
>       at java.io.DataInputStream.readInt(DataInputStream.java:375)
>       at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
>       at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>       at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>       at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>       at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>       at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>       at 
> org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value

Reply via email to