Thanks very much for your helpful response! I should go into some more details about this job. It's essentially a use of the Hadoop framework to sort a large amount of data. The mapper transforms a record to (sorting_key, record), where the sorting keys are effectively unique, and the reducer is trivial, outputting the record and discarding the sorting key, so the memory consumption of both the map and the reduce steps is intended to be O(1).
However, due to the nature of the sorting, it's necessary that certain sets of records appear together in the sorted output. Thus the partitioner (HashPartitioner with a specially designed hash function) will sometimes be forced to send a large number of records to a particular reducer. This is not desirable, and it occurs only rarely, but it's not feasible to prevent it from happening on a deterministic basis. You could say that it creates a reliability engineering problem. My understanding of the configuration options you've linked to is that they're intended for performance tuning, and that even if the defaults are not optimal for a particular input, the shuffle should still succeed, albeit more slowly than it could have otherwise. In particular, it seems like the ShuffleRamManager class (I think ReduceTask.ReduceCopier.ShuffleRamManager) is intended to prevent my crash from occurring, by disallowing the in-memory shuffle from using up all the JVM heap. Is it possible that the continued existence of this OutOfMemoryError represents a bug in ShuffleRamManager, or in some other code that is intended to prevent this situation from occurring? Thanks so much for your time. On Wed, Feb 20, 2013 at 5:41 PM, Hemanth Yamijala <[email protected]> wrote: > There are a few tweaks In configuration that may help. Can you please look > at > http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#Shuffle%2FReduce+Parameters > > Also, since you have mentioned reducers are unbalanced, could you use a > custom partitioner to balance out the outputs. Or just increase the number > of reducers so the load is spread out. > > Thanks > Hemanth > > > On Wednesday, February 20, 2013, Shivaram Lingamneni wrote: >> >> I'm experiencing the following crash during reduce tasks: >> >> https://gist.github.com/slingamn/04ff3ff3412af23aa50d >> >> on Hadoop 1.0.3 (specifically I'm using Amazon's EMR, AMI version >> 2.2.1). The crash is triggered by especially unbalanced reducer >> inputs, i.e., when one reducer receives too many records. (The reduce >> task gets retried three times, but since the data is the same every >> time, it crashes each time in the same place and the job fails.) >> >> From the following links: >> >> https://issues.apache.org/jira/browse/MAPREDUCE-1182 >> >> >> http://hadoop-common.472056.n3.nabble.com/Shuffle-In-Memory-OutOfMemoryError-td433197.html >> >> it seems as though Hadoop is supposed to prevent this from happening >> by intelligently managing the amount of memory that is provided to the >> shuffle. However, I don't know how ironclad this guarantee is. >> >> Can anyone advise me on how robust I can expect Hadoop to be to this >> issue, in the face of highly unbalanced reducer inputs? Thanks very >> much for your time.
