Hey guys, I'm running into issues when doing a moderate-size EMR job on 12 m1.large nodes. Mappers and Reducers will randomly fail.
The EMR defaults are 2 mappers / 2 reducers per node. I've tried running with mapred.child.opts set in the jobconf to -Xmx256m and -Xmx1024m. No difference. There are about 1,000 map tasks. Not very much data, maybe 50G at most? My job fails to complete. Looking in syslog shows this: java.io.IOException: Cannot run program "bash": java.io.IOException: error=12, Cannot allocate memory at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at org.apache.hadoop.util.Shell.runCommand(Shell.java:149) at org.apache.hadoop.util.Shell.run(Shell.java:134) at org.apache.hadoop.fs.DF.getAvailable(DF.java:73) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1238) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:365) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:312) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory at java.lang.UNIXProcess.<init>(UNIXProcess.java:148) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 11 more I would ask the EMR forums, but I think I may get faster feedback here :) -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
