On Wed, Nov 11, 2009 at 11:36 AM, John Clarke <[email protected]> wrote: > Hi, > > I've been running our app on EC2 using the small instances and it's been > mostly fine. Very occasionally a task will die due to a heap out of memory > exception. So far these failed tasks have successfully been restarted by > Hadoop on other nodes and the job has run to completion. > > I want to know how to avoid those occasional out of memory problems. > > I tried increasing the mapred.child.java.opts from -Xmx550m to -Xmx768m but > this caused more and much quicker out of memory exceptions. Can someone help > me understand why? > > I then reduced it to -Xmx400m and it is running ok so far. > > My application is a custom threaded maprunnable app and I often have > hundreds of threads operating at the same time. > > Cheers, > John >
John, If you look at the description of mapred.child.java.opts: Java opts for the task tracker child processes. The following symbol, if present, will be interpolated Thus using -Xmx400m serves a a limit for the max memory a task tracker can consume. Now you have to look at: mapred.tasktracker.map.tasks.maximum mapred.tasktracker.reduce.tasks.maximum If you have set these variables too high each node will spawn more tasks then it can handle memory wise, this will cause threads will die. What you have to do here is look hard at how much memory you have on your machine, how many map and reduce tasks can run on the machine, anything else that may be running on the machine. Then you have to set -Xmx lower then this. Many things effect this. For example if you raise tasktracker.http.threads Now the task tracker will have more threads and probably consume more memory. Edward
