Are you using HadoopStreaming? If so, then subprocess created by HadoopStreaming Job can take as much memory as it needs. In that case, the system will run out of memory and other processes (e.g. TaskTracker) may not be able to run properly or even be killed by the OS.
/Taeho On Fri, Aug 1, 2008 at 2:24 AM, Xavier Stevens <[EMAIL PROTECTED]>wrote: > We're currently running jobs on machines with around 16GB of memory with > 8 map tasks per machine. We used to run with max heap set to 2048m. > Since we started using version 0.17.1 we've been getting a lot of these > errors: > > task_200807251330_0042_m_000146_0: Caused by: java.io.IOException: > java.io.IOException: Cannot allocate memory > task_200807251330_0042_m_000146_0: at > java.lang.UNIXProcess.<init>(UNIXProcess.java:148) > task_200807251330_0042_m_000146_0: at > java.lang.ProcessImpl.start(ProcessImpl.java:65) > task_200807251330_0042_m_000146_0: at > java.lang.ProcessBuilder.start(ProcessBuilder.java:451) > task_200807251330_0042_m_000146_0: at > org.apache.hadoop.util.Shell.runCommand(Shell.java:149) > task_200807251330_0042_m_000146_0: at > org.apache.hadoop.util.Shell.run(Shell.java:134) > task_200807251330_0042_m_000146_0: at > org.apache.hadoop.fs.DF.getAvailable(DF.java:73) > task_200807251330_0042_m_000146_0: at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathF > orWrite(LocalDirAllocator.java:296) > task_200807251330_0042_m_000146_0: at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllo > cator.java:124) > task_200807251330_0042_m_000146_0: at > org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFil > e.java:107) > task_200807251330_0042_m_000146_0: at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.ja > va:734) > task_200807251330_0042_m_000146_0: at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1600(MapTask.jav > a:272) > task_200807251330_0042_m_000146_0: at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask > .java:707) > > We haven't changed our heapsizes at all. Has anyone else experienced > this? Is there a way around it other than reducing heap sizes > excessively low? I've tried all the way down to 1024m max heap and I > still get this error. > > > -Xavier > >
