Which version of hadoop are you running? Are you running on linux?
-Joey On Thu, May 12, 2011 at 1:39 PM, Adi <adi.pan...@gmail.com> wrote: > For one long running job we are noticing that the mapper jvms do not exit > even after the mapper is done. Any suggestions on why this could be > happening. > The java processes get cleaned up if I do a hadoop job -kill <job_id>. The > java processes get cleaned up of I run in it in a smaller batch and the job > gets done fairly quickly(say half an hour). For larger inputs the nodes > eventually run out of memory because of these java processes that the > cluster thinks are gone but they haven't been cleaned up yet. I am > suspecting the TaskTrackers are failing to kill JVMs for some reason by > themselves. > The following exceptions can be seen in the hadoop logs. > > 2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process > 2011-05-12 13:52:08,071 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -11061: No such process > 2011-05-12 13:52:09,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -11151: No such process > 2011-05-12 13:52:12,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -25057: No such process > 2011-05-12 13:52:13,306 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -19805: No such process > 2011-05-12 13:52:14,996 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -11103: No such process > > 2011-05-12 15:51:41,105 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -17202: No such process > 2011-05-12 15:51:43,481 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -15981: No such process > 2011-05-12 15:51:45,916 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -17931: No such process > 2011-05-12 15:52:06,328 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -14867: No such process > 2011-05-12 15:52:34,503 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -29376: No such process > 2011-05-12 15:52:38,607 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -32491: No such process > 2011-05-12 15:52:39,292 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -31529: No such process > 2011-05-12 15:52:46,547 WARN org.apache.hadoop.mapreduce.util.ProcessTree: > Error executing shell command > org.apache.hadoop.util.Shell$ExitCodeException: kill -15140: No such process > > Some other exceptions also seen in the logs may or may not be related to the > above problem. > 2011-05-12 16:01:20,534 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 6 on 33465 caught: java.nio.channels.ClosedChannelException > 2011-05-12 16:01:48,869 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 80 on 33465 caught: java.nio.channels.ClosedChannelException > 2011-05-12 16:01:53,922 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 59 on 33465 caught: java.nio.channels.ClosedChannelException > 2011-05-12 16:01:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 28 on 33465 caught: java.nio.channels.ClosedChannelException > 2011-05-12 16:02:04,040 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 37 on 33465 caught: java.nio.channels.ClosedChannelException > 2011-05-12 16:02:09,095 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 100 on 33465 caught: java.nio.channels.ClosedChannelException > > Thanks. > > -Adi > -- Joseph Echeverria Cloudera, Inc. 443.305.9434