Which version of hadoop are you running?

Are you running on linux?

-Joey

On Thu, May 12, 2011 at 1:39 PM, Adi <adi.pan...@gmail.com> wrote:
> For one long running job we are noticing that the mapper jvms do not exit
> even after the mapper is done. Any suggestions on why this could be
> happening.
> The java processes get cleaned up if I do a hadoop job -kill <job_id>. The
> java processes get cleaned up of I run in it in a smaller batch and the job
> gets done fairly quickly(say half an hour). For larger inputs the nodes
> eventually run out of memory because of these java processes that the
> cluster thinks are gone but they haven't been cleaned up yet. I am
> suspecting the TaskTrackers are failing to kill JVMs for some reason by
> themselves.
> The following exceptions can be seen in the hadoop logs.
>
> 2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process
> 2011-05-12 13:52:08,071 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -11061: No such process
> 2011-05-12 13:52:09,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -11151: No such process
> 2011-05-12 13:52:12,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -25057: No such process
> 2011-05-12 13:52:13,306 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -19805: No such process
> 2011-05-12 13:52:14,996 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -11103: No such process
>
> 2011-05-12 15:51:41,105 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -17202: No such process
> 2011-05-12 15:51:43,481 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -15981: No such process
> 2011-05-12 15:51:45,916 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -17931: No such process
> 2011-05-12 15:52:06,328 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -14867: No such process
> 2011-05-12 15:52:34,503 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -29376: No such process
> 2011-05-12 15:52:38,607 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -32491: No such process
> 2011-05-12 15:52:39,292 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -31529: No such process
> 2011-05-12 15:52:46,547 WARN org.apache.hadoop.mapreduce.util.ProcessTree:
> Error executing shell command
> org.apache.hadoop.util.Shell$ExitCodeException: kill -15140: No such process
>
> Some other exceptions also seen in the logs may or may not be related to the
> above problem.
> 2011-05-12 16:01:20,534 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 6 on 33465 caught: java.nio.channels.ClosedChannelException
> 2011-05-12 16:01:48,869 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 80 on 33465 caught: java.nio.channels.ClosedChannelException
> 2011-05-12 16:01:53,922 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 59 on 33465 caught: java.nio.channels.ClosedChannelException
> 2011-05-12 16:01:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 28 on 33465 caught: java.nio.channels.ClosedChannelException
> 2011-05-12 16:02:04,040 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 37 on 33465 caught: java.nio.channels.ClosedChannelException
> 2011-05-12 16:02:09,095 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 100 on 33465 caught: java.nio.channels.ClosedChannelException
>
> Thanks.
>
> -Adi
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Reply via email to