Which version of hadoop are you running?
>
> Hadoop 0.21.0 with some patches.



> Are you running on linux?
>
> Yes
Linux 2.6.18-238.9.1.el5 #1 SMP  x86_64 x86_64 x86_64 GNU/Linux
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)

I set up 0.21.0 on another linux box and am not seeing this issue as hadoop
is reusing JVMs(as configured).
In the production cluster it is not re-using JVMs and runs out of memory
because of mapper JVMs staying alive even after they have ended according to
hadoop.

The production node is a 64 bit OS/JVM. Is there any known issue workaround
for enabling JVM reuse in 64 bit environments.

Test node is 32 bit:
Linux 2.6.18-194.32.1.el5.centos.plus #1 SMP i686 i686 i386 GNU/Linux
java version "1.6.0_17"
OpenJDK Runtime Environment (IcedTea6 1.7.5) (rhel-1.16.b17.el5-i386)
OpenJDK Server VM (build 14.0-b16, mixed mode)

Even if I can get it to reuse JVM it will be grrreat.

-Adi





> -Joey
>
> On Thu, May 12, 2011 at 1:39 PM, Adi <adi.pan...@gmail.com> wrote:
> > For one long running job we are noticing that the mapper jvms do not exit
> > even after the mapper is done. Any suggestions on why this could be
> > happening.
> > The java processes get cleaned up if I do a hadoop job -kill <job_id>.
> The
> > java processes get cleaned up of I run in it in a smaller batch and the
> job
> > gets done fairly quickly(say half an hour). For larger inputs the nodes
> > eventually run out of memory because of these java processes that the
> > cluster thinks are gone but they haven't been cleaned up yet. I am
> > suspecting the TaskTrackers are failing to kill JVMs for some reason by
> > themselves.
> > The following exceptions can be seen in the hadoop logs.
> >
> > 2011-05-12 13:52:04,147 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such
> process
> > 2011-05-12 13:52:08,071 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -11061: No such
> process
> > 2011-05-12 13:52:09,009 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -11151: No such
> process
> > 2011-05-12 13:52:12,009 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -25057: No such
> process
> > 2011-05-12 13:52:13,306 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -19805: No such
> process
> > 2011-05-12 13:52:14,996 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -11103: No such
> process
> >
> > 2011-05-12 15:51:41,105 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -17202: No such
> process
> > 2011-05-12 15:51:43,481 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -15981: No such
> process
> > 2011-05-12 15:51:45,916 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -17931: No such
> process
> > 2011-05-12 15:52:06,328 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -14867: No such
> process
> > 2011-05-12 15:52:34,503 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -29376: No such
> process
> > 2011-05-12 15:52:38,607 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -32491: No such
> process
> > 2011-05-12 15:52:39,292 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -31529: No such
> process
> > 2011-05-12 15:52:46,547 WARN
> org.apache.hadoop.mapreduce.util.ProcessTree:
> > Error executing shell command
> > org.apache.hadoop.util.Shell$ExitCodeException: kill -15140: No such
> process
> >
> > Some other exceptions also seen in the logs may or may not be related to
> the
> > above problem.
> > 2011-05-12 16:01:20,534 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 6 on 33465 caught: java.nio.channels.ClosedChannelException
> > 2011-05-12 16:01:48,869 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 80 on 33465 caught: java.nio.channels.ClosedChannelException
> > 2011-05-12 16:01:53,922 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 59 on 33465 caught: java.nio.channels.ClosedChannelException
> > 2011-05-12 16:01:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 28 on 33465 caught: java.nio.channels.ClosedChannelException
> > 2011-05-12 16:02:04,040 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 37 on 33465 caught: java.nio.channels.ClosedChannelException
> > 2011-05-12 16:02:09,095 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 100 on 33465 caught: java.nio.channels.ClosedChannelException
> >
> > Thanks.
> >
> > -Adi
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>

Reply via email to