Re: Samza job killed by left orphaned on YARN

David Yu Wed, 18 May 2016 10:22:08 -0700

Jacob,

I have checked and made sure that NM is running on the node:


$ ps aux | grep java
...
yarn     25623  0.5  0.8 2366536 275488 ?      Sl   May17   7:04
/usr/java/jdk1.8.0_51/bin/java -Dproc_nodemanager
 ... org.apache.hadoop.yarn.server.nodemanager.NodeManager



Thanks,
David

On Wed, May 18, 2016 at 7:08 AM, Jacob Maes <[email protected]> wrote:

> Hey David,
>
> The only time I've seen orphaned containers is when the NM dies. If the NM
> isn't running, the RM has no means to kill the containers on a node. Can
> you verify that the NM was healthy at the time of the shut down?
>
> If it wasn't healthy and/or it was restarted, one option that may help is
> NM Recovery:
>
> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/NodeManagerRestart.html
>
> With NM Recovery, the NM will resume control over containers that were
> running when the NM shut down. This option has virtually eliminated
> orphaned containers in our clusters.
>
> -Jake
>
> On Tue, May 17, 2016 at 11:54 PM, David Yu <[email protected]>
> wrote:
>
> > Samza version = 0.10.0
> > YARN version = Hadoop 2.6.0-cdh5.4.9
> >
> > We are experience issues when killing a Samza job:
> >
> > $ yarn application -kill application_1463512986427_0007
> >
> > Killing application application_1463512986427_0007
> >
> > 16/05/18 06:29:05 INFO impl.YarnClientImpl: Killed application
> > application_1463512986427_0007
> >
> > RM shows that the job is killed. However, the samza containers are still
> > left running.
> >
> > Any idea why this is happening?
> >
> > Thanks,
> > David
> >
>

Re: Samza job killed by left orphaned on YARN

Reply via email to