Re: Samza job killed by left orphaned on YARN

Jacob Maes Wed, 18 May 2016 07:16:05 -0700

Hey David,

The only time I've seen orphaned containers is when the NM dies. If the NM
isn't running, the RM has no means to kill the containers on a node. Can
you verify that the NM was healthy at the time of the shut down?


If it wasn't healthy and/or it was restarted, one option that may help is
NM Recovery:
https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/NodeManagerRestart.html

With NM Recovery, the NM will resume control over containers that were
running when the NM shut down. This option has virtually eliminated
orphaned containers in our clusters.

-Jake

On Tue, May 17, 2016 at 11:54 PM, David Yu <david...@optimizely.com> wrote:

> Samza version = 0.10.0
> YARN version = Hadoop 2.6.0-cdh5.4.9
>
> We are experience issues when killing a Samza job:
>
> $ yarn application -kill application_1463512986427_0007
>
> Killing application application_1463512986427_0007
>
> 16/05/18 06:29:05 INFO impl.YarnClientImpl: Killed application
> application_1463512986427_0007
>
> RM shows that the job is killed. However, the samza containers are still
> left running.
>
> Any idea why this is happening?
>
> Thanks,
> David
>

Re: Samza job killed by left orphaned on YARN

Reply via email to