[jira] [Commented] (SAMZA-1116) Yarn RM recovery causing duplicate containers

Jake Maes (JIRA) Thu, 02 Mar 2017 08:56:09 -0800

    [ 
https://issues.apache.org/jira/browse/SAMZA-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892600#comment-15892600
 ]


Jake Maes commented on SAMZA-1116:
----------------------------------

Ahh that makes sense. 

Thanks for the simple, reproducible test case. That'll help prove out 
SAMZA-871. I hope we'll be able to dig into that one soon!

> Yarn RM recovery causing duplicate containers
> ---------------------------------------------
>
>                 Key: SAMZA-1116
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1116
>             Project: Samza
>          Issue Type: Bug
>    Affects Versions: 0.11
>            Reporter: Danil Serdyuchenko
>
> To replicate:
> # Make sure that Yarn RM recovery is enabled
> # Deploy a test job
> # Terminate Yarn RM
> # Wait until AM of the job terminate with: 
> {code}
> 2017-02-02 13:08:04 RetryInvocationHandler [WARN] Exception while invoking 
> class 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster
>  over rm2. Not retrying because failovers (30) exceeded maximum allowed (30)
> {code}
> # Restart RM
> The job should get a new attempt but the old containers will not be 
> terminated, causing duplicate containers to run. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (SAMZA-1116) Yarn RM recovery causing duplicate containers

Reply via email to