[
https://issues.apache.org/jira/browse/SAMZA-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892600#comment-15892600
]
Jake Maes commented on SAMZA-1116:
----------------------------------
Ahh that makes sense.
Thanks for the simple, reproducible test case. That'll help prove out
SAMZA-871. I hope we'll be able to dig into that one soon!
> Yarn RM recovery causing duplicate containers
> ---------------------------------------------
>
> Key: SAMZA-1116
> URL: https://issues.apache.org/jira/browse/SAMZA-1116
> Project: Samza
> Issue Type: Bug
> Affects Versions: 0.11
> Reporter: Danil Serdyuchenko
>
> To replicate:
> # Make sure that Yarn RM recovery is enabled
> # Deploy a test job
> # Terminate Yarn RM
> # Wait until AM of the job terminate with:
> {code}
> 2017-02-02 13:08:04 RetryInvocationHandler [WARN] Exception while invoking
> class
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster
> over rm2. Not retrying because failovers (30) exceeded maximum allowed (30)
> {code}
> # Restart RM
> The job should get a new attempt but the old containers will not be
> terminated, causing duplicate containers to run.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)