Danil Serdyuchenko created SAMZA-1116:
-----------------------------------------

             Summary: Yarn RM recovery causing duplicate containers
                 Key: SAMZA-1116
                 URL: https://issues.apache.org/jira/browse/SAMZA-1116
             Project: Samza
          Issue Type: Bug
    Affects Versions: 0.11
            Reporter: Danil Serdyuchenko


To replicate:

# Make sure that Yarn RM recovery is enabled
# Deploy a test job
# Terminate Yarn RM
# Wait until AM of the job terminate with: 
{code}
2017-02-02 13:08:04 RetryInvocationHandler [WARN] Exception while invoking 
class 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster
 over rm2. Not retrying because failovers (30) exceeded maximum allowed (30)
{code}
# Restart RM

The job should get a new attempt but the old containers will not be terminated, 
causing duplicate containers to run. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to