[ 
https://issues.apache.org/jira/browse/APEXCORE-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839032#comment-15839032
 ] 

Sanjay M Pujare commented on APEXCORE-624:
------------------------------------------

This typically happens when the RM does not return all the requested containers 
in one shot and the appMaster has to remove the original unfulfilled requests 
and resubmit them in which case the numRequestedContainers count reflects the 
wrong number.

> Shutdown does not work because of incorrect logic in the AppMaster
> ------------------------------------------------------------------
>
>                 Key: APEXCORE-624
>                 URL: https://issues.apache.org/jira/browse/APEXCORE-624
>             Project: Apache Apex Core
>          Issue Type: Bug
>            Reporter: Sanjay M Pujare
>            Assignee: Sanjay M Pujare
>            Priority: Critical
>
> com.datatorrent.stram.StreamingAppMasterService.execute() calculates 
> numRequestedContainers incorrectly in some cases (e.g. RM container 
> allocation failure) which prevents an application from shutting down when it 
> is requested externally. An example is where we ask RM to remove previous 
> container allocation request (where the count should be decremented but is 
> NOT) and add a new one (where the count should be and IS incremented). 
> Another example is the "alreadyAllocated" case where we release the container 
> and still increment numRequestedContainers which seems wrong. 
> This bug is showing up in multiple Apex deployments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to