[ 
https://issues.apache.org/jira/browse/APEXCORE-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15652496#comment-15652496
 ] 

Venkatesh Kottapalli commented on APEXCORE-471:
-----------------------------------------------

In the issue scenario, there are other jobs using 177 containers from 180 
containers in the cluster. When the Apex job is launched, it needs 20 
containers and received the 3 remaining containers in the cluster initially.

After this,  there is no request from the App master to RM to get the rest of 
the 17 resources allocated and the job waits in pending state forever even 
after the other jobs in the cluster got completed and all the containers are 
available.

> Requests for container allocation are not resubmitted
> -----------------------------------------------------
>
>                 Key: APEXCORE-471
>                 URL: https://issues.apache.org/jira/browse/APEXCORE-471
>             Project: Apache Apex Core
>          Issue Type: Bug
>    Affects Versions: 3.3.0, 3.4.0
>            Reporter: Vlad Rozov
>
> When Yarn cluster has a limited number of available resources, requests 
> should be resubmitted. BlacklistBasedResourceRequestHandler does not properly 
> handle case when resources are limited.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to