[
https://issues.apache.org/jira/browse/SPARK-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127088#comment-14127088
]
Thomas Graves commented on SPARK-3456:
--------------------------------------
Note this is only a problem on yarn alpha because in stable we use the
AMRMClient interface which actually does an add.
> YarnAllocator can lose container requests to RM
> -----------------------------------------------
>
> Key: SPARK-3456
> URL: https://issues.apache.org/jira/browse/SPARK-3456
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Affects Versions: 1.2.0
> Reporter: Thomas Graves
> Priority: Critical
>
> I haven't actually tested this yet, but I believe that spark on yarn can lose
> container requests to the RM. The reason is that we ask for the total number
> upfront (say x) but then we don't ask for anymore unless some are missing and
> if we do then we could erase the original request.
> For example
> - ask for 3 containers
> - 1 is allocated
> - ask for 0 containers since asked for 3 originally (2 left)
> - the 1 allocated dies
> - We now ask for 1 since its missing, this will override whatever is on the
> yarn side (in this case 2).
> Then we lose the 2 more we need.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]