Yang Wang commented on FLINK-15036:

[~trohrmann] You are right. It should be handled in the main thread of 
{{YarnResourceManager}}. Otherwise, concurrent exceptions may happen. We could 
wrap all the codes of {{onStartContainerError}} into {{onStartContainerError}} 
for a quick fix.

> Container startup error will be handled out side of the YarnResourceManager's 
> main thread
> -----------------------------------------------------------------------------------------
>                 Key: FLINK-15036
>                 URL: https://issues.apache.org/jira/browse/FLINK-15036
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN
>    Affects Versions: 1.10.0, 1.8.3, 1.9.2
>            Reporter: Till Rohrmann
>            Priority: Critical
>             Fix For: 1.10.0, 1.8.3, 1.9.2
> With FLINK-13184, we replaced the {{NMClient}} with the {{NMClientAsync}}. As 
> part of this change, container start up errors are now handled by a callback 
> to {{NMClientAsync.CallbackHandler}}. The implementation of 
> {{NMClientAsync.CallbackHandler#onStartContainerError}} will be called by the 
> {{NMClientAsync}}. Since the implementation does state changing operations, 
> it needs to happen inside of the {{YarnResourceManager}} main thread.

This message was sent by Atlassian Jira

Reply via email to