TisonKun commented on a change in pull request #10143: [FLINK-13184]Starting a
TaskExecutor blocks the YarnResourceManager's main thread
URL: https://github.com/apache/flink/pull/10143#discussion_r344592017
##########
File path:
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java
##########
@@ -407,15 +404,9 @@ private void startTaskExecutorInContainer(Container
container) {
containerIdStr,
container.getNodeId().getHost());
- nodeManagerClient.startContainer(container,
taskExecutorLaunchContext);
+ nodeManagerClient.startContainerAsync(container,
taskExecutorLaunchContext);
} catch (Throwable t) {
- log.error("Could not start TaskManager in container
{}.", container.getId(), t);
-
- // release the failed container
- workerNodeMap.remove(resourceId);
-
resourceManagerClient.releaseAssignedContainer(container.getId());
- // and ask for a new one
- requestYarnContainerIfRequired();
+ onStartContainerError(container.getId(), t);
Review comment:
The same here. How can we react to start container failure? I don't think it
handles the future result of an async method.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services