[
https://issues.apache.org/jira/browse/FLINK-19568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xintong Song reassigned FLINK-19568:
------------------------------------
Assignee: Xintong Song
> Offload creating TM launch contexts to the IO executor
> ------------------------------------------------------
>
> Key: FLINK-19568
> URL: https://issues.apache.org/jira/browse/FLINK-19568
> Project: Flink
> Issue Type: Improvement
> Components: Deployment / YARN
> Reporter: Xintong Song
> Assignee: Xintong Song
> Priority: Major
> Fix For: 1.12.0
>
>
> Currently, for launching each TM container on Yarn, Flink creates a container
> launch context in RM's PRC main thread. This includes accessing file status
> from remote file systems, which may blocks the RM's main thread, especially
> when remote file system is slow. See [this
> thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/TM-heartbeat-timeout-due-to-ResourceManager-being-busy-td38626.html].
> The creating of TM context does not access nor change any RM's internal
> states. Therefore, I propose to offload the work to the IO executor. To be
> specific, I think the entire
> {{YarnResourceManagerDriver#createTaskExecutorLaunchContext}} can be invoked
> on the IO executor.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)