[
https://issues.apache.org/jira/browse/FLINK-23194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397256#comment-17397256
]
Till Rohrmann commented on FLINK-23194:
---------------------------------------
So if I understood you correctly, then we can close this ticket as won't do,
right [~zlzhang0122]?
> Cache and reuse the ContainerLaunchContext and accelarate the progress of
> createTaskExecutorLaunchContext on yarn
> -----------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-23194
> URL: https://issues.apache.org/jira/browse/FLINK-23194
> Project: Flink
> Issue Type: Improvement
> Components: Deployment / YARN
> Affects Versions: 1.13.1, 1.12.4
> Reporter: zlzhang0122
> Priority: Major
> Fix For: 1.14.0
>
>
> When starting the TaskExecutor in container on yarn, this will create
> ContainerLaunchContext for n times(n represent the number of the TaskManager).
> When I examined the progress of this creation, I found that most of them were
> in common and had nothing to do with the particular TaskManager except the
> launchCommand. We can create ContainerLaunchContext once and reuse it. Only
> the launchCommand need to create separately for every particular TaskManager.
> So I propose that we can cache and reuse the ContainerLaunchContext object to
> accelerate this creation progress.
> I think this can have some benefit like below:
> # this can accelerate the creation of ContainerLaunchContext and also the
> start of the TaskExecutor, especially under the situation of massive
> TaskManager.
> # this can decrease the pressure of the HDFS, etc.
> # this can also avoid the suddenly failure of the HDFS or yarn, etc.
> We have implemented this on our production environment. So far there has no
> problem and have a good benefit. Please let me know if there's any point that
> I haven't considered.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)