[
https://issues.apache.org/jira/browse/FLINK-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287613#comment-17287613
]
Xintong Song commented on FLINK-20332:
--------------------------------------
To add recovered workers to pending resources, we need to get the
{{TaskExecutorProcessSpec}} or {{WorkerResourceSpec}} of the recovered workers
before they register to RM. There are several options to do this.
# Parse from the TM starting command, which contains TM resource specifications
as dynamic properties. Starting command of recovered workers are only available
for the native k8s deployment.
# Attach (serialized) resource specifications to workers as metadata: k8s
annotations, yarn allocation tags (version 3.1+, at the price handling
different versions with reflection and yarn RM keeps the tags in memory).
# Maintaining worker resource specs in HA, at the price of more HA space
(linear to # of living TMs) and IO cost during TM allocation/release.
2) does not provide much extra benefit compared to 1), and 3) might be a bit
too heavy. Therefore, as a first step, I propose to do this improvement with
1), for native K8s deployment only.
- Yarn can be supported with 2) later, if needed. Recovering resource
specifications with different approach for Kubernetes / Yarn shouldn't be a
problem.
- ATM, we do not support TMs with different resources on Mesos. Thus, TM
resource specifications on Mesos should always be the default.
> Add workers recovered from previous attempt to pending resources
> ----------------------------------------------------------------
>
> Key: FLINK-20332
> URL: https://issues.apache.org/jira/browse/FLINK-20332
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Reporter: Xintong Song
> Assignee: Xintong Song
> Priority: Major
>
> For active deployments (Native K8s/Yarn/Mesos), after a JM failover, workers
> from previous attempt should register to the new JM. Depending on the order
> that slot requests and TM registrations arrive at the RM, it could happen
> that RM allocates unnecessary new resources while there are recovered
> resources that can be reused.
> A potential improvement is to add recovered workers to pending resources, so
> that RM knows what resources are expected to be available soon and decide
> whether to allocate new resources accordingly.
> See also the discussion in FLINK-20249.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)