[ 
https://issues.apache.org/jira/browse/FLINK-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287613#comment-17287613
 ] 

Xintong Song commented on FLINK-20332:
--------------------------------------

To add recovered workers to pending resources, we need to get the 
{{TaskExecutorProcessSpec}} or {{WorkerResourceSpec}} of the recovered workers 
before they register to RM. There are several options to do this.
# Parse from the TM starting command, which contains TM resource specifications 
as dynamic properties. Starting command of recovered workers are only available 
for the native k8s deployment.
# Attach (serialized) resource specifications to workers as metadata: k8s 
annotations, yarn allocation tags (version 3.1+, at the price handling 
different versions with reflection and yarn RM keeps the tags in memory).
# Maintaining worker resource specs in HA, at the price of more HA space 
(linear to # of living TMs) and IO cost during TM allocation/release.

2) does not provide much extra benefit compared to 1), and 3) might be a bit 
too heavy. Therefore, as a first step, I propose to do this improvement with 
1), for native K8s deployment only.
- Yarn can be supported with 2) later, if needed. Recovering resource 
specifications with different approach for Kubernetes / Yarn shouldn't be a 
problem.
- ATM, we do not support TMs with different resources on Mesos. Thus, TM 
resource specifications on Mesos should always be the default.

> Add workers recovered from previous attempt to pending resources
> ----------------------------------------------------------------
>
>                 Key: FLINK-20332
>                 URL: https://issues.apache.org/jira/browse/FLINK-20332
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>            Reporter: Xintong Song
>            Assignee: Xintong Song
>            Priority: Major
>
> For active deployments (Native K8s/Yarn/Mesos), after a JM failover, workers 
> from previous attempt should register to the new JM. Depending on the order 
> that slot requests and TM registrations arrive at the RM, it could happen 
> that RM allocates unnecessary new resources while there are recovered 
> resources that can be reused.
> A potential improvement is to add recovered workers to pending resources, so 
> that RM knows what resources are expected to be available soon and decide 
> whether to allocate new resources accordingly.
> See also the discussion in FLINK-20249.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to