[jira] [Updated] (FLINK-18226) ResourceManager requests unnecessary new workers if previous workers are allocated but not registered.

Till Rohrmann (Jira) Wed, 10 Jun 2020 06:06:49 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-18226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Till Rohrmann updated FLINK-18226:
----------------------------------
    Priority: Blocker  (was: Critical)

> ResourceManager requests unnecessary new workers if previous workers are 
> allocated but not registered.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-18226
>                 URL: https://issues.apache.org/jira/browse/FLINK-18226
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / Kubernetes, Deployment / YARN, Runtime / 
> Coordination
>    Affects Versions: 1.11.0
>            Reporter: Xintong Song
>            Assignee: Xintong Song
>            Priority: Blocker
>             Fix For: 1.11.0
>
>
> h2. Problem
> Currently on Kubernetes & Yarn deployment, the ResourceManager compares 
> *pending workers requested from Kubernetes/Yarn* against *pending workers 
> required by SlotManager*, for deciding whether new workers should be 
> requested in case of a worker failure.
>  * {{KubernetesResourceManager#requestKubernetesPodIfRequired}}
>  * {{YarnResourceManager#requestYarnContainerIfRequired}}
> *Pending workers requested from Kubernetes/Yarn* is decreased when the worker 
> is allocated, *before the worker is actually started and registered*.
>  * Decreased in {{ActiveResourceManager#notifyNewWorkerAllocated}}, which is 
> called in
>  * {{KubernetesResourceManager#onAdded}}
>  * {{YarnResourceManager#onContainersOfResourceAllocated}}
> On the other hand, *pending workers required by SlotManager* is derived from 
> the number of pending slots inside SlotManager, which is decreased *when the 
> new workers/slots are registered*.
>  * {{SlotManagerImpl#registerSlot}}
> Therefore, if a worker {{w1}} is failed after another worker {{w2}} is 
> allocated but before {{w2}} is registered, the ResourceManager will request 
> an unnecessary new worker for {{w2}}.
> h2. Impact
> Normally, the extra worker should be released soon after allocated. But in 
> cases where the Kubernetes/Yarn cluster does not have enough resources, it 
> might create more and more pending pods/containers.
> It's even more severe for Kubernetes, because 
> {{KubernetesResourceManager#onAdded}} only suggest that the pod spec has been 
> successfully added to the cluster, but the pod may not actually been 
> allocated due to lack of resources. Imagine there are {{N}} pending pods, a 
> failure of a running pod means requesting another {{N}} new pods.
> In a session cluster, such pending pods could take long to be cleared even 
> after all jobs in the session cluster have terminated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-18226) ResourceManager requests unnecessary new workers if previous workers are allocated but not registered.

Reply via email to