Yang Wang created FLINK-18228:
---------------------------------
Summary: Release pending pods/containers timely when pending slots
changed
Key: FLINK-18228
URL: https://issues.apache.org/jira/browse/FLINK-18228
Project: Flink
Issue Type: Improvement
Components: Deployment / Kubernetes, Deployment / YARN, Runtime /
Coordination
Affects Versions: 1.12.0
Reporter: Yang Wang
Currently, when we deploy a session cluster on Yarn/K8s and submit a job into
the existing cluster, some pending pods/containers may be created due to no
enough resource. Even the job will fail with slot allocation timeout or be
canceled, the pending pods/containers will still be there. Until allocated and
launched, they could be released via TaskManager idle timeout.
This behavior how to release the pending pods/containers could be improved.
Once the pending slots changed in the {{SlotManager}}, it could notify the
{{ActiveResourceManager}} to do some corresponding actions(e.g. release the
needless pending pods). This will help a lot when the cluster is small and do
not have too much available resources.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)