[
https://issues.apache.org/jira/browse/FLINK-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yi Tang updated FLINK-19171:
----------------------------
Description:
{code:java}
private void terminatedPodsInMainThread(List<KubernetesPod> pods) {
getMainThreadExecutor().execute(() -> {
for (KubernetesPod pod : pods) {
if (pod.isTerminated()) {
...
}
}
});
}
{code}
Looks like that the RM only remove the pod from ledger if the pod
"isTerminated",
and the pod has been taken accounted after being created.
However, it is not complete by checking pod "isTerminated", e.g. a Pending pod
is deleted manually.
After that, a new job requires more resource can not trigger the allocation of
a new pod.
Pls let me know if i misunderstand, thanks.
was:
{code:java}
private void terminatedPodsInMainThread(List<KubernetesPod> pods) {
getMainThreadExecutor().execute(() -> {
for (KubernetesPod pod : pods) {
if (pod.isTerminated()) {
...
}
}
});
}
{code}
Looks like that the RM only remove the pod from ledger if the pod
"isTerminated",
and the pod has been taken accounted after being created.
However, it is not complete by checking pod "isTerminated", e.g. a Pending pod
is deleted manually.
Pls let me know if i misunderstand, thanks.
> K8s Resource Manager may lead to resource leak after pod deleted
> ----------------------------------------------------------------
>
> Key: FLINK-19171
> URL: https://issues.apache.org/jira/browse/FLINK-19171
> Project: Flink
> Issue Type: Bug
> Reporter: Yi Tang
> Priority: Minor
>
> {code:java}
> private void terminatedPodsInMainThread(List<KubernetesPod> pods) {
> getMainThreadExecutor().execute(() -> {
> for (KubernetesPod pod : pods) {
> if (pod.isTerminated()) {
> ...
> }
> }
> });
> }
> {code}
> Looks like that the RM only remove the pod from ledger if the pod
> "isTerminated",
> and the pod has been taken accounted after being created.
> However, it is not complete by checking pod "isTerminated", e.g. a Pending
> pod is deleted manually.
> After that, a new job requires more resource can not trigger the allocation
> of a new pod.
>
> Pls let me know if i misunderstand, thanks.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)