[
https://issues.apache.org/jira/browse/YUNIKORN-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Bacsko updated YUNIKORN-1161:
-----------------------------------
Description:
If we create pods where the name of the task group does not match the
{{task-group-name}} annotation, then the real pods will not transition to
Running state when the placeholder pods expire and Yunikorn was restarted in
the meantime.
For example, extend the sleep batch job like that:
{noformat}
apiVersion: batch/v1
kind: Job
metadata:
name: batch-sleep-job-9
spec:
completions: 5
parallelism: 5
template:
metadata:
labels:
app: sleep
applicationId: "batch-sleep-job-9"
queue: root.sandbox
annotations:
yunikorn.apache.org/task-group-name: sleep-groupxxx
yunikorn.apache.org/task-groups: |-
[{
"name": "sleep-group",
"minMember": 5,
"minResource": {
"cpu": "100m",
"memory": "2000M"
},
"nodeSelector": {},
"tolerations": []
}]
...
{noformat}
Submit the job and restart Yunikorn when the placeholders are already running.
This will result in "batch-sleep-job-9-nnnnn" pods that are not transitioning
to {{Running}} and they have to be manually terminated.
{noformat}
$ kubectl get pods -A | grep -E "(batch-sleep-job-9|yunikorn)"
default batch-sleep-job-9-hgxxl 0/1
Pending 0 20m
default batch-sleep-job-9-j6twt 0/1
Pending 0 20m
default batch-sleep-job-9-l4jhm 0/1
Pending 0 20m
default batch-sleep-job-9-swlm4 0/1
Pending 0 20m
default batch-sleep-job-9-z6wqx 0/1
Pending 0 20m
default yunikorn-admission-controller-78c775cfd9-6pp8d 1/1
Running 4 3d22h
default yunikorn-scheduler-77dd7c665b-f8kkn 2/2
Running 0 18m
{noformat}
Note that without YK restart, they are deallocated and removed properly.
was:
If we create pods where the nam of the task group does not match the
{{task-group-name}} annotation, then the real pods will not transition to
Running state when the placeholder pods expire and Yunikorn was restarted in
the meantime.
For example, extend the sleep batch job like that:
{noformat}
apiVersion: batch/v1
kind: Job
metadata:
name: batch-sleep-job-9
spec:
completions: 5
parallelism: 5
template:
metadata:
labels:
app: sleep
applicationId: "batch-sleep-job-9"
queue: root.sandbox
annotations:
yunikorn.apache.org/task-group-name: sleep-groupxxx
yunikorn.apache.org/task-groups: |-
[{
"name": "sleep-group",
"minMember": 5,
"minResource": {
"cpu": "100m",
"memory": "2000M"
},
"nodeSelector": {},
"tolerations": []
}]
...
{noformat}
Submit the job and restart Yunikorn when the placeholders are already running.
This will result in "batch-sleep-job-9-nnnnn" pods that are not transitioning
to {{Running}} and they have to be manually terminated.
{noformat}
$ kubectl get pods -A | grep -E "(batch-sleep-job-9|yunikorn)"
default batch-sleep-job-9-hgxxl 0/1
Pending 0 20m
default batch-sleep-job-9-j6twt 0/1
Pending 0 20m
default batch-sleep-job-9-l4jhm 0/1
Pending 0 20m
default batch-sleep-job-9-swlm4 0/1
Pending 0 20m
default batch-sleep-job-9-z6wqx 0/1
Pending 0 20m
default yunikorn-admission-controller-78c775cfd9-6pp8d 1/1
Running 4 3d22h
default yunikorn-scheduler-77dd7c665b-f8kkn 2/2
Running 0 18m
{noformat}
Note that without YK restart, they are deallocated and removed properly.
> Pods not linked to placeholders are stuck in Running state if YK is restarted
> -----------------------------------------------------------------------------
>
> Key: YUNIKORN-1161
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1161
> Project: Apache YuniKorn
> Issue Type: Sub-task
> Components: shim - kubernetes
> Reporter: Peter Bacsko
> Assignee: Peter Bacsko
> Priority: Major
> Attachments:
> logs-from-yunikorn-scheduler-k8s-in-yunikorn-scheduler-after_restart_nomatchingtaskgroupname.txt,
>
> logs-from-yunikorn-scheduler-k8s-in-yunikorn-scheduler-before_restart_nomatchingtaskgroupname.txt,
> pods_nomatchingtaskgroupname.txt
>
>
> If we create pods where the name of the task group does not match the
> {{task-group-name}} annotation, then the real pods will not transition to
> Running state when the placeholder pods expire and Yunikorn was restarted in
> the meantime.
> For example, extend the sleep batch job like that:
> {noformat}
> apiVersion: batch/v1
> kind: Job
> metadata:
> name: batch-sleep-job-9
> spec:
> completions: 5
> parallelism: 5
> template:
> metadata:
> labels:
> app: sleep
> applicationId: "batch-sleep-job-9"
> queue: root.sandbox
> annotations:
> yunikorn.apache.org/task-group-name: sleep-groupxxx
> yunikorn.apache.org/task-groups: |-
> [{
> "name": "sleep-group",
> "minMember": 5,
> "minResource": {
> "cpu": "100m",
> "memory": "2000M"
> },
> "nodeSelector": {},
> "tolerations": []
> }]
> ...
> {noformat}
> Submit the job and restart Yunikorn when the placeholders are already running.
> This will result in "batch-sleep-job-9-nnnnn" pods that are not transitioning
> to {{Running}} and they have to be manually terminated.
> {noformat}
> $ kubectl get pods -A | grep -E "(batch-sleep-job-9|yunikorn)"
> default batch-sleep-job-9-hgxxl 0/1
> Pending 0 20m
> default batch-sleep-job-9-j6twt 0/1
> Pending 0 20m
> default batch-sleep-job-9-l4jhm 0/1
> Pending 0 20m
> default batch-sleep-job-9-swlm4 0/1
> Pending 0 20m
> default batch-sleep-job-9-z6wqx 0/1
> Pending 0 20m
> default yunikorn-admission-controller-78c775cfd9-6pp8d 1/1
> Running 4 3d22h
> default yunikorn-scheduler-77dd7c665b-f8kkn 2/2
> Running 0 18m
> {noformat}
> Note that without YK restart, they are deallocated and removed properly.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]