[jira] [Updated] (YUNIKORN-1161) Pods not linked to placeholders are stuck in Running state if YK is restarted

Peter Bacsko (Jira) Wed, 30 Mar 2022 09:29:04 -0700


     [ 
https://issues.apache.org/jira/browse/YUNIKORN-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Peter Bacsko updated YUNIKORN-1161:
-----------------------------------
    Description: 
If we create pods where the name of the task group does not match the 
{{task-group-name}} annotation, then the real pods will not transition to 
Running state when the placeholder pods expire and Yunikorn was restarted in 
the meantime.

For example, extend the sleep batch job like that:
{noformat}
apiVersion: batch/v1
kind: Job
metadata:
  name: batch-sleep-job-9
spec:
  completions: 5
  parallelism: 5
  template:
    metadata:
      labels:
        app: sleep
        applicationId: "batch-sleep-job-9"
        queue: root.sandbox
      annotations:
        yunikorn.apache.org/task-group-name: sleep-groupxxx
        yunikorn.apache.org/task-groups: |-
          [{
              "name": "sleep-group",
              "minMember": 5,
              "minResource": {
                "cpu": "100m",
                "memory": "2000M"
              },
              "nodeSelector": {},
              "tolerations": []
          }]
...
{noformat}

Submit the job and restart Yunikorn when the placeholders are already running.
This will result in "batch-sleep-job-9-nnnnn" pods that are not transitioning 
to {{Running}} and they have to be manually terminated.

{noformat}
$ kubectl get pods -A | grep -E "(batch-sleep-job-9|yunikorn)"
default                batch-sleep-job-9-hgxxl                          0/1     
Pending     0          20m
default                batch-sleep-job-9-j6twt                          0/1     
Pending     0          20m
default                batch-sleep-job-9-l4jhm                          0/1     
Pending     0          20m
default                batch-sleep-job-9-swlm4                          0/1     
Pending     0          20m
default                batch-sleep-job-9-z6wqx                          0/1     
Pending     0          20m
default                yunikorn-admission-controller-78c775cfd9-6pp8d   1/1     
Running     4          3d22h
default                yunikorn-scheduler-77dd7c665b-f8kkn              2/2     
Running     0          18m
{noformat}

Note that without YK restart, they are deallocated and removed properly.

  was:
If we create pods where the nam of the task group does not match the 
{{task-group-name}} annotation, then the real pods will not transition to 
Running state when the placeholder pods expire and Yunikorn was restarted in 
the meantime.

For example, extend the sleep batch job like that:
{noformat}
apiVersion: batch/v1
kind: Job
metadata:
  name: batch-sleep-job-9
spec:
  completions: 5
  parallelism: 5
  template:
    metadata:
      labels:
        app: sleep
        applicationId: "batch-sleep-job-9"
        queue: root.sandbox
      annotations:
        yunikorn.apache.org/task-group-name: sleep-groupxxx
        yunikorn.apache.org/task-groups: |-
          [{
              "name": "sleep-group",
              "minMember": 5,
              "minResource": {
                "cpu": "100m",
                "memory": "2000M"
              },
              "nodeSelector": {},
              "tolerations": []
          }]
...
{noformat}

Submit the job and restart Yunikorn when the placeholders are already running.
This will result in "batch-sleep-job-9-nnnnn" pods that are not transitioning 
to {{Running}} and they have to be manually terminated.

{noformat}
$ kubectl get pods -A | grep -E "(batch-sleep-job-9|yunikorn)"
default                batch-sleep-job-9-hgxxl                          0/1     
Pending     0          20m
default                batch-sleep-job-9-j6twt                          0/1     
Pending     0          20m
default                batch-sleep-job-9-l4jhm                          0/1     
Pending     0          20m
default                batch-sleep-job-9-swlm4                          0/1     
Pending     0          20m
default                batch-sleep-job-9-z6wqx                          0/1     
Pending     0          20m
default                yunikorn-admission-controller-78c775cfd9-6pp8d   1/1     
Running     4          3d22h
default                yunikorn-scheduler-77dd7c665b-f8kkn              2/2     
Running     0          18m
{noformat}

Note that without YK restart, they are deallocated and removed properly.


> Pods not linked to placeholders are stuck in Running state if YK is restarted
> -----------------------------------------------------------------------------
>
>                 Key: YUNIKORN-1161
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1161
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: shim - kubernetes
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>         Attachments: 
> logs-from-yunikorn-scheduler-k8s-in-yunikorn-scheduler-after_restart_nomatchingtaskgroupname.txt,
>  
> logs-from-yunikorn-scheduler-k8s-in-yunikorn-scheduler-before_restart_nomatchingtaskgroupname.txt,
>  pods_nomatchingtaskgroupname.txt
>
>
> If we create pods where the name of the task group does not match the 
> {{task-group-name}} annotation, then the real pods will not transition to 
> Running state when the placeholder pods expire and Yunikorn was restarted in 
> the meantime.
> For example, extend the sleep batch job like that:
> {noformat}
> apiVersion: batch/v1
> kind: Job
> metadata:
>   name: batch-sleep-job-9
> spec:
>   completions: 5
>   parallelism: 5
>   template:
>     metadata:
>       labels:
>         app: sleep
>         applicationId: "batch-sleep-job-9"
>         queue: root.sandbox
>       annotations:
>         yunikorn.apache.org/task-group-name: sleep-groupxxx
>         yunikorn.apache.org/task-groups: |-
>           [{
>               "name": "sleep-group",
>               "minMember": 5,
>               "minResource": {
>                 "cpu": "100m",
>                 "memory": "2000M"
>               },
>               "nodeSelector": {},
>               "tolerations": []
>           }]
> ...
> {noformat}
> Submit the job and restart Yunikorn when the placeholders are already running.
> This will result in "batch-sleep-job-9-nnnnn" pods that are not transitioning 
> to {{Running}} and they have to be manually terminated.
> {noformat}
> $ kubectl get pods -A | grep -E "(batch-sleep-job-9|yunikorn)"
> default                batch-sleep-job-9-hgxxl                          0/1   
>   Pending     0          20m
> default                batch-sleep-job-9-j6twt                          0/1   
>   Pending     0          20m
> default                batch-sleep-job-9-l4jhm                          0/1   
>   Pending     0          20m
> default                batch-sleep-job-9-swlm4                          0/1   
>   Pending     0          20m
> default                batch-sleep-job-9-z6wqx                          0/1   
>   Pending     0          20m
> default                yunikorn-admission-controller-78c775cfd9-6pp8d   1/1   
>   Running     4          3d22h
> default                yunikorn-scheduler-77dd7c665b-f8kkn              2/2   
>   Running     0          18m
> {noformat}
> Note that without YK restart, they are deallocated and removed properly.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (YUNIKORN-1161) Pods not linked to placeholders are stuck in Running state if YK is restarted

Reply via email to