[ 
https://issues.apache.org/jira/browse/YUNIKORN-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881484#comment-17881484
 ] 

shawn edited comment on YUNIKORN-2860 at 9/13/24 7:37 AM:
----------------------------------------------------------

[~wilfreds] Thank you.I find yunikorn will break this scene uses ph timeout, 
depends on what schedulingPolicy yunikorn uses.
Another question, I submit four  apps(using mentioned yamls) using hard  
schedulingPolicy in v1.3.0 and v1.5.2,
in 1.3.0, yunikorn results in all apps pending, seems to be a bug,
!image-2024-09-13-15-33-13-964.png!
!image-2024-09-13-15-33-19-380.png|width=744,height=164!

in 1.5.2, yunikorn fails the timeout apps and turns it to 
ResourceReservationTimeout status, that's expected,
!http://www.kdocs.cn/api/v3/office/copy/WXFSeiszR1UxODd0TTlJWUVGKzJCWE1VTkFUcjhsdFF3OGhJaXNQTWhtdTFlMEJhdlpxOWxZRkFKQnJJM0dVc1RpNXU3TVdXZ0NDYzRuWm5kWGFLbWIxblJVcTBwcnpzSnFlMmh5NlE2Ti95ajFMZVVjbHVTc0NBcm1oK1hHUlpvV2FKR1M2cHptWjNlMkVzZzRlREswcmFhU1dPNXA4SmYyeGN0Rk5GemVPK3ZGRWFIaVNCSnMxcGoxbmdtTkhPd3JuTXM5WVZiaS8xYWpOV3NVRFZKNklNeXgxWFNKb0RjQzZuZVdlcnRMUHhkaERhU3Fwb2I3Mm5STmFhVzdsRFRhTGxRUWlsa0wwPQ==/attach/object/JMAMALQ3ADACC?|width=753,height=219!

!image-2024-09-13-15-35-26-177.png|width=750,height=313!

I wonder if anyone can tell the related improvements in v1.5.2, thank you.


was (Author: JIRAUSER303988):
[~wilfreds] Thank you.I find yunikorn will break this scene uses pg timeout, 
depends on what schedulingPolicy yunikorn uses.
Another question, I submit four  apps(using mentioned yamls) using hard  
schedulingPolicy in v1.3.0 and v1.5.2,
in 1.3.0, yunikorn results in all apps pending, seems to be a bug,
!image-2024-09-13-15-33-13-964.png!
!image-2024-09-13-15-33-19-380.png|width=744,height=164!

in 1.5.2, yunikorn fails the timeout apps and turns it to 
ResourceReservationTimeout status, that's expected,
!http://www.kdocs.cn/api/v3/office/copy/WXFSeiszR1UxODd0TTlJWUVGKzJCWE1VTkFUcjhsdFF3OGhJaXNQTWhtdTFlMEJhdlpxOWxZRkFKQnJJM0dVc1RpNXU3TVdXZ0NDYzRuWm5kWGFLbWIxblJVcTBwcnpzSnFlMmh5NlE2Ti95ajFMZVVjbHVTc0NBcm1oK1hHUlpvV2FKR1M2cHptWjNlMkVzZzRlREswcmFhU1dPNXA4SmYyeGN0Rk5GemVPK3ZGRWFIaVNCSnMxcGoxbmdtTkhPd3JuTXM5WVZiaS8xYWpOV3NVRFZKNklNeXgxWFNKb0RjQzZuZVdlcnRMUHhkaERhU3Fwb2I3Mm5STmFhVzdsRFRhTGxRUWlsa0wwPQ==/attach/object/JMAMALQ3ADACC?|width=753,height=219!

!image-2024-09-13-15-35-26-177.png|width=750,height=313!

I wonder if anyone can tell the related improvements in v1.5.2, thank you.

> submit gang applications Simultaneously  may cause unexpected pending apps 
> ---------------------------------------------------------------------------
>
>                 Key: YUNIKORN-2860
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2860
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>    Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.5.1, 1.5.2
>            Reporter: shawn
>            Priority: Major
>         Attachments: image-2024-09-11-15-41-12-142.png, 
> image-2024-09-11-15-42-07-739.png, image-2024-09-13-15-33-13-964.png, 
> image-2024-09-13-15-33-19-380.png, image-2024-09-13-15-35-26-177.png, 
> state-dump.txt, yunikorn-scheduler.txt
>
>
>   
>   I Simultaneously submit 4 gang apps to yunikorn,sometimes 4 apps get 
> pending, while two pgs get running, that's not expected.
>  It can be reproduced as follows:
> queues
>       1.kubectl create configmap yunikorn-configs --from-file=queues.yaml -n 
> yunikorn
>  * queues.yaml
> {code:java}
> partitions:
>   - name: default
>     queues:
>       - name: root
>         queues:
>           - name: my-dev
>             submitacl: "*"
>             resources:
>               guaranteed: { memory: 1G, vcore: 1 }
>               max: { memory: 2G, vcore: 2 }{code}
>          2.Simultaneously submit gang-scheduling-job-example1-4.yaml, while
> gang-scheduling-job-example1-4.yaml only differ in name and applicationId
> {code:java}
> apiVersion: batch/v1
> kind: Job
> metadata:
>   name: gang-scheduling-job-example1
> spec:
>   completions: 2
>   parallelism: 2
>   template:
>     metadata:
>       labels:
>         app: sleep
>         applicationId: "gang-scheduling-job-example1"
>         queue: root.my-dev
>       annotations:
>         yunikorn.apache.org/task-group-name: task-group-example-0
>         yunikorn.apache.org/task-groups: |-
>           [{
>               "name": "task-group-example-0",
>               "minMember": 2,
>               "minResource": {
>                 "cpu": "1",
>                 "memory": "1G"
>               },
>               "nodeSelector": {},
>               "tolerations": [],
>               "affinity": {}
>           }]
>     spec:
>       schedulerName: yunikorn
>       restartPolicy: Never
>       containers:
>         - name: sleep30
>           image: "nginx:latest"
>           command: ["sleep", "999999999"]
>           resources:
>             requests:
>               cpu: "1"
>               memory: "1G" {code}
> finally,kubectl get pods -n default gets unexpected result(not always 
> reproducible)
> !image-2024-09-11-15-41-12-142.png!
>  
> app state as follows
> !image-2024-09-11-15-42-07-739.png|width=754,height=280!
> full state dump as state-dump.txt, yunikorn scheduler logs are in 
> yunikorn-scheduler.txt
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to