[
https://issues.apache.org/jira/browse/YUNIKORN-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
shawn updated YUNIKORN-2860:
----------------------------
Attachment: pods.png
> submit gang applications Simultaneously may cause unexpected pending apps
> ---------------------------------------------------------------------------
>
> Key: YUNIKORN-2860
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2860
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.5.1, 1.5.2
> Reporter: shawn
> Priority: Major
> Attachments: applications.png, pods.png, queues.png
>
>
>
> I Simultaneously submit 4 gang apps to yunikorn,sometimes 4 apps get
> pending, while two pgs get running, that's not expected.
> It can be reproduced as follows:
> queues
> 1.kubectl create configmap yunikorn-configs --from-file=queues.yaml -n
> yunikorn
> * queues.yaml
> {code:java}
> partitions:
> - name: default
> queues:
> - name: root
> queues:
> - name: my-dev
> submitacl: "*"
> resources:
> guaranteed: { memory: 1G, vcore: 1 }
> max: { memory: 2G, vcore: 2 }{code}
> 2.Simultaneously submit gang-scheduling-job-example1-4.yaml, while
> gang-scheduling-job-example1-4.yaml only differ in name and applicationId
> {code:java}
> apiVersion: batch/v1
> kind: Job
> metadata:
> name: gang-scheduling-job-example1
> spec:
> completions: 2
> parallelism: 2
> template:
> metadata:
> labels:
> app: sleep
> applicationId: "gang-scheduling-job-example1"
> queue: root.my-dev
> annotations:
> yunikorn.apache.org/task-group-name: task-group-example-0
> yunikorn.apache.org/task-groups: |-
> [{
> "name": "task-group-example-0",
> "minMember": 2,
> "minResource": {
> "cpu": "1",
> "memory": "1G"
> },
> "nodeSelector": {},
> "tolerations": [],
> "affinity": {}
> }]
> spec:
> schedulerName: yunikorn
> restartPolicy: Never
> containers:
> - name: sleep30
> image: "nginx:latest"
> command: ["sleep", "999999999"]
> resources:
> requests:
> cpu: "1"
> memory: "1G" {code}
> finally,kubectl get pods -n default gets unexpected result(not always
> reproducible)
> !http://www.kdocs.cn/api/v3/office/copy/dHZnb0t1QXY5SXVjY0llQW5BcHJabWRzcGNxYm1NMUMyVmo4Mk4yYnhrWFhkZlRCamV6L1h6bHNqOEtyanc3QmpKU04xMDY5WHBTcEhMT2FxbnFGSWU1dVFJMGh1V2x4SXNXRU1KU3dQY2xxSzE4dW5QbkZ3NE5hcWtMOWZPVEtnM2lFRGhLTWNLYUR0NzRFUmNmRHZ2QjNJeTU3NHoyZm96SjNYSWFhc0srbVl4a1hjclJTT1JZVnphaEplSmVibGxXZjgyU0NoNlBpSjV4N2dyc2dIdFFUK0ppbGVrS1VueWxWWEFMd2xqUGpFUUlYSVNqNmxZRjBLY3RwL2pUdHJPbHJ1c1hhNE1vPQ==/attach/object/HCBXYGQ3ABQGY?|width=666,height=331!
> queues web ui as follows
> !http://www.kdocs.cn/api/v3/office/copy/dHZnb0t1QXY5SXVjY0llQW5BcHJabWRzcGNxYm1NMUMyVmo4Mk4yYnhrWFhkZlRCamV6L1h6bHNqOEtyanc3QmpKU04xMDY5WHBTcEhMT2FxbnFGSWU1dVFJMGh1V2x4SXNXRU1KU3dQY2xxSzE4dW5QbkZ3NE5hcWtMOWZPVEtnM2lFRGhLTWNLYUR0NzRFUmNmRHZ2QjNJeTU3NHoyZm96SjNYSWFhc0srbVl4a1hjclJTT1JZVnphaEplSmVibGxXZjgyU0NoNlBpSjV4N2dyc2dIdFFUK0ppbGVrS1VueWxWWEFMd2xqUGpFUUlYSVNqNmxZRjBLY3RwL2pUdHJPbHJ1c1hhNE1vPQ==/attach/object/JKIV4GQ3ACADC?|width=665,height=275!
> app state as follows
> !http://www.kdocs.cn/api/v3/office/copy/dHZnb0t1QXY5SXVjY0llQW5BcHJabWRzcGNxYm1NMUMyVmo4Mk4yYnhrWFhkZlRCamV6L1h6bHNqOEtyanc3QmpKU04xMDY5WHBTcEhMT2FxbnFGSWU1dVFJMGh1V2x4SXNXRU1KU3dQY2xxSzE4dW5QbkZ3NE5hcWtMOWZPVEtnM2lFRGhLTWNLYUR0NzRFUmNmRHZ2QjNJeTU3NHoyZm96SjNYSWFhc0srbVl4a1hjclJTT1JZVnphaEplSmVibGxXZjgyU0NoNlBpSjV4N2dyc2dIdFFUK0ppbGVrS1VueWxWWEFMd2xqUGpFUUlYSVNqNmxZRjBLY3RwL2pUdHJPbHJ1c1hhNE1vPQ==/attach/object/7CIH2GQ3ABQEK?|width=660,height=245!
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]