chenyulin0719 opened a new pull request, #871: URL: https://github.com/apache/yunikorn-k8shim/pull/871
### What is this PR for? Support canonical Queue/ApplicationId labels in Pod, allows it coexist with the existing metadata. - yunikorn.apache.org/app-id (New, **Canonical Label**) - yunikorn.apache.org/queue (New, **Canonical Label)** YuniKorn will reject those pods with conflicting metadata after version 1.7.0. - Check metadata consistency before move task state from 'New' to 'Pending'. Run the pod metadata check in task.sanityCheckBeforeScheduling() - Before 1.7.0, If sanity check failed due to inconsistent metadata, then log a warning message - After 1.7.0, If sanity check failed due to inconsistent metadata, move the task from 'New' to 'Rejected' state. And fail the pod with reasons. ApplicationID is fetched from pod in below order: 1. Label: constants.CanonicalLabelApplicationID (**New**) 2. Annotation: constants.AnnotationApplicationID 3. Label: constants.LabelApplicationID 4. Label: constants.SparkLabelAppID Queue name is fetched from pod in below ortder 1. Label: constants.CanonicalLabelQueueName (**New**) 2. Annotation: constants.AnnotationQueueName 3. Label: constants.LabelQueueName (**Previous: constants.LabelQueueName > constants.AnnotationQueueName**) 4. Default: constants.ApplicationDefaultQueue ### What type of PR is it? * [X] - Feature ### Todos - Admission Controller should fail the pod request too if the metadata is inconsistent. Will create another Jira once this PR got merged. - Update Doc https://yunikorn.apache.org/docs/next/user_guide/labels_and_annotations_in_yunikorn ### What is the Jira issue? https://issues.apache.org/jira/browse/YUNIKORN-2504 ### How should this be tested? Run below simple sleep pods: ``` apiVersion: v1 kind: Pod metadata: labels: app: sleep yunikorn.apache.org/app-id: "application-sleep-0001" yunikorn.apache.org/queue: "root.sandbox" annotations: yunikorn.apache.org/queue: "root.sandbox-another" name: pod-with-inconsistent-queue spec: schedulerName: yunikorn restartPolicy: Never containers: - name: sleep-6000s image: "alpine:latest" command: ["sleep", "6000"] resources: requests: cpu: "100m" memory: "500M" --- apiVersion: v1 kind: Pod metadata: labels: app: sleep yunikorn.apache.org/app-id: "application-sleep-0002" annotations: yunikorn.apache.org/app-id: "application-sleep-0002-another" name: pod-with-inconsistent-app-id spec: schedulerName: yunikorn restartPolicy: Never containers: - name: sleep-6000s image: "alpine:latest" command: ["sleep", "6000"] resources: requests: cpu: "100m" memory: "500M" ``` Check the scheduler pod logs: ``` kubectl logs -l component=yunikorn-scheduler -n yunikorn --tail=200000 > yunikorn-scheduler-logs.txt ``` You will see the warning logs like this: ``` 2024-07-05T18:26:17.591Z WARN shim.cache.task cache/task.go:582 Task pod has conflicting metadata, the unbound task pod will be rejected after version 1.7.0 {"appID": "application-sleep-00002", "podName": "pod-with-inconsistent-queue", "error": "queue is not consistently set in pod's labels and annotations. [PodInconsistentMetadata]"} 2024-07-05T18:26:17.592Z WARN shim.cache.task cache/task.go:582 Task pod has conflicting metadata, the unbound task pod will be rejected after version 1.7.0 {"appID": "application-sleep-00001-annotation", "podName": "pod-with-inconsistent-app-id", "error": "application ID is not consistently set in pod's labels and annotations. [PodInconsistentMetadata]"} ``` ### Screenshots (if appropriate) Before 1.7.0, only log warning message: <img width="1106" alt="image" src="https://github.com/apache/yunikorn-k8shim/assets/26764036/4750c3d8-1ea5-46b6-a9be-7966a58e447c"> After 1.7.0, below is the screenshot without admission controller: (Another PR will be submit after version 1.6.0 released.) (This is the original screenshot in the closed [PR](https://github.com/apache/yunikorn-k8shim/pull/860)) <img width="960" alt="image" src="https://github.com/apache/yunikorn-k8shim/assets/26764036/22ad137f-245f-4ec8-8332-947eb4c521f4"> <img width="974" alt="image" src="https://github.com/apache/yunikorn-k8shim/assets/26764036/d28d80f2-89c0-4976-bc3e-d8bb45c169e1"> <img width="973" alt="image" src="https://github.com/apache/yunikorn-k8shim/assets/26764036/00ebf818-4115-4605-8fe8-7a3cf1a47a1f"> ### Questions: NA -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
