[
https://issues.apache.org/jira/browse/YUNIKORN-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wilfred Spiegelenburg resolved YUNIKORN-1165.
---------------------------------------------
Fix Version/s: 1.0.0
Resolution: Fixed
changes committed. the code now behaves as described in the code comments
> Yunikorn plugin inconsistent node assignment + scheduling
> ---------------------------------------------------------
>
> Key: YUNIKORN-1165
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1165
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: shim - kubernetes
> Reporter: Ronald Zhang
> Assignee: Craig Condit
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.0.0
>
>
> Although the plugin selects a certain node, the default scheduler may not
> schedule the pod on the plugin-selected node. In the below example, Yunikorn
> selected node host-05.shared-ek8s-dev-01.kcloud.cloudera.com. However, the
> pod was scheduled on host-01.shared-ek8s-dev-01.kcloud.cloudera.com.
> Internally, Yunikorn still incorrectly believes that the node was scheduled
> on host-05.shared-ek8s-dev-01.kcloud.cloudera.com.
> {code:java}
> Events:
> Type Reason Age From Message
> ---- ------ ---- ---- -------
> Warning FailedScheduling 19s yunikorn 0/10 nodes are available: 10 Pod
> is not ready for scheduling.
> Warning FailedScheduling 19s yunikorn 0/10 nodes are available: 10 Pod
> is not ready for scheduling.
> Normal Scheduling 16s yunikorn dev/test-pod-1 is queued and
> waiting for allocation
> Normal QuotaApproved 16s yunikorn Pod dev/test-pod-1 is ready for
> scheduling on node host-05.shared-ek8s-dev-01.kcloud.cloudera.com
> Normal Scheduled 16s yunikorn Successfully assigned
> dev/test-pod-1 to host-01.shared-ek8s-dev-01.kcloud.cloudera.com
> Normal Pulled 15s kubelet Container image
> "docker-private.infra.cloudera.com/cloudera_base/alpine:latest" already
> present on machine
> Normal Created 15s kubelet Created container sleep-600s-1
> Normal Started 15s kubelet Started container sleep-600s-1
> {code}
> {code:java}
> { "applicationID": "app-v1", "usedResource": "[memory:50
> vcore:50]", "partition": "default", "queueName": "root.dev",
> "submissionTime": 1648676490424350638, "allocations":
> [ { "allocationKey":
> "d50084ea-2ae1-4057-8a9b-48cda1fad4c7", "allocationTags":
> { "kubernetes.io/label/app": "sleep",
> "kubernetes.io/label/applicationId": "app-v1",
> "kubernetes.io/label/component": "yunikorn-scheduler",
> "kubernetes.io/label/queue": "root.default",
> "kubernetes.io/meta/namespace": "dev",
> "kubernetes.io/meta/podName": "test-pod-1" },
> "uuid": "f7e4b5d6-1e71-4d5b-bd1d-aa28067a3d4b", "resource":
> "[memory:50 vcore:50]", "priority": "0",
> "queueName": "root.dev", "nodeId":
> "host-05.shared-ek8s-dev-01.kcloud.cloudera.com",
> "applicationId": "app-v1", "partition": "default"
> } ], "applicationState": "Starting" },
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]