Hi Patrick Thanks for reaching out. I have one question about how you limit there is only 2 pod running at a time. Are you running this on a single node cluster, and limit that at the node resource level? Based on the use case you described. My guess is the scheduler creates Reservations for B-1 and A-3, and when we unreserve these reservations, the FIFO ordering was not strictly honored. Have you set any queue quota? Here is the doc about how to set the queue mapping with quota set: http://yunikorn.apache.org/docs/next/user_guide/resource_quota_management#namespace-to-queue-mapping. If we limit the running pods by quota, then I think we will see expected behavior. The document about app sorting policy is here: http://yunikorn.apache.org/docs/next/user_guide/sorting_policies#application-sorting .
Weiwei On Fri, Feb 5, 2021 at 1:04 PM Patrick, Alton (US) <[email protected]> wrote: > Hi. I have just started using Yunikorn for scheduling with K8s. I have > been doing some simple experiments to make sure I understand how it works. > Most of them are working as expected, but there is one I don't understand. > > I used Helm to deploy Yunikorn and I did not modify anything, so I have > the default setup: one queue per namespace, and I assume the application > sort policy is the default of "FifoSortPolicy." > > I create four pods. All of them have the resource requests set the same (2 > cores, 1Gi mem), and the resource requests are such that only two can run > at a time. The pods are created in this order, with a 1 second gap between > each: > > 1. A-1, applicationId = A, sleeps for 10s > 2. A-2, applicationId = A, sleeps for 5s > 3. B-1, applicationId = B, sleeps for 5s > 4. A-3, applicationId = A, sleeps for 5s > > What I expect to see is: > * A-1 is scheduled > * A-2 is scheduled > * A-2 finishes > * A-3 is scheduled (because A is the first application created, as long as > there are pods in the queue for application A I understand that they should > have priority over pods for application B) > > What I see instead is that after A-2 finishes, B-1 gets scheduled to run. > > Is this the expected behavior, and if so can someone explain what is wrong > with my understanding? > > Additionally, in the logs for the scheduler pod, right after pod B-1 gets > scheduled, I see the following messages repeated thousands of times very > fast (over 2000 instances in about .25s according to timestamps in log). Is > this normal? > > 2021-02-05T20:25:34.479Z DEBUG scheduler/scheduling_application.go:641 > skipping node for allocation: basic condition not satisfied {"node": > "local-node", "allocationKey": "258d9947-e92b-4967-9758-08eee62f4d1b", > "error": "pre alloc check: requested resource map[memory:1074 vcore:2000] > is larger than currently available map[ephemeral-storage:50977832921 > hugepages-2Mi:0 memory:13250 pods:110 vcore:1600] resource on local-node"} > 2021-02-05T20:25:34.479Z DEBUG scheduler/scheduling_node.go:271 requested > resource is larger than currently available node resources {"nodeID": > "local-node", "requested": "map[memory:1074 vcore:2000]", "available": > "map[ephemeral-storage:50977832921 hugepages-2Mi:0 memory:13250 pods:110 > vcore:1600]"} > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
