Wilfred Spiegelenburg created YUNIKORN-1790:
-----------------------------------------------
Summary: TestTryAllocatePreemptNode: flaky test
Key: YUNIKORN-1790
URL: https://issues.apache.org/jira/browse/YUNIKORN-1790
Project: Apache YuniKorn
Issue Type: Improvement
Components: test - unit
Reporter: Wilfred Spiegelenburg
The TestTryAllocatePreemptNode fails sometimes with a different allocation
being preempted than the expected alloc1:
{code:java}
2023-06-07T12:40:35.035+1000 INFO objects/queue.go:139 configured
queue added to scheduler {"queueName": "root"}
2023-06-07T12:40:35.035+1000 INFO objects/queue.go:1190 updating root
queue max resources {"current max": "nil resource", "new max":
"map[first:40]"}
2023-06-07T12:40:35.035+1000 DEBUG objects/queue.go:850 new parent
queue inheriting template from parent queue {"child queue": "root.parent",
"parent queue": "root"}
2023-06-07T12:40:35.035+1000 INFO objects/queue.go:139 configured
queue added to scheduler {"queueName": "root.parent"}
2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources
setting ignored: cannot set zero max resources
2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:350 guaranteed
resources setting ignored: cannot set zero max resources
2023-06-07T12:40:35.036+1000 INFO objects/queue.go:139 configured
queue added to scheduler {"queueName": "root.unlimited"}
2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources
setting ignored: cannot set zero max resources2023-06-07T12:40:35.036+1000
INFO objects/queue.go:139 configured queue added to scheduler
{"queueName": "root.parent.child1"}
2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources
setting ignored: cannot set zero max resources
2023-06-07T12:40:35.036+1000 INFO objects/queue.go:139 configured
queue added to scheduler {"queueName": "root.parent.child2"}
2023-06-07T12:40:35.036+1000 INFO objects/application.go:184 Unknown
gang scheduling style, using soft style as default {"gang scheduling
style": ""}
2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:1780 Unknown
application {"applicationID": "app-0"}
2023-06-07T12:40:35.037+1000 INFO objects/application_state.go:133
Application state transition {"appID": "app-0", "source": "New",
"destination": "Accepted", "event": "runApplication"}
2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask
added successfully to application {"appID": "app-0", "ask": "alloc0-0",
"placeholder": false, "pendingDelta": "map[first:11]"}
2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask
added successfully to application {"appID": "app-0", "ask": "alloc0-1",
"placeholder": false, "pendingDelta": "map[first:11]"}
2023-06-07T12:40:35.037+1000 INFO objects/application.go:184 Unknown
gang scheduling style, using soft style as default {"gang scheduling
style": ""}
2023-06-07T12:40:35.037+1000 DEBUG objects/queue.go:1780 Unknown
application {"applicationID": "app-1"}
2023-06-07T12:40:35.037+1000 INFO objects/application_state.go:133
Application state transition {"appID": "app-1", "source": "New",
"destination": "Accepted", "event": "runApplication"}
2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask
added successfully to application {"appID": "app-1", "ask": "alloc1",
"placeholder": false, "pendingDelta": "map[first:5]"}
2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask
added successfully to application {"appID": "app-1", "ask": "alloc2",
"placeholder": false, "pendingDelta": "map[first:5]"}
2023-06-07T12:40:35.038+1000 INFO objects/application.go:184 Unknown
gang scheduling style, using soft style as default {"gang scheduling
style": ""}
2023-06-07T12:40:35.038+1000 DEBUG objects/queue.go:1780 Unknown
application {"applicationID": "app-2"}
2023-06-07T12:40:35.038+1000 INFO objects/application_state.go:133
Application state transition {"appID": "app-2", "source": "New",
"destination": "Accepted", "event": "runApplication"}
2023-06-07T12:40:35.038+1000 INFO objects/application.go:668 ask
added successfully to application {"appID": "app-2", "ask": "alloc3",
"placeholder": false, "pendingDelta": "map[first:5]"}
2023-06-07T12:40:35.038+1000 DEBUG objects/application.go:339
Application state timer initiated {"appID": "app-0", "state": "Starting",
"timeout": "5m0s"}
2023-06-07T12:40:35.038+1000 INFO objects/application_state.go:133
Application state transition {"appID": "app-0", "source": "Accepted",
"destination": "Starting", "event": "runApplication"}
2023-06-07T12:40:35.038+1000 DEBUG ugm/manager.go:63 Increasing
resource usage {"user": "testuser", "queue path": "root.unlimited",
"application": "app-0", "resource": "map[first:11]"}
2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "root.unlimited", "application": "app-0",
"resource": "map[first:11]"}
2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:45 Creating queue
tracker object for queue {"queue": "unlimited"}
2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "unlimited", "application": "app-0",
"resource": "map[first:11]"}
2023-06-07T12:40:35.038+1000 DEBUG ugm/manager.go:265 Group tracker
already exists and linking (reusing) the same with application
{"application": "app-0", "queue path": "root.unlimited", "user": "testuser",
"group": "testgroup"}
2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "root.unlimited", "application": "app-0",
"resource": "map[first:11]"}
2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:45 Creating queue
tracker object for queue {"queue": "unlimited"}
2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "unlimited", "application": "app-0",
"resource": "map[first:11]"}
2023-06-07T12:40:35.039+1000 DEBUG objects/application.go:388
Application state timer cleared {"appID": "app-0", "state": "Starting"}
2023-06-07T12:40:35.039+1000 INFO objects/application_state.go:133
Application state transition {"appID": "app-0", "source": "Starting",
"destination": "Running", "event": "runApplication"}
2023-06-07T12:40:35.039+1000 DEBUG ugm/manager.go:63 Increasing
resource usage {"user": "testuser", "queue path": "root.unlimited",
"application": "app-0", "resource": "map[first:11]"}
2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "root.unlimited", "application": "app-0",
"resource": "map[first:11]"}
2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "unlimited", "application": "app-0",
"resource": "map[first:11]"}
2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "root.unlimited", "application": "app-0",
"resource": "map[first:11]"}
2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "unlimited", "application": "app-0",
"resource": "map[first:11]"}
2023-06-07T12:40:35.040+1000 DEBUG objects/application.go:339
Application state timer initiated {"appID": "app-1", "state": "Starting",
"timeout": "5m0s"}
2023-06-07T12:40:35.040+1000 INFO objects/application_state.go:133
Application state transition {"appID": "app-1", "source": "Accepted",
"destination": "Starting", "event": "runApplication"}
2023-06-07T12:40:35.040+1000 DEBUG ugm/manager.go:63 Increasing
resource usage {"user": "testuser", "queue path": "root.parent.child1",
"application": "app-1", "resource": "map[first:5]"}
2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "root.parent.child1", "application":
"app-1", "resource": "map[first:5]"}
2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "parent.child1", "application": "app-1",
"resource": "map[first:5]"}
2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "child1", "application": "app-1",
"resource": "map[first:5]"}
2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "root.parent.child1", "application":
"app-1", "resource": "map[first:5]"}
2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "parent.child1", "application": "app-1",
"resource": "map[first:5]"}
2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "child1", "application": "app-1",
"resource": "map[first:5]"}
2023-06-07T12:40:35.041+1000 DEBUG objects/application.go:388
Application state timer cleared {"appID": "app-1", "state": "Starting"}
2023-06-07T12:40:35.041+1000 INFO objects/application_state.go:133
Application state transition {"appID": "app-1", "source": "Starting",
"destination": "Running", "event": "runApplication"}
2023-06-07T12:40:35.041+1000 DEBUG ugm/manager.go:63 Increasing
resource usage {"user": "testuser", "queue path": "root.parent.child1",
"application": "app-1", "resource": "map[first:5]"}
2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "root.parent.child1", "application":
"app-1", "resource": "map[first:5]"}
2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "parent.child1", "application": "app-1",
"resource": "map[first:5]"}
2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "child1", "application": "app-1",
"resource": "map[first:5]"}
2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "root.parent.child1", "application":
"app-1", "resource": "map[first:5]"}
2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "parent.child1", "application": "app-1",
"resource": "map[first:5]"}
2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
resource usage {"queue path": "child1", "application": "app-1",
"resource": "map[first:5]"}
2023-06-07T12:40:35.041+1000 DEBUG objects/application.go:1454 app
reservation check {"allocationKey": "alloc3", "createTime":
"2023-06-07T12:40:25.038+1000", "askAge": "10.003814459s", "reservationDelay":
"2s"}
2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app
reservation check {"allocationKey": "alloc3", "createTime":
"2023-06-07T12:40:25.038+1000", "askAge": "10.003914s", "reservationDelay":
"2s"}
2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1470 found
candidate node for app reservation {"appID": "app-2", "nodeID": "node1",
"allocationKey": "alloc3", "reservations": 0, "pendingRepeats": 1}
2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app
reservation check {"allocationKey": "alloc3", "createTime":
"2023-06-07T12:39:55.038+1000", "askAge": "40.004148584s", "reservationDelay":
"2s"}
2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app
reservation check {"allocationKey": "alloc3", "createTime":
"2023-06-07T12:39:55.038+1000", "askAge": "40.004227875s", "reservationDelay":
"2s"}
2023-06-07T12:40:35.042+1000 DEBUG objects/preemption.go:352 No RM
callback plugin registered, using first selected node for preemption
{"NodeID": "node2", "AllocationKey": "alloc3"}
2023-06-07T12:40:35.043+1000 ERROR objects/preemption.go:517 BUG:
Didn't find instance type in the nodeInstanceTypeMap {"nodeId": "node2"}
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).tryNodes
/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:517
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).TryPreemption
/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:537
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryPreemption
/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1327
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryAllocate
/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1050
github.com/apache/yunikorn-core/pkg/scheduler/objects.TestTryAllocatePreemptNode
/Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application_test.go:1874
testing.tRunner /usr/local/go/src/testing/testing.go:1576
2023-06-07T12:40:35.043+1000 INFO objects/preemption.go:559
Preempting task {"applicationID": "app-1", "allocationKey": "alloc2", "nodeID":
"node2", "resources": "map[first:5]"}
2023-06-07T12:40:35.043+1000 INFO objects/preemption.go:580
Reserving node for ask after preemption {"allocationKey": "alloc3", "nodeID":
"node2", "victimCount": 1}
--- FAIL: TestTryAllocatePreemptNode (0.06s)
application_test.go:1876: assertion failed: expression is false:
alloc1.IsPreempted(): alloc1 should have been preempted {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]