[
https://issues.apache.org/jira/browse/YUNIKORN-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated YUNIKORN-1790:
-------------------------------------
Labels: flaky-test pull-request-available (was: flaky-test)
> TestTryAllocatePreemptNode: flaky test
> --------------------------------------
>
> Key: YUNIKORN-1790
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1790
> Project: Apache YuniKorn
> Issue Type: Improvement
> Components: test - unit
> Reporter: Wilfred Spiegelenburg
> Assignee: Qi Zhu
> Priority: Major
> Labels: flaky-test, pull-request-available
>
> The TestTryAllocatePreemptNode fails sometimes with a different allocation
> being preempted than the expected alloc1:
> {code:java}
> 2023-06-07T12:40:35.035+1000 INFO objects/queue.go:139 configured
> queue added to scheduler {"queueName": "root"}
> 2023-06-07T12:40:35.035+1000 INFO objects/queue.go:1190 updating root
> queue max resources {"current max": "nil resource", "new max":
> "map[first:40]"}
> 2023-06-07T12:40:35.035+1000 DEBUG objects/queue.go:850 new parent
> queue inheriting template from parent queue {"child queue": "root.parent",
> "parent queue": "root"}
> 2023-06-07T12:40:35.035+1000 INFO objects/queue.go:139 configured
> queue added to scheduler {"queueName": "root.parent"}
> 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources
> setting ignored: cannot set zero max resources
> 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:350 guaranteed
> resources setting ignored: cannot set zero max resources
> 2023-06-07T12:40:35.036+1000 INFO objects/queue.go:139 configured
> queue added to scheduler {"queueName": "root.unlimited"}
> 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources
> setting ignored: cannot set zero max resources2023-06-07T12:40:35.036+1000
> INFO objects/queue.go:139 configured queue added to scheduler
> {"queueName": "root.parent.child1"}
> 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources
> setting ignored: cannot set zero max resources
> 2023-06-07T12:40:35.036+1000 INFO objects/queue.go:139 configured
> queue added to scheduler {"queueName": "root.parent.child2"}
> 2023-06-07T12:40:35.036+1000 INFO objects/application.go:184 Unknown
> gang scheduling style, using soft style as default {"gang scheduling
> style": ""}
> 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:1780 Unknown
> application {"applicationID": "app-0"}
> 2023-06-07T12:40:35.037+1000 INFO objects/application_state.go:133
> Application state transition {"appID": "app-0", "source": "New",
> "destination": "Accepted", "event": "runApplication"}
> 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask
> added successfully to application {"appID": "app-0", "ask": "alloc0-0",
> "placeholder": false, "pendingDelta": "map[first:11]"}
> 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask
> added successfully to application {"appID": "app-0", "ask": "alloc0-1",
> "placeholder": false, "pendingDelta": "map[first:11]"}
> 2023-06-07T12:40:35.037+1000 INFO objects/application.go:184 Unknown
> gang scheduling style, using soft style as default {"gang scheduling
> style": ""}
> 2023-06-07T12:40:35.037+1000 DEBUG objects/queue.go:1780 Unknown
> application {"applicationID": "app-1"}
> 2023-06-07T12:40:35.037+1000 INFO objects/application_state.go:133
> Application state transition {"appID": "app-1", "source": "New",
> "destination": "Accepted", "event": "runApplication"}
> 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask
> added successfully to application {"appID": "app-1", "ask": "alloc1",
> "placeholder": false, "pendingDelta": "map[first:5]"}
> 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask
> added successfully to application {"appID": "app-1", "ask": "alloc2",
> "placeholder": false, "pendingDelta": "map[first:5]"}
> 2023-06-07T12:40:35.038+1000 INFO objects/application.go:184 Unknown
> gang scheduling style, using soft style as default {"gang scheduling
> style": ""}
> 2023-06-07T12:40:35.038+1000 DEBUG objects/queue.go:1780 Unknown
> application {"applicationID": "app-2"}
> 2023-06-07T12:40:35.038+1000 INFO objects/application_state.go:133
> Application state transition {"appID": "app-2", "source": "New",
> "destination": "Accepted", "event": "runApplication"}
> 2023-06-07T12:40:35.038+1000 INFO objects/application.go:668 ask
> added successfully to application {"appID": "app-2", "ask": "alloc3",
> "placeholder": false, "pendingDelta": "map[first:5]"}
> 2023-06-07T12:40:35.038+1000 DEBUG objects/application.go:339
> Application state timer initiated {"appID": "app-0", "state":
> "Starting", "timeout": "5m0s"}
> 2023-06-07T12:40:35.038+1000 INFO objects/application_state.go:133
> Application state transition {"appID": "app-0", "source": "Accepted",
> "destination": "Starting", "event": "runApplication"}
> 2023-06-07T12:40:35.038+1000 DEBUG ugm/manager.go:63 Increasing
> resource usage {"user": "testuser", "queue path": "root.unlimited",
> "application": "app-0", "resource": "map[first:11]"}
> 2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "root.unlimited", "application": "app-0",
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:45 Creating queue
> tracker object for queue {"queue": "unlimited"}
> 2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "unlimited", "application": "app-0",
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.038+1000 DEBUG ugm/manager.go:265 Group tracker
> already exists and linking (reusing) the same with application
> {"application": "app-0", "queue path": "root.unlimited", "user": "testuser",
> "group": "testgroup"}
> 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "root.unlimited", "application": "app-0",
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:45 Creating queue
> tracker object for queue {"queue": "unlimited"}
> 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "unlimited", "application": "app-0",
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000 DEBUG objects/application.go:388
> Application state timer cleared {"appID": "app-0", "state": "Starting"}
> 2023-06-07T12:40:35.039+1000 INFO objects/application_state.go:133
> Application state transition {"appID": "app-0", "source": "Starting",
> "destination": "Running", "event": "runApplication"}
> 2023-06-07T12:40:35.039+1000 DEBUG ugm/manager.go:63 Increasing
> resource usage {"user": "testuser", "queue path": "root.unlimited",
> "application": "app-0", "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "root.unlimited", "application": "app-0",
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "unlimited", "application": "app-0",
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "root.unlimited", "application": "app-0",
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "unlimited", "application": "app-0",
> "resource": "map[first:11]"}
> 2023-06-07T12:40:35.040+1000 DEBUG objects/application.go:339
> Application state timer initiated {"appID": "app-1", "state":
> "Starting", "timeout": "5m0s"}
> 2023-06-07T12:40:35.040+1000 INFO objects/application_state.go:133
> Application state transition {"appID": "app-1", "source": "Accepted",
> "destination": "Starting", "event": "runApplication"}
> 2023-06-07T12:40:35.040+1000 DEBUG ugm/manager.go:63 Increasing
> resource usage {"user": "testuser", "queue path": "root.parent.child1",
> "application": "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "root.parent.child1", "application":
> "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "parent.child1", "application": "app-1",
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "child1", "application": "app-1",
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "root.parent.child1", "application":
> "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "parent.child1", "application": "app-1",
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "child1", "application": "app-1",
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000 DEBUG objects/application.go:388
> Application state timer cleared {"appID": "app-1", "state": "Starting"}
> 2023-06-07T12:40:35.041+1000 INFO objects/application_state.go:133
> Application state transition {"appID": "app-1", "source": "Starting",
> "destination": "Running", "event": "runApplication"}
> 2023-06-07T12:40:35.041+1000 DEBUG ugm/manager.go:63 Increasing
> resource usage {"user": "testuser", "queue path": "root.parent.child1",
> "application": "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "root.parent.child1", "application":
> "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "parent.child1", "application": "app-1",
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "child1", "application": "app-1",
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "root.parent.child1", "application":
> "app-1", "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "parent.child1", "application": "app-1",
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing
> resource usage {"queue path": "child1", "application": "app-1",
> "resource": "map[first:5]"}
> 2023-06-07T12:40:35.041+1000 DEBUG objects/application.go:1454 app
> reservation check {"allocationKey": "alloc3", "createTime":
> "2023-06-07T12:40:25.038+1000", "askAge": "10.003814459s",
> "reservationDelay": "2s"}
> 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app
> reservation check {"allocationKey": "alloc3", "createTime":
> "2023-06-07T12:40:25.038+1000", "askAge": "10.003914s", "reservationDelay":
> "2s"}
> 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1470 found
> candidate node for app reservation {"appID": "app-2", "nodeID":
> "node1", "allocationKey": "alloc3", "reservations": 0, "pendingRepeats": 1}
> 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app
> reservation check {"allocationKey": "alloc3", "createTime":
> "2023-06-07T12:39:55.038+1000", "askAge": "40.004148584s",
> "reservationDelay": "2s"}
> 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app
> reservation check {"allocationKey": "alloc3", "createTime":
> "2023-06-07T12:39:55.038+1000", "askAge": "40.004227875s",
> "reservationDelay": "2s"}
> 2023-06-07T12:40:35.042+1000 DEBUG objects/preemption.go:352 No RM
> callback plugin registered, using first selected node for preemption
> {"NodeID": "node2", "AllocationKey": "alloc3"}
> 2023-06-07T12:40:35.043+1000 ERROR objects/preemption.go:517 BUG:
> Didn't find instance type in the nodeInstanceTypeMap {"nodeId": "node2"}
> github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).tryNodes
>
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:517
> github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).TryPreemption
>
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:537
> github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryPreemption
>
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1327
> github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryAllocate
>
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1050
> github.com/apache/yunikorn-core/pkg/scheduler/objects.TestTryAllocatePreemptNode
>
> /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application_test.go:1874
> testing.tRunner /usr/local/go/src/testing/testing.go:1576
> 2023-06-07T12:40:35.043+1000 INFO objects/preemption.go:559
> Preempting task {"applicationID": "app-1", "allocationKey": "alloc2",
> "nodeID": "node2", "resources": "map[first:5]"}
> 2023-06-07T12:40:35.043+1000 INFO objects/preemption.go:580
> Reserving node for ask after preemption {"allocationKey": "alloc3", "nodeID":
> "node2", "victimCount": 1}
> --- FAIL: TestTryAllocatePreemptNode (0.06s)
> application_test.go:1876: assertion failed: expression is false:
> alloc1.IsPreempted(): alloc1 should have been preempted {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]