Peter Bacsko created YUNIKORN-1108:
--------------------------------------
Summary: Fix race conditions in dispatcher_test.go
Key: YUNIKORN-1108
URL: https://issues.apache.org/jira/browse/YUNIKORN-1108
Project: Apache YuniKorn
Issue Type: Bug
Components: shim - kubernetes, test - unit
Reporter: Peter Bacsko
Assignee: Peter Bacsko
A dispatcher test failed locally and there was also a data race observed:
{noformat}
2022-03-08T16:19:15.795+0100 INFO log/logger.go:89 scheduler
configuration, pretty print {"configs": "{\n \"schedulerName\":
\"yunikorn\",\n \"clusterId\": \"my-kube-cluster\",\n \"clusterVersion\":
\"0.1\",\n \"policyGroup\": \"queues\",\n \"schedulingIntervalSecond\":
1000000000,\n \"absoluteKubeConfigFilePath\": \"\",\n \"loggingLevel\": 0,\n
\"logEncoding\": \"console\",\n \"logFilePath\": \"\",\n \"volumeBindTimeout\":
10000000000,\n \"testMode\": false,\n \"eventChannelCapacity\": 1048576,\n
\"dispatchTimeout\": 300000000000,\n \"kubeQPS\": 1000,\n \"kubeBurst\":
1000,\n \"predicates\": \"\",\n \"operatorPlugins\":
\"general,yunikorn-app\",\n \"enableConfigHotRefresh\": false,\n
\"disableGangScheduling\": false,\n \"userLabelKey\":
\"yunikorn.apache.org/username\"\n}"}
2022-03-08T16:19:15.796+0100 INFO dispatcher/dispatcher.go:80 Init
dispatcher {"EventChannelCapacity": 1048576, "AsyncDispatchLimit": 104857,
"DispatchTimeoutInSeconds": 300}
2022-03-08T16:19:15.797+0100 INFO dispatcher/dispatcher.go:183
starting the dispatcher
2022-03-08T16:19:15.798+0100 INFO dispatcher/dispatcher.go:179
dispatcher is draining out
--- FAIL: TestDispatcherStartStop (0.01s)
dispatcher_test.go:119: assertion failed: 1 (int) != 2 (int)
==================
WARNING: DATA RACE
Write at 0x00c0003d2140 by goroutine 37:
github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.TestEventWillNotBeLostWhenEventChannelIsFull()
/home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher_test.go:145
+0xda
testing.tRunner()
/snap/go/9028/src/testing/testing.go:1259 +0x22f
testing.(*T).Run·dwrap·21()
/snap/go/9028/src/testing/testing.go:1306 +0x47
Previous read at 0x00c0003d2140 by goroutine 35:
github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.Start.func1()
/home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher.go:187
+0x6b
Goroutine 37 (running) created at:
testing.(*T).Run()
/snap/go/9028/src/testing/testing.go:1306 +0x726
testing.runTests.func1()
/snap/go/9028/src/testing/testing.go:1598 +0x99
testing.tRunner()
/snap/go/9028/src/testing/testing.go:1259 +0x22f
testing.runTests()
/snap/go/9028/src/testing/testing.go:1596 +0x7ca
testing.(*M).Run()
/snap/go/9028/src/testing/testing.go:1504 +0x9d1
main.main()
_testmain.go:99 +0x324
Goroutine 35 (running) created at:
github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.Start()
/home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher.go:184
+0x84
github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.TestDispatcherStartStop()
/home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher_test.go:104
+0x189
testing.tRunner()
/snap/go/9028/src/testing/testing.go:1259 +0x22f
testing.(*T).Run·dwrap·21()
/snap/go/9028/src/testing/testing.go:1306 +0x47
==================
{noformat}
The problem is that even if we call {{dispatcher.drain()}} and the event
channel is empty, processing an event might still be in progress.
The simplest solution seems to be placing {{Stop()}} right after
{{dispatcher.drain()}}.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]