Peter Bacsko created YUNIKORN-1108:
--------------------------------------

             Summary: Fix race conditions in dispatcher_test.go
                 Key: YUNIKORN-1108
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1108
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: shim - kubernetes, test - unit
            Reporter: Peter Bacsko
            Assignee: Peter Bacsko


A dispatcher test failed locally and there was also a data race observed:

{noformat}
2022-03-08T16:19:15.795+0100    INFO    log/logger.go:89        scheduler 
configuration, pretty print   {"configs": "{\n \"schedulerName\": 
\"yunikorn\",\n \"clusterId\": \"my-kube-cluster\",\n \"clusterVersion\": 
\"0.1\",\n \"policyGroup\": \"queues\",\n \"schedulingIntervalSecond\": 
1000000000,\n \"absoluteKubeConfigFilePath\": \"\",\n \"loggingLevel\": 0,\n 
\"logEncoding\": \"console\",\n \"logFilePath\": \"\",\n \"volumeBindTimeout\": 
10000000000,\n \"testMode\": false,\n \"eventChannelCapacity\": 1048576,\n 
\"dispatchTimeout\": 300000000000,\n \"kubeQPS\": 1000,\n \"kubeBurst\": 
1000,\n \"predicates\": \"\",\n \"operatorPlugins\": 
\"general,yunikorn-app\",\n \"enableConfigHotRefresh\": false,\n 
\"disableGangScheduling\": false,\n \"userLabelKey\": 
\"yunikorn.apache.org/username\"\n}"}
2022-03-08T16:19:15.796+0100    INFO    dispatcher/dispatcher.go:80     Init 
dispatcher {"EventChannelCapacity": 1048576, "AsyncDispatchLimit": 104857, 
"DispatchTimeoutInSeconds": 300}
2022-03-08T16:19:15.797+0100    INFO    dispatcher/dispatcher.go:183    
starting the dispatcher
2022-03-08T16:19:15.798+0100    INFO    dispatcher/dispatcher.go:179    
dispatcher is draining out
--- FAIL: TestDispatcherStartStop (0.01s)
    dispatcher_test.go:119: assertion failed: 1 (int) != 2 (int)
==================
WARNING: DATA RACE
Write at 0x00c0003d2140 by goroutine 37:
  
github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.TestEventWillNotBeLostWhenEventChannelIsFull()
      
/home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher_test.go:145
 +0xda
  testing.tRunner()
      /snap/go/9028/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /snap/go/9028/src/testing/testing.go:1306 +0x47

Previous read at 0x00c0003d2140 by goroutine 35:
  github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.Start.func1()
      
/home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher.go:187 
+0x6b

Goroutine 37 (running) created at:
  testing.(*T).Run()
      /snap/go/9028/src/testing/testing.go:1306 +0x726
  testing.runTests.func1()
      /snap/go/9028/src/testing/testing.go:1598 +0x99
  testing.tRunner()
      /snap/go/9028/src/testing/testing.go:1259 +0x22f
  testing.runTests()
      /snap/go/9028/src/testing/testing.go:1596 +0x7ca
  testing.(*M).Run()
      /snap/go/9028/src/testing/testing.go:1504 +0x9d1
  main.main()
      _testmain.go:99 +0x324

Goroutine 35 (running) created at:
  github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.Start()
      
/home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher.go:184 
+0x84
  
github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.TestDispatcherStartStop()
      
/home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher_test.go:104
 +0x189
  testing.tRunner()
      /snap/go/9028/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /snap/go/9028/src/testing/testing.go:1306 +0x47
==================
{noformat}

The problem is that even if we call {{dispatcher.drain()}} and the event 
channel is empty, processing an event might still be in progress. 

The simplest solution seems to be placing {{Stop()}} right after 
{{dispatcher.drain()}}. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to