[ 
https://issues.apache.org/jira/browse/YUNIKORN-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg resolved YUNIKORN-1108.
---------------------------------------------
    Resolution: Fixed

Changes to re-implement the dispatcher handling in the tests have been 
committed also cleans up the closure of the dispatcher to work in all cases.

Thank you [~pbacsko] 

> Fix race conditions in dispatcher_test.go
> -----------------------------------------
>
>                 Key: YUNIKORN-1108
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1108
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: shim - kubernetes, test - unit
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.0.0
>
>
> A dispatcher test failed locally and there was also a data race observed:
> {noformat}
> 2022-03-08T16:19:15.795+0100  INFO    log/logger.go:89        scheduler 
> configuration, pretty print   {"configs": "{\n \"schedulerName\": 
> \"yunikorn\",\n \"clusterId\": \"my-kube-cluster\",\n \"clusterVersion\": 
> \"0.1\",\n \"policyGroup\": \"queues\",\n \"schedulingIntervalSecond\": 
> 1000000000,\n \"absoluteKubeConfigFilePath\": \"\",\n \"loggingLevel\": 0,\n 
> \"logEncoding\": \"console\",\n \"logFilePath\": \"\",\n 
> \"volumeBindTimeout\": 10000000000,\n \"testMode\": false,\n 
> \"eventChannelCapacity\": 1048576,\n \"dispatchTimeout\": 300000000000,\n 
> \"kubeQPS\": 1000,\n \"kubeBurst\": 1000,\n \"predicates\": \"\",\n 
> \"operatorPlugins\": \"general,yunikorn-app\",\n \"enableConfigHotRefresh\": 
> false,\n \"disableGangScheduling\": false,\n \"userLabelKey\": 
> \"yunikorn.apache.org/username\"\n}"}
> 2022-03-08T16:19:15.796+0100  INFO    dispatcher/dispatcher.go:80     Init 
> dispatcher {"EventChannelCapacity": 1048576, "AsyncDispatchLimit": 104857, 
> "DispatchTimeoutInSeconds": 300}
> 2022-03-08T16:19:15.797+0100  INFO    dispatcher/dispatcher.go:183    
> starting the dispatcher
> 2022-03-08T16:19:15.798+0100  INFO    dispatcher/dispatcher.go:179    
> dispatcher is draining out
> --- FAIL: TestDispatcherStartStop (0.01s)
>     dispatcher_test.go:119: assertion failed: 1 (int) != 2 (int)
> ==================
> WARNING: DATA RACE
> Write at 0x00c0003d2140 by goroutine 37:
>   
> github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.TestEventWillNotBeLostWhenEventChannelIsFull()
>       
> /home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher_test.go:145
>  +0xda
>   testing.tRunner()
>       /snap/go/9028/src/testing/testing.go:1259 +0x22f
>   testing.(*T).Run·dwrap·21()
>       /snap/go/9028/src/testing/testing.go:1306 +0x47
> Previous read at 0x00c0003d2140 by goroutine 35:
>   github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.Start.func1()
>       
> /home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher.go:187
>  +0x6b
> Goroutine 37 (running) created at:
>   testing.(*T).Run()
>       /snap/go/9028/src/testing/testing.go:1306 +0x726
>   testing.runTests.func1()
>       /snap/go/9028/src/testing/testing.go:1598 +0x99
>   testing.tRunner()
>       /snap/go/9028/src/testing/testing.go:1259 +0x22f
>   testing.runTests()
>       /snap/go/9028/src/testing/testing.go:1596 +0x7ca
>   testing.(*M).Run()
>       /snap/go/9028/src/testing/testing.go:1504 +0x9d1
>   main.main()
>       _testmain.go:99 +0x324
> Goroutine 35 (running) created at:
>   github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.Start()
>       
> /home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher.go:184
>  +0x84
>   
> github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.TestDispatcherStartStop()
>       
> /home/bacskop/repos/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher_test.go:104
>  +0x189
>   testing.tRunner()
>       /snap/go/9028/src/testing/testing.go:1259 +0x22f
>   testing.(*T).Run·dwrap·21()
>       /snap/go/9028/src/testing/testing.go:1306 +0x47
> ==================
> {noformat}
> The problem is that even if we call {{dispatcher.drain()}} and the event 
> channel is empty, processing an event might still be in progress. 
> The simplest solution seems to be placing {{Stop()}} right after 
> {{dispatcher.drain()}}. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to