Yu-Lin Chen created YUNIKORN-2576:
-------------------------------------
Summary: Data Race: Flaky tests in dispatcher_test.go
Key: YUNIKORN-2576
URL: https://issues.apache.org/jira/browse/YUNIKORN-2576
Project: Apache YuniKorn
Issue Type: Bug
Reporter: Yu-Lin Chen
Attachments: shim-race.txt
How to reproduce:
# In Shim, run 'go test ./pkg/... -race -count=10 > shim-race.txt'
{code:java}
WARNING: DATA RACE
Write at 0x0000035315e0 by goroutine 88:
github.com/apache/yunikorn-k8shim/pkg/dispatcher.initDispatcher()
/home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:73
+0x2c4
github.com/apache/yunikorn-k8shim/pkg/dispatcher.createDispatcher()
/home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatch_test.go:305
+0x2f
runtime.Goexit()
/usr/local/go/src/runtime/panic.go:626 +0x5d
testing.(*T).FailNow()
<autogenerated>:1 +0x31
gotest.tools/v3/assert.Equal()
/home/chenyulin0719/go/pkg/mod/gotest.tools/[email protected]/assert/assert.go:205
+0x1aa
github.com/apache/yunikorn-k8shim/pkg/dispatcher.TestDispatchTimeout()
/home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatch_test.go:244
+0x2ba
testing.tRunner()
/usr/local/go/src/testing/testing.go:1689 +0x21e
testing.(*T).Run.gowrap1()
/usr/local/go/src/testing/testing.go:1742 +0x44Previous read at
0x0000035315e0 by goroutine 90:
github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).asyncDispatch.func1()
/home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:188
+0x2f5
github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).asyncDispatch.gowrap1()
/home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:197
+0x6eGoroutine 88 (running) created at:
testing.(*T).Run()
/usr/local/go/src/testing/testing.go:1742 +0x825
testing.runTests.func1()
/usr/local/go/src/testing/testing.go:2161 +0x85
testing.tRunner()
/usr/local/go/src/testing/testing.go:1689 +0x21e
testing.runTests()
/usr/local/go/src/testing/testing.go:2159 +0x8be
testing.(*M).Run()
/usr/local/go/src/testing/testing.go:2027 +0xf17
main.main()
_testmain.go:55 +0x2bdGoroutine 90 (running) created at:
github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).asyncDispatch()
/home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:178
+0x391
github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).dispatch()
/home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:164
+0xbb
github.com/apache/yunikorn-k8shim/pkg/dispatcher.Dispatch()
/home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:142
+0x71
github.com/apache/yunikorn-k8shim/pkg/dispatcher.TestDispatchTimeout()
/home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatch_test.go:232
+0x244
testing.tRunner()
/usr/local/go/src/testing/testing.go:1689 +0x21e
testing.(*T).Run.gowrap1()
/usr/local/go/src/testing/testing.go:1742 +0x44
=================={code}
Root Cause:
* The [globla
vairables|https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatcher.go#L46-L51]
in dispatcher.go is not protected when running unit tests. Each unit test will
run initDispatcher() through
[createDispatcher()|https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatch_test.go#L305].
* Race occurs if any other unit tests read/write the global variables before
or after initDispatcher(). ex: TestDispatchTimeout()
[https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatcher.go#L188]
Solution to be discussed:
# Refactor dispatcher.go and encapsulates global variables to Dispatcher struct
, change Dispatcher.Start(), Dispatcher.Stop() to type method
# Implement Singleton in getDispatcher() and add a new function newDispatcher()
# Create a new Dispatcher for each unit test
The race issue only happens in unit test becasue the shared vairable was
protected by
once.Do(initDispatcher) in dispatcher.go :
[https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatcher.go#L131]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]