[
https://issues.apache.org/jira/browse/YUNIKORN-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Bacsko updated YUNIKORN-2576:
-----------------------------------
Target Version: 1.9.0 (was: 1.8.0)
> Data Race: Flaky tests in dispatcher_test.go
> --------------------------------------------
>
> Key: YUNIKORN-2576
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2576
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: test - unit
> Reporter: Yu-Lin Chen
> Priority: Major
> Attachments: shim-race.txt
>
>
> How to reproduce:
> # In Shim, run 'go test ./pkg/... -race -count=10 > shim-race.txt'
>
> {code:java}
> WARNING: DATA RACE
> Write at 0x0000035315e0 by goroutine 88:
> github.com/apache/yunikorn-k8shim/pkg/dispatcher.initDispatcher()
>
> /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:73
> +0x2c4
> github.com/apache/yunikorn-k8shim/pkg/dispatcher.createDispatcher()
>
> /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatch_test.go:305
> +0x2f
> runtime.Goexit()
> /usr/local/go/src/runtime/panic.go:626 +0x5d
> testing.(*T).FailNow()
> <autogenerated>:1 +0x31
> gotest.tools/v3/assert.Equal()
>
> /home/chenyulin0719/go/pkg/mod/gotest.tools/[email protected]/assert/assert.go:205
> +0x1aa
> github.com/apache/yunikorn-k8shim/pkg/dispatcher.TestDispatchTimeout()
>
> /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatch_test.go:244
> +0x2ba
> testing.tRunner()
> /usr/local/go/src/testing/testing.go:1689 +0x21e
> testing.(*T).Run.gowrap1()
> /usr/local/go/src/testing/testing.go:1742 +0x44
> Previous read at 0x0000035315e0 by goroutine 90:
>
> github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).asyncDispatch.func1()
>
> /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:188
> +0x2f5
>
> github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).asyncDispatch.gowrap1()
>
> /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:197
> +0x6eGoroutine 88 (running) created at:
> testing.(*T).Run()
> /usr/local/go/src/testing/testing.go:1742 +0x825
> testing.runTests.func1()
> /usr/local/go/src/testing/testing.go:2161 +0x85
> testing.tRunner()
> /usr/local/go/src/testing/testing.go:1689 +0x21e
> testing.runTests()
> /usr/local/go/src/testing/testing.go:2159 +0x8be
> testing.(*M).Run()
> /usr/local/go/src/testing/testing.go:2027 +0xf17
> main.main()
> _testmain.go:55 +0x2bdGoroutine 90 (running) created at:
>
> github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).asyncDispatch()
>
> /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:178
> +0x391
> github.com/apache/yunikorn-k8shim/pkg/dispatcher.(*Dispatcher).dispatch()
>
> /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:164
> +0xbb
> github.com/apache/yunikorn-k8shim/pkg/dispatcher.Dispatch()
>
> /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:142
> +0x71
> github.com/apache/yunikorn-k8shim/pkg/dispatcher.TestDispatchTimeout()
>
> /home/chenyulin0719/yunikorn/yunikorn-k8shim/pkg/dispatcher/dispatch_test.go:232
> +0x244
> testing.tRunner()
> /usr/local/go/src/testing/testing.go:1689 +0x21e
> testing.(*T).Run.gowrap1()
> /usr/local/go/src/testing/testing.go:1742 +0x44
> =================={code}
> Root Cause:
> * The [globla
> vairables|https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatcher.go#L46-L51]
> in dispatcher.go is not protected when running unit tests. Each unit test
> will run initDispatcher() through
> [createDispatcher()|https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatch_test.go#L305].
> * Race occurs if any other unit tests read/write the global variables before
> or after initDispatcher(). ex: TestDispatchTimeout()
> [https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatcher.go#L188]
>
> Solution to be discussed:
> # Refactor dispatcher.go and encapsulates global variables to Dispatcher
> struct
> , change Dispatcher.Start(), Dispatcher.Stop() to type method
> # Implement Singleton in getDispatcher() and add a new function
> newDispatcher()
> # Create a new Dispatcher for each unit test
>
> The race issue only happens in unit test becasue the shared vairable was
> protected by
> once.Do(initDispatcher) in dispatcher.go :
> [https://github.com/chenyulin0719/yunikorn-k8shim/blob/64b204a2fb3b83fde9d86ea58f5f0d1e42187472/pkg/dispatcher/dispatcher.go#L131]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]