qlong commented on PR #53914:
URL: https://github.com/apache/spark/pull/53914#issuecomment-3812290189
> @qlong I can reproduce the issue with our internal CI infra that stress
test `ClientStreamingQuerySuite`, but it's not available on my local laptop.
Please let me know if you have any idea to test this race condition and I can
implement.
It is hard to manually reproduce race conditions. Your fix looks good, but I
think we should a few tests in ExeuctorSuite to prevent regression in the
future.
1. test case that shows we can no longer acqjuire an evicted session
- create a new session state
- evict that seession state
- acquire that session state returns false (old code did not detect this)
2. test case that shows new session for the same artifactstae is created
successfully after evictor
- set cache size to 1
- s1_orig = obtainSession(JobArtifactState("session1")), release it
immeidately so it can be cleaned up
- s2 = obtainSession(JobArtifactState("session2"))
- s1_new = obtainSession(JobArtifactState("session1"))
- s1_orig != s1_new, and !s1_new.evicted
We probably should add some test case to simulate race conditions with two
threads, or add a stress tests:
1) set cache size to 1
2) prepare 10 or more threads, some of them try to
obtainSession("session1"), some tyr to obtainSession("session2"). this would
cause random eviction
3) start all the prepared threeds, verify all the threads successfully
return a unevicted session
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]