[
https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346130#comment-17346130
]
Andres de la Peña commented on CASSANDRA-16668:
-----------------------------------------------
Here are 10K runs of {{SEPExecutorTest}} with the patch, using the [CircleCI
multiplexer|https://github.com/apache/cassandra/blob/trunk/doc/source/development/testing.rst#circleci]:
*
[j8-j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/457/workflows/cb5b1d27-75d4-4b3a-814c-04454fb4f4ef/jobs/4017]
*
[j8-j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/457/workflows/cb5b1d27-75d4-4b3a-814c-04454fb4f4ef/jobs/4015]
*
[j11-j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/457/workflows/1b82f571-9dd8-4a98-892b-1c0e3704f2d9/jobs/4013]
It seems that {{changingMaxWorkersMeetsConcurrencyGoalsTest}} happily survives
the three 10K runs, but there are some uncommon failures on {{shutdownTest}}:
* [j8-j8 runner 62 iteration
49|https://4017-85817267-gh.circle-artifacts.com/62/stdout/fails/049/testsome-org.apache.cassandra.concurrent.SEPExecutorTest.txt]
* [j8-j11 runner 73 iteration
9|https://4015-85817267-gh.circle-artifacts.com/73/stdout/fails/009/testsome-org.apache.cassandra.concurrent.SEPExecutorTest.txt]
* [j11-j11 runner 68 iteration
51|https://4013-85817267-gh.circle-artifacts.com/68/stdout/fails/051/testsome-org.apache.cassandra.concurrent.SEPExecutorTest.txt]
Not sure whether that is related or an independent failure.
> Intermittent failure of
> SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race
> condition when shrinking maximum pool size to zero
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-16668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16668
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Other
> Reporter: Matt Fleming
> Assignee: Matt Fleming
> Priority: Normal
> Fix For: 4.0-rc
>
>
> A difficult-to-hit race condition exists in
> changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool
> size from 0 -> 4 which results in the test failing like so:
> {{junit.framework.AssertionFailedError: Test tasks did not hit max
> concurrency goal expected:<true> but
> was:<false>junit.framework.AssertionFailedError: Test tasks did not hit max
> concurrency goal expected:<true> but was:<false> at
> org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198)
> at
> org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}}
> I can hit this issue maybe 2/3 times for every 100 invocations of the unit
> test.
> The issue that causes the failure is that if tasks are still enqueued when
> the maximum pool size is set to zero and if all of the SEPWorker threads
> enter the STOP state before the pool size is bumped to 4, then no SEPWorker
> threads will be spun up to service the task queue. This causes the above
> error.
> Why don't we spin up SEPWorker threads when enqueing tasks? Because of the
> guard logic in addTask:
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121]
> In this scenario taskPermits will not be zero (because we have tasks on the
> queue) so we never call {{maybeStartSpinningWorker()}}.
> A trick to make this issue much easier to hit is to insert a
> {{Thread.sleep(500)}} immediately after setting the pool size to zero. This
> has the effect of guaranteeing that all SEPWorker threads will be STOP'd
> before enqueueing more work.
> Here's a fix that attempts to spin up an SEPWorker whenever we grow the
> number of work permits:
> https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]