[
https://issues.apache.org/jira/browse/CASSANDRA-16685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17354901#comment-17354901
]
Berenguer Blasi commented on CASSANDRA-16685:
---------------------------------------------
The underlying root problem seems to stem from java's {{ThreadPool}} and
{{SynchronousQueue}} interactions. The queue will fail {{offer()}} calls if
there is no thread ready to read. At the same time the TP can be bouncing and
spinning threads around internally. If you happen to submit a task during that
window you will get false rejections. Other people have hit this, it can be
googled easily, and it seems to be an internal Java thing.
Reproduction is easy with the {{RepeteableRunner}} for the single test method
where it fails around 10%. 500 runs is a good start, you'll have to rename the
KS on each iteration though. I am providing a pure java test, with no C* code
involved, as proof this is a generic issue and not related to our codebase
{code:java}
@Test
public void testTP() throws InterruptedException
{
ExecutorService validationExecutor = null;
try
{
ThreadPoolExecutor executor = new ThreadPoolExecutor(1,
2,
1,
TimeUnit.HOURS,
new
SynchronousQueue<>(),
new
NamedThreadFactory("Repair-Task"));
executor.setRejectedExecutionHandler(new
ThreadPoolExecutor.AbortPolicy());
executor.prestartAllCoreThreads();
validationExecutor = executor;
Condition blocked = new SimpleCondition();
CountDownLatch completed = new CountDownLatch(2);
//Thread.sleep(100);
validationExecutor.submit(new Task(blocked, completed));
validationExecutor.submit(new Task(blocked, completed));
blocked.signalAll();
completed.await(11, TimeUnit.SECONDS);
}
finally
{
validationExecutor.shutdownNow();
}
}
{code}
> Flaky ActiveRepairServiceTest.testRejectWhenPoolFullStrategy
> ------------------------------------------------------------
>
> Key: CASSANDRA-16685
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16685
> Project: Cassandra
> Issue Type: Bug
> Components: Test/unit
> Reporter: Berenguer Blasi
> Assignee: Berenguer Blasi
> Priority: Normal
> Fix For: 4.0-rc2, 4.0, 4.x
>
>
> Flaky
> [ActiveRepairServiceTest.testRejectWhenPoolFullStrategy|https://ci-cassandra.apache.org/job/Cassandra-4.0/50/testReport/junit/org.apache.cassandra.service/ActiveRepairServiceTest/testRejectWhenPoolFullStrategy_compression/]
> {noformat}
> Error Message
> Task java.util.concurrent.FutureTask@63553e9f[Not completed, task =
> java.util.concurrent.Executors$RunnableAdapter@52cb52bd[Wrapped task =
> org.apache.cassandra.service.ActiveRepairServiceTest$Task@1d1c37d5]] rejected
> from
> org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@218df7d6[Running,
> pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 0]
> Stacktrace
> java.util.concurrent.RejectedExecutionException: Task
> java.util.concurrent.FutureTask@63553e9f[Not completed, task =
> java.util.concurrent.Executors$RunnableAdapter@52cb52bd[Wrapped task =
> org.apache.cassandra.service.ActiveRepairServiceTest$Task@1d1c37d5]] rejected
> from
> org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor@218df7d6[Running,
> pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 0]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
> at
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.execute(DebuggableThreadPoolExecutor.java:176)
> at
> java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
> at
> org.apache.cassandra.service.ActiveRepairServiceTest.testRejectWhenPoolFullStrategy(ActiveRepairServiceTest.java:380)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Standard Output
> INFO [main] 2021-05-18 22:04:31,694 YamlConfigurationLoader.java:93 -
> Configuration location:
> file:////home/cassandra/cassandra/build/test/cassandra.compressed.yaml
> DEBUG [main] 2021-05-18 22:04:31,698 YamlConfigurationLoader.java:112 -
> Loading settings from
> file:////home/cassandra/cassandra/build/test/cassandra.compressed.yaml
> DEBUG [main] 2021-05-18 22:04:31,807 InternalLoggerFactory.java:63 - Using
> SLF4J as the default logging framework
> DEBUG [main] 2021-05-18 22:04:31,827 PlatformDependent0
> ...[truncated 95289 chars]...
> andra/build/test/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb_txn_flush_08e70270-b825-11eb-a393-871312b17b94.log
>
> DEBUG [MemtableFlushWriter:1] 2021-05-18 22:04:36,792
> ColumnFamilyStore.java:1197 - Flushed to
> [BigTableReader(path='/home/cassandra/cassandra/build/test/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-15-big-Data.db')]
> (1 sstables, 4.944KiB), biggest 4.944KiB, smallest 4.944KiB
> DEBUG [main] 2021-05-18 22:04:36,795 StorageService.java:1619 - NORMAL
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]