[
https://issues.apache.org/jira/browse/CASSANDRA-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769091#comment-17769091
]
Jacek Lewandowski commented on CASSANDRA-18361:
-----------------------------------------------
I can see no relation between CASSANDRA-18311 and this flaky test.
What I can see is that the test does something weird. It aims to fail the index
build and it uses byteman to do that with the following rule:
{noformat}
RULE fail during index building
CLASS org.apache.cassandra.db.compaction.CompactionManager
METHOD submitIndexBuild
AT ENTRY
# set flag to only run this rule once.
IF NOT flagged("done")
DO
flag("done");
throw new java.lang.RuntimeException("Index building failure")
ENDRULE
{noformat}
The rule executes on entry of this method:
{code:java}
Future<?> submitIndexBuild(final SecondaryIndexBuilder builder,
ActiveCompactionsTracker activeCompactions)
{
Runnable runnable = new Runnable()
{
...
};
return secondaryIndexExecutor.submitIfRunning(runnable, "index build");
}
{code}
As I can see, it is very unlikely that any exception could even happen in this
place - maybe when the executor is already shutdown. I suppose the original
intention was to make the runnable fail its execution, in which case, instead
of throwing an exception at entry, we should return a fixed failed future.
I've managed to reproduce it locally and with enabled ref debugging, this is
the stack trace of the offending reference creation:
{noformat}
Thread[RMI TCP Connection(2)-127.0.0.1,5,RMI Runtime]
at java.base/java.lang.Thread.getStackTrace(Thread.java:1602)
at org.apache.cassandra.utils.concurrent.Ref$Debug.<init>(Ref.java:273)
at org.apache.cassandra.utils.concurrent.Ref$State.<init>(Ref.java:194)
at org.apache.cassandra.utils.concurrent.Ref.<init>(Ref.java:122)
at org.apache.cassandra.utils.concurrent.Ref.tryRef(Ref.java:159)
at org.apache.cassandra.utils.concurrent.Ref.ref(Ref.java:164)
at
org.apache.cassandra.utils.concurrent.SharedCloseableImpl.<init>(SharedCloseableImpl.java:35)
at org.apache.cassandra.io.util.FileHandle.<init>(FileHandle.java:79)
at
org.apache.cassandra.io.util.FileHandle.sharedCopy(FileHandle.java:123)
at
org.apache.cassandra.io.sstable.format.big.BigTableKeyReader.create(BigTableKeyReader.java:77)
at
org.apache.cassandra.io.sstable.format.big.BigTableReader.keyReader(BigTableReader.java:157)
at
org.apache.cassandra.io.sstable.format.SSTableReader.keyIterator(SSTableReader.java:829)
at
org.apache.cassandra.io.sstable.ReducingKeyIterator.<init>(ReducingKeyIterator.java:48)
at
org.apache.cassandra.index.Index$CollatedViewIndexBuildingSupport.getIndexBuildTask(Index.java:195)
at
org.apache.cassandra.index.SecondaryIndexManager.lambda$buildIndexesBlocking$7(SecondaryIndexManager.java:632)
at java.base/java.util.HashMap.forEach(HashMap.java:1337)
at
org.apache.cassandra.index.SecondaryIndexManager.buildIndexesBlocking(SecondaryIndexManager.java:630)
at
org.apache.cassandra.index.SecondaryIndexManager.rebuildIndexesBlocking(SecondaryIndexManager.java:423)
at
org.apache.cassandra.db.ColumnFamilyStore.rebuildSecondaryIndex(ColumnFamilyStore.java:904)
at
org.apache.cassandra.service.StorageService.rebuildSecondaryIndex(StorageService.java:6456)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260)
at
java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at
java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at
java.management/com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at
java.management/com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at
java.management/com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
at
java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:809)
at
java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
at
java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1466)
at
java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
at
java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399)
at
java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:827)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
java.rmi/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:359)
at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:200)
at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.rmi/sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at
java.rmi/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:562)
at
java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:796)
at
java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:677)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at
java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:676)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
{noformat}
I'll keep working on it...
> Test Failure:
> secondary_indexes_test.py::TestSecondaryIndexes::test_failing_manual_rebuild_index
> ------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-18361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18361
> Project: Cassandra
> Issue Type: Bug
> Components: Test/dtest/python
> Reporter: Andres de la Peña
> Assignee: Jacek Lewandowski
> Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> The Python dtest
> {{secondary_indexes_test.py::TestSecondaryIndexes::test_failing_manual_rebuild_index}}
> is flaky, at least for trunk:
> *
> https://butler.cassandra.apache.org/#/ci/upstream/workflow/Cassandra-trunk/failure/secondary_indexes_test/TestSecondaryIndexes/test_failing_manual_rebuild_index
> *
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1501/testReport/dtest.secondary_indexes_test/TestSecondaryIndexes/test_failing_manual_rebuild_index/
> {code}
> Error Message
> failed on teardown with "Unexpected error found in node logs (see stdout for
> full details). Errors: [[node1] 'ERROR [Reference-Reaper] 2023-03-23
> 00:23:43,597 Ref.java:237 - LEAK DETECTED: a reference (class
> org.apache.cassandra.io.util.FileHandle$Cleanup@967019010:/home/cassandra/cassandra/cassandra-dtest/tmp/dtest-hgjoy8rq/test/node1/data0/k/t-b7dae870c91011eda58f05bc40bfcaa1/nc-1-big-Index.db)
> to class
> org.apache.cassandra.io.util.FileHandle$Cleanup@967019010:/home/cassandra/cassandra/cassandra-dtest/tmp/dtest-hgjoy8rq/test/node1/data0/k/t-b7dae870c91011eda58f05bc40bfcaa1/nc-1-big-Index.db
> was not released before the reference was garbage collected']"
> Stacktrace
> Unexpected error found in node logs (see stdout for full details). Errors:
> [[node1] 'ERROR [Reference-Reaper] 2023-03-23 00:23:43,597 Ref.java:237 -
> LEAK DETECTED: a reference (class
> org.apache.cassandra.io.util.FileHandle$Cleanup@967019010:/home/cassandra/cassandra/cassandra-dtest/tmp/dtest-hgjoy8rq/test/node1/data0/k/t-b7dae870c91011eda58f05bc40bfcaa1/nc-1-big-Index.db)
> to class
> org.apache.cassandra.io.util.FileHandle$Cleanup@967019010:/home/cassandra/cassandra/cassandra-dtest/tmp/dtest-hgjoy8rq/test/node1/data0/k/t-b7dae870c91011eda58f05bc40bfcaa1/nc-1-big-Index.db
> was not released before the reference was garbage collected']
> {code}
> The failure can be reproduced in CircleCI:
> *
> https://app.circleci.com/pipelines/github/adelapena/cassandra/2732/workflows/829434ab-2d1a-4e1c-8c7f-42449fcfda22
> The CircleCI config I used to reproduce the test failure can be generated
> with:
> {code}
> .circleci/generate.sh -p \
> -e REPEATED_DTESTS_COUNT=200 \
> -e
> REPEATED_DTESTS=secondary_indexes_test.py::TestSecondaryIndexes::test_failing_manual_rebuild_index
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]