[ 
https://issues.apache.org/jira/browse/CASSANDRA-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769091#comment-17769091
 ] 

Jacek Lewandowski commented on CASSANDRA-18361:
-----------------------------------------------

I can see no relation between CASSANDRA-18311 and this flaky test.

What I can see is that the test does something weird. It aims to fail the index 
build and it uses byteman to do that with the following rule:
{noformat}
RULE fail during index building
CLASS org.apache.cassandra.db.compaction.CompactionManager
METHOD submitIndexBuild
AT ENTRY
# set flag to only run this rule once.
IF NOT flagged("done")
DO
   flag("done");
   throw new java.lang.RuntimeException("Index building failure")
ENDRULE
{noformat}

The rule executes on entry of this method:
{code:java}
    Future<?> submitIndexBuild(final SecondaryIndexBuilder builder, 
ActiveCompactionsTracker activeCompactions)
    {
        Runnable runnable = new Runnable()
        {
           ...
        };

        return secondaryIndexExecutor.submitIfRunning(runnable, "index build");
    }
{code}

As I can see, it is very unlikely that any exception could even happen in this 
place - maybe when the executor is already shutdown. I suppose the original 
intention was to make the runnable fail its execution, in which case, instead 
of throwing an exception at entry, we should return a fixed failed future.

I've managed to reproduce it locally and with enabled ref debugging, this is 
the stack trace of the offending reference creation:
{noformat}
  Thread[RMI TCP Connection(2)-127.0.0.1,5,RMI Runtime]
        at java.base/java.lang.Thread.getStackTrace(Thread.java:1602)
        at org.apache.cassandra.utils.concurrent.Ref$Debug.<init>(Ref.java:273)
        at org.apache.cassandra.utils.concurrent.Ref$State.<init>(Ref.java:194)
        at org.apache.cassandra.utils.concurrent.Ref.<init>(Ref.java:122)
        at org.apache.cassandra.utils.concurrent.Ref.tryRef(Ref.java:159)
        at org.apache.cassandra.utils.concurrent.Ref.ref(Ref.java:164)
        at 
org.apache.cassandra.utils.concurrent.SharedCloseableImpl.<init>(SharedCloseableImpl.java:35)
        at org.apache.cassandra.io.util.FileHandle.<init>(FileHandle.java:79)
        at 
org.apache.cassandra.io.util.FileHandle.sharedCopy(FileHandle.java:123)
        at 
org.apache.cassandra.io.sstable.format.big.BigTableKeyReader.create(BigTableKeyReader.java:77)
        at 
org.apache.cassandra.io.sstable.format.big.BigTableReader.keyReader(BigTableReader.java:157)
        at 
org.apache.cassandra.io.sstable.format.SSTableReader.keyIterator(SSTableReader.java:829)
        at 
org.apache.cassandra.io.sstable.ReducingKeyIterator.<init>(ReducingKeyIterator.java:48)
        at 
org.apache.cassandra.index.Index$CollatedViewIndexBuildingSupport.getIndexBuildTask(Index.java:195)
        at 
org.apache.cassandra.index.SecondaryIndexManager.lambda$buildIndexesBlocking$7(SecondaryIndexManager.java:632)
        at java.base/java.util.HashMap.forEach(HashMap.java:1337)
        at 
org.apache.cassandra.index.SecondaryIndexManager.buildIndexesBlocking(SecondaryIndexManager.java:630)
        at 
org.apache.cassandra.index.SecondaryIndexManager.rebuildIndexesBlocking(SecondaryIndexManager.java:423)
        at 
org.apache.cassandra.db.ColumnFamilyStore.rebuildSecondaryIndex(ColumnFamilyStore.java:904)
        at 
org.apache.cassandra.service.StorageService.rebuildSecondaryIndex(StorageService.java:6456)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
        at jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260)
        at 
java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
        at 
java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
        at 
java.management/com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
        at 
java.management/com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
        at 
java.management/com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
        at 
java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:809)
        at 
java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
        at 
java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1466)
        at 
java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
        at 
java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399)
        at 
java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:827)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at 
java.rmi/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:359)
        at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:200)
        at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:197)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.rmi/sun.rmi.transport.Transport.serviceCall(Transport.java:196)
        at 
java.rmi/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:562)
        at 
java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:796)
        at 
java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:677)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at 
java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:676)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
{noformat}

I'll keep working on it...


> Test Failure: 
> secondary_indexes_test.py::TestSecondaryIndexes::test_failing_manual_rebuild_index
> ------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18361
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18361
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/python
>            Reporter: Andres de la Peña
>            Assignee: Jacek Lewandowski
>            Priority: Normal
>             Fix For: 5.0.x, 5.x
>
>
> The Python dtest 
> {{secondary_indexes_test.py::TestSecondaryIndexes::test_failing_manual_rebuild_index}}
>  is flaky, at least for trunk:
> * 
> https://butler.cassandra.apache.org/#/ci/upstream/workflow/Cassandra-trunk/failure/secondary_indexes_test/TestSecondaryIndexes/test_failing_manual_rebuild_index
> * 
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1501/testReport/dtest.secondary_indexes_test/TestSecondaryIndexes/test_failing_manual_rebuild_index/
> {code}
> Error Message
> failed on teardown with "Unexpected error found in node logs (see stdout for 
> full details). Errors: [[node1] 'ERROR [Reference-Reaper] 2023-03-23 
> 00:23:43,597 Ref.java:237 - LEAK DETECTED: a reference (class 
> org.apache.cassandra.io.util.FileHandle$Cleanup@967019010:/home/cassandra/cassandra/cassandra-dtest/tmp/dtest-hgjoy8rq/test/node1/data0/k/t-b7dae870c91011eda58f05bc40bfcaa1/nc-1-big-Index.db)
>  to class 
> org.apache.cassandra.io.util.FileHandle$Cleanup@967019010:/home/cassandra/cassandra/cassandra-dtest/tmp/dtest-hgjoy8rq/test/node1/data0/k/t-b7dae870c91011eda58f05bc40bfcaa1/nc-1-big-Index.db
>  was not released before the reference was garbage collected']"
> Stacktrace
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [[node1] 'ERROR [Reference-Reaper] 2023-03-23 00:23:43,597 Ref.java:237 - 
> LEAK DETECTED: a reference (class 
> org.apache.cassandra.io.util.FileHandle$Cleanup@967019010:/home/cassandra/cassandra/cassandra-dtest/tmp/dtest-hgjoy8rq/test/node1/data0/k/t-b7dae870c91011eda58f05bc40bfcaa1/nc-1-big-Index.db)
>  to class 
> org.apache.cassandra.io.util.FileHandle$Cleanup@967019010:/home/cassandra/cassandra/cassandra-dtest/tmp/dtest-hgjoy8rq/test/node1/data0/k/t-b7dae870c91011eda58f05bc40bfcaa1/nc-1-big-Index.db
>  was not released before the reference was garbage collected']
> {code}
> The failure can be reproduced in CircleCI:
> * 
> https://app.circleci.com/pipelines/github/adelapena/cassandra/2732/workflows/829434ab-2d1a-4e1c-8c7f-42449fcfda22
> The CircleCI config I used to reproduce the test failure can be generated 
> with:
> {code}
> .circleci/generate.sh -p \
>   -e REPEATED_DTESTS_COUNT=200 \
>   -e 
> REPEATED_DTESTS=secondary_indexes_test.py::TestSecondaryIndexes::test_failing_manual_rebuild_index
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to