[ 
https://issues.apache.org/jira/browse/CASSANDRA-20084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Konstantinov updated CASSANDRA-20084:
--------------------------------------------
    Description: 
The issue is reported based on an analysis done in "Unexplained stuck memtable 
flush" mail thread from Cassandra User/Dev groups.

e2e observations: for a Cassandra node the memtable flush randomly getting 
stuck and later as a consequence all writes to memtables are stuck as well due 
to lack of memory in a memory pool.

Based on the analysis of thread dump and heap dump the following cause has been 
identified:
We have the following stacktrace for "read-hotness-tracker:1" thread
{code:java}
"read-hotness-tracker:1" daemon prio=5 tid=93 WAITING
    at jdk.internal.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:323)
    at 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$AbstractSignal.await(WaitQueue.java:289)
       local variable: 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$SignalWithListener#2086139
    at 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$AbstractSignal.await(WaitQueue.java:282)
       local variable: 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$SignalWithListener#2086139
    at 
org.apache.cassandra.utils.concurrent.Awaitable$Defaults.awaitUninterruptibly(Awaitable.java:186)
       local variable: 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$SignalWithListener#2086139
    at 
org.apache.cassandra.utils.concurrent.Awaitable$AbstractAwaitable.awaitUninterruptibly(Awaitable.java:259)
    at 
org.apache.cassandra.db.commitlog.AbstractCommitLogService.awaitSyncAt(AbstractCommitLogService.java:324)
       local variable: 
org.apache.cassandra.db.commitlog.PeriodicCommitLogService#1
       local variable: com.codahale.metrics.Timer$Context#2086795
    at 
org.apache.cassandra.db.commitlog.PeriodicCommitLogService.maybeWaitForSync(PeriodicCommitLogService.java:42)
       local variable: 
org.apache.cassandra.db.commitlog.PeriodicCommitLogService#1
    at 
org.apache.cassandra.db.commitlog.AbstractCommitLogService.finishWriteFor(AbstractCommitLogService.java:284)
       local variable: 
org.apache.cassandra.db.commitlog.PeriodicCommitLogService#1
    at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307)
       local variable: org.apache.cassandra.db.commitlog.CommitLog#1
       local variable: org.apache.cassandra.io.util.DataOutputBuffer$1$1#61
       local variable: 
org.apache.cassandra.db.commitlog.CommitLogSegment$Allocation#1
    at 
org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:100)
    at 
org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:54)
       local variable: org.apache.cassandra.utils.concurrent.OpOrder$Group#8162
    at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:628)
       local variable: org.apache.cassandra.db.Keyspace#8
       local variable: org.apache.cassandra.db.Mutation#54491
    at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:512)
    at org.apache.cassandra.db.Mutation.apply(Mutation.java:228)
    at org.apache.cassandra.db.Mutation.apply(Mutation.java:248)
    at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:675)
    at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeLocally(ModificationStatement.java:666)
    at 
org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:447)
    at 
org.apache.cassandra.db.SystemKeyspace.persistSSTableReadMeter(SystemKeyspace.java:1488)
       local variable: java.lang.String#42806
       local variable: java.lang.String#47659
       local variable: org.apache.cassandra.io.sstable.UUIDBasedSSTableId#1389
       local variable: org.apache.cassandra.metrics.RestorableMeter#752
    at 
org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy$1.run(SSTableReader.java:2170)
{code}
where we are waiting in the await() method of 
WaitQueue$Standard$SignalWithListener#2086139 object.

another input from a heap dump: WaitQueue$Standard$SignalWithListener#2086139 
has the following field values based on the heap dump screenshots:
state = -1 (it means CANCELLED: 
org.apache.cassandra.utils.concurrent.WaitQueue.Standard#CANCELLED = -1)
thread = null

According to the code in WaitQueue.java such combination of values only 
possible if we invoked cancel() on this SignalWithListener object but we didn't 
do it in our code, it is in another branch of 
AbstractCommitLogService.awaitSyncAt

but let's check the await() logic more carefully:
{code:java}
public Signal await() throws InterruptedException
{
    while (!isSignalled())
    {
        checkInterrupted();
        LockSupport.park(); <--- we are here
    }
    checkAndClear();
    return this;
}
{code}
and inspect unassuming checkInterrupted() method
{code:java}
private void checkInterrupted() throws InterruptedException
{
    if (Thread.interrupted())
    {
        cancel();
        throw new InterruptedException();
    }
}
{code}
here we see that if the currently awaiting thread is interrupted then 
SignalWithListener cancels itself.
but we throw new InterruptedException() here, so the execution should fail with 
it.
Checking the thread stack trace and see that AbstractSignal.await() is invoked 
from 
org.apache.cassandra.utils.concurrent.Awaitable$AbstractAwaitable.awaitUninterruptibly(Awaitable.java:259)

how it looks inside:
{code:java}
public static <A extends Awaitable> A awaitUninterruptibly(A await)
{
    boolean interrupted = false;
    while (true)
    {
        try
        {
            await.await();
            break;
        }
        catch (InterruptedException e)
        {
            interrupted = true;
        }
    }
    if (interrupted)
        Thread.currentThread().interrupt();
    return await;
}
{code}
Here we have a loop, if the current thread is interrupted - we have 
InterruptedException thrown by await.await(), we catch it here and try 
await.await() again. But in the previous invocation of await.await() we have 
cancelled our Signal object, so the condition inside await() method:
isSignalled(): state == SIGNALLED is always false now plus if another thread 
will invoke notifyAll - nothing will happen, because as a part of cancellation 
our Signal object unregister itself from the queue and the notifying thread 
will not able to find it to notify.
I have reproduced the case with a simple unit test.

The last element of the puzzle - why is the thread interrupted? To understand 
it - let's check there "read-hotness-tracker:1" thread execution started. It is 
a background periodical runnable in 
org.apache.cassandra.io.sstable.format.SSTableReader.GlobalTidy
{code:java}
readMeterSyncFuture = new WeakReference<>(syncExecutor.scheduleAtFixedRate(new 
Runnable()
{
    public void run()
    {
        if (obsoletion == null)
        {
            meterSyncThrottle.acquire();
            SystemKeyspace.persistSSTableReadMeter(desc.ksname, desc.cfname, 
desc.id, readMeter);
        }
    }
}, 1, 5, TimeUnit.MINUTES));
{code}
let's check how we deal with readMeterSyncFuture ref and we see our 
interruption:
{code:java}
private void stopReadMeterPersistence()
{
    ScheduledFuture<?> readMeterSyncFutureLocal = readMeterSyncFuture.get();
    if (readMeterSyncFutureLocal != null)
    {
        readMeterSyncFutureLocal.cancel(true); <========= here we can interrupt 
the thread
        readMeterSyncFuture = NULL;
    }
}
{code}
stopReadMeterPersistence() is invoked by one of the following branches of logic 
when an SSTable is not in use anymore:
org.apache.cassandra.io.sstable.format.SSTableReader#markObsolete
org.apache.cassandra.db.lifecycle.Helpers#markObsolete
org.apache.cassandra.db.lifecycle.Tracker#dropSSTables(...)
org.apache.cassandra.db.lifecycle.LifecycleTransaction#doCommit
org.apache.cassandra.db.lifecycle.LifecycleTransaction#doAbort

So,the issue is caused by a combination of events with a right timing:
 - an SSTable which was become obsolete at the middle of 
persistSSTableReadMeter background job execution for this SSTable, so we 
interrupted the job
 - persistSSTableReadMeter was at that moment in the middle of write operation, 
at commit log append step
 - and the commit log append step was stuck due to a delay in commit log sync 
(lastSyncedAt < syncTime condition in AbstractCommitLogService), so we started 
to wait for it using await

  was:
The issue is reported based on an analysis done in "Unexplained stuck memtable 
flush
" mail thread from Cassandra User/Dev groups.

e2e observations: for a Cassandra node the memtable flush randomly getting 
stuck and later as a consequence all writes to memtables are stuck as well due 
to lack of memory in a memory pool.

Based on the analysis of thread dump and heap dump the following cause has been 
identified:
We have the following stacktrace for "read-hotness-tracker:1" thread
{code:java}
"read-hotness-tracker:1" daemon prio=5 tid=93 WAITING
    at jdk.internal.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:323)
    at 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$AbstractSignal.await(WaitQueue.java:289)
       local variable: 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$SignalWithListener#2086139
    at 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$AbstractSignal.await(WaitQueue.java:282)
       local variable: 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$SignalWithListener#2086139
    at 
org.apache.cassandra.utils.concurrent.Awaitable$Defaults.awaitUninterruptibly(Awaitable.java:186)
       local variable: 
org.apache.cassandra.utils.concurrent.WaitQueue$Standard$SignalWithListener#2086139
    at 
org.apache.cassandra.utils.concurrent.Awaitable$AbstractAwaitable.awaitUninterruptibly(Awaitable.java:259)
    at 
org.apache.cassandra.db.commitlog.AbstractCommitLogService.awaitSyncAt(AbstractCommitLogService.java:324)
       local variable: 
org.apache.cassandra.db.commitlog.PeriodicCommitLogService#1
       local variable: com.codahale.metrics.Timer$Context#2086795
    at 
org.apache.cassandra.db.commitlog.PeriodicCommitLogService.maybeWaitForSync(PeriodicCommitLogService.java:42)
       local variable: 
org.apache.cassandra.db.commitlog.PeriodicCommitLogService#1
    at 
org.apache.cassandra.db.commitlog.AbstractCommitLogService.finishWriteFor(AbstractCommitLogService.java:284)
       local variable: 
org.apache.cassandra.db.commitlog.PeriodicCommitLogService#1
    at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307)
       local variable: org.apache.cassandra.db.commitlog.CommitLog#1
       local variable: org.apache.cassandra.io.util.DataOutputBuffer$1$1#61
       local variable: 
org.apache.cassandra.db.commitlog.CommitLogSegment$Allocation#1
    at 
org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:100)
    at 
org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:54)
       local variable: org.apache.cassandra.utils.concurrent.OpOrder$Group#8162
    at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:628)
       local variable: org.apache.cassandra.db.Keyspace#8
       local variable: org.apache.cassandra.db.Mutation#54491
    at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:512)
    at org.apache.cassandra.db.Mutation.apply(Mutation.java:228)
    at org.apache.cassandra.db.Mutation.apply(Mutation.java:248)
    at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:675)
    at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeLocally(ModificationStatement.java:666)
    at 
org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:447)
    at 
org.apache.cassandra.db.SystemKeyspace.persistSSTableReadMeter(SystemKeyspace.java:1488)
       local variable: java.lang.String#42806
       local variable: java.lang.String#47659
       local variable: org.apache.cassandra.io.sstable.UUIDBasedSSTableId#1389
       local variable: org.apache.cassandra.metrics.RestorableMeter#752
    at 
org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy$1.run(SSTableReader.java:2170)
{code}
where we are waiting in the await() method of 
WaitQueue$Standard$SignalWithListener#2086139 object.

another input from a heap dump: WaitQueue$Standard$SignalWithListener#2086139 
has the following field values based on the heap dump screenshots:
state = -1 (it means CANCELLED: 
org.apache.cassandra.utils.concurrent.WaitQueue.Standard#CANCELLED = -1)
thread = null

According to the code in WaitQueue.java such combination of values only 
possible if we invoked cancel() on this SignalWithListener object but we didn't 
do it in our code, it is in another branch of 
AbstractCommitLogService.awaitSyncAt

but let's check the await() logic more carefully:
{code:java}
public Signal await() throws InterruptedException
{
    while (!isSignalled())
    {
        checkInterrupted();
        LockSupport.park(); <--- we are here
    }
    checkAndClear();
    return this;
}
{code}
and inspect unassuming checkInterrupted() method
{code:java}
private void checkInterrupted() throws InterruptedException
{
    if (Thread.interrupted())
    {
        cancel();
        throw new InterruptedException();
    }
}
{code}
here we see that if the currently awaiting thread is interrupted then 
SignalWithListener cancels itself.
but we throw new InterruptedException() here, so the execution should fail with 
it.
Checking the thread stack trace and see that AbstractSignal.await() is invoked 
from 
org.apache.cassandra.utils.concurrent.Awaitable$AbstractAwaitable.awaitUninterruptibly(Awaitable.java:259)

how it looks inside:
{code:java}
public static <A extends Awaitable> A awaitUninterruptibly(A await)
{
    boolean interrupted = false;
    while (true)
    {
        try
        {
            await.await();
            break;
        }
        catch (InterruptedException e)
        {
            interrupted = true;
        }
    }
    if (interrupted)
        Thread.currentThread().interrupt();
    return await;
}
{code}
Here we have a loop, if the current thread is interrupted - we have 
InterruptedException thrown by await.await(), we catch it here and try 
await.await() again. But in the previous invocation of await.await() we have 
cancelled our Signal object, so the condition inside await() method:
isSignalled(): state == SIGNALLED is always false now plus if another thread 
will invoke notifyAll - nothing will happen, because as a part of cancellation 
our Signal object unregister itself from the queue and the notifying thread 
will not able to find it to notify.
I have reproduced the case with a simple unit test.

The last element of the puzzle - why is the thread interrupted? To understand 
it - let's check there "read-hotness-tracker:1" thread execution started. It is 
a background periodical runnable in 
org.apache.cassandra.io.sstable.format.SSTableReader.GlobalTidy
{code:java}
readMeterSyncFuture = new WeakReference<>(syncExecutor.scheduleAtFixedRate(new 
Runnable()
{
    public void run()
    {
        if (obsoletion == null)
        {
            meterSyncThrottle.acquire();
            SystemKeyspace.persistSSTableReadMeter(desc.ksname, desc.cfname, 
desc.id, readMeter);
        }
    }
}, 1, 5, TimeUnit.MINUTES));
{code}
let's check how we deal with readMeterSyncFuture ref and we see our 
interruption:
{code:java}
private void stopReadMeterPersistence()
{
    ScheduledFuture<?> readMeterSyncFutureLocal = readMeterSyncFuture.get();
    if (readMeterSyncFutureLocal != null)
    {
        readMeterSyncFutureLocal.cancel(true); <========= here we can interrupt 
the thread
        readMeterSyncFuture = NULL;
    }
}
{code}
stopReadMeterPersistence() is invoked by one of the following branches of logic 
when an SSTable is not in use anymore:
org.apache.cassandra.io.sstable.format.SSTableReader#markObsolete
org.apache.cassandra.db.lifecycle.Helpers#markObsolete
org.apache.cassandra.db.lifecycle.Tracker#dropSSTables(...)
org.apache.cassandra.db.lifecycle.LifecycleTransaction#doCommit
org.apache.cassandra.db.lifecycle.LifecycleTransaction#doAbort

So,the issue is caused by a combination of events with a right timing:
 - an SSTable which was become obsolete at the middle of 
persistSSTableReadMeter background job execution for this SSTable, so we 
interrupted the job
 - persistSSTableReadMeter was at that moment in the middle of write operation, 
at commit log append step
 - and the commit log append step was stuck due to a delay in commit log sync 
(lastSyncedAt < syncTime condition in AbstractCommitLogService), so we started 
to wait for it using await


> WaitQueue.Signal.awaitUninterruptibly() may stuck forever if the thread is 
> interrupted
> --------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-20084
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20084
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>         Attachments: cass_threads.txt.gz, heap_dump_vars_state.png
>
>
> The issue is reported based on an analysis done in "Unexplained stuck 
> memtable flush" mail thread from Cassandra User/Dev groups.
> e2e observations: for a Cassandra node the memtable flush randomly getting 
> stuck and later as a consequence all writes to memtables are stuck as well 
> due to lack of memory in a memory pool.
> Based on the analysis of thread dump and heap dump the following cause has 
> been identified:
> We have the following stacktrace for "read-hotness-tracker:1" thread
> {code:java}
> "read-hotness-tracker:1" daemon prio=5 tid=93 WAITING
>     at jdk.internal.misc.Unsafe.park(Native Method)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:323)
>     at 
> org.apache.cassandra.utils.concurrent.WaitQueue$Standard$AbstractSignal.await(WaitQueue.java:289)
>        local variable: 
> org.apache.cassandra.utils.concurrent.WaitQueue$Standard$SignalWithListener#2086139
>     at 
> org.apache.cassandra.utils.concurrent.WaitQueue$Standard$AbstractSignal.await(WaitQueue.java:282)
>        local variable: 
> org.apache.cassandra.utils.concurrent.WaitQueue$Standard$SignalWithListener#2086139
>     at 
> org.apache.cassandra.utils.concurrent.Awaitable$Defaults.awaitUninterruptibly(Awaitable.java:186)
>        local variable: 
> org.apache.cassandra.utils.concurrent.WaitQueue$Standard$SignalWithListener#2086139
>     at 
> org.apache.cassandra.utils.concurrent.Awaitable$AbstractAwaitable.awaitUninterruptibly(Awaitable.java:259)
>     at 
> org.apache.cassandra.db.commitlog.AbstractCommitLogService.awaitSyncAt(AbstractCommitLogService.java:324)
>        local variable: 
> org.apache.cassandra.db.commitlog.PeriodicCommitLogService#1
>        local variable: com.codahale.metrics.Timer$Context#2086795
>     at 
> org.apache.cassandra.db.commitlog.PeriodicCommitLogService.maybeWaitForSync(PeriodicCommitLogService.java:42)
>        local variable: 
> org.apache.cassandra.db.commitlog.PeriodicCommitLogService#1
>     at 
> org.apache.cassandra.db.commitlog.AbstractCommitLogService.finishWriteFor(AbstractCommitLogService.java:284)
>        local variable: 
> org.apache.cassandra.db.commitlog.PeriodicCommitLogService#1
>     at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307)
>        local variable: org.apache.cassandra.db.commitlog.CommitLog#1
>        local variable: org.apache.cassandra.io.util.DataOutputBuffer$1$1#61
>        local variable: 
> org.apache.cassandra.db.commitlog.CommitLogSegment$Allocation#1
>     at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:100)
>     at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:54)
>        local variable: 
> org.apache.cassandra.utils.concurrent.OpOrder$Group#8162
>     at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:628)
>        local variable: org.apache.cassandra.db.Keyspace#8
>        local variable: org.apache.cassandra.db.Mutation#54491
>     at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:512)
>     at org.apache.cassandra.db.Mutation.apply(Mutation.java:228)
>     at org.apache.cassandra.db.Mutation.apply(Mutation.java:248)
>     at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:675)
>     at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeLocally(ModificationStatement.java:666)
>     at 
> org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:447)
>     at 
> org.apache.cassandra.db.SystemKeyspace.persistSSTableReadMeter(SystemKeyspace.java:1488)
>        local variable: java.lang.String#42806
>        local variable: java.lang.String#47659
>        local variable: org.apache.cassandra.io.sstable.UUIDBasedSSTableId#1389
>        local variable: org.apache.cassandra.metrics.RestorableMeter#752
>     at 
> org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy$1.run(SSTableReader.java:2170)
> {code}
> where we are waiting in the await() method of 
> WaitQueue$Standard$SignalWithListener#2086139 object.
> another input from a heap dump: WaitQueue$Standard$SignalWithListener#2086139 
> has the following field values based on the heap dump screenshots:
> state = -1 (it means CANCELLED: 
> org.apache.cassandra.utils.concurrent.WaitQueue.Standard#CANCELLED = -1)
> thread = null
> According to the code in WaitQueue.java such combination of values only 
> possible if we invoked cancel() on this SignalWithListener object but we 
> didn't do it in our code, it is in another branch of 
> AbstractCommitLogService.awaitSyncAt
> but let's check the await() logic more carefully:
> {code:java}
> public Signal await() throws InterruptedException
> {
>     while (!isSignalled())
>     {
>         checkInterrupted();
>         LockSupport.park(); <--- we are here
>     }
>     checkAndClear();
>     return this;
> }
> {code}
> and inspect unassuming checkInterrupted() method
> {code:java}
> private void checkInterrupted() throws InterruptedException
> {
>     if (Thread.interrupted())
>     {
>         cancel();
>         throw new InterruptedException();
>     }
> }
> {code}
> here we see that if the currently awaiting thread is interrupted then 
> SignalWithListener cancels itself.
> but we throw new InterruptedException() here, so the execution should fail 
> with it.
> Checking the thread stack trace and see that AbstractSignal.await() is 
> invoked from 
> org.apache.cassandra.utils.concurrent.Awaitable$AbstractAwaitable.awaitUninterruptibly(Awaitable.java:259)
> how it looks inside:
> {code:java}
> public static <A extends Awaitable> A awaitUninterruptibly(A await)
> {
>     boolean interrupted = false;
>     while (true)
>     {
>         try
>         {
>             await.await();
>             break;
>         }
>         catch (InterruptedException e)
>         {
>             interrupted = true;
>         }
>     }
>     if (interrupted)
>         Thread.currentThread().interrupt();
>     return await;
> }
> {code}
> Here we have a loop, if the current thread is interrupted - we have 
> InterruptedException thrown by await.await(), we catch it here and try 
> await.await() again. But in the previous invocation of await.await() we have 
> cancelled our Signal object, so the condition inside await() method:
> isSignalled(): state == SIGNALLED is always false now plus if another thread 
> will invoke notifyAll - nothing will happen, because as a part of 
> cancellation our Signal object unregister itself from the queue and the 
> notifying thread will not able to find it to notify.
> I have reproduced the case with a simple unit test.
> The last element of the puzzle - why is the thread interrupted? To understand 
> it - let's check there "read-hotness-tracker:1" thread execution started. It 
> is a background periodical runnable in 
> org.apache.cassandra.io.sstable.format.SSTableReader.GlobalTidy
> {code:java}
> readMeterSyncFuture = new 
> WeakReference<>(syncExecutor.scheduleAtFixedRate(new Runnable()
> {
>     public void run()
>     {
>         if (obsoletion == null)
>         {
>             meterSyncThrottle.acquire();
>             SystemKeyspace.persistSSTableReadMeter(desc.ksname, desc.cfname, 
> desc.id, readMeter);
>         }
>     }
> }, 1, 5, TimeUnit.MINUTES));
> {code}
> let's check how we deal with readMeterSyncFuture ref and we see our 
> interruption:
> {code:java}
> private void stopReadMeterPersistence()
> {
>     ScheduledFuture<?> readMeterSyncFutureLocal = readMeterSyncFuture.get();
>     if (readMeterSyncFutureLocal != null)
>     {
>         readMeterSyncFutureLocal.cancel(true); <========= here we can 
> interrupt the thread
>         readMeterSyncFuture = NULL;
>     }
> }
> {code}
> stopReadMeterPersistence() is invoked by one of the following branches of 
> logic when an SSTable is not in use anymore:
> org.apache.cassandra.io.sstable.format.SSTableReader#markObsolete
> org.apache.cassandra.db.lifecycle.Helpers#markObsolete
> org.apache.cassandra.db.lifecycle.Tracker#dropSSTables(...)
> org.apache.cassandra.db.lifecycle.LifecycleTransaction#doCommit
> org.apache.cassandra.db.lifecycle.LifecycleTransaction#doAbort
> So,the issue is caused by a combination of events with a right timing:
>  - an SSTable which was become obsolete at the middle of 
> persistSSTableReadMeter background job execution for this SSTable, so we 
> interrupted the job
>  - persistSSTableReadMeter was at that moment in the middle of write 
> operation, at commit log append step
>  - and the commit log append step was stuck due to a delay in commit log sync 
> (lastSyncedAt < syncTime condition in AbstractCommitLogService), so we 
> started to wait for it using await



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to