[
https://issues.apache.org/jira/browse/CASSANDRA-18443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcus Eriksson updated CASSANDRA-18443:
----------------------------------------
Status: Ready to Commit (was: Review In Progress)
+1
> Deadlock updating sstable metadata if disk boundaries need reloading
> --------------------------------------------------------------------
>
> Key: CASSANDRA-18443
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18443
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Compaction, Local/Memtable, Local/SSTable
> Reporter: Jon Meredith
> Assignee: Jon Meredith
> Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0
>
>
> {{CompactionStrategyManager.handleNotification}} holds the read lock while
> processing notifications. When handling metadata changed notifications, an
> extra call is made to maybeReloadDiskBoundaries which tries to grab the write
> lock and deadlocks the thread.
> Partial stacktrace
> {code}
> at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
> - parking to wait for <0x00000005cc000078> (a
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire
> at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock
> at
> org.apache.cassandra.db.compaction.CompactionStrategyManager.maybeReloadDiskBoundaries(CompactionStrategyManager.java:495)
> at
> org.apache.cassandra.db.compaction.CompactionStrategyManager.getCompactionStrategyFor(CompactionStrategyManager.java:343)
> at
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleMetadataChangedNotification(CompactionStrategyManager.java:796)
> at
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(CompactionStrategyManager.java:838)
> at
> org.apache.cassandra.db.lifecycle.Tracker.notifySSTableMetadataChanged(Tracker.java:482)
> at
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(CompactionStrategyManager.java:838)
> {code}
> Deadlocking with the read lock held blocks the SlabpoolCleaner while
> notifying ColumnFamilyStore so memtables are prevented from being flushed and
> recycled, causing any thread applying a mutation to the database (at least
> GossipStage and MutationStage) to be considered down by peers and/or back up
> with pending requests.
> All the cases investigated were during single sstable upleveling by
> {{org.apache.cassandra.db.compaction.SingleSSTableLCSTask}} added in
> CASSANDRA-12526.
> Other less critical work was also affected, JMX calls to get estimated
> remaining compaction tasks, the index summary manager redistributing
> summaries, the StatusLogger trying to log dropped messages, and the
> ValidationManager.
> Workaround is to reboot the affected host.
> The fix is to just remove the redundant disk boundary reload check on that
> path.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]