[ 
https://issues.apache.org/jira/browse/CASSANDRA-13422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960746#comment-15960746
 ] 

Marcus Eriksson commented on CASSANDRA-13422:
---------------------------------------------

+1

> CompactionStrategyManager should take write not read lock when handling 
> add/remove notifications
> ------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13422
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13422
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 4.0, 3.11.x
>
>
> {{getNextBackgroundTask}} in various compaction strategies (definitely 
> {{LCS}}) rely on checking the result of {{DataTracker.getCompacting()}} to 
> avoid accessing data and metadata related to tables that have already head 
> their resources released.
> There is a race where this check is unreliable and will claim a table that 
> has its resources already released is not compacting resulting in use after 
> free.
> [{{LeveledCompactionStrategy.findDroppableSSTable}}|https://github.com/apache/cassandra/blob/c794d2bed7ca1d10e13c4da08a3d45f5c755c1d8/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java#L504]
>  for instance has this three part logical && condition where the first check 
> is against the compacting set before calling {{worthDroppingTombstones}} 
> which fails if the table has been released.
> The order of events is basically that CompactionStrategyManager acquires the 
> read lock in getNextBackgroundTask(), then proceeds eventually to 
> findDroppableSSTable and acquires a set of SSTables from the manifest. While 
> the manifest is thread safe it's not accessed atomically WRT to other 
> operations. Once it has acquire the set of tables it acquires the (not 
> atomically) the set of compacting SSTables and iterates checking the former 
> against the latter.
> Meanwhile other compaction threads are marking tables obsolete or compacted 
> and releasing their references. Doing this removes them from {{DataTracker}} 
> and publishes a notification to the strategies, but this notification only 
> requires the read lock. After the compaction thread has published the 
> notifications it eventually marks the table as not compacting in 
> {{DataTracker}} or removes it entirely.
> The race is then that the compaction thread generating a new background task 
> acquires the sstables from the manifest on the stack. Any table in that set 
> that was compacting at that time must remain compacting so that it can be 
> skipped. Another compaction thread finishes a compaction and is able to 
> remove the table from the manifest and then remove it from the compacting 
> set. The thread generating the background task then acquires the list of 
> compacting tables which doesn't include the table it is supposed to skip.
> The simple fix appears to be to require threads to acquire the write lock in 
> order to publish notifications of tables being added/removed from compaction 
> strategies. While holding the write lock it won't be possible for someone to 
> see a view of tables in the manifest where tables that are compacting aren't 
> compacting in the view



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to