[jira] [Commented] (CASSANDRA-16554) Race between secondary index building and active compactions tracking

Yifan Cai (Jira) Thu, 01 Apr 2021 14:28:15 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313447#comment-17313447
 ]


Yifan Cai commented on CASSANDRA-16554:
---------------------------------------

bq. Is there a ticket for the publication of RandomAccessReader by ScrubInfo 
and VerifyInfo?

I do not have created it. 

The PR (https://github.com/apache/cassandra/pull/946) fixes the lazy 
initialization. But the `CompactionInfo.Holder#getCompactionInfo` method still 
unsafely publish the internal state (e.g. RandomAccessReader) to the other 
threads, which I am not quite comfortable with. 

Here is an alternative patch that contains the necessary synchronization when 
access the RandomAccessReader in addition to the original patch. It also adds 
the lock for Scrubber and Verifier when closing the RandomAccessReader and 
creating compaction info objects. 
https://github.com/yifan-c/cassandra/tree/CASSANDRA-16554/trunk-2 
Please take a look if the alternative is preferred. 

> Race between secondary index building and active compactions tracking
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-16554
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16554
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Feature/2i Index, Local/Other
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>             Fix For: 4.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a race condition between the secondary index build compaction task 
> and the active compactions tracking, especially when incremental repair is 
> running. 
> It could result into 2 exceptions. 
> {code:java}
> Caused by: java.util.NoSuchElementException
>       at 
> org.apache.cassandra.utils.AbstractIterator.next(AbstractIterator.java:64)
>       at 
> org.apache.cassandra.io.sstable.ReducingKeyIterator.next(ReducingKeyIterator.java:117)
>       at 
> org.apache.cassandra.index.internal.CollatedViewIndexBuilder.build(CollatedViewIndexBuilder.java:74)
>       at 
> org.apache.cassandra.db.compaction.CompactionManager$14.run(CompactionManager.java:1688)
> {code}
> {code:java}
> Caused by: java.io.EOFException
>       at 
> org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:180)
>       at 
> org.apache.cassandra.utils.vint.VIntCoding.readUnsignedVInt(VIntCoding.java:68)
>       at 
> org.apache.cassandra.io.util.RebufferingInputStream.readUnsignedVInt(RebufferingInputStream.java:243)
>       at 
> org.apache.cassandra.db.RowIndexEntry$Serializer.readPosition(RowIndexEntry.java:364)
>       at 
> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:369)
>       at 
> org.apache.cassandra.io.sstable.KeyIterator.computeNext(KeyIterator.java:110)
> {code}
> In the first exception, the iterator returns true for the call of `hasNext`, 
> but the following call of `next` throws. 
> In the second exception, the file wrapper object returns false for the call 
> of `isEOF`, but the following call that reads from it throws EOFException. 
> The exceptions can be constantly reproduced by the test in the patch. 
> The root cause of the exception is from the thread-unsafe lazy initialization 
> found in `ReducingKeyIterator` and `KeyIterator`. When the `maybeInit` 
> methods of both classes are called from multiple threads, it is likely to 
> instantiate 2 instances and mess up the internal state. Those iterators might 
> not be considered being used in a multiple thread environment when being 
> added to the codebase initially. However, the `CollatedViewIndexBuilder` 
> contains the reference to those 2 iterator, and it, as a 
> `CompactionInfo.Holder`, is added to active compactions to be accessed from 
> other threads, e.g. by calling `ActiveCompactions#getCompactionsForSSTable`. 
> The implementation of `getCompactionInfo` in `CollatedViewIndexBuilder` 
> publishes the reference to the `ReducingKeyIterator` and transitively 
> `KeyIterator` to the other threads. Therefore, the other threads can possibly 
> race. 
> For the instance of NSEE, thread 1 calls the `hasNext` on the 
> `ReducingKeyIterator`, it initialize the internal merge iterator. The call 
> returns true. Right after, there is a thread 2 that calls 
> `ActiveCompactions#getCompactionsForSSTable` and it sees the instance of 
> `ReducingKeyIterator` is not initialized yet, so run `maybeInit` again. The 
> reference of `iter` is replaced with the second instance. Now, thread 1 calls 
> `next` using the new instance that does not has the correct state (expecting 
> HAS_NEXT). So it again calls `hasNext`, which fetches the next element from 
> the merged iterator. Those 2 `ReducingKeyIterator` share the same instances 
> of `KeyIterator`. If the key iterators are already at the end, calling 
> `hasNext` returns false and we get the NSEE. 
> Besides the explicit NSEE exception, with the above reasoning, it is also 
> possible to have the unexpected behavior that skips one item from the merge 
> iterator. It leads to no 2i is built for that missing partition. Such case is 
> totally hidden since no exception is ever thrown. 
> To fix the unexpected behaviors, we need to correctly lazy initialize the 
> iterators. 
> Looking at the implementations of `CompactionInfo.Holder`, `ScrubInfo` and 
> `VerifyInfo` also publishes the non-thread-safe `RandomAccessReader` when 
> creating the compaction info object. According to the code inspection, there 
> is a rare chance that a thread is calling `getFilePointer` from the file 
> object and another thread closes the file, which can produce a NPE. When 
> closing the file, the internal states, e.g. buffer is set to null. I did not 
> add the fix in this patch, as it is never spotted. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-16554) Race between secondary index building and active compactions tracking

Reply via email to