[jira] [Created] (CASSANDRA-16554) Race between secondary index building and active compactions tracking

Yifan Cai (Jira) Wed, 31 Mar 2021 16:25:07 -0700

Yifan Cai created CASSANDRA-16554:
-------------------------------------

             Summary: Race between secondary index building and active 
compactions tracking
                 Key: CASSANDRA-16554
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16554
             Project: Cassandra
          Issue Type: Bug
          Components: Feature/2i Index, Local/Other
            Reporter: Yifan Cai
            Assignee: Yifan Cai

There is a race condition between the secondary index build compaction task and
the active compactions tracking, especially when incremental repair is running.
It could result into 2 exceptions.

{code:java}
Caused by: java.util.NoSuchElementException
at
org.apache.cassandra.utils.AbstractIterator.next(AbstractIterator.java:64)
at
org.apache.cassandra.io.sstable.ReducingKeyIterator.next(ReducingKeyIterator.java:117)
at
org.apache.cassandra.index.internal.CollatedViewIndexBuilder.build(CollatedViewIndexBuilder.java:74)
at
org.apache.cassandra.db.compaction.CompactionManager$14.run(CompactionManager.java:1688)
{code}

{code:java}
Caused by: java.io.EOFException
at
org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:180)
at
org.apache.cassandra.utils.vint.VIntCoding.readUnsignedVInt(VIntCoding.java:68)
at
org.apache.cassandra.io.util.RebufferingInputStream.readUnsignedVInt(RebufferingInputStream.java:243)
at
org.apache.cassandra.db.RowIndexEntry$Serializer.readPosition(RowIndexEntry.java:364)
at
org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:369)
at
org.apache.cassandra.io.sstable.KeyIterator.computeNext(KeyIterator.java:110)
{code}

In the first exception, the iterator returns true for the call of `hasNext`,
but the following call of `next` throws.
In the second exception, the file wrapper object returns false for the call of
`isEOF`, but the following call that reads from it throws EOFException.
The exceptions can be constantly reproduced by the test in the patch.

The root cause of the exception is from the thread-unsafe lazy initialization
found in `ReducingKeyIterator` and `KeyIterator`. When the `maybeInit` methods
of both classes are called from multiple threads, it is likely to instantiate 2
instances and mess up the internal state. Those iterators might not be
considered being used in a multiple thread environment when being added to the
codebase initially. However, the `CollatedViewIndexBuilder` contains the
reference to those 2 iterator, and it, as a `CompactionInfo.Holder`, is added
to active compactions to be accessed from other threads, e.g. by calling
`ActiveCompactions#getCompactionsForSSTable`. The implementation of
`getCompactionInfo` in `CollatedViewIndexBuilder` publishes the reference to
the `ReducingKeyIterator` and transitively `KeyIterator` to the other threads.
Therefore, the other threads can possibly race.

For the instance of NSEE, thread 1 calls the `hasNext` on the
`ReducingKeyIterator`, it initialize the internal merge iterator. The call
returns true. Right after, there is a thread 2 that calls
`ActiveCompactions#getCompactionsForSSTable` and it sees the instance of
`ReducingKeyIterator` is not initialized yet, so run `maybeInit` again. The
reference of `iter` is replaced with the second instance. Now, thread 1 calls
`next` using the new instance that does not has the correct state (expecting
HAS_NEXT). So it again calls `hasNext`, which fetches the next element from the
merged iterator. Those 2 `ReducingKeyIterator` share the same instances of
`KeyIterator`. If the key iterators are already at the end, calling `hasNext`
returns false and we get the NSEE.

Besides the explicit NSEE exception, with the above reasoning, it is also
possible to have the unexpected behavior that skips one item from the merge
iterator. It leads to no 2i is built for that missing partition. Such case is
totally hidden since no exception is ever thrown.

To fix the unexpected behaviors, we need to correctly lazy initialize the
iterators.

Looking at the implementations of `CompactionInfo.Holder`, `ScrubInfo` and
`VerifyInfo` also publishes the non-thread-safe `RandomAccessReader` when
creating the compaction info object. According to the code inspection, there is
a rare chance that a thread is calling `getFilePointer` from the file object
and another thread closes the file, which can produce a NPE. When closing the
file, the internal states, e.g. buffer is set to null. I did not add the fix in
this patch, as it is never spotted.

--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (CASSANDRA-16554) Race between secondary index building and active compactions tracking

Reply via email to