[
https://issues.apache.org/jira/browse/CASSANDRA-17401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821475#comment-17821475
]
Paulo Motta commented on CASSANDRA-17401:
-----------------------------------------
Thanks for the detailed reports and repro steps. I've taken a look and this
looks to me to be a legitimate race condition that can cause a re-prepare storm
under large concurrency and unlucky timing.
My understanding is that [these evict
statements|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/QueryProcessor.java#L735]
are not required for the correctness of the upgrade compatibility logic and
can be safely removed. Would you have some cycles to confirm this [~ifesdjeen] ?
In addition to this, I think there's a pending issue from CASSANDRA-17248 that
can leak prepared statements between keyspaces during mixed upgrade mode. Since
these issues are in a related area I think it makes sense to address them
together (in separate commits) to ensure these changes are tested together.
I think the {{PreparedStatementCollisionTest}} suite from [this
commit|https://github.com/apache/cassandra/pull/1872/commits/758bc4a89d7ca9d0bfe27e6f41000484724261bc]
can help improve the validation coverage of this logic. That change looks
correct to me but may need some cleanup. We should probably keep the metric
changes out of this to keep the scope of this patch to a minimum.
After proper review and validation I think there's value in including these
fixes in the final 3.X releases to address these outstanding issues as users
will still do upgrade cycles as 5.x release approaches. This will make
resolution more laborious as we will need to provide patches for 3.x all the
way up to trunk + CI for all branches. What do you think [~brandon.williams]
[~stefan.miklosovic] ?
> Race condition in QueryProcessor causes just prepared statement not to be in
> the prepared statements cache
> ----------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-17401
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17401
> Project: Cassandra
> Issue Type: Bug
> Components: Messaging/Client
> Reporter: Ivan Senic
> Assignee: Jaydeepkumar Chovatia
> Priority: Normal
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The changes in the
> [QueryProcessor#prepare|https://github.com/apache/cassandra/blame/cassandra-4.0.2/src/java/org/apache/cassandra/cql3/QueryProcessor.java#L575-L638]
> method that were introduced in versions *4.0.2* and *3.11.12* can cause a
> race condition between two threads trying to concurrently prepare the same
> statement. This race condition can cause removing of a prepared statement
> from the cache, after one of the threads has received the result of the
> prepare and eventually uses MD5Digest to call
> [QueryProcessor#getPrepared|https://github.com/apache/cassandra/blame/cassandra-4.0.2/src/java/org/apache/cassandra/cql3/QueryProcessor.java#L212-L215].
> The race condition looks like this:
> * Thread1 enters _prepare_ method and resolves _safeToReturnCached_ as false
> * Thread1 executes eviction of hashes
> * Thread2 enters _prepare_ method and resolves _safeToReturnCached_ as false
> * Thread1 prepares the statement and caches it
> * Thread1 returns the result of the prepare
> * Thread2 executes eviction of hashes
> * Thread1 tries to execute the prepared statement with the received
> MD5Digest, but statement is not in the cache as it was evicted by Thread2
> I tried to reproduce this by using a Java driver, but hitting this case from
> a client side is highly unlikely and I can not simulate the needed race
> condition. However, we can easily reproduce this in Stargate (details
> [here|https://github.com/stargate/stargate/pull/1647]), as it's closer to
> QueryProcessor.
> Reproducing this in a unit test is fairly easy. I am happy to showcase this
> if needed.
> Note that the issue can occur only when safeToReturnCached is resolved as
> false.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]