[ 
https://issues.apache.org/jira/browse/CASSANDRA-17401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18046039#comment-18046039
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-17401:
---------------------------------------------------

> I am personally not a big fan of adding version-dependent logic, since it 
> complicates maintenance and testing substantially.

As of today, in the trunk, we already have 4.0.1 -> 4.0.2 version-specific 
[logic|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/QueryProcessor.java#L803];
 please see the "else" portion.
{code:java}
            if (useNewPreparedStatementBehaviour)
            {
                if (cachedWithoutKeyspace.fullyQualified) // For fully 
qualified statements, we always skip keyspace to avoid digest switching
                    return createResultMessage(hashWithoutKeyspace, 
cachedWithoutKeyspace);                if (clientState.getRawKeyspace() != null 
&& !cachedWithKeyspace.fullyQualified) // For non-fully qualified statements, 
we always include keyspace to avoid ambiguity
                    return createResultMessage(hashWithKeyspace, 
cachedWithKeyspace);            }
            else // legacy caches, pre-CASSANDRA-15252 behaviour
            {
                return createResultMessage(hashWithKeyspace, 
cachedWithKeyspace);
            } {code}
This eviction logic became necessary for 4.0.1 -> 4.0.2, and is not needed 
afterwards. Therefore, the proposed PR on _trunk_ removes the "eviction" logic 
entirely.

And we can selectively decide till what version we want to land the proposed PR 
- I propose to land it on trunk and 5.0, maybe 4.1, but not land on 4.x.

 

>Another approach is to evict only if we see that one of the statements is 
>cached but not the other (i.e., \{{ (a == null && b != null) && (b == null && 
>a != null) }}), which is going to reduce the window of race possibility 
>substantially. 

I agree that this logic reduces the possibility of a race, but it does not 
eliminate it. The point I am trying to make is not to keep unnecessary 
code/logic that is only needed for the transition from version 4.0.1 to 4.0.2.

WDYT?

> Race condition in QueryProcessor causes just prepared statement not to be in 
> the prepared statements cache
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17401
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17401
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Messaging/Client
>            Reporter: Ivan Senic
>            Assignee: Jaydeepkumar Chovatia
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The changes in the 
> [QueryProcessor#prepare|https://github.com/apache/cassandra/blame/cassandra-4.0.2/src/java/org/apache/cassandra/cql3/QueryProcessor.java#L575-L638]
>  method that were introduced in versions *4.0.2* and *3.11.12* can cause a 
> race condition between two threads trying to concurrently prepare the same 
> statement. This race condition can cause removing of a prepared statement 
> from the cache, after one of the threads has received the result of the 
> prepare and eventually uses MD5Digest to call 
> [QueryProcessor#getPrepared|https://github.com/apache/cassandra/blame/cassandra-4.0.2/src/java/org/apache/cassandra/cql3/QueryProcessor.java#L212-L215].
> The race condition looks like this:
>  * Thread1 enters _prepare_ method and resolves _safeToReturnCached_ as false
>  * Thread1 executes eviction of hashes
>  * Thread2 enters _prepare_ method and resolves _safeToReturnCached_ as false
>  * Thread1 prepares the statement and caches it
>  * Thread1 returns the result of the prepare
>  * Thread2 executes eviction of hashes
>  * Thread1 tries to execute the prepared statement with the received 
> MD5Digest, but statement is not in the cache as it was evicted by Thread2
> I tried to reproduce this by using a Java driver, but hitting this case from 
> a client side is highly unlikely and I can not simulate the needed race 
> condition. However, we can easily reproduce this in Stargate (details 
> [here|https://github.com/stargate/stargate/pull/1647]), as it's closer to 
> QueryProcessor.
> Reproducing this in a unit test is fairly easy. I am happy to showcase this 
> if needed.
> Note that the issue can occur only when  safeToReturnCached is resolved as 
> false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to