[ 
https://issues.apache.org/jira/browse/CASSANDRA-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985529#comment-13985529
 ] 

Jack Krupansky commented on CASSANDRA-7099:
-------------------------------------------

It may have been your mistake, but could C* or the driver have detected the 
difficulty and reported an error?

> Concurrent instances of same Prepared Statement seeing intermingled result 
> sets
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7099
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7099
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Cassandra 2.0.7 with single node cluster
> Windows dual-core laptop
> DataStax Java driver 2.0.1
>            Reporter: Bill Mitchell
>
> I have a schema in which a wide row is partitioned into smaller rows.  (See 
> CASSANDRA-6826, CASSANDRA-6825 for more detail on this schema.)  In this 
> case, I randomly assigned the rows across the partitions based on the first 
> four hex digits of a hash value modulo the number of partitions.  
> Occasionally I need to retrieve the rows in order of insertion irrespective 
> of the partitioning.  Cassandra, of course, does not support this when paging 
> by fetch size is enabled, so I am issuing a query against each of the 
> partitions to obtain their rows in order, and merging the results:
> SELECT l, partition, cd, rd, ec, ea FROM sr WHERE s = ?, l = ?, partition = ? 
> ORDER BY cd ASC, ec ASC ALLOW FILTERING;
> These parallel queries are all instances of a single PreparedStatement.  
> What I saw was identical values from multiple queries, which by construction 
> should never happen, and after further investigation, discovered that rows 
> from partition 5 are being returned in the result set for the query against 
> another partition, e.g., 1.  This was so unbelievable that I added diagnostic 
> code in my test case to detect this:
> After reading 167 rows, returned partition 5 does not match query partition 4
> The merge logic works fine and delivers correct results when I use LIMIT to 
> avoid fetch size paging.  Even if there were a bug there, it is hard to see 
> how any client error explains ResultSet.one() returning a row whose values 
> don't match the constraints in that ResultSet's query.
> I'm not sure of the exact significance of 167, as I have configured the 
> queryFetchSize for the cluster to 1000, and in this merge logic I divide that 
> by the number of partitions, 7, so the fetchSize for each of these parallel 
> queries was set to 142.  I suspect this is being treated as a minimum 
> fetchSize, and the driver or server is rounding this up to fill a 
> transmission block.  When I prime the pump, issuing the query against each of 
> the partitions, the initial contents of the result sets are correct.  The 
> failure appears after we advance two of these queries to the next page.
> Although I had been experimenting with fetchMoreResults() for prefetching, I 
> disabled that to isolate this problem, so that is not a factor.   
> I have not yet tried preparing separate instances of the query, as I already 
> have common logic to cache and reuse already prepared statements.
> I have not proven that it is a server bug and not a Java driver bug, but on 
> first glance it was not obvious how the Java driver might associate the 
> responses with the wrong requests.  Were that happening, one would expect to 
> see the right overall collection of rows, just to the wrong queries, and not 
> duplicates, which is what I saw.    



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to