Hi Bhaskar,

Thanks for reporting that problem. It is a nice catch :-)

Could you open a JIRA ticket with all the information that you provided?

I will try to fix that problem.

Benjamin


On Wed, Nov 2, 2016 at 12:00 AM, Bhaskar Muppana <mgvbhas...@gmail.com>
wrote:

> Hi Guys,
>
> We are seeing an issue with paging reads missing some small number of
> columns when we do paging/limit reads. We get this on a single DC cluster
> itself when both reads and writes are happening with QUORUM. Paging/limit
> reads see this issue. I have attached the ccm based script which reproduces
> the problem.
>
> * Keyspace RF - 2
> * Table (id int, course text, marks int, primary key(id, course))
> * replicas for partition key 1 - r1, r2 and r3
> * insert (1, '1', 1) ,  (1, '2', 2),  (1, '3', 3),  (1, '4', 4),  (1, '5',
> 5) - succeeded on all 3 replicas
> * insert (1, '6', 6) succeeded on r1 and r3, failed on r2
> * delete (1, '2'), (1, '3'), (1, '4'), (1, '5') succeeded on r1 and r2,
> failed on r3
> * insert (1, '7', 7) succeeded on r1 and r2, failed on r3
>
> Local data on 3 nodes looks like as below now
>
> r1: (1, '1', 1), tombstone(2-5 records), (1, '6', 6), (1, '7', 7)
> r2: (1, '1', 1), tombstone(2-5 records), (1, '7', 7)
> r3: (1, '1', 1),  (1, '2', 2),  (1, '3', 3),  (1, '4', 4),  (1, '5',
> 5), (1, '6', 6)
>
> If we do a paging read with page_size 2, and if it gets data from r2 and
> r3, then it will only get the data (1, '1', 1) and (1, '7', 7) skipping
> record 6. This problem would happen if the same query is not doing paging
> but limit set to 2 records.
>
> Resolution code for reads works same for paging queries and normal
> queries. Co-ordinator shouldn't respond back to client with records/columns
> that it didn't have complete visibility on all required replicas (in this
> case 2 replicas). In above case, it is sending back record (1, '7', 7) back
> to client, but its visibility on r3 is limited up to (1, '2', 2) and it is
> relying on just r2 data to assume (1, '6', 6) doesn't exist, which is
> wrong. End of the resolution all it can conclusively say any thing about is
> (1, '1', 1), which exists and (1, '2', 2), which is deleted.
>
> Ideally we should have different resolution implementation for
> paging/limit queries.
>
> We could reproduce this on 2.0.17, 2.1.16 and 3.0.9.
>
> Seems like 3.0.9 we have ShortReadProtection transformation on list
> queries. I assume that is to protect against the cases like above. But, we
> can reproduce the issue in 3.0.9 as well.
>
> Thanks,
> Bhaskar
>
>
>
>
>
>
>

Reply via email to