Duncan Sands created CASSANDRA-6887:
---------------------------------------

             Summary: LOCAL_ONE read repair only does local repair, in spite of 
global digest queries
                 Key: CASSANDRA-6887
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6887
             Project: Cassandra
          Issue Type: Bug
          Components: Core
         Environment: Cassandra 2.0.6, x86-64 ubuntu precise
            Reporter: Duncan Sands


I have a cluster spanning two data centres.  Almost all of the writing (and a 
lot of reading) is done in DC1.  DC2 is used for running the occasional 
analytics query.  Reads in both data centres use LOCAL_ONE.  Read repair 
settings are set to the defaults on all column families.

I had a long network outage between the data centres; it lasted longer than the 
hints window, so after it was over DC2 didn't have the latest information.  
Even after reading data many many times in DC2, the returned data was still out 
of date: read repair was not correcting it.

I then investigated using cqlsh in DC2, with tracing on.

What I saw was:

  - with consistency ONE, after about 10 read requests a digest request would 
be sent to many nodes (spanning both data centres), and the data in DC2 would 
be repaired.

 - with consistency LOCAL_ONE, after about 10 read requests a digest request 
would be sent to many nodes (spanning both data centres), but the data in DC2 
would not be repaired.  This is in spite of digest requests being sent to DC1, 
as shown by the tracing.

So it looks like digest requests are being sent to both data centres, but 
replies from outside the local data centre are ignored when using LOCAL_ONE.

The same data is being queried all the time in DC1 with consistency LOCAL_ONE, 
but this didn't result in the data in DC2 being read repaired either.  This is 
a slightly different case to what I described above: in that case the local 
node was out of date and the remote node had the latest data, while here it is 
the other way round.

It could be argued that you don't want cross data centre read repair when using 
LOCAL_ONE.  But then why bother sending cross data centre digest requests?  And 
if only doing local read repair is how it is supposed to work then it would be 
good to document this somewhere.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to