On Sun, Nov 16, 2014 at 5:13 PM, Jimmy Lin <y2klyf+w...@gmail.com> wrote:

> I have read  that read repair suppose to be running as background, but
> does the co-ordinator node need to wait for the response(along with other
> normal read tasks) before return the entire result back to the caller?
>

For the 10% of requests where read repair is triggered, the coordinator
will send a request to every replica.  (A data request to two replicas,
digest requests to the rest.)  Once enough replicas have replied to satisfy
the consistency level, the result will be returned to the client; if
there's a mismatch in the responses from the replicas, a blocking repair
will be performed before responding to the client.  Later, in the
background, the coordinator will check the remaining responses from
replicas to see if they match up.  If any of them do not, they will be
repaired in the background.


>
> #
> how a high rate of read repair impact performance? I read something that
> it will impact through put but not latency, how so?
>

That's correct, it should impact throughput but not necessarily latency.
Throughput is lower because more replicas have to do work, but latency is
unaffected (unless you're hitting capacity) because blocking repair only
happens under the same conditions that it normally does.


>
> #
> is it safe to even just  make read_repair_chance = 0?
> (since we are mostly talking to one DC, the other DC most of the time
> serve as backup/emergency )
>

Sure, it's safe enough.  People use read repair for different reasons.
Some would say that RR keeps their other datacenter's caches warm. Others
rely on it in place of normal repairs (which is not particularly safe, but
if your consistency requirements allow for it, it's fine).  If you're
running regular repairs anyway, it's safe to turn off read repair.


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Reply via email to