[jira] [Commented] (CASSANDRA-7168) Add repair aware consistency levels

Sylvain Lebresne (JIRA) Tue, 21 Apr 2015 00:53:28 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504539#comment-14504539
 ]


Sylvain Lebresne commented on CASSANDRA-7168:
---------------------------------------------

bq. What makes you uncomfortable about relying on repair time?

The problem is, what if we screw up the repair time (even temporarily)? Or what 
if for some reason a sstable get deleted from one node without the user 
realizing right away (you could argue that in that case we can already break CL 
guarantees and that's true, but this would make it a lot worth since in 
practice reading from all replicas does give us a reasonably good protection 
against this)? The fact that we'll be reading only one node (for the repaired 
data at least) makes it a lot easier imo to screw up consistency guarantees 
that if we actually read the data on every node (even if just to send digests). 
In a way, data/digest reads is a bit brute-force, but that's what make it 
pretty reliable a mechanism. Relying too heavily on the repair time feels 
fragile in comparison, and being fragile when it comes to consistency 
guarantees makes me uncomfortable.

bq. What would make you more comfortable?

I'm not sure. I would probably like to see this added as an opt-in feature 
first (ideally with some granularity, either per-query or per-table) so we can 
slowly built some confidence that our handling of the repair time is solid and 
we have fail-safes around that mechanism for when things go badly.

> Add repair aware consistency levels
> -----------------------------------
>
>                 Key: CASSANDRA-7168
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7168
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>              Labels: performance
>             Fix For: 3.1
>
>
> With CASSANDRA-5351 and CASSANDRA-2424 I think there is an opportunity to 
> avoid a lot of extra disk I/O when running queries with higher consistency 
> levels.  
> Since repaired data is by definition consistent and we know which sstables 
> are repaired, we can optimize the read path by having a REPAIRED_QUORUM which 
> breaks reads into two phases:
>  
>   1) Read from one replica the result from the repaired sstables. 
>   2) Read from a quorum only the un-repaired data.
> For the node performing 1) we can pipeline the call so it's a single hop.
> In the long run (assuming data is repaired regularly) we will end up with 
> much closer to CL.ONE performance while maintaining consistency.
> Some things to figure out:
>   - If repairs fail on some nodes we can have a situation where we don't have 
> a consistent repaired state across the replicas.  
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7168) Add repair aware consistency levels

Reply via email to