[jira] [Updated] (CASSANDRA-2890) Randomize (to some extend) the choice of the first replica for counter increment

Sylvain Lebresne (JIRA) Tue, 13 Sep 2011 05:03:39 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sylvain Lebresne updated CASSANDRA-2890:
----------------------------------------

    Attachment: 2890.patch

Ok, I wasn't understanding that "the lowest value IP address" is always chosen
thing, but as it turns out the DynamicSnitch uses the underlying snitch 
compareEndpoints() method if two endpoints have the same score (which includes 
the case where they have no scores at all, because no reads have been done for 
instance). And the SimpleSnitch compareEndpoints method happens to compare the 
endpoint by IP address. I'm not sure this is a really good choice because
  # this is not coherent with the sortEndpointByProximity sorting
  # this doesn't correspond to how the NetworkTopolySnitch compareEndpoints 
work when restricted to only one datacenter (it sets all node equal).
So I think there may be something to change in there, but anyway that is not 
completely relevant to this issue.

On this issue, I agree with Stu than tracking the latency and using that is 
probably the best solution (though that does suppose the use of the 
DynamicSnitch). However, this is not so simple, because the latency we have (on 
the coordinator) is the latency of the whole counter write, that is, it does 
not only include the read by the first replica, but also the local write (not a 
big deal) and the latencies of the writes to the other replica. Those last ones 
depends on the consistency level for instance. Also, a TimeoutException does 
not necessarily means that the first replica is to blame, it could be that 
enough other replica timed out (so that the consistency level wasn't achieved).

There may be solutions to those problem, but I don't see any simple ones. Now, 
as there is reports this is a problem "in the wild", I propose we go for the 
simple "randomize" solution for now and push this directly to the 0.8 series.  
Attaching a patch (against 0.8) for this. Then we can open another ticket to 
improve over that.


> Randomize (to some extend) the choice of the first replica for counter 
> increment
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2890
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2890
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.0
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>              Labels: counters
>             Fix For: 0.8.6
>
>         Attachments: 2890.patch
>
>
> Right now, we choose the first replica for a counter increments based solely 
> on what the snitch returns. If the clients requests are well balanced over 
> the cluster and the snitch not ill configured, this should not be a problem, 
> but this is probably too strong an assumption to make.
> The goal of this ticket is to change this to choose a random replica in the 
> current data center instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2890) Randomize (to some extend) the choice of the first replica for counter increment

Reply via email to