Joseph Lynch created CASSANDRA-14297:
----------------------------------------

             Summary: Optional startup delay for peers should wait for count 
rather than percentage
                 Key: CASSANDRA-14297
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14297
             Project: Cassandra
          Issue Type: Bug
          Components: Lifecycle
            Reporter: Joseph Lynch


As I commented in CASSANDRA-13993, the current wait for functionality is a 
great step in the right direction, but I don't think that the current setting 
(70% of nodes in the cluster) is the right configuration option. First I think 
this because 70% will not protect against errors as if you wait for 70% of the 
cluster you could still very easily have {{UnavailableException}} or 
{{ReadTimeoutException}} exceptions. This is because if you have even two nodes 
down in different racks in a Cassandra cluster these exceptions are possible 
(or with the default {{num_tokens}} setting of 256 it is basically guaranteed). 
Second I think this option is not easy for operators to set, the only setting I 
could think of that would "just work" is 100%.

I proposed in that ticket instead of having `block_for_peers_percentage` 
defaulting to 70%, we instead have `block_for_peers` as a count of nodes that 
are allowed to be down before the starting node makes itself available as a 
coordinator. Of course, we would still have the timeout to limit startup time 
and deal with really extreme situations (whole datacenters down etc).

I started working on a patch for this change [on 
github|https://github.com/jasobrown/cassandra/compare/13993...jolynch:13993], 
and am happy to finish it up with unit tests and such if someone can 
review/commit it (maybe [~aweisberg]?).

I think the short version of my proposal is we replace:
{noformat}
block_for_peers_percentage: <percentage needed up, defaults to 70%>
{noformat}

with either
{noformat}
block_for_peers: <number that can be down, defaults to 1>
{noformat}

or, if we want to do even better imo and enable advanced operators to finely 
tune this behavior (while still having good defaults that work for almost 
everyone):
{noformat}
block_for_peers_local_dc:  <number that can be down, defaults to 1>
block_for_peers_each_dc: <number that can be down, defaults to sys.maxint>
block_for_peers_all_dcs: <number that can be down, defaults to sys.maxint>
{noformat}

For example if an operator knows that they must be available at 
{{LOCAL_QUORUM}} they would set {{block_for_peers_local_dc=1}}, if they use 
{{EACH_QUOURM}} they would set {{block_for_peers_local_dc=1}}, if they use 
{{QUORUM}} (RF=3, dcs=2) they would set {{block_for_peers_all_dcs=2}}. 
Naturally everything would of course have a timeout to prevent startup taking 
too long.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to