Github user jolynch commented on a diff in the pull request:
https://github.com/apache/cassandra/pull/212#discussion_r225733894
--- Diff: src/java/org/apache/cassandra/config/Config.java ---
@@ -376,9 +376,31 @@
public String full_query_log_dir = null;
- // parameters to adjust how much to delay startup until a certain
amount of the cluster is connect to and marked alive
- public int block_for_peers_percentage = 70;
+ /**
+ * When a node first starts up it intially thinks all other peers are
DOWN, and then as the initial gossip
+ * broadcast messages comes back nodes transition to UP. These options
configure how many nodes can remain in
+ * DOWN state before we make this node available as a coordinator, as
well as an overall timeout on this process
+ * to ensure that startup is not delayed too much.
+ *
+ * The defaults are tuned for LOCAL_ONE consistency levels with RF=3,
and have natural settings for other CLs:
+ *
+ * Consistency Level | local_dc | all_dcs
+ * --------------------------------------------------------
+ * LOCAL_ONE | default (2) | default (any)
+ * LOCAL_QUORUM | 1 | default (any)
+ * ONE | any | RF - 1
+ * QUORUM | any | (RF / 2) - 1
+ * ALL | default | 0
+ *
+ * A concrete example with QUORUM would be if you have 3 replicas in 2
datacenters, then you would set
+ * block_for_peers_all_dcs to (6 / 2) - 1 = 2 because that guarantees
that at most 2 hosts in all datacenters
+ * are down when you start taking client traffic, which should
satistfy QUORUM for all RF=6 QUORUM queries.
+ */
+ public int block_for_peers_local_dc = 2;
+ public int block_for_peers_all_dcs = Integer.MAX_VALUE;
--- End diff --
I agree and 10s isn't that big a deal (imo) for users that don't want to
wait for remote dcs, but I was trying to compromise between the old settings
(70% == could literally have a whole DC down with 3 DCs) which appear to be
motivated from a perspective of "don't block my startup" and what I personally
think that we shouldn't be saying we're ready to coordinate until we're
actually ready to coordinate.
I am happy to put whatever default gets this merged ;-) We'll probably
internally be setting the local setting to `1` (so `LOCAL_QUORUM`) and the
remote to a really large number. But that's just our perspective...
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]