Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/212#discussion_r225732037
  
    --- Diff: src/java/org/apache/cassandra/config/Config.java ---
    @@ -376,9 +376,31 @@
     
         public String full_query_log_dir = null;
     
    -    // parameters to adjust how much to delay startup until a certain 
amount of the cluster is connect to and marked alive
    -    public int block_for_peers_percentage = 70;
    +    /**
    +     * When a node first starts up it intially thinks all other peers are 
DOWN, and then as the initial gossip
    +     * broadcast messages comes back nodes transition to UP. These options 
configure how many nodes can remain in
    +     * DOWN state before we make this node available as a coordinator, as 
well as an overall timeout on this process
    +     * to ensure that startup is not delayed too much.
    +     *
    +     * The defaults are tuned for LOCAL_ONE consistency levels with RF=3, 
and have natural settings for other CLs:
    +     *
    +     *     Consistency Level | local_dc     | all_dcs
    +     *     --------------------------------------------------------
    +     *     LOCAL_ONE         | default (2)  | default (any)
    +     *     LOCAL_QUORUM      | 1            | default (any)
    +     *     ONE               | any          | RF - 1
    +     *     QUORUM            | any          | (RF / 2) - 1
    +     *     ALL               | default      | 0
    +     *
    +     * A concrete example with QUORUM would be if you have 3 replicas in 2 
datacenters, then you would set
    +     * block_for_peers_all_dcs to (6 / 2) - 1 = 2 because that guarantees 
that at most 2 hosts in all datacenters
    +     * are down when you start taking client traffic, which should 
satistfy QUORUM for all RF=6 QUORUM queries.
    +     */
    +    public int block_for_peers_local_dc = 2;
    --- End diff --
    
    > If the goal is to not return unavailable shouldn't the default really be 
1 since LOCAL_QUORUM and QUORUM are more common? I feel like if this doesn't do 
what people want with the defaults it makes it a lot less useful?
    
    I thought that defaulting to handling the case that the drivers default to 
`LOCAL_ONE` was most sensible, although I can make it `1` if you prefer and say 
the defaults are for `LOCAL_QUORUM` (I don't have strong opinions other than we 
shouldn't be waiting by default on remote DCs).
    
    > All this does is add 10 seconds to startup which seems really minor. A 
part of me even wonders why fuss with any of this if we are just going to wait 
10 seconds? Just wait until everything is up and if it doesn't happen in 10 
seconds continue to do what we were going to do anyways?
    
    In practice you wait way less than 10 seconds (e.g. on the test 200 node 
4.0 cluster we handshake with the whole local DC in about ... 500ms). I 
personally think that most users would rather their database wait O(minutes) 
than throw errors in the general case, but from the earlier conversations in 
CASSANDRA-13993 it seemed like folks were hesitant to wait so long on startup 
e.g. when doing ccm clusters or the such.
    
    > The name also doesn't really make sense anymore since this is not the 
number we are blocking for it's the number we aren't blocking for.
    
    How about ... `startup_max_down_local_dc_peers` and 
`startup_max_down_peers`?
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to