[ 
https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569908#comment-17569908
 ] 

Sam Tunnicliffe commented on CASSANDRA-13851:
---------------------------------------------

bq. A node will not start unless it can contact at least one seed node 

This is not true. In fact, it was the very point of this JIRA to fix that 
regression, which was introduced in CASSANDRA-10134. If a node cannot contact 
any seeds during startup, it will expand the set of nodes contacted in the 
shadow round to include all known peers. So only if no peers at all can be 
reached will the node fail to start (and even then, only if the node is not a 
seed itself).

{code}
            
{code:java}
while (true)
{
    if (slept % 5000 == 0)
    { // CASSANDRA-8072, retry at the beginning and every 5 seconds
        logger.trace("Sending shadow round GOSSIP DIGEST SYN to seeds {}", 
seeds);

        for (InetAddressAndPort seed : seeds)
            MessagingService.instance().send(message, seed);

        // Send to any peers we already know about, but only if a seed didn't 
respond.
        if (includePeers)
        {
            logger.trace("Sending shadow round GOSSIP DIGEST SYN to known peers 
{}", peers);
            for (InetAddressAndPort peer : peers)
                MessagingService.instance().send(message, peer);
        }
        includePeers = true;
    }

    Thread.sleep(1000);
    if (!inShadowRound)
        break;

    slept += 1000;
    if (slept > shadowRoundDelay)
    {
        // if we got here no peers could be gossiped to. If we're a seed that's 
OK, but otherwise we stop. See CASSANDRA-13851
        if (!isSeed)
            throw new RuntimeException("Unable to gossip with any peers");

        inShadowRound = false;
        break;
    }
}
{code}


> Allow existing nodes to use all peers in shadow round
> -----------------------------------------------------
>
>                 Key: CASSANDRA-13851
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13851
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Startup and Shutdown
>            Reporter: Kurt Greaves
>            Assignee: Kurt Greaves
>            Priority: Normal
>             Fix For: 3.11.3, 4.0-alpha1, 4.0
>
>
> In CASSANDRA-10134 we made collision checks necessary on every startup. A 
> side-effect was introduced that then requires a nodes seeds to be contacted 
> on every startup. Prior to this change an existing node could start up 
> regardless whether it could contact a seed node or not (because 
> checkForEndpointCollision() was only called for bootstrapping nodes). 
> Now if a nodes seeds are removed/deleted/fail it will no longer be able to 
> start up until live seeds are configured (or itself is made a seed), even 
> though it already knows about the rest of the ring. This is inconvenient for 
> operators and has the potential to cause some nasty surprises and increase 
> downtime.
> One solution would be to use all a nodes existing peers as seeds in the 
> shadow round. Not a Gossip guru though so not sure of implications.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to