[ https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569908#comment-17569908 ]
Sam Tunnicliffe commented on CASSANDRA-13851: --------------------------------------------- bq. A node will not start unless it can contact at least one seed node This is not true. In fact, it was the very point of this JIRA to fix that regression, which was introduced in CASSANDRA-10134. If a node cannot contact any seeds during startup, it will expand the set of nodes contacted in the shadow round to include all known peers. So only if no peers at all can be reached will the node fail to start (and even then, only if the node is not a seed itself). {code} {code:java} while (true) { if (slept % 5000 == 0) { // CASSANDRA-8072, retry at the beginning and every 5 seconds logger.trace("Sending shadow round GOSSIP DIGEST SYN to seeds {}", seeds); for (InetAddressAndPort seed : seeds) MessagingService.instance().send(message, seed); // Send to any peers we already know about, but only if a seed didn't respond. if (includePeers) { logger.trace("Sending shadow round GOSSIP DIGEST SYN to known peers {}", peers); for (InetAddressAndPort peer : peers) MessagingService.instance().send(message, peer); } includePeers = true; } Thread.sleep(1000); if (!inShadowRound) break; slept += 1000; if (slept > shadowRoundDelay) { // if we got here no peers could be gossiped to. If we're a seed that's OK, but otherwise we stop. See CASSANDRA-13851 if (!isSeed) throw new RuntimeException("Unable to gossip with any peers"); inShadowRound = false; break; } } {code} > Allow existing nodes to use all peers in shadow round > ----------------------------------------------------- > > Key: CASSANDRA-13851 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13851 > Project: Cassandra > Issue Type: Bug > Components: Local/Startup and Shutdown > Reporter: Kurt Greaves > Assignee: Kurt Greaves > Priority: Normal > Fix For: 3.11.3, 4.0-alpha1, 4.0 > > > In CASSANDRA-10134 we made collision checks necessary on every startup. A > side-effect was introduced that then requires a nodes seeds to be contacted > on every startup. Prior to this change an existing node could start up > regardless whether it could contact a seed node or not (because > checkForEndpointCollision() was only called for bootstrapping nodes). > Now if a nodes seeds are removed/deleted/fail it will no longer be able to > start up until live seeds are configured (or itself is made a seed), even > though it already knows about the rest of the ring. This is inconvenient for > operators and has the potential to cause some nasty surprises and increase > downtime. > One solution would be to use all a nodes existing peers as seeds in the > shadow round. Not a Gossip guru though so not sure of implications. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org