[
https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767361#comment-17767361
]
Cameron Zemek commented on CASSANDRA-18845:
-------------------------------------------
{noformat}
Sep 21 03:01:42 ip-10-1-32-228 cassandra[52927]: INFO
org.apache.cassandra.gms.Gossiper Waiting for gossip to settle...
Sep 21 03:01:48 ip-10-1-32-228 cassandra[52927]: INFO
org.apache.cassandra.gms.Gossiper Gossip looks settled. epSize=108
Sep 21 03:01:49 ip-10-1-32-228 cassandra[52927]: INFO
org.apache.cassandra.gms.Gossiper Gossip looks settled. epSize=108
Sep 21 03:01:50 ip-10-1-32-228 cassandra[52927]: INFO
org.apache.cassandra.gms.Gossiper Gossip looks settled. epSize=108
Sep 21 03:02:00 ip-10-1-32-228 cassandra[52927]: INFO
o.a.c.gms.GossipDigestAckVerbHandler Received a GossipDigestAckMessage from
/15.223.140.86
Sep 21 03:02:00 ip-10-1-32-228 cassandra[52927]: INFO
org.apache.cassandra.gms.Gossiper Sending a EchoMessage to /44.229.153.229
...
Sep 21 03:03:40 ip-10-1-32-228 cassandra[52927]: INFO
org.apache.cassandra.gms.Gossiper InetAddress /44.229.153.229 is now
UP{noformat}
Got a test run with 18 second delay.
> Waiting for gossip to settle on live endpoints
> ----------------------------------------------
>
> Key: CASSANDRA-18845
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18845
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Cameron Zemek
> Priority: Normal
> Attachments: 18845-seperate.patch, delay.log, example.log,
> image-2023-09-14-11-16-23-020.png, stream.log, test1.log, test2.log, test3.log
>
>
> This is a follow up to CASSANDRA-18543
> Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms
> this is tedious and error prone. On a node just observed a 79 second gap
> between waiting for gossip and the first echo response to indicate a node is
> UP.
> The problem being that do not want to start Native Transport until gossip
> settles otherwise queries can fail consistency such as LOCAL_QUORUM as it
> thinks the replicas are still in DOWN state.
> Instead of having to set gossip_settle_min_wait_ms I am proposing that
> (outside single node cluster) wait for UP message from another node before
> considering gossip as settled. Eg.
> {code:java}
> if (currentSize == epSize && currentLive == liveSize && liveSize
> > 1)
> {
> logger.debug("Gossip looks settled.");
> numOkay++;
> } {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]