[ https://issues.apache.org/jira/browse/CASSANDRA-18845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766949#comment-17766949 ]
Cameron Zemek commented on CASSANDRA-18845: ------------------------------------------- Still running, but sharing the results so far: {noformat} $ pytest --count=500 --cassandra-dir=/home/grom/dev/cassandra transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_between_and_cleanup /home/grom/dtest/lib/python3.10/site-packages/ccmlib/common.py:773: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. return LooseVersion(match.group(1)) ====================================== test session starts =======================================platform linux -- Python 3.10.12, pytest-7.3.1, pluggy-1.0.0 rootdir: /home/grom/tmp/cassandra-dtest configfile: pytest.ini plugins: repeat-0.9.1, flaky-3.7.0, timeout-1.4.2 timeout: 900.0s timeout method: signal timeout func_only: False collected 500 itemstransient_replication_ring_test.py ....................................................... [ 11%] ................................................................{noformat} > Waiting for gossip to settle on live endpoints > ---------------------------------------------- > > Key: CASSANDRA-18845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18845 > Project: Cassandra > Issue Type: Improvement > Reporter: Cameron Zemek > Priority: Normal > Attachments: delay.log, example.log, > image-2023-09-14-11-16-23-020.png, test1.log, test2.log, test3.log > > > This is a follow up to CASSANDRA-18543 > Although that ticket added ability to set cassandra.gossip_settle_min_wait_ms > this is tedious and error prone. On a node just observed a 79 second gap > between waiting for gossip and the first echo response to indicate a node is > UP. > The problem being that do not want to start Native Transport until gossip > settles otherwise queries can fail consistency such as LOCAL_QUORUM as it > thinks the replicas are still in DOWN state. > Instead of having to set gossip_settle_min_wait_ms I am proposing that > (outside single node cluster) wait for UP message from another node before > considering gossip as settled. Eg. > {code:java} > if (currentSize == epSize && currentLive == liveSize && liveSize > > 1) > { > logger.debug("Gossip looks settled."); > numOkay++; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org