[ 
https://issues.apache.org/jira/browse/CASSANDRA-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Lightfoot updated CASSANDRA-21185:
--------------------------------------
    Description: 
Tests often failing due to no seed no being up whilst non-seeds trying to join 
the ring. Likely fix to start the seed node to ensure CMS initialization is 
complete then allow other nodes in CCM to start in parallel.

Affects 5.1+ due to <=5.0 using [sequential 
startup|https://github.com/apache/cassandra-dtest/blob/trunk/bootstrap_test.py#L254C30-L254C32].

On further analysis it appears two separate clusters form due to the seed node 
not accepting messages during CMS initialization. Attached logs show the 
independent clusters resulting from 
_bootstrap_test.py::TestBootstrap::test_read_from_bootstrapped_node._

  was:
Tests often failing due to no seed no being up whilst non-seeds trying to join 
the ring. Likely fix to start the seed node to ensure CMS initialization is 
complete then allow other nodes in CCM to start in parallel.

Affects 5.1+ due to <=5.0 using [sequential 
startup|https://github.com/apache/cassandra-dtest/blob/trunk/bootstrap_test.py#L254C30-L254C32].

On further analysis it appears split-brain is occurring leading to two separate 
clusters being formed due to the seed node not accepting messages during CMS 
initialization. Attached logs show the independent clusters resulting fromĀ 
_bootstrap_test.py::TestBootstrap::test_read_from_bootstrapped_node_


> Fix flaky DTest: bootstrap_test_*
> ---------------------------------
>
>                 Key: CASSANDRA-21185
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21185
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/python
>            Reporter: Sam Lightfoot
>            Assignee: Sam Lightfoot
>            Priority: Normal
>             Fix For: 5.1
>
>         Attachments: split_brain_logs.txt
>
>
> Tests often failing due to no seed no being up whilst non-seeds trying to 
> join the ring. Likely fix to start the seed node to ensure CMS initialization 
> is complete then allow other nodes in CCM to start in parallel.
> Affects 5.1+ due to <=5.0 using [sequential 
> startup|https://github.com/apache/cassandra-dtest/blob/trunk/bootstrap_test.py#L254C30-L254C32].
> On further analysis it appears two separate clusters form due to the seed 
> node not accepting messages during CMS initialization. Attached logs show the 
> independent clusters resulting from 
> _bootstrap_test.py::TestBootstrap::test_read_from_bootstrapped_node._



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to