[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176483#comment-17176483
 ] 

Alexander Dejanovski commented on CASSANDRA-13701:
--------------------------------------------------

Quick update:

I was able to make the 
*bootstrap_test.py::TestBootstrap::test_simultaneous_bootstrap* pass with this 
branch.

The test assumes that both starting nodes will see each other when they check 
for endpoint collision. But if the nodes start at exactly the same time (or 
roughly), then they can both perform the check while none of them is gossiping 
yet, meaning only node1 is part of the ring, which allows them to get tokens 
and start bootstrapping.

Since there's a 30s pause, waiting for gossip to settle, adding a 10s pause 
between node2 and node3 startup allows us to "luckily" avoid the race condition.

The code is not bulletproof to this scenario though. 
I still wonder why this is only happening with the new token allocation 
algorithm. Furthermore, tests are executed with num_tokens = 1, which makes it 
fairly fast to pick a token.
It seems like the orchestration is different between the random token 
allocation and the rf based allocation which makes the race condition more 
obvious.

I'll check the other failing tests tomorrow to see if we're dealing with the 
same problems. 

> Lower default num_tokens
> ------------------------
>
>                 Key: CASSANDRA-13701
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Config
>            Reporter: Chris Lohfink
>            Assignee: Alexander Dejanovski
>            Priority: Low
>             Fix For: 4.0-alpha
>
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to