[
https://issues.apache.org/jira/browse/CASSANDRA-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201025#comment-17201025
]
Paulo Motta commented on CASSANDRA-8494:
----------------------------------------
Dynamic virtual nodes (CASSANDRA-16141) will make it trivial to support
incremental bootstrap. The idea is similar to [~rustyrazorblade] suggestion on
[this
comment|https://issues.apache.org/jira/browse/CASSANDRA-8494?focusedCommentId=14264970&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14264970]:
a node will bootstrap one token at a time and announce to the cluster that
token is ready to receive requests before bootstrapping the next token. The
pseudo-code is available
[here|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-incremental_bootstrap-py].
> incremental bootstrap
> ---------------------
>
> Key: CASSANDRA-8494
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8494
> Project: Cassandra
> Issue Type: New Feature
> Components: Legacy/Streaming and Messaging
> Reporter: Jon Haddad
> Assignee: Yuki Morishita
> Priority: Low
> Labels: dense-storage
> Fix For: 4.x
>
>
> Current bootstrapping involves (to my knowledge) picking tokens and streaming
> data before the node is available for requests. This can be problematic with
> "fat nodes", since it may require 20TB of data to be streamed over before the
> machine can be useful. This can result in a massive window of time before
> the machine can do anything useful.
> As a potential approach to mitigate the huge window of time before a node is
> available, I suggest modifying the bootstrap process to only acquire a single
> initial token before being marked UP. This would likely be a configuration
> parameter "incremental_bootstrap" or something similar.
> After the node is bootstrapped with this one token, it could go into UP
> state, and could then acquire additional tokens (one or a handful at a time),
> which would be streamed over while the node is active and serving requests.
> The benefit here is that with the default 256 tokens a node could become an
> active part of the cluster with less than 1% of it's final data streamed over.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]