[
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16396587#comment-16396587
]
Kurt Greaves commented on CASSANDRA-5836:
-----------------------------------------
{quote}Hold on, how is this going to work at all? If the first node in new DC
is going to bootstrap (let's assume seeds are allowed to bootstrap) it will own
the whole token ring at first, so it will have to stream in all the data that
exists in the source DC, times the RF(s) of new DC. Even if the new node
doesn't die a horrible death in the process, you won't be able to add another
node to the cluster until this is finished. And even after that, adding the
next node to new DC will take ~50% of ownership from the first one, so you will
need to run cleanup on the first one in the end, etc. for the rest of the new
nodes.
It is totally unpractical to add new DC this way, so I firmly believe that
auto_bootstrap=false is here to stay for new DCs.
{quote}
That will only occur if you've added the datacenter to replication prior to
adding the nodes in the new DC. I was under the impression for a while that NTS
keyspaces won't bootstrap across DC's, but on further testing they do. This is
irrelevant however as it's not really standard practice to update your RF
before you set up your new DC. You can, but it's generally not a good idea
because of the reason you listed, unless you're doing it on a really small
dataset. Either way it's for an expert to decide what they want to do here.
Regardless, there's no saying we need to change the [un]documented procedure
for adding a new DC because of this. It's perfectly acceptable to still use
{{auto_bootstrap: false}} for a new DC, even with my code changes. It would be
worth documenting the current behaviour that you'll stream *a lot* of data if
you update RF prior to bootstrapping a new DC though.
bq. ... I have doubts that any of these checks can be made really bullet-proof.
I was saying you'd need that check if you changed all seeds to bootstrap.
Otherwise how would you tell if you are the first node? Currently a seed won't
fail if you set {{auto_bootstrap: true}} and it's the first node in the
cluster, which is what you're proposing.
bq. Well, we have this for adding new DCs, so not really that silly. It also
doesn't have to be "we said so", for me the explanation is simple: the first
seed node will fail the bootstrap otherwise, because there is no other nodes to
bootstrap from yet.
I really don't think it's a good idea to change/further complicate the new
cluster startup process. It's not the same as adding a DC (and as I've said,
that procedure isn't necessarily correct). Many people will be relying on the
existing behaviour pretty heavily. Complicating it by saying "now your first
node will need auto_bootstrap: false" is not going to end well. How is it not
simpler that the first node just doesn't bootstrap, and all others do?
bq. Again, I have serious doubts about all this automatic corner case
detection. As I've said before I'm totally fine with making initial cluster set
up a little bit more involved, if that makes operations on the clusters in
production more reliable.
Cassandra is a complex beast. I've been in the startup code pretty heavily and
I know first hand from working with hundreds of clusters that the
startup/bootstrapping code is a nightmare. All I'm proposing is we change the
"seeds don't bootstrap logic regardless of configuration" to "the first node in
the cluster doesn't bootstrap, all other nodes respect the auto_bootstrap
setting". IMO this reduces the # of corner cases because you no longer have to
think about replacing nodes, new DC's, replacing nodes as seeds, new nodes as
seeds, new nodes, or having conflicting configs like {{auto_bootstrap: true}}
but being a seed.
> Seed nodes should be able to bootstrap without manual intervention
> ------------------------------------------------------------------
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
> Issue Type: Bug
> Reporter: Bill Hathaway
> Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped. If a user
> wants to bootstrap a node configured as a seed (for example to replace a seed
> node via replace_token), they first need to remove the node's own IP from the
> seed list, and then start the bootstrap process. This seems like an
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a
> seed node to bootstrap without manual intervention when there are other seed
> nodes up in a ring.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]