[
https://issues.apache.org/jira/browse/CASSANDRA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107935#comment-13107935
]
Sylvain Lebresne commented on CASSANDRA-3219:
---------------------------------------------
bq. Which as I pointed out in chat is NOT a new problem, but it's one we should
address.
Agreed, but I suspect this is due (in the not a new problem case) to races in
Boostrapper.getBootstrapSource() detection of already bootstrapping node. We
should fix that if possible, which the patch don't really since if you will
still potentially have those race with auto. But note that this problem is
present in 0.8 and I think is not a top priority because it's relatively rare
to actually start bootstrapping 2 nodes at the same time in real life. Again,
I'm not saying we shouldn't fix, but it's ok to say that as long as it's not
worth than in 0.8, it can wait post 1.0.0 to get fixed.
Now there is a actual new problem with 1.0.0. That problem is that when you
start an initial cluster, i.e, when in 0.8 you would start node with
auto-boostrap=false, you do often end up starting nodes simultaneously. That is
why older version were using random token when auto-bootstrap was false. This
problem does need to be fix for 1.0.0 because that is a serious regression.
However, my argument is that even though we now default to auto-boostrap=true,
that doesn't mean that there is no difference between setting up the initial
nodes of a cluster and the latter bootstrapping of nodes to add capacity to an
existing cluster. Indeed, in 1.0.0 we decided to draw this line based on
whether a schema had been created or not (we call the bootstrap() method based
on that). Imho, this means that we have no boostrap option and the "I have no
schema" is the old auto-boostrap=false. So we should use random token in that
case and balanced one otherwise the same way we are doing it in 0.8.
And I'm saying that I would prefer we do that and report the fixing of
Boostrapper.getBootstrapSource() rather than exposing (and making the default)
the random choice of tokens, which is my opinion is a bad idea.
> Nodes started at the same time end up with the same token
> ---------------------------------------------------------
>
> Key: CASSANDRA-3219
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3219
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.0
> Reporter: T Jake Luciani
> Assignee: Jonathan Ellis
> Labels: bootstrap
> Fix For: 1.0.0
>
> Attachments: 3219.txt, 3219_v2.patch
>
>
> Since autoboostrap is defaulted to on when you start a cluster at once
> (http://screenr.com/5G6) you can end up with nodes being assigned the same
> token.
> {code}
> INFO 17:34:55,688 Node /67.23.43.14 is now part of the cluster
> INFO 17:34:55,698 InetAddress /67.23.43.14 is now UP
> INFO 17:34:55,698 Nodes /67.23.43.14 and tjake2/67.23.43.15 have the same
> token 8823900603000512634329811229926543166. Ignoring /67.23.43.14
> INFO 17:34:55,698 Node /98.129.220.182 is now part of the cluster
> INFO 17:34:55,698 InetAddress /98.129.220.182 is now UP
> INFO 17:34:55,698 Nodes /98.129.220.182 and tjake2/67.23.43.15 have the same
> token 8823900603000512634329811229926543166. Ignoring /98.129.220.182
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira