[ https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105957#comment-16105957 ]
Jeremiah Jordan commented on CASSANDRA-13348: --------------------------------------------- I have also seen nodes have partial cluster state views if auto bootstrap was false, as in the auto bootstrap false case we do not wait for things to settle for as long, so in a large cluster or one with datacenters that have a lot of latency they come up before having a full view of things. So I could see this possibly happening in that case as well. The fix from CASSANDRA-13700 helps with that, as those conditions caused the bug there to cause even more trouble with gossip settling. > Duplicate tokens after bootstrap > -------------------------------- > > Key: CASSANDRA-13348 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13348 > Project: Cassandra > Issue Type: Bug > Reporter: Tom van der Woerdt > Assignee: Dikang Gu > Priority: Blocker > Fix For: 3.0.x > > > This one is a bit scary, and probably results in data loss. After a bootstrap > of a few new nodes into an existing cluster, two new nodes have chosen some > overlapping tokens. > In fact, of the 256 tokens chosen, 51 tokens were already in use on the other > node. > Node 1 log : > {noformat} > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 > TokenAllocation.java:61 - Selected tokens [............, 2959334889475814712, > 3727103702384420083, 7183119311535804926, 6013900799616279548, > -1222135324851761575, 1645259890258332163, -1213352346686661387, > 7604192574911909354] > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:65 - Replicated node load in datacentre before > allocation max 1.00 min 1.00 stddev 0.0000 > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:66 - Replicated node load in datacentre after allocation > max 1.00 min 1.00 stddev 0.0000 > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:70 - Unexpected growth in standard deviation after > allocation. > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 > StorageService.java:1160 - JOINING: sleeping 30000 ms for pending range setup > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 > StorageService.java:1160 - JOINING: Starting to bootstrap... > {noformat} > Node 2 log: > {noformat} > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 > StorageService.java:971 - Joining ring by operator request > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 > StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 > StorageService.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 > TokenAllocation.java:61 - Selected tokens [......, 2890709530010722764, > -2416006722819773829, -5820248611267569511, -5990139574852472056, > 1645259890258332163, 9135021011763659240, -5451286144622276797, > 7604192574911909354] > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 > TokenAllocation.java:65 - Replicated node load in datacentre before > allocation max 1.02 min 0.98 stddev 0.0000 > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 > TokenAllocation.java:66 - Replicated node load in datacentre after allocation > max 1.00 min 1.00 stddev 0.0000 > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 > StorageService.java:1160 - JOINING: sleeping 30000 ms for pending range setup > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 > StorageService.java:1160 - JOINING: Starting to bootstrap... > {noformat} > eg. 7604192574911909354 has been chosen by both. > The joins were eight days apart, so I don't think it's a race :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org