[
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931215#comment-15931215
]
Tom van der Woerdt commented on CASSANDRA-13348:
------------------------------------------------
It seems that /at least/ four nodes were affected by this, all in the same DC.
I'm decommissioning and rejoining them all, in an attempt to get the cluster
healthy again.
> Duplicate tokens after bootstrap
> --------------------------------
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
> Issue Type: Bug
> Reporter: Tom van der Woerdt
> Priority: Critical
>
> This one is a bit scary, and probably results in data loss. After a bootstrap
> of a few new nodes into an existing cluster, two new nodes have chosen some
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other
> node.
> Node 1 log :
> {noformat}
> INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564
> TokenAllocation.java:61 - Selected tokens [............, 2959334889475814712,
> 3727103702384420083, 7183119311535804926, 6013900799616279548,
> -1222135324851761575, 1645259890258332163, -1213352346686661387,
> 7604192574911909354]
> WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729
> TokenAllocation.java:65 - Replicated node load in datacentre before
> allocation max 1.00 min 1.00 stddev 0.0000
> WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation
> max 1.00 min 1.00 stddev 0.0000
> WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729
> TokenAllocation.java:70 - Unexpected growth in standard deviation after
> allocation.
> INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150
> StorageService.java:1160 - JOINING: sleeping 30000 ms for pending range setup
> INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937
> StorageService.java:971 - Joining ring by operator request
> INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630
> TokenAllocation.java:61 - Selected tokens [......, 2890709530010722764,
> -2416006722819773829, -5820248611267569511, -5990139574852472056,
> 1645259890258332163, 9135021011763659240, -5451286144622276797,
> 7604192574911909354]
> WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794
> TokenAllocation.java:65 - Replicated node load in datacentre before
> allocation max 1.02 min 0.98 stddev 0.0000
> WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation
> max 1.00 min 1.00 stddev 0.0000
> INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149
> StorageService.java:1160 - JOINING: sleeping 30000 ms for pending range setup
> INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> eg. 7604192574911909354 has been chosen by both.
> The joins were eight days apart, so I don't think it's a race :)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)