[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2018-04-15 Thread Tom van der Woerdt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438674#comment-16438674
 ] 

Tom van der Woerdt commented on CASSANDRA-13348:


I haven't seen this happening anymore. It's possible CASSANDRA-13700 was indeed 
related, but we also made a major change to how new nodes are deployed 
(previously: start all at the same time, then 'nodetool join' one at a time. 
now: start one at a time and wait for it to join).

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2018-04-14 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438577#comment-16438577
 ] 

Dikang Gu commented on CASSANDRA-13348:
---

I'm unable to reproduce this particular issue in our test or production 
environment. I think we need a strong consistent membership (CASSANDRA-9667) to 
truly avoid race condition during token allocation.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2018-04-13 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437217#comment-16437217
 ] 

Stefan Podkowinski commented on CASSANDRA-13348:


[~dikanggu], are you still working on this or should we close this ticket as 
"Unable to reproduce"?

[~tvdw], did the issue happen again in the mean time?

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-07-28 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105957#comment-16105957
 ] 

Jeremiah Jordan commented on CASSANDRA-13348:
-

I have also seen nodes have partial cluster state views if auto bootstrap was 
false, as in the auto bootstrap false case we do not wait for things to settle 
for as long, so in a large cluster or one with datacenters that have a lot of 
latency they come up before having a full view of things. So I could see this 
possibly happening in that case as well. The fix from CASSANDRA-13700 helps 
with that, as those conditions caused the bug there to cause even more trouble 
with gossip settling.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-07-28 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105793#comment-16105793
 ] 

Jeff Jirsa commented on CASSANDRA-13348:


[~jjordan] - that would be easier to explain than what Tom's seeing -  [~tvdw] 
suggests that he did one node on a given day and still saw it, which doesn't 
feel like a race at all. 




> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-07-28 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105737#comment-16105737
 ] 

Jeremiah Jordan commented on CASSANDRA-13348:
-

I have definitely seen this happen when using the new token allocation 
algorithm and starting two nodes at the same time.  Without some kind of LWT or 
something being used for joining the cluster, I don't know that there is a way 
to prevent this.  Though maybe some random wiggle added to the tokens chosen by 
the algorithm would help.  The issue is if two nodes join and get the same 
"current state" of the cluster, then they will pick the same optimal token 
locations.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-22 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019992#comment-16019992
 ] 

Dikang Gu commented on CASSANDRA-13348:
---

[~tvdw], this part of code should prevent duplicated token already, 
https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/dht/tokenallocator/TokenAllocation.java#L75.
 

We are also using the new token algorithm all across our clusters, haven't seen 
any duplicated tokens so far.

If you can send me more logs, it will be helpful. Or the tokens for each node 
in your cluster.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-19 Thread Tom van der Woerdt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018126#comment-16018126
 ] 

Tom van der Woerdt commented on CASSANDRA-13348:


Same DC.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> eg. 7604192574911909354 has been chosen by both.
> The joins were eight days 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-19 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018086#comment-16018086
 ] 

Dikang Gu commented on CASSANDRA-13348:
---

[~tvdw], hmm, for the nodes with duplicated tokens, are they in the same DC or 
different DC?

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> eg. 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-17 Thread Tom van der Woerdt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013731#comment-16013731
 ] 

Tom van der Woerdt commented on CASSANDRA-13348:


Sadly, I was able to reproduce this again yesterday.

It seems unlikely that this is related to gossip. I run all servers with 
"-Dcassandra.join_ring=false", and "nodetool join" is run later. In this case I 
joined a single new node into the cluster, which got the join command about 30 
minutes after gossip had settled. No other nodes joined in the same day. I 
accidentally lost the logs though :(

The node joined with 256 vnodes; about 130 of them were unique, the rest was 
duplicate.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-01 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991984#comment-15991984
 ] 

Yasuharu Goto commented on CASSANDRA-13348:
---

[~dikanggu] Thank you! Our cluster seems to be able to avoid this issue. I'm 
gonna upgrade to 3.0.13!

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-01 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991971#comment-15991971
 ] 

Dikang Gu commented on CASSANDRA-13348:
---

[~Yasuharu], if you do not set the *allocate_tokens_for_keyspace*, the new 
algorithm won't kick in, and you will still generate all the tokens randomlyl

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-01 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990685#comment-15990685
 ] 

Yasuharu Goto commented on CASSANDRA-13348:
---

Hi [~dikanggu], we're now preparing to upgrade our production Cassandra cluster 
from 2.1 to 3.0.14.
Our 3.0 clusters does not activate allocate_tokens_for_keyspace for now. In 
your theory, would this issue affects to C* clusters with 
allocate_tokens_for_keyspace = null?

Thank you.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-04-26 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985990#comment-15985990
 ] 

Dikang Gu commented on CASSANDRA-13348:
---

This is the patch we maintained internally, 
https://gist.github.com/DikangGu/cc59a846f6fb90120193fa08ef18fe1b, it will 
increase the chance that gossip is already settled down before allocating new 
tokens. Feel free to patch if you want, I'm thinking if there are more general 
way to do it. (I discussed with [~jasobrown] offline about this)

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-04-24 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981502#comment-15981502
 ] 

Dikang Gu commented on CASSANDRA-13348:
---

[~spo...@gmail.com] sorry for the delay, I have a theory for the cause. I 
suspect it's because new token allocation algorithm needs to wait gossip to 
completely settle down, and then can get all token information, before 
allocating new tokens. In the case that Gossip is not settled down completely, 
it may miss some existing tokens from other nodes, and allocated some 
duplicated tokens. I'm still looking for ways to confirm it and fix it cleanly 
(I have an internal patch, which tells Cassandra instances to wait for certain 
number of nodes, before mark Gossip as settle down). 

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-04-24 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15980811#comment-15980811
 ] 

Stefan Podkowinski commented on CASSANDRA-13348:


Any clues what causes this, Dikang? Would it makes sense for others to have a 
look as well, or do you already know what happened here?

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-04-07 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961479#comment-15961479
 ] 

Dikang Gu commented on CASSANDRA-13348:
---

was busy with something else this week, will look deep into this problem next 
week.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> eg. 7604192574911909354 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-03-27 Thread Tom van der Woerdt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944078#comment-15944078
 ] 

Tom van der Woerdt commented on CASSANDRA-13348:


Murmur3 simulator, yes. GossipingPropertyFileSnitch, if it matters.

I don't recall exactly how this cluster was built, but it was something like 
this :
 * Provision 5 nodes per DC, all but one with "-Dcassandra.join_ring=false". 
Keyspace with rf= dc1:2 dc2:2
 * "nodetool join" one at a time (random order)
 * Provision 30 nodes in dc1 -- all have "allocate_tokens_for_keyspace" set
 * "nodetool join" ~10
 * Decommission the first five, so we're now left with dc1:10
 * "nodetool join" the rest
 * Ditto for dc2, so we now have dc1:30 dc2:30

There's a lot of automation involved, a human may take a different route to 
doing this. I decommissioned the 10 initial nodes which had non-ideal hardware, 
and they made place for 60 more powerful machines.

The last "nodetool join" batch produced two or three machines with bad tokens.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-03-27 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944062#comment-15944062
 ] 

Dikang Gu commented on CASSANDRA-13348:
---

[~tvdw], okay, I will try to run a simulator. I assume you are using Murmur3 
partitioner? 

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> eg. 7604192574911909354 has been chosen by both.

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-03-27 Thread Tom van der Woerdt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944058#comment-15944058
 ] 

Tom van der Woerdt commented on CASSANDRA-13348:


My replication factor was dc1:2 dc2:2, with ~30 nodes per dc. As for the logs, 
I don't have those anymore, got rotated away by now, though there wasn't 
anything interesting there other than the excerpts I pasted.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-03-27 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944054#comment-15944054
 ] 

Dikang Gu commented on CASSANDRA-13348:
---

[~tvdw], what's your replication factor? Can you send me the full log of the 
node 1 and node 2?

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> eg. 7604192574911909354 has been chosen by 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-03-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932827#comment-15932827
 ] 

Paulo Motta commented on CASSANDRA-13348:
-

regardless of the fix here we should detect and prevent nodes from joining the 
ring with conflicting tokens (unless when replacing an existing node) - there's 
a similar check available on CASSANDRA-12485 for another less likely edge case 
that allows a node joining the ring with conflicting tokens

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-03-18 Thread Tom van der Woerdt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931215#comment-15931215
 ] 

Tom van der Woerdt commented on CASSANDRA-13348:


It seems that /at least/ four nodes were affected by this, all in the same DC. 
I'm decommissioning and rejoining them all, in an attempt to get the cluster 
healthy again.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Priority: Critical
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - JOINING: Starting to