[ 
https://issues.apache.org/jira/browse/CASSANDRA-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Holmberg updated CASSANDRA-16408:
--------------------------------------
    Description: 
Trying to add a new node to an existing 4.0 cluster gets stuck in 
bootstrap/joining permanently with no clear error.

Version: 4.0-beta4 (issue also seen in 4.0-beta3, and NOT seen in 3.11.x) and 
Java 8 (Open JDK 1.8.0_275)
Topology: 3 rack single DC using EC2Snitch, 1 seed node per rack
Relevant cassandra.yaml settings: 
{code}
auto_bootstrap: true (implicit)
seeds contains the same 3 nodes on all nodes
num_tokens: 16
allocate_tokens_for_local_replication_factor: 3
server_encryption_options.internode_encryption: all
server_encryption_options.enabled: true
server_encryption_options.optional: false
server_encryption_options.require_client_auth: true
client_encryption_options.enabled: true
client_encryption_options.optional: false
client_encryption_options.require_client_auth: true
{code}

Scenario: 
* Bring up the 3 seed nodes to create a new cluster. 
* Add a user keyspace: create keyspace test with replication = \{ 'class': 
'NetworkTopologyStrategy', 'us-east-1-dc': 3 }; and insert some test data. 
* Wait at least 10 minutes after the initial 3 seed nodes come up (nodes will 
join if they are brought up at the same time as the seeds, but not if they are 
brought up later). 
* Start cassandra on a fourth node. 

Cassandra begins to bootstrap but does not ever finish (I have left this 
running overnight) and does not exit nor log any errors. Nodetool status from 
any node shows new node as UJ. Nodetool netstats from new node shows receiving 
file from test keyspace at 100% received. Logs show bootstrap starting and 
streaming starting, but then nothing/no errors.

Worth noting here that I have also tried this with 
allocate_tokens_for_local_replication_factor disabled and still have this 
issue. I have also tried this without any user keyspace/data, just completely 
empty cluster and still have this issue. The only way I seem to be able to 
bring up a decently sized cluster on 4.0 is to disable 
allocate_tokens_for_local_replication_factor (to avoid collisions as mentioned 
in other issues) and bring up all nodes at about the same time, or use 
auto_bootstrap: false. I have no issue adding a new node in a similar fashion 
to a 3.11.x cluster.

  was:
Trying to add a new node to an existing 4.0 cluster gets stuck in 
bootstrap/joining permanently with no clear error.

Version: 4.0-beta4 (issue also seen in 4.0-beta3, and NOT seen in 3.11.x) and 
Java 8 (Open JDK 1.8.0_275)
Topology: 3 rack single DC using EC2Snitch, 1 seed node per rack
Relevant cassandra.yaml settings: 
{code}
auto_bootstrap: true (implicit)
seeds contains the same 3 nodes on all nodes
num_tokens: 16
allocate_tokens_for_local_replication_factor: 3
server_encryption_options.internode_encryption: all
server_encryption_options.enabled: true
server_encryption_options.optional: false
server_encryption_options.require_client_auth: true
client_encryption_options.enabled: true
client_encryption_options.optional: false
client_encryption_options.require_client_auth: true
{code}

Scenario: Bring up the 3 seed nodes to create a new cluster. Add a user 
keyspace: create keyspace test with replication = \{ 'class': 
'NetworkTopologyStrategy', 'us-east-1-dc': 3 }; and insert some test data. Wait 
at least 10 minutes after the initial 3 seed nodes come up (nodes will join if 
they are brought up at the same time as the seeds, but not if they are brought 
up later). Start cassandra on a fourth node. Cassandra begins to bootstrap but 
does not ever finish (I have left this running overnight) and does not exit nor 
log any errors. Nodetool status from any node shows new node as UJ. Nodetool 
netstats from new node shows receiving file from test keyspace at 100% 
received. Logs show bootstrap starting and streaming starting, but then 
nothing/no errors.

Worth noting here that I have also tried this with 
allocate_tokens_for_local_replication_factor disabled and still have this 
issue. I have also tried this without any user keyspace/data, just completely 
empty cluster and still have this issue. The only way I seem to be able to 
bring up a decently sized cluster on 4.0 is to disable 
allocate_tokens_for_local_replication_factor (to avoid collisions as mentioned 
in other issues) and bring up all nodes at about the same time, or use 
auto_bootstrap: false. I have no issue adding a new node in a similar fashion 
to a 3.11.x cluster.


> Unable to bootstrap/join new nodes to existing 4.0 cluster
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-16408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16408
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Membership, Consistency/Bootstrap and 
> Decommission
>            Reporter: J Hickey
>            Priority: Normal
>             Fix For: 4.0-beta
>
>
> Trying to add a new node to an existing 4.0 cluster gets stuck in 
> bootstrap/joining permanently with no clear error.
> Version: 4.0-beta4 (issue also seen in 4.0-beta3, and NOT seen in 3.11.x) and 
> Java 8 (Open JDK 1.8.0_275)
> Topology: 3 rack single DC using EC2Snitch, 1 seed node per rack
> Relevant cassandra.yaml settings: 
> {code}
> auto_bootstrap: true (implicit)
> seeds contains the same 3 nodes on all nodes
> num_tokens: 16
> allocate_tokens_for_local_replication_factor: 3
> server_encryption_options.internode_encryption: all
> server_encryption_options.enabled: true
> server_encryption_options.optional: false
> server_encryption_options.require_client_auth: true
> client_encryption_options.enabled: true
> client_encryption_options.optional: false
> client_encryption_options.require_client_auth: true
> {code}
> Scenario: 
> * Bring up the 3 seed nodes to create a new cluster. 
> * Add a user keyspace: create keyspace test with replication = \{ 'class': 
> 'NetworkTopologyStrategy', 'us-east-1-dc': 3 }; and insert some test data. 
> * Wait at least 10 minutes after the initial 3 seed nodes come up (nodes will 
> join if they are brought up at the same time as the seeds, but not if they 
> are brought up later). 
> * Start cassandra on a fourth node. 
> Cassandra begins to bootstrap but does not ever finish (I have left this 
> running overnight) and does not exit nor log any errors. Nodetool status from 
> any node shows new node as UJ. Nodetool netstats from new node shows 
> receiving file from test keyspace at 100% received. Logs show bootstrap 
> starting and streaming starting, but then nothing/no errors.
> Worth noting here that I have also tried this with 
> allocate_tokens_for_local_replication_factor disabled and still have this 
> issue. I have also tried this without any user keyspace/data, just completely 
> empty cluster and still have this issue. The only way I seem to be able to 
> bring up a decently sized cluster on 4.0 is to disable 
> allocate_tokens_for_local_replication_factor (to avoid collisions as 
> mentioned in other issues) and bring up all nodes at about the same time, or 
> use auto_bootstrap: false. I have no issue adding a new node in a similar 
> fashion to a 3.11.x cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to