Not sure off hand why that is happening but could you try bootstrapping that 
node from scratch again or try a different new node?

 

Kenneth Brotman

 

From: Léo FERLIN SUTTON [mailto:lfer...@mailjet.com.INVALID] 
Sent: Wednesday, February 06, 2019 9:15 AM
To: user@cassandra.apache.org
Subject: Bootstrap keeps failing

 

Hello !

 

I am having a recurrent problem when trying to bootstrap a few new nodes.

 

Some general info : 

*       I am running cassandra 3.0.17
*       We have about 30 nodes in our cluster
*       All healthy nodes have between 60% to 90% used disk space on 
/var/lib/cassandra

So I create a new node and let auto_bootstrap do it's job. After a few days the 
bootstrapping node stops streaming new data but is still not a member of the 
cluster.

 

`nodetool status` says the node is still joining, 

 

When this happens I run `nodetool bootstrap resume`. This usually ends up in 
two different ways :

1.      The node fills up to 100% disk space and crashes.
2.      The bootstrap resume finishes with errors

When I look at `nodetool netstats -H` is  looks like `bootstrap resume` does 
not resume but restarts a full transfer of every data from every node.

 

This is the output I get from `nodetool resume` :

[2019-02-06 01:39:14,369] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-225-big-Data.db
 (progress: 2113%)

[2019-02-06 01:39:16,821] received file 
/var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-88-big-Data.db
 (progress: 2113%)

[2019-02-06 01:39:17,003] received file 
/var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-89-big-Data.db
 (progress: 2113%)

[2019-02-06 01:39:17,032] session with /10.16.XX.YYY complete (progress: 2113%)

[2019-02-06 01:41:15,160] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-220-big-Data.db
 (progress: 2113%)

[2019-02-06 01:42:02,864] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-226-big-Data.db
 (progress: 2113%)

[2019-02-06 01:42:09,284] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-227-big-Data.db
 (progress: 2113%)

[2019-02-06 01:42:10,522] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-228-big-Data.db
 (progress: 2113%)

[2019-02-06 01:42:10,622] received file 
/var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-229-big-Data.db
 (progress: 2113%)

[2019-02-06 01:42:11,925] received file 
/var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-90-big-Data.db
 (progress: 2114%)

[2019-02-06 01:42:14,887] received file 
/var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-91-big-Data.db
 (progress: 2114%)

[2019-02-06 01:42:14,980] session with /1016.XX.ZZZ complete (progress: 2114%)

[2019-02-06 01:42:14,980] Stream failed

[2019-02-06 01:42:14,982] Error during bootstrap: Stream failed

[2019-02-06 01:42:14,982] Resume bootstrap complete

  

The bootstrap `progress` goes way over 100% and eventually fails.

 

 

Right now I have a node with this output from `nodetool status` : 

`UJ  10.16.XX.YYY  2.93 TB    256          ?                 
5788f061-a3c0-46af-b712-ebeecd397bf7  c`

 

It is almost filled with data, yet if I look at `nodetool netstats` :

        Receiving 480 files, 325.39 GB total. Already received 5 files, 68.32 
MB total
        Receiving 499 files, 328.96 GB total. Already received 1 files, 1.32 GB 
total
        Receiving 506 files, 345.33 GB total. Already received 6 files, 24.19 
MB total
        Receiving 362 files, 206.73 GB total. Already received 7 files, 34 MB 
total
        Receiving 424 files, 281.25 GB total. Already received 1 files, 1.3 GB 
total
        Receiving 581 files, 349.26 GB total. Already received 8 files, 45.96 
MB total
        Receiving 443 files, 337.26 GB total. Already received 6 files, 96.15 
MB total
        Receiving 424 files, 275.23 GB total. Already received 5 files, 42.67 
MB total

 

It is trying to pull all the data again.

 

Am I missing something about the way `nodetool bootstrap resume` is supposed to 
be used ?

 

Regards,

 

Leo

 

Reply via email to