The time it takes to stream data off of a node varies by network, cloud region, and other factors. So it's not unheard of for it to take a bit to finish.
Just thought I'd mention that auto_bootstrap is true by default. So if you're not setting it, the node should bootstrap as long as it's not a seed node. As for the rack issue, yes, it's a good idea to keep your racks in multiples of your RF. When performing token ownership calculations, Cassandra takes rack designation into consideration. It tries to ensure that multiple replicas for a row are not placed in the same rack. TBH - I'd build out two more nodes to have 6 nodes across 3 racks (2 in each), just to ensure even distribution. Otherwise, you might notice that the nodes sharing a rack will consume disk at a different rate than the nodes which have their own rack. On Mon, Apr 3, 2023 at 8:57 AM David Tinker <david.tin...@gmail.com> wrote: > Thanks. Hmm, the remove has been busy for hours but seems to be > progressing. > > I have been running this on the nodes to monitor progress: > # nodetool netstats | grep Already > Receiving 92 files, 843934103369 bytes total. Already received 82 > files (89.13%), 590204687299 bytes total (69.93%) > Sending 84 files, 860198753783 bytes total. Already sent 56 files > (66.67%), 307038785732 bytes total (35.69%) > Sending 78 files, 815573435637 bytes total. Already sent 56 files > (71.79%), 313079823738 bytes total (38.39%) > > The percentages are ticking up. > > # nodetool ring | head -20 > Datacenter: dc1 > ========== > Address Rack Status State Load Owns > Token > > 9189523899826545641 > xxx.xxx.xxx..24 rack4 Down Leaving 26.62 GiB 79.95% > -9194674091837769168 > xxx.xxx.xxx.107 rack1 Up Normal 2.68 TiB 73.25% > -9168781258594813088 > xxx.xxx.xxx.253 rack2 Up Normal 2.63 TiB 73.92% > -9163037340977721917 > xxx.xxx.xxx.105 rack3 Up Normal 2.68 TiB 72.88% > -9148860739730046229 > > > On Mon, Apr 3, 2023 at 3:46 PM Bowen Song via user < > user@cassandra.apache.org> wrote: > >> Use nodetool removenode is strongly preferred in most circumstances, and >> only resort to assassinate if you do not care about data consistency or >> you know there won't be any consistency issue (e.g. no new writes and did >> not run nodetool cleanup). >> >> Since the size of data on the new node is small, nodetool removenode >> should finish fairly quickly and bring your cluster back. >> >> Next time when you are doing something like this again, please test it >> out on a non-production environment, make sure everything works as expected >> before moving onto the production. >> >> >> On 03/04/2023 06:28, David Tinker wrote: >> >> Should I use assassinate or removenode? Given that there is some data on >> the node. Or will that be found on the other nodes? Sorry for all the >> questions but I really don't want to mess up. >> >> On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz <crdiaz...@gmail.com> wrote: >> >>> That's what nodetool assassinte will do. >>> >>> On Sun, Apr 2, 2023 at 10:19 PM David Tinker <david.tin...@gmail.com> >>> wrote: >>> >>>> Is it possible for me to remove the node from the cluster i.e. to undo >>>> this mess and get the cluster operating again? >>>> >>>> On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz <crdiaz...@gmail.com> wrote: >>>> >>>>> You can leave it in the seed list of the other nodes, just make sure >>>>> it's not included in this node's seed list. However, if you do decide to >>>>> fix the issue with the racks first assassinate this node (nodetool >>>>> assassinate <ip>), and update the rack name before you restart. >>>>> >>>>> On Sun, Apr 2, 2023 at 10:06 PM David Tinker <david.tin...@gmail.com> >>>>> wrote: >>>>> >>>>>> It is also in the seeds list for the other nodes. Should I remove it >>>>>> from those, restart them one at a time, then restart it? >>>>>> >>>>>> /etc/cassandra # grep -i bootstrap * >>>>>> doesn't show anything so I don't think I have auto_bootstrap false. >>>>>> >>>>>> Thanks very much for the help. >>>>>> >>>>>> >>>>>> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz <crdiaz...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Just remove it from the seed list in the cassandra.yaml file and >>>>>>> restart the node. Make sure that auto_bootstrap is set to true first >>>>>>> though. >>>>>>> >>>>>>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker <david.tin...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> So likely because I made it a seed node when I added it to the >>>>>>>> cluster it didn't do the bootstrap process. How can I recover this? >>>>>>>> >>>>>>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker <david.tin...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Yes replication factor is 3. >>>>>>>>> >>>>>>>>> I ran nodetool repair -pr on all the nodes (one at a time) and am >>>>>>>>> still having issues getting data back from queries. >>>>>>>>> >>>>>>>>> I did make the new node a seed node. >>>>>>>>> >>>>>>>>> Re "rack4": I assumed that was just an indication as to the >>>>>>>>> physical location of the server for redundancy. This one is separate >>>>>>>>> from >>>>>>>>> the others so I used rack4. >>>>>>>>> >>>>>>>>> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz <crdiaz...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I'm assuming that your replication factor is 3. If that's the >>>>>>>>>> case, did you intentionally put this node in rack 4? Typically, you >>>>>>>>>> want >>>>>>>>>> to add nodes in multiples of your replication factor in order to >>>>>>>>>> keep the >>>>>>>>>> "racks" balanced. In other words, this node should have been added >>>>>>>>>> to rack >>>>>>>>>> 1, 2 or 3. >>>>>>>>>> >>>>>>>>>> Having said that, you should be able to easily fix your problem >>>>>>>>>> by running a nodetool repair -pr on the new node. >>>>>>>>>> >>>>>>>>>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker < >>>>>>>>>> david.tin...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi All >>>>>>>>>>> >>>>>>>>>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and >>>>>>>>>>> now many reads are not returning rows! What do I need to do to fix >>>>>>>>>>> this? >>>>>>>>>>> There weren't any errors in the logs or other problems that I could >>>>>>>>>>> see. I >>>>>>>>>>> expected the cluster to balance itself but this hasn't happened >>>>>>>>>>> (yet?). The >>>>>>>>>>> nodes are similar so I have num_tokens=256 for each. I am using the >>>>>>>>>>> Murmur3Partitioner. >>>>>>>>>>> >>>>>>>>>>> # nodetool status >>>>>>>>>>> Datacenter: dc1 >>>>>>>>>>> =============== >>>>>>>>>>> Status=Up/Down >>>>>>>>>>> |/ State=Normal/Leaving/Joining/Moving >>>>>>>>>>> -- Address Load Tokens Owns (effective) Host >>>>>>>>>>> ID Rack >>>>>>>>>>> UN xxx.xxx.xxx.105 2.65 TiB 256 72.9% >>>>>>>>>>> afd02287-3f88-4c6f-8b27-06f7a8192402 rack3 >>>>>>>>>>> UN xxx.xxx.xxx.253 2.6 TiB 256 73.9% >>>>>>>>>>> e1af72be-e5df-4c6b-a124-c7bc48c6602a rack2 >>>>>>>>>>> UN xxx.xxx.xxx.24 93.82 KiB 256 80.0% >>>>>>>>>>> c4e8b4a0-f014-45e6-afb4-648aad4f8500 rack4 >>>>>>>>>>> UN xxx.xxx.xxx.107 2.65 TiB 256 73.2% >>>>>>>>>>> ab72f017-be96-41d2-9bef-a551dec2c7b5 rack1 >>>>>>>>>>> >>>>>>>>>>> # nodetool netstats >>>>>>>>>>> Mode: NORMAL >>>>>>>>>>> Not sending any streams. >>>>>>>>>>> Read Repair Statistics: >>>>>>>>>>> Attempted: 0 >>>>>>>>>>> Mismatch (Blocking): 0 >>>>>>>>>>> Mismatch (Background): 0 >>>>>>>>>>> Pool Name Active Pending Completed >>>>>>>>>>> Dropped >>>>>>>>>>> Large messages n/a 0 71754 >>>>>>>>>>> 0 >>>>>>>>>>> Small messages n/a 0 8398184 >>>>>>>>>>> 14 >>>>>>>>>>> Gossip messages n/a 0 1303634 >>>>>>>>>>> 0 >>>>>>>>>>> >>>>>>>>>>> # nodetool ring >>>>>>>>>>> Datacenter: dc1 >>>>>>>>>>> ========== >>>>>>>>>>> Address Rack Status State Load >>>>>>>>>>> Owns Token >>>>>>>>>>> >>>>>>>>>>> 9189523899826545641 >>>>>>>>>>> xxx.xxx.xxx.24 rack4 Up Normal 93.82 KiB >>>>>>>>>>> 79.95% -9194674091837769168 >>>>>>>>>>> xxx.xxx.xxx.107 rack1 Up Normal 2.65 TiB >>>>>>>>>>> 73.25% -9168781258594813088 >>>>>>>>>>> xxx.xxx.xxx.253 rack2 Up Normal 2.6 TiB >>>>>>>>>>> 73.92% -9163037340977721917 >>>>>>>>>>> xxx.xxx.xxx.105 rack3 Up Normal 2.65 TiB >>>>>>>>>>> 72.88% -9148860739730046229 >>>>>>>>>>> xxx.xxx.xxx.107 rack1 Up Normal 2.65 TiB >>>>>>>>>>> 73.25% -9125240034139323535 >>>>>>>>>>> xxx.xxx.xxx.253 rack2 Up Normal 2.6 TiB >>>>>>>>>>> 73.92% -9112518853051755414 >>>>>>>>>>> xxx.xxx.xxx.105 rack3 Up Normal 2.65 TiB >>>>>>>>>>> 72.88% -9100516173422432134 >>>>>>>>>>> ... >>>>>>>>>>> >>>>>>>>>>> This is causing a serious production issue. Please help if you >>>>>>>>>>> can. >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>