Ok. Have to psych myself up to the add node task a bit. Didn't go well the first time round!
Tasks - Make sure the new node is not in seeds list! - Check cluster name, listen address, rpc address - Give it its own rack in cassandra-rackdc.properties - Delete cassandra-topology.properties if it exists - Make sure no compactions are on the go - rm -rf /var/lib/cassandra/* - rm /data/cassandra/commitlog/* (this is on different disk) - systemctl start cassandra And it should start streaming data from the other nodes and join the cluster. Anything else I have to watch out for? Tx. On Tue, Apr 4, 2023 at 5:25 AM Jeff Jirsa <jji...@gmail.com> wrote: > Because executing “removenode” streamed extra data from live nodes to the > “gaining” replica > > Oversimplified (if you had one token per node) > > If you start with A B C > > Then add D > > D should bootstrap a range from each of A B and C, but at the end, some of > the data that was A B C becomes B C D > > When you removenode, you tell B and C to send data back to A. > > A B and C will eventually contact that data away. Eventually. > > If you get around to adding D again, running “cleanup” when you’re done > (successfully) will remove a lot of it. > > > > On Apr 3, 2023, at 8:14 PM, David Tinker <david.tin...@gmail.com> wrote: > > > Looks like the remove has sorted things out. Thanks. > > One thing I am wondering about is why the nodes are carrying a lot more > data? The loads were about 2.7T before, now 3.4T. > > # nodetool status > Datacenter: dc1 > =============== > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN xxx.xxx.xxx.105 3.4 TiB 256 100.0% > afd02287-3f88-4c6f-8b27-06f7a8192402 rack3 > UN xxx.xxx.xxx.253 3.34 TiB 256 100.0% > e1af72be-e5df-4c6b-a124-c7bc48c6602a rack2 > UN xxx.xxx.xxx.107 3.44 TiB 256 100.0% > ab72f017-be96-41d2-9bef-a551dec2c7b5 rack1 > > On Mon, Apr 3, 2023 at 5:42 PM Bowen Song via user < > user@cassandra.apache.org> wrote: > >> That's correct. nodetool removenode is strongly preferred when your node >> is already down. If the node is still functional, use nodetool >> decommission on the node instead. >> On 03/04/2023 16:32, Jeff Jirsa wrote: >> >> FWIW, `nodetool decommission` is strongly preferred. `nodetool >> removenode` is designed to be run when a host is offline. Only decommission >> is guaranteed to maintain consistency / correctness, and removemode >> probably streams a lot more data around than decommission. >> >> >> On Mon, Apr 3, 2023 at 6:47 AM Bowen Song via user < >> user@cassandra.apache.org> wrote: >> >>> Use nodetool removenode is strongly preferred in most circumstances, >>> and only resort to assassinate if you do not care about data >>> consistency or you know there won't be any consistency issue (e.g. no new >>> writes and did not run nodetool cleanup). >>> >>> Since the size of data on the new node is small, nodetool removenode >>> should finish fairly quickly and bring your cluster back. >>> >>> Next time when you are doing something like this again, please test it >>> out on a non-production environment, make sure everything works as expected >>> before moving onto the production. >>> >>> >>> On 03/04/2023 06:28, David Tinker wrote: >>> >>> Should I use assassinate or removenode? Given that there is some data on >>> the node. Or will that be found on the other nodes? Sorry for all the >>> questions but I really don't want to mess up. >>> >>> On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz <crdiaz...@gmail.com> wrote: >>> >>>> That's what nodetool assassinte will do. >>>> >>>> On Sun, Apr 2, 2023 at 10:19 PM David Tinker <david.tin...@gmail.com> >>>> wrote: >>>> >>>>> Is it possible for me to remove the node from the cluster i.e. to undo >>>>> this mess and get the cluster operating again? >>>>> >>>>> On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz <crdiaz...@gmail.com> >>>>> wrote: >>>>> >>>>>> You can leave it in the seed list of the other nodes, just make sure >>>>>> it's not included in this node's seed list. However, if you do decide to >>>>>> fix the issue with the racks first assassinate this node (nodetool >>>>>> assassinate <ip>), and update the rack name before you restart. >>>>>> >>>>>> On Sun, Apr 2, 2023 at 10:06 PM David Tinker <david.tin...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> It is also in the seeds list for the other nodes. Should I remove it >>>>>>> from those, restart them one at a time, then restart it? >>>>>>> >>>>>>> /etc/cassandra # grep -i bootstrap * >>>>>>> doesn't show anything so I don't think I have auto_bootstrap false. >>>>>>> >>>>>>> Thanks very much for the help. >>>>>>> >>>>>>> >>>>>>> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz <crdiaz...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Just remove it from the seed list in the cassandra.yaml file and >>>>>>>> restart the node. Make sure that auto_bootstrap is set to true first >>>>>>>> though. >>>>>>>> >>>>>>>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker <david.tin...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> So likely because I made it a seed node when I added it to the >>>>>>>>> cluster it didn't do the bootstrap process. How can I recover this? >>>>>>>>> >>>>>>>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker < >>>>>>>>> david.tin...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Yes replication factor is 3. >>>>>>>>>> >>>>>>>>>> I ran nodetool repair -pr on all the nodes (one at a time) and >>>>>>>>>> am still having issues getting data back from queries. >>>>>>>>>> >>>>>>>>>> I did make the new node a seed node. >>>>>>>>>> >>>>>>>>>> Re "rack4": I assumed that was just an indication as to the >>>>>>>>>> physical location of the server for redundancy. This one is separate >>>>>>>>>> from >>>>>>>>>> the others so I used rack4. >>>>>>>>>> >>>>>>>>>> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz <crdiaz...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I'm assuming that your replication factor is 3. If that's the >>>>>>>>>>> case, did you intentionally put this node in rack 4? Typically, >>>>>>>>>>> you want >>>>>>>>>>> to add nodes in multiples of your replication factor in order to >>>>>>>>>>> keep the >>>>>>>>>>> "racks" balanced. In other words, this node should have been added >>>>>>>>>>> to rack >>>>>>>>>>> 1, 2 or 3. >>>>>>>>>>> >>>>>>>>>>> Having said that, you should be able to easily fix your problem >>>>>>>>>>> by running a nodetool repair -pr on the new node. >>>>>>>>>>> >>>>>>>>>>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker < >>>>>>>>>>> david.tin...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi All >>>>>>>>>>>> >>>>>>>>>>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster >>>>>>>>>>>> and now many reads are not returning rows! What do I need to do to >>>>>>>>>>>> fix >>>>>>>>>>>> this? There weren't any errors in the logs or other problems that >>>>>>>>>>>> I could >>>>>>>>>>>> see. I expected the cluster to balance itself but this hasn't >>>>>>>>>>>> happened >>>>>>>>>>>> (yet?). The nodes are similar so I have num_tokens=256 for each. I >>>>>>>>>>>> am using >>>>>>>>>>>> the Murmur3Partitioner. >>>>>>>>>>>> >>>>>>>>>>>> # nodetool status >>>>>>>>>>>> Datacenter: dc1 >>>>>>>>>>>> =============== >>>>>>>>>>>> Status=Up/Down >>>>>>>>>>>> |/ State=Normal/Leaving/Joining/Moving >>>>>>>>>>>> -- Address Load Tokens Owns (effective) Host >>>>>>>>>>>> ID Rack >>>>>>>>>>>> UN xxx.xxx.xxx.105 2.65 TiB 256 72.9% >>>>>>>>>>>> afd02287-3f88-4c6f-8b27-06f7a8192402 rack3 >>>>>>>>>>>> UN xxx.xxx.xxx.253 2.6 TiB 256 73.9% >>>>>>>>>>>> e1af72be-e5df-4c6b-a124-c7bc48c6602a rack2 >>>>>>>>>>>> UN xxx.xxx.xxx.24 93.82 KiB 256 80.0% >>>>>>>>>>>> c4e8b4a0-f014-45e6-afb4-648aad4f8500 rack4 >>>>>>>>>>>> UN xxx.xxx.xxx.107 2.65 TiB 256 73.2% >>>>>>>>>>>> ab72f017-be96-41d2-9bef-a551dec2c7b5 rack1 >>>>>>>>>>>> >>>>>>>>>>>> # nodetool netstats >>>>>>>>>>>> Mode: NORMAL >>>>>>>>>>>> Not sending any streams. >>>>>>>>>>>> Read Repair Statistics: >>>>>>>>>>>> Attempted: 0 >>>>>>>>>>>> Mismatch (Blocking): 0 >>>>>>>>>>>> Mismatch (Background): 0 >>>>>>>>>>>> Pool Name Active Pending Completed >>>>>>>>>>>> Dropped >>>>>>>>>>>> Large messages n/a 0 71754 >>>>>>>>>>>> 0 >>>>>>>>>>>> Small messages n/a 0 8398184 >>>>>>>>>>>> 14 >>>>>>>>>>>> Gossip messages n/a 0 1303634 >>>>>>>>>>>> 0 >>>>>>>>>>>> >>>>>>>>>>>> # nodetool ring >>>>>>>>>>>> Datacenter: dc1 >>>>>>>>>>>> ========== >>>>>>>>>>>> Address Rack Status State Load >>>>>>>>>>>> Owns Token >>>>>>>>>>>> >>>>>>>>>>>> 9189523899826545641 >>>>>>>>>>>> xxx.xxx.xxx.24 rack4 Up Normal 93.82 KiB >>>>>>>>>>>> 79.95% -9194674091837769168 >>>>>>>>>>>> xxx.xxx.xxx.107 rack1 Up Normal 2.65 TiB >>>>>>>>>>>> 73.25% -9168781258594813088 >>>>>>>>>>>> xxx.xxx.xxx.253 rack2 Up Normal 2.6 TiB >>>>>>>>>>>> 73.92% -9163037340977721917 >>>>>>>>>>>> xxx.xxx.xxx.105 rack3 Up Normal 2.65 TiB >>>>>>>>>>>> 72.88% -9148860739730046229 >>>>>>>>>>>> xxx.xxx.xxx.107 rack1 Up Normal 2.65 TiB >>>>>>>>>>>> 73.25% -9125240034139323535 >>>>>>>>>>>> xxx.xxx.xxx.253 rack2 Up Normal 2.6 TiB >>>>>>>>>>>> 73.92% -9112518853051755414 >>>>>>>>>>>> xxx.xxx.xxx.105 rack3 Up Normal 2.65 TiB >>>>>>>>>>>> 72.88% -9100516173422432134 >>>>>>>>>>>> ... >>>>>>>>>>>> >>>>>>>>>>>> This is causing a serious production issue. Please help if you >>>>>>>>>>>> can. >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>