Use nodetool removenode is strongly preferred in most circumstances, and only resort to assassinate if you do not care about data consistency or you know there won't be any consistency issue (e.g. no new writes and did not run nodetool cleanup).

Since the size of data on the new node is small, nodetool removenode should finish fairly quickly and bring your cluster back.

Next time when you are doing something like this again, please test it out on a non-production environment, make sure everything works as expected before moving onto the production.


On 03/04/2023 06:28, David Tinker wrote:
Should I use assassinate or removenode? Given that there is some data on the node. Or will that be found on the other nodes? Sorry for all the questions but I really don't want to mess up.

On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz <crdiaz...@gmail.com> wrote:

    That's what nodetool assassinte will do.

    On Sun, Apr 2, 2023 at 10:19 PM David Tinker
    <david.tin...@gmail.com> wrote:

        Is it possible for me to remove the node from the cluster i.e.
        to undo this mess and get the cluster operating again?

        On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz
        <crdiaz...@gmail.com> wrote:

            You can leave it in the seed list of the other nodes, just
            make sure it's not included in this node's seed list. 
            However, if you do decide to fix the issue with the racks
            first assassinate this node (nodetool assassinate <ip>),
            and update the rack name before you restart.

            On Sun, Apr 2, 2023 at 10:06 PM David Tinker
            <david.tin...@gmail.com> wrote:

                It is also in the seeds list for the other nodes.
                Should I remove it from those, restart them one at a
                time, then restart it?

                /etc/cassandra # grep -i bootstrap *
                doesn't show anything so I don't think I have
                auto_bootstrap false.

                Thanks very much for the help.


                On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz
                <crdiaz...@gmail.com> wrote:

                    Just remove it from the seed list in the
                    cassandra.yaml file and restart the node.  Make
                    sure that auto_bootstrap is set to true first though.

                    On Sun, Apr 2, 2023 at 9:59 PM David Tinker
                    <david.tin...@gmail.com> wrote:

                        So likely because I made it a seed node when I
                        added it to the cluster it didn't do the
                        bootstrap process. How can I recover this?

                        On Mon, Apr 3, 2023 at 6:41 AM David Tinker
                        <david.tin...@gmail.com> wrote:

                            Yes replication factor is 3.

                            I ran nodetool repair -pr on all the nodes
                            (one at a time) and am still having issues
                            getting data back from queries.

                            I did make the new node a seed node.

                            Re "rack4": I assumed that was just an
                            indication as to the physical location of
                            the server for redundancy. This one is
                            separate from the others so I used rack4.

                            On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz
                            <crdiaz...@gmail.com> wrote:

                                I'm assuming that your replication
                                factor is 3.  If that's the case, did
                                you intentionally put this node in
                                rack 4?  Typically, you want to add
                                nodes in multiples of your replication
                                factor in order to keep the "racks"
                                balanced.  In other words, this node
                                should have been added to rack 1, 2 or 3.

                                Having said that, you should be able
                                to easily fix your problem by running
                                a nodetool repair -pr on the new node.

                                On Sun, Apr 2, 2023 at 8:16 PM David
                                Tinker <david.tin...@gmail.com> wrote:

                                    Hi All

                                    I recently added a node to my 3
                                    node Cassandra 4.0.5 cluster and
                                    now many reads are not returning
                                    rows! What do I need to do to fix
                                    this? There weren't any errors in
                                    the logs or other problems that I
                                    could see. I expected the cluster
                                    to balance itself but this hasn't
                                    happened (yet?). The nodes are
                                    similar so I have num_tokens=256
                                    for each. I am using the
                                    Murmur3Partitioner.

                                    # nodetool status
                                    Datacenter: dc1
                                    ===============
                                    Status=Up/Down
                                    |/ State=Normal/Leaving/Joining/Moving
                                    --  Address  Load       Tokens
                                     Owns (effective)  Host ID   Rack
                                    UN  xxx.xxx.xxx.105  2.65 TiB  
                                    256 72.9%
                                    afd02287-3f88-4c6f-8b27-06f7a8192402
                                     rack3
                                    UN  xxx.xxx.xxx.253  2.6 TiB  
                                     256     73.9%
                                    e1af72be-e5df-4c6b-a124-c7bc48c6602a
                                     rack2
                                    UN  xxx.xxx.xxx.24 93.82 KiB  256
                                        80.0%
                                    c4e8b4a0-f014-45e6-afb4-648aad4f8500
                                     rack4
                                    UN  xxx.xxx.xxx.107  2.65 TiB  
                                    256 73.2%
                                    ab72f017-be96-41d2-9bef-a551dec2c7b5
                                     rack1

                                    # nodetool netstats
                                    Mode: NORMAL
                                    Not sending any streams.
                                    Read Repair Statistics:
                                    Attempted: 0
                                    Mismatch (Blocking): 0
                                    Mismatch (Background): 0
                                    Pool Name      Active   Pending  
                                     Completed   Dropped
                                    Large messages        n/a        
                                    0      71754         0
                                    Small messages        n/a        
                                    0    8398184        14
                                    Gossip messages         n/a      
                                      0      1303634         0

                                    # nodetool ring
                                    Datacenter: dc1
                                    ==========
                                    Address Rack        Status State  
                                    Load            Owns            
                                     Token
                                     9189523899826545641
                                    xxx.xxx.xxx.24  rack4       Up
                                    Normal  93.82 KiB 79.95%
                                     -9194674091837769168
                                    xxx.xxx.xxx.107 rack1       Up
                                    Normal  2.65 TiB  73.25%
                                     -9168781258594813088
                                    xxx.xxx.xxx.253 rack2       Up
                                    Normal  2.6 TiB 73.92%
                                     -9163037340977721917
                                    xxx.xxx.xxx.105 rack3       Up
                                    Normal  2.65 TiB  72.88%
                                     -9148860739730046229
                                    xxx.xxx.xxx.107 rack1       Up
                                    Normal  2.65 TiB  73.25%
                                     -9125240034139323535
                                    xxx.xxx.xxx.253 rack2       Up
                                    Normal  2.6 TiB 73.92%
                                     -9112518853051755414
                                    xxx.xxx.xxx.105 rack3       Up
                                    Normal  2.65 TiB  72.88%
                                     -9100516173422432134
                                    ...

                                    This is causing a serious
                                    production issue. Please help if
                                    you can.

                                    Thanks
                                    David


Reply via email to