[ 
https://issues.apache.org/jira/browse/CASSANDRA-20051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898620#comment-17898620
 ] 

Stefan Miklosovic edited comment on CASSANDRA-20051 at 11/15/24 1:22 PM:
-------------------------------------------------------------------------

[~jjirsa] [~jjordan]

I was reading the code in trunk and I hit this method (1). Check the comment - 
"Possibly gossip to a seed for facilitating partition healing". This left me 
thinking ... if we have a node with e.g. two perfectly healthy seeds and we go 
to reload seeds in such a way that we put there that node - hence seed list 
will be empty (with this patch in) - then what exactly have we achieved when we 
hit this method? We achieved that seeds.size() will be 0, so it will completely 
skip that logic in that method. That means that node will not participate in 
facilitating partition healing. 

"Facilitation of partition healing" is rather tricky to grasp fully but what I 
think it tries to say is that gossip messages about the cluster will be sent 
around the cluster faster / in more probable manner which will eventually make 
all nodes in the cluster know about the topology / schema etc sooner.

If we introduced this patch, then we would make the changes to not propagate 
(faster). So, I think that we would be actually worse with this patch than 
without it in this regard. I think that it would be better if we somehow 
propagated the fact that the seed list was not changed to a user. Currently, it 
logs this:

{code}
Updated seed node IP list, excluding the current node's IP ...
{code}

ReloadSeeds command is implemented like this:

{code}
        List<String> seedList = probe.reloadSeeds();
        if (seedList == null)
        {
            out.println("Failed to reload the seed node list.");
        }
        else if (seedList.isEmpty())
        {
            out.println("Seed node list does not contain any remote node IPs");
        }
        else
        {
            out.println("Updated seed node IP list, excluding the current 
node's IP: " + String.join(" ", seedList));
        }
{code}

seedList is not empty, so it jumps to the last "else", it just thinks that it 
changed the seeds, but it didn't. To be more explicit about the fact that seeds 
have not changed, we could do this:

{code}
        // get the current seeds
        List<String> originalSeeds = probe.getSeeds();
        // try to reload
        List<String> seedList = probe.reloadSeeds();
        if (seedList == null)
        {
            out.println("Failed to reload the seed node list.");
        }
        else if (seedList.isEmpty())
        {
            out.println("Seed node list does not contain any remote node IPs");
        }
        else
        {
            // this if would be added
            if (originalSeeds.equals(seedList))
            {
                out.println("Seeds were not reloaded. Most probably, the seed 
you wanted to set is equal to the node's address which is meaningless 
operation");
            }
            else
            {
                out.println("Updated seed node IP list, excluding the current 
node's IP: " + String.join(" ", seedList));
            }
        }
{code}

(1) 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L813-L839


was (Author: smiklosovic):
[~jjirsa] [~jjordan]

I was reading the code in trunk and I hit this method (1). Check the comment - 
"Possibly gossip to a seed for facilitating partition healing". This left me 
thinking ... if we have a node with e.g. two perfectly healthy seeds and we go 
to reload seeds in such a way that we put there that node - hence seed list 
will be empty (with this patch in) - then what exactly have we achieved when we 
hit this method? We achieved that seeds.size() will be 0, so it will completely 
skip that logic in that method. That means that such a not will not participate 
in facilitating partition healing. 

"Facilitation of partition healing" is rather tricky to grasp fully but what I 
think it tries to say is that gossip messages about the cluster will be sent 
around the cluster faster / in more probable manner which will eventually make 
all nodes in the cluster know about the topology / schema etc sooner.

If we introduced this patch, then we would make the changes to not propagate 
(faster). So, I think that we would be actually worse with this patch than 
without it in this regard. I think that it would be better if we somehow 
propagated the fact that the seed list was not changed to a user. Currently, it 
logs this:

{code}
Updated seed node IP list, excluding the current node's IP ...
{code}

ReloadSeeds command is implemented like this:

{code}
        List<String> seedList = probe.reloadSeeds();
        if (seedList == null)
        {
            out.println("Failed to reload the seed node list.");
        }
        else if (seedList.isEmpty())
        {
            out.println("Seed node list does not contain any remote node IPs");
        }
        else
        {
            out.println("Updated seed node IP list, excluding the current 
node's IP: " + String.join(" ", seedList));
        }
{code}

seedList is not empty, so it jumps to the last "else", it just thinks that it 
changed the seeds, but it didn't. To be more explicit about the fact that seeds 
have not changed, we could do this:

{code}
        // get the current seeds
        List<String> originalSeeds = probe.getSeeds();
        // try to reload
        List<String> seedList = probe.reloadSeeds();
        if (seedList == null)
        {
            out.println("Failed to reload the seed node list.");
        }
        else if (seedList.isEmpty())
        {
            out.println("Seed node list does not contain any remote node IPs");
        }
        else
        {
            // this if would be added
            if (originalSeeds.equals(seedList))
            {
                out.println("Seeds were not reloaded. Most probably, the seed 
you wanted to set is equal to the node's address which is meaningless 
operation");
            }
            else
            {
                out.println("Updated seed node IP list, excluding the current 
node's IP: " + String.join(" ", seedList));
            }
        }
{code}

(1) 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L813-L839

> nodetool reloadseeds does not reliably reload the seeds
> -------------------------------------------------------
>
>                 Key: CASSANDRA-20051
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20051
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Config
>            Reporter: Tibor Repasi
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> During re-deploying lots of Cassandra nodes I've observed that some nodes 
> does not reliably reload the seeds when {{nodetool reloadseeds}} command was 
> issued.
> After the seeds list was changed in the config:
> {code}
> $ grep seeds /etc/cassandra/cassandra.yaml
>  - seeds: 10.90.44.82
> $ nodetool getseeds
> Current list of seed node IPs, excluding the current node's IP: 
> /10.90.40.86:7000 /10.90.44.86:7000
> $ nodetool reloadseeds
> Updated seed node IP list, excluding the current node's IP: /10.90.40.86:7000 
> /10.90.44.86:7000
> {code}
> At this instance the following line was logged to debug.log:
> {code}
> DEBUG [RMI TCP Connection(103568)-127.0.0.1] 2024-11-04 14:04:27,638 
> YamlConfigurationLoader.java:124 - Loading settings from 
> file:/etc/cassandra/cassandra.yaml
> {code}
> However, getting the old list:
> {code}
> $ nodetool getseeds
> Current list of seed node IPs, excluding the current node's IP: 
> /10.90.40.86:7000 /10.90.44.86:7000
> {code}
> These nodes read the seed list only after Cassandra was restarted:
> {code}
> $ sudo systemctl restart cassandra.service
> $ nodetool getseeds
> Seed node list does not contain any remote node IPs
> {code}
> Note: this was observed on a seed node.
> Observed on Cassandra 4.1.7.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to