[ 
https://issues.apache.org/jira/browse/CASSANDRA-20051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896409#comment-17896409
 ] 

Jeff Jirsa edited comment on CASSANDRA-20051 at 11/7/24 5:13 PM:
-----------------------------------------------------------------

{quote}So after we apply that script, all nodes will have a seed set properly 
but that one which will not set it to itself will contain some other node. Then 
we go to remove this node from a cluster, because we were thinking there is no 
node which still contains this seed. So we remove that node and we end up with 
a node point to non-existing seed.
{quote}

-The node that's wrong has a seed pointing to itself, so if you remove it, you 
remove both the invalid reference and the source of the invalid reference? - 
Edit: I'm wrong here, I get your point. It's right, but it's also something 
that doesn't matter in real life? 

Also, in maybeGossipToSeed() ( 
https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/gms/Gossiper.java#L982
 ) , a few things are true:

- If you have yourself as the only seed, you exit early anyway (no-op)

- If the seed isn't alive in failure detector (e.g. invalid IP), you skip it

- Otherwise you go into the probabilistic sending behavior


So IF you do this, you either exit early no-op OR you skip over the invalid 
node anyway after the first FD round marks it as dead.

I don't object to the code, and it feels like a sufficiently dead horse. It's a 
semantically pointless fix, but if it feels good to you because it makes the 
user feel like it's doing something "right", commit it. 


was (Author: jjirsa):
{quote}So after we apply that script, all nodes will have a seed set properly 
but that one which will not set it to itself will contain some other node. Then 
we go to remove this node from a cluster, because we were thinking there is no 
node which still contains this seed. So we remove that node and we end up with 
a node point to non-existing seed.
{quote}

The node that's wrong has a seed pointing to itself, so if you remove it, you 
remove both the invalid reference and the source of the invalid reference?

Also, in maybeGossipToSeed() ( 
https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/gms/Gossiper.java#L982
 ) , a few things are true:

- If you have yourself as the only seed, you exit early anyway (no-op)

- If the seed isn't alive in failure detector (e.g. invalid IP), you skip it

- Otherwise you go into the probabilistic sending behavior


So IF you do this, you either exit early no-op OR you skip over the invalid 
node anyway after the first FD round marks it as dead.

I don't object to the code, and it feels like a sufficiently dead horse. It's a 
semantically pointless fix, but if it feels good to you because it makes the 
user feel like it's doing something "right", commit it. 

> nodetool reloadseeds does not reliably reload the seeds
> -------------------------------------------------------
>
>                 Key: CASSANDRA-20051
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20051
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Config
>            Reporter: Tibor Repasi
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> During re-deploying lots of Cassandra nodes I've observed that some nodes 
> does not reliably reload the seeds when {{nodetool reloadseeds}} command was 
> issued.
> After the seeds list was changed in the config:
> {code}
> $ grep seeds /etc/cassandra/cassandra.yaml
>  - seeds: 10.90.44.82
> $ nodetool getseeds
> Current list of seed node IPs, excluding the current node's IP: 
> /10.90.40.86:7000 /10.90.44.86:7000
> $ nodetool reloadseeds
> Updated seed node IP list, excluding the current node's IP: /10.90.40.86:7000 
> /10.90.44.86:7000
> {code}
> At this instance the following line was logged to debug.log:
> {code}
> DEBUG [RMI TCP Connection(103568)-127.0.0.1] 2024-11-04 14:04:27,638 
> YamlConfigurationLoader.java:124 - Loading settings from 
> file:/etc/cassandra/cassandra.yaml
> {code}
> However, getting the old list:
> {code}
> $ nodetool getseeds
> Current list of seed node IPs, excluding the current node's IP: 
> /10.90.40.86:7000 /10.90.44.86:7000
> {code}
> These nodes read the seed list only after Cassandra was restarted:
> {code}
> $ sudo systemctl restart cassandra.service
> $ nodetool getseeds
> Seed node list does not contain any remote node IPs
> {code}
> Note: this was observed on a seed node.
> Observed on Cassandra 4.1.7.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to