[
https://issues.apache.org/jira/browse/CASSANDRA-20051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896409#comment-17896409
]
Jeff Jirsa edited comment on CASSANDRA-20051 at 11/7/24 5:13 PM:
-----------------------------------------------------------------
{quote}So after we apply that script, all nodes will have a seed set properly
but that one which will not set it to itself will contain some other node. Then
we go to remove this node from a cluster, because we were thinking there is no
node which still contains this seed. So we remove that node and we end up with
a node point to non-existing seed.
{quote}
-The node that's wrong has a seed pointing to itself, so if you remove it, you
remove both the invalid reference and the source of the invalid reference?
-
Edit: I'm wrong here, I get your point. It's right, but it's also something
that doesn't matter in real life?
Also, in maybeGossipToSeed() (
https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/gms/Gossiper.java#L982
) , a few things are true:
- If you have yourself as the only seed, you exit early anyway (no-op)
- If the seed isn't alive in failure detector (e.g. invalid IP), you skip it
- Otherwise you go into the probabilistic sending behavior
So IF you do this, you either exit early no-op OR you skip over the invalid
node anyway after the first FD round marks it as dead.
I don't object to the code, and it feels like a sufficiently dead horse. It's a
semantically pointless fix, but if it feels good to you because it makes the
user feel like it's doing something "right", commit it.
was (Author: jjirsa):
{quote}So after we apply that script, all nodes will have a seed set properly
but that one which will not set it to itself will contain some other node. Then
we go to remove this node from a cluster, because we were thinking there is no
node which still contains this seed. So we remove that node and we end up with
a node point to non-existing seed.
{quote}
-The node that's wrong has a seed pointing to itself, so if you remove it, you
remove both the invalid reference and the source of the invalid reference? -
Edit: I'm wrong here, I get your point. It's right, but it's also something
that doesn't matter in real life?
Also, in maybeGossipToSeed() (
https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/gms/Gossiper.java#L982
) , a few things are true:
- If you have yourself as the only seed, you exit early anyway (no-op)
- If the seed isn't alive in failure detector (e.g. invalid IP), you skip it
- Otherwise you go into the probabilistic sending behavior
So IF you do this, you either exit early no-op OR you skip over the invalid
node anyway after the first FD round marks it as dead.
I don't object to the code, and it feels like a sufficiently dead horse. It's a
semantically pointless fix, but if it feels good to you because it makes the
user feel like it's doing something "right", commit it.
> nodetool reloadseeds does not reliably reload the seeds
> -------------------------------------------------------
>
> Key: CASSANDRA-20051
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20051
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Config
> Reporter: Tibor Repasi
> Assignee: Stefan Miklosovic
> Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> During re-deploying lots of Cassandra nodes I've observed that some nodes
> does not reliably reload the seeds when {{nodetool reloadseeds}} command was
> issued.
> After the seeds list was changed in the config:
> {code}
> $ grep seeds /etc/cassandra/cassandra.yaml
> - seeds: 10.90.44.82
> $ nodetool getseeds
> Current list of seed node IPs, excluding the current node's IP:
> /10.90.40.86:7000 /10.90.44.86:7000
> $ nodetool reloadseeds
> Updated seed node IP list, excluding the current node's IP: /10.90.40.86:7000
> /10.90.44.86:7000
> {code}
> At this instance the following line was logged to debug.log:
> {code}
> DEBUG [RMI TCP Connection(103568)-127.0.0.1] 2024-11-04 14:04:27,638
> YamlConfigurationLoader.java:124 - Loading settings from
> file:/etc/cassandra/cassandra.yaml
> {code}
> However, getting the old list:
> {code}
> $ nodetool getseeds
> Current list of seed node IPs, excluding the current node's IP:
> /10.90.40.86:7000 /10.90.44.86:7000
> {code}
> These nodes read the seed list only after Cassandra was restarted:
> {code}
> $ sudo systemctl restart cassandra.service
> $ nodetool getseeds
> Seed node list does not contain any remote node IPs
> {code}
> Note: this was observed on a seed node.
> Observed on Cassandra 4.1.7.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]