[
https://issues.apache.org/jira/browse/CASSANDRA-10231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734586#comment-14734586
]
Stefania commented on CASSANDRA-10231:
--------------------------------------
This is not going to be easy to reproduce with a dtest, not without injecting
some failure into the code. So far I was able to see this interesting
transition by issuing repeated nodetool status commands during a decommission -
but I was very lucky as I only saw it once out of several times:
{code}
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UL 127.0.0.1 57.39 KB 256 ?
1b91a92c-58b7-470f-82eb-f1e05fc50636 rack1
UN 127.0.0.2 90.56 KB 256 ?
4287fd68-e53d-4b9e-a48b-af374f9e69b3 rack1
UN 127.0.0.3 52.56 KB 256 ?
35a94edb-b38a-4bf3-8318-e14bb8a59eef rack1
Note: Non-system keyspaces don't have the same replication settings, effective
ownership information is meaningless
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UL 127.0.0.1 57.39 KB 256 ? null
rack1
UN 127.0.0.2 90.56 KB 256 ?
4287fd68-e53d-4b9e-a48b-af374f9e69b3 rack1
UN 127.0.0.3 52.56 KB 256 ?
35a94edb-b38a-4bf3-8318-e14bb8a59eef rack1
Note: Non-system keyspaces don't have the same replication settings, effective
ownership information is meaningless
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UN 127.0.0.2 90.56 KB 256 ?
4287fd68-e53d-4b9e-a48b-af374f9e69b3 rack1
UN 127.0.0.3 52.56 KB 256 ?
35a94edb-b38a-4bf3-8318-e14bb8a59eef rack1
Note: Non-system keyspaces don't have the same replication settings, effective
ownership information is meaningless
{code}
Because of this observed transition, we know that at some point during the
decomission the host id must be null. That means it must be updated as null in
{{system.peers}}. My assumption was that if the node crashes when the host id
is null in {{system.peers}} but before the entry is removed entirely, this
behavior might be observed. So I patched the C* code not to save host id in
system peers, and when I did I got this, which is close but not identical:
{code}
Final status from node 2
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UL 127.0.0.1 63.71 KB 256 ? null
rack1
UN 127.0.0.2 102.39 KB 256 ?
c897de6b-9ec8-4fe2-9835-60bf812c0b22 rack1
{code}
I also saw this exception:
{code}
ERROR [GossipStage:1] 2015-09-08 18:10:22,590 CassandraDaemon.java:191 -
Exception in thread Thread[GossipStage:1,5,main]
java.lang.NullPointerException: null
at
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
~[na:1.8.0_60]
at org.apache.cassandra.hints.HintsCatalog.get(HintsCatalog.java:85)
~[main/:na]
at
org.apache.cassandra.hints.HintsService.excise(HintsService.java:267)
~[main/:na]
at
org.apache.cassandra.service.StorageService.excise(StorageService.java:2129)
~[main/:na]
at
org.apache.cassandra.service.StorageService.excise(StorageService.java:2141)
~[main/:na]
at
org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:2046)
~[main/:na]
at
org.apache.cassandra.service.StorageService.onChange(StorageService.java:1660)
~[main/:na]
at
org.apache.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1191)
~[main/:na]
at org.apache.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:1173)
~[main/:na]
at
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1130)
~[main/:na]
at
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
~[main/:na]
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
~[main/:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_60]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[na:1.8.0_60]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_60]
{code}
Here is the [wip
dtest|https://github.com/stef1927/cassandra-dtest/commits/10231] but it only
works by changing the C* source code as follows:
{code}
stefi@lila:~/git/cstar/cassandra$ git diff
diff --git a/src/java/org/apache/cassandra/service/StorageService.java
b/src/java/org/apache/cassandra/service/StorageService.java
index 2d9bbec..b84bcf5 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -1701,7 +1701,7 @@ public class StorageService extends
NotificationBroadcasterSupport implements IE
MigrationManager.instance.scheduleSchemaPull(endpoint,
epState);
break;
case HOST_ID:
- SystemKeyspace.updatePeerInfo(endpoint, "host_id",
UUID.fromString(value.value));
+ //SystemKeyspace.updatePeerInfo(endpoint, "host_id",
UUID.fromString(value.value));
break;
case RPC_READY:
notifyRpcChange(endpoint, epState.isRpcReady());
@@ -1741,7 +1741,7 @@ public class StorageService extends
NotificationBroadcasterSupport implements IE
SystemKeyspace.updatePeerInfo(endpoint, "schema_version",
UUID.fromString(entry.getValue().value));
break;
case HOST_ID:
- SystemKeyspace.updatePeerInfo(endpoint, "host_id",
UUID.fromString(entry.getValue().value));
+ //SystemKeyspace.updatePeerInfo(endpoint, "host_id",
UUID.fromString(entry.getValue().value));
break;
}
}
{code}
This code in {{SS.initServer()}} is suspect:
{code}
if
(Boolean.parseBoolean(System.getProperty("cassandra.load_ring_state", "true")))
{
logger.info("Loading persisted ring state");
Multimap<InetAddress, Token> loadedTokens =
SystemKeyspace.loadTokens();
Map<InetAddress, UUID> loadedHostIds = SystemKeyspace.loadHostIds();
for (InetAddress ep : loadedTokens.keySet())
{
if (ep.equals(FBUtilities.getBroadcastAddress()))
{
// entry has been mistakenly added, delete it
SystemKeyspace.removeEndpoint(ep);
}
else
{
if (loadedHostIds.containsKey(ep))
tokenMetadata.updateHostId(loadedHostIds.get(ep), ep);
Gossiper.instance.addSavedEndpoint(ep);
}
}
}
{code}
The EP is added even when there is no host id so this might explain the problem
but I still need to investigate further.
> Null status entries on nodes that crash during decommission of a different
> node
> -------------------------------------------------------------------------------
>
> Key: CASSANDRA-10231
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10231
> Project: Cassandra
> Issue Type: Bug
> Reporter: Joel Knighton
> Assignee: Stefania
> Fix For: 3.0.x
>
>
> This issue is reproducible through a Jepsen test of materialized views that
> crashes and decommissions nodes throughout the test.
> In a 5 node cluster, if a node crashes at a certain point (unknown) during
> the decommission of a different node, it may start with a null entry for the
> decommissioned node like so:
> DN 10.0.0.5 ? 256 ? null rack1
> This entry does not get updated/cleared by gossip. This entry is removed upon
> a restart of the affected node.
> This issue is further detailed in ticket
> [10068|https://issues.apache.org/jira/browse/CASSANDRA-10068].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)