[
https://issues.apache.org/jira/browse/CASSANDRA-10231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934733#comment-14934733
]
Joel Knighton commented on CASSANDRA-10231:
-------------------------------------------
I've attached the logs for n1, n2, n3, n4, and n5. n1 is at 10.0.0.2, n2 is at
10.0.0.3, and so on.
The decommission node is n2. The node with the null status entry is n5. This
status entry looks like
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UN 10.0.0.2 480.16 KB 256 ?
7a7681f5-0a22-4ba2-89c4-17c84658a18f rack1
?N 10.0.0.3 ? 256 ? null
rack1
UN 10.0.0.4 495.24 KB 256 ?
ef529827-e178-49f8-ad3a-458198df5060 rack1
UN 10.0.0.5 374.78 KB 256 ?
ee63423d-1204-496e-b53d-d318472717ab rack1
UN 10.0.0.6 456.69 KB 256 ?
d88d166b-ed03-4b48-a12e-ea849f680920 rack1
As I mentioned last week, I'm tracking down an MV issue that causes a failure
in the tests before they would reach this point on 3.0. In order to accommodate
this, I applied your patch to commit e5c14285404b1ba98d385c5e5ed069229a2f6004,
which is the commit in which I originally produced the issue.
Sorry for the delay.
> Null status entries on nodes that crash during decommission of a different
> node
> -------------------------------------------------------------------------------
>
> Key: CASSANDRA-10231
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10231
> Project: Cassandra
> Issue Type: Bug
> Reporter: Joel Knighton
> Assignee: Stefania
> Fix For: 3.0.0 rc2
>
> Attachments: n1.log, n2.log, n3.log, n4.log, n5.log
>
>
> This issue is reproducible through a Jepsen test of materialized views that
> crashes and decommissions nodes throughout the test.
> In a 5 node cluster, if a node crashes at a certain point (unknown) during
> the decommission of a different node, it may start with a null entry for the
> decommissioned node like so:
> DN 10.0.0.5 ? 256 ? null rack1
> This entry does not get updated/cleared by gossip. This entry is removed upon
> a restart of the affected node.
> This issue is further detailed in ticket
> [10068|https://issues.apache.org/jira/browse/CASSANDRA-10068].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)