[
https://issues.apache.org/jira/browse/CASSANDRA-11731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293014#comment-15293014
]
Sam Tunnicliffe commented on CASSANDRA-11731:
---------------------------------------------
I think there are couple of issues regarding the coverage for CASSANDRA-11038.
Firstly, {{NEW_NODE}} notifications are not always delivered when they should
be, sometimes when a node is added only the {{UP}} event is fired. I suspect
this might be caused by some raciness in setting the {{hostId}} for the new
node which results in it not being null when checked in {{handleStateNormal}}.
In fact, when replacing there will always be an existing endpoint for the
{{hostId}}, so at the minimum we'd need to check that it didn't match
{{endpoint}}. In my patch for 11038 I used {{TokenMetadata::isMember}} for the
same purpose and that seems to give consistent behaviour.
Second, my reading of CASSANDRA-8236 and the [native protocol
spec|https://github.com/apache/cassandra/blob/cassandra-3.5/doc/native_protocol_v4.spec#L760-L765]
suggest that {{NEW_NODE}} notifications *do* need to be delayed until the new
node is in an RPC ready state as clients are permitted to interpret them as
signal that the node is up and ready for connections. This is somewhat
redundant as it means that the new node and first up notifications will always
be delivered together, but we shouldn't arbitrarily change the behaviour that
CASSANDRA-8236 introduced (aside from fixing the 11038 bug).
I'm planning to move 11038 to review today, I was just checking whether
{{LatestEvent}} was still necessary before doing that but what I have so far is
[here|https://github.com/beobal/cassandra/tree/11038-3.0].
> dtest failure in
> pushed_notifications_test.TestPushedNotifications.move_single_node_test
> ----------------------------------------------------------------------------------------
>
> Key: CASSANDRA-11731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11731
> Project: Cassandra
> Issue Type: Test
> Reporter: Russ Hatch
> Assignee: Philip Thompson
> Labels: dtest
>
> one recent failure (no vnode job)
> {noformat}
> 'MOVED_NODE' != u'NEW_NODE'
> {noformat}
> http://cassci.datastax.com/job/trunk_novnode_dtest/366/testReport/pushed_notifications_test/TestPushedNotifications/move_single_node_test
> Failed on CassCI build trunk_novnode_dtest #366
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)