[ 
https://issues.apache.org/jira/browse/CASSANDRA-10870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109870#comment-15109870
 ] 

Stefania commented on CASSANDRA-10870:
--------------------------------------

The test checks that each time node 2 restarts, node 1 sends us exactly 3 
notifications in this order: DOWN, UP and NEW_NODE, with the correct IP 
address. The failures under JDK 8 and JDK 7 are different, and they are both 
not limited to 2.1, they also happen on 2.2 or 3.0. I suspect 3.2+ too.

On JDK 8, example 
[here|http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/174/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/],
 the order of {{UP}} and {{NEW_NODE}} notifications is swapped.

{code}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'DOWN', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': 
('127.0.0.2', 9042)}
{code}

An example for 2.2 is 
[here|http://cassci.datastax.com/job/cassandra-2.2_dtest_jdk8/156/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/]:

{code}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'DOWN', 'address': 
('127.0.0.2', 9042)}
{code}

On JDK 7, example 
[here|http://cassci.datastax.com/job/cassandra-2.1_dtest/385/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/],
 we received an extra {{NEW_NODE}} notification:

{code}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'DOWN', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': 
('127.0.0.2', 9042)}
{code}

This happened also on 3.0, except the duplicated notifications is {{UP}}, 
example 
[here|http://cassci.datastax.com/job/cassandra-3.0_dtest/508/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/]:

{code}
dtest: DEBUG: Restarting second node...
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'DOWN', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Waiting for notifications from 127.0.0.1
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': 
('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': 
('127.0.0.2', 9042)}
{code}

I'd say there is a chance we might be seeing the previous notifications of when 
the node starts for the first time during the cluster start-up. If this is the 
case, it might be enough to add a pause before creating the waiter or - better 
- only start node1, then start node2 and wait for the 3 notifications, then 
enter the loop. If this does not fix it, then we really have an issue in 
production code and you can assign the ticket to me. In fact, I can try to fix 
the test as well if you want me to, just assign the ticket to me if that's the 
case.

> pushed_notifications_test.py:TestPushedNotifications.restart_node_test 
> flapping on C* 2.1
> -----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10870
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10870
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Jim Witschey
>            Assignee: DS Test Eng
>             Fix For: 2.1.x
>
>
> This test flaps on CassCI on 2.1. [~aboudreault] Do I remember correctly that 
> you did some work on these tests in the past few months? If so, could you 
> have a look and see if there's some assumption the test makes that don't hold 
> for 2.1?
> Oddly, it fails frequently under JDK8:
> http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/lastCompletedBuild/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/history/
> but less frequently on JDK7:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/lastCompletedBuild/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/history/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to