[ 
https://issues.apache.org/jira/browse/CASSANDRA-15863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133115#comment-17133115
 ] 

Berenguer Blasi edited comment on CASSANDRA-15863 at 6/11/20, 10:22 AM:
------------------------------------------------------------------------

This ticket fixes a number of failures so here's some direction for reviewers:

*test_resume_failed_replace, 
test_restart_failed_replace_with_reset_resume_state & 
test_resume_failed_replace*

This test fails waiting for 
[this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L164]
 log trace. This is never reached bc on the test we are failing bootstrap and 
thus it is being marked IN_PROGRESS. Hence the daemon won't go that far, we 
[exit|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L568]
 before we reach that point.

The solution is to replace the nodes 
[without|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR482]
 waiting for that log trace and checking in an 
[alternative|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR485]
 way the bootstrap status.

 

*test_resume_failed_replace*

Once the above was fixed we would never hit the resume complete 
[log|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR507].
 This is bc {{StorageService#resumeBoostrap}} 
[here|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1625]
 would throw an exception starting the daemon. That exception was being 
swallowed, now it is getting logged. Also I had to add a native transport 
[init|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1623]
 to avoid said exception and the daemon to start correctly. I am worried about 
any side effects of this extra native transport init, so sbdy with a broader 
knowledge of the codebase should chime in.

 

*test_replace_nonexistent_node, test_replace_first_boot, 
test_replace_shutdown_node & test_replace_stopped_node*

These in the end turned out to be failures based on the logging messages having 
changed throughout versions.


was (Author: bereng):
This ticket fixes a number of failures so here's some direction for reviewers:

 

*test_resume_failed_replace, 
test_restart_failed_replace_with_reset_resume_state & 
test_resume_failed_replace*

This test fails waiting for 
[this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L164]
 log trace. This is never reached bc on the test we are failing bootstrap and 
thus it is being marked IN_PROGRESS. Hence the daemon won't go that far, we 
[exit|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L568]
 before we reach that point.

The solution is to replace the nodes 
[without|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR482]
 waiting for that log trace and checking in an 
[alternative|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR485]
 way the bootstrap status.

 

*test_resume_failed_replace*

Once the above was fixed we would never hit the resume complete 
[log|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR507].
 This is bc {{StorageService#resumeBoostrap}} 
[here|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1625]
 would throw an exception starting the daemon. That exception was being 
swallowed, now it is getting logged. Also I had to add a native transport 
[init|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1623]
 to avoid said exception and the daemon to start correctly. I am worried about 
any side effects of this extra native transport init, so sbdy with a broader 
knowledge of the codebase should chime in.

 

*test_replace_nonexistent_node, test_replace_first_boot, 
test_replace_shutdown_node & test_replace_stopped_node*

These in the end turned out to be failures based on the logging messages having 
changed throughout versions.

> Boostrap resume and TestReplaceAddress fixes
> --------------------------------------------
>
>                 Key: CASSANDRA-15863
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15863
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Bootstrap and Decommission, Test/dtest
>            Reporter: Berenguer Blasi
>            Assignee: Berenguer Blasi
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x, 4.0-alpha
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This has been 
> [broken|https://ci-cassandra.apache.org/job/Cassandra-trunk/159/testReport/dtest-large.replace_address_test/TestReplaceAddress/test_restart_failed_replace/history/]
>  for ages



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to