[
https://issues.apache.org/jira/browse/CASSANDRA-15863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133115#comment-17133115
]
Berenguer Blasi edited comment on CASSANDRA-15863 at 6/11/20, 10:22 AM:
------------------------------------------------------------------------
This ticket fixes a number of failures so here's some direction for reviewers:
*test_resume_failed_replace,
test_restart_failed_replace_with_reset_resume_state &
test_resume_failed_replace*
This test fails waiting for
[this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L164]
log trace. This is never reached bc on the test we are failing bootstrap and
thus it is being marked IN_PROGRESS. Hence the daemon won't go that far, we
[exit|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L568]
before we reach that point.
The solution is to replace the nodes
[without|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR482]
waiting for that log trace and checking in an
[alternative|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR485]
way the bootstrap status.
*test_resume_failed_replace*
Once the above was fixed we would never hit the resume complete
[log|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR507].
This is bc {{StorageService#resumeBoostrap}}
[here|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1625]
would throw an exception starting the daemon. That exception was being
swallowed, now it is getting logged. Also I had to add a native transport
[init|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1623]
to avoid said exception and the daemon to start correctly. I am worried about
any side effects of this extra native transport init, so sbdy with a broader
knowledge of the codebase should chime in.
*test_replace_nonexistent_node, test_replace_first_boot,
test_replace_shutdown_node & test_replace_stopped_node*
These in the end turned out to be failures based on the logging messages having
changed throughout versions.
was (Author: bereng):
This ticket fixes a number of failures so here's some direction for reviewers:
*test_resume_failed_replace,
test_restart_failed_replace_with_reset_resume_state &
test_resume_failed_replace*
This test fails waiting for
[this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L164]
log trace. This is never reached bc on the test we are failing bootstrap and
thus it is being marked IN_PROGRESS. Hence the daemon won't go that far, we
[exit|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L568]
before we reach that point.
The solution is to replace the nodes
[without|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR482]
waiting for that log trace and checking in an
[alternative|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR485]
way the bootstrap status.
*test_resume_failed_replace*
Once the above was fixed we would never hit the resume complete
[log|https://github.com/apache/cassandra-dtest/pull/76/files#diff-271eb822afe096c34193f9071884d8beR507].
This is bc {{StorageService#resumeBoostrap}}
[here|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1625]
would throw an exception starting the daemon. That exception was being
swallowed, now it is getting logged. Also I had to add a native transport
[init|https://github.com/apache/cassandra/pull/622/files#diff-b76a607445d53f18a98c9df14323c7ddR1623]
to avoid said exception and the daemon to start correctly. I am worried about
any side effects of this extra native transport init, so sbdy with a broader
knowledge of the codebase should chime in.
*test_replace_nonexistent_node, test_replace_first_boot,
test_replace_shutdown_node & test_replace_stopped_node*
These in the end turned out to be failures based on the logging messages having
changed throughout versions.
> Boostrap resume and TestReplaceAddress fixes
> --------------------------------------------
>
> Key: CASSANDRA-15863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15863
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Bootstrap and Decommission, Test/dtest
> Reporter: Berenguer Blasi
> Assignee: Berenguer Blasi
> Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-alpha
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> This has been
> [broken|https://ci-cassandra.apache.org/job/Cassandra-trunk/159/testReport/dtest-large.replace_address_test/TestReplaceAddress/test_restart_failed_replace/history/]
> for ages
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]