[
https://issues.apache.org/jira/browse/CASSANDRA-16405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tomasz Lasica updated CASSANDRA-16405:
--------------------------------------
Description:
*Summary*
Some tests are validating situation, when node (or cluster) does not start
correctly.
They rely on `TimeoutError` to be raised, but technically it is also possible
that `ccm` will raise `NodeError` without waiting for timeout to be met.
*Why we need this change*
We can improve `ccm` to fail fast in case node being started terminates. This
would:
* make unexpected test failures faster to fail (not waiting 90 or 120s)
* shorten overall test duration, even if timeout is given
ccm work (in progress): https://github.com/riptano/ccm/pull/724
*Proposed improvement*
Handle both TimeoutError and NodeError as expected node failure.
*PR*
[|https://github.com/apache/cassandra-dtest/pull/113/files]
was:
*Summary*
Node start timeouts should be explicitly extended to more than default 90s
(boostrap with reset state, replace node tests) because the default 90s will
start to work after ccm changes.
*Why we need this change*
There is a bug in [https://github.com/riptano/ccm] that node.start() timeout
(or more precisely node.wait_for_binary_proto() timeout is in practice 600s.
This is the time to wait for certain log message:
[https://github.com/riptano/ccm/blob/484476494bda6d71f895826358722a7b1c47a3cf/ccmlib/node.py#L642|https://github.com/riptano/ccm/blob/cassandra-test/ccmlib/node.py#L642]
This bug will be fixed by: [https://github.com/riptano/ccm/pull/725]
*Proposed improvement*
Explicitly raise node start timeout to 120s or 180s (depending on the scenario)
by using existing `Node` api to provide timeout as int (in seconds) instead of
bool.
Note that this is available after [https://github.com/riptano/ccm/pull/725] is
merged but should not break test logic before it is merged.
*PR*
[https://github.com/apache/cassandra-dtest/pull/113/files]
> Handle both TimeoutError and NodeError when expecting node start failure
> ------------------------------------------------------------------------
>
> Key: CASSANDRA-16405
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16405
> Project: Cassandra
> Issue Type: Improvement
> Components: Test/dtest/python
> Reporter: Tomasz Lasica
> Assignee: Tomasz Lasica
> Priority: Low
> Fix For: 2.2.20, 3.0.24, 3.11.10, 4.0-beta5, 4.0
>
>
> *Summary*
> Some tests are validating situation, when node (or cluster) does not start
> correctly.
> They rely on `TimeoutError` to be raised, but technically it is also possible
> that `ccm` will raise `NodeError` without waiting for timeout to be met.
> *Why we need this change*
> We can improve `ccm` to fail fast in case node being started terminates. This
> would:
> * make unexpected test failures faster to fail (not waiting 90 or 120s)
> * shorten overall test duration, even if timeout is given
> ccm work (in progress): https://github.com/riptano/ccm/pull/724
> *Proposed improvement*
> Handle both TimeoutError and NodeError as expected node failure.
> *PR*
> [|https://github.com/apache/cassandra-dtest/pull/113/files]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]