[jira] [Updated] (CASSANDRA-16405) Handle both TimeoutError and NodeError when expecting node start failure

Tomasz Lasica (Jira) Mon, 25 Jan 2021 02:07:06 -0800


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-16405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tomasz Lasica updated CASSANDRA-16405:
--------------------------------------
    Description: 
*Summary*

Some tests are validating situation, when node (or cluster) does not start 
correctly.

They rely on `TimeoutError` to be raised, but technically it is also possible 
that `ccm` will raise `NodeError` without waiting for timeout to be met.

*Why we need this change*

We can improve `ccm` to fail fast in case node being started terminates. This 
would:
 * make unexpected test failures faster to fail (not waiting 90 or 120s)
 * shorten overall test duration, even if timeout is given

ccm work (in progress): https://github.com/riptano/ccm/pull/724

*Proposed improvement*

Handle both TimeoutError and NodeError as expected node failure.

*PR*

[|https://github.com/apache/cassandra-dtest/pull/113/files]

 

  was:
*Summary*

Node start timeouts should be explicitly extended to more than default 90s 
(boostrap with reset state, replace node tests) because the default 90s will 
start to work after ccm changes.

*Why we need this change*

There is a bug in [https://github.com/riptano/ccm] that node.start() timeout 
(or more precisely node.wait_for_binary_proto() timeout is in practice 600s. 
This is the time to wait for certain log message:

[https://github.com/riptano/ccm/blob/484476494bda6d71f895826358722a7b1c47a3cf/ccmlib/node.py#L642|https://github.com/riptano/ccm/blob/cassandra-test/ccmlib/node.py#L642]

This bug will be fixed by: [https://github.com/riptano/ccm/pull/725]

*Proposed improvement*

Explicitly raise node start timeout to 120s or 180s (depending on the scenario) 
by using existing `Node` api to provide timeout as int (in seconds) instead of 
bool.

Note that this is available after [https://github.com/riptano/ccm/pull/725] is 
merged but should not break test logic before it is merged.

*PR*

[https://github.com/apache/cassandra-dtest/pull/113/files]

 


> Handle both TimeoutError and NodeError when expecting node start failure
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16405
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16405
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Test/dtest/python
>            Reporter: Tomasz Lasica
>            Assignee: Tomasz Lasica
>            Priority: Low
>             Fix For: 2.2.20, 3.0.24, 3.11.10, 4.0-beta5, 4.0
>
>
> *Summary*
> Some tests are validating situation, when node (or cluster) does not start 
> correctly.
> They rely on `TimeoutError` to be raised, but technically it is also possible 
> that `ccm` will raise `NodeError` without waiting for timeout to be met.
> *Why we need this change*
> We can improve `ccm` to fail fast in case node being started terminates. This 
> would:
>  * make unexpected test failures faster to fail (not waiting 90 or 120s)
>  * shorten overall test duration, even if timeout is given
> ccm work (in progress): https://github.com/riptano/ccm/pull/724
> *Proposed improvement*
> Handle both TimeoutError and NodeError as expected node failure.
> *PR*
> [|https://github.com/apache/cassandra-dtest/pull/113/files]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (CASSANDRA-16405) Handle both TimeoutError and NodeError when expecting node start failure

Reply via email to