[
https://issues.apache.org/jira/browse/CASSANDRA-7821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109626#comment-14109626
]
Russell Alexander Spitzer commented on CASSANDRA-7821:
------------------------------------------------------
Attached a patch which adds to simlpe options to C* Stress
{code}
backoff_strategy = {CONSTANT,LINEAR,EXPONENTIAL}
CONSTANT : A constant amount of seconds based on backoff_seconds
LINEAR : An amount of time based on the retry_num * backoff_seconds
EXPONENTIAL: An amount of time based on backoff_seconds * 2 ^ retry_num
backoff_seconds = #
The number of seconds to be used as a coefficent in the above strategies
{code}
https://github.com/RussellSpitzer/cassandra/compare/RussellSpitzer:cassandra-2.1...CASSANDRA-7821
I also bumped up the timeout for threads up to 10 minutes but ideally we would
pass through the max expected amount of retry time.
[~benedict] As usual your feedback would be extremely welcome
> Add Optional Backoff on Retry to Cassandra Stress
> -------------------------------------------------
>
> Key: CASSANDRA-7821
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7821
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Russell Alexander Spitzer
> Assignee: Russell Alexander Spitzer
>
> Currently when stress is running against a cluster which occasionally has
> nodes marked as down, it will almost immediately stop. This occurs because
> the retry loop can execute extremely quickly if each execution terminates
> with a {{com.datastax.driver.core.exceptions.NoHostAvailableException}} or
> {{com.datastax.driver.core.exceptions.UnavailableException}}.
> In case of these exceptions is will most likely be unable to succeed if the
> retries are performed as fast as possible. To get around this, we could add
> an optional delay on retries giving the cluster time to recover rather than
> terminating the stress run.
> We could make this configurable, with options such as:
> * Constant # Delays the same amount after each retry
> * Linear # Backoff a set amount * the trial number
> * Exponential # Backoff set amount * 2 ^ trial number
> This may also require adjusting the "thread is stuck check" to make sure that
> the max retry timeout will not cause the thread to be terminated early.
--
This message was sent by Atlassian JIRA
(v6.2#6252)