[
https://issues.apache.org/jira/browse/CASSANDRA-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17442920#comment-17442920
]
David Capwell commented on CASSANDRA-17069:
-------------------------------------------
Here are my results so far with
org.apache.cassandra.distributed.test.NetstatsBootstrapWithEntireSSTablesCompressionStreamingTest.testWithStreamingEntireSSTablesWithoutCompressionWithoutThrottling
The stack trace is for the following line
{code}
final Future<AbstractNetstatsStreaming.NetstatResults> netstatsFuture =
executorService.submit(new NetstatsCallable(cluster.get(1)));
final AbstractNetstatsStreaming.NetstatResults results = netstatsFuture.get(1,
MINUTES); // timeout here
{code}
This future calls nodetool in a loop with sleeps (Thread.sleep(500)). It stops
looping after it no longer sees Receiving/Sending in the logs (aka streaming
ran but is no longer running). After this point it awaits for the node to come
up (2m timeout)...
I do not believe this patch impacts this test (ran locally and hard to hit this
case), but just in case I plan to patch the test (see a flake, fix a flake) to
wait longer for node to come up (before was practically 3m) and then check for
streaming (if we are not done streaming after node2 is up... what happened?).
> Refactor normal/preview/IR repair to standardize repair cleanup and error
> handling of failed RepairJobs
> -------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-17069
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17069
> Project: Cassandra
> Issue Type: Improvement
> Components: Consistency/Repair
> Reporter: David Capwell
> Assignee: David Capwell
> Priority: Normal
> Fix For: 4.x
>
>
> Right now we have 3 different implementations of repair: normal, preview, and
> incremental (IR); all 3 handle RepairJob failures differently and offer
> different state cleanup. To make sure that we consistently handle errors the
> same way and cleanup, we should move these responsibilities outside of the
> repair task itself and move these into common APIs and move some logic into
> the repair coordination its self.
> This work relates with CASSANDRA-15399 as special handling each task makes
> the work more complex.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]