[jira] [Commented] (CASSANDRA-16079) Improve dtest runtime

Paulo Motta (Jira) Sat, 03 Oct 2020 14:50:24 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206864#comment-17206864
 ]


Paulo Motta commented on CASSANDRA-16079:
-----------------------------------------

{quote}To cache bootstrapped ccm clusters it meant that those ccm clusters had 
to be bootstrapped, then stopped, then the directory copied, and then started 
up again for the test in question.
{quote}
Is it necessary to stop the cluster to cache it and then start it again? Can't 
we just flush and copy the directory without stopping the cluster? This might 
shave some restart time.

One optimization we can do is to *NOT* enforce +_wait_for_binary_proto_+ before 
starting the next node on [~adejanovski]'s 
[patch|https://github.com/apache/cassandra/pull/663/commits/db4f32fa6981d9fdddce2730438f4a057a566ca9],
 but only _+wait_other_notice+_, since this will guarantee the next node will 
take the previous into account during token allocation without waiting it to 
fully bootstrap (which takes at least 30s due to 
*_cassandra.ring_delay_ms=30000_*). This should return some level of 
parallelism to the cluster bootstrap process.

In case the above suggestion does not bring acceptable results, another point 
to consider is that when enabling 
_*allocate_tokens_for_local_replication_factor: 3*_ by default on 
CASSANDRA-13701, we are now exercising the new token allocation algorithm on 
all dtests, even the ones that are not specifically testing this.

While it's probably a good idea to run all dtests with the new token allocation 
algorithm to catch unintended edge cases the downside is that this increases 
dtest execution time. Couldn't we keep use random allocation by default on 
dtests to be able to bootstrap in parallel, and have a new suite of dtests 
executed nightly and before release (similar to _novnode_) with the new token 
allocation enabled?

> Improve dtest runtime
> ---------------------
>
>                 Key: CASSANDRA-16079
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16079
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CI
>            Reporter: Adam Holmberg
>            Priority: Normal
>             Fix For: 4.0-beta, 4.0-triage
>
>         Attachments: Screenshot 2020-09-19 at 12.32.21.png
>
>
> A recent ticket, CASSANDRA-13701, changed the way dtests run, resulting in a 
> [30% increase in run 
> time|https://www.mail-archive.com/[email protected]/msg15606.html]. 
> While that change was accepted, we wanted to spin out a ticket to optimize 
> dtests in an attempt to gain back some of that runtime.
> At this time we don't have concrete improvements in mind, so the first order 
> of this ticket will be to analyze the state of things currently, and try to 
> ascertain some valuable optimizations. Once the problems are understood, we 
> will break down subtasks to divide the work.
> Some areas to consider:
> * cluster reuse
> * C* startup optimizations
> * Tests that should be ported to in-JVM dtest or even unit tests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-16079) Improve dtest runtime

Reply via email to