[
https://issues.apache.org/jira/browse/CASSANDRA-21383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Konstantinov updated CASSANDRA-21383:
--------------------------------------------
Description:
An epic for improvements in Cassandra tests execution time.
Ideas:
* Review startup/shutdown/before & after individual test logic for CQLTester
** Reduce number of flushes to disk
*** TCM triggered: UNSAFE_SYSTEM("cassandra.unsafesystem") is introduced in
Accord tests, we can try to use it for majority of other tests too
*** SystemKeyspace#setLocalHostId - skip flush for tests
*** dropping of tables after test execution causes a flush, we can try to skip
it
***
CommitLog.instance.forceRecycleAllSegments(Collections.singleton(metadata.id));
on drop, can we skip it for tests?
** Reduce cassandra.shutdown_announce_in_ms for tests
** Avoid non-needed SSL logic/libraries load - TEST_JVM_DTEST_DISABLE_SSL
option
** Do not inherit DisableSslContextFactory from AbstractSslContextFactory to
avoid caffein cache creation and other cinit/init logic
** Disable snapshots on table/keyspace drop (auto_snapshot: false) - CQLTester
drops keyspaces after a test run
** org.apache.cassandra.locator.SimpleSeedProvider#getSeeds - loads yaml
again, can we avoid it?
** Schema.instance.saveSystemKeyspace() - disable for tests?
** org.apache.cassandra.ServerTestUtils#cleanup - bulk removal, ca we do it
for for all directories together (one invocation of rm process instead of many)?
** JSON write:
org.apache.cassandra.db.commitlog.CommitLogDescriptor#constructParametersString
- we spend time here to load and init JSON ObjectMapper
** AccordService init - can we skip it for logic when Accord is not needed?
(introduce requireAccord method in CQLTester)
** avoid MBeans register if a test does not need it?
** CassandraXMLJUnitResultFormatter - do we really need hostname in XML report
to spend time to retrieve it?
** Disable com.datastax.driver.USE_NATIVE_CLOCK to avoid native JNR library
loadingĀ
** org.apache.cassandra.config.DefaultLoader#getProperties - can we move some
of the parsing of config classes info into compile phase instead of runtime?
* Test combinations
** Analyze current tests for different configurations & JVM versions, do we
really need a full cartesian join
* Slowest tests assessment
** check tests for non-efficient awaiting patterns (like sleeping to wait for
events)
* JVM level
** disable C2 (TieredStopAtLevel=1)
** adjust GC settings (try serial GC)
** classpath/JAR indexing?
** CDS (Class Data Sharing)
** metaspace optimal size to avoid metaspace-triggered full GC
* Jenkins level: currently we create and schedule several thousands of tasks
at the same time, it seems like Jenkins Controller is not very happy about and
it starts to lag (at least UI), need to find a more efficient way to manage
concurrency of tests execution
* Revise microbenchmark smoke run for long executions, probably reduction of
options is required for some of JMH tests
was:
An epic for improvements in Cassandra tests execution time.
Ideas:
* Review startup/shutdown/before & after individual test logic for CQLTester
** Reduce number of flushes to disk
*** TCM triggered: UNSAFE_SYSTEM("cassandra.unsafesystem") is introduced in
Accord tests, we can try to use it for majority of other tests too
*** SystemKeyspace#setLocalHostId - skip flush for tests
*** dropping of tables after test execution causes a flush, we can try to skip
it
***
CommitLog.instance.forceRecycleAllSegments(Collections.singleton(metadata.id));
on drop, can we skip it for tests?
** Reduce cassandra.shutdown_announce_in_ms for tests
** Avoid non-needed SSL logic/libraries load - TEST_JVM_DTEST_DISABLE_SSL
option
** Do not inherit DisableSslContextFactory from AbstractSslContextFactory to
avoid caffein cache creation and other cinit/init logic
** Disable snapshots on table/keyspace drop (auto_snapshot: false) - CQLTester
drops keyspaces after a test run
** org.apache.cassandra.locator.SimpleSeedProvider#getSeeds - loads yaml
again, can we avoid it?
** Schema.instance.saveSystemKeyspace() - disable for tests?
** org.apache.cassandra.ServerTestUtils#cleanup - bulk removal, ca we do it
for for all directories together (one invocation of rm process instead of many)?
** JSON write:
org.apache.cassandra.db.commitlog.CommitLogDescriptor#constructParametersString
- we spend time here to load and init JSON ObjectMapper
** AccordService init - can we skip it for logic when Accord is not needed?
(introduce something requireAccord method in CQLTester)
** avoid MBeans register if a test does not need it?
** CassandraXMLJUnitResultFormatter - do we really need hostname in XML report
to spend time to retrieve it?
** Disable com.datastax.driver.USE_NATIVE_CLOCK to avoid native JNR library
loadingĀ
** org.apache.cassandra.config.DefaultLoader#getProperties - can we move some
of the parsing of config classes info into compile phase instead of runtime?
* Test combinations
** Analyze current tests for different configurations & JVM versions, do we
really need a full cartesian join
* Slowest tests assessment
** check tests for non-efficient awaiting patterns (like sleeping to wait for
events)
* JVM level
** disable C2 (TieredStopAtLevel=1)
** adjust GC settings (try serial GC)
** classpath/JAR indexing?
** CDS (Class Data Sharing)
** metaspace optimal size to avoid metaspace-triggered full GC
* Jenkins level: currently we create and schedule several thousands of tasks
at the same time, it seems like Jenkins Controller is not very happy about and
it starts to lag (at least UI), need to find a more efficient way to manage
concurrency of tests execution
* Revise microbenchmark smoke run for long executions, probably reduction of
options is required for some of JMH tests
> Speedup Cassandra tests execution
> ---------------------------------
>
> Key: CASSANDRA-21383
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21383
> Project: Apache Cassandra
> Issue Type: Epic
> Components: Test/dtest/java, Test/dtest/python, Test/unit
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
>
> An epic for improvements in Cassandra tests execution time.
> Ideas:
> * Review startup/shutdown/before & after individual test logic for CQLTester
> ** Reduce number of flushes to disk
> *** TCM triggered: UNSAFE_SYSTEM("cassandra.unsafesystem") is introduced in
> Accord tests, we can try to use it for majority of other tests too
> *** SystemKeyspace#setLocalHostId - skip flush for tests
> *** dropping of tables after test execution causes a flush, we can try to
> skip it
> ***
> CommitLog.instance.forceRecycleAllSegments(Collections.singleton(metadata.id));
> on drop, can we skip it for tests?
> ** Reduce cassandra.shutdown_announce_in_ms for tests
> ** Avoid non-needed SSL logic/libraries load - TEST_JVM_DTEST_DISABLE_SSL
> option
> ** Do not inherit DisableSslContextFactory from AbstractSslContextFactory to
> avoid caffein cache creation and other cinit/init logic
> ** Disable snapshots on table/keyspace drop (auto_snapshot: false) -
> CQLTester drops keyspaces after a test run
> ** org.apache.cassandra.locator.SimpleSeedProvider#getSeeds - loads yaml
> again, can we avoid it?
> ** Schema.instance.saveSystemKeyspace() - disable for tests?
> ** org.apache.cassandra.ServerTestUtils#cleanup - bulk removal, ca we do it
> for for all directories together (one invocation of rm process instead of
> many)?
> ** JSON write:
> org.apache.cassandra.db.commitlog.CommitLogDescriptor#constructParametersString
> - we spend time here to load and init JSON ObjectMapper
> ** AccordService init - can we skip it for logic when Accord is not needed?
> (introduce requireAccord method in CQLTester)
> ** avoid MBeans register if a test does not need it?
> ** CassandraXMLJUnitResultFormatter - do we really need hostname in XML
> report to spend time to retrieve it?
> ** Disable com.datastax.driver.USE_NATIVE_CLOCK to avoid native JNR library
> loadingĀ
> ** org.apache.cassandra.config.DefaultLoader#getProperties - can we move
> some of the parsing of config classes info into compile phase instead of
> runtime?
> * Test combinations
> ** Analyze current tests for different configurations & JVM versions, do we
> really need a full cartesian join
> * Slowest tests assessment
> ** check tests for non-efficient awaiting patterns (like sleeping to wait
> for events)
> * JVM level
> ** disable C2 (TieredStopAtLevel=1)
> ** adjust GC settings (try serial GC)
> ** classpath/JAR indexing?
> ** CDS (Class Data Sharing)
> ** metaspace optimal size to avoid metaspace-triggered full GC
> * Jenkins level: currently we create and schedule several thousands of tasks
> at the same time, it seems like Jenkins Controller is not very happy about
> and it starts to lag (at least UI), need to find a more efficient way to
> manage concurrency of tests execution
> * Revise microbenchmark smoke run for long executions, probably reduction of
> options is required for some of JMH tests
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]