[
https://issues.apache.org/jira/browse/CASSANDRA-21008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ling Mao updated CASSANDRA-21008:
---------------------------------
Description:
h4. Goals and Motivation
Random Test is designed to help with testing systems that rely on randomness,
such as those involving probabilistic behavior, concurrency, or randomized
algorithms. It enables automated, repeatable testing of code that behaves
differently on each run, ensuring more comprehensive test coverage.
Key benefits:
Test Uncertainty: It provides tools for testing systems where inputs,
behaviors, or results vary across test runs.
Reproducibility: It helps ensure tests can be reliably reproduced, even with
randomness involved.
Stress Testing: It is ideal for identifying edge cases and bugs that might not
be easily found in deterministic tests.
Better Test Coverage: Randomized tests provide more varied input combinations,
leading to more thorough testing.
In short, Random Test ensures the robustness and reliability of systems dealing
with randomness, making it essential for testing complex or non-deterministic
behavior.
h4. Approach
{code:java}
private static final String RANDOMIZED_CONTEXT_FQN =
"com.carrotsearch.randomizedtesting.RandomizedContext";
Class.forName(RANDOMIZED_CONTEXT_FQN); // sets isRandomTestMode = true{code}
We use this approach to determine whether we are in Random Test Mode or Normal
Mode (if we can find the FQN, we activate the test infrastructure).
{code:java}
Randomness.java
public static Random get() {
if (isRandomTestMode()) {
return our ThreadLocal seedable random implementation;
} else {
return java native Random implementation;
}
} {code}
We centralize all random operations in "Randomness.java" to make them
repeatable, unified, and aggregated (when in Random Test Mode, all randomness
becomes reproducible and deterministic). For example:
Java Random APIs (SecureRandom, Collections.shuffle, UUID.randomUUID(),
Math.random(), etc.)
Additional random utilities (randomUTF8, randomLocale, randomDate,
randomAsciiLetters, etc.)
We print the seed when a unit test fails:
{code:java}
java.lang.RuntimeException: java.lang.ExceptionInInitializerError
at __randomizedtesting.SeedInfo.seed([FFB4961AF9935703:3AFD18F7B24FCA34]:0)
at
org.apache.cassandra.distributed.impl.Instance.lambda$startup$7(Instance.java:675)
... {code}
Then use the following command to reproduce the failure:
{code:java}
ant testsome -Dtest.name=org.apache.cassandra.db.marshal.MyRandomizedTest
-Dtests.seed=FFB4961AF9935703 {code}
We can also print JVM system properties, memory, and CPU information for
additional debugging context.
h4. Proposed Changes
{code:java}
// Before
double randDbl = random.nextDouble();
// After
double randDbl = Randomness.get().nextDouble(); {code}
h4. Test Plan
We will refactor all random test suites. Existing test suites will verify these
changes. Additionally, we will add unit tests for Randomness.
h4. Rejected Alternatives: Why not use carrotsearch randomizedtesting directly?
Elasticsearch and Lucene have successfully used randomizedtesting (Apache-2.0
license) for decades([7]), and it's already present in Cassandra. However, we
reject this approach for the following reasons:
# {*}Initialization timing issues{*}: Using Java reflection to call
{{RandomizedContext.current().getRandom()}} causes
{{java.lang.ExceptionInInitializerError}} due to class initialization timing at
startup.
# {*}Boundary limitations{*}: randomizedtesting accesses random contexts
within {{@BeforeClass}} and {{@AfterClass}} boundaries, but static test class
initializers cannot access random contexts. However, we need random behavior
during server startup.
# {*}Distributed JVM incompatibility{*}: randomizedtesting is not compatible
with distributed JVM tests. Our native Java approach works seamlessly with
distributed JVM tests, simulators, and QuickTheories.
*Existing test patterns:*
For unit tests:
{code:java}
public class UUIDTypeTest extends RandomTestSuite@Listeners({
ReproduceInfoPrinter.class })
@ThreadLeakScope(ThreadLeakScope.Scope.NONE)
@SeedDecorators({ MixWithSuiteName.class })
@TestMethodProviders(value = { JUnit4MethodProvider.class })
public abstract class RandomTestSuite extends RandomizedTest{code}
For distributed JVM tests:
{code:java}
@RunWith(RandomizedRunner.class)
@Listeners({ ReproduceInfoPrinter.class })
@ThreadLeakScope(ThreadLeakScope.Scope.NONE)
@SeedDecorators({ MixWithSuiteName.class })
public class HintDataReappearingTest extends AbstractHintWindowTest{code}
*Limitations of existing approaches:* # Some distributed JVM tests cannot
obtain {{RandomizedContext}}
# Tests using {{@RunWith(Parameterized.class)}} or
{{@RunWith(BMUnitRunner.class)}} cannot work with
{{@RunWith(RandomizedRunner.class)}} since only one {{@RunWith}} annotation is
allowed per test
# Extending {{RandomTestSuite}} or using {{@RunWith(RandomizedRunner.class)}}
requires a base test class, causing extensive code changes that are difficult
to review
h4. Reference
[1] https://labs.carrotsearch.com/randomizedtesting-concept.html
[2] https://get.carrotsearch.com/presentations/dweiss-barcelona-testing-2011.pdf
[3]
https://spinscale.de/posts/2020-04-22-testing-and-releasing-elasticsearch-and-the-elastic-stack.html
[4] https://github.com/randomizedtesting/randomizedtesting
[5] https://labs.carrotsearch.com/randomizedtesting.html
[6]
https://github.com/randomizedtesting/randomizedtesting/?tab=Apache-2.0-1-ov-file#
[7]https://github.com/elastic/elasticsearch/blob/004fab6627b0ae65cf939e0d69a4db8dbd175798/server/src/main/java/org/elasticsearch/common/Randomness.java#L93
[8] https://www.youtube.com/watch?v=zD57QKzqdCw
> Optimizing Randomized Testing for Reproducibility
> -------------------------------------------------
>
> Key: CASSANDRA-21008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21008
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Test/unit
> Reporter: Ling Mao
> Priority: Normal
>
> h4. Goals and Motivation
>
> Random Test is designed to help with testing systems that rely on randomness,
> such as those involving probabilistic behavior, concurrency, or randomized
> algorithms. It enables automated, repeatable testing of code that behaves
> differently on each run, ensuring more comprehensive test coverage.
> Key benefits:
> Test Uncertainty: It provides tools for testing systems where inputs,
> behaviors, or results vary across test runs.
> Reproducibility: It helps ensure tests can be reliably reproduced, even with
> randomness involved.
> Stress Testing: It is ideal for identifying edge cases and bugs that might
> not be easily found in deterministic tests.
> Better Test Coverage: Randomized tests provide more varied input
> combinations, leading to more thorough testing.
> In short, Random Test ensures the robustness and reliability of systems
> dealing with randomness, making it essential for testing complex or
> non-deterministic behavior.
> h4. Approach
> {code:java}
> private static final String RANDOMIZED_CONTEXT_FQN =
> "com.carrotsearch.randomizedtesting.RandomizedContext";
> Class.forName(RANDOMIZED_CONTEXT_FQN); // sets isRandomTestMode = true{code}
> We use this approach to determine whether we are in Random Test Mode or
> Normal Mode (if we can find the FQN, we activate the test infrastructure).
> {code:java}
> Randomness.java
> public static Random get() {
> if (isRandomTestMode()) {
> return our ThreadLocal seedable random implementation;
> } else {
> return java native Random implementation;
> }
> } {code}
> We centralize all random operations in "Randomness.java" to make them
> repeatable, unified, and aggregated (when in Random Test Mode, all randomness
> becomes reproducible and deterministic). For example:
> Java Random APIs (SecureRandom, Collections.shuffle, UUID.randomUUID(),
> Math.random(), etc.)
> Additional random utilities (randomUTF8, randomLocale, randomDate,
> randomAsciiLetters, etc.)
> We print the seed when a unit test fails:
> {code:java}
> java.lang.RuntimeException: java.lang.ExceptionInInitializerError
> at
> __randomizedtesting.SeedInfo.seed([FFB4961AF9935703:3AFD18F7B24FCA34]:0)
> at
> org.apache.cassandra.distributed.impl.Instance.lambda$startup$7(Instance.java:675)
> ... {code}
> Then use the following command to reproduce the failure:
> {code:java}
> ant testsome -Dtest.name=org.apache.cassandra.db.marshal.MyRandomizedTest
> -Dtests.seed=FFB4961AF9935703 {code}
> We can also print JVM system properties, memory, and CPU information for
> additional debugging context.
> h4. Proposed Changes
> {code:java}
> // Before
> double randDbl = random.nextDouble();
> // After
> double randDbl = Randomness.get().nextDouble(); {code}
> h4. Test Plan
> We will refactor all random test suites. Existing test suites will verify
> these changes. Additionally, we will add unit tests for Randomness.
> h4. Rejected Alternatives: Why not use carrotsearch randomizedtesting
> directly?
> Elasticsearch and Lucene have successfully used randomizedtesting (Apache-2.0
> license) for decades([7]), and it's already present in Cassandra. However, we
> reject this approach for the following reasons:
> # {*}Initialization timing issues{*}: Using Java reflection to call
> {{RandomizedContext.current().getRandom()}} causes
> {{java.lang.ExceptionInInitializerError}} due to class initialization timing
> at startup.
> # {*}Boundary limitations{*}: randomizedtesting accesses random contexts
> within {{@BeforeClass}} and {{@AfterClass}} boundaries, but static test class
> initializers cannot access random contexts. However, we need random behavior
> during server startup.
> # {*}Distributed JVM incompatibility{*}: randomizedtesting is not compatible
> with distributed JVM tests. Our native Java approach works seamlessly with
> distributed JVM tests, simulators, and QuickTheories.
>
> *Existing test patterns:*
> For unit tests:
>
> {code:java}
> public class UUIDTypeTest extends RandomTestSuite@Listeners({
> ReproduceInfoPrinter.class })
> @ThreadLeakScope(ThreadLeakScope.Scope.NONE)
> @SeedDecorators({ MixWithSuiteName.class })
> @TestMethodProviders(value = { JUnit4MethodProvider.class })
> public abstract class RandomTestSuite extends RandomizedTest{code}
>
> For distributed JVM tests:
>
> {code:java}
> @RunWith(RandomizedRunner.class)
> @Listeners({ ReproduceInfoPrinter.class })
> @ThreadLeakScope(ThreadLeakScope.Scope.NONE)
> @SeedDecorators({ MixWithSuiteName.class })
> public class HintDataReappearingTest extends AbstractHintWindowTest{code}
>
> *Limitations of existing approaches:* # Some distributed JVM tests cannot
> obtain {{RandomizedContext}}
> # Tests using {{@RunWith(Parameterized.class)}} or
> {{@RunWith(BMUnitRunner.class)}} cannot work with
> {{@RunWith(RandomizedRunner.class)}} since only one {{@RunWith}} annotation
> is allowed per test
> # Extending {{RandomTestSuite}} or using
> {{@RunWith(RandomizedRunner.class)}} requires a base test class, causing
> extensive code changes that are difficult to review
> h4. Reference
> [1] https://labs.carrotsearch.com/randomizedtesting-concept.html
> [2]
> https://get.carrotsearch.com/presentations/dweiss-barcelona-testing-2011.pdf
> [3]
> https://spinscale.de/posts/2020-04-22-testing-and-releasing-elasticsearch-and-the-elastic-stack.html
> [4] https://github.com/randomizedtesting/randomizedtesting
> [5] https://labs.carrotsearch.com/randomizedtesting.html
> [6]
> https://github.com/randomizedtesting/randomizedtesting/?tab=Apache-2.0-1-ov-file#
> [7]https://github.com/elastic/elasticsearch/blob/004fab6627b0ae65cf939e0d69a4db8dbd175798/server/src/main/java/org/elasticsearch/common/Randomness.java#L93
> [8] https://www.youtube.com/watch?v=zD57QKzqdCw
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]