[
https://issues.apache.org/jira/browse/CASSANDRA-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539759#comment-17539759
]
Josh McKenzie commented on CASSANDRA-17291:
-------------------------------------------
Ok. So this was annoying but I think I got to the bottom of it.
[PR|https://github.com/apache/cassandra/pull/1638]
[JDK8
CI|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/232/workflows/6c742dd9-e1b3-4056-aac8-9f0b95ae4e84]
[JDK11
CI|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/232/workflows/5fcce28f-a258-485f-906e-45957c2b3b08]
The offending bit of code:
{code:java}
for (int numTokens = 1; numTokens <= 16 ; ++numTokens)
{
for (int rf = 1; rf <=5; ++rf)
{
int nodeCount = 32;
for (int racks = 1; racks <= 10; ++racks)
{
int[] nodeToRack = makeRackCountArray(nodeCount, racks);
for (IPartitioner partitioner : new IPartitioner[] {
Murmur3Partitioner.instance, RandomPartitioner.instance })
{
{code}
{{void testTokenGenerations()}} was timing out due to this pretty beastly
nesting of different permutations it runs combined with it logging output, etc.
Effectively we have to run 1600 tests in 900 seconds sans whatever the other
tests were consuming for time, and with all the logging about tokens and
iteration through it looks like we tipped over an edge.
I pared the test down to the following (about a third of the combinations with
what I think is *logically* comparably useful coverage):
{code:java}
private final int[] racks = { 1, 2, 3, 5, 6, 9, 10 };
private final int[] rfs = { 1, 2, 3, 5 };
private final int[] tokens = { 1, 2, 3, 5, 6, 9, 10, 13, 15, 16 };
{code}
I toyed with getting rid of the logging inside the test class but the lion's
share of what's spamming is in a variety of other classes. I also looked into
disabling logging in the {{SystemOutputImpl}} in
{{OfflineTokenAllocatorTestUtils}} but that ended up being more pain than it
was worth.
Rather than taking 15+ minutes and timing out my laptop it takes about 20
seconds; should be ok in our CI env. Also split out this test method to its own
file entirely so it doesn't stomp on the other offline token allocation tests
and throw a red herring of timeout like this did (parallelization and method
timeouts within a class are... not fun).
> Test Failure: unit test compression:
> testTokenGenerator_single_rack_or_single_rf
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-17291
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17291
> Project: Cassandra
> Issue Type: Bug
> Components: Test/unit
> Reporter: Josh McKenzie
> Assignee: Josh McKenzie
> Priority: Normal
> Fix For: 4.1-beta, 4.1.x, 4.x
>
>
> org.apache.cassandra.dht.tokenallocator.OfflineTokenAllocatorTest
> [https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/168/workflows/1d30a113-c14b-4cf4-a631-bedb9eb65762/jobs/1447]
> Looks like a pretty simple / straightforward timeout
> {code}
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time
> in the report does not reflect the time until the timeout.
> at java.util.Vector.forEach(Vector.java:1277)
> at java.util.Vector.forEach(Vector.java:1277)
> at java.util.Vector.forEach(Vector.java:1277)
> at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53)
> at java.util.Vector.forEach(Vector.java:1277)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]