[ 
https://issues.apache.org/jira/browse/CASSANDRA-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539759#comment-17539759
 ] 

Josh McKenzie commented on CASSANDRA-17291:
-------------------------------------------

Ok. So this was annoying but I think I got to the bottom of it.

[PR|https://github.com/apache/cassandra/pull/1638]
[JDK8 
CI|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/232/workflows/6c742dd9-e1b3-4056-aac8-9f0b95ae4e84]
[JDK11 
CI|https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/232/workflows/5fcce28f-a258-485f-906e-45957c2b3b08]

The offending bit of code:
{code:java}
        for (int numTokens = 1; numTokens <= 16 ; ++numTokens)
        {
            for (int rf = 1; rf <=5; ++rf)
            {
                int nodeCount = 32;
                for (int racks = 1; racks <= 10; ++racks)
                {
                    int[] nodeToRack = makeRackCountArray(nodeCount, racks);
                    for (IPartitioner partitioner : new IPartitioner[] { 
Murmur3Partitioner.instance, RandomPartitioner.instance })
                    {
{code}
{{void testTokenGenerations()}} was timing out due to this pretty beastly 
nesting of different permutations it runs combined with it logging output, etc. 
Effectively we have to run 1600 tests in 900 seconds sans whatever the other 
tests were consuming for time, and with all the logging about tokens and 
iteration through it looks like we tipped over an edge.

I pared the test down to the following (about a third of the combinations with 
what I think is *logically* comparably useful coverage):
{code:java}
    private final int[] racks = { 1, 2, 3, 5, 6, 9, 10 };
    private final int[] rfs = { 1, 2, 3, 5 };
    private final int[] tokens = { 1, 2, 3, 5, 6, 9, 10, 13, 15, 16 };
{code}
I toyed with getting rid of the logging inside the test class but the lion's 
share of what's spamming is in a variety of other classes. I also looked into 
disabling logging in the {{SystemOutputImpl}} in 
{{OfflineTokenAllocatorTestUtils}} but that ended up being more pain than it 
was worth.

Rather than taking 15+ minutes and timing out my laptop it takes about 20 
seconds; should be ok in our CI env. Also split out this test method to its own 
file entirely so it doesn't stomp on the other offline token allocation tests 
and throw a red herring of timeout like this did (parallelization and method 
timeouts within a class are... not fun).
 

> Test Failure: unit test compression: 
> testTokenGenerator_single_rack_or_single_rf
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17291
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17291
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/unit
>            Reporter: Josh McKenzie
>            Assignee: Josh McKenzie
>            Priority: Normal
>             Fix For: 4.1-beta, 4.1.x, 4.x
>
>
> org.apache.cassandra.dht.tokenallocator.OfflineTokenAllocatorTest
> [https://app.circleci.com/pipelines/github/josh-mckenzie/cassandra/168/workflows/1d30a113-c14b-4cf4-a631-bedb9eb65762/jobs/1447]
> Looks like a pretty simple / straightforward timeout
>  {code}
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time 
> in the report does not reflect the time until the timeout.
>       at java.util.Vector.forEach(Vector.java:1277)
>       at java.util.Vector.forEach(Vector.java:1277)
>       at java.util.Vector.forEach(Vector.java:1277)
>       at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53)
>       at java.util.Vector.forEach(Vector.java:1277)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to