[ 
https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204176#comment-17204176
 ] 

Adam Holmberg commented on CASSANDRA-16089:
-------------------------------------------

The test fails occasionally due to random token generation for the second node. 
If the random token is too close to the node1 token, very little data is left 
on node1 after cleanup, and the disk balance variation goes up due to "noise".

It can be made to fail in this manner consistently by configuring an 
intentionally bad token. For example one unfortunate token selection:

{noformat}
Datacenter: datacenter1
==========
Address    Rack        Status State   Load            Owns                Token
                                                                                
                          9143583083429189474
127.0.0.1  rack1       Up     Normal  23.67 MiB       0.43%               
-9223372036854775808
127.0.0.2  rack1       Up     Normal  68.62 KiB       99.57%              
9143583083429189474
{noformat}

Leaves only kilobytes of data on each disk after cleanup and compaction.

The dtest change just makes for fixed token selection so we can avoid the noise 
in small files.
[patch|https://github.com/apache/cassandra-dtest/commit/32a31742bda41b09872b7820e9fb7fffda1addd9]
[ci|https://app.circleci.com/pipelines/github/aholmberg/cassandra?branch=CASSANDRA-16089]

> Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16089
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16089
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/python
>            Reporter: Caleb Rackliffe
>            Assignee: Adam Holmberg
>            Priority: Normal
>              Labels: dtest
>             Fix For: 4.0-beta
>
>
> See 
> https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498
> After bootstrapping a second node into the cluster, the sizes of the SSTables 
> (per directory) on the first node no longer fall within the 10% margin of 
> error. We don’t have any assertion in the test that they were balanced before 
> bootstrap, however.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to