[ https://issues.apache.org/jira/browse/CASSANDRA-20774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuqi Yan updated CASSANDRA-20774: --------------------------------- Fix Version/s: 4.1.x 5.0.x > PaxosCleanup.isOutOfRange caused CPU util spikes in cluster on new node > joining > ------------------------------------------------------------------------------- > > Key: CASSANDRA-20774 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20774 > Project: Apache Cassandra > Issue Type: Bug > Reporter: Yuqi Yan > Priority: Normal > Fix For: 4.1.x, 5.0.x > > Attachments: image-2025-07-17-16-58-17-456.png > > > Running Cassandra 4.1.3. > After switching to PaxosV2 for one of our instances, we started to see that, > when a new node attempted to join the ring, multiple nodes within the cluster > started to have spike in CPU utils. > I collected the CPU profile on one of them and seeing: > !image-2025-07-17-16-58-17-456.png|width=1335,height=715! > > So after switching to V2, new node boostrapping will trigger > `repairPaxosForTopologyChange` which will schedule `PaxosCleanup` table by > table. > Seems in `isOutOfRange` we're doing some unnecessary calculation to compute > the token map for the whole cluster with `getAddressReplicas()` - then only > use the local range. > > {code:java} > localRanges = Range.normalize(keyspace.getReplicationStrategy() > .getAddressReplicas() > ^^^ this build the map for the > entire cluster > > .get(FBUtilities.getBroadcastAddressAndPort()) > .ranges());{code} > One potential improvement here is to reuse > `getAddressReplicas(FBUtilities.getBroadcastAddressAndPort())` so we don't > rebuild the whole map > We're using 16 vnodes. Instance has ~1K tables. > Though there is still significant load comes from `calculateNaturalReplicas`. > Wondering is there any reason here we always recalculate this map instead of > using the cached `EndpointsForRange` similar to `getNaturalReplicas`? > > The issue might not be there in trunk after we have ClusterMetadata > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org