[
https://issues.apache.org/jira/browse/CASSANDRA-20801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18039715#comment-18039715
]
Yuqi Yan commented on CASSANDRA-20801:
--------------------------------------
Actually we might not really need complicated scheduling rate limiting here.
Something simple like an option for operator to schedule and wait for
completion by keyspace would be sufficient for most cases (unless people create
thousands of tables within the same keyspace which I don't think will be the
case, but we do have some clusters have lots of keyspaces and tables.
Scheduling all at once will def overload the message queue)
> Ratelimit repairPaxosForTopologyChange
> --------------------------------------
>
> Key: CASSANDRA-20801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20801
> Project: Apache Cassandra
> Issue Type: Bug
> Reporter: Yuqi Yan
> Assignee: Yuqi Yan
> Priority: Normal
>
> We should have some configurable rate limiting within
> `repairPaxosForTopologyChange`. Currently it's starting all the
> PaxosCleanupSessions in one go.
> ([https://github.com/apache/cassandra/blob/6fd1194578ca3149ce9cd9f14a8591df71c20753/src/java/org/apache/cassandra/service/ActiveRepairService.java#L1157)]
> Number of sessions created is related to (vnodes * node_count * tables) so
> this can go really heavy.
> Looking in the log I found something like:
> {code:java}
> /x.x.x.x:xxxx->/x.x.x.x:xxxx-SMALL_MESSAGES-a9be1f48 overloaded; dropping
> 0.073KiB message (queue: 132.000MiB local, 128.000MiB endpoint, 198.713MiB
> global){code}
> the messages were most likely from the PAXOS2_CLEANUP_REQ
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]