thetumbled commented on code in PR #21011:
URL: https://github.com/apache/pulsar/pull/21011#discussion_r1301075667
##########
pip/pip-294.md:
##########
@@ -0,0 +1,56 @@
+# Background knowledge
+
+Load balance module in Pulsar broker rely on zk to store and synchronize
metadata about load. Every broker will upload its `LocalBrokerData` to zk, and
leader broker will retrieve all `LocalBrokerData` from zk ,generate all
`BundleData` from each `LocalBrokerData`, and update all `BundleData` to zk.
+
+
+# Motivation
+
+As every bundle in the cluster corresponds to a zk node, it is common that
there are thousands of zk nodes in a cluster, which results into thousands of
read/update operations to zk. This will cause a lot of pressure on zk.
+
+**As All Load Shedding Algorithm pick bundles from top to bottom based on
throughput/msgRate, bundles with low throughput/msgRate are rarely be selected
for shedding. So there is no need to update these bundleData to zk frequently.**
+
+
+# Goals
+
+Filter out bundles with low throughput/msgRate, and do not update these
bundles to zk frequently to reduce the pressure on zk.
+
+
+# High Level Design
+
+Filter out bundles with low throughput/msgRate when leader update bundleData
to zk.
+
+
+# Detailed Design
+
+## Design & Implementation Details
+Add throughput/msg-based bundle report control in the `ExtensibleLoadManager`
and `ModularLoadManagerImpl`.
+
+### Configuration
+
+add configuration:
+```
+ @FieldContext(
+ dynamic = true,
+ category = CATEGORY_LOAD_BALANCER,
+ doc = "minimum throughput in of bundle to be considered for
updating data in metadata store"
Review Comment:
TopK compare bundles and select the top k by
`org.apache.pulsar.policies.data.loadbalancer.NamespaceBundleStats#compareTo`,
which compare base on
```
// compare 2 bundles in below aspects:
// 1. Inbound bandwidth
// 2. Outbound bandwidth
// 3. Total megRate (both in and out)
// 4. Total topics and producers/consumers
// 5. Total cache size
public int compareTo(NamespaceBundleStats other) {
int result = this.compareByBandwidthIn(other);
if (result == 0) {
result = this.compareByBandwidthOut(other);
}
if (result == 0) {
result = this.compareByMsgRate(other);
}
if (result == 0) {
result = this.compareByTopicConnections(other);
}
if (result == 0) {
result = this.compareByCacheSize(other);
}
return result;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]