heesung-sn commented on code in PR #21011:
URL: https://github.com/apache/pulsar/pull/21011#discussion_r1300959003


##########
pip/pip-294.md:
##########
@@ -0,0 +1,67 @@
+# Background knowledge
+
+There are mainly two `LoadManager` implementation in Pulsar broker: 
`ExtensibleLoadManager` and `ModularLoadManagerImpl`. `ModularLoadManagerImpl` 
is the default load manager, and `ExtensibleLoadManager` is a new load manager 
which is proposed after 3.0.0 version.
+
+## ModularLoadManagerImpl
+`ModularLoadManagerImpl` rely on zk to store and synchronize metadata about 
load, which pose greate pressure on zk, threatening the stability of system. 
Every broker will upload its `LocalBrokerData` to zk, and leader broker will 
retrieve all `LocalBrokerData` from zk ,generate all `BundleData` from each 
`LocalBrokerData`, and update all `BundleData` to zk. 
+
+## ExtensibleLoadManager
+`ExtensibleLoadManager` depends on system topics and table views for load 
balance metadata store and replication. Though not using zk to store and 
synchronize metadata about load, it is still necessary to control the number of 
bundles that need to be updated, for which there is a 
`loadBalancerMaxNumberOfBundlesInBundleLoadReport` configuration in 
`ExtensibleLoadManager` that select the top k bundles.
+
+# Motivation
+
+## ModularLoadManagerImpl
+As every bundle in the cluster corresponds to a zk node, it is common that 
there are thousands of zk nodes in a cluster, which results into thousands of 
read/update operations to zk. This will cause a lot of pressure on zk.
+
+**As All Load Shedding Algorithm pick bundles from top to bottom based on 
throughput/msgRate, bundles with low throughput/msgRate are rarely be selected 
for shedding. So there is no need to update these bundleData to zk frequently.**
+
+## ExtensibleLoadManager
+As the number of bundles and the throughput in the cluster changes 
dynamically, users can't select a reasonable number of bundles easily, while 
based on throughput, we can be sure that bundles with throughput less than 0.1M 
is useless for load balance no matter how the cluster change.
+
+We can also add this throughput-based bundle report size control in the 
ExtensibleLoadManager, working together with 
`loadBalancerMaxNumberOfBundlesInBundleLoadReport`.

Review Comment:
   I can see many variations in selecting bundles in the load data.
   
   Maybe we can generalize this option by introducing 
`loadBalanceBundleLoadDataReportStrategy` with the options of.
   
   1. num of bundles
   2. % of the total number of bundles or local broker's number of bundles
   3. throughput-based
   4. others.
   
   IMO, top k is more predictable than the number of producers and consumer and 
their throughputs because the max number of bundles are relatively more static.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to