hangc0276 commented on code in PR #21011:
URL: https://github.com/apache/pulsar/pull/21011#discussion_r1299887586
##########
pip/pip-294.md:
##########
@@ -0,0 +1,56 @@
+# Background knowledge
+
+Load balance module in Pulsar broker rely on zk to store and synchronize
metadata about load. Every broker will upload its `LocalBrokerData` to zk, and
leader broker will retrieve all `LocalBrokerData` from zk ,generate all
`BundleData` from each `LocalBrokerData`, and update all `BundleData` to zk.
+
+
+# Motivation
+
+As every bundle in the cluster corresponds to a zk node, it is common that
there are thousands of zk nodes in a cluster, which results into thousands of
read/update operations to zk. This will cause a lot of pressure on zk.
+
+**As All Load Shedding Algorithm pick bundles from top to bottom based on
throughput/msgRate, bundles with low throughput/msgRate are rarely be selected
for shedding. So there is no need to update these bundleData to zk frequently.**
Review Comment:
Can we also apply the TopK selection for `ModularLoadManagerImpl`?
##########
pip/pip-294.md:
##########
@@ -0,0 +1,67 @@
+# Background knowledge
+
+There are mainly two `LoadManager` implementation in Pulsar broker:
`ExtensibleLoadManager` and `ModularLoadManagerImpl`. `ModularLoadManagerImpl`
is the default load manager, and `ExtensibleLoadManager` is a new load manager
which is proposed after 3.0.0 version.
+
+## ModularLoadManagerImpl
+`ModularLoadManagerImpl` rely on zk to store and synchronize metadata about
load, which pose greate pressure on zk, threatening the stability of system.
Every broker will upload its `LocalBrokerData` to zk, and leader broker will
retrieve all `LocalBrokerData` from zk ,generate all `BundleData` from each
`LocalBrokerData`, and update all `BundleData` to zk.
+
+## ExtensibleLoadManager
+`ExtensibleLoadManager` depends on system topics and table views for load
balance metadata store and replication. Though not using zk to store and
synchronize metadata about load, it is still necessary to control the number of
bundles that need to be updated, for which there is a
`loadBalancerMaxNumberOfBundlesInBundleLoadReport` configuration in
`ExtensibleLoadManager` that select the top k bundles.
+
+# Motivation
+
+## ModularLoadManagerImpl
+As every bundle in the cluster corresponds to a zk node, it is common that
there are thousands of zk nodes in a cluster, which results into thousands of
read/update operations to zk. This will cause a lot of pressure on zk.
+
+**As All Load Shedding Algorithm pick bundles from top to bottom based on
throughput/msgRate, bundles with low throughput/msgRate are rarely be selected
for shedding. So there is no need to update these bundleData to zk frequently.**
+
+## ExtensibleLoadManager
+As the number of bundles and the throughput in the cluster changes
dynamically, users can't select a reasonable number of bundles easily, while
based on throughput, we can be sure that bundles with throughput less than 0.1M
is useless for load balance no matter how the cluster change.
+
+We can also add this throughput-based bundle report size control in the
ExtensibleLoadManager, working together with
`loadBalancerMaxNumberOfBundlesInBundleLoadReport`.
Review Comment:
From the motivation, I learned that this proposal's goal is to limit the
number of bundle reports written to the meta store or system topic because a
huge number of bundle reports will bring heavy pressure on the meta store and
system topic.
What I don't understand is why
`loadBalancerMaxNumberOfBundlesInBundleLoadReport` is not enough.
- We can limit the load report bundler number to 20 for each broker. If you
have 10 brokers, and the leader will get 200 bundles to decide if we need to
migrate any bundles. In the next round, each broker will report 20 bundles
report based on current bundle stats. In this way, we can limit the max number
of bundle reports written to the meta store or system topic
- For ExtensibleLoadManager, the number of bundle report issue has less
impact on the load balance because we use non-persistent topic to store the
bundle load report.
- Add throughput-based bundle report size control has the following issues.
- Users need to learn another configuration to understand the load-balance
behavior
- The value is hard to configure, especially when the Pulsar cluster has
low throughput in all brokers
- Our goal is to help the leader to decide which broker is overloaded and
which bundle needs to be unloaded. The TopK bundles play a key role to help the
leader to make the decision.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]