[pulsar-site] branch main updated: [feat][doc] Add docs for message dispatch throttling (#386)

junma Mon, 13 Mar 2023 03:55:41 -0700

This is an automated email from the ASF dual-hosted git repository.

junma pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/pulsar-site.git



The following commit(s) were added to refs/heads/main by this push:
     new 42648bf4275 [feat][doc] Add docs for message dispatch throttling (#386)
42648bf4275 is described below

commit 42648bf4275de6bfad8d788ce3b4234f68efe20f
Author: Jun Ma <[email protected]>
AuthorDate: Mon Mar 13 18:55:26 2023 +0800

    [feat][doc] Add docs for message dispatch throttling (#386)
    
    * Add draft content and image
    
    * Install and config for math equation
    
    * address review comments
    
    * remove unexpected check-in `package-lock.json`
    
    * Update concepts-throttling.md
    
    * address review comments
    
    * Update concepts-throttling.md
    
    * correct the single quote character
    
    * Update concepts-throttling.md
    
    * Update concepts-throttling.md
    
    * Update concepts-throttling.md
    
    * update REST API links and image meta description
    
    * update REST API links
    
    * apply doc updates to versions since 2.8.x
---
 docs/admin-api-namespaces.md                       |   4 +-
 docs/concepts-throttling.md                        | 167 +++++++++++++++++++++
 docusaurus.config.js                               |  15 +-
 package.json                                       |   3 +
 sidebars.json                                      |   1 +
 static/assets/throttling-dispatch.svg              |   1 +
 static/assets/throttling-limitation.svg            |   1 +
 .../version-2.10.x/admin-api-namespaces.md         |   4 +-
 .../version-2.10.x/concepts-throttling.md          | 167 +++++++++++++++++++++
 .../version-2.11.x/admin-api-namespaces.md         |   4 +-
 .../version-2.11.x/concepts-throttling.md          | 167 +++++++++++++++++++++
 .../version-2.8.x/admin-api-namespaces.md          |   4 +-
 .../version-2.8.x/concepts-throttling.md           | 167 +++++++++++++++++++++
 .../version-2.9.x/admin-api-namespaces.md          |   4 +-
 .../version-2.9.x/concepts-throttling.md           | 167 +++++++++++++++++++++
 versioned_sidebars/version-2.10.x-sidebars.json    |   4 +
 versioned_sidebars/version-2.11.x-sidebars.json    |   1 +
 versioned_sidebars/version-2.8.x-sidebars.json     |   4 +
 versioned_sidebars/version-2.9.x-sidebars.json     |   4 +
 19 files changed, 878 insertions(+), 11 deletions(-)

diff --git a/docs/admin-api-namespaces.md b/docs/admin-api-namespaces.md
index b408a90dcd2..50524d34f80 100644
--- a/docs/admin-api-namespaces.md
+++ b/docs/admin-api-namespaces.md
@@ -886,7 +886,7 @@ pulsar-admin namespaces set-subscription-dispatch-rate 
test-tenant/namespace1 \
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
@@ -928,7 +928,7 @@ Example output:
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
diff --git a/docs/concepts-throttling.md b/docs/concepts-throttling.md
new file mode 100644
index 00000000000..b0e89441044
--- /dev/null
+++ b/docs/concepts-throttling.md
@@ -0,0 +1,167 @@
+---
+id: concepts-throttling
+title: Message dispatch throttling
+sidebar_label: "Message throttling"
+---
+
+## Overview
+
+### What is message dispatch throttling?
+
+Large message payloads can cause memory usage spikes that lead to performance 
decreases. Pulsar adopts a rate-limit throttling mechanism for message 
dispatch, avoiding a traffic surge and improving message deliverability. You 
can set a threshold to limit the number of messages and the byte size of 
entries that can be delivered to clients, blocking subsequent deliveries when 
the traffic per unit of time exceeds the threshold.
+
+For example, when you configure the dispatch rate limit to 10 messages per 
second, then the number of messages that can be delivered to the client per 
second is up to 10.
+
+![Rate-limit dispatch throttling](/assets/throttling-dispatch.svg 'message 
throttling')
+
+### Why use it?
+
+Message dispatch throttling brings the following benefits in detail:
+
+- **Limit broker's read request loads to BookKeeper**
+
+  Messages are persistently stored in the BookKeeper cluster. If a large 
number of read requests cannot be fulfilled using the cached data, the 
BookKeeper cluster may become too busy to respond, and the broker's I/O or CPU 
resources can be fully occupied. Using the message dispatch throttling feature 
can regulate the data flow to limit the broker’s read request loads to 
BookKeeper.
+
+- **Balance the allocation of broker's hardware resources at 
topic/subscription levels**
+
+  A broker instance serves multiple topics at one time. If a topic is 
overloaded with requests, it will occupy almost all of the I/O, CPU, and memory 
resources of the broker, causing other topics cannot be read. Using the message 
dispatch throttling feature can limit the allocation of broker’s hardware 
resources across topics.
+
+- **Limit the allocation of client's hardware resources at topic/subscription 
levels**
+
+  When there is a large backlog of messages to consume, clients may receive a 
large amount of data in a short period of time, which monopolizes their 
computing resources. Since the client has no mechanisms to proactively limit 
the consumption rate, using the message dispatch throttling feature can also 
regulate the allocation of the client's hardware resources.
+
+### How it works?
+
+The process of message dispatch throttling can be divided into the following 
steps:
+1. The broker approximates the number of entries to read from the bookies by 
calculating the remaining quota. 
+2. The broker reads the messages from the bookies.
+3. The broker dispatches the messages to the client and updates the counter to 
decrease the quota. A scheduled task refreshes the quota when a throttling 
period ends.
+
+:::note
+
+- The quota cannot be decreased before step 3, because the broker doesn't know 
the actual number of messages per entry or the actual entry size until it reads 
the data.
+- Operations like `seek` or `redeliver` may deliver messages to a client 
multiple times. The broker counts them as different messages and updates the 
counter.
+
+:::
+
+## Concepts
+
+### Throttling levels
+
+The following table outlines the three levels that you can throttle message 
dispatch.
+
+Level | Description
+:-----|:------------
+Per broker | All subscriptions in a single broker share the quota.
+Per topic | All subscriptions in the same topic share the quota.<br /><li>If 
it's a non-partitioned topic, the quota equals the maximum number of messages 
the topic can deliver per unit of time.</li><li>If a topic has multiple 
partitions, the quota refers to the maximum number of messages each partition 
can deliver per unit of time. In other words, the actual dispatch rate limit of 
a [partitioned topic](concepts-messaging.md#partitioned-topics) is N times the 
configured one (N is the num [...]
+Per subscription | <li>If it's a non-partitioned topic, the rate limit refers 
to the maximum number of messages a subscription can deliver to clients per 
unit of time.</li><li>If the subscribed topic has multiple partitions, the rate 
limit refers to the maximum number of messages the subscription can deliver per 
partition per unit of time. In other words, a subscription's actual dispatch 
rate limit for a [partitioned topic](concepts-messaging.md#partitioned-topics) 
is N times the configu [...]
+
+:::note
+
+The dispatch rate limits configured at multiple levels take effect 
simultaneously (logical AND).
+
+:::
+
+### Throttling approaches
+
+The following table outlines multiple approaches to configure the dispatch 
rate limits at different levels.
+
+Approach | Per cluster | Per topic | Per subscription
+:--------|:------------|:----------|:----------------
+Set [broker configurations](#throttling-configurations) or [dynamic broker 
configurations](admin-api-brokers.md#dynamic-broker-configuration) | 
<li>`dispatchThrottlingRateInMsg`</li><li>`dispatchThrottlingRateInByte`</li> | 
<li>`dispatchThrottlingRatePerTopicInMsg`</li><li>`dispatchThrottlingRatePerTopicInByte`</li><br
 />It applies to all topics in the cluster. | 
<li>`dispatchThrottlingRatePerSubscriptionInMsg`</li><li>`dispatchThrottlingRatePerSubscriptionInByte`</li><br
 />It applies to [...]
+Set namespace policies | N/A | Refer to [Configure dispatch throttling for 
topics](admin-api-namespaces.md#configure-dispatch-throttling-for-topics). | 
Refer to [Configure dispatch throttling for 
subscriptions](admin-api-namespaces.md#configure-dispatch-throttling-for-subscription).
+Set topic policies | N/A | Refer to [Set topic-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setDispatchRate).
 | Refer to [Set subscription-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setSubscriptionDispatchRate).<br
 />It applies to all subscriptions in a topic.
+
+:::note
+
+The dispatch rate limits configured through the above three approaches take 
effect with priorities, which is "topic policies" > "namespace policies" > 
"broker configurations". For example, if you have configured the dispatch rate 
limit for a subscription using all these three approaches, only the one 
configured through "topic policies" takes effect.
+
+:::
+
+### Throttling configurations
+
+The following table outlines the parameters that you can configure for message 
dispatch throttling in the `conf/broker.conf` file.
+
+Parameter | Description | Default value
+:---------|:------------|:-------------
+dispatchThrottlingRateInMsg | The total number of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInMsg` 
or `dispatchThrottlingRatePerSubscriptionInMsg`. | '-1', which means no limit.
+dispatchThrottlingRateInByte | The total byte size of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInByte` 
or `dispatchThrottlingRatePerSubscriptionInByte`. | '-1', which means no limit.
+ratePeriodInSecond | The period of time for dispatch throttling (in seconds). 
The counter is reset at the end of the period.<br />For example, if you want to 
configure the rate limit to `10,000 messages per minute`, you need to set 
`ratePeriodInSecond` to `60` and set `dispatchThrottlingRateInMsg` to `10,000`. 
| 1 (second)
+preciseDispatcherFlowControl | Whether to apply a precise control on the 
dispatch throttling. By default, it's disabled, which means the broker 
approximates `the number of messages to read from bookies` using the minimum 
value between the remaining `consumer.receiverQueueSize` (defaults to 1000) and 
`dispatcherMaxReadBatchSize` (defaults to 100).<br /><br />When it's set to 
`true`, the broker approximates $$the \ number \ of \ entries \ to \ read \ 
from \ bookies$$ through the following  [...]
+dispatchThrottlingOnBatchMessageEnabled | Whether to count messages by entry 
(batch). By default, it's disabled.<br /><br />Note that setting it to `true` 
may lead to an inaccurate approximation of total message count but maximize 
Pulsar's throughput while keeping stable read requests to the bookies. For 
example, assume you've set the rate limit to `10/s`, if you set 
`dispatchThrottlingOnBatchMessageEnabled` to `true`, the broker only reads 10 
entries and delivers them to the client per  [...]
+dispatchThrottlingOnNonBacklogConsumerEnabled | Whether the dispatch 
throttling on non-backlog consumers is enabled. By default, it's enabled.<br 
/>When it is set to `false`:<br /><li>If all the consumers in one subscription 
have no backlog, the message dispatch throttling is turned off automatically 
even if `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` are 
configured.</li><li>If at least one consumer has a backlog, the throttling is 
turned on automatically.</li> | true
+
+:::note
+
+- You can use `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` 
simultaneously (logical AND).
+- Ensure that only one of `preciseDispatcherFlowControl` and 
`dispatchThrottlingOnBatchMessageEnabled` is enabled at one time since they are 
mutually exclusive. Both parameters can be used to improve the over-delivery 
issues (see [Limitations](#limitations)). The difference between them is:
+  - When `preciseDispatcherFlowControl` is enabled, Pulsar considers the 
number of messages per entry. This parameter takes effect when the broker reads 
entries from the bookies.
+  - When `dispatchThrottlingOnBatchMessageEnabled` is enabled, Pulsar ignores 
the number of messages per entry. This parameter takes effect when the broker 
updates the counter after sending messages to the client.
+
+:::
+
+## Limitations
+
+Message dispatch throttling may cause messages over-delivered per unit of time 
due to the following reasons:
+
+1. **The broker may read more entries or bytes from the bookies than the 
throttling limit.**
+
+   a) **The byte size of messages delivered to the client may exceed the 
configured threshold.**
+   
+     When you set the dispatch rate limit in bytes/throttling-period 
(`dispatchThrottlingRateInByte`/`ratePeriodInSecond`), the broker calculates 
$$the \ number \ of \ entries \ to \ read \ from \ bookies$$ in one throttling 
period through the following equation:
+   
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ byte \ size \ to \ read} \over{The \ average \ byte \ size \ per \ entry}}
+     $$
+
+     By controlling $$the \ number \ of \ entries \ to \ read \ from \ 
bookies$$, the broker attempts to limit `the total byte size to read` below 
`the dispatch rate` within each throttling period. It reads messages from the 
bookies in the unit of `entry` and approximates the bytes of the next entry to 
read because it does not know the exact byte size of each entry before reading 
it.
+
+     The broker uses the following two metrics to get the average byte size 
per entry:
+
+      * Average publish size (`brk_ml_EntrySizeBuckets`): the average byte 
size per entry stored in the bookies when the broker receives a publish request.
+
+      * Average dispatch size (`entriesReadSize`/`entriesReadCount`): the 
average byte size per entry read from bookies, that is, the average byte size 
per entry sent to the client.
+      
+     The broker uses the average publish size in preference to the average 
dispatch size. If the average publish size is unavailable, then it uses the 
average dispatch size. When none of the two metrics are available, the broker 
only reads one entry at the first attempt.
+
+   **b) The number of messages delivered to the client may exceed the 
configured threshold.**
+     
+     When you set the dispatch rate limit in message-count/throttling-period 
(`dispatchThrottlingRateInMsg`/`ratePeriodInSecond`) and batching 
(`batch-send`) is enabled, the broker counts an entry as one message (despite 
the message count per entry) and calculates $$the \ number \ of \ entries \ to 
\ read \ from \ bookies$$ through the following equation:
+      
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ number \ of \ messages \ to \ read} \over{The \ average \ message \ count \ 
per \ entry} \ (=1)}
+     $$
+
+     Since there is a number of messages per entry, the number of messages 
delivered to the client always exceeds or equals the configured threshold.
+
+   **Workaround**
+
+   Configuring `preciseDispatcherFlowControl` or 
`dispatchThrottlingOnBatchMessageEnabled` can mitigate the over-delivery issue. 
For example, turning on `preciseDispatcherFlowControl` can mitigate the 
limitation by pre-decrementing the quota using the approximated average message 
count per entry. See [Throttling configurations](#throttling-configurations) 
for more details.
+
+2. **Concurrent throttling processes may not decrease the quota in a timely 
manner.**
+
+   As introduced in [How it works](#how-it-works), the dispatch throttling 
process is `1.get remaining quota` $$\to$$ `2.load data` $$\to$$ `3.decrease 
quota`. 
+   
+   When two processes "dispatch replay messages (process-R)" and "dispatch 
non-replay messages (process-N)" in the same subscription are executed 
concurrently, their throttling processes can be interwoven in this order: 
+
+     1) process-R: `1.get remaining quota`
+
+     2) process-R: `2.load data`
+
+     3) process-N: `1.get remaining quota`
+
+     4) process-N: `2.load data`
+
+     5) process-R: `3.decrease quota`
+
+     6) process-N: `3.decrease quota`
+
+   As a result, the total number of dispatched messages may exceed the quota.
+
+   :::note
+
+   When over-delivery happens, and the delivered message count exceeds the 
quota in the current period, then the quota for the next period will be reduced 
accordingly. For example, if the rate limit is set to `10/s`, and `11` messages 
have been delivered to the client in the first period, then only up to `9` 
messages can be delivered to the client in the next period; if 30 messages have 
been delivered in the last period, the count of messages to deliver in the next 
two periods is `0`.
+
+   ![An example of over-delivery occurred within a throttling 
period](/assets/throttling-limitation.svg)
+
+   :::
\ No newline at end of file
diff --git a/docusaurus.config.js b/docusaurus.config.js
index 00a08293cfc..f472003a6b5 100644
--- a/docusaurus.config.js
+++ b/docusaurus.config.js
@@ -39,6 +39,8 @@ const lookupApiUrl = url + "/lookup-rest-api";
 const githubUrl = "https://github.com/apache/pulsar";;
 const githubSiteUrl = "https://github.com/apache/pulsar-site";;
 const baseUrl = "/";
+const math = require('remark-math');
+const katex = require('rehype-katex');
 
 const injectLinkParse = ([, prefix, , name, path]) => {
     if (prefix == "javadoc") {
@@ -336,7 +338,9 @@ module.exports = {
                             /{\@inject\:\s?endpoint\|([^}]+)}/,
                             injectLinkParseForEndpoint
                         ),
-                    ],
+                        math,
+                        ],
+                    rehypePlugins: [katex],
                     versions: versionsMap,
                     onlyIncludeVersions: buildVersions || ["current"],
                 },
@@ -395,4 +399,13 @@ module.exports = {
         "/js/matomo-agent.js",
     ],
     clientModules: [require.resolve('./matomoClientModule.ts')],
+    stylesheets: [
+        {
+          href: 
'https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.min.css',
+          type: 'text/css',
+          integrity:
+            
'sha384-odtC+0UGzzFL/6PNoE8rX/SPcQDXBJ+uRepguP4QkPCm2LBxH3FA3y+fKSiJ+AmM',
+          crossorigin: 'anonymous',
+        },
+      ],
 };
diff --git a/package.json b/package.json
index e8e94cb85cb..98a7c2eb56b 100644
--- a/package.json
+++ b/package.json
@@ -32,6 +32,7 @@
     "execa": "^6.1.0",
     "file-loader": "^6.2.0",
     "font-awesome": "^4.7.0",
+    "hast-util-is-element": "^1.1.0",
     "install": "^0.13.0",
     "jquery": "^3.1.1",
     "jquery.scrollto": "^2.1.2",
@@ -48,7 +49,9 @@
     "react-markdown": "^8.0.0",
     "react-md-file": "^2.0.0",
     "react-svg": "^14.1.13",
+    "rehype-katex": "^5.0.0",
     "remark-linkify-regex": "^1.0.0",
+    "remark-math": "^3.0.1",
     "replace-in-file": "^6.3.2",
     "semver": "^7.3.8",
     "sine-waves": "^0.3.0",
diff --git a/sidebars.json b/sidebars.json
index 17e48e6f928..6cfcd1f6ca2 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -31,6 +31,7 @@
         "concepts-multi-tenancy",
         "concepts-authentication",
         "concepts-topic-compaction",
+        "concepts-throttling",
         "concepts-proxy-sni-routing",
         "concepts-multiple-advertised-listeners"
       ]
diff --git a/static/assets/throttling-dispatch.svg 
b/static/assets/throttling-dispatch.svg
new file mode 100644
index 00000000000..c7000e0ffa2
--- /dev/null
+++ b/static/assets/throttling-dispatch.svg
@@ -0,0 +1 @@
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; xmlns:lucid="lucid" width="1551.05" 
height="280.13"><g transform="translate(1624.4147375328416 -949.6319444444443)" 
lucid:page-tab-id="0_0"><path d="M-1806.87 0H64v1322.84h-1870.87z" 
fill="#fff"/><path d="M-1583.9 1103.77a.5.5 0 0 1 .5-.5H-94.38a.5.5 0 0 1 
.5.5.5.5 0 0 1-.5.5h-1489.05a.5.5 0 0 1-.5-.5z" stroke="#188fff" 
fill="none"/><path d="M-1484.42 1072.27a.5.5 0 0 1 .5.5v76a.5.5 0 0 1-.5.5.5.5 
0 0 1-.5 [...]
\ No newline at end of file
diff --git a/static/assets/throttling-limitation.svg 
b/static/assets/throttling-limitation.svg
new file mode 100644
index 00000000000..a336699d247
--- /dev/null
+++ b/static/assets/throttling-limitation.svg
@@ -0,0 +1 @@
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; xmlns:lucid="lucid" width="1093.5" 
height="361.07"><g transform="translate(1604.4147375328084 -1255.365911541303)" 
lucid:page-tab-id="0_0"><path d="M-1806.87 1258.84H64v1322.83h-1870.87z" 
fill="#fff"/><path d="M-1583.9 1490.1a.5.5 0 0 1 .5-.5h1051.48a.5.5 0 0 1 
.5.5.5.5 0 0 1-.5.5h-1051.5a.5.5 0 0 1-.5-.5z" stroke="#4c535d" 
fill="none"/><path d="M-1485.4 1458.6a.5.5 0 0 1 .5.5v76a.5.5 0 0 1-.5.5.5.5 0 
0 1- [...]
\ No newline at end of file
diff --git a/versioned_docs/version-2.10.x/admin-api-namespaces.md 
b/versioned_docs/version-2.10.x/admin-api-namespaces.md
index 4295a0b2f0a..1c11679c273 100644
--- a/versioned_docs/version-2.10.x/admin-api-namespaces.md
+++ b/versioned_docs/version-2.10.x/admin-api-namespaces.md
@@ -966,7 +966,7 @@ $ pulsar-admin namespaces set-subscription-dispatch-rate 
test-tenant/ns1 \
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
@@ -1011,7 +1011,7 @@ $ pulsar-admin namespaces get-subscription-dispatch-rate 
test-tenant/ns1
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
diff --git a/versioned_docs/version-2.10.x/concepts-throttling.md 
b/versioned_docs/version-2.10.x/concepts-throttling.md
new file mode 100644
index 00000000000..b0e89441044
--- /dev/null
+++ b/versioned_docs/version-2.10.x/concepts-throttling.md
@@ -0,0 +1,167 @@
+---
+id: concepts-throttling
+title: Message dispatch throttling
+sidebar_label: "Message throttling"
+---
+
+## Overview
+
+### What is message dispatch throttling?
+
+Large message payloads can cause memory usage spikes that lead to performance 
decreases. Pulsar adopts a rate-limit throttling mechanism for message 
dispatch, avoiding a traffic surge and improving message deliverability. You 
can set a threshold to limit the number of messages and the byte size of 
entries that can be delivered to clients, blocking subsequent deliveries when 
the traffic per unit of time exceeds the threshold.
+
+For example, when you configure the dispatch rate limit to 10 messages per 
second, then the number of messages that can be delivered to the client per 
second is up to 10.
+
+![Rate-limit dispatch throttling](/assets/throttling-dispatch.svg 'message 
throttling')
+
+### Why use it?
+
+Message dispatch throttling brings the following benefits in detail:
+
+- **Limit broker's read request loads to BookKeeper**
+
+  Messages are persistently stored in the BookKeeper cluster. If a large 
number of read requests cannot be fulfilled using the cached data, the 
BookKeeper cluster may become too busy to respond, and the broker's I/O or CPU 
resources can be fully occupied. Using the message dispatch throttling feature 
can regulate the data flow to limit the broker’s read request loads to 
BookKeeper.
+
+- **Balance the allocation of broker's hardware resources at 
topic/subscription levels**
+
+  A broker instance serves multiple topics at one time. If a topic is 
overloaded with requests, it will occupy almost all of the I/O, CPU, and memory 
resources of the broker, causing other topics cannot be read. Using the message 
dispatch throttling feature can limit the allocation of broker’s hardware 
resources across topics.
+
+- **Limit the allocation of client's hardware resources at topic/subscription 
levels**
+
+  When there is a large backlog of messages to consume, clients may receive a 
large amount of data in a short period of time, which monopolizes their 
computing resources. Since the client has no mechanisms to proactively limit 
the consumption rate, using the message dispatch throttling feature can also 
regulate the allocation of the client's hardware resources.
+
+### How it works?
+
+The process of message dispatch throttling can be divided into the following 
steps:
+1. The broker approximates the number of entries to read from the bookies by 
calculating the remaining quota. 
+2. The broker reads the messages from the bookies.
+3. The broker dispatches the messages to the client and updates the counter to 
decrease the quota. A scheduled task refreshes the quota when a throttling 
period ends.
+
+:::note
+
+- The quota cannot be decreased before step 3, because the broker doesn't know 
the actual number of messages per entry or the actual entry size until it reads 
the data.
+- Operations like `seek` or `redeliver` may deliver messages to a client 
multiple times. The broker counts them as different messages and updates the 
counter.
+
+:::
+
+## Concepts
+
+### Throttling levels
+
+The following table outlines the three levels that you can throttle message 
dispatch.
+
+Level | Description
+:-----|:------------
+Per broker | All subscriptions in a single broker share the quota.
+Per topic | All subscriptions in the same topic share the quota.<br /><li>If 
it's a non-partitioned topic, the quota equals the maximum number of messages 
the topic can deliver per unit of time.</li><li>If a topic has multiple 
partitions, the quota refers to the maximum number of messages each partition 
can deliver per unit of time. In other words, the actual dispatch rate limit of 
a [partitioned topic](concepts-messaging.md#partitioned-topics) is N times the 
configured one (N is the num [...]
+Per subscription | <li>If it's a non-partitioned topic, the rate limit refers 
to the maximum number of messages a subscription can deliver to clients per 
unit of time.</li><li>If the subscribed topic has multiple partitions, the rate 
limit refers to the maximum number of messages the subscription can deliver per 
partition per unit of time. In other words, a subscription's actual dispatch 
rate limit for a [partitioned topic](concepts-messaging.md#partitioned-topics) 
is N times the configu [...]
+
+:::note
+
+The dispatch rate limits configured at multiple levels take effect 
simultaneously (logical AND).
+
+:::
+
+### Throttling approaches
+
+The following table outlines multiple approaches to configure the dispatch 
rate limits at different levels.
+
+Approach | Per cluster | Per topic | Per subscription
+:--------|:------------|:----------|:----------------
+Set [broker configurations](#throttling-configurations) or [dynamic broker 
configurations](admin-api-brokers.md#dynamic-broker-configuration) | 
<li>`dispatchThrottlingRateInMsg`</li><li>`dispatchThrottlingRateInByte`</li> | 
<li>`dispatchThrottlingRatePerTopicInMsg`</li><li>`dispatchThrottlingRatePerTopicInByte`</li><br
 />It applies to all topics in the cluster. | 
<li>`dispatchThrottlingRatePerSubscriptionInMsg`</li><li>`dispatchThrottlingRatePerSubscriptionInByte`</li><br
 />It applies to [...]
+Set namespace policies | N/A | Refer to [Configure dispatch throttling for 
topics](admin-api-namespaces.md#configure-dispatch-throttling-for-topics). | 
Refer to [Configure dispatch throttling for 
subscriptions](admin-api-namespaces.md#configure-dispatch-throttling-for-subscription).
+Set topic policies | N/A | Refer to [Set topic-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setDispatchRate).
 | Refer to [Set subscription-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setSubscriptionDispatchRate).<br
 />It applies to all subscriptions in a topic.
+
+:::note
+
+The dispatch rate limits configured through the above three approaches take 
effect with priorities, which is "topic policies" > "namespace policies" > 
"broker configurations". For example, if you have configured the dispatch rate 
limit for a subscription using all these three approaches, only the one 
configured through "topic policies" takes effect.
+
+:::
+
+### Throttling configurations
+
+The following table outlines the parameters that you can configure for message 
dispatch throttling in the `conf/broker.conf` file.
+
+Parameter | Description | Default value
+:---------|:------------|:-------------
+dispatchThrottlingRateInMsg | The total number of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInMsg` 
or `dispatchThrottlingRatePerSubscriptionInMsg`. | '-1', which means no limit.
+dispatchThrottlingRateInByte | The total byte size of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInByte` 
or `dispatchThrottlingRatePerSubscriptionInByte`. | '-1', which means no limit.
+ratePeriodInSecond | The period of time for dispatch throttling (in seconds). 
The counter is reset at the end of the period.<br />For example, if you want to 
configure the rate limit to `10,000 messages per minute`, you need to set 
`ratePeriodInSecond` to `60` and set `dispatchThrottlingRateInMsg` to `10,000`. 
| 1 (second)
+preciseDispatcherFlowControl | Whether to apply a precise control on the 
dispatch throttling. By default, it's disabled, which means the broker 
approximates `the number of messages to read from bookies` using the minimum 
value between the remaining `consumer.receiverQueueSize` (defaults to 1000) and 
`dispatcherMaxReadBatchSize` (defaults to 100).<br /><br />When it's set to 
`true`, the broker approximates $$the \ number \ of \ entries \ to \ read \ 
from \ bookies$$ through the following  [...]
+dispatchThrottlingOnBatchMessageEnabled | Whether to count messages by entry 
(batch). By default, it's disabled.<br /><br />Note that setting it to `true` 
may lead to an inaccurate approximation of total message count but maximize 
Pulsar's throughput while keeping stable read requests to the bookies. For 
example, assume you've set the rate limit to `10/s`, if you set 
`dispatchThrottlingOnBatchMessageEnabled` to `true`, the broker only reads 10 
entries and delivers them to the client per  [...]
+dispatchThrottlingOnNonBacklogConsumerEnabled | Whether the dispatch 
throttling on non-backlog consumers is enabled. By default, it's enabled.<br 
/>When it is set to `false`:<br /><li>If all the consumers in one subscription 
have no backlog, the message dispatch throttling is turned off automatically 
even if `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` are 
configured.</li><li>If at least one consumer has a backlog, the throttling is 
turned on automatically.</li> | true
+
+:::note
+
+- You can use `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` 
simultaneously (logical AND).
+- Ensure that only one of `preciseDispatcherFlowControl` and 
`dispatchThrottlingOnBatchMessageEnabled` is enabled at one time since they are 
mutually exclusive. Both parameters can be used to improve the over-delivery 
issues (see [Limitations](#limitations)). The difference between them is:
+  - When `preciseDispatcherFlowControl` is enabled, Pulsar considers the 
number of messages per entry. This parameter takes effect when the broker reads 
entries from the bookies.
+  - When `dispatchThrottlingOnBatchMessageEnabled` is enabled, Pulsar ignores 
the number of messages per entry. This parameter takes effect when the broker 
updates the counter after sending messages to the client.
+
+:::
+
+## Limitations
+
+Message dispatch throttling may cause messages over-delivered per unit of time 
due to the following reasons:
+
+1. **The broker may read more entries or bytes from the bookies than the 
throttling limit.**
+
+   a) **The byte size of messages delivered to the client may exceed the 
configured threshold.**
+   
+     When you set the dispatch rate limit in bytes/throttling-period 
(`dispatchThrottlingRateInByte`/`ratePeriodInSecond`), the broker calculates 
$$the \ number \ of \ entries \ to \ read \ from \ bookies$$ in one throttling 
period through the following equation:
+   
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ byte \ size \ to \ read} \over{The \ average \ byte \ size \ per \ entry}}
+     $$
+
+     By controlling $$the \ number \ of \ entries \ to \ read \ from \ 
bookies$$, the broker attempts to limit `the total byte size to read` below 
`the dispatch rate` within each throttling period. It reads messages from the 
bookies in the unit of `entry` and approximates the bytes of the next entry to 
read because it does not know the exact byte size of each entry before reading 
it.
+
+     The broker uses the following two metrics to get the average byte size 
per entry:
+
+      * Average publish size (`brk_ml_EntrySizeBuckets`): the average byte 
size per entry stored in the bookies when the broker receives a publish request.
+
+      * Average dispatch size (`entriesReadSize`/`entriesReadCount`): the 
average byte size per entry read from bookies, that is, the average byte size 
per entry sent to the client.
+      
+     The broker uses the average publish size in preference to the average 
dispatch size. If the average publish size is unavailable, then it uses the 
average dispatch size. When none of the two metrics are available, the broker 
only reads one entry at the first attempt.
+
+   **b) The number of messages delivered to the client may exceed the 
configured threshold.**
+     
+     When you set the dispatch rate limit in message-count/throttling-period 
(`dispatchThrottlingRateInMsg`/`ratePeriodInSecond`) and batching 
(`batch-send`) is enabled, the broker counts an entry as one message (despite 
the message count per entry) and calculates $$the \ number \ of \ entries \ to 
\ read \ from \ bookies$$ through the following equation:
+      
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ number \ of \ messages \ to \ read} \over{The \ average \ message \ count \ 
per \ entry} \ (=1)}
+     $$
+
+     Since there is a number of messages per entry, the number of messages 
delivered to the client always exceeds or equals the configured threshold.
+
+   **Workaround**
+
+   Configuring `preciseDispatcherFlowControl` or 
`dispatchThrottlingOnBatchMessageEnabled` can mitigate the over-delivery issue. 
For example, turning on `preciseDispatcherFlowControl` can mitigate the 
limitation by pre-decrementing the quota using the approximated average message 
count per entry. See [Throttling configurations](#throttling-configurations) 
for more details.
+
+2. **Concurrent throttling processes may not decrease the quota in a timely 
manner.**
+
+   As introduced in [How it works](#how-it-works), the dispatch throttling 
process is `1.get remaining quota` $$\to$$ `2.load data` $$\to$$ `3.decrease 
quota`. 
+   
+   When two processes "dispatch replay messages (process-R)" and "dispatch 
non-replay messages (process-N)" in the same subscription are executed 
concurrently, their throttling processes can be interwoven in this order: 
+
+     1) process-R: `1.get remaining quota`
+
+     2) process-R: `2.load data`
+
+     3) process-N: `1.get remaining quota`
+
+     4) process-N: `2.load data`
+
+     5) process-R: `3.decrease quota`
+
+     6) process-N: `3.decrease quota`
+
+   As a result, the total number of dispatched messages may exceed the quota.
+
+   :::note
+
+   When over-delivery happens, and the delivered message count exceeds the 
quota in the current period, then the quota for the next period will be reduced 
accordingly. For example, if the rate limit is set to `10/s`, and `11` messages 
have been delivered to the client in the first period, then only up to `9` 
messages can be delivered to the client in the next period; if 30 messages have 
been delivered in the last period, the count of messages to deliver in the next 
two periods is `0`.
+
+   ![An example of over-delivery occurred within a throttling 
period](/assets/throttling-limitation.svg)
+
+   :::
\ No newline at end of file
diff --git a/versioned_docs/version-2.11.x/admin-api-namespaces.md 
b/versioned_docs/version-2.11.x/admin-api-namespaces.md
index efcc25b9a04..94a416597b0 100644
--- a/versioned_docs/version-2.11.x/admin-api-namespaces.md
+++ b/versioned_docs/version-2.11.x/admin-api-namespaces.md
@@ -886,7 +886,7 @@ pulsar-admin namespaces set-subscription-dispatch-rate 
test-tenant/namespace1 \
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
@@ -928,7 +928,7 @@ Example output:
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
diff --git a/versioned_docs/version-2.11.x/concepts-throttling.md 
b/versioned_docs/version-2.11.x/concepts-throttling.md
new file mode 100644
index 00000000000..b0e89441044
--- /dev/null
+++ b/versioned_docs/version-2.11.x/concepts-throttling.md
@@ -0,0 +1,167 @@
+---
+id: concepts-throttling
+title: Message dispatch throttling
+sidebar_label: "Message throttling"
+---
+
+## Overview
+
+### What is message dispatch throttling?
+
+Large message payloads can cause memory usage spikes that lead to performance 
decreases. Pulsar adopts a rate-limit throttling mechanism for message 
dispatch, avoiding a traffic surge and improving message deliverability. You 
can set a threshold to limit the number of messages and the byte size of 
entries that can be delivered to clients, blocking subsequent deliveries when 
the traffic per unit of time exceeds the threshold.
+
+For example, when you configure the dispatch rate limit to 10 messages per 
second, then the number of messages that can be delivered to the client per 
second is up to 10.
+
+![Rate-limit dispatch throttling](/assets/throttling-dispatch.svg 'message 
throttling')
+
+### Why use it?
+
+Message dispatch throttling brings the following benefits in detail:
+
+- **Limit broker's read request loads to BookKeeper**
+
+  Messages are persistently stored in the BookKeeper cluster. If a large 
number of read requests cannot be fulfilled using the cached data, the 
BookKeeper cluster may become too busy to respond, and the broker's I/O or CPU 
resources can be fully occupied. Using the message dispatch throttling feature 
can regulate the data flow to limit the broker’s read request loads to 
BookKeeper.
+
+- **Balance the allocation of broker's hardware resources at 
topic/subscription levels**
+
+  A broker instance serves multiple topics at one time. If a topic is 
overloaded with requests, it will occupy almost all of the I/O, CPU, and memory 
resources of the broker, causing other topics cannot be read. Using the message 
dispatch throttling feature can limit the allocation of broker’s hardware 
resources across topics.
+
+- **Limit the allocation of client's hardware resources at topic/subscription 
levels**
+
+  When there is a large backlog of messages to consume, clients may receive a 
large amount of data in a short period of time, which monopolizes their 
computing resources. Since the client has no mechanisms to proactively limit 
the consumption rate, using the message dispatch throttling feature can also 
regulate the allocation of the client's hardware resources.
+
+### How it works?
+
+The process of message dispatch throttling can be divided into the following 
steps:
+1. The broker approximates the number of entries to read from the bookies by 
calculating the remaining quota. 
+2. The broker reads the messages from the bookies.
+3. The broker dispatches the messages to the client and updates the counter to 
decrease the quota. A scheduled task refreshes the quota when a throttling 
period ends.
+
+:::note
+
+- The quota cannot be decreased before step 3, because the broker doesn't know 
the actual number of messages per entry or the actual entry size until it reads 
the data.
+- Operations like `seek` or `redeliver` may deliver messages to a client 
multiple times. The broker counts them as different messages and updates the 
counter.
+
+:::
+
+## Concepts
+
+### Throttling levels
+
+The following table outlines the three levels that you can throttle message 
dispatch.
+
+Level | Description
+:-----|:------------
+Per broker | All subscriptions in a single broker share the quota.
+Per topic | All subscriptions in the same topic share the quota.<br /><li>If 
it's a non-partitioned topic, the quota equals the maximum number of messages 
the topic can deliver per unit of time.</li><li>If a topic has multiple 
partitions, the quota refers to the maximum number of messages each partition 
can deliver per unit of time. In other words, the actual dispatch rate limit of 
a [partitioned topic](concepts-messaging.md#partitioned-topics) is N times the 
configured one (N is the num [...]
+Per subscription | <li>If it's a non-partitioned topic, the rate limit refers 
to the maximum number of messages a subscription can deliver to clients per 
unit of time.</li><li>If the subscribed topic has multiple partitions, the rate 
limit refers to the maximum number of messages the subscription can deliver per 
partition per unit of time. In other words, a subscription's actual dispatch 
rate limit for a [partitioned topic](concepts-messaging.md#partitioned-topics) 
is N times the configu [...]
+
+:::note
+
+The dispatch rate limits configured at multiple levels take effect 
simultaneously (logical AND).
+
+:::
+
+### Throttling approaches
+
+The following table outlines multiple approaches to configure the dispatch 
rate limits at different levels.
+
+Approach | Per cluster | Per topic | Per subscription
+:--------|:------------|:----------|:----------------
+Set [broker configurations](#throttling-configurations) or [dynamic broker 
configurations](admin-api-brokers.md#dynamic-broker-configuration) | 
<li>`dispatchThrottlingRateInMsg`</li><li>`dispatchThrottlingRateInByte`</li> | 
<li>`dispatchThrottlingRatePerTopicInMsg`</li><li>`dispatchThrottlingRatePerTopicInByte`</li><br
 />It applies to all topics in the cluster. | 
<li>`dispatchThrottlingRatePerSubscriptionInMsg`</li><li>`dispatchThrottlingRatePerSubscriptionInByte`</li><br
 />It applies to [...]
+Set namespace policies | N/A | Refer to [Configure dispatch throttling for 
topics](admin-api-namespaces.md#configure-dispatch-throttling-for-topics). | 
Refer to [Configure dispatch throttling for 
subscriptions](admin-api-namespaces.md#configure-dispatch-throttling-for-subscription).
+Set topic policies | N/A | Refer to [Set topic-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setDispatchRate).
 | Refer to [Set subscription-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setSubscriptionDispatchRate).<br
 />It applies to all subscriptions in a topic.
+
+:::note
+
+The dispatch rate limits configured through the above three approaches take 
effect with priorities, which is "topic policies" > "namespace policies" > 
"broker configurations". For example, if you have configured the dispatch rate 
limit for a subscription using all these three approaches, only the one 
configured through "topic policies" takes effect.
+
+:::
+
+### Throttling configurations
+
+The following table outlines the parameters that you can configure for message 
dispatch throttling in the `conf/broker.conf` file.
+
+Parameter | Description | Default value
+:---------|:------------|:-------------
+dispatchThrottlingRateInMsg | The total number of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInMsg` 
or `dispatchThrottlingRatePerSubscriptionInMsg`. | '-1', which means no limit.
+dispatchThrottlingRateInByte | The total byte size of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInByte` 
or `dispatchThrottlingRatePerSubscriptionInByte`. | '-1', which means no limit.
+ratePeriodInSecond | The period of time for dispatch throttling (in seconds). 
The counter is reset at the end of the period.<br />For example, if you want to 
configure the rate limit to `10,000 messages per minute`, you need to set 
`ratePeriodInSecond` to `60` and set `dispatchThrottlingRateInMsg` to `10,000`. 
| 1 (second)
+preciseDispatcherFlowControl | Whether to apply a precise control on the 
dispatch throttling. By default, it's disabled, which means the broker 
approximates `the number of messages to read from bookies` using the minimum 
value between the remaining `consumer.receiverQueueSize` (defaults to 1000) and 
`dispatcherMaxReadBatchSize` (defaults to 100).<br /><br />When it's set to 
`true`, the broker approximates $$the \ number \ of \ entries \ to \ read \ 
from \ bookies$$ through the following  [...]
+dispatchThrottlingOnBatchMessageEnabled | Whether to count messages by entry 
(batch). By default, it's disabled.<br /><br />Note that setting it to `true` 
may lead to an inaccurate approximation of total message count but maximize 
Pulsar's throughput while keeping stable read requests to the bookies. For 
example, assume you've set the rate limit to `10/s`, if you set 
`dispatchThrottlingOnBatchMessageEnabled` to `true`, the broker only reads 10 
entries and delivers them to the client per  [...]
+dispatchThrottlingOnNonBacklogConsumerEnabled | Whether the dispatch 
throttling on non-backlog consumers is enabled. By default, it's enabled.<br 
/>When it is set to `false`:<br /><li>If all the consumers in one subscription 
have no backlog, the message dispatch throttling is turned off automatically 
even if `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` are 
configured.</li><li>If at least one consumer has a backlog, the throttling is 
turned on automatically.</li> | true
+
+:::note
+
+- You can use `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` 
simultaneously (logical AND).
+- Ensure that only one of `preciseDispatcherFlowControl` and 
`dispatchThrottlingOnBatchMessageEnabled` is enabled at one time since they are 
mutually exclusive. Both parameters can be used to improve the over-delivery 
issues (see [Limitations](#limitations)). The difference between them is:
+  - When `preciseDispatcherFlowControl` is enabled, Pulsar considers the 
number of messages per entry. This parameter takes effect when the broker reads 
entries from the bookies.
+  - When `dispatchThrottlingOnBatchMessageEnabled` is enabled, Pulsar ignores 
the number of messages per entry. This parameter takes effect when the broker 
updates the counter after sending messages to the client.
+
+:::
+
+## Limitations
+
+Message dispatch throttling may cause messages over-delivered per unit of time 
due to the following reasons:
+
+1. **The broker may read more entries or bytes from the bookies than the 
throttling limit.**
+
+   a) **The byte size of messages delivered to the client may exceed the 
configured threshold.**
+   
+     When you set the dispatch rate limit in bytes/throttling-period 
(`dispatchThrottlingRateInByte`/`ratePeriodInSecond`), the broker calculates 
$$the \ number \ of \ entries \ to \ read \ from \ bookies$$ in one throttling 
period through the following equation:
+   
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ byte \ size \ to \ read} \over{The \ average \ byte \ size \ per \ entry}}
+     $$
+
+     By controlling $$the \ number \ of \ entries \ to \ read \ from \ 
bookies$$, the broker attempts to limit `the total byte size to read` below 
`the dispatch rate` within each throttling period. It reads messages from the 
bookies in the unit of `entry` and approximates the bytes of the next entry to 
read because it does not know the exact byte size of each entry before reading 
it.
+
+     The broker uses the following two metrics to get the average byte size 
per entry:
+
+      * Average publish size (`brk_ml_EntrySizeBuckets`): the average byte 
size per entry stored in the bookies when the broker receives a publish request.
+
+      * Average dispatch size (`entriesReadSize`/`entriesReadCount`): the 
average byte size per entry read from bookies, that is, the average byte size 
per entry sent to the client.
+      
+     The broker uses the average publish size in preference to the average 
dispatch size. If the average publish size is unavailable, then it uses the 
average dispatch size. When none of the two metrics are available, the broker 
only reads one entry at the first attempt.
+
+   **b) The number of messages delivered to the client may exceed the 
configured threshold.**
+     
+     When you set the dispatch rate limit in message-count/throttling-period 
(`dispatchThrottlingRateInMsg`/`ratePeriodInSecond`) and batching 
(`batch-send`) is enabled, the broker counts an entry as one message (despite 
the message count per entry) and calculates $$the \ number \ of \ entries \ to 
\ read \ from \ bookies$$ through the following equation:
+      
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ number \ of \ messages \ to \ read} \over{The \ average \ message \ count \ 
per \ entry} \ (=1)}
+     $$
+
+     Since there is a number of messages per entry, the number of messages 
delivered to the client always exceeds or equals the configured threshold.
+
+   **Workaround**
+
+   Configuring `preciseDispatcherFlowControl` or 
`dispatchThrottlingOnBatchMessageEnabled` can mitigate the over-delivery issue. 
For example, turning on `preciseDispatcherFlowControl` can mitigate the 
limitation by pre-decrementing the quota using the approximated average message 
count per entry. See [Throttling configurations](#throttling-configurations) 
for more details.
+
+2. **Concurrent throttling processes may not decrease the quota in a timely 
manner.**
+
+   As introduced in [How it works](#how-it-works), the dispatch throttling 
process is `1.get remaining quota` $$\to$$ `2.load data` $$\to$$ `3.decrease 
quota`. 
+   
+   When two processes "dispatch replay messages (process-R)" and "dispatch 
non-replay messages (process-N)" in the same subscription are executed 
concurrently, their throttling processes can be interwoven in this order: 
+
+     1) process-R: `1.get remaining quota`
+
+     2) process-R: `2.load data`
+
+     3) process-N: `1.get remaining quota`
+
+     4) process-N: `2.load data`
+
+     5) process-R: `3.decrease quota`
+
+     6) process-N: `3.decrease quota`
+
+   As a result, the total number of dispatched messages may exceed the quota.
+
+   :::note
+
+   When over-delivery happens, and the delivered message count exceeds the 
quota in the current period, then the quota for the next period will be reduced 
accordingly. For example, if the rate limit is set to `10/s`, and `11` messages 
have been delivered to the client in the first period, then only up to `9` 
messages can be delivered to the client in the next period; if 30 messages have 
been delivered in the last period, the count of messages to deliver in the next 
two periods is `0`.
+
+   ![An example of over-delivery occurred within a throttling 
period](/assets/throttling-limitation.svg)
+
+   :::
\ No newline at end of file
diff --git a/versioned_docs/version-2.8.x/admin-api-namespaces.md 
b/versioned_docs/version-2.8.x/admin-api-namespaces.md
index be5940d1091..55165c643ab 100644
--- a/versioned_docs/version-2.8.x/admin-api-namespaces.md
+++ b/versioned_docs/version-2.8.x/admin-api-namespaces.md
@@ -1013,7 +1013,7 @@ $ pulsar-admin namespaces set-subscription-dispatch-rate 
test-tenant/ns1 \
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
@@ -1058,7 +1058,7 @@ $ pulsar-admin namespaces get-subscription-dispatch-rate 
test-tenant/ns1
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
diff --git a/versioned_docs/version-2.8.x/concepts-throttling.md 
b/versioned_docs/version-2.8.x/concepts-throttling.md
new file mode 100644
index 00000000000..b0e89441044
--- /dev/null
+++ b/versioned_docs/version-2.8.x/concepts-throttling.md
@@ -0,0 +1,167 @@
+---
+id: concepts-throttling
+title: Message dispatch throttling
+sidebar_label: "Message throttling"
+---
+
+## Overview
+
+### What is message dispatch throttling?
+
+Large message payloads can cause memory usage spikes that lead to performance 
decreases. Pulsar adopts a rate-limit throttling mechanism for message 
dispatch, avoiding a traffic surge and improving message deliverability. You 
can set a threshold to limit the number of messages and the byte size of 
entries that can be delivered to clients, blocking subsequent deliveries when 
the traffic per unit of time exceeds the threshold.
+
+For example, when you configure the dispatch rate limit to 10 messages per 
second, then the number of messages that can be delivered to the client per 
second is up to 10.
+
+![Rate-limit dispatch throttling](/assets/throttling-dispatch.svg 'message 
throttling')
+
+### Why use it?
+
+Message dispatch throttling brings the following benefits in detail:
+
+- **Limit broker's read request loads to BookKeeper**
+
+  Messages are persistently stored in the BookKeeper cluster. If a large 
number of read requests cannot be fulfilled using the cached data, the 
BookKeeper cluster may become too busy to respond, and the broker's I/O or CPU 
resources can be fully occupied. Using the message dispatch throttling feature 
can regulate the data flow to limit the broker’s read request loads to 
BookKeeper.
+
+- **Balance the allocation of broker's hardware resources at 
topic/subscription levels**
+
+  A broker instance serves multiple topics at one time. If a topic is 
overloaded with requests, it will occupy almost all of the I/O, CPU, and memory 
resources of the broker, causing other topics cannot be read. Using the message 
dispatch throttling feature can limit the allocation of broker’s hardware 
resources across topics.
+
+- **Limit the allocation of client's hardware resources at topic/subscription 
levels**
+
+  When there is a large backlog of messages to consume, clients may receive a 
large amount of data in a short period of time, which monopolizes their 
computing resources. Since the client has no mechanisms to proactively limit 
the consumption rate, using the message dispatch throttling feature can also 
regulate the allocation of the client's hardware resources.
+
+### How it works?
+
+The process of message dispatch throttling can be divided into the following 
steps:
+1. The broker approximates the number of entries to read from the bookies by 
calculating the remaining quota. 
+2. The broker reads the messages from the bookies.
+3. The broker dispatches the messages to the client and updates the counter to 
decrease the quota. A scheduled task refreshes the quota when a throttling 
period ends.
+
+:::note
+
+- The quota cannot be decreased before step 3, because the broker doesn't know 
the actual number of messages per entry or the actual entry size until it reads 
the data.
+- Operations like `seek` or `redeliver` may deliver messages to a client 
multiple times. The broker counts them as different messages and updates the 
counter.
+
+:::
+
+## Concepts
+
+### Throttling levels
+
+The following table outlines the three levels that you can throttle message 
dispatch.
+
+Level | Description
+:-----|:------------
+Per broker | All subscriptions in a single broker share the quota.
+Per topic | All subscriptions in the same topic share the quota.<br /><li>If 
it's a non-partitioned topic, the quota equals the maximum number of messages 
the topic can deliver per unit of time.</li><li>If a topic has multiple 
partitions, the quota refers to the maximum number of messages each partition 
can deliver per unit of time. In other words, the actual dispatch rate limit of 
a [partitioned topic](concepts-messaging.md#partitioned-topics) is N times the 
configured one (N is the num [...]
+Per subscription | <li>If it's a non-partitioned topic, the rate limit refers 
to the maximum number of messages a subscription can deliver to clients per 
unit of time.</li><li>If the subscribed topic has multiple partitions, the rate 
limit refers to the maximum number of messages the subscription can deliver per 
partition per unit of time. In other words, a subscription's actual dispatch 
rate limit for a [partitioned topic](concepts-messaging.md#partitioned-topics) 
is N times the configu [...]
+
+:::note
+
+The dispatch rate limits configured at multiple levels take effect 
simultaneously (logical AND).
+
+:::
+
+### Throttling approaches
+
+The following table outlines multiple approaches to configure the dispatch 
rate limits at different levels.
+
+Approach | Per cluster | Per topic | Per subscription
+:--------|:------------|:----------|:----------------
+Set [broker configurations](#throttling-configurations) or [dynamic broker 
configurations](admin-api-brokers.md#dynamic-broker-configuration) | 
<li>`dispatchThrottlingRateInMsg`</li><li>`dispatchThrottlingRateInByte`</li> | 
<li>`dispatchThrottlingRatePerTopicInMsg`</li><li>`dispatchThrottlingRatePerTopicInByte`</li><br
 />It applies to all topics in the cluster. | 
<li>`dispatchThrottlingRatePerSubscriptionInMsg`</li><li>`dispatchThrottlingRatePerSubscriptionInByte`</li><br
 />It applies to [...]
+Set namespace policies | N/A | Refer to [Configure dispatch throttling for 
topics](admin-api-namespaces.md#configure-dispatch-throttling-for-topics). | 
Refer to [Configure dispatch throttling for 
subscriptions](admin-api-namespaces.md#configure-dispatch-throttling-for-subscription).
+Set topic policies | N/A | Refer to [Set topic-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setDispatchRate).
 | Refer to [Set subscription-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setSubscriptionDispatchRate).<br
 />It applies to all subscriptions in a topic.
+
+:::note
+
+The dispatch rate limits configured through the above three approaches take 
effect with priorities, which is "topic policies" > "namespace policies" > 
"broker configurations". For example, if you have configured the dispatch rate 
limit for a subscription using all these three approaches, only the one 
configured through "topic policies" takes effect.
+
+:::
+
+### Throttling configurations
+
+The following table outlines the parameters that you can configure for message 
dispatch throttling in the `conf/broker.conf` file.
+
+Parameter | Description | Default value
+:---------|:------------|:-------------
+dispatchThrottlingRateInMsg | The total number of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInMsg` 
or `dispatchThrottlingRatePerSubscriptionInMsg`. | '-1', which means no limit.
+dispatchThrottlingRateInByte | The total byte size of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInByte` 
or `dispatchThrottlingRatePerSubscriptionInByte`. | '-1', which means no limit.
+ratePeriodInSecond | The period of time for dispatch throttling (in seconds). 
The counter is reset at the end of the period.<br />For example, if you want to 
configure the rate limit to `10,000 messages per minute`, you need to set 
`ratePeriodInSecond` to `60` and set `dispatchThrottlingRateInMsg` to `10,000`. 
| 1 (second)
+preciseDispatcherFlowControl | Whether to apply a precise control on the 
dispatch throttling. By default, it's disabled, which means the broker 
approximates `the number of messages to read from bookies` using the minimum 
value between the remaining `consumer.receiverQueueSize` (defaults to 1000) and 
`dispatcherMaxReadBatchSize` (defaults to 100).<br /><br />When it's set to 
`true`, the broker approximates $$the \ number \ of \ entries \ to \ read \ 
from \ bookies$$ through the following  [...]
+dispatchThrottlingOnBatchMessageEnabled | Whether to count messages by entry 
(batch). By default, it's disabled.<br /><br />Note that setting it to `true` 
may lead to an inaccurate approximation of total message count but maximize 
Pulsar's throughput while keeping stable read requests to the bookies. For 
example, assume you've set the rate limit to `10/s`, if you set 
`dispatchThrottlingOnBatchMessageEnabled` to `true`, the broker only reads 10 
entries and delivers them to the client per  [...]
+dispatchThrottlingOnNonBacklogConsumerEnabled | Whether the dispatch 
throttling on non-backlog consumers is enabled. By default, it's enabled.<br 
/>When it is set to `false`:<br /><li>If all the consumers in one subscription 
have no backlog, the message dispatch throttling is turned off automatically 
even if `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` are 
configured.</li><li>If at least one consumer has a backlog, the throttling is 
turned on automatically.</li> | true
+
+:::note
+
+- You can use `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` 
simultaneously (logical AND).
+- Ensure that only one of `preciseDispatcherFlowControl` and 
`dispatchThrottlingOnBatchMessageEnabled` is enabled at one time since they are 
mutually exclusive. Both parameters can be used to improve the over-delivery 
issues (see [Limitations](#limitations)). The difference between them is:
+  - When `preciseDispatcherFlowControl` is enabled, Pulsar considers the 
number of messages per entry. This parameter takes effect when the broker reads 
entries from the bookies.
+  - When `dispatchThrottlingOnBatchMessageEnabled` is enabled, Pulsar ignores 
the number of messages per entry. This parameter takes effect when the broker 
updates the counter after sending messages to the client.
+
+:::
+
+## Limitations
+
+Message dispatch throttling may cause messages over-delivered per unit of time 
due to the following reasons:
+
+1. **The broker may read more entries or bytes from the bookies than the 
throttling limit.**
+
+   a) **The byte size of messages delivered to the client may exceed the 
configured threshold.**
+   
+     When you set the dispatch rate limit in bytes/throttling-period 
(`dispatchThrottlingRateInByte`/`ratePeriodInSecond`), the broker calculates 
$$the \ number \ of \ entries \ to \ read \ from \ bookies$$ in one throttling 
period through the following equation:
+   
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ byte \ size \ to \ read} \over{The \ average \ byte \ size \ per \ entry}}
+     $$
+
+     By controlling $$the \ number \ of \ entries \ to \ read \ from \ 
bookies$$, the broker attempts to limit `the total byte size to read` below 
`the dispatch rate` within each throttling period. It reads messages from the 
bookies in the unit of `entry` and approximates the bytes of the next entry to 
read because it does not know the exact byte size of each entry before reading 
it.
+
+     The broker uses the following two metrics to get the average byte size 
per entry:
+
+      * Average publish size (`brk_ml_EntrySizeBuckets`): the average byte 
size per entry stored in the bookies when the broker receives a publish request.
+
+      * Average dispatch size (`entriesReadSize`/`entriesReadCount`): the 
average byte size per entry read from bookies, that is, the average byte size 
per entry sent to the client.
+      
+     The broker uses the average publish size in preference to the average 
dispatch size. If the average publish size is unavailable, then it uses the 
average dispatch size. When none of the two metrics are available, the broker 
only reads one entry at the first attempt.
+
+   **b) The number of messages delivered to the client may exceed the 
configured threshold.**
+     
+     When you set the dispatch rate limit in message-count/throttling-period 
(`dispatchThrottlingRateInMsg`/`ratePeriodInSecond`) and batching 
(`batch-send`) is enabled, the broker counts an entry as one message (despite 
the message count per entry) and calculates $$the \ number \ of \ entries \ to 
\ read \ from \ bookies$$ through the following equation:
+      
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ number \ of \ messages \ to \ read} \over{The \ average \ message \ count \ 
per \ entry} \ (=1)}
+     $$
+
+     Since there is a number of messages per entry, the number of messages 
delivered to the client always exceeds or equals the configured threshold.
+
+   **Workaround**
+
+   Configuring `preciseDispatcherFlowControl` or 
`dispatchThrottlingOnBatchMessageEnabled` can mitigate the over-delivery issue. 
For example, turning on `preciseDispatcherFlowControl` can mitigate the 
limitation by pre-decrementing the quota using the approximated average message 
count per entry. See [Throttling configurations](#throttling-configurations) 
for more details.
+
+2. **Concurrent throttling processes may not decrease the quota in a timely 
manner.**
+
+   As introduced in [How it works](#how-it-works), the dispatch throttling 
process is `1.get remaining quota` $$\to$$ `2.load data` $$\to$$ `3.decrease 
quota`. 
+   
+   When two processes "dispatch replay messages (process-R)" and "dispatch 
non-replay messages (process-N)" in the same subscription are executed 
concurrently, their throttling processes can be interwoven in this order: 
+
+     1) process-R: `1.get remaining quota`
+
+     2) process-R: `2.load data`
+
+     3) process-N: `1.get remaining quota`
+
+     4) process-N: `2.load data`
+
+     5) process-R: `3.decrease quota`
+
+     6) process-N: `3.decrease quota`
+
+   As a result, the total number of dispatched messages may exceed the quota.
+
+   :::note
+
+   When over-delivery happens, and the delivered message count exceeds the 
quota in the current period, then the quota for the next period will be reduced 
accordingly. For example, if the rate limit is set to `10/s`, and `11` messages 
have been delivered to the client in the first period, then only up to `9` 
messages can be delivered to the client in the next period; if 30 messages have 
been delivered in the last period, the count of messages to deliver in the next 
two periods is `0`.
+
+   ![An example of over-delivery occurred within a throttling 
period](/assets/throttling-limitation.svg)
+
+   :::
\ No newline at end of file
diff --git a/versioned_docs/version-2.9.x/admin-api-namespaces.md 
b/versioned_docs/version-2.9.x/admin-api-namespaces.md
index b90c76326dd..ec35a87a4d1 100644
--- a/versioned_docs/version-2.9.x/admin-api-namespaces.md
+++ b/versioned_docs/version-2.9.x/admin-api-namespaces.md
@@ -966,7 +966,7 @@ $ pulsar-admin namespaces set-subscription-dispatch-rate 
test-tenant/ns1 \
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|POST|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/setSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
@@ -1011,7 +1011,7 @@ $ pulsar-admin namespaces get-subscription-dispatch-rate 
test-tenant/ns1
 </TabItem>
 <TabItem value="REST API">
 
-{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getDispatchRate?version=@pulsar:version_number@}
+{@inject: 
endpoint|GET|/admin/v2/namespaces/:tenant/:namespace/subscriptionDispatchRate|operation/getSubscriptionDispatchRate?version=@pulsar:version_number@}
 
 </TabItem>
 <TabItem value="Java">
diff --git a/versioned_docs/version-2.9.x/concepts-throttling.md 
b/versioned_docs/version-2.9.x/concepts-throttling.md
new file mode 100644
index 00000000000..b0e89441044
--- /dev/null
+++ b/versioned_docs/version-2.9.x/concepts-throttling.md
@@ -0,0 +1,167 @@
+---
+id: concepts-throttling
+title: Message dispatch throttling
+sidebar_label: "Message throttling"
+---
+
+## Overview
+
+### What is message dispatch throttling?
+
+Large message payloads can cause memory usage spikes that lead to performance 
decreases. Pulsar adopts a rate-limit throttling mechanism for message 
dispatch, avoiding a traffic surge and improving message deliverability. You 
can set a threshold to limit the number of messages and the byte size of 
entries that can be delivered to clients, blocking subsequent deliveries when 
the traffic per unit of time exceeds the threshold.
+
+For example, when you configure the dispatch rate limit to 10 messages per 
second, then the number of messages that can be delivered to the client per 
second is up to 10.
+
+![Rate-limit dispatch throttling](/assets/throttling-dispatch.svg 'message 
throttling')
+
+### Why use it?
+
+Message dispatch throttling brings the following benefits in detail:
+
+- **Limit broker's read request loads to BookKeeper**
+
+  Messages are persistently stored in the BookKeeper cluster. If a large 
number of read requests cannot be fulfilled using the cached data, the 
BookKeeper cluster may become too busy to respond, and the broker's I/O or CPU 
resources can be fully occupied. Using the message dispatch throttling feature 
can regulate the data flow to limit the broker’s read request loads to 
BookKeeper.
+
+- **Balance the allocation of broker's hardware resources at 
topic/subscription levels**
+
+  A broker instance serves multiple topics at one time. If a topic is 
overloaded with requests, it will occupy almost all of the I/O, CPU, and memory 
resources of the broker, causing other topics cannot be read. Using the message 
dispatch throttling feature can limit the allocation of broker’s hardware 
resources across topics.
+
+- **Limit the allocation of client's hardware resources at topic/subscription 
levels**
+
+  When there is a large backlog of messages to consume, clients may receive a 
large amount of data in a short period of time, which monopolizes their 
computing resources. Since the client has no mechanisms to proactively limit 
the consumption rate, using the message dispatch throttling feature can also 
regulate the allocation of the client's hardware resources.
+
+### How it works?
+
+The process of message dispatch throttling can be divided into the following 
steps:
+1. The broker approximates the number of entries to read from the bookies by 
calculating the remaining quota. 
+2. The broker reads the messages from the bookies.
+3. The broker dispatches the messages to the client and updates the counter to 
decrease the quota. A scheduled task refreshes the quota when a throttling 
period ends.
+
+:::note
+
+- The quota cannot be decreased before step 3, because the broker doesn't know 
the actual number of messages per entry or the actual entry size until it reads 
the data.
+- Operations like `seek` or `redeliver` may deliver messages to a client 
multiple times. The broker counts them as different messages and updates the 
counter.
+
+:::
+
+## Concepts
+
+### Throttling levels
+
+The following table outlines the three levels that you can throttle message 
dispatch.
+
+Level | Description
+:-----|:------------
+Per broker | All subscriptions in a single broker share the quota.
+Per topic | All subscriptions in the same topic share the quota.<br /><li>If 
it's a non-partitioned topic, the quota equals the maximum number of messages 
the topic can deliver per unit of time.</li><li>If a topic has multiple 
partitions, the quota refers to the maximum number of messages each partition 
can deliver per unit of time. In other words, the actual dispatch rate limit of 
a [partitioned topic](concepts-messaging.md#partitioned-topics) is N times the 
configured one (N is the num [...]
+Per subscription | <li>If it's a non-partitioned topic, the rate limit refers 
to the maximum number of messages a subscription can deliver to clients per 
unit of time.</li><li>If the subscribed topic has multiple partitions, the rate 
limit refers to the maximum number of messages the subscription can deliver per 
partition per unit of time. In other words, a subscription's actual dispatch 
rate limit for a [partitioned topic](concepts-messaging.md#partitioned-topics) 
is N times the configu [...]
+
+:::note
+
+The dispatch rate limits configured at multiple levels take effect 
simultaneously (logical AND).
+
+:::
+
+### Throttling approaches
+
+The following table outlines multiple approaches to configure the dispatch 
rate limits at different levels.
+
+Approach | Per cluster | Per topic | Per subscription
+:--------|:------------|:----------|:----------------
+Set [broker configurations](#throttling-configurations) or [dynamic broker 
configurations](admin-api-brokers.md#dynamic-broker-configuration) | 
<li>`dispatchThrottlingRateInMsg`</li><li>`dispatchThrottlingRateInByte`</li> | 
<li>`dispatchThrottlingRatePerTopicInMsg`</li><li>`dispatchThrottlingRatePerTopicInByte`</li><br
 />It applies to all topics in the cluster. | 
<li>`dispatchThrottlingRatePerSubscriptionInMsg`</li><li>`dispatchThrottlingRatePerSubscriptionInByte`</li><br
 />It applies to [...]
+Set namespace policies | N/A | Refer to [Configure dispatch throttling for 
topics](admin-api-namespaces.md#configure-dispatch-throttling-for-topics). | 
Refer to [Configure dispatch throttling for 
subscriptions](admin-api-namespaces.md#configure-dispatch-throttling-for-subscription).
+Set topic policies | N/A | Refer to [Set topic-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setDispatchRate).
 | Refer to [Set subscription-level dispatch 
rate](pathname:///admin-rest-api/?version=@pulsar:version_number@/#operation/PersistentTopics_setSubscriptionDispatchRate).<br
 />It applies to all subscriptions in a topic.
+
+:::note
+
+The dispatch rate limits configured through the above three approaches take 
effect with priorities, which is "topic policies" > "namespace policies" > 
"broker configurations". For example, if you have configured the dispatch rate 
limit for a subscription using all these three approaches, only the one 
configured through "topic policies" takes effect.
+
+:::
+
+### Throttling configurations
+
+The following table outlines the parameters that you can configure for message 
dispatch throttling in the `conf/broker.conf` file.
+
+Parameter | Description | Default value
+:---------|:------------|:-------------
+dispatchThrottlingRateInMsg | The total number of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInMsg` 
or `dispatchThrottlingRatePerSubscriptionInMsg`. | '-1', which means no limit.
+dispatchThrottlingRateInByte | The total byte size of messages that can be 
delivered per cluster per throttling period.<br /><br />To set the topic-level 
or the subscription-level one, configure `dispatchThrottlingRatePerTopicInByte` 
or `dispatchThrottlingRatePerSubscriptionInByte`. | '-1', which means no limit.
+ratePeriodInSecond | The period of time for dispatch throttling (in seconds). 
The counter is reset at the end of the period.<br />For example, if you want to 
configure the rate limit to `10,000 messages per minute`, you need to set 
`ratePeriodInSecond` to `60` and set `dispatchThrottlingRateInMsg` to `10,000`. 
| 1 (second)
+preciseDispatcherFlowControl | Whether to apply a precise control on the 
dispatch throttling. By default, it's disabled, which means the broker 
approximates `the number of messages to read from bookies` using the minimum 
value between the remaining `consumer.receiverQueueSize` (defaults to 1000) and 
`dispatcherMaxReadBatchSize` (defaults to 100).<br /><br />When it's set to 
`true`, the broker approximates $$the \ number \ of \ entries \ to \ read \ 
from \ bookies$$ through the following  [...]
+dispatchThrottlingOnBatchMessageEnabled | Whether to count messages by entry 
(batch). By default, it's disabled.<br /><br />Note that setting it to `true` 
may lead to an inaccurate approximation of total message count but maximize 
Pulsar's throughput while keeping stable read requests to the bookies. For 
example, assume you've set the rate limit to `10/s`, if you set 
`dispatchThrottlingOnBatchMessageEnabled` to `true`, the broker only reads 10 
entries and delivers them to the client per  [...]
+dispatchThrottlingOnNonBacklogConsumerEnabled | Whether the dispatch 
throttling on non-backlog consumers is enabled. By default, it's enabled.<br 
/>When it is set to `false`:<br /><li>If all the consumers in one subscription 
have no backlog, the message dispatch throttling is turned off automatically 
even if `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` are 
configured.</li><li>If at least one consumer has a backlog, the throttling is 
turned on automatically.</li> | true
+
+:::note
+
+- You can use `dispatchThrottlingRateInMsg` and `dispatchThrottlingRateInByte` 
simultaneously (logical AND).
+- Ensure that only one of `preciseDispatcherFlowControl` and 
`dispatchThrottlingOnBatchMessageEnabled` is enabled at one time since they are 
mutually exclusive. Both parameters can be used to improve the over-delivery 
issues (see [Limitations](#limitations)). The difference between them is:
+  - When `preciseDispatcherFlowControl` is enabled, Pulsar considers the 
number of messages per entry. This parameter takes effect when the broker reads 
entries from the bookies.
+  - When `dispatchThrottlingOnBatchMessageEnabled` is enabled, Pulsar ignores 
the number of messages per entry. This parameter takes effect when the broker 
updates the counter after sending messages to the client.
+
+:::
+
+## Limitations
+
+Message dispatch throttling may cause messages over-delivered per unit of time 
due to the following reasons:
+
+1. **The broker may read more entries or bytes from the bookies than the 
throttling limit.**
+
+   a) **The byte size of messages delivered to the client may exceed the 
configured threshold.**
+   
+     When you set the dispatch rate limit in bytes/throttling-period 
(`dispatchThrottlingRateInByte`/`ratePeriodInSecond`), the broker calculates 
$$the \ number \ of \ entries \ to \ read \ from \ bookies$$ in one throttling 
period through the following equation:
+   
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ byte \ size \ to \ read} \over{The \ average \ byte \ size \ per \ entry}}
+     $$
+
+     By controlling $$the \ number \ of \ entries \ to \ read \ from \ 
bookies$$, the broker attempts to limit `the total byte size to read` below 
`the dispatch rate` within each throttling period. It reads messages from the 
bookies in the unit of `entry` and approximates the bytes of the next entry to 
read because it does not know the exact byte size of each entry before reading 
it.
+
+     The broker uses the following two metrics to get the average byte size 
per entry:
+
+      * Average publish size (`brk_ml_EntrySizeBuckets`): the average byte 
size per entry stored in the bookies when the broker receives a publish request.
+
+      * Average dispatch size (`entriesReadSize`/`entriesReadCount`): the 
average byte size per entry read from bookies, that is, the average byte size 
per entry sent to the client.
+      
+     The broker uses the average publish size in preference to the average 
dispatch size. If the average publish size is unavailable, then it uses the 
average dispatch size. When none of the two metrics are available, the broker 
only reads one entry at the first attempt.
+
+   **b) The number of messages delivered to the client may exceed the 
configured threshold.**
+     
+     When you set the dispatch rate limit in message-count/throttling-period 
(`dispatchThrottlingRateInMsg`/`ratePeriodInSecond`) and batching 
(`batch-send`) is enabled, the broker counts an entry as one message (despite 
the message count per entry) and calculates $$the \ number \ of \ entries \ to 
\ read \ from \ bookies$$ through the following equation:
+      
+     $$
+     The \ number \ of \ entries \ to \ read \ from \ bookies = {{The \ total 
\ number \ of \ messages \ to \ read} \over{The \ average \ message \ count \ 
per \ entry} \ (=1)}
+     $$
+
+     Since there is a number of messages per entry, the number of messages 
delivered to the client always exceeds or equals the configured threshold.
+
+   **Workaround**
+
+   Configuring `preciseDispatcherFlowControl` or 
`dispatchThrottlingOnBatchMessageEnabled` can mitigate the over-delivery issue. 
For example, turning on `preciseDispatcherFlowControl` can mitigate the 
limitation by pre-decrementing the quota using the approximated average message 
count per entry. See [Throttling configurations](#throttling-configurations) 
for more details.
+
+2. **Concurrent throttling processes may not decrease the quota in a timely 
manner.**
+
+   As introduced in [How it works](#how-it-works), the dispatch throttling 
process is `1.get remaining quota` $$\to$$ `2.load data` $$\to$$ `3.decrease 
quota`. 
+   
+   When two processes "dispatch replay messages (process-R)" and "dispatch 
non-replay messages (process-N)" in the same subscription are executed 
concurrently, their throttling processes can be interwoven in this order: 
+
+     1) process-R: `1.get remaining quota`
+
+     2) process-R: `2.load data`
+
+     3) process-N: `1.get remaining quota`
+
+     4) process-N: `2.load data`
+
+     5) process-R: `3.decrease quota`
+
+     6) process-N: `3.decrease quota`
+
+   As a result, the total number of dispatched messages may exceed the quota.
+
+   :::note
+
+   When over-delivery happens, and the delivered message count exceeds the 
quota in the current period, then the quota for the next period will be reduced 
accordingly. For example, if the rate limit is set to `10/s`, and `11` messages 
have been delivered to the client in the first period, then only up to `9` 
messages can be delivered to the client in the next period; if 30 messages have 
been delivered in the last period, the count of messages to deliver in the next 
two periods is `0`.
+
+   ![An example of over-delivery occurred within a throttling 
period](/assets/throttling-limitation.svg)
+
+   :::
\ No newline at end of file
diff --git a/versioned_sidebars/version-2.10.x-sidebars.json 
b/versioned_sidebars/version-2.10.x-sidebars.json
index 68681d06f3f..3ff9b896718 100644
--- a/versioned_sidebars/version-2.10.x-sidebars.json
+++ b/versioned_sidebars/version-2.10.x-sidebars.json
@@ -62,6 +62,10 @@
           "type": "doc",
           "id": "version-2.10.x/concepts-topic-compaction"
         },
+        {
+          "type": "doc",
+          "id": "version-2.10.x/concepts-throttling"
+        },
         {
           "type": "doc",
           "id": "version-2.10.x/concepts-proxy-sni-routing"
diff --git a/versioned_sidebars/version-2.11.x-sidebars.json 
b/versioned_sidebars/version-2.11.x-sidebars.json
index 3225dcb6747..81ed56bac6a 100644
--- a/versioned_sidebars/version-2.11.x-sidebars.json
+++ b/versioned_sidebars/version-2.11.x-sidebars.json
@@ -30,6 +30,7 @@
         "concepts-multi-tenancy",
         "concepts-authentication",
         "concepts-topic-compaction",
+        "concepts-throttling",
         "concepts-proxy-sni-routing",
         "concepts-multiple-advertised-listeners"
       ]
diff --git a/versioned_sidebars/version-2.8.x-sidebars.json 
b/versioned_sidebars/version-2.8.x-sidebars.json
index 59e6d115f60..8435529c5bb 100644
--- a/versioned_sidebars/version-2.8.x-sidebars.json
+++ b/versioned_sidebars/version-2.8.x-sidebars.json
@@ -62,6 +62,10 @@
           "type": "doc",
           "id": "version-2.8.x/concepts-topic-compaction"
         },
+        {
+          "type": "doc",
+          "id": "version-2.8.x/concepts-throttling"
+        },
         {
           "type": "doc",
           "id": "version-2.8.x/concepts-proxy-sni-routing"
diff --git a/versioned_sidebars/version-2.9.x-sidebars.json 
b/versioned_sidebars/version-2.9.x-sidebars.json
index 5d8c59b1ec8..be11cffec4a 100644
--- a/versioned_sidebars/version-2.9.x-sidebars.json
+++ b/versioned_sidebars/version-2.9.x-sidebars.json
@@ -62,6 +62,10 @@
           "type": "doc",
           "id": "version-2.9.x/concepts-topic-compaction"
         },
+        {
+          "type": "doc",
+          "id": "version-2.9.x/concepts-throttling"
+        },
         {
           "type": "doc",
           "id": "version-2.9.x/concepts-proxy-sni-routing"

[pulsar-site] branch main updated: [feat][doc] Add docs for message dispatch throttling (#386)

Reply via email to