This is an automated email from the ASF dual-hosted git repository.
liuyu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git
The following commit(s) were added to refs/heads/master by this push:
new d5d8923265b [feature][doc] Add docs for failure domain + anti-affinity
namespace (#16069)
d5d8923265b is described below
commit d5d8923265b3a4b9731892879b2f2b9dc48c0f98
Author: momo-jun <[email protected]>
AuthorDate: Mon Jun 20 20:11:00 2022 +0800
[feature][doc] Add docs for failure domain + anti-affinity namespace
(#16069)
---
site2/docs/administration-load-balance.md | 277 +++++++++++++--------
...-affinity-namespaces-across-failure-domains.svg | 1 +
site2/docs/concepts-architecture-overview.md | 2 +-
site2/docs/reference-terminology.md | 9 +
.../version-2.10.0/administration-load-balance.md | 276 ++++++++++++--------
.../version-2.10.0/reference-terminology.md | 8 +
.../version-2.8.0/administration-load-balance.md | 56 +++++
.../version-2.8.0/reference-terminology.md | 8 +
.../version-2.8.1/administration-load-balance.md | 56 +++++
.../version-2.8.1/reference-terminology.md | 8 +
.../version-2.8.2/administration-load-balance.md | 56 +++++
.../version-2.8.2/reference-terminology.md | 8 +
.../version-2.8.3/administration-load-balance.md | 56 +++++
.../version-2.8.3/reference-terminology.md | 8 +
.../version-2.9.0/administration-load-balance.md | 56 +++++
.../version-2.9.0/reference-terminology.md | 8 +
.../version-2.9.1/administration-load-balance.md | 56 +++++
.../version-2.9.1/reference-terminology.md | 8 +
.../version-2.9.2/administration-load-balance.md | 56 +++++
.../version-2.9.2/reference-terminology.md | 8 +
20 files changed, 806 insertions(+), 215 deletions(-)
diff --git a/site2/docs/administration-load-balance.md
b/site2/docs/administration-load-balance.md
index a9fd48348d9..f581c031cb4 100644
--- a/site2/docs/administration-load-balance.md
+++ b/site2/docs/administration-load-balance.md
@@ -1,110 +1,81 @@
---
id: administration-load-balance
-title: Pulsar load balance
+title: Load balance across brokers
sidebar_label: "Load balance"
---
-## Load balance across Pulsar brokers
-Pulsar is an horizontally scalable messaging system, so the traffic in a
logical cluster must be balanced across all the available Pulsar brokers as
evenly as possible, which is a core requirement.
+Pulsar is a horizontally scalable messaging system, so the traffic in a
logical cluster must be balanced across all the available Pulsar brokers as
evenly as possible, which is a core requirement.
-You can use multiple settings and tools to control the traffic distribution
which require a bit of context to understand how the traffic is managed in
Pulsar. Though, in most cases, the core requirement mentioned above is true out
of the box and you should not worry about it.
+You can use multiple settings and tools to control the traffic distribution
which requires a bit of context to understand how the traffic is managed in
Pulsar. Though in most cases, the core requirement mentioned above is true out
of the box and you should not worry about it.
-## Pulsar load manager architecture
+The following sections introduce how the load-balanced assignments work across
Pulsar brokers and how you can leverage the framework to adjust.
-The following part introduces the basic architecture of the Pulsar load
manager.
+## Dynamic assignments
-### Assign topics to brokers dynamically
+Topics are dynamically assigned to brokers based on the load conditions of all
brokers in the cluster. The assignment of topics to brokers is not done at the
topic level but at the **bundle** level (a higher level). Instead of individual
topic assignments, each broker takes ownership of a subset of the topics for a
namespace. This subset is called a bundle and effectively this subset is a
sharding mechanism.
-Topics are dynamically assigned to brokers based on the load conditions of all
brokers in the cluster.
+In other words, each namespace is an "administrative" unit and sharded into a
list of bundles, with each bundle comprising a portion of the overall hash
range of the namespace. Topics are assigned to a particular bundle by taking
the hash of the topic name and checking in which bundle the hash falls. Each
bundle is independent of the others and thus is independently assigned to
different brokers.
-When a client starts using new topics that are not assigned to any broker, a
process is triggered to choose the best suited broker to acquire ownership of
these topics according to the load conditions.
+The benefit of the assignment granularity is to amortize the amount of
information that you need to keep track of. Based on CPU, memory, traffic load,
and other indexes, topics are assigned to a particular broker dynamically. For
example:
+* When a client starts using new topics that are not assigned to any broker, a
process is triggered to choose the best-suited broker to acquire ownership of
these topics according to the load conditions.
+* If the broker owning a topic becomes overloaded, the topic is reassigned to
a less-loaded broker.
+* If the broker owning a topic crashes, the topic is reassigned to another
active broker.
-In case of partitioned topics, different partitions are assigned to different
brokers. Here "topic" means either a non-partitioned topic or one partition of
a topic.
+:::tip
-The assignment is "dynamic" because the assignment changes quickly. For
example, if the broker owning the topic crashes, the topic is reassigned
immediately to another broker. Another scenario is that the broker owning the
topic becomes overloaded. In this case, the topic is reassigned to a less
loaded broker.
+For partitioned topics, different partitions are assigned to different
brokers. Here "topic" means either a non-partitioned topic or one partition of
a topic.
-The stateless nature of brokers makes the dynamic assignment possible, so you
can quickly expand or shrink the cluster based on usage.
+:::
-#### Assignment granularity
+## Create namespaces with assigned bundles
-The assignment of topics or partitions to brokers is not done at the topics or
partitions level, but done at the Bundle level (a higher level). The reason is
to amortize the amount of information that you need to keep track. Based on
CPU, memory, traffic load and other indexes, topics are assigned to a
particular broker dynamically.
+When you create a new namespace, a number of bundles are assigned to the
namespace. You can set this number in the `conf/broker.conf` file:
-Instead of individual topic or partition assignment, each broker takes
ownership of a subset of the topics for a namespace. This subset is called a
"*bundle*" and effectively this subset is a sharding mechanism.
+```conf
-The namespace is the "administrative" unit: many config knobs or operations
are done at the namespace level.
-
-For assignment, a namespaces is sharded into a list of "bundles", with each
bundle comprising a portion of overall hash range of the namespace.
-
-Topics are assigned to a particular bundle by taking the hash of the topic
name and checking in which bundle the hash falls into.
-
-Each bundle is independent of the others and thus is independently assigned to
different brokers.
-
-### Create namespaces and bundles
-
-When you create a new namespace, the new namespace sets to use the default
number of bundles. You can set this in `conf/broker.conf`:
-
-```properties
-
-# When a namespace is created without specifying the number of bundle, this
+# When a namespace is created without specifying the number of bundles, this
# value will be used as the default
defaultNumberOfNamespaceBundles=4
```
-You can either change the system default, or override it when you create a new
namespace:
+Alternatively, you can override the value when you create a new namespace
using [Pulsar admin](/tools/pulsar-admin/):
```shell
-$ bin/pulsar-admin namespaces create my-tenant/my-namespace --clusters us-west
--bundles 16
+bin/pulsar-admin namespaces create my-tenant/my-namespace --clusters us-west
--bundles 16
```
-With this command, you create a namespace with 16 initial bundles. Therefore
the topics for this namespaces can immediately be spread across up to 16
brokers.
+With the above command, you create a namespace with 16 initial bundles.
Therefore the topics for this namespace can immediately be spread across up to
16 brokers.
In general, if you know the expected traffic and number of topics in advance,
you had better start with a reasonable number of bundles instead of waiting for
the system to auto-correct the distribution.
-On the same note, it is beneficial to start with more bundles than the number
of brokers, because of the hashing nature of the distribution of topics into
bundles. For example, for a namespace with 1000 topics, using something like 64
bundles achieves a good distribution of traffic across 16 brokers.
-
-### Unload topics and bundles
-
-You can "unload" a topic in Pulsar with admin operation. Unloading means to
close the topics, release ownership and reassign the topics to a new broker,
based on current load.
-
-When unloading happens, the client experiences a small latency blip, typically
in the order of tens of milliseconds, while the topic is reassigned.
-
-Unloading is the mechanism that the load-manager uses to perform the load
shedding, but you can also trigger the unloading manually, for example to
correct the assignments and redistribute traffic even before having any broker
overloaded.
+On the same note, it is beneficial to start with more bundles than the number
of brokers, due to the hashing nature of the distribution of topics into
bundles. For example, for a namespace with 1000 topics, using something like 64
bundles achieves a good distribution of traffic across 16 brokers.
-Unloading a topic has no effect on the assignment, but just closes and reopens
the particular topic:
-```shell
+## Split namespace bundles
-pulsar-admin topics unload persistent://tenant/namespace/topic
+Since the load for the topics in a bundle might change over time and
predicting the load might be hard, bundle split is designed to resolve these
challenges. The broker splits a bundle into two and the new smaller bundles can
be reassigned to different brokers.
-```
+Pulsar supports the following two bundle split algorithms:
+* `range_equally_divide`: split the bundle into two parts with the same hash
range size.
+* `topic_count_equally_divide`: split the bundle into two parts with the same
number of topics.
-To unload all topics for a namespace and trigger reassignments:
+To enable bundle split, you need to configure the following settings in the
`broker.conf` file, and set `defaultNamespaceBundleSplitAlgorithm` based on
your needs.
-```shell
+```conf
-pulsar-admin namespaces unload tenant/namespace
+loadBalancerAutoBundleSplitEnabled=true
+loadBalancerAutoUnloadSplitBundlesEnabled=true
+defaultNamespaceBundleSplitAlgorithm=range_equally_divide
```
-### Split namespace bundles
-
-Since the load for the topics in a bundle might change over time and
predicting the load might be hard, bundle split is designed to deal with these
issues. The broker splits a bundle into two and the new smaller bundles can be
reassigned to different brokers.
+You can configure more parameters for splitting thresholds. Any existing
bundle that exceeds any of the thresholds is a candidate to be split. By
default, the newly split bundles are immediately reassigned to other brokers,
to facilitate the traffic distribution.
-The splitting is based on some tunable thresholds. Any existing bundle that
exceeds any of the threshold is a candidate to be split. By default the newly
split bundles are also immediately offloaded to other brokers, to facilitate
the traffic distribution.
-
-You can split namespace bundles in two ways, by setting
`supportedNamespaceBundleSplitAlgorithms` to `range_equally_divide` or
`topic_count_equally_divide` in `broker.conf` file. The former splits the
bundle into two parts with the same hash range size; the latter splits the
bundle into two parts with the same number of topics. You can also configure
other parameters for namespace bundles.
-
-```properties
-
-# enable/disable namespace bundle auto split
-loadBalancerAutoBundleSplitEnabled=true
-
-# enable/disable automatic unloading of split bundles
-loadBalancerAutoUnloadSplitBundlesEnabled=true
+```conf
# maximum topics in a bundle, otherwise bundle split will be triggered
loadBalancerNamespaceBundleMaxTopics=1000
@@ -123,94 +94,184 @@ loadBalancerNamespaceMaximumBundles=128
```
-### Shed load automatically
+## Shed load automatically
-The support for automatic load shedding is available in the load manager of
Pulsar. This means that whenever the system recognizes a particular broker is
overloaded, the system forces some traffic to be reassigned to less loaded
brokers.
+The support for automatic load shedding is available in the load manager of
Pulsar. This means that whenever the system recognizes a particular broker is
overloaded, the system forces some traffic to be reassigned to less-loaded
brokers.
When a broker is identified as overloaded, the broker forces to "unload" a
subset of the bundles, the ones with higher traffic, that make up for the
overload percentage.
-For example, the default threshold is 85% and if a broker is over quota at 95%
CPU usage, then the broker unloads the percent difference plus a 5% margin:
`(95% - 85%) + 5% = 15%`.
+For example, the default threshold is 85% and if a broker is over quota at 95%
CPU usage, then the broker unloads the percent difference plus a 5% margin:
`(95% - 85%) + 5% = 15%`. Given the selection of bundles to unload is based on
traffic (as a proxy measure for CPU, network, and memory), the broker unloads
bundles for at least 15% of traffic.
-Given the selection of bundles to offload is based on traffic (as a proxy
measure for cpu, network and memory), broker unloads bundles for at least 15%
of traffic.
+:::tip
-The automatic load shedding is enabled by default and you can disable the
automatic load shedding with this setting:
+* The automatic load shedding is enabled by default. To disable it, you can
set `loadBalancerSheddingEnabled` to `false`.
+* Besides the automatic load shedding, you can [manually unload
bundles](#unload-topics-and-bundles).
-```properties
-
-# Enable/disable automatic bundle unloading for load-shedding
-loadBalancerSheddingEnabled=true
-
-```
+:::
Additional settings that apply to shedding:
-```properties
+```conf
# Load shedding interval. Broker periodically checks whether some traffic
should be offload from
# some over-loaded broker to other under-loaded brokers
loadBalancerSheddingIntervalMinutes=1
-# Prevent the same topics to be shed and moved to other brokers more that once
within this timeframe
+# Prevent the same topics to be shed and moved to other brokers more than once
within this timeframe
loadBalancerSheddingGracePeriodMinutes=30
```
-Pulsar supports the following types of shedding strategies. From Pulsar 2.10,
the **default** shedding strategy is `ThresholdShedder`.
+Pulsar supports the following types of automatic load shedding strategies.
+* [ThresholdShedder](#thresholdshedder)
+* [OverloadShedder](#overloadshedder)
+* [UniformLoadShedder](#uniformloadshedder)
-> **Note**<br />
-> You need to restart brokers if the shedding strategy is [dynamically
updated](admin-api-brokers.md/#dynamic-broker-configuration).
+:::note
-##### ThresholdShedder
-This strategy tends to shed the bundles if any broker's usage is above the
configured threshold. It does this by first computing the average resource
usage per broker for the whole cluster. The resource usage for each broker is
calculated using the following method:
LocalBrokerData#getMaxResourceUsageWithWeight. The weights for each resource
are configurable. Historical observations are included in the running average
based on the broker's setting for loadBalancerHistoryResourcePercentag [...]
-`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.ThresholdShedder`
+* From Pulsar 2.10, the **default** shedding strategy is `ThresholdShedder`.
+* You need to restart brokers if the shedding strategy is [dynamically
updated](admin-api-brokers.md/#dynamic-broker-configuration).
+
+:::
+
+### ThresholdShedder
+This strategy tends to shed the bundles if any broker's usage is above the
configured threshold. It does this by first computing the average resource
usage per broker for the whole cluster. The resource usage for each broker is
calculated using the following method
`LocalBrokerData#getMaxResourceUsageWithWeight`. Historical observations are
included in the running average based on the broker's setting for
`loadBalancerHistoryResourcePercentage`. Once the average resource usage is
calcula [...]

-##### OverloadShedder
-This strategy will attempt to shed exactly one bundle on brokers which are
overloaded, that is, whose maximum system resource usage exceeds
loadBalancerBrokerOverloadedThresholdPercentage. To see which resources are
considered when determining the maximum system resource. A bundle is
recommended for unloading off that broker if and only if the following
conditions hold: The broker has at least two bundles assigned and the broker
has at least one bundle that has not been unloaded recently [...]
-`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.OverloadShedder`
+To use the `ThresholdShedder` strategy, configure brokers with this value.
+`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.ThresholdShedder`
+
+You can configure the weights for each resource per broker in the
`conf/broker.conf` file.
+
+```conf
+
+# The BandWithIn usage weight when calculating new resource usage.
+loadBalancerBandwithInResourceWeight=1.0
+
+# The BandWithOut usage weight when calculating new resource usage.
+loadBalancerBandwithOutResourceWeight=1.0
+
+# The CPU usage weight when calculating new resource usage.
+loadBalancerCPUResourceWeight=1.0
+
+# The heap memory usage weight when calculating new resource usage.
+loadBalancerMemoryResourceWeight=1.0
+
+# The direct memory usage weight when calculating new resource usage.
+loadBalancerDirectMemoryResourceWeight=1.0
+
+```
+
+### OverloadShedder
+This strategy attempts to shed exactly one bundle on brokers which are
overloaded, that is, whose maximum system resource usage exceeds
[`loadBalancerBrokerOverloadedThresholdPercentage`](#broker-overload-thresholds).
To see which resources are considered when determining the maximum system
resource. A bundle is recommended for unloading off that broker if and only if
the following conditions hold: The broker has at least two bundles assigned and
the broker has at least one bundle that h [...]

-##### UniformLoadShedder
-This strategy tends to distribute load uniformly across all brokers. This
strategy checks load difference between broker with highest load and broker
with lowest load. If the difference is higher than configured thresholds
`loadBalancerMsgRateDifferenceShedderThreshold` and
`loadBalancerMsgThroughputMultiplierDifferenceShedderThreshold` then it finds
out bundles which can be unloaded to distribute traffic evenly across all
brokers. Configure broker with below value to use this strategy.
-`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.UniformLoadShedder`
+To use the `OverloadShedder` strategy, configure brokers with this value.
+`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.OverloadShedder`
+
+#### Broker overload thresholds
+
+The determination of when a broker is overloaded is based on the threshold of
CPU, network, and memory usage. Whenever either of those metrics reaches the
threshold, the system triggers the shedding (if enabled).
+
+:::note
+
+The overload threshold `loadBalancerBrokerOverloadedThresholdPercentage` only
applies to the [`OverloadShedder`](#overloadshedder) shedding strategy. By
default, it is set to 85%.
+
+:::
+
+Pulsar gathers the CPU, network, and memory usage stats from the system
metrics. In some cases of network utilization, the network interface speed that
Linux reports is not correct and needs to be manually overridden. This is the
case in AWS EC2 instances with 1Gbps NIC speed for which the OS reports 10Gbps
speed.
+
+Because of the incorrect max speed, the load manager might think the broker
has not reached the NIC capacity, while in fact the broker already uses all the
bandwidth and the traffic is slowed down.
+
+You can set `loadBalancerOverrideBrokerNicSpeedGbps` in the `conf/broker.conf`
file to correct the max NIC speed. When the value is empty, Pulsar uses the
value that the OS reports.
+
+### UniformLoadShedder
+This strategy tends to distribute load uniformly across all brokers. This
strategy checks the load difference between the broker with the highest load
and the broker with the lowest load. If the difference is higher than
configured thresholds `loadBalancerMsgRateDifferenceShedderThreshold` and
`loadBalancerMsgThroughputMultiplierDifferenceShedderThreshold` then it finds
out bundles that can be unloaded to distribute traffic evenly across all
brokers.

-#### Broker overload thresholds
+To use the `UniformLoadShedder` strategy, configure brokers with this value.
+`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.UniformLoadShedder`
+
+## Unload topics and bundles
+
+You can "unload" a topic in Pulsar manual admin operations. Unloading means
closing topics, releasing ownership, and reassigning topics to a new broker,
based on the current load.
+
+When unloading happens, the client experiences a small latency blip, typically
in the order of tens of milliseconds, while the topic is reassigned.
+
+Unloading is the mechanism that the load manager uses to perform the load
shedding, but you can also trigger the unloading manually, for example, to
correct the assignments and redistribute traffic even before having any broker
overloaded.
+
+Unloading a topic has no effect on the assignment, but just closes and reopens
the particular topic:
-The determinations of when a broker is overloaded is based on threshold of
CPU, network and memory usage. Whenever either of those metrics reaches the
threshold, the system triggers the shedding (if enabled).
+```shell
-By default, overload threshold is set at 85%:
+pulsar-admin topics unload persistent://tenant/namespace/topic
-```properties
+```
-# Usage threshold to determine a broker as over-loaded
-loadBalancerBrokerOverloadedThresholdPercentage=85
+To unload all topics for a namespace and trigger reassignments:
+
+```shell
+
+pulsar-admin namespaces unload tenant/namespace
```
-Pulsar gathers the usage stats from the system metrics.
+## Distribute anti-affinity namespaces across failure domains
+
+When your application has multiple namespaces and you want one of them
available all the time to avoid any downtime, you can group these namespaces
and distribute them across different [failure
domains](reference-terminology.md#failure-domain) and different brokers. Thus,
if one of the failure domains is down (due to release rollout or brokers
restart), it only disrupts namespaces owned by that specific failure domain and
the rest of the namespaces owned by other domains remain available [...]
+
+Such a group of namespaces has anti-affinity to each other, that is, all the
namespaces in this group are [anti-affinity
namespaces](reference-terminology.md#anti-affinity-namespaces) and are
distributed to different failure domains in a load-balanced manner.
-In case of network utilization, in some cases the network interface speed that
Linux reports is not correct and needs to be manually overridden. This is the
case in AWS EC2 instances with 1Gbps NIC speed for which the OS reports 10Gbps
speed.
+As illustrated in the following figure, Pulsar has 2 failure domains (Domain1
and Domain2) and each domain has 2 brokers in it. You can create an
anti-affinity namespace group that has 4 namespaces in it, and all the 4
namespaces have anti-affinity to each other. The load manager tries to
distribute namespaces evenly across all the brokers in the same domain. Since
each domain has 2 brokers, every broker owns one namespace from this
anti-affinity namespace group, and you can see each dom [...]
-Because of the incorrect max speed, the Pulsar load manager might think the
broker has not reached the NIC capacity, while in fact the broker already uses
all the bandwidth and the traffic is slowed down.
+
-You can use the following setting to correct the max NIC speed:
+The load manager follows an even distribution policy across failure domains to
assign anti-affinity namespaces. The following table outlines the
even-distributed assignment sequence illustrated in the above figure.
-```properties
+| Assignment sequence | Namespace | Failure domain candidates | Broker
candidates | Selected broker |
+|:---|:------------|:------------------|:------------------------------------|:-----------------|
+| 1 | Namespace1 | Domain1, Domain2 | Broker1, Broker2, Broker3, Broker4 |
Domain1:Broker1 |
+| 2 | Namespace2 | Domain2 | Broker3, Broker4 |
Domain2:Broker3 |
+| 3 | Namespace3 | Domain1, Domain2 | Broker2, Broker4 |
Domain1:Broker2 |
+| 4 | Namespace4 | Domain2 | Broker4 |
Domain2:Broker4 |
+
+:::tip
-# Override the auto-detection of the network interfaces max speed.
-# This option is useful in some environments (eg: EC2 VMs) where the max speed
-# reported by Linux is not reflecting the real bandwidth available to the
broker.
-# Since the network usage is employed by the load manager to decide when a
broker
-# is overloaded, it is important to make sure the info is correct or override
it
-# with the right value here. The configured value can be a double (eg: 0.8)
and that
-# can be used to trigger load-shedding even before hitting on NIC limits.
-loadBalancerOverrideBrokerNicSpeedGbps=
+* Each namespace belongs to only one anti-affinity group. If a namespace with
an existing anti-affinity assignment is assigned to another anti-affinity
group, the original assignment is dropped.
+
+* If there are more anti-affinity namespaces than failure domains, the load
manager distributes namespaces evenly across all the domains, and also every
domain distributes namespaces evenly across all the brokers under that domain.
+
+:::
+
+### Create a failure domain and register brokers
+
+:::note
+
+One broker can only be registered to a single failure domain.
+
+:::
+
+To create a domain under a specific cluster and register brokers, run the
following command:
+
+```bash
+
+pulsar-admin clusters create-failure-domain <cluster-name> --domain-name
<domain-name> --broker-list <broker-list-comma-separated>
```
-When the value is empty, Pulsar uses the value that the OS reports.
+You can also view, update, and delete domains under a specific cluster. For
more information, refer to [Pulsar admin doc](/tools/pulsar-admin/).
+
+### Create an anti-affinity namespace group
+
+An anti-affinity group is created automatically when the first namespace is
assigned to the group. To assign a namespace to an anti-affinity group, run the
following command. It sets an anti-affinity group name for a namespace.
+
+```bash
+
+pulsar-admin namespaces set-anti-affinity-group <namespace> --group
<group-name>
+
+```
+For more information about `anti-affinity-group` related commands, refer to
[Pulsar admin doc](/tools/pulsar-admin/).
diff --git
a/site2/docs/assets/anti-affinity-namespaces-across-failure-domains.svg
b/site2/docs/assets/anti-affinity-namespaces-across-failure-domains.svg
new file mode 100644
index 00000000000..6abb48d1420
--- /dev/null
+++ b/site2/docs/assets/anti-affinity-namespaces-across-failure-domains.svg
@@ -0,0 +1 @@
+<svg xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:lucid="lucid" width="1546"
height="741"><g transform="translate(-154 -115)" lucid:page-tab-id="0_0"><path
d="M0 0h1870.87v1322.83H0z" fill="#fff"/><path d="M402.75 179.18a6 6 0 0 1
6-6h440.93a6 6 0 0 1 6 6v317.6a6 6 0 0 1-6 6H408.75a6 6 0 0 1-6-6z"
fill="#fff"/><path d="M404.25 179.18c0 .83-.67 1.5-1.5
1.5s-1.5-.67-1.5-1.5.67-1.5 1.5-1.5 1.5.67 1.5 1.5zm5.06-5.92c0 .82-.66 1.5-1.5
1.5-.82 0-1.5-.68-1 [...]
\ No newline at end of file
diff --git a/site2/docs/concepts-architecture-overview.md
b/site2/docs/concepts-architecture-overview.md
index 19e84f93eaa..1384b3ed4bf 100644
--- a/site2/docs/concepts-architecture-overview.md
+++ b/site2/docs/concepts-architecture-overview.md
@@ -8,7 +8,7 @@ At the highest level, a Pulsar instance is composed of one or
more Pulsar cluste
In a Pulsar cluster:
-* One or more brokers handles and [load
balances](administration-load-balance.md#load-balance-across-pulsar-brokers)
incoming messages from producers, dispatches messages to consumers,
communicates with the Pulsar configuration store to handle various coordination
tasks, stores messages in BookKeeper instances (aka bookies), relies on a
cluster-specific ZooKeeper cluster for certain tasks, and more.
+* One or more brokers handles and [load
balances](administration-load-balance.md) incoming messages from producers,
dispatches messages to consumers, communicates with the Pulsar configuration
store to handle various coordination tasks, stores messages in BookKeeper
instances (aka bookies), relies on a cluster-specific ZooKeeper cluster for
certain tasks, and more.
* A BookKeeper cluster consisting of one or more bookies handles [persistent
storage](#persistent-storage) of messages.
* A ZooKeeper cluster specific to that cluster handles coordination tasks
between Pulsar clusters.
diff --git a/site2/docs/reference-terminology.md
b/site2/docs/reference-terminology.md
index ebc114d86f7..8a5e7914964 100644
--- a/site2/docs/reference-terminology.md
+++ b/site2/docs/reference-terminology.md
@@ -97,6 +97,15 @@ that have already been [acknowledged](#acknowledgement-ack).
The ability to isolate [namespaces](#namespace), specify quotas, and configure
authentication and authorization
on a per-[tenant](#tenant) basis.
+#### Failure Domain
+
+A logical domain under a Pulsar cluster. Each logical domain contains a
pre-configured list of brokers.
+
+#### Anti-affinity Namespaces
+
+A group of namespaces that have anti-affinity to each other.
+
+
### Architecture
#### Standalone
diff --git
a/site2/website/versioned_docs/version-2.10.0/administration-load-balance.md
b/site2/website/versioned_docs/version-2.10.0/administration-load-balance.md
index 49e76e52995..3bb295d25cb 100644
--- a/site2/website/versioned_docs/version-2.10.0/administration-load-balance.md
+++ b/site2/website/versioned_docs/version-2.10.0/administration-load-balance.md
@@ -1,111 +1,82 @@
---
id: administration-load-balance
-title: Pulsar load balance
+title: Load balance across brokers
sidebar_label: "Load balance"
original_id: administration-load-balance
---
-## Load balance across Pulsar brokers
-Pulsar is an horizontally scalable messaging system, so the traffic in a
logical cluster must be balanced across all the available Pulsar brokers as
evenly as possible, which is a core requirement.
+Pulsar is a horizontally scalable messaging system, so the traffic in a
logical cluster must be balanced across all the available Pulsar brokers as
evenly as possible, which is a core requirement.
-You can use multiple settings and tools to control the traffic distribution
which require a bit of context to understand how the traffic is managed in
Pulsar. Though, in most cases, the core requirement mentioned above is true out
of the box and you should not worry about it.
+You can use multiple settings and tools to control the traffic distribution
which requires a bit of context to understand how the traffic is managed in
Pulsar. Though in most cases, the core requirement mentioned above is true out
of the box and you should not worry about it.
-## Pulsar load manager architecture
+The following sections introduce how the load-balanced assignments work across
Pulsar brokers and how you can leverage the framework to adjust.
-The following part introduces the basic architecture of the Pulsar load
manager.
+## Dynamic assignments
-### Assign topics to brokers dynamically
+Topics are dynamically assigned to brokers based on the load conditions of all
brokers in the cluster. The assignment of topics to brokers is not done at the
topic level but at the **bundle** level (a higher level). Instead of individual
topic assignments, each broker takes ownership of a subset of the topics for a
namespace. This subset is called a bundle and effectively this subset is a
sharding mechanism.
-Topics are dynamically assigned to brokers based on the load conditions of all
brokers in the cluster.
+In other words, each namespace is an "administrative" unit and sharded into a
list of bundles, with each bundle comprising a portion of the overall hash
range of the namespace. Topics are assigned to a particular bundle by taking
the hash of the topic name and checking in which bundle the hash falls. Each
bundle is independent of the others and thus is independently assigned to
different brokers.
-When a client starts using new topics that are not assigned to any broker, a
process is triggered to choose the best suited broker to acquire ownership of
these topics according to the load conditions.
+The benefit of the assignment granularity is to amortize the amount of
information that you need to keep track of. Based on CPU, memory, traffic load,
and other indexes, topics are assigned to a particular broker dynamically. For
example:
+* When a client starts using new topics that are not assigned to any broker, a
process is triggered to choose the best-suited broker to acquire ownership of
these topics according to the load conditions.
+* If the broker owning a topic becomes overloaded, the topic is reassigned to
a less-loaded broker.
+* If the broker owning a topic crashes, the topic is reassigned to another
active broker.
-In case of partitioned topics, different partitions are assigned to different
brokers. Here "topic" means either a non-partitioned topic or one partition of
a topic.
+:::tip
-The assignment is "dynamic" because the assignment changes quickly. For
example, if the broker owning the topic crashes, the topic is reassigned
immediately to another broker. Another scenario is that the broker owning the
topic becomes overloaded. In this case, the topic is reassigned to a less
loaded broker.
+For partitioned topics, different partitions are assigned to different
brokers. Here "topic" means either a non-partitioned topic or one partition of
a topic.
-The stateless nature of brokers makes the dynamic assignment possible, so you
can quickly expand or shrink the cluster based on usage.
+:::
-#### Assignment granularity
+## Create namespaces with assigned bundles
-The assignment of topics or partitions to brokers is not done at the topics or
partitions level, but done at the Bundle level (a higher level). The reason is
to amortize the amount of information that you need to keep track. Based on
CPU, memory, traffic load and other indexes, topics are assigned to a
particular broker dynamically.
+When you create a new namespace, a number of bundles are assigned to the
namespace. You can set this number in the `conf/broker.conf` file:
-Instead of individual topic or partition assignment, each broker takes
ownership of a subset of the topics for a namespace. This subset is called a
"*bundle*" and effectively this subset is a sharding mechanism.
+```conf
-The namespace is the "administrative" unit: many config knobs or operations
are done at the namespace level.
-
-For assignment, a namespaces is sharded into a list of "bundles", with each
bundle comprising a portion of overall hash range of the namespace.
-
-Topics are assigned to a particular bundle by taking the hash of the topic
name and checking in which bundle the hash falls into.
-
-Each bundle is independent of the others and thus is independently assigned to
different brokers.
-
-### Create namespaces and bundles
-
-When you create a new namespace, the new namespace sets to use the default
number of bundles. You can set this in `conf/broker.conf`:
-
-```properties
-
-# When a namespace is created without specifying the number of bundle, this
+# When a namespace is created without specifying the number of bundles, this
# value will be used as the default
defaultNumberOfNamespaceBundles=4
```
-You can either change the system default, or override it when you create a new
namespace:
+Alternatively, you can override the value when you create a new namespace
using [Pulsar admin](/tools/pulsar-admin/):
```shell
-$ bin/pulsar-admin namespaces create my-tenant/my-namespace --clusters us-west
--bundles 16
+bin/pulsar-admin namespaces create my-tenant/my-namespace --clusters us-west
--bundles 16
```
-With this command, you create a namespace with 16 initial bundles. Therefore
the topics for this namespaces can immediately be spread across up to 16
brokers.
+With the above command, you create a namespace with 16 initial bundles.
Therefore the topics for this namespace can immediately be spread across up to
16 brokers.
In general, if you know the expected traffic and number of topics in advance,
you had better start with a reasonable number of bundles instead of waiting for
the system to auto-correct the distribution.
-On the same note, it is beneficial to start with more bundles than the number
of brokers, because of the hashing nature of the distribution of topics into
bundles. For example, for a namespace with 1000 topics, using something like 64
bundles achieves a good distribution of traffic across 16 brokers.
-
-### Unload topics and bundles
+On the same note, it is beneficial to start with more bundles than the number
of brokers, due to the hashing nature of the distribution of topics into
bundles. For example, for a namespace with 1000 topics, using something like 64
bundles achieves a good distribution of traffic across 16 brokers.
-You can "unload" a topic in Pulsar with admin operation. Unloading means to
close the topics, release ownership and reassign the topics to a new broker,
based on current load.
-When unloading happens, the client experiences a small latency blip, typically
in the order of tens of milliseconds, while the topic is reassigned.
+## Split namespace bundles
-Unloading is the mechanism that the load-manager uses to perform the load
shedding, but you can also trigger the unloading manually, for example to
correct the assignments and redistribute traffic even before having any broker
overloaded.
+Since the load for the topics in a bundle might change over time and
predicting the load might be hard, bundle split is designed to resolve these
challenges. The broker splits a bundle into two and the new smaller bundles can
be reassigned to different brokers.
-Unloading a topic has no effect on the assignment, but just closes and reopens
the particular topic:
+Pulsar supports the following two bundle split algorithms:
+* `range_equally_divide`: split the bundle into two parts with the same hash
range size.
+* `topic_count_equally_divide`: split the bundle into two parts with the same
number of topics.
-```shell
+To enable bundle split, you need to configure the following settings in the
`broker.conf` file, and set `defaultNamespaceBundleSplitAlgorithm` based on
your needs.
-pulsar-admin topics unload persistent://tenant/namespace/topic
+```conf
-```
-
-To unload all topics for a namespace and trigger reassignments:
-
-```shell
-
-pulsar-admin namespaces unload tenant/namespace
+loadBalancerAutoBundleSplitEnabled=true
+loadBalancerAutoUnloadSplitBundlesEnabled=true
+defaultNamespaceBundleSplitAlgorithm=range_equally_divide
```
-### Split namespace bundles
-
-Since the load for the topics in a bundle might change over time and
predicting the load might be hard, bundle split is designed to deal with these
issues. The broker splits a bundle into two and the new smaller bundles can be
reassigned to different brokers.
-
-The splitting is based on some tunable thresholds. Any existing bundle that
exceeds any of the threshold is a candidate to be split. By default the newly
split bundles are also immediately offloaded to other brokers, to facilitate
the traffic distribution.
+You can configure more parameters for splitting thresholds. Any existing
bundle that exceeds any of the thresholds is a candidate to be split. By
default, the newly split bundles are immediately reassigned to other brokers,
to facilitate the traffic distribution.
-You can split namespace bundles in two ways, by setting
`supportedNamespaceBundleSplitAlgorithms` to `range_equally_divide` or
`topic_count_equally_divide` in `broker.conf` file. The former splits the
bundle into two parts with the same hash range size; the latter splits the
bundle into two parts with the same number of topics. You can also configure
other parameters for namespace bundles.
-
-```properties
-
-# enable/disable namespace bundle auto split
-loadBalancerAutoBundleSplitEnabled=true
-
-# enable/disable automatic unloading of split bundles
-loadBalancerAutoUnloadSplitBundlesEnabled=true
+```conf
# maximum topics in a bundle, otherwise bundle split will be triggered
loadBalancerNamespaceBundleMaxTopics=1000
@@ -124,91 +95,184 @@ loadBalancerNamespaceMaximumBundles=128
```
-### Shed load automatically
+## Shed load automatically
-The support for automatic load shedding is available in the load manager of
Pulsar. This means that whenever the system recognizes a particular broker is
overloaded, the system forces some traffic to be reassigned to less loaded
brokers.
+The support for automatic load shedding is available in the load manager of
Pulsar. This means that whenever the system recognizes a particular broker is
overloaded, the system forces some traffic to be reassigned to less-loaded
brokers.
When a broker is identified as overloaded, the broker forces to "unload" a
subset of the bundles, the ones with higher traffic, that make up for the
overload percentage.
-For example, the default threshold is 85% and if a broker is over quota at 95%
CPU usage, then the broker unloads the percent difference plus a 5% margin:
`(95% - 85%) + 5% = 15%`.
-
-Given the selection of bundles to offload is based on traffic (as a proxy
measure for cpu, network and memory), broker unloads bundles for at least 15%
of traffic.
+For example, the default threshold is 85% and if a broker is over quota at 95%
CPU usage, then the broker unloads the percent difference plus a 5% margin:
`(95% - 85%) + 5% = 15%`. Given the selection of bundles to unload is based on
traffic (as a proxy measure for CPU, network, and memory), the broker unloads
bundles for at least 15% of traffic.
-The automatic load shedding is enabled by default and you can disable the
automatic load shedding with this setting:
+:::tip
-```properties
+* The automatic load shedding is enabled by default. To disable it, you can
set `loadBalancerSheddingEnabled` to `false`.
+* Besides the automatic load shedding, you can [manually unload
bundles](#unload-topics-and-bundles).
-# Enable/disable automatic bundle unloading for load-shedding
-loadBalancerSheddingEnabled=true
-
-```
+:::
Additional settings that apply to shedding:
-```properties
+```conf
# Load shedding interval. Broker periodically checks whether some traffic
should be offload from
# some over-loaded broker to other under-loaded brokers
loadBalancerSheddingIntervalMinutes=1
-# Prevent the same topics to be shed and moved to other brokers more that once
within this timeframe
+# Prevent the same topics to be shed and moved to other brokers more than once
within this timeframe
loadBalancerSheddingGracePeriodMinutes=30
```
-Pulsar supports the following types of shedding strategies. From Pulsar 2.10,
the **default** shedding strategy is `ThresholdShedder`.
+Pulsar supports the following types of automatic load shedding strategies.
+* [ThresholdShedder](#thresholdshedder)
+* [OverloadShedder](#overloadshedder)
+* [UniformLoadShedder](#uniformloadshedder)
-##### ThresholdShedder
-This strategy tends to shed the bundles if any broker's usage is above the
configured threshold. It does this by first computing the average resource
usage per broker for the whole cluster. The resource usage for each broker is
calculated using the following method:
LocalBrokerData#getMaxResourceUsageWithWeight. The weights for each resource
are configurable. Historical observations are included in the running average
based on the broker's setting for loadBalancerHistoryResourcePercentag [...]
-`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.ThresholdShedder`
+:::note
+
+* From Pulsar 2.10, the **default** shedding strategy is `ThresholdShedder`.
+* You need to restart brokers if the shedding strategy is [dynamically
updated](admin-api-brokers.md/#dynamic-broker-configuration).
+
+:::
+
+### ThresholdShedder
+This strategy tends to shed the bundles if any broker's usage is above the
configured threshold. It does this by first computing the average resource
usage per broker for the whole cluster. The resource usage for each broker is
calculated using the following method
`LocalBrokerData#getMaxResourceUsageWithWeight`. Historical observations are
included in the running average based on the broker's setting for
`loadBalancerHistoryResourcePercentage`. Once the average resource usage is
calcula [...]

-##### OverloadShedder
-This strategy will attempt to shed exactly one bundle on brokers which are
overloaded, that is, whose maximum system resource usage exceeds
loadBalancerBrokerOverloadedThresholdPercentage. To see which resources are
considered when determining the maximum system resource. A bundle is
recommended for unloading off that broker if and only if the following
conditions hold: The broker has at least two bundles assigned and the broker
has at least one bundle that has not been unloaded recently [...]
-`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.OverloadShedder`
+To use the `ThresholdShedder` strategy, configure brokers with this value.
+`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.ThresholdShedder`
+
+You can configure the weights for each resource per broker in the
`conf/broker.conf` file.
+
+```conf
+
+# The BandWithIn usage weight when calculating new resource usage.
+loadBalancerBandwithInResourceWeight=1.0
+
+# The BandWithOut usage weight when calculating new resource usage.
+loadBalancerBandwithOutResourceWeight=1.0
+
+# The CPU usage weight when calculating new resource usage.
+loadBalancerCPUResourceWeight=1.0
+
+# The heap memory usage weight when calculating new resource usage.
+loadBalancerMemoryResourceWeight=1.0
+
+# The direct memory usage weight when calculating new resource usage.
+loadBalancerDirectMemoryResourceWeight=1.0
+
+```
+
+### OverloadShedder
+This strategy attempts to shed exactly one bundle on brokers which are
overloaded, that is, whose maximum system resource usage exceeds
[`loadBalancerBrokerOverloadedThresholdPercentage`](#broker-overload-thresholds).
To see which resources are considered when determining the maximum system
resource. A bundle is recommended for unloading off that broker if and only if
the following conditions hold: The broker has at least two bundles assigned and
the broker has at least one bundle that h [...]

-##### UniformLoadShedder
-This strategy tends to distribute load uniformly across all brokers. This
strategy checks laod difference between broker with highest load and broker
with lowest load. If the difference is higher than configured thresholds
`loadBalancerMsgRateDifferenceShedderThreshold` and
`loadBalancerMsgThroughputMultiplierDifferenceShedderThreshold` then it finds
out bundles which can be unloaded to distribute traffic evenly across all
brokers. Configure broker with below value to use this strategy.
-`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.UniformLoadShedder`
+To use the `OverloadShedder` strategy, configure brokers with this value.
+`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.OverloadShedder`
+
+#### Broker overload thresholds
+
+The determination of when a broker is overloaded is based on the threshold of
CPU, network, and memory usage. Whenever either of those metrics reaches the
threshold, the system triggers the shedding (if enabled).
+
+:::note
+
+The overload threshold `loadBalancerBrokerOverloadedThresholdPercentage` only
applies to the [`OverloadShedder`](#overloadshedder) shedding strategy. By
default, it is set to 85%.
+
+:::
+
+Pulsar gathers the CPU, network, and memory usage stats from the system
metrics. In some cases of network utilization, the network interface speed that
Linux reports is not correct and needs to be manually overridden. This is the
case in AWS EC2 instances with 1Gbps NIC speed for which the OS reports 10Gbps
speed.
+
+Because of the incorrect max speed, the load manager might think the broker
has not reached the NIC capacity, while in fact the broker already uses all the
bandwidth and the traffic is slowed down.
+
+You can set `loadBalancerOverrideBrokerNicSpeedGbps` in the `conf/broker.conf`
file to correct the max NIC speed. When the value is empty, Pulsar uses the
value that the OS reports.
+
+### UniformLoadShedder
+This strategy tends to distribute load uniformly across all brokers. This
strategy checks the load difference between the broker with the highest load
and the broker with the lowest load. If the difference is higher than
configured thresholds `loadBalancerMsgRateDifferenceShedderThreshold` and
`loadBalancerMsgThroughputMultiplierDifferenceShedderThreshold` then it finds
out bundles that can be unloaded to distribute traffic evenly across all
brokers.

-#### Broker overload thresholds
+To use the `UniformLoadShedder` strategy, configure brokers with this value.
+`loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.UniformLoadShedder`
+
+## Unload topics and bundles
+
+You can "unload" a topic in Pulsar manual admin operations. Unloading means
closing topics, releasing ownership, and reassigning topics to a new broker,
based on the current load.
+
+When unloading happens, the client experiences a small latency blip, typically
in the order of tens of milliseconds, while the topic is reassigned.
+
+Unloading is the mechanism that the load manager uses to perform the load
shedding, but you can also trigger the unloading manually, for example, to
correct the assignments and redistribute traffic even before having any broker
overloaded.
+
+Unloading a topic has no effect on the assignment, but just closes and reopens
the particular topic:
-The determinations of when a broker is overloaded is based on threshold of
CPU, network and memory usage. Whenever either of those metrics reaches the
threshold, the system triggers the shedding (if enabled).
+```shell
-By default, overload threshold is set at 85%:
+pulsar-admin topics unload persistent://tenant/namespace/topic
-```properties
+```
-# Usage threshold to determine a broker as over-loaded
-loadBalancerBrokerOverloadedThresholdPercentage=85
+To unload all topics for a namespace and trigger reassignments:
+
+```shell
+
+pulsar-admin namespaces unload tenant/namespace
```
-Pulsar gathers the usage stats from the system metrics.
+## Distribute anti-affinity namespaces across failure domains
+
+When your application has multiple namespaces and you want one of them
available all the time to avoid any downtime, you can group these namespaces
and distribute them across different [failure
domains](reference-terminology.md#failure-domain) and different brokers. Thus,
if one of the failure domains is down (due to release rollout or brokers
restart), it only disrupts namespaces owned by that specific failure domain and
the rest of the namespaces owned by other domains remain available [...]
+
+Such a group of namespaces has anti-affinity to each other, that is, all the
namespaces in this group are [anti-affinity
namespaces](reference-terminology.md#anti-affinity-namespaces) and are
distributed to different failure domains in a load-balanced manner.
-In case of network utilization, in some cases the network interface speed that
Linux reports is not correct and needs to be manually overridden. This is the
case in AWS EC2 instances with 1Gbps NIC speed for which the OS reports 10Gbps
speed.
+As illustrated in the following figure, Pulsar has 2 failure domains (Domain1
and Domain2) and each domain has 2 brokers in it. You can create an
anti-affinity namespace group that has 4 namespaces in it, and all the 4
namespaces have anti-affinity to each other. The load manager tries to
distribute namespaces evenly across all the brokers in the same domain. Since
each domain has 2 brokers, every broker owns one namespace from this
anti-affinity namespace group, and you can see each dom [...]
-Because of the incorrect max speed, the Pulsar load manager might think the
broker has not reached the NIC capacity, while in fact the broker already uses
all the bandwidth and the traffic is slowed down.
+
-You can use the following setting to correct the max NIC speed:
+The load manager follows an even distribution policy across failure domains to
assign anti-affinity namespaces. The following table outlines the
even-distributed assignment sequence illustrated in the above figure.
-```properties
+| Assignment sequence | Namespace | Failure domain candidates | Broker
candidates | Selected broker |
+|:---|:------------|:------------------|:------------------------------------|:-----------------|
+| 1 | Namespace1 | Domain1, Domain2 | Broker1, Broker2, Broker3, Broker4 |
Domain1:Broker1 |
+| 2 | Namespace2 | Domain2 | Broker3, Broker4 |
Domain2:Broker3 |
+| 3 | Namespace3 | Domain1, Domain2 | Broker2, Broker4 |
Domain1:Broker2 |
+| 4 | Namespace4 | Domain2 | Broker4 |
Domain2:Broker4 |
+
+:::tip
-# Override the auto-detection of the network interfaces max speed.
-# This option is useful in some environments (eg: EC2 VMs) where the max speed
-# reported by Linux is not reflecting the real bandwidth available to the
broker.
-# Since the network usage is employed by the load manager to decide when a
broker
-# is overloaded, it is important to make sure the info is correct or override
it
-# with the right value here. The configured value can be a double (eg: 0.8)
and that
-# can be used to trigger load-shedding even before hitting on NIC limits.
-loadBalancerOverrideBrokerNicSpeedGbps=
+* Each namespace belongs to only one anti-affinity group. If a namespace with
an existing anti-affinity assignment is assigned to another anti-affinity
group, the original assignment is dropped.
+
+* If there are more anti-affinity namespaces than failure domains, the load
manager distributes namespaces evenly across all the domains, and also every
domain distributes namespaces evenly across all the brokers under that domain.
+
+:::
+
+### Create a failure domain and register brokers
+
+:::note
+
+One broker can only be registered to a single failure domain.
+
+:::
+
+To create a domain under a specific cluster and register brokers, run the
following command:
+
+```bash
+
+pulsar-admin clusters create-failure-domain <cluster-name> --domain-name
<domain-name> --broker-list <broker-list-comma-separated>
```
-When the value is empty, Pulsar uses the value that the OS reports.
+You can also view, update, and delete domains under a specific cluster. For
more information, refer to [Pulsar admin doc](/tools/pulsar-admin/).
+
+### Create an anti-affinity namespace group
+
+An anti-affinity group is created automatically when the first namespace is
assigned to the group. To assign a namespace to an anti-affinity group, run the
following command. It sets an anti-affinity group name for a namespace.
+
+```bash
+
+pulsar-admin namespaces set-anti-affinity-group <namespace> --group
<group-name>
+
+```
+For more information about `anti-affinity-group` related commands, refer to
[Pulsar admin doc](/tools/pulsar-admin/).
diff --git
a/site2/website/versioned_docs/version-2.10.0/reference-terminology.md
b/site2/website/versioned_docs/version-2.10.0/reference-terminology.md
index d0e736860e2..e5099141c32 100644
--- a/site2/website/versioned_docs/version-2.10.0/reference-terminology.md
+++ b/site2/website/versioned_docs/version-2.10.0/reference-terminology.md
@@ -98,6 +98,14 @@ that have already been [acknowledged](#acknowledgement-ack).
The ability to isolate [namespaces](#namespace), specify quotas, and configure
authentication and authorization
on a per-[tenant](#tenant) basis.
+#### Failure Domain
+
+A logical domain under a Pulsar cluster. Each logical domain contains a
pre-configured list of brokers.
+
+#### Anti-affinity Namespaces
+
+A group of namespaces that have anti-affinity to each other.
+
### Architecture
#### Standalone
diff --git
a/site2/website/versioned_docs/version-2.8.0/administration-load-balance.md
b/site2/website/versioned_docs/version-2.8.0/administration-load-balance.md
index 3efba601ed4..890ebf66162 100644
--- a/site2/website/versioned_docs/version-2.8.0/administration-load-balance.md
+++ b/site2/website/versioned_docs/version-2.8.0/administration-load-balance.md
@@ -198,3 +198,59 @@ loadBalancerOverrideBrokerNicSpeedGbps=
When the value is empty, Pulsar uses the value that the OS reports.
+### Distribute anti-affinity namespaces across failure domains
+
+When your application has multiple namespaces and you want one of them
available all the time to avoid any downtime, you can group these namespaces
and distribute them across different [failure
domains](reference-terminology.md#failure-domain) and different brokers. Thus,
if one of the failure domains is down (due to release rollout or brokers
restart), it only disrupts namespaces owned by that specific failure domain and
the rest of the namespaces owned by other domains remain available [...]
+
+Such a group of namespaces has anti-affinity to each other, that is, all the
namespaces in this group are [anti-affinity
namespaces](reference-terminology.md#anti-affinity-namespaces) and are
distributed to different failure domains in a load-balanced manner.
+
+As illustrated in the following figure, Pulsar has 2 failure domains (Domain1
and Domain2) and each domain has 2 brokers in it. You can create an
anti-affinity namespace group that has 4 namespaces in it, and all the 4
namespaces have anti-affinity to each other. The load manager tries to
distribute namespaces evenly across all the brokers in the same domain. Since
each domain has 2 brokers, every broker owns one namespace from this
anti-affinity namespace group, and you can see each dom [...]
+
+
+
+The load manager follows an even distribution policy across failure domains to
assign anti-affinity namespaces. The following table outlines the
even-distributed assignment sequence illustrated in the above figure.
+
+| Assignment sequence | Namespace | Failure domain candidates | Broker
candidates | Selected broker |
+|:---|:------------|:------------------|:------------------------------------|:-----------------|
+| 1 | Namespace1 | Domain1, Domain2 | Broker1, Broker2, Broker3, Broker4 |
Domain1:Broker1 |
+| 2 | Namespace2 | Domain2 | Broker3, Broker4 |
Domain2:Broker3 |
+| 3 | Namespace3 | Domain1, Domain2 | Broker2, Broker4 |
Domain1:Broker2 |
+| 4 | Namespace4 | Domain2 | Broker4 |
Domain2:Broker4 |
+
+:::tip
+
+* Each namespace belongs to only one anti-affinity group. If a namespace with
an existing anti-affinity assignment is assigned to another anti-affinity
group, the original assignment is dropped.
+
+* If there are more anti-affinity namespaces than failure domains, the load
manager distributes namespaces evenly across all the domains, and also every
domain distributes namespaces evenly across all the brokers under that domain.
+
+:::
+
+#### Create a failure domain and register brokers
+
+:::note
+
+One broker can only be registered to a single failure domain.
+
+:::
+
+To create a domain under a specific cluster and register brokers, run the
following command:
+
+```bash
+
+pulsar-admin clusters create-failure-domain <cluster-name> --domain-name
<domain-name> --broker-list <broker-list-comma-separated>
+
+```
+
+You can also view, update, and delete domains under a specific cluster. For
more information, refer to [Pulsar admin doc](/tools/pulsar-admin/).
+
+#### Create an anti-affinity namespace group
+
+An anti-affinity group is created automatically when the first namespace is
assigned to the group. To assign a namespace to an anti-affinity group, run the
following command. It sets an anti-affinity group name for a namespace.
+
+```bash
+
+pulsar-admin namespaces set-anti-affinity-group <namespace> --group
<group-name>
+
+```
+
+For more information about `anti-affinity-group` related commands, refer to
[Pulsar admin doc](/tools/pulsar-admin/).
diff --git
a/site2/website/versioned_docs/version-2.8.0/reference-terminology.md
b/site2/website/versioned_docs/version-2.8.0/reference-terminology.md
index d0e736860e2..e5099141c32 100644
--- a/site2/website/versioned_docs/version-2.8.0/reference-terminology.md
+++ b/site2/website/versioned_docs/version-2.8.0/reference-terminology.md
@@ -98,6 +98,14 @@ that have already been [acknowledged](#acknowledgement-ack).
The ability to isolate [namespaces](#namespace), specify quotas, and configure
authentication and authorization
on a per-[tenant](#tenant) basis.
+#### Failure Domain
+
+A logical domain under a Pulsar cluster. Each logical domain contains a
pre-configured list of brokers.
+
+#### Anti-affinity Namespaces
+
+A group of namespaces that have anti-affinity to each other.
+
### Architecture
#### Standalone
diff --git
a/site2/website/versioned_docs/version-2.8.1/administration-load-balance.md
b/site2/website/versioned_docs/version-2.8.1/administration-load-balance.md
index 3efba601ed4..134c39e0956 100644
--- a/site2/website/versioned_docs/version-2.8.1/administration-load-balance.md
+++ b/site2/website/versioned_docs/version-2.8.1/administration-load-balance.md
@@ -198,3 +198,59 @@ loadBalancerOverrideBrokerNicSpeedGbps=
When the value is empty, Pulsar uses the value that the OS reports.
+### Distribute anti-affinity namespaces across failure domains
+
+When your application has multiple namespaces and you want one of them
available all the time to avoid any downtime, you can group these namespaces
and distribute them across different [failure
domains](reference-terminology.md#failure-domain) and different brokers. Thus,
if one of the failure domains is down (due to release rollout or brokers
restart), it only disrupts namespaces owned by that specific failure domain and
the rest of the namespaces owned by other domains remain available [...]
+
+Such a group of namespaces has anti-affinity to each other, that is, all the
namespaces in this group are [anti-affinity
namespaces](reference-terminology.md#anti-affinity-namespaces) and are
distributed to different failure domains in a load-balanced manner.
+
+As illustrated in the following figure, Pulsar has 2 failure domains (Domain1
and Domain2) and each domain has 2 brokers in it. You can create an
anti-affinity namespace group that has 4 namespaces in it, and all the 4
namespaces have anti-affinity to each other. The load manager tries to
distribute namespaces evenly across all the brokers in the same domain. Since
each domain has 2 brokers, every broker owns one namespace from this
anti-affinity namespace group, and you can see each dom [...]
+
+
+
+The load manager follows an even distribution policy across failure domains to
assign anti-affinity namespaces. The following table outlines the
even-distributed assignment sequence illustrated in the above figure.
+
+| Assignment sequence | Namespace | Failure domain candidates | Broker
candidates | Selected broker |
+|:---|:------------|:------------------|:------------------------------------|:-----------------|
+| 1 | Namespace1 | Domain1, Domain2 | Broker1, Broker2, Broker3, Broker4 |
Domain1:Broker1 |
+| 2 | Namespace2 | Domain2 | Broker3, Broker4 |
Domain2:Broker3 |
+| 3 | Namespace3 | Domain1, Domain2 | Broker2, Broker4 |
Domain1:Broker2 |
+| 4 | Namespace4 | Domain2 | Broker4 |
Domain2:Broker4 |
+
+:::tip
+
+* Each namespace belongs to only one anti-affinity group. If a namespace with
an existing anti-affinity assignment is assigned to another anti-affinity
group, the original assignment is dropped.
+
+* If there are more anti-affinity namespaces than failure domains, the load
manager distributes namespaces evenly across all the domains, and also every
domain distributes namespaces evenly across all the brokers under that domain.
+
+:::
+
+#### Create a failure domain and register brokers
+
+:::note
+
+One broker can only be registered to a single failure domain.
+
+:::
+
+To create a domain under a specific cluster and register brokers, run the
following command:
+
+```bash
+
+pulsar-admin clusters create-failure-domain <cluster-name> --domain-name
<domain-name> --broker-list <broker-list-comma-separated>
+
+```
+
+You can also view, update, and delete domains under a specific cluster. For
more information, refer to [Pulsar admin doc](/tools/pulsar-admin/).
+
+#### Create an anti-affinity namespace group
+
+An anti-affinity group is created automatically when the first namespace is
assigned to the group. To assign a namespace to an anti-affinity group, run the
following command. It sets an anti-affinity group name for a namespace.
+
+```bash
+
+pulsar-admin namespaces set-anti-affinity-group <namespace> --group
<group-name>
+
+```
+
+For more information about `anti-affinity-group` related commands, refer to
[Pulsar admin doc](/tools/pulsar-admin/).
\ No newline at end of file
diff --git
a/site2/website/versioned_docs/version-2.8.1/reference-terminology.md
b/site2/website/versioned_docs/version-2.8.1/reference-terminology.md
index d0e736860e2..e5099141c32 100644
--- a/site2/website/versioned_docs/version-2.8.1/reference-terminology.md
+++ b/site2/website/versioned_docs/version-2.8.1/reference-terminology.md
@@ -98,6 +98,14 @@ that have already been [acknowledged](#acknowledgement-ack).
The ability to isolate [namespaces](#namespace), specify quotas, and configure
authentication and authorization
on a per-[tenant](#tenant) basis.
+#### Failure Domain
+
+A logical domain under a Pulsar cluster. Each logical domain contains a
pre-configured list of brokers.
+
+#### Anti-affinity Namespaces
+
+A group of namespaces that have anti-affinity to each other.
+
### Architecture
#### Standalone
diff --git
a/site2/website/versioned_docs/version-2.8.2/administration-load-balance.md
b/site2/website/versioned_docs/version-2.8.2/administration-load-balance.md
index 3efba601ed4..134c39e0956 100644
--- a/site2/website/versioned_docs/version-2.8.2/administration-load-balance.md
+++ b/site2/website/versioned_docs/version-2.8.2/administration-load-balance.md
@@ -198,3 +198,59 @@ loadBalancerOverrideBrokerNicSpeedGbps=
When the value is empty, Pulsar uses the value that the OS reports.
+### Distribute anti-affinity namespaces across failure domains
+
+When your application has multiple namespaces and you want one of them
available all the time to avoid any downtime, you can group these namespaces
and distribute them across different [failure
domains](reference-terminology.md#failure-domain) and different brokers. Thus,
if one of the failure domains is down (due to release rollout or brokers
restart), it only disrupts namespaces owned by that specific failure domain and
the rest of the namespaces owned by other domains remain available [...]
+
+Such a group of namespaces has anti-affinity to each other, that is, all the
namespaces in this group are [anti-affinity
namespaces](reference-terminology.md#anti-affinity-namespaces) and are
distributed to different failure domains in a load-balanced manner.
+
+As illustrated in the following figure, Pulsar has 2 failure domains (Domain1
and Domain2) and each domain has 2 brokers in it. You can create an
anti-affinity namespace group that has 4 namespaces in it, and all the 4
namespaces have anti-affinity to each other. The load manager tries to
distribute namespaces evenly across all the brokers in the same domain. Since
each domain has 2 brokers, every broker owns one namespace from this
anti-affinity namespace group, and you can see each dom [...]
+
+
+
+The load manager follows an even distribution policy across failure domains to
assign anti-affinity namespaces. The following table outlines the
even-distributed assignment sequence illustrated in the above figure.
+
+| Assignment sequence | Namespace | Failure domain candidates | Broker
candidates | Selected broker |
+|:---|:------------|:------------------|:------------------------------------|:-----------------|
+| 1 | Namespace1 | Domain1, Domain2 | Broker1, Broker2, Broker3, Broker4 |
Domain1:Broker1 |
+| 2 | Namespace2 | Domain2 | Broker3, Broker4 |
Domain2:Broker3 |
+| 3 | Namespace3 | Domain1, Domain2 | Broker2, Broker4 |
Domain1:Broker2 |
+| 4 | Namespace4 | Domain2 | Broker4 |
Domain2:Broker4 |
+
+:::tip
+
+* Each namespace belongs to only one anti-affinity group. If a namespace with
an existing anti-affinity assignment is assigned to another anti-affinity
group, the original assignment is dropped.
+
+* If there are more anti-affinity namespaces than failure domains, the load
manager distributes namespaces evenly across all the domains, and also every
domain distributes namespaces evenly across all the brokers under that domain.
+
+:::
+
+#### Create a failure domain and register brokers
+
+:::note
+
+One broker can only be registered to a single failure domain.
+
+:::
+
+To create a domain under a specific cluster and register brokers, run the
following command:
+
+```bash
+
+pulsar-admin clusters create-failure-domain <cluster-name> --domain-name
<domain-name> --broker-list <broker-list-comma-separated>
+
+```
+
+You can also view, update, and delete domains under a specific cluster. For
more information, refer to [Pulsar admin doc](/tools/pulsar-admin/).
+
+#### Create an anti-affinity namespace group
+
+An anti-affinity group is created automatically when the first namespace is
assigned to the group. To assign a namespace to an anti-affinity group, run the
following command. It sets an anti-affinity group name for a namespace.
+
+```bash
+
+pulsar-admin namespaces set-anti-affinity-group <namespace> --group
<group-name>
+
+```
+
+For more information about `anti-affinity-group` related commands, refer to
[Pulsar admin doc](/tools/pulsar-admin/).
\ No newline at end of file
diff --git
a/site2/website/versioned_docs/version-2.8.2/reference-terminology.md
b/site2/website/versioned_docs/version-2.8.2/reference-terminology.md
index d0e736860e2..e5099141c32 100644
--- a/site2/website/versioned_docs/version-2.8.2/reference-terminology.md
+++ b/site2/website/versioned_docs/version-2.8.2/reference-terminology.md
@@ -98,6 +98,14 @@ that have already been [acknowledged](#acknowledgement-ack).
The ability to isolate [namespaces](#namespace), specify quotas, and configure
authentication and authorization
on a per-[tenant](#tenant) basis.
+#### Failure Domain
+
+A logical domain under a Pulsar cluster. Each logical domain contains a
pre-configured list of brokers.
+
+#### Anti-affinity Namespaces
+
+A group of namespaces that have anti-affinity to each other.
+
### Architecture
#### Standalone
diff --git
a/site2/website/versioned_docs/version-2.8.3/administration-load-balance.md
b/site2/website/versioned_docs/version-2.8.3/administration-load-balance.md
index 3efba601ed4..134c39e0956 100644
--- a/site2/website/versioned_docs/version-2.8.3/administration-load-balance.md
+++ b/site2/website/versioned_docs/version-2.8.3/administration-load-balance.md
@@ -198,3 +198,59 @@ loadBalancerOverrideBrokerNicSpeedGbps=
When the value is empty, Pulsar uses the value that the OS reports.
+### Distribute anti-affinity namespaces across failure domains
+
+When your application has multiple namespaces and you want one of them
available all the time to avoid any downtime, you can group these namespaces
and distribute them across different [failure
domains](reference-terminology.md#failure-domain) and different brokers. Thus,
if one of the failure domains is down (due to release rollout or brokers
restart), it only disrupts namespaces owned by that specific failure domain and
the rest of the namespaces owned by other domains remain available [...]
+
+Such a group of namespaces has anti-affinity to each other, that is, all the
namespaces in this group are [anti-affinity
namespaces](reference-terminology.md#anti-affinity-namespaces) and are
distributed to different failure domains in a load-balanced manner.
+
+As illustrated in the following figure, Pulsar has 2 failure domains (Domain1
and Domain2) and each domain has 2 brokers in it. You can create an
anti-affinity namespace group that has 4 namespaces in it, and all the 4
namespaces have anti-affinity to each other. The load manager tries to
distribute namespaces evenly across all the brokers in the same domain. Since
each domain has 2 brokers, every broker owns one namespace from this
anti-affinity namespace group, and you can see each dom [...]
+
+
+
+The load manager follows an even distribution policy across failure domains to
assign anti-affinity namespaces. The following table outlines the
even-distributed assignment sequence illustrated in the above figure.
+
+| Assignment sequence | Namespace | Failure domain candidates | Broker
candidates | Selected broker |
+|:---|:------------|:------------------|:------------------------------------|:-----------------|
+| 1 | Namespace1 | Domain1, Domain2 | Broker1, Broker2, Broker3, Broker4 |
Domain1:Broker1 |
+| 2 | Namespace2 | Domain2 | Broker3, Broker4 |
Domain2:Broker3 |
+| 3 | Namespace3 | Domain1, Domain2 | Broker2, Broker4 |
Domain1:Broker2 |
+| 4 | Namespace4 | Domain2 | Broker4 |
Domain2:Broker4 |
+
+:::tip
+
+* Each namespace belongs to only one anti-affinity group. If a namespace with
an existing anti-affinity assignment is assigned to another anti-affinity
group, the original assignment is dropped.
+
+* If there are more anti-affinity namespaces than failure domains, the load
manager distributes namespaces evenly across all the domains, and also every
domain distributes namespaces evenly across all the brokers under that domain.
+
+:::
+
+#### Create a failure domain and register brokers
+
+:::note
+
+One broker can only be registered to a single failure domain.
+
+:::
+
+To create a domain under a specific cluster and register brokers, run the
following command:
+
+```bash
+
+pulsar-admin clusters create-failure-domain <cluster-name> --domain-name
<domain-name> --broker-list <broker-list-comma-separated>
+
+```
+
+You can also view, update, and delete domains under a specific cluster. For
more information, refer to [Pulsar admin doc](/tools/pulsar-admin/).
+
+#### Create an anti-affinity namespace group
+
+An anti-affinity group is created automatically when the first namespace is
assigned to the group. To assign a namespace to an anti-affinity group, run the
following command. It sets an anti-affinity group name for a namespace.
+
+```bash
+
+pulsar-admin namespaces set-anti-affinity-group <namespace> --group
<group-name>
+
+```
+
+For more information about `anti-affinity-group` related commands, refer to
[Pulsar admin doc](/tools/pulsar-admin/).
\ No newline at end of file
diff --git
a/site2/website/versioned_docs/version-2.8.3/reference-terminology.md
b/site2/website/versioned_docs/version-2.8.3/reference-terminology.md
index d0e736860e2..e5099141c32 100644
--- a/site2/website/versioned_docs/version-2.8.3/reference-terminology.md
+++ b/site2/website/versioned_docs/version-2.8.3/reference-terminology.md
@@ -98,6 +98,14 @@ that have already been [acknowledged](#acknowledgement-ack).
The ability to isolate [namespaces](#namespace), specify quotas, and configure
authentication and authorization
on a per-[tenant](#tenant) basis.
+#### Failure Domain
+
+A logical domain under a Pulsar cluster. Each logical domain contains a
pre-configured list of brokers.
+
+#### Anti-affinity Namespaces
+
+A group of namespaces that have anti-affinity to each other.
+
### Architecture
#### Standalone
diff --git
a/site2/website/versioned_docs/version-2.9.0/administration-load-balance.md
b/site2/website/versioned_docs/version-2.9.0/administration-load-balance.md
index 2b2ac839581..788c84a5931 100644
--- a/site2/website/versioned_docs/version-2.9.0/administration-load-balance.md
+++ b/site2/website/versioned_docs/version-2.9.0/administration-load-balance.md
@@ -192,3 +192,59 @@ loadBalancerOverrideBrokerNicSpeedGbps=
When the value is empty, Pulsar uses the value that the OS reports.
+### Distribute anti-affinity namespaces across failure domains
+
+When your application has multiple namespaces and you want one of them
available all the time to avoid any downtime, you can group these namespaces
and distribute them across different [failure
domains](reference-terminology.md#failure-domain) and different brokers. Thus,
if one of the failure domains is down (due to release rollout or brokers
restart), it only disrupts namespaces owned by that specific failure domain and
the rest of the namespaces owned by other domains remain available [...]
+
+Such a group of namespaces has anti-affinity to each other, that is, all the
namespaces in this group are [anti-affinity
namespaces](reference-terminology.md#anti-affinity-namespaces) and are
distributed to different failure domains in a load-balanced manner.
+
+As illustrated in the following figure, Pulsar has 2 failure domains (Domain1
and Domain2) and each domain has 2 brokers in it. You can create an
anti-affinity namespace group that has 4 namespaces in it, and all the 4
namespaces have anti-affinity to each other. The load manager tries to
distribute namespaces evenly across all the brokers in the same domain. Since
each domain has 2 brokers, every broker owns one namespace from this
anti-affinity namespace group, and you can see each dom [...]
+
+
+
+The load manager follows an even distribution policy across failure domains to
assign anti-affinity namespaces. The following table outlines the
even-distributed assignment sequence illustrated in the above figure.
+
+| Assignment sequence | Namespace | Failure domain candidates | Broker
candidates | Selected broker |
+|:---|:------------|:------------------|:------------------------------------|:-----------------|
+| 1 | Namespace1 | Domain1, Domain2 | Broker1, Broker2, Broker3, Broker4 |
Domain1:Broker1 |
+| 2 | Namespace2 | Domain2 | Broker3, Broker4 |
Domain2:Broker3 |
+| 3 | Namespace3 | Domain1, Domain2 | Broker2, Broker4 |
Domain1:Broker2 |
+| 4 | Namespace4 | Domain2 | Broker4 |
Domain2:Broker4 |
+
+:::tip
+
+* Each namespace belongs to only one anti-affinity group. If a namespace with
an existing anti-affinity assignment is assigned to another anti-affinity
group, the original assignment is dropped.
+
+* If there are more anti-affinity namespaces than failure domains, the load
manager distributes namespaces evenly across all the domains, and also every
domain distributes namespaces evenly across all the brokers under that domain.
+
+:::
+
+#### Create a failure domain and register brokers
+
+:::note
+
+One broker can only be registered to a single failure domain.
+
+:::
+
+To create a domain under a specific cluster and register brokers, run the
following command:
+
+```bash
+
+pulsar-admin clusters create-failure-domain <cluster-name> --domain-name
<domain-name> --broker-list <broker-list-comma-separated>
+
+```
+
+You can also view, update, and delete domains under a specific cluster. For
more information, refer to [Pulsar admin doc](/tools/pulsar-admin/).
+
+#### Create an anti-affinity namespace group
+
+An anti-affinity group is created automatically when the first namespace is
assigned to the group. To assign a namespace to an anti-affinity group, run the
following command. It sets an anti-affinity group name for a namespace.
+
+```bash
+
+pulsar-admin namespaces set-anti-affinity-group <namespace> --group
<group-name>
+
+```
+
+For more information about `anti-affinity-group` related commands, refer to
[Pulsar admin doc](/tools/pulsar-admin/).
\ No newline at end of file
diff --git
a/site2/website/versioned_docs/version-2.9.0/reference-terminology.md
b/site2/website/versioned_docs/version-2.9.0/reference-terminology.md
index d0e736860e2..e5099141c32 100644
--- a/site2/website/versioned_docs/version-2.9.0/reference-terminology.md
+++ b/site2/website/versioned_docs/version-2.9.0/reference-terminology.md
@@ -98,6 +98,14 @@ that have already been [acknowledged](#acknowledgement-ack).
The ability to isolate [namespaces](#namespace), specify quotas, and configure
authentication and authorization
on a per-[tenant](#tenant) basis.
+#### Failure Domain
+
+A logical domain under a Pulsar cluster. Each logical domain contains a
pre-configured list of brokers.
+
+#### Anti-affinity Namespaces
+
+A group of namespaces that have anti-affinity to each other.
+
### Architecture
#### Standalone
diff --git
a/site2/website/versioned_docs/version-2.9.1/administration-load-balance.md
b/site2/website/versioned_docs/version-2.9.1/administration-load-balance.md
index 2b2ac839581..788c84a5931 100644
--- a/site2/website/versioned_docs/version-2.9.1/administration-load-balance.md
+++ b/site2/website/versioned_docs/version-2.9.1/administration-load-balance.md
@@ -192,3 +192,59 @@ loadBalancerOverrideBrokerNicSpeedGbps=
When the value is empty, Pulsar uses the value that the OS reports.
+### Distribute anti-affinity namespaces across failure domains
+
+When your application has multiple namespaces and you want one of them
available all the time to avoid any downtime, you can group these namespaces
and distribute them across different [failure
domains](reference-terminology.md#failure-domain) and different brokers. Thus,
if one of the failure domains is down (due to release rollout or brokers
restart), it only disrupts namespaces owned by that specific failure domain and
the rest of the namespaces owned by other domains remain available [...]
+
+Such a group of namespaces has anti-affinity to each other, that is, all the
namespaces in this group are [anti-affinity
namespaces](reference-terminology.md#anti-affinity-namespaces) and are
distributed to different failure domains in a load-balanced manner.
+
+As illustrated in the following figure, Pulsar has 2 failure domains (Domain1
and Domain2) and each domain has 2 brokers in it. You can create an
anti-affinity namespace group that has 4 namespaces in it, and all the 4
namespaces have anti-affinity to each other. The load manager tries to
distribute namespaces evenly across all the brokers in the same domain. Since
each domain has 2 brokers, every broker owns one namespace from this
anti-affinity namespace group, and you can see each dom [...]
+
+
+
+The load manager follows an even distribution policy across failure domains to
assign anti-affinity namespaces. The following table outlines the
even-distributed assignment sequence illustrated in the above figure.
+
+| Assignment sequence | Namespace | Failure domain candidates | Broker
candidates | Selected broker |
+|:---|:------------|:------------------|:------------------------------------|:-----------------|
+| 1 | Namespace1 | Domain1, Domain2 | Broker1, Broker2, Broker3, Broker4 |
Domain1:Broker1 |
+| 2 | Namespace2 | Domain2 | Broker3, Broker4 |
Domain2:Broker3 |
+| 3 | Namespace3 | Domain1, Domain2 | Broker2, Broker4 |
Domain1:Broker2 |
+| 4 | Namespace4 | Domain2 | Broker4 |
Domain2:Broker4 |
+
+:::tip
+
+* Each namespace belongs to only one anti-affinity group. If a namespace with
an existing anti-affinity assignment is assigned to another anti-affinity
group, the original assignment is dropped.
+
+* If there are more anti-affinity namespaces than failure domains, the load
manager distributes namespaces evenly across all the domains, and also every
domain distributes namespaces evenly across all the brokers under that domain.
+
+:::
+
+#### Create a failure domain and register brokers
+
+:::note
+
+One broker can only be registered to a single failure domain.
+
+:::
+
+To create a domain under a specific cluster and register brokers, run the
following command:
+
+```bash
+
+pulsar-admin clusters create-failure-domain <cluster-name> --domain-name
<domain-name> --broker-list <broker-list-comma-separated>
+
+```
+
+You can also view, update, and delete domains under a specific cluster. For
more information, refer to [Pulsar admin doc](/tools/pulsar-admin/).
+
+#### Create an anti-affinity namespace group
+
+An anti-affinity group is created automatically when the first namespace is
assigned to the group. To assign a namespace to an anti-affinity group, run the
following command. It sets an anti-affinity group name for a namespace.
+
+```bash
+
+pulsar-admin namespaces set-anti-affinity-group <namespace> --group
<group-name>
+
+```
+
+For more information about `anti-affinity-group` related commands, refer to
[Pulsar admin doc](/tools/pulsar-admin/).
\ No newline at end of file
diff --git
a/site2/website/versioned_docs/version-2.9.1/reference-terminology.md
b/site2/website/versioned_docs/version-2.9.1/reference-terminology.md
index d0e736860e2..e5099141c32 100644
--- a/site2/website/versioned_docs/version-2.9.1/reference-terminology.md
+++ b/site2/website/versioned_docs/version-2.9.1/reference-terminology.md
@@ -98,6 +98,14 @@ that have already been [acknowledged](#acknowledgement-ack).
The ability to isolate [namespaces](#namespace), specify quotas, and configure
authentication and authorization
on a per-[tenant](#tenant) basis.
+#### Failure Domain
+
+A logical domain under a Pulsar cluster. Each logical domain contains a
pre-configured list of brokers.
+
+#### Anti-affinity Namespaces
+
+A group of namespaces that have anti-affinity to each other.
+
### Architecture
#### Standalone
diff --git
a/site2/website/versioned_docs/version-2.9.2/administration-load-balance.md
b/site2/website/versioned_docs/version-2.9.2/administration-load-balance.md
index 2b2ac839581..788c84a5931 100644
--- a/site2/website/versioned_docs/version-2.9.2/administration-load-balance.md
+++ b/site2/website/versioned_docs/version-2.9.2/administration-load-balance.md
@@ -192,3 +192,59 @@ loadBalancerOverrideBrokerNicSpeedGbps=
When the value is empty, Pulsar uses the value that the OS reports.
+### Distribute anti-affinity namespaces across failure domains
+
+When your application has multiple namespaces and you want one of them
available all the time to avoid any downtime, you can group these namespaces
and distribute them across different [failure
domains](reference-terminology.md#failure-domain) and different brokers. Thus,
if one of the failure domains is down (due to release rollout or brokers
restart), it only disrupts namespaces owned by that specific failure domain and
the rest of the namespaces owned by other domains remain available [...]
+
+Such a group of namespaces has anti-affinity to each other, that is, all the
namespaces in this group are [anti-affinity
namespaces](reference-terminology.md#anti-affinity-namespaces) and are
distributed to different failure domains in a load-balanced manner.
+
+As illustrated in the following figure, Pulsar has 2 failure domains (Domain1
and Domain2) and each domain has 2 brokers in it. You can create an
anti-affinity namespace group that has 4 namespaces in it, and all the 4
namespaces have anti-affinity to each other. The load manager tries to
distribute namespaces evenly across all the brokers in the same domain. Since
each domain has 2 brokers, every broker owns one namespace from this
anti-affinity namespace group, and you can see each dom [...]
+
+
+
+The load manager follows an even distribution policy across failure domains to
assign anti-affinity namespaces. The following table outlines the
even-distributed assignment sequence illustrated in the above figure.
+
+| Assignment sequence | Namespace | Failure domain candidates | Broker
candidates | Selected broker |
+|:---|:------------|:------------------|:------------------------------------|:-----------------|
+| 1 | Namespace1 | Domain1, Domain2 | Broker1, Broker2, Broker3, Broker4 |
Domain1:Broker1 |
+| 2 | Namespace2 | Domain2 | Broker3, Broker4 |
Domain2:Broker3 |
+| 3 | Namespace3 | Domain1, Domain2 | Broker2, Broker4 |
Domain1:Broker2 |
+| 4 | Namespace4 | Domain2 | Broker4 |
Domain2:Broker4 |
+
+:::tip
+
+* Each namespace belongs to only one anti-affinity group. If a namespace with
an existing anti-affinity assignment is assigned to another anti-affinity
group, the original assignment is dropped.
+
+* If there are more anti-affinity namespaces than failure domains, the load
manager distributes namespaces evenly across all the domains, and also every
domain distributes namespaces evenly across all the brokers under that domain.
+
+:::
+
+#### Create a failure domain and register brokers
+
+:::note
+
+One broker can only be registered to a single failure domain.
+
+:::
+
+To create a domain under a specific cluster and register brokers, run the
following command:
+
+```bash
+
+pulsar-admin clusters create-failure-domain <cluster-name> --domain-name
<domain-name> --broker-list <broker-list-comma-separated>
+
+```
+
+You can also view, update, and delete domains under a specific cluster. For
more information, refer to [Pulsar admin doc](/tools/pulsar-admin/).
+
+#### Create an anti-affinity namespace group
+
+An anti-affinity group is created automatically when the first namespace is
assigned to the group. To assign a namespace to an anti-affinity group, run the
following command. It sets an anti-affinity group name for a namespace.
+
+```bash
+
+pulsar-admin namespaces set-anti-affinity-group <namespace> --group
<group-name>
+
+```
+
+For more information about `anti-affinity-group` related commands, refer to
[Pulsar admin doc](/tools/pulsar-admin/).
\ No newline at end of file
diff --git
a/site2/website/versioned_docs/version-2.9.2/reference-terminology.md
b/site2/website/versioned_docs/version-2.9.2/reference-terminology.md
index d0e736860e2..e5099141c32 100644
--- a/site2/website/versioned_docs/version-2.9.2/reference-terminology.md
+++ b/site2/website/versioned_docs/version-2.9.2/reference-terminology.md
@@ -98,6 +98,14 @@ that have already been [acknowledged](#acknowledgement-ack).
The ability to isolate [namespaces](#namespace), specify quotas, and configure
authentication and authorization
on a per-[tenant](#tenant) basis.
+#### Failure Domain
+
+A logical domain under a Pulsar cluster. Each logical domain contains a
pre-configured list of brokers.
+
+#### Anti-affinity Namespaces
+
+A group of namespaces that have anti-affinity to each other.
+
### Architecture
#### Standalone