jojochuang commented on code in PR #8935:
URL: https://github.com/apache/ozone/pull/8935#discussion_r2274768042


##########
hadoop-hdds/docs/content/feature/listener-om.md:
##########
@@ -0,0 +1,102 @@
+---
+title: "Listener Ozone Manager"
+weight: 2
+menu:
+   main:
+      parent: Features
+summary: Read-only Ozone Manager to scale out read performance.
+---
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+## Introduction
+
+The Listener Ozone Manager (OM) is a read-only, non-voting member of the OM 
High Availability (HA) group. It receives all log entries from the leader and 
stays up-to-date, but it does not participate in leader election or consensus 
votes. This allows Listener OMs to serve read requests from clients, which can 
significantly improve read performance and reduce the load on the voting OMs.
+
+## Why use Listener OMs?
+
+In a standard OM HA setup, all OMs are peers and participate in the Raft 
consensus protocol. This means that all OMs are involved in the write path, 
which can become a bottleneck for read-heavy workloads. By introducing Listener 
OMs, you can scale out your read performance by offloading read requests to 
these read-only OMs.
+
+## How it works
+
+A Listener OM is a regular OM that is configured to be a listener. When an OM 
is configured as a listener, it is added to the Ratis group as a listener. This 
means that it will receive all log entries from the leader, but it will not 
participate in the leader election or consensus votes.
+
+Clients can be configured to send read requests to Listener OMs. This allows 
read requests to be served by the Listener OMs, which reduces the load on the 
voting OMs and improves read performance.
+
+**Note:** An OM in a listener state cannot be transitioned into a leader or 
follower, whether automatically, or via the `ozone admin om transfer` command.
+
+## Best Practices
+
+*   **Recommended Cluster Topology:** For a production environment, it is 
recommended to have a minimum of 3 voting OMs (1 leader and 2 followers) for 
high availability. You can then add one or more Listener OMs to this cluster. 
It is not recommended to have a setup with less than 3 voting OMs, for example, 
1 leader + listener OMs, or 1 leader + 1 follower + listeners.
+*   **Deploy multiple Listener OMs for high availability:** To ensure that 
your read requests can still be served in the event of a Listener OM failure, 
it is recommended to deploy multiple Listener OMs.
+*   **Monitor the load on your Listener OMs:** It is important to monitor the 
load on your Listener OMs to ensure that they are not becoming a bottleneck.
+*   **Decommissioning a Listener OM:** A Listener OM can be decommissioned. 
The process is the same as decommissioning a regular OM. Once the Listener OM 
is decommissioned, it is removed from the OM HA Ring and does not receive Ratis 
transactions.
+
+## Configuration
+
+To configure a Listener OM, you need to perform the following steps:
+
+1.  **Configure `ozone-site.xml` for all OM roles:** Add the following 
properties to the `ozone-site.xml` file on all OMs in the cluster. This 
includes the existing voting OMs and the new Listener OMs.
+
+    ```xml
+    <property>
+      <name>ozone.om.service.ids</name>
+      <value>cluster1</value>
+    </property>
+    <property>
+      <name>ozone.om.nodes.cluster1</name>
+      <value>om1,om2,om3,om4,om5</value>
+    </property>
+    <property>
+      <name>ozone.om.address.cluster1.om1</name>
+      <value>host1</value>
+    </property>
+    <property>
+      <name>ozone.om.address.cluster1.om2</name>
+      <value>host2</value>
+    </property>
+    <property>
+      <name>ozone.om.address.cluster1.om3</name>
+      <value>host3</value>
+    </property>
+    <property>
+      <name>ozone.om.address.cluster1.om4</name>
+      <value>host4</value>
+    </property>
+    <property>
+      <name>ozone.om.address.cluster1.om5</name>
+      <value>host5</value>
+    </property>
+    <property>
+      <name>ozone.om.listener.nodes</name>
+      <value>om4,om5</value>
+    </property>
+    ```
+
+    In this example, `om1`, `om2`, and `om3` are the voting OMs, and `om4` and 
`om5` are the Listener OMs.
+
+2.  **Bootstrap the Listener OM:** Before a new Listener OM can be started, it 
needs to be bootstrapped. This is the same process as bootstrapping a regular 
OM. For more details, please refer to the [OM High Availability]({{< ref 
"../feature/OM-HA.md#om-bootstrap" >}}) documentation.
+
+3.  **Configure clients to use Listener OMs:** To configure clients to use 
Listener OMs, you need to add the same `ozone-site.xml` configuration to the 
client.

Review Comment:
   this feature does not dictate how clients uses it.  I think even after this 
feature is enabled, clients still go to the leader.
   
   Features like #7988 allows clients to read from listener. But until we add 
support for Ratis linearizable read from follower 
(https://jojochuang.github.io/ratis-site/docs/features/linearizable-read) it 
won't work in general cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to