This is an automated email from the ASF dual-hosted git repository.
weichiu pushed a commit to branch HDDS-9225-website-v2
in repository https://gitbox.apache.org/repos/asf/ozone-site.git
The following commit(s) were added to refs/heads/HDDS-9225-website-v2 by this
push:
new 33ef5cf4c HDDS-14396. [Website v2] [Docs] [Administrator Guide]
Production Deployment (#260)
33ef5cf4c is described below
commit 33ef5cf4c52470a99146c752d71a3fbfcc8aea68
Author: Bolin Lin <[email protected]>
AuthorDate: Tue Jan 20 12:15:31 2026 -0500
HDDS-14396. [Website v2] [Docs] [Administrator Guide] Production Deployment
(#260)
Co-authored-by: Wei-Chiu Chuang <[email protected]>
---
cspell.yaml | 9 +++
.../04-performance/01-placeholder.md | 72 +++++++++++++++++++++-
2 files changed, 78 insertions(+), 3 deletions(-)
diff --git a/cspell.yaml b/cspell.yaml
index c0571438c..a4e96ad9f 100644
--- a/cspell.yaml
+++ b/cspell.yaml
@@ -192,3 +192,12 @@ words:
- utilisation
- utilised
- hotspots
+# Infrastructure and hardware terms
+- Chrony
+- SAS
+- JBOD
+- Gbps
+- Hugepage
+- LVM
+- NVMe
+- ntpd
diff --git
a/docs/05-administrator-guide/02-configuration/04-performance/01-placeholder.md
b/docs/05-administrator-guide/02-configuration/04-performance/01-placeholder.md
index dff86927b..6f2c1e3d8 100644
---
a/docs/05-administrator-guide/02-configuration/04-performance/01-placeholder.md
+++
b/docs/05-administrator-guide/02-configuration/04-performance/01-placeholder.md
@@ -1,5 +1,71 @@
-# PLACEHOLDER
+---
+sidebar_label: Production Deployment
+---
-**TODO:** File a subtask under
[HDDS-9859](https://issues.apache.org/jira/browse/HDDS-9859) and complete this
page or section.
+# Production Deployment
-There will be multiple pages on performance under this section. Not sure what
is required yet.
+This document provides guidance on the requirements and best practices for a
production deployment of Apache Ozone.
+
+## Ozone Components
+
+A typical production Ozone cluster includes the following services:
+
+- **Ozone Manager (OM)**: Manages the namespace and metadata of the Ozone
cluster. A production cluster requires 3 OM instances for high availability.
+- **Storage Container Manager (SCM)**: Manages the data nodes and pipelines. A
production cluster requires 3 SCM instances for high availability.
+- **Datanode**: Stores the actual data in containers. A production cluster
requires at least 3 Datanodes.
+- **Recon**: A web-based UI for monitoring and managing the Ozone cluster. A
Recon server is strongly recommended, though not required.
+- **S3 Gateway (S3G)**: An S3-compatible gateway for accessing Ozone. Multiple
S3 Gateway instances are strongly recommended to load balance S3 traffic.
+- **HttpFS**: An WebHDFS-compatible API for accessing Ozone. This is an
optional component.
+
+## Requirements
+
+### System Requirements
+
+- **Hardware**: Bare metal machines are recommended for optimal performance.
Virtual machines or containers are not recommended for production deployments.
+- **Operating System**: Linux (recommended distributions: Red Hat 8/Rocky 8+,
Ubuntu, SUSE; supported architectures: x86/ARM).
+- **Java Development Kit (JDK)**: Version 8 or higher.
+- **Time Synchronization**: A time synchronization service such as Chrony or
ntpd must be enabled to prevent time drift.
+
+### Memory Requirements
+
+- **Ozone Manager (OM), Storage Container Manager (SCM), and Recon**:
Recommended heap size in large production clusters is 64GB.
+- **Datanode, S3 Gateway, and HttpFS**: Recommended heap size is 31GB.
+
+### Storage Requirements
+
+- **Ozone Manager (OM), Storage Container Manager (SCM), and Recon Metadata
Storage**: Use SAS SSD or NVMe SSD for metadata (RocksDB and Ratis) to ensure
optimal performance. It is recommended to use RAID 1 (disk mirroring) for the
metadata disks to protect against disk failures.
+- **Datanode Storage**:
+ - **Ratis Log**: Use SAS SSD or NVMe SSD for the Ratis log directory for low
latency writes.
+ - **Container Data**: Hard disks are acceptable for container data storage.
+ - **Disk Configuration**: It is recommended to use a JBOD (Just a Bunch Of
Disks) configuration instead of RAID. Ozone is a replicated distributed storage
system and handles data redundancy. Using RAID can decrease performance without
providing additional data protection benefits.
+- **Storage Type**: Use direct-attached storage. Do not use Network Attached
Storage (NAS) or Storage Area Network (SAN).
+
+### Network Requirements
+
+- **Network Bandwidth**: A minimum of 25Gbps network card bandwidth is
recommended.
+- **Network Topology**: A leaf-spine network topology with an oversubscription
ratio below 3:1 is recommended for predictable performance.
+
+### Security Requirements (Optional but Recommended)
+
+- **Kerberos**: A Kerberos environment, including a Key Distribution Center
(KDC), is recommended for enhanced security.
+
+## Recommended Configurations
+
+### Linux Kernel
+
+- **CPU Governor**: Set the CPU scaling driver to `performance` mode to
maximize performance.
+- **Transparent Hugepage**: Disable Transparent Hugepage to avoid performance
issues.
+- **SELinux**: Disable SELinux.
+- **Swappiness**: Set `vm.swappiness=1` to minimize swapping.
+
+### Local File System
+
+- **LVM**: Disable Logical Volume Manager (LVM) for data drives.
+- **File System**: Use `ext4` or `xfs` file systems.
+- **Mount Options**: Mount drives with the `noatime` option to reduce
unnecessary disk writes. For SSDs, also add the `discard` option.
+
+### Ozone Configuration
+
+- **Monitoring**: Install
[Prometheus](../../operations/observability/prometheus) and
[Grafana](../../operations/observability/grafana) for monitoring the Ozone
cluster. For audit logs, consider using a log ingestion framework such as the
ELK Stack (Elasticsearch, Logstash, and Kibana) with FileBeat, or other similar
frameworks. Alternatively, you can use Apache Ranger to manage audit logs.
+- **Pipeline Limits**: Increase the number of allowed write pipelines to
better suit your workload by adjusting `ozone.scm.datanode.pipeline.limit`. See
the [Multi-Raft](./multi-raft) documentation for the formula to calculate
appropriate pipeline limits based on your metadata disk configuration.
+- **Heap Sizes**: Configure sufficient heap sizes for Ozone Manager (OM),
Storage Container Manager (SCM), Recon, Datanode, S3 Gateway (S3G), and HttpFS
services to ensure stability.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]