[
https://issues.apache.org/jira/browse/HDDS-14502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-14502:
----------------------------------
Labels: pull-request-available (was: )
> [Website v2] [Docs] [Administrator Guide] Replacing Ozone Manager Disks
> -----------------------------------------------------------------------
>
> Key: HDDS-14502
> URL: https://issues.apache.org/jira/browse/HDDS-14502
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Wei-Chiu Chuang
> Assignee: Gargi Jaiswal
> Priority: Major
> Labels: pull-request-available
>
> [https://ozone-site-v2.staged.apache.org/docs/administrator-guide/operations/disk-replacement/ozone-manager]
>
> if the disk containing OM metadata directory (ozone.om.db.dirs) needs to be
> replaced for whatever reason, the OM metadata directory will need to be
> reconstructed by running ozone om–bootstrap. (assuming OM HA is configured)
>
> —
> Gemini Cli suggests the following content, which looks quite reasonable to me:
>
> —
> Title: Replacing Ozone Manager Disks
> Audience: Cluster Administrators
> Prerequisites: Familiarity with Ozone cluster administration and Linux
> system administration.
> —
> 1. Overview
> Start with a brief introduction explaining the purpose of the document.
> * Purpose: This guide provides the steps required to safely replace a
> failed disk on an Ozone Manager (OM) node.
> * Impact of OM Disk Failure: The OM disk is critical as it stores the
> RocksDB database containing the entire object store namespace (volumes,
> buckets,
> keys) and block locations. A failure of this disk can lead to metadata
> loss if not handled correctly.
> * Crucial Distinction: HA vs. Non-HA: The recovery procedure depends
> entirely on whether your OM is a single, standalone instance or part of a
> High-Availability (HA) Ratis-based quorum. The HA procedure is
> significantly safer and results in no cluster downtime. Running a standalone
> OM is not
> recommended for production environments.
> —
> 2. Pre-flight Checks
> Before starting, the administrator should:
> 1. Identify the Failed Disk: Use system tools (dmesg, smartctl, etc.) to
> confirm which disk has failed and its mount point.
> 2. Identify OM Directories: Check your ozone-site.xml to confirm which
> Ozone directories are on the failed disk. The most important one is:
> * ozone.om.db.dirs: The primary OM metadata database.
> * Also check ozone.om.ratis.storage.dir if you have configured it to
> be on a separate disk.
> 3. Prepare the Replacement Disk: Have a new, healthy disk physically
> installed, formatted, and mounted on the system at the same path as the
> failed disk.
> Ensure it has the correct ownership and permissions for the user that
> runs the OM process.
> —
> 3. Procedure for a Standalone (Non-HA) Ozone Manager
> This is a high-risk, manual disaster recovery process that will require
> cluster downtime.
> 1. STOP THE ENTIRE CLUSTER: Shut down all clients, DataNodes, SCM, and the
> Ozone Manager to prevent any further state changes.
> 2. Attempt Data Recovery: If the failed disk is still partially readable,
> make a best-effort attempt to copy the contents of the ozone.om.db.dirs
> directory to a safe, temporary location.
> 3. If Recovery Fails, Restore from Backup: If the OM database files are
> unrecoverable, you must restore from your most recent backup. This document
> does
> not cover the backup process itself, but it is the only path to
> recovery in this scenario.
> 4. Replace and Configure Disk: Physically replace the hardware and ensure
> the new, empty disk is mounted at the correct path defined in
> ozone.om.db.dirs.
> 5. Restore Metadata: Copy the recovered data (from Step 2) or the restored
> backup data (from Step 3) to the ozone.om.db.dirs path on the new disk.
> 6. Restart and Verify:
> * Start the SCM and Ozone Manager services.
> * Once the OM is running, start the DataNodes.
> * Run ozone sh volume list and other basic commands to verify that the
> namespace is intact and the cluster is operational.
> —
> 4. Procedure for an HA (Ratis-based) Ozone Manager
> This procedure is much safer, leverages the built-in redundancy of the OM
> HA cluster, and does not require full cluster downtime.
> 1. STOP THE FAILED OM INSTANCE: On the node with the failed disk, stop
> only the Ozone Manager process. The other two OMs will continue operating,
> and one
> of them will remain the leader, serving client requests.
> 2. Replace and Configure Disk: Physically replace the hardware. Mount the
> new, empty disk at the path defined in ozone.om.db.dirs and ensure it has the
> correct ownership and permissions.
> 3. RE-INITIALIZE THE OM: This is the key step. Since the local database is
> gone, the OM needs to be "reborn" by getting a complete copy of the latest
> state from the current OM leader.
> * Simply starting the OM process on the repaired node with an empty DB
> directory will trigger this process automatically. The OM process is designed
> to detect that it belongs to an existing Ratis ring but has no local
> state.
> 4. START THE OM AND MONITOR:
> * Start the Ozone Manager service on the repaired node.
> * Tail the OM's log file (.log and .out). You should see messages
> indicating that it is connecting to the OM HA ring and that a "snapshot" is
> being
> installed. This means the current OM leader is streaming the entire
> metadata database to this new follower.
> * This process can take some time, depending on the size of your
> metadata.
> 5. VERIFY: Once the snapshot installation is complete, the OM will finish
> starting, join the Ratis ring as a follower, and begin receiving live updates.
> * You can check the OM Web UI on any of the OM nodes. The list of
> peers should now show all three OMs as healthy.
> * The cluster is now back at full redundancy and the procedure is
> complete.
> —
> 5. Additional Considerations
> * Disk Monitoring: This procedure highlights the importance of actively
> monitoring disk health (smartd, etc.) to replace disks proactively before a
> catastrophic failure.
> * Separating Ratis Logs: If you have configured ozone.om.ratis.storage.dir
> on a separate, dedicated disk (recommended for performance), a failure of that
> disk would follow the same HA recovery procedure. The OM would
> automatically rebuild its Ratis logs from the other members of the ring.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]