Wei-Chiu Chuang created HDDS-14503:
--------------------------------------
Summary: [Website v2] [Docs] [Administrator Guide] Replacing
Storage Container Manager Disks
Key: HDDS-14503
URL: https://issues.apache.org/jira/browse/HDDS-14503
Project: Apache Ozone
Issue Type: Sub-task
Components: documentation
Reporter: Wei-Chiu Chuang
[https://ozone-site-v2.staged.apache.org/docs/administrator-guide/operations/disk-replacement/storage-container-manager]
if the disk containing SCM metadata directory (ozone.scm.db.dirs) needs to be
replaced for whatever reason, the SCM metadata directory will need to be
reconstructed by running ozone scm –bootstrap. (assuming SCM HA is configured)
---
Gemini cli suggest the following content writeup:
---
Title: Replacing Storage Container Manager (SCM) Disks
Audience: Cluster Administrators
Prerequisites: Familiarity with Ozone cluster administration, especially SCM
and its HA configuration.
---
1. Overview
* Purpose: This guide details the procedure for replacing a failed disk on
an SCM node.
* Impact of SCM Disk Failure: The SCM disk is critical, as it stores the
RocksDB database containing the state of the entire cluster's physical storage,
including:
* DataNode registration and heartbeat status.
* Pipeline information and states.
* Container locations and replica information.
* A failure of this disk without a proper recovery plan can render the
cluster unable to manage storage or allocate new blocks.
* Crucial Distinction: HA vs. Non-HA: The procedure depends entirely on
whether your SCM is a single, standalone instance or part of a High-Availability
(HA) Ratis-based quorum. Running a standalone SCM is a single point of
failure and is not recommended for production environments.
---
2. Pre-flight Checks
1. Identify the Failed Disk: Use system tools (dmesg, smartctl, etc.) to
confirm which disk has failed and its mount point.
2. Identify SCM Directories: Check your ozone-site.xml to confirm which
Ozone directories are on the failed disk. The most important properties are:
* ozone.scm.db.dirs: The primary SCM metadata database.
* ozone.scm.ha.ratis.storage.dir: The location for SCM's internal HA
Ratis logs (in an HA setup).
3. Prepare the Replacement Disk: Physically install a new, healthy disk.
Format it and mount it at the same path as the failed disk. Ensure it has the
correct ownership and permissions for the user that runs the SCM process.
---
3. Procedure for a Standalone (Non-HA) SCM
This procedure is a critical disaster recovery event that requires full
cluster downtime and a valid backup.
1. STOP THE ENTIRE CLUSTER: Shut down all clients, DataNodes, OMs, and the
SCM. Without a functional SCM, DataNodes cannot heartbeat and new block
allocations will fail.
2. Attempt Data Recovery: If possible, make a best-effort attempt to copy
the contents of the ozone.scm.db.dirs directory from the failing disk to a
safe, temporary location.
3. If Recovery Fails, Restore from Backup: If the SCM database is
unrecoverable, you must restore it from your most recent backup. Without a
backup, you
risk permanent data loss or a lengthy, complex, and potentially
incomplete state reconstruction from DataNode reports.
4. Replace and Configure Disk: Physically replace the hardware and ensure
the new, empty disk is mounted at the correct path defined in
ozone.scm.db.dirs.
5. Restore Metadata: Copy the recovered data (from Step 2) or the restored
backup data (from Step 3) to the ozone.scm.db.dirs path on the new disk.
6. Restart and Verify:
* Start the SCM service first.
* Once the SCM is fully initialized and running, start the OMs and then
the DataNodes.
* Check the SCM Web UI to confirm that DataNodes are heartbeating and
that pipelines are healthy. Run client I/O tests to ensure the cluster is fully
operational.
---
4. Procedure for an HA (Ratis-based) SCM
This is the recommended production procedure. It leverages the HA quorum for
recovery, requires no cluster downtime, and is much safer.
1. STOP THE FAILED SCM INSTANCE: On the node with the failed disk, stop only
the SCM process. The other SCMs will continue to operate, and one of them
will remain the leader, managing the cluster.
2. Replace and Configure Disk: Physically replace the hardware. Mount the
new, empty disk at the path(s) defined in ozone.scm.db.dirs and
ozone.scm.ha.ratis.storage.dir. Ensure correct ownership and permissions.
3. RE-INITIALIZE THE SCM VIA BOOTSTRAP: The failed SCM has lost its state
and must rejoin the HA cluster by getting a full copy of the latest state from
the current leader. This is done using the scm --bootstrap command.
4. RUN BOOTSTRAP AND MONITOR:
* On the repaired node, execute the bootstrap command: bin/ozone scm
--bootstrap
* This command will:
1. Connect to the existing SCM HA ring.
2. Trigger the current leader to create a database checkpoint (a
snapshot).
3. Securely download the snapshot and install it locally on the new
disk.
4. Start the SCM daemon, which will join the Ratis ring as a
follower.
* Monitor the console output of the bootstrap command and the SCM's log
file (.log and .out). You will see messages related to downloading the
snapshot and joining the ring.
5. VERIFY:
* Once the bootstrap is complete and the daemon is running, the SCM is a
healthy follower in the quorum.
* Check the SCM Web UI from any of the SCM nodes. The list of peers
should now show all SCMs as healthy. The cluster is back at full redundancy.
---
5. Additional Considerations
* Primordial SCM Node: In an HA setup, the first SCM started with scm --init
is the "primordial" node, which generates the cluster's unique ID. If the
primordial node's disk fails, the recovery procedure is the same (scm
--bootstrap). The cluster ID is preserved by the surviving SCMs and will be
replicated to the repaired node during the bootstrap process.
* Backups are Still Essential: Even in a robust HA configuration,
maintaining regular, off-site backups of the SCM database is a critical best
practice
for recovering from catastrophic multi-node failures or logical data
corruption.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]