This is an automated email from the ASF dual-hosted git repository.
adoroszlai pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone.git
The following commit(s) were added to refs/heads/master by this push:
new 99328c24239 HDDS-15181. Robot test for snapshot defrag (#10200)
99328c24239 is described below
commit 99328c242391d6288e75cf3ff6606972e7f8faa5
Author: Arun Sarin <[email protected]>
AuthorDate: Mon May 11 22:29:58 2026 +0530
HDDS-15181. Robot test for snapshot defrag (#10200)
---
.../dist/src/main/compose/ozone/docker-config | 3 +
hadoop-ozone/dist/src/main/compose/ozone/test.sh | 5 +-
.../main/smoketest/snapshot/snapshot-defrag.robot | 172 +++++++++++++++++++++
3 files changed, 179 insertions(+), 1 deletion(-)
diff --git a/hadoop-ozone/dist/src/main/compose/ozone/docker-config
b/hadoop-ozone/dist/src/main/compose/ozone/docker-config
index ecca3a971c6..ef8430bfaec 100644
--- a/hadoop-ozone/dist/src/main/compose/ozone/docker-config
+++ b/hadoop-ozone/dist/src/main/compose/ozone/docker-config
@@ -67,3 +67,6 @@ no_proxy=om,scm,s3g,recon,kdc,localhost,127.0.0.1
# Explicitly enable filesystem snapshot feature for this Docker compose cluster
OZONE-SITE.XML_ozone.filesystem.snapshot.enabled=true
+
+# Periodic snapshot defrag for smoketest snapshot/snapshot-defrag.robot
(HDDS-15181)
+OZONE-SITE.XML_ozone.snapshot.defrag.service.interval=30s
diff --git a/hadoop-ozone/dist/src/main/compose/ozone/test.sh
b/hadoop-ozone/dist/src/main/compose/ozone/test.sh
index 653a0aaf766..800f1c41f28 100755
--- a/hadoop-ozone/dist/src/main/compose/ozone/test.sh
+++ b/hadoop-ozone/dist/src/main/compose/ozone/test.sh
@@ -55,4 +55,7 @@ execute_robot_test scm -v SCHEME:ofs -N ozonefs-obs
ozonefs/ozonefs-obs.robot
execute_robot_test s3g grpc/grpc-om-s3-metrics.robot
-execute_robot_test scm --exclude pre-finalized-snapshot-tests snapshot
+execute_robot_test scm --exclude om_filesystem --exclude
pre-finalized-snapshot-tests snapshot
+
+# snapshot-defrag.robot reads OmSnapshot local YAML under the OM data
directory; Robot must run in the om container.
+execute_robot_test om snapshot/snapshot-defrag.robot
diff --git
a/hadoop-ozone/dist/src/main/smoketest/snapshot/snapshot-defrag.robot
b/hadoop-ozone/dist/src/main/smoketest/snapshot/snapshot-defrag.robot
new file mode 100644
index 00000000000..0dd42a2e0b0
--- /dev/null
+++ b/hadoop-ozone/dist/src/main/smoketest/snapshot/snapshot-defrag.robot
@@ -0,0 +1,172 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+*** Settings ***
+Documentation Basic checks that snapshots still look correct while the
OM runs periodic
+... snapshot defrag in the background (Jira HDDS-15181 /
parent HDDS-13003).
+... Cluster setup: filesystem snapshots on; defrag interval in
compose/ozone
+... docker-config. The unsecure compose test.sh uses
start_docker_env, which starts
+... three datanodes by default; test.sh sets
OZONE_REPLICATION_FACTOR=3. This suite
+... should be run with execute_robot_test om: Robot runs
inside the OM container and
+... Snapshot Local YAML checks read paths under /data/metadata
on that OM host.
+Force Tags om_filesystem
+Library OperatingSystem
+Resource ../ozone-lib/shell.robot
+Resource snapshot-setup.robot
+Suite Setup Run Keywords Assert Snapshot Defrag Interval Is Positive
+... AND Detect Rocks Tools Available
+... AND Prepare Suite With Bucket And First Snapshot
+Test Timeout 20 minutes
+
+*** Variables ***
+${DEFRAG_POLL_TIMEOUT} 10 minutes
+${DEFRAG_POLL_INTERVAL} 5 seconds
+${DEFRAG_FALLBACK_SECONDS} 65
+
+*** Test Cases ***
+Read Snapshot Data Right After Create
+ [Documentation] You can read the snapshotted key from the .snapshot
path as soon as the snapshot exists.
+ Key Should Match Local File ${SNAP_KEY_PATH_ONE} /etc/hosts
+
+After Waiting Keys Still Match Through Snapshot And On Live Bucket
+ [Documentation] Add a new key on the live bucket, wait so defrag may
run, then confirm the snapshot
+ ... still has the old file and the live bucket has the new
one.
+ ${key_two} = snapshot-setup.Create key ${VOLUME}
${BUCKET} /etc/passwd
+ Set Suite Variable ${KEY_TWO} ${key_two}
+ Set Suite Variable ${LIVE_KEY_TWO_PATH}
/${VOLUME}/${BUCKET}/${key_two}
+ Wait Until Snapshot Local YAML Shows Defragged ${SNAPSHOT_ONE}
+ Key Should Match Local File ${SNAP_KEY_PATH_ONE} /etc/hosts
+ Key Should Match Local File ${LIVE_KEY_TWO_PATH} /etc/passwd
+
+Snapshot List Still Shows Active
+ [Documentation] ozone sh snapshot ls still lists this snapshot as
SNAPSHOT_ACTIVE.
+ ${result} = Execute ozone sh snapshot ls
/${VOLUME}/${BUCKET}
+ Should contain ${result} ${SNAPSHOT_ONE}
+ Should contain ${result} SNAPSHOT_ACTIVE
+
+Second Snapshot Sees All Keys So Far
+ [Documentation] Take another snapshot after adding a third key; older
snapshot still only has the first key;
+ ... newer snapshot can read all three keys.
+ ${key_three} = snapshot-setup.Create key ${VOLUME}
${BUCKET} /etc/group
+ Set Suite Variable ${KEY_THREE} ${key_three}
+ ${snapshot_two} = Create snapshot ${VOLUME}
${BUCKET}
+ Set Suite Variable ${SNAPSHOT_TWO} ${snapshot_two}
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_ONE}/${KEY_ONE}
/etc/hosts
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_TWO}/${KEY_ONE}
/etc/hosts
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_TWO}/${KEY_TWO}
/etc/passwd
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_TWO}/${KEY_THREE}
/etc/group
+
+Snapshot Diff Starts A New Job
+ [Documentation] Comparing the two snapshots prints the usual “new job”
and --get-report hint (like snapshot-sh.robot).
+ ${result} = Execute ozone sh snapshot diff
/${VOLUME}/${BUCKET} ${SNAPSHOT_ONE} ${SNAPSHOT_TWO}
+ Should contain ${result} Submitting a new job
+ Should contain ${result} --get-report option
+
+Snapshot Diff Json Report Lists Added Keys
+ [Documentation] Full JSON report finishes with DONE and lists the keys
that appeared after the first snapshot.
+ ${result} = Execute ozone sh snapshot diff --get-report
--json /${VOLUME}/${BUCKET} ${SNAPSHOT_ONE} ${SNAPSHOT_TWO}
+ Should contain echo '${result}' | jq '.jobStatus'
DONE
+ Should contain echo '${result}' | jq
'.snapshotDiffReport.volumeName' ${VOLUME}
+ Should contain echo '${result}' | jq
'.snapshotDiffReport.bucketName' ${BUCKET}
+ Should contain echo '${result}' | jq
'.snapshotDiffReport.fromSnapshot' ${SNAPSHOT_ONE}
+ Should contain echo '${result}' | jq
'.snapshotDiffReport.toSnapshot' ${SNAPSHOT_TWO}
+ Should contain echo '${result}' | jq
'.snapshotDiffReport.diffList | .[].sourcePath' ${KEY_TWO}
+ Should contain echo '${result}' | jq
'.snapshotDiffReport.diffList | .[].sourcePath' ${KEY_THREE}
+
+After More Defrag Time Snapshot Info And Reads Stay Consistent
+ [Documentation] Poll OmSnapshot local YAML until defrag is recorded
(version > 0, needsDefrag false),
+ ... then re-check ozone sh snapshot info and snapshot
reads. We do not rerun snapshot
+ ... diff --get-report here: a completed diff report is
served from cache for
+ ... ozone.om.snapshot.diff.job.report.persistent.time, so
that call would not retrigger work.
+ Wait Until Snapshot Local YAML Shows Defragged ${SNAPSHOT_ONE}
+ Wait Until Snapshot Local YAML Shows Defragged ${SNAPSHOT_TWO}
+ ${info_one} = Execute ozone sh snapshot info
/${VOLUME}/${BUCKET} ${SNAPSHOT_ONE}
+ Should contain echo '${info_one}' | jq
'.volumeName' ${VOLUME}
+ Should contain echo '${info_one}' | jq
'.bucketName' ${BUCKET}
+ Should contain echo '${info_one}' | jq '.name'
${SNAPSHOT_ONE}
+ Should contain echo '${info_one}' | jq
'.snapshotStatus' SNAPSHOT_ACTIVE
+ ${snap_id_one} = Execute echo '${info_one}' | jq -r
'.snapshotId'
+ Should contain echo '${info_one}' | jq -r
'.checkpointDir' ${snap_id_one}
+ ${info_two} = Execute ozone sh snapshot info
/${VOLUME}/${BUCKET} ${SNAPSHOT_TWO}
+ Should contain echo '${info_two}' | jq
'.volumeName' ${VOLUME}
+ Should contain echo '${info_two}' | jq
'.bucketName' ${BUCKET}
+ Should contain echo '${info_two}' | jq '.name'
${SNAPSHOT_TWO}
+ Should contain echo '${info_two}' | jq
'.snapshotStatus' SNAPSHOT_ACTIVE
+ ${snap_id_two} = Execute echo '${info_two}' | jq -r
'.snapshotId'
+ Should contain echo '${info_two}' | jq -r
'.checkpointDir' ${snap_id_two}
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_ONE}/${KEY_ONE}
/etc/hosts
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_TWO}/${KEY_ONE}
/etc/hosts
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_TWO}/${KEY_TWO}
/etc/passwd
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_TWO}/${KEY_THREE}
/etc/group
+
+Delete Older Snapshot Younger One Still Readable
+ [Documentation] Delete the first snapshot; it shows SNAPSHOT_DELETED;
read all keys through the second snapshot path.
+ ${output} = Execute ozone sh snapshot delete
/${VOLUME}/${BUCKET} ${SNAPSHOT_ONE}
+ Should not contain ${output}
Failed
+ ${output} = Execute ozone sh snapshot ls
/${VOLUME}/${BUCKET} | jq --arg n '${SNAPSHOT_ONE}' '[.[] | select(.name == $n)
| .snapshotStatus] | if length > 0 then .[] else "SNAPSHOT_DELETED" end'
+ Should contain ${output}
SNAPSHOT_DELETED
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_TWO}/${KEY_ONE}
/etc/hosts
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_TWO}/${KEY_TWO}
/etc/passwd
+ Key Should Match Local File
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${SNAPSHOT_TWO}/${KEY_THREE}
/etc/group
+
+*** Keywords ***
+Assert Snapshot Defrag Interval Is Positive
+ [Documentation] Default ozone.snapshot.defrag.service.interval is -1
(service off). This suite needs a
+ ... positive interval (compose/ozone docker-config).
+ ${ival} = Execute ozone getconf confKey
ozone.snapshot.defrag.service.interval
+ ${ival} = Strip String ${ival}
+ Should Not Contain ${ival} -1
+ Should Not Be Equal As Strings ${ival} ${EMPTY}
+
+Detect Rocks Tools Available
+ [Documentation] SnapshotDefragService needs rocks-tools JNI. CI Linux
dist embeds .so; a Mac-built jar may only
+ ... contain .dylib, so defrag never runs and YAML version
stays 0.
+ ${out} = Execute ozone debug checknative
+ ${ok} = Run Keyword And Return Status Should Match Regexp ${out}
(?m)^\\s*rocks-tools:\\s+true\\b
+ Set Suite Variable ${ROCKS_TOOLS_AVAILABLE} ${ok}
+
+Get Snapshot Local YAML Path
+ [Arguments] ${snapshot_name}
+ ${info} = Execute ozone sh snapshot info /${VOLUME}/${BUCKET}
${snapshot_name}
+ ${snapshot_id} = Execute echo '${info}' | jq -r '.snapshotId'
+ [Return]
/data/metadata/db.snapshots/checkpointState/om.db-${snapshot_id}.yaml
+
+Snapshot Local YAML Should Show Defragged
+ [Arguments] ${snapshot_name}
+ ${yaml} = Get Snapshot Local YAML Path ${snapshot_name}
+ Execute test -f '${yaml}'
+ ${version} = Execute awk '/^[[:space:]]*version:/ {print $2;
exit}' '${yaml}'
+ ${version} = Strip String ${version}
+ ${v} = Convert To Integer ${version}
+ Should Be True ${v} > 0
+ ${needs_defrag} = Execute awk '/^[[:space:]]*needsDefrag:/ {print $2;
exit}' '${yaml}'
+ ${needs_defrag} = Strip String ${needs_defrag}
+ Should Be Equal ${needs_defrag} false
+
+Wait Until Snapshot Local YAML Shows Defragged
+ [Arguments] ${snapshot_name}
+ Run Keyword Unless ${ROCKS_TOOLS_AVAILABLE} Log
+ ... rocks-tools JNI not loaded in OM; using ${DEFRAG_FALLBACK_SECONDS}s
wait instead of YAML defrag poll (build Linux dist with -Drocks_tools_native to
exercise reviewer YAML path).
+ Run Keyword Unless ${ROCKS_TOOLS_AVAILABLE} Sleep
${DEFRAG_FALLBACK_SECONDS}
+ Run Keyword If ${ROCKS_TOOLS_AVAILABLE} Wait Until Keyword Succeeds
${DEFRAG_POLL_TIMEOUT} ${DEFRAG_POLL_INTERVAL}
+ ... Snapshot Local YAML Should Show Defragged ${snapshot_name}
+
+Prepare Suite With Bucket And First Snapshot
+ Setup volume and bucket
+ ${key_one} = snapshot-setup.Create key ${VOLUME}
${BUCKET} /etc/hosts
+ Set Suite Variable ${KEY_ONE} ${key_one}
+ ${snapshot_one} = Create snapshot ${VOLUME}
${BUCKET}
+ Set Suite Variable ${SNAPSHOT_ONE} ${snapshot_one}
+ Set Suite Variable ${SNAP_KEY_PATH_ONE}
/${VOLUME}/${BUCKET}/${SNAPSHOT_INDICATOR}/${snapshot_one}/${key_one}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]