abh1sar commented on code in PR #13074:
URL: https://github.com/apache/cloudstack/pull/13074#discussion_r3324134863
##########
plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtTakeBackupCommandWrapper.java:
##########
@@ -94,21 +113,46 @@ public Answer execute(TakeBackupCommand command,
LibvirtComputingResource libvir
return answer;
}
+ // Strip out our incremental marker lines before parsing size, so the
legacy
+ // numeric-suffix parser keeps working.
+ String stdout = result.second().trim();
+ String bitmapCreated = null;
+ boolean incrementalFallback = false;
+ StringBuilder filtered = new StringBuilder();
+ for (String line : stdout.split("\n")) {
+ String trimmed = line.trim();
+ if (trimmed.startsWith("BITMAP_CREATED=")) {
Review Comment:
The wrapper is passing the new bitmap to nasbackup.sh.
So why read it again from the scipt. Can't we just used the passed value?
##########
core/src/main/java/org/apache/cloudstack/backup/RebaseBackupCommand.java:
##########
@@ -0,0 +1,73 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+package org.apache.cloudstack.backup;
+
+import com.cloud.agent.api.Command;
+import com.cloud.agent.api.LogLevel;
+
+/**
+ * Tells the KVM agent to rebase a NAS backup qcow2 onto a new backing parent.
Used by the
+ * NAS backup provider during chain repair when a middle incremental is being
deleted: the
+ * immediate child must absorb the soon-to-be-deleted parent's blocks and then
re-link to
+ * the grandparent. Both target and new-backing paths are NAS-mount-relative.
+ */
+public class RebaseBackupCommand extends Command {
Review Comment:
I think you mentioned this in a previous comment also. Can you please remove
theRebaseBackupCommand and its wrapper as it is not being used now.
##########
plugins/backup/nas/src/main/java/org/apache/cloudstack/backup/NASBackupChainKeys.java:
##########
@@ -0,0 +1,66 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+package org.apache.cloudstack.backup;
+
+/**
+ * Keys used by the NAS backup provider when storing incremental-chain metadata
+ * in the existing {@code backup_details} key/value table. Stored here (not on
+ * the {@code backups} table) so other providers do not need a schema change to
+ * support their own incremental implementations.
+ */
+public final class NASBackupChainKeys {
+
+ /** UUID of the parent backup (full or previous incremental). Empty for
full backups. */
+ public static final String PARENT_BACKUP_ID = "nas.parent_backup_id";
+
+ /** QEMU dirty-bitmap name created by this backup, used as the {@code
<incremental>} reference for the next one. */
+ public static final String BITMAP_NAME = "nas.bitmap_name";
+
+ /** Identifier shared by every backup in the same chain (the full anchors
a chain; its incrementals inherit the id). */
+ public static final String CHAIN_ID = "nas.chain_id";
+
+ /** Position within the chain: 0 for the full, 1 for the first
incremental, and so on. */
+ public static final String CHAIN_POSITION = "nas.chain_position";
+
+ /**
+ * In-memory chain-mode sentinels used by {@code ChainDecision.mode}. The
persisted
+ * full-vs-incremental backup type lives on the {@code backup.type} column
(set in
+ * {@code takeBackup}) — single source of truth. Not duplicated into
backup_details.
+ */
+ public static final String TYPE_FULL = "full";
+ public static final String TYPE_INCREMENTAL = "incremental";
Review Comment:
These two strings are not being used anywhere in the code. Can be removed
##########
scripts/vm/hypervisor/kvm/nasbackup.sh:
##########
@@ -191,24 +337,39 @@ backup_running_vm() {
virsh -c qemu:///system domblklist "$VM" --details 2>/dev/null | awk
'$2=="disk"{print $3, $4}'
)
- rm -f $dest/backup.xml
+ rm -f $dest/backup.xml $dest/checkpoint.xml
sync
# Print statistics
virsh -c qemu:///system domjobinfo $VM --completed
du -sb $dest | cut -f1
+ if [[ -n "$BITMAP_NEW" ]]; then
+ # Echo the bitmap name on its own line so the Java caller can capture it
for backup_details.
+ echo "BITMAP_CREATED=$BITMAP_NEW"
+ fi
umount $mount_point
rmdir $mount_point
}
backup_stopped_vm() {
+ # Stopped VMs cannot use libvirt's backup-begin (no QEMU process). Take a
full
+ # backup via qemu-img convert. If the caller asked for incremental, fall back
+ # to full and signal the fallback so the orchestrator can record it as a full
+ # in the chain.
+ if [[ "$MODE" == "incremental" ]]; then
+ # Emit on stdout so Script.executePipedCommands in
LibvirtTakeBackupCommandWrapper
+ # can parse it and record the backup as FULL.
+ echo "INCREMENTAL_FALLBACK=full (VM stopped — incremental requires running
VM)"
Review Comment:
We are always sending `mode=full` for stopped VMs. So non need to check
again in the script.
If we can avoid echoing INCREMENTAL_FALLBACK and BITMAP_CREATED from the
script it will reduce lots of complexity in parsing the output
The below code in `LibvirtTakeBackupCommandWrapper` can also be removed then
```
// Strip out our incremental marker lines before parsing size, so
the legacy
// numeric-suffix parser keeps working.
String stdout = result.second().trim();
String bitmapCreated = null;
boolean incrementalFallback = false;
StringBuilder filtered = new StringBuilder();
for (String line : stdout.split("\n")) {
String trimmed = line.trim();
if (trimmed.startsWith("BITMAP_CREATED=")) {
bitmapCreated =
trimmed.substring("BITMAP_CREATED=".length());
continue;
}
if (trimmed.startsWith("INCREMENTAL_FALLBACK=")) {
incrementalFallback = true;
continue;
}
if (filtered.length() > 0) {
filtered.append("\n");
}
filtered.append(line);
}
String numericOutput = filtered.toString().trim();
```
##########
scripts/vm/hypervisor/kvm/nasbackup.sh:
##########
@@ -113,20 +126,104 @@ backup_running_vm() {
mount_operation
mkdir -p "$dest" || { echo "Failed to create backup directory $dest"; exit
1; }
+ # Determine effective mode for this run.
+ # Legacy callers (no -M argument) get the original full-only behavior with
no checkpoint.
+ local effective_mode="${MODE:-legacy-full}"
+ local make_checkpoint=0
+ case "$effective_mode" in
+ incremental)
+ if [[ -z "$BITMAP_PARENT" || -z "$BITMAP_NEW" || -z "$PARENT_PATHS" ]];
then
+ echo "incremental mode requires --bitmap-parent, --bitmap-new, and
--parent-paths"
+ cleanup
+ exit 1
+ fi
+ make_checkpoint=1
+ ;;
+ full)
+ if [[ -z "$BITMAP_NEW" ]]; then
+ echo "full mode requires --bitmap-new (the bitmap to create for the
next incremental)"
+ cleanup
+ exit 1
+ fi
+ make_checkpoint=1
+ ;;
+ legacy-full)
+ make_checkpoint=0
+ ;;
+ *)
+ echo "Unknown mode: $effective_mode"
+ cleanup
+ exit 1
+ ;;
+ esac
+
+ # When incremental, verify the parent bitmap still exists on the running
domain.
+ # CloudStack rebuilds the libvirt domain XML on every VM start, so libvirt's
checkpoint
+ # registry is wiped — but the bitmap may still exist on the qcow2 itself (we
pre-seed
+ # one on stopped-VM backups, see backup_stopped_vm). If the parent is
missing from
+ # libvirt's view, recreate it. If it's missing entirely (qcow2 too), this
falls through
+ # to a fresh-create which captures all writes since — slightly larger but
correct.
+ if [[ "$effective_mode" == "incremental" ]]; then
+ if ! virsh -c qemu:///system checkpoint-list "$VM" --name 2>/dev/null |
grep -qx "$BITMAP_PARENT"; then
+ cat > $dest/recreate-checkpoint.xml <<XML
+<domaincheckpoint><name>$BITMAP_PARENT</name><disks>
+$(virsh -c qemu:///system domblklist "$VM" --details 2>/dev/null | awk
'$2=="disk"{printf "<disk name=\"%s\"/>\n", $3}')
+</disks></domaincheckpoint>
+XML
+ if ! virsh -c qemu:///system checkpoint-create "$VM" --xmlfile
$dest/recreate-checkpoint.xml > /dev/null 2>&1; then
+ # If a bitmap of the same name already lives on the qcow2 (pre-seeded
by an
+ # offline backup) libvirt 7.2+ should reuse it during
checkpoint-create. Older
+ # libvirt fails the create — clean up the orphan bitmap and retry as a
fresh.
+ local retried_ok=1
+ for disk_path in $(virsh -c qemu:///system domblklist "$VM" --details
2>/dev/null | awk '$2=="disk"{print $4}'); do
+ [[ -f "$disk_path" ]] && qemu-img bitmap --remove "$disk_path"
"$BITMAP_PARENT" 2>/dev/null || true
+ done
+ if ! virsh -c qemu:///system checkpoint-create "$VM" --xmlfile
$dest/recreate-checkpoint.xml > /dev/null 2>&1; then
+ retried_ok=0
+ fi
+ if [[ "$retried_ok" == "0" ]]; then
+ echo "Failed to recreate parent bitmap $BITMAP_PARENT for $VM"
+ cleanup
+ exit 1
+ fi
+ fi
+ rm -f $dest/recreate-checkpoint.xml
+ fi
+ fi
+
Review Comment:
While this is needed and historically we have put all of the logic for
taking backup in nasbackup.sh, a very big bash file is harder to maintain and
more error prone.
Can I suggest moving out this logic of redefining checkpoint to the Java
code (LibvirtTakeBackupCommandWrapper) before doing the actual backup?
Also, I think there are a few issues with the checkpoint.xml here.
I think you need a createdAt tag and the `disks` tag is not really required.
Can you check out `ensureFromCheckpointExists()` in
https://github.com/shapeblue/cloudstack/blob/integration-veeam-kvm/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtStartBackupCommandWrapper.java#L114
I have implemented the same thing for a different PR (Veeam-kvm integration)
here.
##########
core/src/main/java/org/apache/cloudstack/backup/RebaseBackupCommand.java:
##########
@@ -0,0 +1,73 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+package org.apache.cloudstack.backup;
+
+import com.cloud.agent.api.Command;
+import com.cloud.agent.api.LogLevel;
+
+/**
+ * Tells the KVM agent to rebase a NAS backup qcow2 onto a new backing parent.
Used by the
+ * NAS backup provider during chain repair when a middle incremental is being
deleted: the
+ * immediate child must absorb the soon-to-be-deleted parent's blocks and then
re-link to
+ * the grandparent. Both target and new-backing paths are NAS-mount-relative.
+ */
+public class RebaseBackupCommand extends Command {
Review Comment:
Also remove any occurrence of it in nasbackup.sh
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]