Lauta11 opened a new issue, #11893:
URL: https://github.com/apache/cloudstack/issues/11893
### problem
When creating snapshots of large volumes (approximately larger than 200 GB),
the task fails to complete successfully. The snapshot command reaches a timeout
of 3600 seconds, which is not enough time to copy the entire disk.
We attempted to modify various timeout-related parameters in CloudStack
(including those for snapshot and asynchronous tasks), but none of them appear
to extend or affect this timeout limit.
It is unclear whether this is a bug or a missing feature that should allow
increasing the timeout duration for snapshot operations.
The corresponding logs are attached for further analysis.
```
Management:
2025-10-21 02:01:57,863 DEBUG [c.c.s.s.SnapshotSchedulerImpl]
(SnapshotPollTask:ctx-93846872) (logid:f246f83a) Snapshot
[455b227d-797f-45e6-aa03-1dc7609ac030] for volume
[{"name":"ROOT-392","uuid":"8508204f-24ba-454a-8bd1-f56b274088ae"}] can be
executed.
2025-10-21 02:01:59,126 DEBUG [c.c.s.s.SnapshotSchedulerImpl]
(SnapshotPollTask:ctx-93846872) (logid:f246f83a) Scheduling snapshot
[455b227d-797f-45e6-aa03-1dc7609ac030] for volume
[{"name":"ROOT-392","uuid":"8508204f-24ba-454a-8bd1-f56b274088ae"}] at
[2025-10-21 05:00:00 GMT].
2025-10-21 02:01:59,179 DEBUG [c.c.s.s.SnapshotSchedulerImpl]
(SnapshotPollTask:ctx-93846872) (logid:f246f83a) Scheduled snapshot
[455b227d-797f-45e6-aa03-1dc7609ac030] for volume
[{"name":"ROOT-392","uuid":"8508204f-24ba-454a-8bd1-f56b274088ae"}] as job
[d0e089bb-d2cb-4e81-8fb4-5e6a23eee556].
2025-10-21 02:02:01,607 DEBUG [o.a.c.s.s.StorPoolSnapshotStrategy]
(Work-Job-Executor-28:ctx-a246b76c job-33232/job-33233 ctx-6f646c29)
(logid:d0e089bb) StorpoolSnapshotStrategy.canHandle:
snapshot=server18063_ROOT-392_20251021050158,
uuid=70decb5c-6f4d-405a-b7e1-f2eb2ace4de0, op=TAKE
2025-10-21 03:02:01,993 DEBUG [o.a.c.s.s.SnapshotServiceImpl]
(Work-Job-Executor-28:ctx-a246b76c job-33232/job-33233 ctx-6f646c29)
(logid:d0e089bb) create snapshot server18063_ROOT-392_20251021050158 failed:
com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to
Agent:59, com.cloud.exception.OperationTimedoutException: Commands
545217029888601412 to Host 59 timed out after 3600
Host:
2025-10-21 03:06:26,176 WARN [utils.script.Script] (Script-10:null)
(logid:) Interrupting script.
2025-10-21 03:06:26,176 WARN [utils.script.Script]
(agentRequest-Handler-2:null) (logid:d0e089bb) Process [3934352] for command
[qemu-img convert -O qcow2 -U --imag
e-opts
driver=qcow2,file.filename=/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/11806790-0ad0-4a4a-b389-2f7ab41b4e87
/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/snapshots/
3ae09c67-c072-48b4-a8a7-e3f2dbaf2687 ] timed out. Output is [].
2025-10-21 03:06:43,053 ERROR [kvm.storage.KVMStorageProcessor]
(agentRequest-Handler-2:null) (logid:d0e089bb) Failed take snapshot for volume
[volumeTO[uuid=8508204
f-24ba-454a-8bd1-f56b274088ae|path=11806790-0ad0-4a4a-b389-2f7ab41b4e87|datastore=PrimaryDataStoreTO[uuid=4c524bab-25a3-3db5-a665-d37159b81f11|name=Pool-01|id=18|poo
ltype=NetworkFilesystem]]], in VM [i-52-392-VM], due to [Failed to convert
volumeTO[uuid=8508204f-24ba-454a-8bd1-f56b274088ae|path=11806790-0ad0-4a4a-b389-2f7ab41b4e
87|datastore=PrimaryDataStoreTO[uuid=4c524bab-25a3-3db5-a665-d37159b81f11|name=Pool-01|id=18|pooltype=NetworkFilesystem]]
snapshot of volume [KVMPhysicalDisk {"dispN
ame":null,"format":"qcow2","name":"11806790-0ad0-4a4a-b389-2f7ab41b4e87","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11\/11806790-0ad0-4a4a-b389-2f7ab41b4e87","
pool":{"uuid":"4c524bab-25a3-3db5-a665-d37159b81f11","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11"},"size":122363461632,"virtualSize":161061273600,"vmName":nu
ll}] to
[/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/snapshots/3ae09c67-c072-48b4-a8a7-e3f2dbaf2687]
due to [timeout].].
com.cloud.utils.exception.CloudRuntimeException: Failed to convert
volumeTO[uuid=8508204f-24ba-454a-8bd1-f56b274088ae|path=11806790-0ad0-4a4a-b389-2f7ab41b4e87|datas
tore=PrimaryDataStoreTO[uuid=4c524bab-25a3-3db5-a665-d37159b81f11|name=Pool-01|id=18|pooltype=NetworkFilesystem]]
snapshot of volume [KVMPhysicalDisk
{"dispName":null,"format":"qcow2","name":"11806790-0ad0-4a4a-b389-2f7ab41b4e87","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11\/11806790-0ad0-4a4a-b389-2f7ab41b4e87","pool":{"uuid":"4c524bab-25a3-3db5-a665-d37159b81f11","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11"},"size":122363461632,"virtualSize":161061273600,"vmName":null}]
to
[/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/snapshots/3ae09c67-c072-48b4-a8a7-e3f2dbaf2687]
due to [timeout].
at
com.cloud.hypervisor.kvm.storage.KVMStorageProcessor.validateConvertResult(KVMStorageProcessor.java:1915)
at
com.cloud.hypervisor.kvm.storage.KVMStorageProcessor.createSnapshot(KVMStorageProcessor.java:1790)
at
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.execute(StorageSubsystemCommandHandlerBase.java:140)
at
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.handleStorageCommands(StorageSubsystemCommandHandlerBase.java:66)
at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtStorageSubSystemCommandWrapper.execute(LibvirtStorageSubSystemCommandWrapper.java:36)
at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtStorageSubSystemCommandWrapper.execute(LibvirtStorageSubSystemCommandWrapper.java:30)
at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
at
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1930)
at com.cloud.agent.Agent.processRequest(Agent.java:683)
at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1106)
at com.cloud.utils.nio.Task.call(Task.java:83)
at com.cloud.utils.nio.Task.call(Task.java:29)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
### versions
Cloudstack v4.20 / KVM
S.O ubuntu 24
### The steps to reproduce the bug
1. Start a snapshot task for a volume > 200GB
2. Wait one hour
3. Error
### What to do about it?
Complete configuration to be able to modify this time or fix a bug if
necessary.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]