gw769 opened a new issue, #11905:
URL: https://github.com/apache/cloudstack/issues/11905

   Hi all,
   
   I’m encountering an intermittent issue when uploading templates to 
CloudStack 4.20.1.0 where Linstor is used as Primary Storage. Here’s the 
detailed breakdown:
   
   
   CloudStack Version: 4.20.1.0
   Primary Storage:
    Linstor-controller 1.32.3-1ppa1~noble1
   linstor-satellite 1.32.3-1ppa1~noble1
   DRBD_KERNEL_VERSION=9.2.14
   System: Ubuntu 24.04.3 LTS
   Hypervisor: KVM
   
   When uploading a template (e.g., an Ubuntu 22.04 qcow2 image), the process 
intermittently fails with NOT_DOWNLOADED in Primary Storage. In my tests, 2 out 
of 3 upload attempts fail, while 1 succeeds randomly.
   
   
   
   Two critical symptoms:
   1. qemu-img Binary Not Found: The CloudStack agent fails to execute qemu-img 
convert because the binary is not found at /usr/local/sbin/qemu-img (verified 
via strace ):
   ps -ef | grep cloudstack-agent
   strace -f -s 256 -p 1318808 -o /tmp/agent_trace.log
   
   can't find execute qemu-img convert
      ```
   execve("/usr/local/sbin/qemu-img", ["qemu-img", "convert", "-n", 
"--target-is-zero", "-W", "-S", "1M", "-O", "raw", "-t", "none", "-U", 
"--image-opts", "driver=qcow2,file.filename=/mnt/.../xxx.qcow2", 
"/dev/drbd1265"], ...) = -1 ENOENT (No such file or directory)
      ```
   
   
   DRBD Devices Stay “Unused”: Linstor resource listings show DRBD devices as 
Unused (instead of one of device InUse):
   
      ```
      root@NODE76:~# linstor v l |grep  80bc7788-638e-4211-9b31-d76d037b50b3
      | cs-80bc7788-638e-4211-9b31-d76d037b50b3    | NODE158 | 
DfltDisklessStorPool |     0 |    1105 | /dev/drbd1105 |            | Unused |  
 Diskless | Established(2) |
      | cs-80bc7788-638e-4211-9b31-d76d037b50b3    | NODE76  | x86_pool_zfs_ssd 
    |     0 |    1105 | /dev/drbd1105 |  20.00 GiB | Unused |   UpToDate | 
Established(2) |
      | cs-80bc7788-638e-4211-9b31-d76d037b50b3    | NODE83  | x86_pool_zfs_ssd 
    |     0 |    1105 | /dev/drbd1105 |  20.00 GiB | Unused |   UpToDate | 
Established(2) |
      ```
   
   
   Steps to Reproduce:
   1. Upload a qcow2 template (e.g., `ubuntu-22.04.4-amd64-base-UEFI.qcow2`) to 
CloudStack.
   2. Monitor the Primary Storage (Linstor) download status.
   3. Repeat the upload 3+ times and observe intermittent `NOT_DOWNLOADED` 
failures.
   
   
   Expected Behavior:
   The template should download to Primary Storage successfully every time, 
with `qemu-img convert` executing properly and DRBD devices showing `InUse`.
   
   
   Actual Behavior:
   - Intermittent failures (2 out of 3 attempts fail).
   - DRBD resources remain `Unused` during failed attempts.
   
   
   Questions:
   - What configuration or troubleshooting steps should I take to resolve this 
intermittent behavior?
   
   
   Any insights or guidance would be greatly appreciated!
   
   Thanks,
   
   <img width="1598" height="865" alt="Image" 
src="https://github.com/user-attachments/assets/12b26478-f480-4201-90ab-407f5fd1522b";
 />


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to