Erik, the original problem sounds to me rather like a qemu problem 😕 When using GFAPI with libvirtd have an eye on selinux/apparmor, this is sometimes really troublesome!
Your config looks exactly like mine, with the exception, that I don't use scsi. I host the plain images (raw & qcow) on the gluster volume like <target dev="vda" bus="virtio"/> A. Am Montag, dem 14.10.2024 um 15:57 +0000 schrieb Jacobson, Erik: > First a heartfelt thanks for writing back. >  > In a solution (not having this issue) we do use nfs-ganesha to host > filesystem squashfs root FS objects to compute nodes. It is working > great. We also have fuse-through-LIO. >  > The solution here is 3 servers making up with cluster admin node. >  > The XFS issue is only observed when we try to replace an existing one > with another XFS on top, and only with RAW, and only inside the VM. > So it isn’t like data is being corrupted. However, it’s hard to > replace a filesystem with another like you would do if you re-install > one of what may be several operating systems on that disk image. >  > I am interested in your GFAPI information. I rebuilt RHEL9.4 qemu and > changed the spec file to produce the needed gluster block package, > and referred to the image file via the gluster protocol. My system > got horrible scsi errors and sometimes didn’t even boot from a live > environment. I repeated the same failure with sles15. I did this with > a direct setup (not volumes/pools/etc). >  > I could experiment with Ubuntu if needed so that was a good data > point. >  > I am interested in your setup to see what I may have missed. If I > simply made a mistake configuring GFAPI that would be welcome news. >  >  <devices> >    <emulator>/usr/libexec/qemu-kvm</emulator> >    <disk type='network' device='disk'> >      <driver name='qemu' type='raw' cache='none'/> >      <source protocol='gluster' name='adminvm/images/adminvm.img' > index='2'> >              <host name='localhost' port='24007'/> >      </source> >      <backingStore/> >      <target dev='sdh' bus='scsi'/> >      <alias name='scsi1-0-0-0'/> >      <address type='drive' controller='1' bus='0' target='0' > unit='0'/> >    </disk> >  > From:Gluster-users <gluster-users-boun...@gluster.org> on behalf of > Andreas Schwibbe <a.schwi...@gmx.net> > Date: Monday, October 14, 2024 at 4:34 AM > To: gluster-users@gluster.org <gluster-users@gluster.org> > Subject: Re: [Gluster-users] XFS corruption reported by QEMU virtual > machine with image hosted on gluster > Hey Erik, >  > I am running a similar setup with no issues having Ubuntu Host > Systems on HPE DL380 Gen 10. > I however used to run libvirt/qemu via nfs-ganesha on top of gluster > flawlessly. > Recently I upgraded to the native GFAPI implementation, which is > poorly documented with snippets all over the internet. > > Although I cannot provide a direct solution for your issue, I am > however suggesting to try either nfs-ganesha as a replacement for > fuse mount or GFAPI. > Happy to share libvirt/GFAPI config hints to make it happen. > > Best > A. >  > Am Sonntag, dem 13.10.2024 um 21:59 +0000 schrieb Jacobson, Erik: > > Hello all! We are experiencing a strange problem with QEMU virtual > > machines where the virtual machine image is hosted on a gluster > > volume. Access via fuse. (Our GFAPI attempt failed, it doesn’t seem > > to work properly with current QEMU/distro/gluster). We have the > > volume tuned for ‘virt’. > >  > > So we use qemu-img to create a raw image. You can use sparse or > > falloc with equal results. We start a virtual machine (libvirt, > > qemu-kvm) and libvirt/qemu points to the fuse mount with the QEMU > > image file we created. > >  > > When we create partitions and filesystems – like you might do for > > installing an operating system – all is well at first. This > > includes a root XFS filesystem. > >  > > When we try to re-make the XFS filesystem over the old one, it will > > not mount and will report XFS corruption. > > If you dig into XFS repair, you can find a UUID mismatch between > > the superblock and the log. The log always retains the UUID of the > > original filesystem (the one we tried to replace). Running > > xfs_repair doesn’t truly repair, it just reports more corruption. > > xfs_db forcing to remake the log doesn’t help. > >  > > We can duplicate this with even a QEMU raw image of 50 megabytes. > > As far as we can tell, XFS is the only filesystem showing this > > behavior or at least the only one reporting a problem. > >  > > If we take QEMU out of the picture and create partitions directly > > on the QEMU raw image file, then use kpartx to create devices to > > the partitions, and run a similar test – the gluster-hosted image > > behaves as you would expect and there is no problem reported by > > XFS. We can’t duplicate the problem outside of QEMU. > >  > > We have observed the issue with Rocky 9.4 and SLES15 SP5 > > environments (including the matching QEMU versions). We have not > > tested more distros yet. > >  > > We observed the problem originally with Gluster 9.3. We reproduced > > it with Gluster 9.6 and 10.5. > >  > > If we switch from QEMU RAW to QCOW2, the problem disappears. > >  > > The problem is not reproduced when we take gluster out of the > > equation (meaning, pointing QEMU at a local disk image instead of > > gluster-hosted one – that works fine). > >  > > The problem can be reproduced this way: > > * Assume /adminvm/images on a gluster sharded volume > > * rm /adminvm/images/adminvm.img > > * qemu-img create -f raw /adminvm/images/adminvm.img 50M > >  > > Now start the virtual machine that refers to the above adminvm.img > > file > > * Boot up a rescue environment or a live mode or similar > > * sgdisk --zap-all /dev/sda > > * sgdisk --set-alignment=4096 --clear /dev/sda > > * sgdisk --set-alignment=4096 --new=1:0:0 /dev/sda > > * mkfs.xfs -L fs1 /dev/sda1 > > * mkdir -p /a > > * mount /dev/sda1 /a > > * umount /a > > * # MAKE same FS again: > > * mkfs.xfs -f -L fs1 /dev/sda1 > > * mount /dev/sda1 /a > > * This will fail with kernel back traces and corruption reported > > * xfs_repair will report the log vs superblock UUID mismatch I > > mentioned > >  > > Here are the volume settings: > >  > > # gluster volume info adminvm > >  > > Volume Name: adminvm > > Type: Replicate > > Volume ID: de655913-aad9-4e17-bac4-ff0ad9c28223 > > Status: Started > > Snapshot Count: 0 > > Number of Bricks: 1 x 3 = 3 > > Transport-type: tcp > > Bricks: > > Brick1: 172.23.254.181:/data/brick_adminvm_slot2 > > Brick2: 172.23.254.182:/data/brick_adminvm_slot2 > > Brick3: 172.23.254.183:/data/brick_adminvm_slot2 > > Options Reconfigured: > > storage.owner-gid: 107 > > storage.owner-uid: 107 > > performance.io-thread-count: 32 > > network.frame-timeout: 10800 > > cluster.lookup-optimize: off > > server.keepalive-count: 5 > > server.keepalive-interval: 2 > > server.keepalive-time: 10 > > server.tcp-user-timeout: 20 > > network.ping-timeout: 20 > > server.event-threads: 4 > > client.event-threads: 4 > > cluster.choose-local: off > > user.cifs: off > > features.shard: on > > cluster.shd-wait-qlength: 10000 > > cluster.shd-max-threads: 8 > > cluster.locking-scheme: granular > > cluster.data-self-heal-algorithm: full > > cluster.server-quorum-type: server > > cluster.quorum-type: auto > > cluster.eager-lock: enable > > performance.strict-o-direct: on > > network.remote-dio: disable > > performance.low-prio-threads: 32 > > performance.io-cache: off > > performance.read-ahead: off > > performance.quick-read: off > > cluster.granular-entry-heal: enable > > storage.fips-mode-rchecksum: on > > transport.address-family: inet > > nfs.disable: on > > performance.client-io-threads: on > >  > > Any help or ideas would be appreciated. Let us know if we have a > > setting incorrect or have made an error. > >  > > Thank you all! > >  > > Erik > > ________ > >  > >  > >  > > Community Meeting Calendar: > >  > > Schedule - > > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > > Bridge: https://meet.google.com/cpu-eiue-hvk > > Gluster-users mailing list > > Gluster-users@gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > Â
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users