[ovirt-users] Re: How can Gluster be a HCI default, when it's hardly ever working?
In this specific case Ieven used virgin hardware originally. Once I managed to kill the hosted-engine by downgrading the datacenter cluster to legacy, I re-installed all gluster storage from the VDO level up. No traces of a file system should be left with LVM and XFS on top, even if I didn't actually null the SSD (does writing nulls to an SSD actually cost you an overwrite these days or is that translated into a trim by the firmware?) No difference in terms of faults between the virgin hardware and the re-install, so stale Gluster extended file attributes etc. (your error theory, I believe) is not a factor. Choosing between 'vmstore' and 'data' domains for the imports makes no difference, full allocation over thin allocation neither. But actually I didn't just see write errors from qemu-img, but also read-errors, which had me concerned about some other corruption source. That was another motivation to start with a fresh source, which meant a backup-domain instead of an export domain or OVAs. The storage underneath the backup domain is NFS (Posix has a 4k issue and I'm not sure I want to try moving Glusters between farms just yet), which is easy to detach at the source and import at the target. If NFS is your default, oVirt can be so much easier, but that more 'professional' domain we use vSphere and actually SAN storage. The attraction of oVirt for the lab use case, critically depends on HCI and gluster. The VMs were fine running from the backup domain (which incidentally must have lost its backup attribute at the target, because otherwise it should have kept the VMs from launching...), but once I tried moving their disks to the gluster, I got empty or unusable disks again, or error while moving. The only way that I found to transfer gluster to gluster was to use disk uploads either via the GUI or by Python, but that results into fully allocated images and is very slow at 50MB/s even with Python. BTW sparsifying does nothing to those images, I guess because sectors full of nulls aren't actually the same as a logically unused sector. At least the VDO underneath should take reduce some of the overhead. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TTQE7YLN5JKABRGSNOFTV3FMMZNO2DRC/
[ovirt-users] Re: How can Gluster be a HCI default, when it's hardly ever working?
Are you reusing a gluster volume or you have created a fresh one ? Best Regards, Strahil Nikolov В вторник, 1 септември 2020 г., 02:58:19 Гринуич+3, tho...@hoberg.net написа: I've just tried to verify what you said here. As a base line I started with the 1nHCI Gluster setup. From four VMs, two legacy, two Q35 on the single node Gluster, one survived the import, one failed silently with an empty disk, two failed somewhere in the middle of qemu-img trying to write the image to the Gluster storage. For each of those two, this always happened at the same block number, a unique one per machine, not in random places, as if qemu-img reading and writing the very same image could not agree. That's two types of error and a 75% failure rate I created another domain, basically using an NFS automount export from one of the HCI nodes (a 4.3 node serving as 4.4 storage) and imported the very same VMs (source all 4.3) transported via a re-attached export domain to 4.4. Three of the for imports worked fine, no error with qemu-img writing to NFS. All VMs had full disk images and launched, which verified that there is nothing wrong with the exports at least. But there was still one, that failed with the same qemu-img error. I then tried to move the disks from NFS to Gluster, which internally is also done via qemu-img, and I had those fail every time. Gluster or HCI seems a bit of a Russian roulette for migrations, and I am wondering how much it is better for normal operations. I'm still going to try moving via a backup domain (on NFS) and moving between that and Gluster, to see if it makes any difference. I really haven't done a lot of stress testing yet with oVirt, but this experience doesn't build confidence. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XM6YYH5H455EPGA33MYDLHYY2J3N35UT/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/73RTGJ3K66HSFARUCGAA2OIR22HCDTCB/