[ovirt-users] Unable to install OVirt 4.4.6 on IvyBridge
Hi, I'm attempting to build an oVirt node on a server. The host is an RHEL 8.4, I'm attempting to install OVirt 4.4.6 as per instructions at https://www.ovirt.org/documentation/installing_ovirt_as_a_self-hosted_engine_using_the_command_line/index.html. I'm receiving the following error: [ INFO ] The host has been set in non_operational status, deployment errors: code 156: Host redacted.com moved to Non-Operational state as host CPU type is not supported in this cluster compatibility version or is not supported at all, code 9000: Failed to verify Power Management configuration for Host redacted.com., [ INFO ] skipping: [localhost] [ INFO ] You can now connect to https://redacted.com:6900/ovirt-engine/ and check the status of this host and eventually remediate it, please continue only when the host is listed as 'up' Now, I note that initially lm_sensors was unable to detect CPU temperatures, which was subsequently resolved (I don't recall how), however the issue still remains. The CPU is an Xeon E5-2630 (IvyBridge). I cannot see any definitive CPU support catalog, so I'm unsure if this is no longer supported. Within engine-logs-2021-11-13T12:22:10Z/log/ovirt-engine/engine.log, there were occurrences of `IvyBridge-IBRS`. From what I can tell, spectre/meltdown bugs have been mitigated: ``` egrep -e "model|cpu family|stepping|microcode" /proc/cpuinfo | sort | uniq cpu family : 6 microcode : 0x42e model : 62 model name : Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping: 4 ``` In the installation documents, IvyBridge seems supported but the evidence above suggests it may not be. Can anyone advise on if this processor is _actually_ supported, and if so how I can remediate the issue? Thanks! ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GJS4GPYH7WG2YE444VQPYTWUYTX66JXM/
[ovirt-users] Re: cloning a VM or creating a template speed is so so slow
On Thu, Nov 11, 2021 at 4:33 AM Pascal D wrote: > > I have been trying to figure out why cloning a VM and creating a template > from ovirt is so slow. I am using ovirt 4.3.10 over NFS. My NFS server is > running NFS 4 over RAID10 with SSD disks over a 10G network and 9000 MTU > > Therocially I should be writing a 50GB file in around 1m30s > a direct copy from the SPM host server of an image to another image on the > same host takes 6m34s > a cloning from ovirt takes around 29m > > So quite a big difference. Therefore I started investigating and found that > ovirt launches a qemu-img process with no source and target cache. Therefore > thinking that could be the issue, I change the cache mode to writeback and > was able to run the exact command in 8m14s. Over 3 times faster. I haven't > tried yet other parameters line -o preallocation=metadata -o preallocation=metadata may work for files, we don't use it since it is not compatible with block storage (requires allocation of the entire volume upfront). > but was wondering why no cache was selected and how to change it to use cache > writeback We don't use the host page cache. There are several issues; - reading stale data after another host change an image on shared storage this should probably not happen with NFS. - writing to the page cache pollute the page cache with data that is unlikely to be needed, since vms also do not use the page cache (for other reasons). so you may reclaim memory that should be used by your vms during the copy. - the kernel like to buffer huge amount of data, and flush too much data at the same time. This cause delays in accessing storage during flushing. This is may break sanlock leases that must have access to storage to update the storage leases. We improved copy performance a few years ago using the -W option, allowing concurrent writes. This can speed up copy to block storage (iscsi/fc) up to 6 times[1]. When we tested this with NFS, we did not see big improvement, so we did not enable it. It also recommended to use -W for raw preallocated disk, since it may cause fragmentation. You can try to change this in vdsm/storage/sd.py: 396 def recommends_unordered_writes(self, format): 397 """ 398 Return True if unordered writes are recommended for copying an image 399 using format to this storage domain. 400 401 Unordered writes improve copy performance but are recommended only for 402 preallocated devices and raw format. 403 """ 404 return format == sc.RAW_FORMAT and not self.supportsSparseness This allows -W only on raw preallocated disks. So it will not be used for raw-sparse (NFS thin) or qcow2-sparse (snapshots on NFS), or for qcow2 on block storage. We use unordered writes for any disk in ovirt-imageio, and other tools like nbdcopy also always enable unordered writes, so maybe we should enable it in all cases. To enable unordered writes for any volume, change this to: def recommends_unordered_writes(self, format): """ Allow unordered writes only storage in any format. """ return True If you want to always enable this only for file storage (NFS, GlsuterFS, LocalFS, posix) add this method in vdsm/storage/nfsSD.py: class FileStorageDomainManifest(sd.StorageDomainManifest): ... def recommends_unordered_writes(self, format): """ Override StorageDomainManifest to allow on on qcow2 and raw sparse images. """ return True Please report how it works for you. If this give good results, file a bug to enable option. I think we can enable this based on vdsm configuration, so it will be easy to disable the option if it causes trouble with some storage domain types or image formats. > command launched by ovirt: > /usr/bin/qemu-img convert -p -t none -T none -f qcow2 > /rhev/data-center/mnt/nas1.bfit:_home_VMS/8e6bea49-9c62-4e31-a3c9-0be09c2fcdbf/images/21f438fb-0c0e-4bdc-abb3-64a7e033cff6/c256a972-4328-4833-984d-fa8e62f76be8 > -O qcow2 -o compat=1.1 > /rhev/data-center/mnt/nas1.bfit:_home_VMS/8e6bea49-9c62-4e31-a3c9-0be09c2fcdbf/images/5a90515c-066d-43fb-9313-5c7742f68146/ed6dc60d-1d6f-48b6-aa6e-0e7fb1ad96b9 With the change suggested, this command will become: /usr/bin/qemu-img convert -p -t none -T none -f qcow2 /rhev/data-center/mnt/nas1.bfit:_home_VMS/8e6bea49-9c62-4e31-a3c9-0be09c2fcdbf/images/21f438fb-0c0e-4bdc-abb3-64a7e033cff6/c256a972-4328-4833-984d-fa8e62f76be8 -O qcow2 -o compat=1.1 -W /rhev/data-center/mnt/nas1.bfit:_home_VMS/8e6bea49-9c62-4e31-a3c9-0be09c2fcdbf/images/5a90515c-066d-43fb-9313-5c7742f68146/ed6dc60d-1d6f-48b6-aa6e-0e7fb1ad96b9 You can test this in the shell without modifying vdsm to test how it affects performance. [1] https://bugzilla.redhat.com/1511891#c57 Nir ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement:
[ovirt-users] Re: cloning a VM or creating a template speed is so so slow
Anyone drom redhat has any feedback on this. 3 times speed gain in cloning or templating a VM makes a big difference imo On Wed, Nov 10, 2021, 6:41 PM Pascal D wrote: > -o preallocation=metadata brings it down to 7m40s > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/XTYIU2BFB6E22DUZVGZISJ7K2SJJCYS7/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SBTX3ADVNB6VLSNYK4NALFED4V2775YX/
[ovirt-users] Re: Upgraded to oVirt 4.4.9, still have vdsmd memory leak
On Wed, Nov 10, 2021 at 4:46 PM Chris Adams wrote: > > I have seen vdsmd leak memory for years (I've been running oVirt since > version 3.5), but never been able to nail it down. I've upgraded a > cluster to oVirt 4.4.9 (reloading the hosts with CentOS 8-stream), and I > still see it happen. One host in the cluster, which has been up 8 days, > has vdsmd with 4.3 GB resident memory. On a couple of other hosts, it's > around half a gigabyte. Can you share vdsm logs from the time vdsm started? We have these logs: 2021-11-14 15:16:32,956+0200 DEBUG (health) [health] Checking health (health:93) 2021-11-14 15:16:32,977+0200 DEBUG (health) [health] Collected 5001 objects (health:101) 2021-11-14 15:16:32,977+0200 DEBUG (health) [health] user=2.46%, sys=0.74%, rss=108068 kB (-376), threads=47 (health:126) 2021-11-14 15:16:32,977+0200 INFO (health) [health] LVM cache hit ratio: 97.64% (hits: 5431 misses: 131) (health:131) They may provide useful info on the leak. You need to enable DEBUG logs for root logger in /etc/vdsm/logger.conf: [logger_root] level=DEBUG handlers=syslog,logthread propagate=0 and restart vdsmd service. Nir ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/JDA34CQF5FTHVFTRXF4OGKEFJIKJL3NL/