[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine
On Wed, Nov 17, 2021 at 2:34 PM Danilo de Paula wrote: > > > > On Wed., Nov. 17, 2021, 4:54 a.m. Yedidyah Bar David, wrote: >> >> On Wed, Nov 17, 2021 at 9:44 AM Sandro Bonazzola wrote: >>> >>> >>> >>> Il giorno mer 17 nov 2021 alle ore 03:12 Danilo de Paula >>> ha scritto: Since you're consuming the CentOS Stream 8 packages (I assume) and the CentOS Stream 8 is actually the opened development of the next RHEL minor release (8.6) [1], it makes a lot of sense to open BZs against those packages in RHEL-8.6. Especially since we won't fix those problems in CentOS Stream without fixing it in RHEL first. So, if you believe that this is a problem with the package itself (as it looks like), I strongly suggest opening a BZ against those packages in RHEL. >>> >>> >>> Didi can you please open a bug against RHEL 8 CentoStream version for >>> qemu-kvm component? >> >> >> Michal searched and found: >> >> https://gitlab.com/qemu-project/qemu/-/issues/641 >> >> And indeed it seems to be our case: >> >> [root@ost-he-basic-suite-master-host-0 ~]# ps uaxww | grep qemu | sed 's/ >> /\n/g' | grep ^pcie-root-port | wc >> 17 171160 >> >> Danilo - the gitlab page above mentions several patches linking to it >> already, some of which from the last few days. I didn't check them. Is there >> still value in opening a bug, or is the issue already sufficiently clear? > > > That's a good question. > See, we don't track gitlab issues, the upstream project does. > So, unless those patches are included in the qemu 6.2 upstream release > (should happen in a few days, I think rc1 is about to be out), they need to > be backported in RHEL (when rhel rebases). And we only do backports with RHEL > BZs. > > I see some commits being mentioned but the issue is still not closed. > So wait a few days and see. If it's not fixed by rc4, then I suggest open a > BZ for the backports. > > I expect the rebase to be concluded by the beginning of December btw. Thanks. Now filed this Stream bug: https://bugzilla.redhat.com/show_bug.cgi?id=2024605 IMO waiting till beginning of Dec is too late. We already got a few reports on the users list by people hitting this. If it's too much work to fix/revert, perhaps consider excluding this on centos-release package, repo, whatever. Thanks! -- Didi ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/BZ63TLATE5QDFXFGWKAY4CM5GE2XL5FR/
[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine
On Wed., Nov. 17, 2021, 4:54 a.m. Yedidyah Bar David, wrote: > On Wed, Nov 17, 2021 at 9:44 AM Sandro Bonazzola > wrote: > >> >> >> Il giorno mer 17 nov 2021 alle ore 03:12 Danilo de Paula < >> ddepa...@redhat.com> ha scritto: >> >>> Since you're consuming the CentOS Stream 8 packages (I assume) and the >>> CentOS Stream 8 is actually the opened development of the next RHEL >>> minor release (8.6) [1], it makes >>> a lot of sense to open BZs against those packages in RHEL-8.6. >>> >>> Especially since we won't fix those problems in CentOS Stream without >>> fixing it in RHEL first. >>> >>> So, if you believe that this is a problem with the package itself (as it >>> looks like), I strongly suggest opening a BZ against those packages in RHEL. >>> >> >> Didi can you please open a bug against RHEL 8 CentoStream version for >> qemu-kvm component? >> > > Michal searched and found: > > https://gitlab.com/qemu-project/qemu/-/issues/641 > > And indeed it seems to be our case: > > [root@ost-he-basic-suite-master-host-0 ~]# ps uaxww | grep qemu | sed 's/ > /\n/g' | grep ^pcie-root-port | wc > 17 171160 > > Danilo - the gitlab page above mentions several patches linking to it > already, some of which from the last few days. I didn't check them. Is > there still value in opening a bug, or is the issue already sufficiently > clear? > That's a good question. See, we don't track gitlab issues, the upstream project does. So, unless those patches are included in the qemu 6.2 upstream release (should happen in a few days, I think rc1 is about to be out), they need to be backported in RHEL (when rhel rebases). And we only do backports with RHEL BZs. I see some commits being mentioned but the issue is still not closed. So wait a few days and see. If it's not fixed by rc4, then I suggest open a BZ for the backports. I expect the rebase to be concluded by the beginning of December btw. > That said, I failed to verify this using isa-debugcon as mentioned there, > and the various things the command uses (FDs, storage) are temporary - > created on the fly by libvirt/vdsm - so I can't just copy the command and > add a few options. If needed I guess it's possible to hack this using a > vdsm hook or something. > > For the time being, is there a workaround/temporary-build/whatever other > than downgrading to qemu-kvm-6.0.0 (which is considered deprecated, I guess > - e.g. Sandro is going to remove it from ovirt-release package ( > https://github.com/oVirt/ovirt-release/pull/2 ), following the thread > "[CentOS-devel] Is advanced-virt (Virtualization SIG) not relevant anymore > with latest CentOS Stream 8?")? > > >> >> >> >>> >>> [1] - This is only true for CentOS Stream 8 (which is a copy of RHEL). >>> CentOS Stream 9 is the other way around. >>> >> >> Sadly no, from my experience it's still fix in rhel first and then in >> CentOS Stream, at least for systemd on CentOS Stream 9. >> I would love to see fixes coming to Stream first. >> > I understand that. Please consider that the CentOS Stream 9 development workflow change is a bit complex and some teams are still adapting. And I can only talk about the virt packages. MRs and fixes are landing in stream first, the situations where they don't are highly exceptional. > Best regards, > > -- > Didi > ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/BRT3O26OI4724VBD4CHJ6PEYFZXFOE4U/
[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine
On Wed, Nov 17, 2021 at 9:44 AM Sandro Bonazzola wrote: > > > Il giorno mer 17 nov 2021 alle ore 03:12 Danilo de Paula < > ddepa...@redhat.com> ha scritto: > >> Since you're consuming the CentOS Stream 8 packages (I assume) and the >> CentOS Stream 8 is actually the opened development of the next RHEL minor >> release (8.6) [1], it makes >> a lot of sense to open BZs against those packages in RHEL-8.6. >> >> Especially since we won't fix those problems in CentOS Stream without >> fixing it in RHEL first. >> >> So, if you believe that this is a problem with the package itself (as it >> looks like), I strongly suggest opening a BZ against those packages in RHEL. >> > > Didi can you please open a bug against RHEL 8 CentoStream version for > qemu-kvm component? > Michal searched and found: https://gitlab.com/qemu-project/qemu/-/issues/641 And indeed it seems to be our case: [root@ost-he-basic-suite-master-host-0 ~]# ps uaxww | grep qemu | sed 's/ /\n/g' | grep ^pcie-root-port | wc 17 171160 Danilo - the gitlab page above mentions several patches linking to it already, some of which from the last few days. I didn't check them. Is there still value in opening a bug, or is the issue already sufficiently clear? That said, I failed to verify this using isa-debugcon as mentioned there, and the various things the command uses (FDs, storage) are temporary - created on the fly by libvirt/vdsm - so I can't just copy the command and add a few options. If needed I guess it's possible to hack this using a vdsm hook or something. For the time being, is there a workaround/temporary-build/whatever other than downgrading to qemu-kvm-6.0.0 (which is considered deprecated, I guess - e.g. Sandro is going to remove it from ovirt-release package ( https://github.com/oVirt/ovirt-release/pull/2 ), following the thread "[CentOS-devel] Is advanced-virt (Virtualization SIG) not relevant anymore with latest CentOS Stream 8?")? > > > >> >> [1] - This is only true for CentOS Stream 8 (which is a copy of RHEL). >> CentOS Stream 9 is the other way around. >> > > Sadly no, from my experience it's still fix in rhel first and then in > CentOS Stream, at least for systemd on CentOS Stream 9. > I would love to see fixes coming to Stream first. > Best regards, -- Didi ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/VN4YUIYKJLIRPMFPWGZRXGU2J44QURIB/
[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine
Il giorno mer 17 nov 2021 alle ore 03:12 Danilo de Paula < ddepa...@redhat.com> ha scritto: > Since you're consuming the CentOS Stream 8 packages (I assume) and the > CentOS Stream 8 is actually the opened development of the next RHEL minor > release (8.6) [1], it makes > a lot of sense to open BZs against those packages in RHEL-8.6. > > Especially since we won't fix those problems in CentOS Stream without > fixing it in RHEL first. > > So, if you believe that this is a problem with the package itself (as it > looks like), I strongly suggest opening a BZ against those packages in RHEL. > Didi can you please open a bug against RHEL 8 CentoStream version for qemu-kvm component? > > [1] - This is only true for CentOS Stream 8 (which is a copy of RHEL). > CentOS Stream 9 is the other way around. > Sadly no, from my experience it's still fix in rhel first and then in CentOS Stream, at least for systemd on CentOS Stream 9. I would love to see fixes coming to Stream first. > > On Tue, Nov 16, 2021 at 5:59 AM Yedidyah Bar David > wrote: > >> On Tue, Nov 16, 2021 at 12:42 PM Nir Soffer wrote: >> > >> > On Tue, Nov 16, 2021 at 12:28 PM Sandro Bonazzola >> wrote: >> >> >> >> +Eduardo Lima and +Danilo Cesar Lemes de Paula FYI >> >> >> >> Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David < >> d...@redhat.com> ha scritto: >> >>> >> >>> Hi all, >> >>> >> >>> For a few days now we have failures in CI of the he-basic suite. >> >>> >> >>> At one point the failure seemed to have been around >> >>> networking/routing/firewalling, but later it changed, and now the >> >>> deploy process fails while trying to first start the engine vm after >> >>> it's copied to the shared storage. >> >>> >> >>> I ran locally OST he-basic with current ost-images, reproduced the >> >>> issue, and managed to "fix" it by enabling >> >>> ovirt-master-centos-stream-advanced-virtualization-testing and >> >>> downgrading qemu-kvm-* from 6.1.0 (from AppStream) to >> >>> 15:6.0.0-33.el8s. >> >>> >> >>> Is this a known issue? >> >>> >> >>> How do we handle? Perhaps we should conflict with it somewhere until >> >>> we find and fix the root cause. >> >>> >> >>> Please note that the flow is: >> >>> >> >>> 1. Create a local VM from the appliance image >> > >> > >> > How do you create the vm? >> >> With virt-install: >> >> >> https://github.com/oVirt/ovirt-ansible-collection/blob/master/roles/hosted_engine_setup/tasks/bootstrap_local_vm/02_create_local_vm.yml#L96 >> >> > >> > Are you using libvirt? What is the VM XML used? >> > >> >>> >> >>> 2. Do stuff on this machine >> >>> 3. Shut it down >> >>> 4. Copy its disk to shared storage >> >>> 5. Start the machine from the shared storage >> > >> > >> > Just to be sure - step 1 - 4 works, but step 5 fails with qemu 6.1.0? >> >> It seems so, yes. >> >> > >> >>> >> >>> >> >>> And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0 >> >>> (so the copying (using qemu-img) did work well) and the difference is >> >>> elsewhere. >> >>> >> >>> Following is the diff between the qemu commands of (1.) and (5.) (as >> >>> found in the respective logs). Any clue? >> >>> >> >>> --- localq 2021-11-16 08:48:01.230426260 +0100 >> >>> +++ sharedq 2021-11-16 08:48:46.884937598 +0100 >> >>> @@ -1,54 +1,79 @@ >> >>> -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0, >> >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys >> >>> , 2021-11-09-20:38:08, ), qemu version: >> >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: >> >>> 4.18.0-348.el8.x86_64, hostname: >> >>> ost-he-basic-suite-master-host-0.lago.local >> >>> +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0, >> >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys >> >>> , 2021-11-09-20:38:08, ), qemu version: >> >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: >> >>> 4.18.0-348.el8.x86_64, hostname: >> >>> ost-he-basic-suite-master-host-0.lago.local >> >>> LC_ALL=C \ >> >>> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ >> >>> -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \ >> >>> >> -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share >> \ >> >>> >> -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \ >> >>> >> -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config \ >> >>> +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \ >> >>> >> +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \ >> >>> +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \ >> >>> +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \ >> >>> /usr/libexec/qemu-kvm \ >> >>> --name guest=HostedEngineLocal,debug-threads=on \ >> >>> +-name guest=HostedEngine,debug-threads=on \ >> >>> -S \ >> >>> --object >> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}' >> >>> \ >> >>>
[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine
Since you're consuming the CentOS Stream 8 packages (I assume) and the CentOS Stream 8 is actually the opened development of the next RHEL minor release (8.6) [1], it makes a lot of sense to open BZs against those packages in RHEL-8.6. Especially since we won't fix those problems in CentOS Stream without fixing it in RHEL first. So, if you believe that this is a problem with the package itself (as it looks like), I strongly suggest opening a BZ against those packages in RHEL. [1] - This is only true for CentOS Stream 8 (which is a copy of RHEL). CentOS Stream 9 is the other way around. On Tue, Nov 16, 2021 at 5:59 AM Yedidyah Bar David wrote: > On Tue, Nov 16, 2021 at 12:42 PM Nir Soffer wrote: > > > > On Tue, Nov 16, 2021 at 12:28 PM Sandro Bonazzola > wrote: > >> > >> +Eduardo Lima and +Danilo Cesar Lemes de Paula FYI > >> > >> Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David < > d...@redhat.com> ha scritto: > >>> > >>> Hi all, > >>> > >>> For a few days now we have failures in CI of the he-basic suite. > >>> > >>> At one point the failure seemed to have been around > >>> networking/routing/firewalling, but later it changed, and now the > >>> deploy process fails while trying to first start the engine vm after > >>> it's copied to the shared storage. > >>> > >>> I ran locally OST he-basic with current ost-images, reproduced the > >>> issue, and managed to "fix" it by enabling > >>> ovirt-master-centos-stream-advanced-virtualization-testing and > >>> downgrading qemu-kvm-* from 6.1.0 (from AppStream) to > >>> 15:6.0.0-33.el8s. > >>> > >>> Is this a known issue? > >>> > >>> How do we handle? Perhaps we should conflict with it somewhere until > >>> we find and fix the root cause. > >>> > >>> Please note that the flow is: > >>> > >>> 1. Create a local VM from the appliance image > > > > > > How do you create the vm? > > With virt-install: > > > https://github.com/oVirt/ovirt-ansible-collection/blob/master/roles/hosted_engine_setup/tasks/bootstrap_local_vm/02_create_local_vm.yml#L96 > > > > > Are you using libvirt? What is the VM XML used? > > > >>> > >>> 2. Do stuff on this machine > >>> 3. Shut it down > >>> 4. Copy its disk to shared storage > >>> 5. Start the machine from the shared storage > > > > > > Just to be sure - step 1 - 4 works, but step 5 fails with qemu 6.1.0? > > It seems so, yes. > > > > >>> > >>> > >>> And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0 > >>> (so the copying (using qemu-img) did work well) and the difference is > >>> elsewhere. > >>> > >>> Following is the diff between the qemu commands of (1.) and (5.) (as > >>> found in the respective logs). Any clue? > >>> > >>> --- localq 2021-11-16 08:48:01.230426260 +0100 > >>> +++ sharedq 2021-11-16 08:48:46.884937598 +0100 > >>> @@ -1,54 +1,79 @@ > >>> -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0, > >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys > >>> , 2021-11-09-20:38:08, ), qemu version: > >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: > >>> 4.18.0-348.el8.x86_64, hostname: > >>> ost-he-basic-suite-master-host-0.lago.local > >>> +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0, > >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys > >>> , 2021-11-09-20:38:08, ), qemu version: > >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: > >>> 4.18.0-348.el8.x86_64, hostname: > >>> ost-he-basic-suite-master-host-0.lago.local > >>> LC_ALL=C \ > >>> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ > >>> -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \ > >>> > -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share > \ > >>> > -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \ > >>> > -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config \ > >>> +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \ > >>> > +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \ > >>> +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \ > >>> +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \ > >>> /usr/libexec/qemu-kvm \ > >>> --name guest=HostedEngineLocal,debug-threads=on \ > >>> +-name guest=HostedEngine,debug-threads=on \ > >>> -S \ > >>> --object > '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}' > >>> \ > >>> --machine > pc-q35-rhel8.5.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram > >>> \ > >>> --cpu > Cascadelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvmclock=on > >>> \ > >>> --m 3171 \ > >>> --object > '{"qom-type":"memory-backend-ram","id":"pc.ram","size":3325034496}'
[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine
On Tue, Nov 16, 2021 at 12:42 PM Nir Soffer wrote: > > On Tue, Nov 16, 2021 at 12:28 PM Sandro Bonazzola wrote: >> >> +Eduardo Lima and +Danilo Cesar Lemes de Paula FYI >> >> Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David >> ha scritto: >>> >>> Hi all, >>> >>> For a few days now we have failures in CI of the he-basic suite. >>> >>> At one point the failure seemed to have been around >>> networking/routing/firewalling, but later it changed, and now the >>> deploy process fails while trying to first start the engine vm after >>> it's copied to the shared storage. >>> >>> I ran locally OST he-basic with current ost-images, reproduced the >>> issue, and managed to "fix" it by enabling >>> ovirt-master-centos-stream-advanced-virtualization-testing and >>> downgrading qemu-kvm-* from 6.1.0 (from AppStream) to >>> 15:6.0.0-33.el8s. >>> >>> Is this a known issue? >>> >>> How do we handle? Perhaps we should conflict with it somewhere until >>> we find and fix the root cause. >>> >>> Please note that the flow is: >>> >>> 1. Create a local VM from the appliance image > > > How do you create the vm? With virt-install: https://github.com/oVirt/ovirt-ansible-collection/blob/master/roles/hosted_engine_setup/tasks/bootstrap_local_vm/02_create_local_vm.yml#L96 > > Are you using libvirt? What is the VM XML used? > >>> >>> 2. Do stuff on this machine >>> 3. Shut it down >>> 4. Copy its disk to shared storage >>> 5. Start the machine from the shared storage > > > Just to be sure - step 1 - 4 works, but step 5 fails with qemu 6.1.0? It seems so, yes. > >>> >>> >>> And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0 >>> (so the copying (using qemu-img) did work well) and the difference is >>> elsewhere. >>> >>> Following is the diff between the qemu commands of (1.) and (5.) (as >>> found in the respective logs). Any clue? >>> >>> --- localq 2021-11-16 08:48:01.230426260 +0100 >>> +++ sharedq 2021-11-16 08:48:46.884937598 +0100 >>> @@ -1,54 +1,79 @@ >>> -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0, >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys >>> , 2021-11-09-20:38:08, ), qemu version: >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: >>> 4.18.0-348.el8.x86_64, hostname: >>> ost-he-basic-suite-master-host-0.lago.local >>> +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0, >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys >>> , 2021-11-09-20:38:08, ), qemu version: >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: >>> 4.18.0-348.el8.x86_64, hostname: >>> ost-he-basic-suite-master-host-0.lago.local >>> LC_ALL=C \ >>> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ >>> -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \ >>> -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share >>> \ >>> -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \ >>> -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config \ >>> +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \ >>> +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \ >>> +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \ >>> +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \ >>> /usr/libexec/qemu-kvm \ >>> --name guest=HostedEngineLocal,debug-threads=on \ >>> +-name guest=HostedEngine,debug-threads=on \ >>> -S \ >>> --object >>> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}' >>> \ >>> --machine >>> pc-q35-rhel8.5.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram >>> \ >>> --cpu >>> Cascadelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvmclock=on >>> \ >>> --m 3171 \ >>> --object >>> '{"qom-type":"memory-backend-ram","id":"pc.ram","size":3325034496}' \ >>> +-object >>> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-2-HostedEngine/master-key.aes"}' >>> \ >>> +-machine >>> pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off,graphics=off \ >>> +-cpu Cascadelake-Server-noTSX,mpx=off \ >>> +-m size=3247104k,slots=16,maxmem=12988416k \ >>> -overcommit mem-lock=off \ >>> --smp 2,sockets=2,cores=1,threads=1 \ >>> --uuid 716b26d9-982b-4c51-ac05-646f28346007 \ >>> +-smp 2,maxcpus=32,sockets=16,dies=1,cores=2,threads=1 \ >>> +-object '{"qom-type":"iothread","id":"iothread1"}' \ >>> +-object >>> '{"qom-type":"memory-backend-ram","id":"ram-node0","size":3325034496}' >>> \ >>> +-numa node,nodeid=0,cpus=0-31,memdev=ram-node0 \ >>> +-uuid a10f5518-1fc2-4aae-b7da-5d1d9875e753 \ >>> +-smbios >>>
[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine
On Tue, Nov 16, 2021 at 12:28 PM Sandro Bonazzola wrote: > +Eduardo Lima and +Danilo Cesar Lemes de Paula > FYI > > Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David < > d...@redhat.com> ha scritto: > >> Hi all, >> >> For a few days now we have failures in CI of the he-basic suite. >> >> At one point the failure seemed to have been around >> networking/routing/firewalling, but later it changed, and now the >> deploy process fails while trying to first start the engine vm after >> it's copied to the shared storage. >> >> I ran locally OST he-basic with current ost-images, reproduced the >> issue, and managed to "fix" it by enabling >> ovirt-master-centos-stream-advanced-virtualization-testing and >> downgrading qemu-kvm-* from 6.1.0 (from AppStream) to >> 15:6.0.0-33.el8s. >> >> Is this a known issue? >> >> How do we handle? Perhaps we should conflict with it somewhere until >> we find and fix the root cause. >> >> Please note that the flow is: >> >> 1. Create a local VM from the appliance image >> > How do you create the vm? Are you using libvirt? What is the VM XML used? > 2. Do stuff on this machine >> 3. Shut it down >> 4. Copy its disk to shared storage >> 5. Start the machine from the shared storage >> > Just to be sure - step 1 - 4 works, but step 5 fails with qemu 6.1.0? > >> And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0 >> (so the copying (using qemu-img) did work well) and the difference is >> elsewhere. >> >> Following is the diff between the qemu commands of (1.) and (5.) (as >> found in the respective logs). Any clue? >> >> --- localq 2021-11-16 08:48:01.230426260 +0100 >> +++ sharedq 2021-11-16 08:48:46.884937598 +0100 >> @@ -1,54 +1,79 @@ >> -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0, >> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys >> , 2021-11-09-20:38:08, ), qemu version: >> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: >> 4.18.0-348.el8.x86_64, hostname: >> ost-he-basic-suite-master-host-0.lago.local >> +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0, >> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys >> , 2021-11-09-20:38:08, ), qemu version: >> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: >> 4.18.0-348.el8.x86_64, hostname: >> ost-he-basic-suite-master-host-0.lago.local >> LC_ALL=C \ >> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ >> -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \ >> -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share >> \ >> -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \ >> -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config >> \ >> +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \ >> +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \ >> +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \ >> +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \ >> /usr/libexec/qemu-kvm \ >> --name guest=HostedEngineLocal,debug-threads=on \ >> +-name guest=HostedEngine,debug-threads=on \ >> -S \ >> --object >> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}' >> \ >> --machine >> pc-q35-rhel8.5.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram >> \ >> --cpu >> Cascadelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvmclock=on >> \ >> --m 3171 \ >> --object >> '{"qom-type":"memory-backend-ram","id":"pc.ram","size":3325034496}' \ >> +-object >> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-2-HostedEngine/master-key.aes"}' >> \ >> +-machine >> pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off,graphics=off \ >> +-cpu Cascadelake-Server-noTSX,mpx=off \ >> +-m size=3247104k,slots=16,maxmem=12988416k \ >> -overcommit mem-lock=off \ >> --smp 2,sockets=2,cores=1,threads=1 \ >> --uuid 716b26d9-982b-4c51-ac05-646f28346007 \ >> +-smp 2,maxcpus=32,sockets=16,dies=1,cores=2,threads=1 \ >> +-object '{"qom-type":"iothread","id":"iothread1"}' \ >> +-object >> '{"qom-type":"memory-backend-ram","id":"ram-node0","size":3325034496}' >> \ >> +-numa node,nodeid=0,cpus=0-31,memdev=ram-node0 \ >> +-uuid a10f5518-1fc2-4aae-b7da-5d1d9875e753 \ >> +-smbios >> type=1,manufacturer=oVirt,product=RHEL,version=8.6-1.el8,serial=d2f36f31-bb29-4e1f-b52d-8fddb632953c,uuid=a10f5518-1fc2-4aae-b7da-5d1d9875e753,family=oVirt >> \ >> -no-user-config \ >> -nodefaults \ >> -chardev socket,id=charmonitor,fd=40,server=on,wait=off \ >> -mon chardev=charmonitor,id=monitor,mode=control \ >> --rtc base=utc \ >> +-rtc base=2021-11-14T15:29:08,driftfix=slew \ >> +-global
[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine
+Eduardo Lima and +Danilo Cesar Lemes de Paula FYI Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David ha scritto: > Hi all, > > For a few days now we have failures in CI of the he-basic suite. > > At one point the failure seemed to have been around > networking/routing/firewalling, but later it changed, and now the > deploy process fails while trying to first start the engine vm after > it's copied to the shared storage. > > I ran locally OST he-basic with current ost-images, reproduced the > issue, and managed to "fix" it by enabling > ovirt-master-centos-stream-advanced-virtualization-testing and > downgrading qemu-kvm-* from 6.1.0 (from AppStream) to > 15:6.0.0-33.el8s. > > Is this a known issue? > > How do we handle? Perhaps we should conflict with it somewhere until > we find and fix the root cause. > > Please note that the flow is: > > 1. Create a local VM from the appliance image > 2. Do stuff on this machine > 3. Shut it down > 4. Copy its disk to shared storage > 5. Start the machine from the shared storage > > And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0 > (so the copying (using qemu-img) did work well) and the difference is > elsewhere. > > Following is the diff between the qemu commands of (1.) and (5.) (as > found in the respective logs). Any clue? > > --- localq 2021-11-16 08:48:01.230426260 +0100 > +++ sharedq 2021-11-16 08:48:46.884937598 +0100 > @@ -1,54 +1,79 @@ > -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0, > package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys > , 2021-11-09-20:38:08, ), qemu version: > 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: > 4.18.0-348.el8.x86_64, hostname: > ost-he-basic-suite-master-host-0.lago.local > +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0, > package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys > , 2021-11-09-20:38:08, ), qemu version: > 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel: > 4.18.0-348.el8.x86_64, hostname: > ost-he-basic-suite-master-host-0.lago.local > LC_ALL=C \ > PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ > -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \ > -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share > \ > -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \ > -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config \ > +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \ > +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \ > +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \ > +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \ > /usr/libexec/qemu-kvm \ > --name guest=HostedEngineLocal,debug-threads=on \ > +-name guest=HostedEngine,debug-threads=on \ > -S \ > --object > '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}' > \ > --machine > pc-q35-rhel8.5.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram > \ > --cpu > Cascadelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvmclock=on > \ > --m 3171 \ > --object > '{"qom-type":"memory-backend-ram","id":"pc.ram","size":3325034496}' \ > +-object > '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-2-HostedEngine/master-key.aes"}' > \ > +-machine > pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off,graphics=off \ > +-cpu Cascadelake-Server-noTSX,mpx=off \ > +-m size=3247104k,slots=16,maxmem=12988416k \ > -overcommit mem-lock=off \ > --smp 2,sockets=2,cores=1,threads=1 \ > --uuid 716b26d9-982b-4c51-ac05-646f28346007 \ > +-smp 2,maxcpus=32,sockets=16,dies=1,cores=2,threads=1 \ > +-object '{"qom-type":"iothread","id":"iothread1"}' \ > +-object > '{"qom-type":"memory-backend-ram","id":"ram-node0","size":3325034496}' > \ > +-numa node,nodeid=0,cpus=0-31,memdev=ram-node0 \ > +-uuid a10f5518-1fc2-4aae-b7da-5d1d9875e753 \ > +-smbios > type=1,manufacturer=oVirt,product=RHEL,version=8.6-1.el8,serial=d2f36f31-bb29-4e1f-b52d-8fddb632953c,uuid=a10f5518-1fc2-4aae-b7da-5d1d9875e753,family=oVirt > \ > -no-user-config \ > -nodefaults \ > -chardev socket,id=charmonitor,fd=40,server=on,wait=off \ > -mon chardev=charmonitor,id=monitor,mode=control \ > --rtc base=utc \ > +-rtc base=2021-11-14T15:29:08,driftfix=slew \ > +-global kvm-pit.lost_tick_policy=delay \ > +-no-hpet \ > -no-shutdown \ > -global ICH9-LPC.disable_s3=1 \ > -global ICH9-LPC.disable_s4=1 \ > --boot menu=off,strict=on \ > +-boot strict=on \ > -device > pcie-root-port,port=16,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 > \ > -device pcie-root-port,port=17,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 > \