[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine

2021-11-18 Thread Yedidyah Bar David
On Wed, Nov 17, 2021 at 2:34 PM Danilo de Paula  wrote:
>
>
>
> On Wed., Nov. 17, 2021, 4:54 a.m. Yedidyah Bar David,  wrote:
>>
>> On Wed, Nov 17, 2021 at 9:44 AM Sandro Bonazzola  wrote:
>>>
>>>
>>>
>>> Il giorno mer 17 nov 2021 alle ore 03:12 Danilo de Paula 
>>>  ha scritto:

 Since you're consuming the CentOS Stream 8 packages (I assume) and the
 CentOS Stream 8 is actually the opened development of the next RHEL minor 
 release (8.6) [1], it makes
 a lot of sense to open BZs against those packages in RHEL-8.6.

 Especially since we won't fix those problems in CentOS Stream without 
 fixing it in RHEL first.

 So, if you believe that this is a problem with the package itself (as it 
 looks like), I strongly suggest opening a BZ against those packages in 
 RHEL.
>>>
>>>
>>> Didi can you please open a bug against RHEL 8 CentoStream version for 
>>> qemu-kvm component?
>>
>>
>> Michal searched and found:
>>
>> https://gitlab.com/qemu-project/qemu/-/issues/641
>>
>> And indeed it seems to be our case:
>>
>> [root@ost-he-basic-suite-master-host-0 ~]# ps uaxww | grep qemu | sed 's/ 
>> /\n/g' | grep ^pcie-root-port | wc
>>  17  171160
>>
>> Danilo - the gitlab page above mentions several patches linking to it 
>> already, some of which from the last few days. I didn't check them. Is there 
>> still value in opening a bug, or is the issue already sufficiently clear?
>
>
> That's a good question.
> See, we don't track gitlab issues, the upstream project does.
> So, unless those patches are included in the qemu 6.2 upstream release 
> (should happen in a few days, I think rc1 is about to be out), they need to 
> be backported in RHEL (when rhel rebases). And we only do backports with RHEL 
> BZs.
>
> I see some commits being mentioned but the issue is still not closed.
> So wait a few days and see. If it's not fixed by rc4, then I suggest open a 
> BZ for the backports.
>
> I expect the rebase to be concluded by the beginning of December btw.

Thanks. Now filed this Stream bug:

https://bugzilla.redhat.com/show_bug.cgi?id=2024605

IMO waiting till beginning of Dec is too late. We already got a few
reports on the users list by people hitting this.
If it's too much work to fix/revert, perhaps consider excluding this
on centos-release package, repo, whatever.

Thanks!
-- 
Didi
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/BZ63TLATE5QDFXFGWKAY4CM5GE2XL5FR/


[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine

2021-11-17 Thread Danilo de Paula
On Wed., Nov. 17, 2021, 4:54 a.m. Yedidyah Bar David, 
wrote:

> On Wed, Nov 17, 2021 at 9:44 AM Sandro Bonazzola 
> wrote:
>
>>
>>
>> Il giorno mer 17 nov 2021 alle ore 03:12 Danilo de Paula <
>> ddepa...@redhat.com> ha scritto:
>>
>>> Since you're consuming the CentOS Stream 8 packages (I assume) and the
>>> CentOS Stream 8 is actually the opened development of the next RHEL
>>> minor release (8.6) [1], it makes
>>> a lot of sense to open BZs against those packages in RHEL-8.6.
>>>
>>> Especially since we won't fix those problems in CentOS Stream without
>>> fixing it in RHEL first.
>>>
>>> So, if you believe that this is a problem with the package itself (as it
>>> looks like), I strongly suggest opening a BZ against those packages in RHEL.
>>>
>>
>> Didi can you please open a bug against RHEL 8 CentoStream version for
>> qemu-kvm component?
>>
>
> Michal searched and found:
>
> https://gitlab.com/qemu-project/qemu/-/issues/641
>
> And indeed it seems to be our case:
>
> [root@ost-he-basic-suite-master-host-0 ~]# ps uaxww | grep qemu | sed 's/
> /\n/g' | grep ^pcie-root-port | wc
>  17  171160
>
> Danilo - the gitlab page above mentions several patches linking to it
> already, some of which from the last few days. I didn't check them. Is
> there still value in opening a bug, or is the issue already sufficiently
> clear?
>

That's a good question.
See, we don't track gitlab issues, the upstream project does.
So, unless those patches are included in the qemu 6.2 upstream release
(should happen in a few days, I think rc1 is about to be out), they need to
be backported in RHEL (when rhel rebases). And we only do backports with
RHEL BZs.

I see some commits being mentioned but the issue is still not closed.
So wait a few days and see. If it's not fixed by rc4, then I suggest open a
BZ for the backports.

I expect the rebase to be concluded by the beginning of December btw.


> That said, I failed to verify this using isa-debugcon as mentioned there,
> and the various things the command uses (FDs, storage) are temporary -
> created on the fly by libvirt/vdsm - so I can't just copy the command and
> add a few options. If needed I guess it's possible to hack this using a
> vdsm hook or something.
>
> For the time being, is there a workaround/temporary-build/whatever other
> than downgrading to qemu-kvm-6.0.0 (which is considered deprecated, I guess
> - e.g. Sandro is going to remove it from ovirt-release package (
> https://github.com/oVirt/ovirt-release/pull/2 ), following the thread
> "[CentOS-devel] Is advanced-virt (Virtualization SIG) not relevant anymore
> with latest CentOS Stream 8?")?
>
>
>>
>>
>>
>>>
>>> [1] - This is only true for CentOS Stream 8 (which is a copy of RHEL).
>>> CentOS Stream 9 is the other way around.
>>>
>>
>> Sadly no, from my experience it's still fix in rhel first and then in
>> CentOS Stream, at least for systemd on CentOS Stream 9.
>> I would love to see fixes coming to Stream first.
>>
>
I understand that. Please consider that the CentOS Stream 9 development
workflow change is a bit complex and some teams are still adapting.

And I can only talk about the virt packages. MRs and fixes are landing in
stream first, the situations where they don't are highly exceptional.


> Best regards,
>
> --
> Didi
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/BRT3O26OI4724VBD4CHJ6PEYFZXFOE4U/


[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine

2021-11-17 Thread Yedidyah Bar David
On Wed, Nov 17, 2021 at 9:44 AM Sandro Bonazzola 
wrote:

>
>
> Il giorno mer 17 nov 2021 alle ore 03:12 Danilo de Paula <
> ddepa...@redhat.com> ha scritto:
>
>> Since you're consuming the CentOS Stream 8 packages (I assume) and the
>> CentOS Stream 8 is actually the opened development of the next RHEL minor
>> release (8.6) [1], it makes
>> a lot of sense to open BZs against those packages in RHEL-8.6.
>>
>> Especially since we won't fix those problems in CentOS Stream without
>> fixing it in RHEL first.
>>
>> So, if you believe that this is a problem with the package itself (as it
>> looks like), I strongly suggest opening a BZ against those packages in RHEL.
>>
>
> Didi can you please open a bug against RHEL 8 CentoStream version for
> qemu-kvm component?
>

Michal searched and found:

https://gitlab.com/qemu-project/qemu/-/issues/641

And indeed it seems to be our case:

[root@ost-he-basic-suite-master-host-0 ~]# ps uaxww | grep qemu | sed 's/
/\n/g' | grep ^pcie-root-port | wc
 17  171160

Danilo - the gitlab page above mentions several patches linking to it
already, some of which from the last few days. I didn't check them. Is
there still value in opening a bug, or is the issue already sufficiently
clear?

That said, I failed to verify this using isa-debugcon as mentioned there,
and the various things the command uses (FDs, storage) are temporary -
created on the fly by libvirt/vdsm - so I can't just copy the command and
add a few options. If needed I guess it's possible to hack this using a
vdsm hook or something.

For the time being, is there a workaround/temporary-build/whatever other
than downgrading to qemu-kvm-6.0.0 (which is considered deprecated, I guess
- e.g. Sandro is going to remove it from ovirt-release package (
https://github.com/oVirt/ovirt-release/pull/2 ), following the thread
"[CentOS-devel] Is advanced-virt (Virtualization SIG) not relevant anymore
with latest CentOS Stream 8?")?


>
>
>
>>
>> [1] - This is only true for CentOS Stream 8 (which is a copy of RHEL).
>> CentOS Stream 9 is the other way around.
>>
>
> Sadly no, from my experience it's still fix in rhel first and then in
> CentOS Stream, at least for systemd on CentOS Stream 9.
> I would love to see fixes coming to Stream first.
>

Best regards,

-- 
Didi
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/VN4YUIYKJLIRPMFPWGZRXGU2J44QURIB/


[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine

2021-11-16 Thread Sandro Bonazzola
Il giorno mer 17 nov 2021 alle ore 03:12 Danilo de Paula <
ddepa...@redhat.com> ha scritto:

> Since you're consuming the CentOS Stream 8 packages (I assume) and the
> CentOS Stream 8 is actually the opened development of the next RHEL minor
> release (8.6) [1], it makes
> a lot of sense to open BZs against those packages in RHEL-8.6.
>
> Especially since we won't fix those problems in CentOS Stream without
> fixing it in RHEL first.
>
> So, if you believe that this is a problem with the package itself (as it
> looks like), I strongly suggest opening a BZ against those packages in RHEL.
>

Didi can you please open a bug against RHEL 8 CentoStream version for
qemu-kvm component?



>
> [1] - This is only true for CentOS Stream 8 (which is a copy of RHEL).
> CentOS Stream 9 is the other way around.
>

Sadly no, from my experience it's still fix in rhel first and then in
CentOS Stream, at least for systemd on CentOS Stream 9.
I would love to see fixes coming to Stream first.



>
> On Tue, Nov 16, 2021 at 5:59 AM Yedidyah Bar David 
> wrote:
>
>> On Tue, Nov 16, 2021 at 12:42 PM Nir Soffer  wrote:
>> >
>> > On Tue, Nov 16, 2021 at 12:28 PM Sandro Bonazzola 
>> wrote:
>> >>
>> >> +Eduardo Lima and +Danilo Cesar Lemes de Paula  FYI
>> >>
>> >> Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David <
>> d...@redhat.com> ha scritto:
>> >>>
>> >>> Hi all,
>> >>>
>> >>> For a few days now we have failures in CI of the he-basic suite.
>> >>>
>> >>> At one point the failure seemed to have been around
>> >>> networking/routing/firewalling, but later it changed, and now the
>> >>> deploy process fails while trying to first start the engine vm after
>> >>> it's copied to the shared storage.
>> >>>
>> >>> I ran locally OST he-basic with current ost-images, reproduced the
>> >>> issue, and managed to "fix" it by enabling
>> >>> ovirt-master-centos-stream-advanced-virtualization-testing and
>> >>> downgrading qemu-kvm-* from 6.1.0 (from AppStream) to
>> >>> 15:6.0.0-33.el8s.
>> >>>
>> >>> Is this a known issue?
>> >>>
>> >>> How do we handle? Perhaps we should conflict with it somewhere until
>> >>> we find and fix the root cause.
>> >>>
>> >>> Please note that the flow is:
>> >>>
>> >>> 1. Create a local VM from the appliance image
>> >
>> >
>> > How do you create the vm?
>>
>> With virt-install:
>>
>>
>> https://github.com/oVirt/ovirt-ansible-collection/blob/master/roles/hosted_engine_setup/tasks/bootstrap_local_vm/02_create_local_vm.yml#L96
>>
>> >
>> > Are you using libvirt? What is the VM XML used?
>> >
>> >>>
>> >>> 2. Do stuff on this machine
>> >>> 3. Shut it down
>> >>> 4. Copy its disk to shared storage
>> >>> 5. Start the machine from the shared storage
>> >
>> >
>> > Just to be sure - step 1 - 4 works, but step 5 fails with qemu 6.1.0?
>>
>> It seems so, yes.
>>
>> >
>> >>>
>> >>>
>> >>> And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0
>> >>> (so the copying (using qemu-img) did work well) and the difference is
>> >>> elsewhere.
>> >>>
>> >>> Following is the diff between the qemu commands of (1.) and (5.) (as
>> >>> found in the respective logs). Any clue?
>> >>>
>> >>> --- localq  2021-11-16 08:48:01.230426260 +0100
>> >>> +++ sharedq 2021-11-16 08:48:46.884937598 +0100
>> >>> @@ -1,54 +1,79 @@
>> >>> -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0,
>> >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
>> >>> , 2021-11-09-20:38:08, ), qemu version:
>> >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
>> >>> 4.18.0-348.el8.x86_64, hostname:
>> >>> ost-he-basic-suite-master-host-0.lago.local
>> >>> +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0,
>> >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
>> >>> , 2021-11-09-20:38:08, ), qemu version:
>> >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
>> >>> 4.18.0-348.el8.x86_64, hostname:
>> >>> ost-he-basic-suite-master-host-0.lago.local
>> >>>  LC_ALL=C \
>> >>>  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
>> >>> -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \
>> >>>
>> -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share
>> \
>> >>>
>> -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \
>> >>>
>> -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config \
>> >>> +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \
>> >>>
>> +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \
>> >>> +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \
>> >>> +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \
>> >>>  /usr/libexec/qemu-kvm \
>> >>> --name guest=HostedEngineLocal,debug-threads=on \
>> >>> +-name guest=HostedEngine,debug-threads=on \
>> >>>  -S \
>> >>> --object
>> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}'
>> >>> \
>> >>> 

[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine

2021-11-16 Thread Danilo de Paula
Since you're consuming the CentOS Stream 8 packages (I assume) and the
CentOS Stream 8 is actually the opened development of the next RHEL minor
release (8.6) [1], it makes
a lot of sense to open BZs against those packages in RHEL-8.6.

Especially since we won't fix those problems in CentOS Stream without
fixing it in RHEL first.

So, if you believe that this is a problem with the package itself (as it
looks like), I strongly suggest opening a BZ against those packages in RHEL.

[1] - This is only true for CentOS Stream 8 (which is a copy of RHEL).
CentOS Stream 9 is the other way around.

On Tue, Nov 16, 2021 at 5:59 AM Yedidyah Bar David  wrote:

> On Tue, Nov 16, 2021 at 12:42 PM Nir Soffer  wrote:
> >
> > On Tue, Nov 16, 2021 at 12:28 PM Sandro Bonazzola 
> wrote:
> >>
> >> +Eduardo Lima and +Danilo Cesar Lemes de Paula  FYI
> >>
> >> Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David <
> d...@redhat.com> ha scritto:
> >>>
> >>> Hi all,
> >>>
> >>> For a few days now we have failures in CI of the he-basic suite.
> >>>
> >>> At one point the failure seemed to have been around
> >>> networking/routing/firewalling, but later it changed, and now the
> >>> deploy process fails while trying to first start the engine vm after
> >>> it's copied to the shared storage.
> >>>
> >>> I ran locally OST he-basic with current ost-images, reproduced the
> >>> issue, and managed to "fix" it by enabling
> >>> ovirt-master-centos-stream-advanced-virtualization-testing and
> >>> downgrading qemu-kvm-* from 6.1.0 (from AppStream) to
> >>> 15:6.0.0-33.el8s.
> >>>
> >>> Is this a known issue?
> >>>
> >>> How do we handle? Perhaps we should conflict with it somewhere until
> >>> we find and fix the root cause.
> >>>
> >>> Please note that the flow is:
> >>>
> >>> 1. Create a local VM from the appliance image
> >
> >
> > How do you create the vm?
>
> With virt-install:
>
>
> https://github.com/oVirt/ovirt-ansible-collection/blob/master/roles/hosted_engine_setup/tasks/bootstrap_local_vm/02_create_local_vm.yml#L96
>
> >
> > Are you using libvirt? What is the VM XML used?
> >
> >>>
> >>> 2. Do stuff on this machine
> >>> 3. Shut it down
> >>> 4. Copy its disk to shared storage
> >>> 5. Start the machine from the shared storage
> >
> >
> > Just to be sure - step 1 - 4 works, but step 5 fails with qemu 6.1.0?
>
> It seems so, yes.
>
> >
> >>>
> >>>
> >>> And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0
> >>> (so the copying (using qemu-img) did work well) and the difference is
> >>> elsewhere.
> >>>
> >>> Following is the diff between the qemu commands of (1.) and (5.) (as
> >>> found in the respective logs). Any clue?
> >>>
> >>> --- localq  2021-11-16 08:48:01.230426260 +0100
> >>> +++ sharedq 2021-11-16 08:48:46.884937598 +0100
> >>> @@ -1,54 +1,79 @@
> >>> -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0,
> >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
> >>> , 2021-11-09-20:38:08, ), qemu version:
> >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
> >>> 4.18.0-348.el8.x86_64, hostname:
> >>> ost-he-basic-suite-master-host-0.lago.local
> >>> +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0,
> >>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
> >>> , 2021-11-09-20:38:08, ), qemu version:
> >>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
> >>> 4.18.0-348.el8.x86_64, hostname:
> >>> ost-he-basic-suite-master-host-0.lago.local
> >>>  LC_ALL=C \
> >>>  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
> >>> -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \
> >>>
> -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share
> \
> >>>
> -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \
> >>>
> -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config \
> >>> +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \
> >>>
> +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \
> >>> +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \
> >>> +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \
> >>>  /usr/libexec/qemu-kvm \
> >>> --name guest=HostedEngineLocal,debug-threads=on \
> >>> +-name guest=HostedEngine,debug-threads=on \
> >>>  -S \
> >>> --object
> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}'
> >>> \
> >>> --machine
> pc-q35-rhel8.5.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram
> >>> \
> >>> --cpu
> Cascadelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvmclock=on
> >>> \
> >>> --m 3171 \
> >>> --object
> '{"qom-type":"memory-backend-ram","id":"pc.ram","size":3325034496}' 

[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine

2021-11-16 Thread Yedidyah Bar David
On Tue, Nov 16, 2021 at 12:42 PM Nir Soffer  wrote:
>
> On Tue, Nov 16, 2021 at 12:28 PM Sandro Bonazzola  wrote:
>>
>> +Eduardo Lima and +Danilo Cesar Lemes de Paula  FYI
>>
>> Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David 
>>  ha scritto:
>>>
>>> Hi all,
>>>
>>> For a few days now we have failures in CI of the he-basic suite.
>>>
>>> At one point the failure seemed to have been around
>>> networking/routing/firewalling, but later it changed, and now the
>>> deploy process fails while trying to first start the engine vm after
>>> it's copied to the shared storage.
>>>
>>> I ran locally OST he-basic with current ost-images, reproduced the
>>> issue, and managed to "fix" it by enabling
>>> ovirt-master-centos-stream-advanced-virtualization-testing and
>>> downgrading qemu-kvm-* from 6.1.0 (from AppStream) to
>>> 15:6.0.0-33.el8s.
>>>
>>> Is this a known issue?
>>>
>>> How do we handle? Perhaps we should conflict with it somewhere until
>>> we find and fix the root cause.
>>>
>>> Please note that the flow is:
>>>
>>> 1. Create a local VM from the appliance image
>
>
> How do you create the vm?

With virt-install:

https://github.com/oVirt/ovirt-ansible-collection/blob/master/roles/hosted_engine_setup/tasks/bootstrap_local_vm/02_create_local_vm.yml#L96

>
> Are you using libvirt? What is the VM XML used?
>
>>>
>>> 2. Do stuff on this machine
>>> 3. Shut it down
>>> 4. Copy its disk to shared storage
>>> 5. Start the machine from the shared storage
>
>
> Just to be sure - step 1 - 4 works, but step 5 fails with qemu 6.1.0?

It seems so, yes.

>
>>>
>>>
>>> And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0
>>> (so the copying (using qemu-img) did work well) and the difference is
>>> elsewhere.
>>>
>>> Following is the diff between the qemu commands of (1.) and (5.) (as
>>> found in the respective logs). Any clue?
>>>
>>> --- localq  2021-11-16 08:48:01.230426260 +0100
>>> +++ sharedq 2021-11-16 08:48:46.884937598 +0100
>>> @@ -1,54 +1,79 @@
>>> -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0,
>>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
>>> , 2021-11-09-20:38:08, ), qemu version:
>>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
>>> 4.18.0-348.el8.x86_64, hostname:
>>> ost-he-basic-suite-master-host-0.lago.local
>>> +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0,
>>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
>>> , 2021-11-09-20:38:08, ), qemu version:
>>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
>>> 4.18.0-348.el8.x86_64, hostname:
>>> ost-he-basic-suite-master-host-0.lago.local
>>>  LC_ALL=C \
>>>  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
>>> -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \
>>> -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share
>>>  \
>>> -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \
>>> -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config \
>>> +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \
>>> +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \
>>> +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \
>>> +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \
>>>  /usr/libexec/qemu-kvm \
>>> --name guest=HostedEngineLocal,debug-threads=on \
>>> +-name guest=HostedEngine,debug-threads=on \
>>>  -S \
>>> --object 
>>> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}'
>>> \
>>> --machine 
>>> pc-q35-rhel8.5.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram
>>> \
>>> --cpu 
>>> Cascadelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvmclock=on
>>> \
>>> --m 3171 \
>>> --object 
>>> '{"qom-type":"memory-backend-ram","id":"pc.ram","size":3325034496}' \
>>> +-object 
>>> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-2-HostedEngine/master-key.aes"}'
>>> \
>>> +-machine 
>>> pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off,graphics=off \
>>> +-cpu Cascadelake-Server-noTSX,mpx=off \
>>> +-m size=3247104k,slots=16,maxmem=12988416k \
>>>  -overcommit mem-lock=off \
>>> --smp 2,sockets=2,cores=1,threads=1 \
>>> --uuid 716b26d9-982b-4c51-ac05-646f28346007 \
>>> +-smp 2,maxcpus=32,sockets=16,dies=1,cores=2,threads=1 \
>>> +-object '{"qom-type":"iothread","id":"iothread1"}' \
>>> +-object 
>>> '{"qom-type":"memory-backend-ram","id":"ram-node0","size":3325034496}'
>>> \
>>> +-numa node,nodeid=0,cpus=0-31,memdev=ram-node0 \
>>> +-uuid a10f5518-1fc2-4aae-b7da-5d1d9875e753 \
>>> +-smbios 
>>> 

[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine

2021-11-16 Thread Nir Soffer
On Tue, Nov 16, 2021 at 12:28 PM Sandro Bonazzola 
wrote:

> +Eduardo Lima  and +Danilo Cesar Lemes de Paula
>   FYI
>
> Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David <
> d...@redhat.com> ha scritto:
>
>> Hi all,
>>
>> For a few days now we have failures in CI of the he-basic suite.
>>
>> At one point the failure seemed to have been around
>> networking/routing/firewalling, but later it changed, and now the
>> deploy process fails while trying to first start the engine vm after
>> it's copied to the shared storage.
>>
>> I ran locally OST he-basic with current ost-images, reproduced the
>> issue, and managed to "fix" it by enabling
>> ovirt-master-centos-stream-advanced-virtualization-testing and
>> downgrading qemu-kvm-* from 6.1.0 (from AppStream) to
>> 15:6.0.0-33.el8s.
>>
>> Is this a known issue?
>>
>> How do we handle? Perhaps we should conflict with it somewhere until
>> we find and fix the root cause.
>>
>> Please note that the flow is:
>>
>> 1. Create a local VM from the appliance image
>>
>
How do you create the vm?

Are you using libvirt? What is the VM XML used?


> 2. Do stuff on this machine
>> 3. Shut it down
>> 4. Copy its disk to shared storage
>> 5. Start the machine from the shared storage
>>
>
Just to be sure - step 1 - 4 works, but step 5 fails with qemu 6.1.0?


>
>> And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0
>> (so the copying (using qemu-img) did work well) and the difference is
>> elsewhere.
>>
>> Following is the diff between the qemu commands of (1.) and (5.) (as
>> found in the respective logs). Any clue?
>>
>> --- localq  2021-11-16 08:48:01.230426260 +0100
>> +++ sharedq 2021-11-16 08:48:46.884937598 +0100
>> @@ -1,54 +1,79 @@
>> -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0,
>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
>> , 2021-11-09-20:38:08, ), qemu version:
>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
>> 4.18.0-348.el8.x86_64, hostname:
>> ost-he-basic-suite-master-host-0.lago.local
>> +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0,
>> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
>> , 2021-11-09-20:38:08, ), qemu version:
>> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
>> 4.18.0-348.el8.x86_64, hostname:
>> ost-he-basic-suite-master-host-0.lago.local
>>  LC_ALL=C \
>>  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
>> -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \
>> -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share
>> \
>> -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \
>> -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config
>> \
>> +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \
>> +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \
>> +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \
>> +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \
>>  /usr/libexec/qemu-kvm \
>> --name guest=HostedEngineLocal,debug-threads=on \
>> +-name guest=HostedEngine,debug-threads=on \
>>  -S \
>> --object
>> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}'
>> \
>> --machine
>> pc-q35-rhel8.5.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram
>> \
>> --cpu
>> Cascadelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvmclock=on
>> \
>> --m 3171 \
>> --object
>> '{"qom-type":"memory-backend-ram","id":"pc.ram","size":3325034496}' \
>> +-object
>> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-2-HostedEngine/master-key.aes"}'
>> \
>> +-machine
>> pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off,graphics=off \
>> +-cpu Cascadelake-Server-noTSX,mpx=off \
>> +-m size=3247104k,slots=16,maxmem=12988416k \
>>  -overcommit mem-lock=off \
>> --smp 2,sockets=2,cores=1,threads=1 \
>> --uuid 716b26d9-982b-4c51-ac05-646f28346007 \
>> +-smp 2,maxcpus=32,sockets=16,dies=1,cores=2,threads=1 \
>> +-object '{"qom-type":"iothread","id":"iothread1"}' \
>> +-object
>> '{"qom-type":"memory-backend-ram","id":"ram-node0","size":3325034496}'
>> \
>> +-numa node,nodeid=0,cpus=0-31,memdev=ram-node0 \
>> +-uuid a10f5518-1fc2-4aae-b7da-5d1d9875e753 \
>> +-smbios
>> type=1,manufacturer=oVirt,product=RHEL,version=8.6-1.el8,serial=d2f36f31-bb29-4e1f-b52d-8fddb632953c,uuid=a10f5518-1fc2-4aae-b7da-5d1d9875e753,family=oVirt
>> \
>>  -no-user-config \
>>  -nodefaults \
>>  -chardev socket,id=charmonitor,fd=40,server=on,wait=off \
>>  -mon chardev=charmonitor,id=monitor,mode=control \
>> --rtc base=utc \
>> +-rtc base=2021-11-14T15:29:08,driftfix=slew \
>> +-global 

[ovirt-devel] Re: qemu-kvm 6.1.0 breaks hosted-engine

2021-11-16 Thread Sandro Bonazzola
+Eduardo Lima  and +Danilo Cesar Lemes de Paula
  FYI

Il giorno mar 16 nov 2021 alle ore 08:55 Yedidyah Bar David 
ha scritto:

> Hi all,
>
> For a few days now we have failures in CI of the he-basic suite.
>
> At one point the failure seemed to have been around
> networking/routing/firewalling, but later it changed, and now the
> deploy process fails while trying to first start the engine vm after
> it's copied to the shared storage.
>
> I ran locally OST he-basic with current ost-images, reproduced the
> issue, and managed to "fix" it by enabling
> ovirt-master-centos-stream-advanced-virtualization-testing and
> downgrading qemu-kvm-* from 6.1.0 (from AppStream) to
> 15:6.0.0-33.el8s.
>
> Is this a known issue?
>
> How do we handle? Perhaps we should conflict with it somewhere until
> we find and fix the root cause.
>
> Please note that the flow is:
>
> 1. Create a local VM from the appliance image
> 2. Do stuff on this machine
> 3. Shut it down
> 4. Copy its disk to shared storage
> 5. Start the machine from the shared storage
>
> And that (1.) did work with 6.1.0, and also (5.) did work with 6.0.0
> (so the copying (using qemu-img) did work well) and the difference is
> elsewhere.
>
> Following is the diff between the qemu commands of (1.) and (5.) (as
> found in the respective logs). Any clue?
>
> --- localq  2021-11-16 08:48:01.230426260 +0100
> +++ sharedq 2021-11-16 08:48:46.884937598 +0100
> @@ -1,54 +1,79 @@
> -2021-11-14 15:09:56.430+: starting up libvirt version: 7.9.0,
> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
> , 2021-11-09-20:38:08, ), qemu version:
> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
> 4.18.0-348.el8.x86_64, hostname:
> ost-he-basic-suite-master-host-0.lago.local
> +2021-11-14 15:29:10.686+: starting up libvirt version: 7.9.0,
> package: 1.module_el8.6.0+983+a7505f3f (CentOS Buildsys
> , 2021-11-09-20:38:08, ), qemu version:
> 6.1.0qemu-kvm-6.1.0-4.module_el8.6.0+983+a7505f3f, kernel:
> 4.18.0-348.el8.x86_64, hostname:
> ost-he-basic-suite-master-host-0.lago.local
>  LC_ALL=C \
>  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
> -HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal \
> -XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.local/share
> \
> -XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.cache \
> -XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/.config \
> +HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine \
> +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.local/share \
> +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.cache \
> +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-HostedEngine/.config \
>  /usr/libexec/qemu-kvm \
> --name guest=HostedEngineLocal,debug-threads=on \
> +-name guest=HostedEngine,debug-threads=on \
>  -S \
> --object
> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-HostedEngineLocal/master-key.aes"}'
> \
> --machine
> pc-q35-rhel8.5.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram
> \
> --cpu
> Cascadelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvmclock=on
> \
> --m 3171 \
> --object
> '{"qom-type":"memory-backend-ram","id":"pc.ram","size":3325034496}' \
> +-object
> '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-2-HostedEngine/master-key.aes"}'
> \
> +-machine
> pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off,graphics=off \
> +-cpu Cascadelake-Server-noTSX,mpx=off \
> +-m size=3247104k,slots=16,maxmem=12988416k \
>  -overcommit mem-lock=off \
> --smp 2,sockets=2,cores=1,threads=1 \
> --uuid 716b26d9-982b-4c51-ac05-646f28346007 \
> +-smp 2,maxcpus=32,sockets=16,dies=1,cores=2,threads=1 \
> +-object '{"qom-type":"iothread","id":"iothread1"}' \
> +-object
> '{"qom-type":"memory-backend-ram","id":"ram-node0","size":3325034496}'
> \
> +-numa node,nodeid=0,cpus=0-31,memdev=ram-node0 \
> +-uuid a10f5518-1fc2-4aae-b7da-5d1d9875e753 \
> +-smbios
> type=1,manufacturer=oVirt,product=RHEL,version=8.6-1.el8,serial=d2f36f31-bb29-4e1f-b52d-8fddb632953c,uuid=a10f5518-1fc2-4aae-b7da-5d1d9875e753,family=oVirt
> \
>  -no-user-config \
>  -nodefaults \
>  -chardev socket,id=charmonitor,fd=40,server=on,wait=off \
>  -mon chardev=charmonitor,id=monitor,mode=control \
> --rtc base=utc \
> +-rtc base=2021-11-14T15:29:08,driftfix=slew \
> +-global kvm-pit.lost_tick_policy=delay \
> +-no-hpet \
>  -no-shutdown \
>  -global ICH9-LPC.disable_s3=1 \
>  -global ICH9-LPC.disable_s4=1 \
> --boot menu=off,strict=on \
> +-boot strict=on \
>  -device
> pcie-root-port,port=16,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2
> \
>  -device pcie-root-port,port=17,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1
> \