Re: issue when not using acpi indices in libvirt 7.4.0 and qemu 6.0.0

2021-06-25 Thread Riccardo Ravaioli
On Thu, 24 Jun 2021 at 04:11, Laine Stump  wrote:

> [...]
>

Hi Laine,

Thank you so much for your analysis and thoughtful insights. As you noticed
straight away, there were indeed some minor differences in the two VM
definitions that I didn't see before posting. The interface naming was not
biased by this in the end.

Anyway, we found out that the problem actually lay in qemu 6.0.0, which was
promptly patched by its maintainers after we contacted them:
https://lists.nongnu.org/archive/html/qemu-stable/2021-06/msg00058.html

Thanks again!

Riccardo


Re: issue when not using acpi indices in libvirt 7.4.0 and qemu 6.0.0

2021-06-23 Thread Laine Stump




On 6/23/21 7:37 PM, Riccardo Ravaioli wrote:
On Wed, 23 Jun 2021 at 18:59, Daniel P. Berrangé > wrote:


[...]
So your config here does NOT list any ACPI indexes


Exactly, I don't list any ACPI indices.

 > After upgrading to libvirt 7.4.0 and qemu 6.0.0, the XML snippet
above
 > yielded:
 > - ens1 for the first virtio interface => OK
 > - rename4 for the second virtio interface => **KO**


(this is reminiscent of what would sometimes happen back in the "bad old 
days" of ethN NIC naming.)



 > - ens3 for the PCI passthrough interface  => OK


With the older libvirt + qemu, the guest (Debian) was setting the device 
names to ens1, ens2, and ens3 (through some sort of renaming, apparently 
by udev. The names that these interfaces would normally get (e.g. in 
Fedora or RHEL8) would be enp1s1, enp1s2, and enp1s3.


With the newer libvirt + qemu, the guest still has the names set by 
systemd (?) to enp1s1, enp1s2, and enp1s3.



So from libvirt's POV, nothing should have changed upon upgrade,
as we wouldn't be setting any ACPI indexes by default.


Right. If ACPI indexes had been turned on, I would have expected the 
names to be, e.g., eno1, eno2, eno3. But that would require explicitly 
adding the option to the qemu commandline, but it isn't there (see below).




Can you show the QEMU command line from /var/log/libvirt/qemu/$GUEST.log
both before and after the libvirt upgrade.


Sure, here it is before the upgrade: https://pastebin.com/ZzKd2uRJ 



-netdev tap,fd=50,id=hostnet0 \
-device 
virtio-net-pci,csum=off,netdev=hostnet0,id=net0,mac=52:54:00:aa:cc:05,bus=pci.1,addr=0x1 
\

-netdev tap,fd=51,id=hostnet1 \
-device 
virtio-net-pci,csum=off,netdev=hostnet1,id=net1,mac=52:54:00:aa:bb:81,bus=pci.1,addr=0x2 
\

[...]
-device vfio-pci,host=:0d:00.0,id=hostdev0,bus=pci.1,addr=0x3 \

And here after the upgrade: https://pastebin.com/EMu6Jgat 



-netdev tap,fd=55,id=hostnet0 \
-device 
virtio-net-pci,csum=off,netdev=hostnet0,id=net0,mac=52:54:00:aa:cc:a0,bus=pci.1,addr=0x1 
\

-netdev tap,fd=56,id=hostnet1 \
-device 
virtio-net-pci,csum=off,netdev=hostnet1,id=net1,mac=52:54:00:aa:bb:a1,bus=pci.1,addr=0x2 
\

[...]


So there is no change in the qemu commandline for the virtio-net 
devices, nor for the hostdev.


(BTW, you say that your vfio-assigned device is SRIOV, but it isn't - 
it's a standard ethernet device - "Intel Corporation I210 Gigabit 
Network Connection" - this has no effect on the current conversation, 
just FYI).


Since the name of the devices hasn't changed to "enoBLAH", I think the 
whole ACPI index thing is a red herring - ACPI indexes aren't being set 
and the device names aren't being set based on the non-existent ACPI 
indexes. There is something else going on (seemingly tied to the device 
renaming that udev (?) is doing from enpXsY to ensZ).


I notice that you're apparently redefining this domain from scratch each 
time it is started.


1) The machinetype changes from pc-i440fx-5.2 to pc-i440fx-6.0, implying 
that each time the domain is started, it is being told to use the 
generic machinetype "pc", which is then canonicalized to "the newest 
pci-i440fx-based machinetype" before starting the guest.


2) The MAC address has been changed for the two virtio-net cards, but 
not to some random number as would happen if you were allowing libvirt.


It's common for OSes to notice a new MAC address and attempt to give the 
interface a new name. Perhaps this is happening and whoever/whatever is 
doing that is screwing things up. Or it's possible there is some minor 
change in the machinetype from pc-i440fx-5.2 to pc-i440fx-6.0 that is 
causing this renaming to behave differently.


If you really need your guests to be stable, you shouldn't just use "pc" 
as the machinetype every time the guest is started, but instead save the 
canonicalized machinetype listed in the XML when you initially define 
the domain, and use that canonicalized machinetype on all future starts 
of the domain. Likewise, you should retain the exact MAC addresses that 
are used for all the NICs when the domain is originally defined and 
started for the first time, and use those exact same MAC addresses in 
subsequent starts. That way you are guaranteed (modulo any bugs) that 
the guest is presented with the exact same hardware each time it boots. 
If you use "virsh define" and "virsh start" (rather than "virsh create" 
- I can't be certain this is what you're doing, but there are clues 
indicating it might be the case) then all these details are 
automatically preserved for you within libvirt's persistent domain 
configuration.


One other comment - I don't remember the exact location, but I recall 
from a long long time ago that udev saves the information about names 
that it gives to NICs "somewhere". You may want to find and clear out 
that cache of info in the guest to get 

Re: issue when not using acpi indices in libvirt 7.4.0 and qemu 6.0.0

2021-06-23 Thread Riccardo Ravaioli
On Wed, 23 Jun 2021 at 18:59, Daniel P. Berrangé 
wrote:

> [...]
> So your config here does NOT list any ACPI indexes
>

Exactly, I don't list any ACPI indices.


> > After upgrading to libvirt 7.4.0 and qemu 6.0.0, the XML snippet above
> > yielded:
> > - ens1 for the first virtio interface => OK
> > - rename4 for the second virtio interface => **KO**
> > - ens3 for the PCI passthrough interface  => OK
>
> So from libvirt's POV, nothing should have changed upon upgrade,
> as we wouldn't be setting any ACPI indexes by default.
>
> Can you show the QEMU command line from /var/log/libvirt/qemu/$GUEST.log
> both before and after the libvirt upgrade.
>

Sure, here it is before the upgrade: https://pastebin.com/ZzKd2uRJ
And here after the upgrade: https://pastebin.com/EMu6Jgat
(there is a minor difference in the disks which shouldn't be related to
this issue)

Thanks!

Riccardo


Re: issue when not using acpi indices in libvirt 7.4.0 and qemu 6.0.0

2021-06-23 Thread Daniel P . Berrangé
On Wed, Jun 23, 2021 at 06:49:12PM +0200, Riccardo Ravaioli wrote:
> Hi everyone,
> 
> We have an issue with how network interfaces are presented in the VM with
> the latest libvirt 7.4.0 and qemu 6.0.0.
> 
> Previously, we were on libvirt 7.0.0 and qemu 5.2.0, and we used increasing
> virtual PCI addresses for any type of network interface (virtio, PCI
> passthrough, SRIOV) in order to decide the interface order inside the VM.
> For instance the following snippet yields ens1, ens2 and ens3 in a Debian
> Buster VM:
> 
>   
>  
>  
>   type="pci"/>
>  
>  
> 
>  
>   
>   
>  
>  
>   type="pci"/>
>  
>  
> 
>  
>   
>   
>  
> 
>  
>   type="pci"/>
>   

So your config here does NOT list any ACPI indexes

> After upgrading to libvirt 7.4.0 and qemu 6.0.0, the XML snippet above
> yielded:
> - ens1 for the first virtio interface => OK
> - rename4 for the second virtio interface => **KO**
> - ens3 for the PCI passthrough interface  => OK

So from libvirt's POV, nothing should have changed upon upgrade,
as we wouldn't be setting any ACPI indexes by default.

Can you show the QEMU command line from /var/log/libvirt/qemu/$GUEST.log
both before and after the libvirt upgrade.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



issue when not using acpi indices in libvirt 7.4.0 and qemu 6.0.0

2021-06-23 Thread Riccardo Ravaioli
Hi everyone,

We have an issue with how network interfaces are presented in the VM with
the latest libvirt 7.4.0 and qemu 6.0.0.

Previously, we were on libvirt 7.0.0 and qemu 5.2.0, and we used increasing
virtual PCI addresses for any type of network interface (virtio, PCI
passthrough, SRIOV) in order to decide the interface order inside the VM.
For instance the following snippet yields ens1, ens2 and ens3 in a Debian
Buster VM:

  
 
 
 
 
 

 
  
  
 
 
 
 
 

 
  
  
 

 
 
  

After upgrading to libvirt 7.4.0 and qemu 6.0.0, the XML snippet above
yielded:
- ens1 for the first virtio interface => OK
- rename4 for the second virtio interface => **KO**
- ens3 for the PCI passthrough interface  => OK

Argh! What happened to ens2? By running udev inside the VM, I see that
"rename4" is the result of a conflict between the ID_NET_NAME_SLOT of the
second and the third interface, both appearing as ID_NET_NAME_SLOT=ens3. In
theory rename4 should show ID_NET_NAME_SLOT=ens2. What happened?

#  udevadm info -q all /sys/class/net/rename4
P: /devices/pci:00/:00:03.0/:01:02.0/virtio4/net/rename4
L: 0
E: DEVPATH=/devices/pci:00/:00:03.0/:01:02.0/virtio4/net/rename4
E: INTERFACE=rename4
E: IFINDEX=4
E: SUBSYSTEM=net
E: USEC_INITIALIZED=94191911
E: ID_NET_NAMING_SCHEME=v240
E: ID_NET_NAME_MAC=enx525400aabba1
E: ID_NET_NAME_PATH=enp1s2
E: ID_NET_NAME_SLOT=ens3
E: ID_BUS=pci
E: ID_VENDOR_ID=0x1af4
E: ID_MODEL_ID=0x1000
E: ID_PCI_CLASS_FROM_DATABASE=Network controller
E: ID_PCI_SUBCLASS_FROM_DATABASE=Ethernet controller
E: ID_VENDOR_FROM_DATABASE=Red Hat, Inc.
E: ID_MODEL_FROM_DATABASE=Virtio network device
E: ID_PATH=pci-:01:02.0
E: ID_PATH_TAG=pci-_01_02_0
E: ID_NET_DRIVER=virtio_net
E: ID_NET_LINK_FILE=/usr/lib/systemd/network/99-default.link
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/rename4
E: TAGS=:systemd:

#  udevadm info -q all /sys/class/net/ens3
P: /devices/pci:00/:00:03.0/:01:03.0/net/ens3
L: 0
E: DEVPATH=/devices/pci:00/:00:03.0/:01:03.0/net/ens3
E: INTERFACE=ens3
E: IFINDEX=2
E: SUBSYSTEM=net
E: USEC_INITIALIZED=3600940
E: ID_NET_NAMING_SCHEME=v240
E: ID_NET_NAME_MAC=enx00900b621235
E: ID_OUI_FROM_DATABASE=LANNER ELECTRONICS, INC.
E: ID_NET_NAME_PATH=enp1s3
E: ID_NET_NAME_SLOT=ens3
E: ID_BUS=pci
E: ID_VENDOR_ID=0x8086
E: ID_MODEL_ID=0x1533
E: ID_PCI_CLASS_FROM_DATABASE=Network controller
E: ID_PCI_SUBCLASS_FROM_DATABASE=Ethernet controller
E: ID_VENDOR_FROM_DATABASE=Intel Corporation
E: ID_MODEL_FROM_DATABASE=I210 Gigabit Network Connection
E: ID_PATH=pci-:01:03.0
E: ID_PATH_TAG=pci-_01_03_0
E: ID_NET_DRIVER=igb
E: ID_NET_LINK_FILE=/usr/lib/systemd/network/99-default.link
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/ens3
E: TAGS=:systemd:


Is there anything we can do in the XML definition of the VM to fix this?

The PCI tree from within the VM is the following, if it helps:
(with libvirt 7.0.0 and qemu 5.2.0 it was the same)

# lspci -tv
-[:00]-+-00.0  Intel Corporation 440FX - 82441FX PMC [Natoma]
   +-01.0  Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
   +-01.1  Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
   +-01.2  Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II]
   +-01.3  Intel Corporation 82371AB/EB/MB PIIX4 ACPI
   +-02.0  Cirrus Logic GD 5446
   +-03.0-[01]--+-01.0  Red Hat, Inc. Virtio network device
   |+-02.0  Red Hat, Inc. Virtio network device
   |\-03.0  Intel Corporation I210 Gigabit Network
Connection
   +-04.0-[02]--
   +-05.0-[03]--
   +-06.0-[04]--
   +-07.0-[05]--
   +-08.0-[06]01.0  Red Hat, Inc. Virtio block device
   +-09.0  Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port
SATA Controller [AHCI mode]
   +-0a.0  Red Hat, Inc. Virtio console
   +-0b.0  Red Hat, Inc. Virtio memory balloon
   \-0c.0  Red Hat, Inc. Virtio RNG


I see that a new feature in qemu and libvirt is to add ACPI indices in
order to have network interfaces appear as *onboard* and sort them through
this index as opposed to virtual PCI addresses. This is great. I see that
in this case, interfaces appear as eno1, eno2, etc.

However, for the sake of backward compatibility, is there a way to have the
previous behaviour where interfaces were called by their PCI slot number
(ens1, ens2, etc.)?

If I move to the new naming yielded by ACPI indices, I am mostly worried
about any possible change in interface names that might occur across VMs
running different OS's, with respect to what we had before with libvirt
7.0.0 and qemu 5.2.0.

Thanks!

Best,
Riccardo Ravaioli