RE: [PATCH] util: Add phys_port_name support on virPCIGetNetName

2020-12-17 Thread Adrian Chiris
>-Original Message-
>From: Laine Stump 
>Sent: Thursday, December 17, 2020 5:22 AM
>To: libvir-list@redhat.com
>Cc: Dmytro Linkin ; Moshe Levi ;
>Adrian Chiris 
>Subject: Re: [PATCH] util: Add phys_port_name support on
>virPCIGetNetName
>
>External email: Use caution opening links or attachments
>
>
>On 12/10/20 11:51 AM, Adrian Chiris wrote:
>
>> Hi,
>> Would be great to get a pair of eyes on this Patch, Thanks!
>
>
>I've looked at it several times and every time would just end up shaking my
>head wondering why there isn't one definitive symlink in the VF's sysfs for the
>netdev of the physical port. (Part of my lack of response from the last time is
>that I didn't know how to respond since I didn't understand why such
>roundabout logic should be needed to answer such a basic question, decided I
>should look at it again before responding, and then forgot about it :-()
>
>
>Anyway, this time I'm determined to not get up until I've got it figured out 
>(or
>at least understand exactly what I don't have figured out)...
>
>
>
>>> -Original Message-
>>> From: Dmytro Linkin 
>>> Sent: Tuesday, October 27, 2020 10:58 AM
>>> To: libvir-list@redhat.com
>>> Cc: Laine Stump ; Adrian Chiris
>>> ; Moshe Levi 
>>> Subject: Re: [PATCH] util: Add phys_port_name support on
>>> virPCIGetNetName
>>>
>>> On Mon, Sep 28, 2020 at 12:56:12PM +0300, Dmytro Linkin wrote:
>>>> On Tue, Sep 22, 2020 at 08:31:15AM -0400, Laine Stump wrote:
>>>>> On 8/28/20 6:53 AM, Dmytro Linkin wrote:
>>>>>> Current virPCIGetNetName() logic is to get net device name by
>>>>>> checking it's phys_port_id, if caller provide it, or by it's index
>>>>>> (eg, by it's position at sysfs net directory). This approach
>>>>>> worked fine up until linux kernel version 5.8, where NVIDIA
>>>>>> Mellanox driver implemented linking of VFs' representors to PCI
>>>>>> device in switchdev mode. This mean that device's sysfs net
>>>>>> directory will hold
>>> multiple net devices. Ex.:
>>>>>> $ ls '/sys/bus/pci/devices/:82:00.0/net'
>>>>>> ens1f0  eth0  eth1
>>>>>>
>>>>>> Most switch devices support phys_port_name instead of
>>>>>> phys_port_id, so
>>>>>> virPCIGetNetName() will try to get PF name by it's index - 0. The
>>>>>> problem here is that the PF nedev entry may not be the first.
>>>>>>
>>>>>> To fix that, for switch devices, we introduce a new logic to
>>>>>> select the PF uplink netdev according to the content of
>phys_port_name.
>>>>>> Extend
>>>>>> virPCIGetNetName() with physPortNameRegex variable to get proper
>>>>>> device by it's phys_port_name scheme, for ex., "p[0-9]+$" to get
>>>>>> PF, "pf[0-9]+vf[0-9]+$" to get VF or "p1$" to get exact net device.
>>>>>> So now
>>>>>> virPCIGetNetName() logic work in following sequence:
>>>>>>   - filter by phys_port_id, if it's provided,
>>>>>>   or
>>>>>>   - filter by phys_port_name, if it's regex provided,
>>>>>>   or
>>>>>>   - get net device by it's index (position) in sysfs net directory.
>>>>>> Also, make getting content of iface sysfs files more generic.
>>>>>>
>>>>>> Signed-off-by: Dmytro Linkin 
>>>>>> Reviewed-by: Adrian Chiris 
>>>>>
>>>>> [...]
>>>>>
>>>>>
>>>>>> +/* Represents format of PF's phys_port_name in switchdev mode:
>>>>>> + * 'p%u' or 'p%us%u'. New line checked since value is readed from
>>>>>> +sysfs
>>> file.
>>>>>> + */
>>>>>> +# define VIR_PF_PHYS_PORT_NAME_REGEX ((char
>>>>>> +*)"(p[0-9]+$)|(p[0-9]+s[0-9]+$)")
>>>>>> +
>>>>>
>>>>> I've come back to look at this patch several times since it was
>>>>> posted (sorry for the extreme delay in responding), but just can't
>>>>> figure out what it's doing with this regex and why the regex is
>>>>> necessary. Not having access to the hardware that it works with is
>>>>> a bit of a problem, but perhaps I could get a better idea if you
>>>>> gave a full example of sysfs contents? My concern wi

RE: [PATCH] util: Add phys_port_name support on virPCIGetNetName

2020-12-10 Thread Adrian Chiris
>-Original Message-
>From: Dmytro Linkin 
>Sent: Tuesday, October 27, 2020 10:58 AM
>To: libvir-list@redhat.com
>Cc: Laine Stump ; Adrian Chiris ;
>Moshe Levi 
>Subject: Re: [PATCH] util: Add phys_port_name support on
>virPCIGetNetName
>
>On Mon, Sep 28, 2020 at 12:56:12PM +0300, Dmytro Linkin wrote:
>> On Tue, Sep 22, 2020 at 08:31:15AM -0400, Laine Stump wrote:
>> > On 8/28/20 6:53 AM, Dmytro Linkin wrote:
>> > >Current virPCIGetNetName() logic is to get net device name by
>> > >checking it's phys_port_id, if caller provide it, or by it's index
>> > >(eg, by it's position at sysfs net directory). This approach worked
>> > >fine up until linux kernel version 5.8, where NVIDIA Mellanox
>> > >driver implemented linking of VFs' representors to PCI device in
>> > >switchdev mode. This mean that device's sysfs net directory will hold
>multiple net devices. Ex.:
>> > >
>> > >$ ls '/sys/bus/pci/devices/:82:00.0/net'
>> > >ens1f0  eth0  eth1
>> > >
>> > >Most switch devices support phys_port_name instead of phys_port_id,
>> > >so
>> > >virPCIGetNetName() will try to get PF name by it's index - 0. The
>> > >problem here is that the PF nedev entry may not be the first.
>> > >
>> > >To fix that, for switch devices, we introduce a new logic to select
>> > >the PF uplink netdev according to the content of phys_port_name.
>> > >Extend
>> > >virPCIGetNetName() with physPortNameRegex variable to get proper
>> > >device by it's phys_port_name scheme, for ex., "p[0-9]+$" to get
>> > >PF, "pf[0-9]+vf[0-9]+$" to get VF or "p1$" to get exact net device.
>> > >So now
>> > >virPCIGetNetName() logic work in following sequence:
>> > >  - filter by phys_port_id, if it's provided,
>> > >  or
>> > >  - filter by phys_port_name, if it's regex provided,
>> > >  or
>> > >  - get net device by it's index (position) in sysfs net directory.
>> > >Also, make getting content of iface sysfs files more generic.
>> > >
>> > >Signed-off-by: Dmytro Linkin 
>> > >Reviewed-by: Adrian Chiris 
>> >
>> >
>> > [...]
>> >
>> >
>> > >
>> > >+/* Represents format of PF's phys_port_name in switchdev mode:
>> > >+ * 'p%u' or 'p%us%u'. New line checked since value is readed from sysfs
>file.
>> > >+ */
>> > >+# define VIR_PF_PHYS_PORT_NAME_REGEX ((char
>> > >+*)"(p[0-9]+$)|(p[0-9]+s[0-9]+$)")
>> > >+
>> >
>> >
>> > I've come back to look at this patch several times since it was
>> > posted (sorry for the extreme delay in responding), but just can't
>> > figure out what it's doing with this regex and why the regex is
>> > necessary. Not having access to the hardware that it works with is a
>> > bit of a problem, but perhaps I could get a better idea if you gave
>> > a full example of sysfs contents? My concern with using a regex is
>> > that it might work just fine when using one method for net device
>> > naming, but break if that was changed. Also, it seems
>> > counterintuitive that it would be necessary to look for a device
>> > with a name matching a specific pattern; why isn't there simply a
>> > single symbolic link somewhere in the sysfs tree for the net device
>> > that just directly points at its physical port? That would be so
>> > much simpler and more reliable (or at least it would give the
>> > *perception* of being more reliable).
>> >
>>
>> I'll emphasize that regex is being used for phys_port_name, NOT net
>> device name (they are not the same). Anyway, I'll give an example.
>>
>> Lets look how virPCIGetNetName() currently work.
>> At first it try to read phys_port_id of every net device in net
>> folder, like:
>> $ cd '/sys/bus/pci/devices/:80:02.0/:82:00.0/'
>> $ cat net/ens1f0/phys_port_id
>>
>> Imagine we have single pci dual port nic (I hope named it right), eg.
>> net folder holds net devices for both ports, so read operation will be
>> successfull and function will return name of first or second port.
>> For single port or dual pci nics (or for drivers which didn't
>> implemented phys_port_id) read fails and function fallback to picking
>> up device by it's index, which not really fine (I'll describe it
>> later) but work 'cause net folder usualy contains only one net d