Re: [systemd-devel] Resolving systemd naming problems on multi-port PCI cards

2016-04-07 Thread Jordan Hargrave
On Thu, Apr 7, 2016 at 11:48 AM, Kay Sievers  wrote:
> On Thu, Apr 7, 2016 at 6:08 PM, Jordan Hargrave  wrote:
>> The current systemd naming scheme for Network cards has a problem
>> correctly naming multi-port NIC devices in a PCI slot.
>>
>> Systemd currently generates names of the form:
>>
>> enpAsBfCdD
>>
>> pA = PCI bus number
>> sB = PCI device number (confusingly called 'SLOT')
>
> Geographical addressing uses sometimes slot sometimes device. The
> kernel uses "slot"
> https://github.com/torvalds/linux/blob/master/arch/x86/pci/early.c
>
>> fC = PCI function number
>> [dD = NIC device port (sysfs dev_port)]
>>
>> eg. enp5s0f0 for a NIC at 05:00.0, dev_port = 0
>>
>> These names already aren't necessarily persistent if PCI bus topology
>> changes (Bus number changes due to adding cards across reboot, etc).
>
> Sure, geographical addressing is not expected to cover hardware
> reconfiguration or firmwares which just do "random" renumbering at
> reboot time.
>
>> --or--
>> ensBfCdD
>>
>> sB = _SUN slot
>> fC = PCI function number
>> [dD = NIC device port (sysfs dev_port)]
>>
>> eg. ens2f0d1 for a single-port NIC at 0?:00.0 in PCI slot 2, dev_port = 1
>>
>> The problem is the 2nd naming scheme cannot handle multi-port NICs.
>> Multi-port NICs often have one or more bridges before the PCI slot
>> number itself.
>>
>> eg. for my quad-port Intel NIC in PCI slot 2 the devices are actually:
>> 44:00.0
>> 44:00.1
>> 45:00.0
>> 45:00.1
>>
>> Using the 2nd naming scheme, the names generated are:
>> ens2f0
>> ens2f1
>> ens2f0
>> ens2f1
>>
>> Oops. Problem. There is a name collision.
>> So depending on who gets
>> initialized first I'll see either:
>>
>> ens2f0
>> ens2f1
>> enp69s0f0
>> enp69s1f0
>>
>> or
>> enp68s0f0
>> enp68s1f0
>> ens2f0
>> ens2f1
>
> How does /sys/bus/pci/slots/ look in that case?
>

There are three entries:
/sys/bus/pci/slots/PCI1 : address = :41:00.0
/sys/bus/pci/slots/PCI2 : address = :42:00.0
/sys/bus/pci/slots/PCI3 : address = :04:00.0

Normally systemd won't discover "PCI2" on my multi-port as it only
looks at a matching device in /sys/bus/pci/slots/address.  So it
checks :44:00.0, :44:00.1, etc. That doesn't match.  On a
single-port NIC in a PCI slot, it would match.

Here's the device tree of the devices that all live under :42:00.0
/sys/devices/pci:40/:40:03.0/:42:00.0/:43:02.0,PCI2
/sys/devices/pci:40/:40:03.0/:42:00.0/:43:04.0,PCI2
/sys/devices/pci:40/:40:03.0/:42:00.0/:43:02.0/:44:00.0,PCI2
NIC Port 1
/sys/devices/pci:40/:40:03.0/:42:00.0/:43:02.0/:44:00.1,PCI2
NIC Port 2
/sys/devices/pci:40/:40:03.0/:42:00.0/:43:04.0/:45:00.0,PCI2
NIC Port 3
/sys/devices/pci:40/:40:03.0/:42:00.0/:43:04.0/:45:00.1,PCI2
NIC Port 4

I changed systemd to also search the parent devices for a match, but
that causes the naming conflict as now 4 devices match, with same
device and function numbers.

> When is the PCI hotplug driver loaded? Before or after the network card 
> driver?
>

Slot files are created at PCI device enumeration, so before network
driver loads.

>> There is a way to fix this by combining the two naming schemes, with a
>> bit of a hack.
>>
>> enpAsBfCdD
>>
>> pA = PCI bus # (no change)
>> sB = _SUN slot # (no change)
>> fC = This is what changes. Instead of C = function number (0..7) it is
>> Device:Function (0..31)
>> dD = Device port (no change)
>>
>> On my system this generates new names:
>> enp4s0 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f0
>> enp4s0d1 at /sys/devices/pci:00/:00:03.0 1 SLOT 3  => enp3s4f0d1
>> enp4s0f1 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f1
>> enp4s0f1d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f1d1
>> enp4s0f2 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f2
>> enp4s0f2d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f2d1
>> enp4s0f3 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f3
>> enp4s0f3d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f3d1
>> enp4s0f4 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f4
>> enp4s0f4d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f4d1
>> enp4s0f5 at /sys/devices/pci:00/:00:03.0 SLOT  => enp3s4f5
>> enp4s0f5d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f5d1
>> enp4s0f6 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f6
>> enp4s0f6d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f6d1
>> enp4s0f7 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f7
>> enp4s0f7d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f7d1
>> enp4s1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f8
>>   (Device 1:0 => Function 8)
>> enp4s1d1 at /sys/devices/pci:00/:00:03.0 SLOT 3=>
>> enp3s4f8d1   (Device 1:0 => Function 8)
>>
>> enp68s0f0 at 

Re: [systemd-devel] Resolving systemd naming problems on multi-port PCI cards

2016-04-07 Thread Kay Sievers
On Thu, Apr 7, 2016 at 6:08 PM, Jordan Hargrave  wrote:
> The current systemd naming scheme for Network cards has a problem
> correctly naming multi-port NIC devices in a PCI slot.
>
> Systemd currently generates names of the form:
>
> enpAsBfCdD
>
> pA = PCI bus number
> sB = PCI device number (confusingly called 'SLOT')

Geographical addressing uses sometimes slot sometimes device. The
kernel uses "slot"
https://github.com/torvalds/linux/blob/master/arch/x86/pci/early.c

> fC = PCI function number
> [dD = NIC device port (sysfs dev_port)]
>
> eg. enp5s0f0 for a NIC at 05:00.0, dev_port = 0
>
> These names already aren't necessarily persistent if PCI bus topology
> changes (Bus number changes due to adding cards across reboot, etc).

Sure, geographical addressing is not expected to cover hardware
reconfiguration or firmwares which just do "random" renumbering at
reboot time.

> --or--
> ensBfCdD
>
> sB = _SUN slot
> fC = PCI function number
> [dD = NIC device port (sysfs dev_port)]
>
> eg. ens2f0d1 for a single-port NIC at 0?:00.0 in PCI slot 2, dev_port = 1
>
> The problem is the 2nd naming scheme cannot handle multi-port NICs.
> Multi-port NICs often have one or more bridges before the PCI slot
> number itself.
>
> eg. for my quad-port Intel NIC in PCI slot 2 the devices are actually:
> 44:00.0
> 44:00.1
> 45:00.0
> 45:00.1
>
> Using the 2nd naming scheme, the names generated are:
> ens2f0
> ens2f1
> ens2f0
> ens2f1
>
> Oops. Problem. There is a name collision.
> So depending on who gets
> initialized first I'll see either:
>
> ens2f0
> ens2f1
> enp69s0f0
> enp69s1f0
>
> or
> enp68s0f0
> enp68s1f0
> ens2f0
> ens2f1

How does /sys/bus/pci/slots/ look in that case?

When is the PCI hotplug driver loaded? Before or after the network card driver?

> There is a way to fix this by combining the two naming schemes, with a
> bit of a hack.
>
> enpAsBfCdD
>
> pA = PCI bus # (no change)
> sB = _SUN slot # (no change)
> fC = This is what changes. Instead of C = function number (0..7) it is
> Device:Function (0..31)
> dD = Device port (no change)
>
> On my system this generates new names:
> enp4s0 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f0
> enp4s0d1 at /sys/devices/pci:00/:00:03.0 1 SLOT 3  => enp3s4f0d1
> enp4s0f1 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f1
> enp4s0f1d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f1d1
> enp4s0f2 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f2
> enp4s0f2d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f2d1
> enp4s0f3 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f3
> enp4s0f3d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f3d1
> enp4s0f4 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f4
> enp4s0f4d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f4d1
> enp4s0f5 at /sys/devices/pci:00/:00:03.0 SLOT  => enp3s4f5
> enp4s0f5d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f5d1
> enp4s0f6 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f6
> enp4s0f6d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f6d1
> enp4s0f7 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f7
> enp4s0f7d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f7d1
> enp4s1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f8
>   (Device 1:0 => Function 8)
> enp4s1d1 at /sys/devices/pci:00/:00:03.0 SLOT 3=>
> enp3s4f8d1   (Device 1:0 => Function 8)
>
> enp68s0f0 at /sys/devices/pci:40/:40:03.0 SLOT 2   => enp68s2f0
> enp68s0f1 at /sys/devices/pci:40/:40:03.0 SLOT 2   => enp68s2f1
> enp69s0f0 at /sys/devices/pci:40/:40:03.0 SLOT 2   => enp69s2f0
> enp69s0f1 at /sys/devices/pci:40/:40:03.0 SLOT 2   => enp69s2f1
>
> This way it is always able to determine the physical PCI slot the device is 
> in.
>
> This scheme still does have a limitation... the names may not be
> persistent if PCI topology changes due to the PCI bus number still
> being part of the name.

I don't think the two should be mixed. The point of the hotplug slots
was to be independent of the geography.

If what you describe can't be fixed, the slot numbering scheme should
just be turned off by default.

Kay
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Resolving systemd naming problems on multi-port PCI cards

2016-04-07 Thread Jordan Hargrave
The current systemd naming scheme for Network cards has a problem
correctly naming multi-port NIC devices in a PCI slot.

Systemd currently generates names of the form:

enpAsBfCdD

pA = PCI bus number
sB = PCI device number (confusingly called 'SLOT')
fC = PCI function number
[dD = NIC device port (sysfs dev_port)]

eg. enp5s0f0 for a NIC at 05:00.0, dev_port = 0

These names already aren't necessarily persistent if PCI bus topology
changes (Bus number changes due to adding cards across reboot, etc).

--or--
ensBfCdD

sB = _SUN slot
fC = PCI function number
[dD = NIC device port (sysfs dev_port)]

eg. ens2f0d1 for a single-port NIC at 0?:00.0 in PCI slot 2, dev_port = 1

The problem is the 2nd naming scheme cannot handle multi-port NICs.
Multi-port NICs often have one or more bridges before the PCI slot
number itself.

eg. for my quad-port Intel NIC in PCI slot 2 the devices are actually:
44:00.0
44:00.1
45:00.0
45:00.1

Using the 2nd naming scheme, the names generated are:
ens2f0
ens2f1
ens2f0
ens2f1

Oops. Problem. There is a name collision.  So depending on who gets
initialized first I'll see either:

ens2f0
ens2f1
enp69s0f0
enp69s1f0

or
enp68s0f0
enp68s1f0
ens2f0
ens2f1

There is a way to fix this by combining the two naming schemes, with a
bit of a hack.

enpAsBfCdD

pA = PCI bus # (no change)
sB = _SUN slot # (no change)
fC = This is what changes. Instead of C = function number (0..7) it is
Device:Function (0..31)
dD = Device port (no change)

On my system this generates new names:
enp4s0 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f0
enp4s0d1 at /sys/devices/pci:00/:00:03.0 1 SLOT 3  => enp3s4f0d1
enp4s0f1 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f1
enp4s0f1d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f1d1
enp4s0f2 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f2
enp4s0f2d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f2d1
enp4s0f3 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f3
enp4s0f3d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f3d1
enp4s0f4 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f4
enp4s0f4d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f4d1
enp4s0f5 at /sys/devices/pci:00/:00:03.0 SLOT  => enp3s4f5
enp4s0f5d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f5d1
enp4s0f6 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f6
enp4s0f6d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f6d1
enp4s0f7 at /sys/devices/pci:00/:00:03.0 SLOT 3=> enp3s4f7
enp4s0f7d1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f7d1
enp4s1 at /sys/devices/pci:00/:00:03.0 SLOT 3  => enp3s4f8
  (Device 1:0 => Function 8)
enp4s1d1 at /sys/devices/pci:00/:00:03.0 SLOT 3=>
enp3s4f8d1   (Device 1:0 => Function 8)

enp68s0f0 at /sys/devices/pci:40/:40:03.0 SLOT 2   => enp68s2f0
enp68s0f1 at /sys/devices/pci:40/:40:03.0 SLOT 2   => enp68s2f1
enp69s0f0 at /sys/devices/pci:40/:40:03.0 SLOT 2   => enp69s2f0
enp69s0f1 at /sys/devices/pci:40/:40:03.0 SLOT 2   => enp69s2f1

This way it is always able to determine the physical PCI slot the device is in.

This scheme still does have a limitation... the names may not be
persistent if PCI topology changes due to the PCI bus number still
being part of the name.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel