Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-16 Thread Haiyang Zhang



> -Original Message-
> From: Lennart Poettering 
> Sent: Monday, January 13, 2020 4:18 AM
> To: Haiyang Zhang 
> Cc: Stephen Hemminger ; systemd-
> de...@lists.freedesktop.org; Michael Kelley 
> Subject: Re: [systemd-devel] Better network naming on Hyper-V/Azure?
> 
> On Fr, 10.01.20 16:17, Haiyang Zhang (haiya...@microsoft.com) wrote:
> 
> > > My guess is that this is a lot like SR-IOV slot number that we can
> > > already use to name interfaces, right? If so, supporting things the
> > > same way sounds totally OK.
> >
> > Thanks for your explanation. We do want to use the ethN format, and want
> the
> > results to be the same between Async and sync probing
> 
> Then deal with it in the kernel. Allocating from the same ethN
> namespace is always going to be racy if both kernel and userspace do
> it.
> 
> That's why the names userspace generally picks for stable Ethernet
> interfaces start with "en" followed by some stable suffix of a kind,
> under the assumption the kernel will not allocate from that namespace.
> 
> > @Stephen Hemminger Since systemd needs to avoid stepping into the kernel
> > ethN formatting, should we do the synthetic NIC naming inside kernel (netvsc
> > driver)?
> 
> If you have any other driver register network interfaces on your
> kernel than your whole enumeration will go wrong though. If you
> tightly control which drivers exist in your environment you might get
> away with taking ownership of the ethN namespace entirely from your
> own driver and manage it fully.

Thanks for your suggestions!
So my implementation will keep the naming in kernel driver (netvsc). 

1) The netvsc's probe_type will be set to PROBE_DEFAULT_STRATEGY, 
so user can either continue with the current sync-probing by default, 
or use module/kernel cmdline option to enable Async-probing if other 
devices, such as DDA or SRIOV/VF NICs are configured to be named 
in different space (enP*, etc.) by systemd.

2) If Async-probing option is in use, netvsc driver will use the dev_num 
based on VMBus offer sequence. It will be the smallest available ethN 
format, which is the same result as the current sync-probing result.

3) My proposal is that Async probing has the same naming as sync 
probing. In case of hot add/remove, the names may be reused. The 
names may change after hot add/remove then reboot once. But the 
names will be stable in further reboots. It is the same behavior as 
current code (sync probing).

Thanks,
- Haiyang


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-13 Thread Lennart Poettering
On Fr, 10.01.20 16:17, Haiyang Zhang (haiya...@microsoft.com) wrote:

> > My guess is that this is a lot like SR-IOV slot number that we can
> > already use to name interfaces, right? If so, supporting things the
> > same way sounds totally OK.
>
> Thanks for your explanation. We do want to use the ethN format, and want the
> results to be the same between Async and sync probing

Then deal with it in the kernel. Allocating from the same ethN
namespace is always going to be racy if both kernel and userspace do
it.

That's why the names userspace generally picks for stable Ethernet
interfaces start with "en" followed by some stable suffix of a kind,
under the assumption the kernel will not allocate from that namespace.

> @Stephen Hemminger Since systemd needs to avoid stepping into the kernel
> ethN formatting, should we do the synthetic NIC naming inside kernel (netvsc
> driver)?

If you have any other driver register network interfaces on your
kernel than your whole enumeration will go wrong though. If you
tightly control which drivers exist in your environment you might get
away with taking ownership of the ethN namespace entirely from your
own driver and manage it fully.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-10 Thread Haiyang Zhang



> -Original Message-
> From: Lennart Poettering 
> Sent: Friday, January 10, 2020 10:55 AM
> To: Haiyang Zhang 
> Cc: Stephen Hemminger ; systemd-
> de...@lists.freedesktop.org; Michael Kelley 
> Subject: Re: [systemd-devel] Better network naming on Hyper-V/Azure?
> 
> On Fr, 10.01.20 15:37, Haiyang Zhang (haiya...@microsoft.com) wrote:
> 
> > > Hyper-V offers netvsc devices (synthetic NICs) in the same sequence across
> > > reboots, so eth0 ... ethN names will associate to the same vNIC every time
> > > with Sync-probing currently.
> > >
> > > But if in the future, we enable Async-probing, the naming may not
> persistent
> > > across reboots. In my patch set (not yet upstream), I added a new 
> > > attribute
> > > (dev_num) in sysfs to keep track of the device channel offer sequence. So
> user
> > > mode program can have the option to use this attribute to name NICs, and
> > > generates the same results for Async-probing as Sync-probing does.
> >
> > Lennart and other systemd developers:
> >
> > Could you also comment on my proposal above? It's to keep the naming
> results
> > of Async-probing same as that of sync-probing.
> 
> I am not sure I follow fully, but if you intend to assign an index to
> each interface that the VM supervisor sets and that we should use to
> name the interface, then that sounds great to me.
> 
> However do note that we generally avoid stepping into the naming
> namespace of the kernel. i.e. if your intention to stabilize eth0,
> eth1, eth2 with that, we can't help you, that's generally racy since
> the kernel allocates other interfaces from that namespace too.
> 
> My guess is that this is a lot like SR-IOV slot number that we can
> already use to name interfaces, right? If so, supporting things the
> same way sounds totally OK.

Thanks for your explanation. We do want to use the ethN format, and want the 
results to be the same between Async and sync probing.

@Stephen Hemminger Since systemd needs to avoid stepping into the kernel 
ethN formatting, should we do the synthetic NIC naming inside kernel (netvsc 
driver)? 

In case of conflicting with other names, like DDA, or VF NICs, it will fall 
back to 
the first available ethN name. I know It's racy, but not worse than current 
situation -- even with sync-probing, the name may still be racing with DDA, or 
VF NICs. And this is already solvable by systemd which uses PCI slot naming, and
puts them into different naming formats.

Thanks,
- Haiyang

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-10 Thread Lennart Poettering
On Fr, 10.01.20 15:37, Haiyang Zhang (haiya...@microsoft.com) wrote:

> > Hyper-V offers netvsc devices (synthetic NICs) in the same sequence across
> > reboots, so eth0 ... ethN names will associate to the same vNIC every time
> > with Sync-probing currently.
> >
> > But if in the future, we enable Async-probing, the naming may not persistent
> > across reboots. In my patch set (not yet upstream), I added a new attribute
> > (dev_num) in sysfs to keep track of the device channel offer sequence. So 
> > user
> > mode program can have the option to use this attribute to name NICs, and
> > generates the same results for Async-probing as Sync-probing does.
>
> Lennart and other systemd developers:
>
> Could you also comment on my proposal above? It's to keep the naming results
> of Async-probing same as that of sync-probing.

I am not sure I follow fully, but if you intend to assign an index to
each interface that the VM supervisor sets and that we should use to
name the interface, then that sounds great to me.

However do note that we generally avoid stepping into the naming
namespace of the kernel. i.e. if your intention to stabilize eth0,
eth1, eth2 with that, we can't help you, that's generally racy since
the kernel allocates other interfaces from that namespace too.

My guess is that this is a lot like SR-IOV slot number that we can
already use to name interfaces, right? If so, supporting things the
same way sounds totally OK.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-10 Thread Haiyang Zhang



> -Original Message-
> From: Haiyang Zhang
> Sent: Tuesday, January 7, 2020 11:01 AM
> To: Lennart Poettering ; Stephen Hemminger
> 
> Cc: systemd-devel@lists.freedesktop.org; Michael Kelley
> 
> Subject: RE: [systemd-devel] Better network naming on Hyper-V/Azure?

> Hyper-V offers netvsc devices (synthetic NICs) in the same sequence across
> reboots, so eth0 ... ethN names will associate to the same vNIC every time
> with Sync-probing currently.
> 
> But if in the future, we enable Async-probing, the naming may not persistent
> across reboots. In my patch set (not yet upstream), I added a new attribute
> (dev_num) in sysfs to keep track of the device channel offer sequence. So user
> mode program can have the option to use this attribute to name NICs, and
> generates the same results for Async-probing as Sync-probing does.

Lennart and other systemd developers:

Could you also comment on my proposal above? It's to keep the naming results
of Async-probing same as that of sync-probing.

Thanks,
- Haiyang

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-08 Thread Stephen Hemminger
On Tue, 7 Jan 2020 19:11:19 +0100
Lennart Poettering  wrote:

> On Di, 07.01.20 09:53, Stephen Hemminger (step...@networkplumber.org) wrote:
> 
> > On Tue, 7 Jan 2020 15:01:25 +0100
> > Lennart Poettering  wrote:
> >  
> > > On Mo, 06.01.20 15:36, Stephen Hemminger (step...@networkplumber.org) 
> > > wrote:
> > >  
> > > > About a year ago there was some discussion on having persistent network 
> > > > names
> > > > on Hyper-V/Azure. Haiyang did some patches to add an attribute which
> > > > could be used by udev to do this. But there are some reluctance because
> > > > of how the channel id works.
> > > >
> > > > The motivation to provide network naming is to allow vmbus to change to 
> > > > parallel probing.
> > > > Right now probing is serialized so naming is always in same order.
> > > >
> > > > My question is what exactly does systemd/udev need to provide persistent
> > > > naming. The obvious ones are:
> > > >   1. Must be unique (although PCI slot isn't)  
> > >
> > > It's not unique per bus? huh?  
> >
> > If you look in sysfs code, there is already code to handle
> > the case where two devices (usually sub-function) share the same PCI
> > slot. It is handled by adding a suffix in the kernel.  
> 
> For multifunction devices we append an "fXYZ" suffix where XYZ is the
> function index. That should be sufficient, no?
> 
> Lennart

Yes, that works. I was thinking  of case where /sys/bus/pci/slots
can have entries 2 and 2-1 because of duplicates.
See kernel drivers/bus/pci/slot.c:make_slot_name
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-07 Thread Lennart Poettering
On Di, 07.01.20 09:53, Stephen Hemminger (step...@networkplumber.org) wrote:

> On Tue, 7 Jan 2020 15:01:25 +0100
> Lennart Poettering  wrote:
>
> > On Mo, 06.01.20 15:36, Stephen Hemminger (step...@networkplumber.org) wrote:
> >
> > > About a year ago there was some discussion on having persistent network 
> > > names
> > > on Hyper-V/Azure. Haiyang did some patches to add an attribute which
> > > could be used by udev to do this. But there are some reluctance because
> > > of how the channel id works.
> > >
> > > The motivation to provide network naming is to allow vmbus to change to 
> > > parallel probing.
> > > Right now probing is serialized so naming is always in same order.
> > >
> > > My question is what exactly does systemd/udev need to provide persistent
> > > naming. The obvious ones are:
> > >   1. Must be unique (although PCI slot isn't)
> >
> > It's not unique per bus? huh?
>
> If you look in sysfs code, there is already code to handle
> the case where two devices (usually sub-function) share the same PCI
> slot. It is handled by adding a suffix in the kernel.

For multifunction devices we append an "fXYZ" suffix where XYZ is the
function index. That should be sufficient, no?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-07 Thread Stephen Hemminger
On Tue, 7 Jan 2020 15:01:25 +0100
Lennart Poettering  wrote:

> On Mo, 06.01.20 15:36, Stephen Hemminger (step...@networkplumber.org) wrote:
> 
> > About a year ago there was some discussion on having persistent network 
> > names
> > on Hyper-V/Azure. Haiyang did some patches to add an attribute which
> > could be used by udev to do this. But there are some reluctance because
> > of how the channel id works.
> >
> > The motivation to provide network naming is to allow vmbus to change to 
> > parallel probing.
> > Right now probing is serialized so naming is always in same order.
> >
> > My question is what exactly does systemd/udev need to provide persistent
> > naming. The obvious ones are:
> >   1. Must be unique (although PCI slot isn't)  
> 
> It's not unique per bus? huh?

If you look in sysfs code, there is already code to handle
the case where two devices (usually sub-function) share the same PCI
slot. It is handled by adding a suffix in the kernel.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-07 Thread Lennart Poettering
On Di, 07.01.20 16:01, Haiyang Zhang (haiya...@microsoft.com) wrote:

> > I have no idea what that means, what is Accelerated Networking?
>
> On Azure, "Accelerated Networking" means SRIOV / VF NICs.

There's nowadays already support for SR-IOV naming in place:

https://www.freedesktop.org/software/systemd/man/systemd.net-naming-scheme.html#ID_NET_NAME_SLOT=prefix%5BPdomain%5Dsslot%5Bffunction%5D%5Bnport_name%7Cddev_port%5D

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-07 Thread Haiyang Zhang
(resending after subscribed to systemd-devel)

> -Original Message-
> From: Lennart Poettering 
> Sent: Tuesday, January 7, 2020 9:01 AM
> To: Stephen Hemminger 
> Cc: systemd-devel@lists.freedesktop.org; Haiyang Zhang
> 
> Subject: Re: [systemd-devel] Better network naming on Hyper-V/Azure?
> 
> On Mo, 06.01.20 15:36, Stephen Hemminger (step...@networkplumber.org)
> wrote:
> 
> > About a year ago there was some discussion on having persistent network
> names
> > on Hyper-V/Azure. Haiyang did some patches to add an attribute which
> > could be used by udev to do this. But there are some reluctance because
> > of how the channel id works.
> >
> > The motivation to provide network naming is to allow vmbus to change to
> parallel probing.
> > Right now probing is serialized so naming is always in same order.
> >
> > My question is what exactly does systemd/udev need to provide persistent
> > naming. The obvious ones are:
> >   1. Must be unique (although PCI slot isn't)
> 
> It's not unique per bus? huh?
> 
> >   2. Must be persistent across reboot.
> >   3. Must be stable if device is removed.
> 
> Yes, these three are the idea.
> 
> > There are more questions.
> >   1. Is there a particular ordering and non-reuse requirement.
> >  Obviously, names have to be 15 characters or less but what
> >   else.
> 
> No ordering or non-reuse requirements are made. I mean, the device
> path names are in particular defined so that they stable even if you
> replace your PCI network card by a different one, hence in that case
> absolutely are reused by a different device, and that's intentional.
> 
> Yeah must fit in IFNAMSIZ. And probably shouldn't include "/", control
> chars, whitespace and some other weird chars that apps don't like. But
> that's the same for all network interfaces, whether managed by udev
> predictable naming or not...
> 
> >   2. How to handle the device associated with Accelerated Networking?
> >  Do you want to hide or rename the VF that is associated with the
> >  virtual device?
> 
> I have no idea what that means, what is Accelerated Networking?

On Azure, "Accelerated Networking" means SRIOV / VF NICs.

> 
> > There are a couple of other quirks:
> >   1. The current cloudinit and other startup applications require eth0 as
> >  the administrative and always there interface, hard wired into the
> >  code. How to handle that?
> 
> if you have multiple devices and want a specific one to be named eth0
> then this is inhrently racy since we can't sensibly rename the device
> like that in userspace because we'd always race against the kernel's
> own naming regime.
> 
> Naming an interface "eth0" only really works if you have only one
> interface or if you don't care about the names at all. If you have
> multiple then pick different names outside of the ethX namespace the
> kernel allocates from.
> 
> In the case where you only have a single interface or don't care about
> the name then drop in a .link file that matches the interface
> generically and sets NamePolicy=kernel so that the kernel name is used
> as it is.
> 
> >   2. Hyper-V has the ability for host administrator to assign a name, but
> >  it is more of a free form string so it is used as default
> >  network description.
> 
> Current systemd git has support for assigning "alternative" ifnames to
> devices, using that new kernel feature. On kernels that support that
> we'll initialize the alternative ifnames to all names we could
> possibly come up with (i.e. so that an interface always be be referred
> to by its by-path, by-slot, by-mac name equally). Since the
> alternative ifnames are not IFNAMSIZ long (but 128 chars long) maybe
> they are suitable to use for these hyperv "free form string" if that
> makes sense given the charset restrictions.
> 
> >   3. Azure has names as part of the CLI for manipulating VM's but these
> >  are not currently exposed to guest. If this could happen would it help 
> > or
> >  hurt.
> 
> I mean, we are happy to make use of any names that make sense. Not
> sure why hyperv needs three different symbolic names for each
> interface, but if it is how it is, then we can toally expose them all
> ;-).

Hyper-V offers netvsc devices (synthetic NICs) in the same sequence across 
reboots, so eth0 ... ethN names will associate to the same vNIC every time 
with Sync-probing currently. 

But if in the future, we enable Async-probing, the naming may not persistent 
across reboots. In my patch set (not yet upstream), I added a new attribute 
(dev_num) in sysfs to keep track of the device channel offer sequence. So user 
mode program can have the option to use this attribute to name NICs, and  
generates the same results for Async-probing as Sync-probing does.

Thanks,
- Haiyang

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Better network naming on Hyper-V/Azure?

2020-01-07 Thread Lennart Poettering
On Mo, 06.01.20 15:36, Stephen Hemminger (step...@networkplumber.org) wrote:

> About a year ago there was some discussion on having persistent network names
> on Hyper-V/Azure. Haiyang did some patches to add an attribute which
> could be used by udev to do this. But there are some reluctance because
> of how the channel id works.
>
> The motivation to provide network naming is to allow vmbus to change to 
> parallel probing.
> Right now probing is serialized so naming is always in same order.
>
> My question is what exactly does systemd/udev need to provide persistent
> naming. The obvious ones are:
>   1. Must be unique (although PCI slot isn't)

It's not unique per bus? huh?

>   2. Must be persistent across reboot.
>   3. Must be stable if device is removed.

Yes, these three are the idea.

> There are more questions.
>   1. Is there a particular ordering and non-reuse requirement.
>  Obviously, names have to be 15 characters or less but what
>   else.

No ordering or non-reuse requirements are made. I mean, the device
path names are in particular defined so that they stable even if you
replace your PCI network card by a different one, hence in that case
absolutely are reused by a different device, and that's intentional.

Yeah must fit in IFNAMSIZ. And probably shouldn't include "/", control
chars, whitespace and some other weird chars that apps don't like. But
that's the same for all network interfaces, whether managed by udev
predictable naming or not...

>   2. How to handle the device associated with Accelerated Networking?
>  Do you want to hide or rename the VF that is associated with the
>  virtual device?

I have no idea what that means, what is Accelerated Networking?

> There are a couple of other quirks:
>   1. The current cloudinit and other startup applications require eth0 as
>  the administrative and always there interface, hard wired into the
>  code. How to handle that?

if you have multiple devices and want a specific one to be named eth0
then this is inhrently racy since we can't sensibly rename the device
like that in userspace because we'd always race against the kernel's
own naming regime.

Naming an interface "eth0" only really works if you have only one
interface or if you don't care about the names at all. If you have
multiple then pick different names outside of the ethX namespace the
kernel allocates from.

In the case where you only have a single interface or don't care about
the name then drop in a .link file that matches the interface
generically and sets NamePolicy=kernel so that the kernel name is used
as it is.

>   2. Hyper-V has the ability for host administrator to assign a name, but
>  it is more of a free form string so it is used as default
>  network description.

Current systemd git has support for assigning "alternative" ifnames to
devices, using that new kernel feature. On kernels that support that
we'll initialize the alternative ifnames to all names we could
possibly come up with (i.e. so that an interface always be be referred
to by its by-path, by-slot, by-mac name equally). Since the
alternative ifnames are not IFNAMSIZ long (but 128 chars long) maybe
they are suitable to use for these hyperv "free form string" if that
makes sense given the charset restrictions.

>   3. Azure has names as part of the CLI for manipulating VM's but these
>  are not currently exposed to guest. If this could happen would it help or
>  hurt.

I mean, we are happy to make use of any names that make sense. Not
sure why hyperv needs three different symbolic names for each
interface, but if it is how it is, then we can toally expose them all
;-).

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel