Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-04-19 Thread Liran Alon



> On 6 Mar 2019, at 23:42, si-wei liu  wrote:
> 
> 
> 
> On 3/6/2019 1:36 PM, Samudrala, Sridhar wrote:
>> 
>> On 3/6/2019 1:26 PM, si-wei liu wrote:
>>> 
>>> 
>>> On 3/6/2019 4:04 AM, Jiri Pirko wrote:
> --- a/net/core/failover.c
> +++ b/net/core/failover.c
> @@ -16,6 +16,11 @@
> 
> static LIST_HEAD(failover_list);
> static DEFINE_SPINLOCK(failover_lock);
> +static bool slave_rename_ok = true;
> +
> +module_param(slave_rename_ok, bool, (S_IRUGO | S_IWUSR));
> +MODULE_PARM_DESC(slave_rename_ok,
> +  "If set allow renaming the slave when failover master is up");
> 
 No module parameters please. If you need to set something do it using
 rtnl_link_ops. Thanks.
 
 
>>> I understand what you ask for, but without module parameters userspace 
>>> don't work. During boot (dracut) the virtio netdev gets enslaved earlier 
>>> than when userspace comes up, so failover has to determine the setting 
>>> during initialization/creation. This config is not dynamic, at least for 
>>> the life cycle of a particular failover link it shouldn't be changed. 
>>> Without module parameter, how does the userspace specify this value during 
>>> kernel initialization? 
>>> 
>> Can we enable this by default and not make it configurable via module 
>> parameter?
>> Is there any  usecase where someone expects rename to fail with failover 
>> slaves?
> Probably just cater for those application that assumes fixed name on UP 
> interface?
> 
> It's already the default for the configurable. I myself don't think that's a 
> big problem for failover users. So far there's not even QEMU support I think 
> everything can be changed. I don't feel strong to just fix it without 
> introducing configurable. But maybe Michael or others think it differently...
> 
> If no one objects, I don't feel strong to make it fixed behavior.
> 
> -Siwei
> 

I agree we should just remove the module parameter.

-Liran


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-03-06 Thread Samudrala, Sridhar


On 3/6/2019 1:26 PM, si-wei liu wrote:




On 3/6/2019 4:04 AM, Jiri Pirko wrote:

--- a/net/core/failover.c
+++ b/net/core/failover.c
@@ -16,6 +16,11 @@

static LIST_HEAD(failover_list);
static DEFINE_SPINLOCK(failover_lock);
+static bool slave_rename_ok = true;
+
+module_param(slave_rename_ok, bool, (S_IRUGO | S_IWUSR));
+MODULE_PARM_DESC(slave_rename_ok,
+"If set allow renaming the slave when failover master is up");

No module parameters please. If you need to set something do it using
rtnl_link_ops. Thanks.

I understand what you ask for, but without module parameters userspace 
don't work. During boot (dracut) the virtio netdev gets enslaved 
earlier than when userspace comes up, so failover has to determine the 
setting during initialization/creation. This config is not dynamic, at 
least for the life cycle of a particular failover link it shouldn't be 
changed. Without module parameter, how does the userspace specify this 
value during kernel initialization?


Can we enable this by default and not make it configurable via module 
parameter?
Is there any  usecase where someone expects rename to fail with failover 
slaves?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-03-06 Thread Jiri Pirko
Tue, Mar 05, 2019 at 01:50:59AM CET, si-wei@oracle.com wrote:
>When a netdev appears through hot plug then gets enslaved by a failover
>master that is already up and running, the slave will be opened
>right away after getting enslaved. Today there's a race that userspace
>(udev) may fail to rename the slave if the kernel (net_failover)
>opens the slave earlier than when the userspace rename happens.
>Unlike bond or team, the primary slave of failover can't be renamed by
>userspace ahead of time, since the kernel initiated auto-enslavement is
>unable to, or rather, is never meant to be synchronized with the rename
>request from userspace.
>
>As the failover slave interfaces are not designed to be operated
>directly by userspace apps: IP configuration, filter rules with
>regard to network traffic passing and etc., should all be done on master
>interface. In general, userspace apps only care about the
>name of master interface, while slave names are less important as long
>as admin users can see reliable names that may carry
>other information describing the netdev. For e.g., they can infer that
>"ens3nsby" is a standby slave of "ens3", while for a
>name like "eth0" they can't tell which master it belongs to.
>
>Historically the name of IFF_UP interface can't be changed because
>there might be admin script or management software that is already
>relying on such behavior and assumes that the slave name can't be
>changed once UP. But failover is special: with the in-kernel
>auto-enslavement mechanism, the userspace expectation for device
>enumeration and bring-up order is already broken. Previously initramfs
>and various userspace config tools were modified to bypass failover
>slaves because of auto-enslavement and duplicate MAC address. Similarly,
>in case that users care about seeing reliable slave name, the new type
>of failover slaves needs to be taken care of specifically in userspace
>anyway.
>
>For that to work, now introduce a module-level tunable,
>"slave_rename_ok" that allows users to lift up the rename restriction on
>failover slave which is already UP. Although it's possible this change
>potentially break userspace component (most likely configuration scripts
>or management software) that assumes slave name can't be changed while
>UP, it's relatively a limited and controllable set among all userspace
>components, which can be fixed specifically to work with the new naming
>behavior of the failover slave. Userspace component interacting with
>slaves should be changed to operate on failover master instead, as the
>failover slave is dynamic in nature which may come and go at any point.
>The goal is to make the role of failover slaves less relevant, and
>all userspace should only deal with master in the long run. The default
>for the "slave_rename_ok" is set to true(1). If userspace doesn't have
>the right support in place meanwhile users don't care about reliable
>userspace naming, the value can be set to false(0).
>
>Signed-off-by: si-wei@oracle.com
>Reviewed-by: Liran Alon 
>---
> include/linux/netdevice.h |  3 +++
> net/core/dev.c|  3 ++-
> net/core/failover.c   | 11 +--
> 3 files changed, 14 insertions(+), 3 deletions(-)
>
>diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>index 857f8ab..6d9e4e0 100644
>--- a/include/linux/netdevice.h
>+++ b/include/linux/netdevice.h
>@@ -1487,6 +1487,7 @@ struct net_device_ops {
>  * @IFF_NO_RX_HANDLER: device doesn't support the rx_handler hook
>  * @IFF_FAILOVER: device is a failover master device
>  * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device
>+ * @IFF_SLAVE_RENAME_OK: rename is allowed while slave device is running
>  */
> enum netdev_priv_flags {
>   IFF_802_1Q_VLAN = 1<<0,
>@@ -1518,6 +1519,7 @@ enum netdev_priv_flags {
>   IFF_NO_RX_HANDLER   = 1<<26,
>   IFF_FAILOVER= 1<<27,
>   IFF_FAILOVER_SLAVE  = 1<<28,
>+  IFF_SLAVE_RENAME_OK = 1<<29,
> };
> 
> #define IFF_802_1Q_VLAN   IFF_802_1Q_VLAN
>@@ -1548,6 +1550,7 @@ enum netdev_priv_flags {
> #define IFF_NO_RX_HANDLER IFF_NO_RX_HANDLER
> #define IFF_FAILOVER  IFF_FAILOVER
> #define IFF_FAILOVER_SLAVEIFF_FAILOVER_SLAVE
>+#define IFF_SLAVE_RENAME_OK   IFF_SLAVE_RENAME_OK
> 
> /**
>  *struct net_device - The DEVICE structure.
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 722d50d..ae070de 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -1180,7 +1180,8 @@ int dev_change_name(struct net_device *dev, const char 
>*newname)
>   BUG_ON(!dev_net(dev));
> 
>   net = dev_net(dev);
>-  if (dev->flags & IFF_UP)
>+  if (dev->flags & IFF_UP &&
>+  !(dev->priv_flags & IFF_SLAVE_RENAME_OK))
>   return -EBUSY;
> 
>   write_seqcount_begin(&devnet_rename_seq);
>diff --git a/net/core/failover.c b/net/core/failover.c
>index 4a92a98..1fd8bbb 

Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-03-05 Thread Michael S. Tsirkin
On Tue, Mar 05, 2019 at 11:15:06PM -0800, si-wei liu wrote:
> 
> 
> On 3/5/2019 10:43 PM, Michael S. Tsirkin wrote:
> > On Tue, Mar 05, 2019 at 04:51:00PM -0800, si-wei liu wrote:
> > > 
> > > On 3/5/2019 4:36 PM, Michael S. Tsirkin wrote:
> > > > On Tue, Mar 05, 2019 at 04:20:50PM -0800, si-wei liu wrote:
> > > > > On 3/5/2019 4:06 PM, Michael S. Tsirkin wrote:
> > > > > > On Tue, Mar 05, 2019 at 11:35:50AM -0800, si-wei liu wrote:
> > > > > > > On 3/5/2019 11:24 AM, Stephen Hemminger wrote:
> > > > > > > > On Tue, 5 Mar 2019 11:19:32 -0800
> > > > > > > > si-wei liu  wrote:
> > > > > > > > 
> > > > > > > > > > I have a vague idea: would it work to *not* set
> > > > > > > > > > IFF_UP on slave devices at all?
> > > > > > > > > Hmm, I ever thought about this option, and it appears this 
> > > > > > > > > solution is
> > > > > > > > > more invasive than required to convert existing scripts, 
> > > > > > > > > despite the
> > > > > > > > > controversy of introducing internal netdev state to 
> > > > > > > > > differentiate user
> > > > > > > > > visible state. Either we disallow slave to be brought up by 
> > > > > > > > > user, or to
> > > > > > > > > not set IFF_UP flag but instead use the internal one, could 
> > > > > > > > > end up with
> > > > > > > > > substantial behavioral change that breaks scripts. Consider 
> > > > > > > > > any admin
> > > > > > > > > script that does `ip link set dev ... up' successfully just 
> > > > > > > > > assumes the
> > > > > > > > > link is up and subsequent operation can be done as usual.
> > > > > > How would it work when carrier is off?
> > > > > > 
> > > > > > > While it *may*
> > > > > > > > > work for dracut (yet to be verified), I'm a bit concerned 
> > > > > > > > > that there are
> > > > > > > > > more scripts to be converted than those that don't follow 
> > > > > > > > > volatile
> > > > > > > > > failover slave names. It's technically doable, but may not 
> > > > > > > > > worth the
> > > > > > > > > effort (in terms of porting existing scripts/apps).
> > > > > > > > > 
> > > > > > > > > Thanks
> > > > > > > > > -Siwei
> > > > > > > > Won't work for most devices.  Many devices turn off PHY and 
> > > > > > > > link layer
> > > > > > > > if not IFF_UP
> > > > > > > True, that's what I said about introducing internal state for 
> > > > > > > those driver
> > > > > > > and other kernel component. Very invasive change indeed.
> > > > > > > 
> > > > > > > -Siwei
> > > > > > Well I did say it's vague.
> > > > > > How about hiding IFF_UP from dev_get_flags (and probably
> > > > > > __dev_change_flags)?
> > > > > > 
> > > > > Any different? This has small footprint for the kernel change for 
> > > > > sure,
> > > > > while the discrepancy is still there. Anyone who writes code for 
> > > > > IFF_UP will
> > > > > not notice IFF_FAILOVER_SLAVE.
> > > > > 
> > > > > Not to mention more userspace "fixup" work has to be done due to this
> > > > > change.
> > > > > 
> > > > > -Siwei
> > > > > 
> > > > > 
> > > > Point is it's ok since most userspace should just ignore slaves
> > > > - hopefully it will just ignore it since it already
> > > > ignores interfaces that are down.
> > > Admin script thought the interface could be bright up and do further
> > > operations without checking the UP flag.
> > These scripts then would be broken  on any box with multiple interfaces
> > since not all of these would have carrier.
> Consider a script executing `ifconfig ... up' and once succeeds runs tcpdump
> or some other command relying on UP interface. It's quite common that those
> scripts don't check the UP flag but instead just rely on the well-known fact
> that the command exits with 0 meaning the interface should be UP. This
> change might well break scripts of that kind.

I am sorry I don't get it. Could you give an example
of a script that works now but would be broken?


> > 
> > 
> > > It doesn't look to be a reliable
> > > way of prohibit userspace from operating against slaves.
> > > 
> > > -Siwei
> > > 
> > > 
> > This does not mean we shouldn't make an effort to disable broken
> > configurations.
> > 
> > I am not arguing against your patch. Not at all. I see better
> > hiding of slaves as a separate enhancement.
> I understand, but my point is we should try to minimize unnecessary side
> impact to the current usage for whatever "hiding" effort we can make. It's
> hard to find a tradeoff sometimes.

Yes if some userspace made an assumption and it worked, we should keep
it working I think. I don't necessarily agree we should worry too much
about theoretical issues. In half a year since the feature got merged
it's unlikely there are millions of slightly different scripts using it.

> > 
> > 
> > Acked-by: Michael S. Tsirkin 
> > 
> > 
> Thank you.
> 
> -Siwei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-03-05 Thread Michael S. Tsirkin
On Tue, Mar 05, 2019 at 04:51:00PM -0800, si-wei liu wrote:
> 
> 
> On 3/5/2019 4:36 PM, Michael S. Tsirkin wrote:
> > On Tue, Mar 05, 2019 at 04:20:50PM -0800, si-wei liu wrote:
> > > 
> > > On 3/5/2019 4:06 PM, Michael S. Tsirkin wrote:
> > > > On Tue, Mar 05, 2019 at 11:35:50AM -0800, si-wei liu wrote:
> > > > > On 3/5/2019 11:24 AM, Stephen Hemminger wrote:
> > > > > > On Tue, 5 Mar 2019 11:19:32 -0800
> > > > > > si-wei liu  wrote:
> > > > > > 
> > > > > > > > I have a vague idea: would it work to *not* set
> > > > > > > > IFF_UP on slave devices at all?
> > > > > > > Hmm, I ever thought about this option, and it appears this 
> > > > > > > solution is
> > > > > > > more invasive than required to convert existing scripts, despite 
> > > > > > > the
> > > > > > > controversy of introducing internal netdev state to differentiate 
> > > > > > > user
> > > > > > > visible state. Either we disallow slave to be brought up by user, 
> > > > > > > or to
> > > > > > > not set IFF_UP flag but instead use the internal one, could end 
> > > > > > > up with
> > > > > > > substantial behavioral change that breaks scripts. Consider any 
> > > > > > > admin
> > > > > > > script that does `ip link set dev ... up' successfully just 
> > > > > > > assumes the
> > > > > > > link is up and subsequent operation can be done as usual.
> > > > How would it work when carrier is off?
> > > > 
> > > > > While it *may*
> > > > > > > work for dracut (yet to be verified), I'm a bit concerned that 
> > > > > > > there are
> > > > > > > more scripts to be converted than those that don't follow volatile
> > > > > > > failover slave names. It's technically doable, but may not worth 
> > > > > > > the
> > > > > > > effort (in terms of porting existing scripts/apps).
> > > > > > > 
> > > > > > > Thanks
> > > > > > > -Siwei
> > > > > > Won't work for most devices.  Many devices turn off PHY and link 
> > > > > > layer
> > > > > > if not IFF_UP
> > > > > True, that's what I said about introducing internal state for those 
> > > > > driver
> > > > > and other kernel component. Very invasive change indeed.
> > > > > 
> > > > > -Siwei
> > > > Well I did say it's vague.
> > > > How about hiding IFF_UP from dev_get_flags (and probably
> > > > __dev_change_flags)?
> > > > 
> > > Any different? This has small footprint for the kernel change for sure,
> > > while the discrepancy is still there. Anyone who writes code for IFF_UP 
> > > will
> > > not notice IFF_FAILOVER_SLAVE.
> > > 
> > > Not to mention more userspace "fixup" work has to be done due to this
> > > change.
> > > 
> > > -Siwei
> > > 
> > > 
> > Point is it's ok since most userspace should just ignore slaves
> > - hopefully it will just ignore it since it already
> > ignores interfaces that are down.
> Admin script thought the interface could be bright up and do further
> operations without checking the UP flag.

These scripts then would be broken  on any box with multiple interfaces
since not all of these would have carrier.


> It doesn't look to be a reliable
> way of prohibit userspace from operating against slaves.
> 
> -Siwei
> 
> 

This does not mean we shouldn't make an effort to disable broken
configurations.

I am not arguing against your patch. Not at all. I see better
hiding of slaves as a separate enhancement.


Acked-by: Michael S. Tsirkin 


-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-03-05 Thread Michael S. Tsirkin
On Tue, Mar 05, 2019 at 04:20:50PM -0800, si-wei liu wrote:
> 
> 
> On 3/5/2019 4:06 PM, Michael S. Tsirkin wrote:
> > On Tue, Mar 05, 2019 at 11:35:50AM -0800, si-wei liu wrote:
> > > 
> > > On 3/5/2019 11:24 AM, Stephen Hemminger wrote:
> > > > On Tue, 5 Mar 2019 11:19:32 -0800
> > > > si-wei liu  wrote:
> > > > 
> > > > > > I have a vague idea: would it work to *not* set
> > > > > > IFF_UP on slave devices at all?
> > > > > Hmm, I ever thought about this option, and it appears this solution is
> > > > > more invasive than required to convert existing scripts, despite the
> > > > > controversy of introducing internal netdev state to differentiate user
> > > > > visible state. Either we disallow slave to be brought up by user, or 
> > > > > to
> > > > > not set IFF_UP flag but instead use the internal one, could end up 
> > > > > with
> > > > > substantial behavioral change that breaks scripts. Consider any admin
> > > > > script that does `ip link set dev ... up' successfully just assumes 
> > > > > the
> > > > > link is up and subsequent operation can be done as usual.
> > How would it work when carrier is off?
> > 
> > > While it *may*
> > > > > work for dracut (yet to be verified), I'm a bit concerned that there 
> > > > > are
> > > > > more scripts to be converted than those that don't follow volatile
> > > > > failover slave names. It's technically doable, but may not worth the
> > > > > effort (in terms of porting existing scripts/apps).
> > > > > 
> > > > > Thanks
> > > > > -Siwei
> > > > Won't work for most devices.  Many devices turn off PHY and link layer
> > > > if not IFF_UP
> > > True, that's what I said about introducing internal state for those driver
> > > and other kernel component. Very invasive change indeed.
> > > 
> > > -Siwei
> > Well I did say it's vague.
> > How about hiding IFF_UP from dev_get_flags (and probably
> > __dev_change_flags)?
> > 
> Any different? This has small footprint for the kernel change for sure,
> while the discrepancy is still there. Anyone who writes code for IFF_UP will
> not notice IFF_FAILOVER_SLAVE.
> 
> Not to mention more userspace "fixup" work has to be done due to this
> change.
> 
> -Siwei
> 
> 

Point is it's ok since most userspace should just ignore slaves
- hopefully it will just ignore it since it already
ignores interfaces that are down.

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-03-05 Thread Michael S. Tsirkin
On Tue, Mar 05, 2019 at 11:35:50AM -0800, si-wei liu wrote:
> 
> 
> On 3/5/2019 11:24 AM, Stephen Hemminger wrote:
> > On Tue, 5 Mar 2019 11:19:32 -0800
> > si-wei liu  wrote:
> > 
> > > > I have a vague idea: would it work to *not* set
> > > > IFF_UP on slave devices at all?
> > > Hmm, I ever thought about this option, and it appears this solution is
> > > more invasive than required to convert existing scripts, despite the
> > > controversy of introducing internal netdev state to differentiate user
> > > visible state. Either we disallow slave to be brought up by user, or to
> > > not set IFF_UP flag but instead use the internal one, could end up with
> > > substantial behavioral change that breaks scripts. Consider any admin
> > > script that does `ip link set dev ... up' successfully just assumes the
> > > link is up and subsequent operation can be done as usual.

How would it work when carrier is off?

> While it *may*
> > > work for dracut (yet to be verified), I'm a bit concerned that there are
> > > more scripts to be converted than those that don't follow volatile
> > > failover slave names. It's technically doable, but may not worth the
> > > effort (in terms of porting existing scripts/apps).
> > > 
> > > Thanks
> > > -Siwei
> > Won't work for most devices.  Many devices turn off PHY and link layer
> > if not IFF_UP
> True, that's what I said about introducing internal state for those driver
> and other kernel component. Very invasive change indeed.
> 
> -Siwei

Well I did say it's vague.
How about hiding IFF_UP from dev_get_flags (and probably
__dev_change_flags)?


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-03-05 Thread Michael S. Tsirkin
On Tue, Mar 05, 2019 at 11:19:32AM -0800, si-wei liu wrote:
> 
> 
> On 3/4/2019 6:33 PM, Michael S. Tsirkin wrote:
> > On Mon, Mar 04, 2019 at 07:50:59PM -0500, Si-Wei Liu wrote:
> > > When a netdev appears through hot plug then gets enslaved by a failover
> > > master that is already up and running, the slave will be opened
> > > right away after getting enslaved. Today there's a race that userspace
> > > (udev) may fail to rename the slave if the kernel (net_failover)
> > > opens the slave earlier than when the userspace rename happens.
> > > Unlike bond or team, the primary slave of failover can't be renamed by
> > > userspace ahead of time, since the kernel initiated auto-enslavement is
> > > unable to, or rather, is never meant to be synchronized with the rename
> > > request from userspace.
> > > 
> > > As the failover slave interfaces are not designed to be operated
> > > directly by userspace apps: IP configuration, filter rules with
> > > regard to network traffic passing and etc., should all be done on master
> > > interface. In general, userspace apps only care about the
> > > name of master interface, while slave names are less important as long
> > > as admin users can see reliable names that may carry
> > > other information describing the netdev. For e.g., they can infer that
> > > "ens3nsby" is a standby slave of "ens3", while for a
> > > name like "eth0" they can't tell which master it belongs to.
> > > 
> > > Historically the name of IFF_UP interface can't be changed because
> > > there might be admin script or management software that is already
> > > relying on such behavior and assumes that the slave name can't be
> > > changed once UP. But failover is special: with the in-kernel
> > > auto-enslavement mechanism, the userspace expectation for device
> > > enumeration and bring-up order is already broken. Previously initramfs
> > > and various userspace config tools were modified to bypass failover
> > > slaves because of auto-enslavement and duplicate MAC address. Similarly,
> > > in case that users care about seeing reliable slave name, the new type
> > > of failover slaves needs to be taken care of specifically in userspace
> > > anyway.
> > > 
> > > For that to work, now introduce a module-level tunable,
> > > "slave_rename_ok" that allows users to lift up the rename restriction on
> > > failover slave which is already UP. Although it's possible this change
> > > potentially break userspace component (most likely configuration scripts
> > > or management software) that assumes slave name can't be changed while
> > > UP, it's relatively a limited and controllable set among all userspace
> > > components, which can be fixed specifically to work with the new naming
> > > behavior of the failover slave. Userspace component interacting with
> > > slaves should be changed to operate on failover master instead, as the
> > > failover slave is dynamic in nature which may come and go at any point.
> > > The goal is to make the role of failover slaves less relevant, and
> > > all userspace should only deal with master in the long run. The default
> > > for the "slave_rename_ok" is set to true(1). If userspace doesn't have
> > > the right support in place meanwhile users don't care about reliable
> > > userspace naming, the value can be set to false(0).
> > > 
> > > Signed-off-by: si-wei@oracle.com
> > > Reviewed-by: Liran Alon 
> > Not sure which of the versions I should reply to.
> Sorry for multiple copies sent. It's fine to reply to this one.
> 
> > 
> > I have a vague idea: would it work to *not* set
> > IFF_UP on slave devices at all?
> Hmm, I ever thought about this option, and it appears this solution is more
> invasive than required to convert existing scripts, despite the controversy
> of introducing internal netdev state to differentiate user visible state.
> Either we disallow slave to be brought up by user, or to not set IFF_UP flag
> but instead use the internal one, could end up with substantial behavioral
> change that breaks scripts. Consider any admin script that does `ip link set
> dev ... up' successfully just assumes the link is up and subsequent
> operation can be done as usual. While it *may* work for dracut (yet to be
> verified), I'm a bit concerned that there are more scripts to be converted
> than those that don't follow volatile failover slave names. It's technically
> doable, but may not worth the effort (in terms of porting existing
> scripts/apps).
> 
> Thanks
> -Siwei


Right. Advantage could be that we prevent all kind of
misconfigurations e.g. when one has a route on a slave.

> > 
> > Would this reduce the chances of existing scripts such as dracut being
> > confused?
> > 
> > And this leaves open the option for scripts to address
> > slaves by checking some custom attribute.
> > 
> > > ---
> > >   include/linux/netdevice.h |  3 +++
> > >   net/core/dev.c|  3 ++-
> > >   net/core/failover.c   | 11 +--
> > >   3 files changed, 14 ins

Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-03-05 Thread Stephen Hemminger
On Tue, 5 Mar 2019 11:19:32 -0800
si-wei liu  wrote:

> > I have a vague idea: would it work to *not* set
> > IFF_UP on slave devices at all?  
> Hmm, I ever thought about this option, and it appears this solution is 
> more invasive than required to convert existing scripts, despite the 
> controversy of introducing internal netdev state to differentiate user 
> visible state. Either we disallow slave to be brought up by user, or to 
> not set IFF_UP flag but instead use the internal one, could end up with 
> substantial behavioral change that breaks scripts. Consider any admin 
> script that does `ip link set dev ... up' successfully just assumes the 
> link is up and subsequent operation can be done as usual. While it *may* 
> work for dracut (yet to be verified), I'm a bit concerned that there are 
> more scripts to be converted than those that don't follow volatile 
> failover slave names. It's technically doable, but may not worth the 
> effort (in terms of porting existing scripts/apps).
> 
> Thanks
> -Siwei

Won't work for most devices.  Many devices turn off PHY and link layer
if not IFF_UP
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH net-next] failover: allow name change on IFF_UP slave interfaces

2019-03-04 Thread Michael S. Tsirkin
On Mon, Mar 04, 2019 at 07:50:59PM -0500, Si-Wei Liu wrote:
> When a netdev appears through hot plug then gets enslaved by a failover
> master that is already up and running, the slave will be opened
> right away after getting enslaved. Today there's a race that userspace
> (udev) may fail to rename the slave if the kernel (net_failover)
> opens the slave earlier than when the userspace rename happens.
> Unlike bond or team, the primary slave of failover can't be renamed by
> userspace ahead of time, since the kernel initiated auto-enslavement is
> unable to, or rather, is never meant to be synchronized with the rename
> request from userspace.
> 
> As the failover slave interfaces are not designed to be operated
> directly by userspace apps: IP configuration, filter rules with
> regard to network traffic passing and etc., should all be done on master
> interface. In general, userspace apps only care about the
> name of master interface, while slave names are less important as long
> as admin users can see reliable names that may carry
> other information describing the netdev. For e.g., they can infer that
> "ens3nsby" is a standby slave of "ens3", while for a
> name like "eth0" they can't tell which master it belongs to.
> 
> Historically the name of IFF_UP interface can't be changed because
> there might be admin script or management software that is already
> relying on such behavior and assumes that the slave name can't be
> changed once UP. But failover is special: with the in-kernel
> auto-enslavement mechanism, the userspace expectation for device
> enumeration and bring-up order is already broken. Previously initramfs
> and various userspace config tools were modified to bypass failover
> slaves because of auto-enslavement and duplicate MAC address. Similarly,
> in case that users care about seeing reliable slave name, the new type
> of failover slaves needs to be taken care of specifically in userspace
> anyway.
> 
> For that to work, now introduce a module-level tunable,
> "slave_rename_ok" that allows users to lift up the rename restriction on
> failover slave which is already UP. Although it's possible this change
> potentially break userspace component (most likely configuration scripts
> or management software) that assumes slave name can't be changed while
> UP, it's relatively a limited and controllable set among all userspace
> components, which can be fixed specifically to work with the new naming
> behavior of the failover slave. Userspace component interacting with
> slaves should be changed to operate on failover master instead, as the
> failover slave is dynamic in nature which may come and go at any point.
> The goal is to make the role of failover slaves less relevant, and
> all userspace should only deal with master in the long run. The default
> for the "slave_rename_ok" is set to true(1). If userspace doesn't have
> the right support in place meanwhile users don't care about reliable
> userspace naming, the value can be set to false(0).
> 
> Signed-off-by: si-wei@oracle.com
> Reviewed-by: Liran Alon 

Not sure which of the versions I should reply to.

I have a vague idea: would it work to *not* set
IFF_UP on slave devices at all?

Would this reduce the chances of existing scripts such as dracut being
confused?

And this leaves open the option for scripts to address
slaves by checking some custom attribute.

> ---
>  include/linux/netdevice.h |  3 +++
>  net/core/dev.c|  3 ++-
>  net/core/failover.c   | 11 +--
>  3 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 857f8ab..6d9e4e0 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1487,6 +1487,7 @@ struct net_device_ops {
>   * @IFF_NO_RX_HANDLER: device doesn't support the rx_handler hook
>   * @IFF_FAILOVER: device is a failover master device
>   * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device
> + * @IFF_SLAVE_RENAME_OK: rename is allowed while slave device is running
>   */
>  enum netdev_priv_flags {
>   IFF_802_1Q_VLAN = 1<<0,
> @@ -1518,6 +1519,7 @@ enum netdev_priv_flags {
>   IFF_NO_RX_HANDLER   = 1<<26,
>   IFF_FAILOVER= 1<<27,
>   IFF_FAILOVER_SLAVE  = 1<<28,
> + IFF_SLAVE_RENAME_OK = 1<<29,
>  };
>  
>  #define IFF_802_1Q_VLAN  IFF_802_1Q_VLAN
> @@ -1548,6 +1550,7 @@ enum netdev_priv_flags {
>  #define IFF_NO_RX_HANDLERIFF_NO_RX_HANDLER
>  #define IFF_FAILOVER IFF_FAILOVER
>  #define IFF_FAILOVER_SLAVE   IFF_FAILOVER_SLAVE
> +#define IFF_SLAVE_RENAME_OK  IFF_SLAVE_RENAME_OK
>  
>  /**
>   *   struct net_device - The DEVICE structure.
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 722d50d..ae070de 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1180,7 +1180,8 @@ int dev_change_name(struct net