Re: [PATCH v4] net/bonding: send arp in interval if no active slave
Nikolay Aleksandrov wrote: On 10/06/2015 09:53 PM, Jarod Wilson wrote: From: Uwe KoziolekWith some very finicky switch hardware, active backup bonding can get into a situation where we play ping-pong between interfaces, trying to get one to come up as the active slave. There seems to be an issue with the switch's arp replies either taking too long, or simply getting lost, so we wind up unable to get any interface up and active. Sometimes, the issue sorts itself out after a while, sometimes it doesn't. Testing with num_grat_arp has proven fruitless, but sending an additional arp on curr_arp_slave if we're still in the arp_interval timeslice in bond_ab_arp_probe(), has shown to produce 100% reliability in testing with this hardware combination. [jarod: manufacturing of changelog, addition of modparam gating] CC: Jay Vosburgh CC: Andy Gospodarek CC: Veaceslav Falico CC: netdev@vger.kernel.org Signed-off-by: Uwe Koziolek Signed-off-by: Jarod Wilson --- v2: add code comment as to why change is needed v3: fix wrapping of comments v4: [jarod] add module parameter gating of code addition Hi all, As Andy already stated I'm not a fan of such workarounds either but it's necessary sometimes so if this is going to be actually considered then a few things need to be fixed. Please make this a proper bonding option which can be changed at runtime and not only via a module parameter. Okay, I can give that a shot, however... Now, I saw that you've only tested with 500 ms, can't this be fixed by using a different interval ? This seems like a very specific problem to have a whole new option for. ...I'll wait until we've heard confirmation from Uwe that intervals other than 500ms don't fix things. A test with 5000 ms don't fix the problem. Tested with Cisco C3750, 4 bonds. I really want to say fix the switch but I know that's not an option. :-) Yeah, unfortunately not! A few minor nits below, drivers/net/bonding/bond_main.c | 24 include/net/bonding.h | 1 + 2 files changed, 25 insertions(+) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 90f2615..72ab512 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -95,6 +95,7 @@ static int miimon; static int updelay; static int downdelay; static int use_carrier= 1; +static int arp_slow_switch; static char *mode; static char *primary; static char *primary_reselect; @@ -133,6 +134,10 @@ MODULE_PARM_DESC(downdelay, "Delay before considering link down, " module_param(use_carrier, int, 0); MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; " "0 for off, 1 for on (default)"); +module_param(arp_slow_switch, int, 0); +MODULE_PARM_DESC(arp_slow_switch, "Do extra arp checks for switches with arp " + "caches that are slow to update; " + "0 for off (default), 1 for on"); module_param(mode, charp, 0); MODULE_PARM_DESC(mode, "Mode of operation; 0 for balance-rr, " "1 for active-backup, 2 for balance-xor, " @@ -2793,6 +2798,18 @@ static bool bond_ab_arp_probe(struct bonding *bond) return should_notify_rtnl; } +/* Sometimes the forwarding tables of the switches are not update ^ s/update/updated/ D'oh. Fixed locally. @@ -4280,6 +4297,12 @@ static int bond_check_params(struct bond_params *params) use_carrier = 1; } +if ((arp_slow_switch != 0) && (arp_slow_switch != 1)) { ^^ no need for the extra () Copy-pasta from use_carrier checks right above it. Never quite sure if I should stick with the same possibly sub-optimal formatting conventions already in the file, try to fix them while also fixing bugs, or just mix styles... +pr_warn("Warning: arp_slow_switch module parameter (%d), not of valid value (0/1), so it was set to 1\n", +arp_slow_switch); +arp_slow_switch = 1; ^^ please default to old behaviour in this case (0) Will do. Uwe Koziolek -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] net/bonding: send arp in interval if no active slave
Jay Vosburgh wrote: Jarod Wilsonwrote: Jarod Wilson wrote: ... As Andy already stated I'm not a fan of such workarounds either but it's necessary sometimes so if this is going to be actually considered then a few things need to be fixed. Please make this a proper bonding option which can be changed at runtime and not only via a module parameter. Is there any particular userspace tool that would need some updating, or is adding the sysfs knobs sufficient here? I think I've got all the sysfs stuff thrown together now, but still need to test. Most (all?) bonding options should be configurable via iproute (netlink) now. D'oh, of course. I've done the kernel-side netlink bits now too, and started looking at the iproute source. However... Now, I saw that you've only tested with 500 ms, can't this be fixed by using a different interval ? This seems like a very specific problem to have a whole new option for. ...I'll wait until we've heard confirmation from Uwe that intervals other than 500ms don't fix things. Okay, so I believe the "only tested with 500ms" was in reference to testing with Uwe's initial patch. I do have supporting evidence in a bugzilla report that shows upwards of 5000ms still experience the problem here. I did set up some switches and attempt to reproduce this yesterday; I daisy-chained three switches (two Cisco and an HP) together and connected the bonded interfaces to the "end" switches. I tried various ARP targets (the switch, hosts on various points of the switch) and varying arp_intervals and was unable to reproduce the problem. As I understand it, the working theory is something like this: - host with two bonded interfaces, A and B. For active-backup mode, the interfaces have been assigned the same MAC address. - switch has MAC for B in its forwarding table - bonding goes from down to up, and thinks all its slaves are down, and starts the "curr_arp_slave" search for an active arp_ip_target. In this case, it starts with A, and sends an ARP from A. As an aside, I'm not 100% clear on what exactly is going on in the "bonding goes from down to up" transition; this seems to be key in reproducing the issue. - switch sees source mac coming from port A, starts to update its forwarding table - meanwhile, switch forwards ARP request, and receives ARP reply, which it forwards to port B. Bonding drops this, as the slave is inactive. - switch finishes updating forwarding table, MAC is now assigned to port A. - bonding now tries sending on port B, and the cycle repeats. If this is what's taking place, then the arp_interval itself is irrelevant, the race is between the switch table update and the generation of the ARP reply. Also, presuming the above is what's going on, we could modify the ARP "curr_arp_slave" logic a bit to resolve this without requiring any magic knobs. I really like this idea. Still trying to grasp exactly how we get into this situation and what everything looks like as we hop through the various bond_ab_arp_* functions though. For example, we could change the "drop on inactive" logic to recognise the "curr_arp_slave" search and accept the unicast ARP reply, and perhaps make that receiving slave the next curr_arp_slave automatically. Nothing ever actually getting picked as curr_arp_slave does appear to be the problem, so that does sound like it could do the trick. I also wonder if the fail_over_mac option would affect this behavior, as it would cause the slaves to keep their MAC address for the duration, so the switch would not see the MAC move from port to port. Not sure if that's an option for the particular environment, but we could certainly ask Uwe to give it a try. Another thought would be to have the curr_arp_slave cycle through the slaves in random order, but that could create non-deterministic results even when things are working correctly. I'd say avoid this route if at all possible, would rather not make things less predictable. -- Jarod Wilson ja...@redhat.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] net/bonding: send arp in interval if no active slave
Jarod Wilson wrote: ... As Andy already stated I'm not a fan of such workarounds either but it's necessary sometimes so if this is going to be actually considered then a few things need to be fixed. Please make this a proper bonding option which can be changed at runtime and not only via a module parameter. Is there any particular userspace tool that would need some updating, or is adding the sysfs knobs sufficient here? I think I've got all the sysfs stuff thrown together now, but still need to test. Now, I saw that you've only tested with 500 ms, can't this be fixed by using a different interval ? This seems like a very specific problem to have a whole new option for. ...I'll wait until we've heard confirmation from Uwe that intervals other than 500ms don't fix things. Okay, so I believe the "only tested with 500ms" was in reference to testing with Uwe's initial patch. I do have supporting evidence in a bugzilla report that shows upwards of 5000ms still experience the problem here. -- Jarod Wilson ja...@redhat.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] net/bonding: send arp in interval if no active slave
On 10/09/2015 04:36 PM, Jarod Wilson wrote: > Jarod Wilson wrote: > ... >> As Andy already stated I'm not a fan of such workarounds either but it's >> necessary sometimes so if this is going to be actually considered then a >> few things need to be fixed. Please make this a proper bonding option >> which can be changed at runtime and not only via a module parameter. > > Is there any particular userspace tool that would need some updating, or is > adding the sysfs knobs sufficient here? I think I've got all the sysfs stuff > thrown together now, but still need to test. > I'd say adding netlink support at this point is more important, and it'd be nice if you can add support to iproute2 for the new attribute. Currently all bonding options have both netlink and sysfs support, so you can follow that, the others can correct me if I'm wrong here. One more thing please don't forget to update Documentation/networking/bonding.txt > >>> Now, I saw that you've only tested with 500 ms, can't this be fixed by >>> using >>> a different interval ? This seems like a very specific problem to have a >>> whole new option for. >> >> ...I'll wait until we've heard confirmation from Uwe that intervals >> other than 500ms don't fix things. > > Okay, so I believe the "only tested with 500ms" was in reference to testing > with Uwe's initial patch. I do have supporting evidence in a bugzilla report > that shows upwards of 5000ms still experience the problem here. _5 seconds_ are not enough to receive a reply, but sending it twice in a second fixes the issue ?! This sounds like the ARP request is not properly handled/received and there's no reply. Cheers, Nik -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] net/bonding: send arp in interval if no active slave
Jarod Wilsonwrote: >Jarod Wilson wrote: >... >> As Andy already stated I'm not a fan of such workarounds either but it's >> necessary sometimes so if this is going to be actually considered then a >> few things need to be fixed. Please make this a proper bonding option >> which can be changed at runtime and not only via a module parameter. > >Is there any particular userspace tool that would need some updating, or >is adding the sysfs knobs sufficient here? I think I've got all the sysfs >stuff thrown together now, but still need to test. Most (all?) bonding options should be configurable via iproute (netlink) now. > >>> Now, I saw that you've only tested with 500 ms, can't this be fixed by >>> using >>> a different interval ? This seems like a very specific problem to have a >>> whole new option for. >> >> ...I'll wait until we've heard confirmation from Uwe that intervals >> other than 500ms don't fix things. > >Okay, so I believe the "only tested with 500ms" was in reference to >testing with Uwe's initial patch. I do have supporting evidence in a >bugzilla report that shows upwards of 5000ms still experience the problem >here. I did set up some switches and attempt to reproduce this yesterday; I daisy-chained three switches (two Cisco and an HP) together and connected the bonded interfaces to the "end" switches. I tried various ARP targets (the switch, hosts on various points of the switch) and varying arp_intervals and was unable to reproduce the problem. As I understand it, the working theory is something like this: - host with two bonded interfaces, A and B. For active-backup mode, the interfaces have been assigned the same MAC address. - switch has MAC for B in its forwarding table - bonding goes from down to up, and thinks all its slaves are down, and starts the "curr_arp_slave" search for an active arp_ip_target. In this case, it starts with A, and sends an ARP from A. As an aside, I'm not 100% clear on what exactly is going on in the "bonding goes from down to up" transition; this seems to be key in reproducing the issue. - switch sees source mac coming from port A, starts to update its forwarding table - meanwhile, switch forwards ARP request, and receives ARP reply, which it forwards to port B. Bonding drops this, as the slave is inactive. - switch finishes updating forwarding table, MAC is now assigned to port A. - bonding now tries sending on port B, and the cycle repeats. If this is what's taking place, then the arp_interval itself is irrelevant, the race is between the switch table update and the generation of the ARP reply. Also, presuming the above is what's going on, we could modify the ARP "curr_arp_slave" logic a bit to resolve this without requiring any magic knobs. For example, we could change the "drop on inactive" logic to recognise the "curr_arp_slave" search and accept the unicast ARP reply, and perhaps make that receiving slave the next curr_arp_slave automatically. I also wonder if the fail_over_mac option would affect this behavior, as it would cause the slaves to keep their MAC address for the duration, so the switch would not see the MAC move from port to port. Another thought would be to have the curr_arp_slave cycle through the slaves in random order, but that could create non-deterministic results even when things are working correctly. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] net/bonding: send arp in interval if no active slave
On 10/06/2015 09:53 PM, Jarod Wilson wrote: > From: Uwe Koziolek> > With some very finicky switch hardware, active backup bonding can get into > a situation where we play ping-pong between interfaces, trying to get one > to come up as the active slave. There seems to be an issue with the > switch's arp replies either taking too long, or simply getting lost, so we > wind up unable to get any interface up and active. Sometimes, the issue > sorts itself out after a while, sometimes it doesn't. > > Testing with num_grat_arp has proven fruitless, but sending an additional > arp on curr_arp_slave if we're still in the arp_interval timeslice in > bond_ab_arp_probe(), has shown to produce 100% reliability in testing with > this hardware combination. > > [jarod: manufacturing of changelog, addition of modparam gating] > CC: Jay Vosburgh > CC: Andy Gospodarek > CC: Veaceslav Falico > CC: netdev@vger.kernel.org > Signed-off-by: Uwe Koziolek > Signed-off-by: Jarod Wilson > --- > v2: add code comment as to why change is needed > v3: fix wrapping of comments > v4: [jarod] add module parameter gating of code addition > Hi all, As Andy already stated I'm not a fan of such workarounds either but it's necessary sometimes so if this is going to be actually considered then a few things need to be fixed. Please make this a proper bonding option which can be changed at runtime and not only via a module parameter. Now, I saw that you've only tested with 500 ms, can't this be fixed by using a different interval ? This seems like a very specific problem to have a whole new option for. I really want to say fix the switch but I know that's not an option. :-) A few minor nits below, > drivers/net/bonding/bond_main.c | 24 > include/net/bonding.h | 1 + > 2 files changed, 25 insertions(+) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 90f2615..72ab512 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -95,6 +95,7 @@ static int miimon; > static int updelay; > static int downdelay; > static int use_carrier = 1; > +static int arp_slow_switch; > static char *mode; > static char *primary; > static char *primary_reselect; > @@ -133,6 +134,10 @@ MODULE_PARM_DESC(downdelay, "Delay before considering > link down, " > module_param(use_carrier, int, 0); > MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in > miimon; " > "0 for off, 1 for on (default)"); > +module_param(arp_slow_switch, int, 0); > +MODULE_PARM_DESC(arp_slow_switch, "Do extra arp checks for switches with arp > " > + "caches that are slow to update; " > + "0 for off (default), 1 for on"); > module_param(mode, charp, 0); > MODULE_PARM_DESC(mode, "Mode of operation; 0 for balance-rr, " > "1 for active-backup, 2 for balance-xor, " > @@ -2793,6 +2798,18 @@ static bool bond_ab_arp_probe(struct bonding *bond) > return should_notify_rtnl; > } > > + /* Sometimes the forwarding tables of the switches are not update ^ s/update/updated/ > + * fast enough, so the first arp response after a slave change is > + * received on the wrong slave. > + * > + * The arp requests will be retried 2 times on the same slave. > + */ > + if (arp_slow_switch && > + bond_time_in_interval(bond, curr_arp_slave->last_link_up, 2)) { > + bond_arp_send_all(bond, curr_arp_slave); > + return should_notify_rtnl; > + } > + > bond_set_slave_inactive_flags(curr_arp_slave, BOND_SLAVE_NOTIFY_LATER); > > bond_for_each_slave_rcu(bond, slave, iter) { > @@ -4280,6 +4297,12 @@ static int bond_check_params(struct bond_params > *params) > use_carrier = 1; > } > > + if ((arp_slow_switch != 0) && (arp_slow_switch != 1)) { ^^ no need for the extra () > + pr_warn("Warning: arp_slow_switch module parameter (%d), not of > valid value (0/1), so it was set to 1\n", > + arp_slow_switch); > + arp_slow_switch = 1; ^^ please default to old behaviour in this case (0) > + } > + > if (num_peer_notif < 0 || num_peer_notif > 255) { > pr_warn("Warning: num_grat_arp/num_unsol_na (%d) not in range > 0-255 so it was reset to 1\n", > num_peer_notif); > @@ -4516,6 +4539,7 @@ static int bond_check_params(struct bond_params *params) > params->updelay = updelay; > params->downdelay = downdelay; > params->use_carrier = use_carrier; > + params->arp_slow_switch = arp_slow_switch; > params->lacp_fast = lacp_fast; > params->primary[0] = 0; > params->primary_reselect =
Re: [PATCH v4] net/bonding: send arp in interval if no active slave
Nikolay Aleksandrov wrote: On 10/06/2015 09:53 PM, Jarod Wilson wrote: From: Uwe KoziolekWith some very finicky switch hardware, active backup bonding can get into a situation where we play ping-pong between interfaces, trying to get one to come up as the active slave. There seems to be an issue with the switch's arp replies either taking too long, or simply getting lost, so we wind up unable to get any interface up and active. Sometimes, the issue sorts itself out after a while, sometimes it doesn't. Testing with num_grat_arp has proven fruitless, but sending an additional arp on curr_arp_slave if we're still in the arp_interval timeslice in bond_ab_arp_probe(), has shown to produce 100% reliability in testing with this hardware combination. [jarod: manufacturing of changelog, addition of modparam gating] CC: Jay Vosburgh CC: Andy Gospodarek CC: Veaceslav Falico CC: netdev@vger.kernel.org Signed-off-by: Uwe Koziolek Signed-off-by: Jarod Wilson --- v2: add code comment as to why change is needed v3: fix wrapping of comments v4: [jarod] add module parameter gating of code addition Hi all, As Andy already stated I'm not a fan of such workarounds either but it's necessary sometimes so if this is going to be actually considered then a few things need to be fixed. Please make this a proper bonding option which can be changed at runtime and not only via a module parameter. Okay, I can give that a shot, however... Now, I saw that you've only tested with 500 ms, can't this be fixed by using a different interval ? This seems like a very specific problem to have a whole new option for. ...I'll wait until we've heard confirmation from Uwe that intervals other than 500ms don't fix things. I really want to say fix the switch but I know that's not an option. :-) Yeah, unfortunately not! A few minor nits below, drivers/net/bonding/bond_main.c | 24 include/net/bonding.h | 1 + 2 files changed, 25 insertions(+) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 90f2615..72ab512 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -95,6 +95,7 @@ static int miimon; static int updelay; static int downdelay; static int use_carrier= 1; +static int arp_slow_switch; static char *mode; static char *primary; static char *primary_reselect; @@ -133,6 +134,10 @@ MODULE_PARM_DESC(downdelay, "Delay before considering link down, " module_param(use_carrier, int, 0); MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; " "0 for off, 1 for on (default)"); +module_param(arp_slow_switch, int, 0); +MODULE_PARM_DESC(arp_slow_switch, "Do extra arp checks for switches with arp " + "caches that are slow to update; " + "0 for off (default), 1 for on"); module_param(mode, charp, 0); MODULE_PARM_DESC(mode, "Mode of operation; 0 for balance-rr, " "1 for active-backup, 2 for balance-xor, " @@ -2793,6 +2798,18 @@ static bool bond_ab_arp_probe(struct bonding *bond) return should_notify_rtnl; } + /* Sometimes the forwarding tables of the switches are not update ^ s/update/updated/ D'oh. Fixed locally. @@ -4280,6 +4297,12 @@ static int bond_check_params(struct bond_params *params) use_carrier = 1; } + if ((arp_slow_switch != 0) && (arp_slow_switch != 1)) { ^^ no need for the extra () Copy-pasta from use_carrier checks right above it. Never quite sure if I should stick with the same possibly sub-optimal formatting conventions already in the file, try to fix them while also fixing bugs, or just mix styles... + pr_warn("Warning: arp_slow_switch module parameter (%d), not of valid value (0/1), so it was set to 1\n", + arp_slow_switch); + arp_slow_switch = 1; ^^ please default to old behaviour in this case (0) Will do. -- Jarod Wilson ja...@redhat.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4] net/bonding: send arp in interval if no active slave
From: Uwe KoziolekWith some very finicky switch hardware, active backup bonding can get into a situation where we play ping-pong between interfaces, trying to get one to come up as the active slave. There seems to be an issue with the switch's arp replies either taking too long, or simply getting lost, so we wind up unable to get any interface up and active. Sometimes, the issue sorts itself out after a while, sometimes it doesn't. Testing with num_grat_arp has proven fruitless, but sending an additional arp on curr_arp_slave if we're still in the arp_interval timeslice in bond_ab_arp_probe(), has shown to produce 100% reliability in testing with this hardware combination. [jarod: manufacturing of changelog, addition of modparam gating] CC: Jay Vosburgh CC: Andy Gospodarek CC: Veaceslav Falico CC: netdev@vger.kernel.org Signed-off-by: Uwe Koziolek Signed-off-by: Jarod Wilson --- v2: add code comment as to why change is needed v3: fix wrapping of comments v4: [jarod] add module parameter gating of code addition drivers/net/bonding/bond_main.c | 24 include/net/bonding.h | 1 + 2 files changed, 25 insertions(+) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 90f2615..72ab512 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -95,6 +95,7 @@ static int miimon; static int updelay; static int downdelay; static int use_carrier = 1; +static int arp_slow_switch; static char *mode; static char *primary; static char *primary_reselect; @@ -133,6 +134,10 @@ MODULE_PARM_DESC(downdelay, "Delay before considering link down, " module_param(use_carrier, int, 0); MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; " "0 for off, 1 for on (default)"); +module_param(arp_slow_switch, int, 0); +MODULE_PARM_DESC(arp_slow_switch, "Do extra arp checks for switches with arp " + "caches that are slow to update; " + "0 for off (default), 1 for on"); module_param(mode, charp, 0); MODULE_PARM_DESC(mode, "Mode of operation; 0 for balance-rr, " "1 for active-backup, 2 for balance-xor, " @@ -2793,6 +2798,18 @@ static bool bond_ab_arp_probe(struct bonding *bond) return should_notify_rtnl; } + /* Sometimes the forwarding tables of the switches are not update +* fast enough, so the first arp response after a slave change is +* received on the wrong slave. +* +* The arp requests will be retried 2 times on the same slave. +*/ + if (arp_slow_switch && + bond_time_in_interval(bond, curr_arp_slave->last_link_up, 2)) { + bond_arp_send_all(bond, curr_arp_slave); + return should_notify_rtnl; + } + bond_set_slave_inactive_flags(curr_arp_slave, BOND_SLAVE_NOTIFY_LATER); bond_for_each_slave_rcu(bond, slave, iter) { @@ -4280,6 +4297,12 @@ static int bond_check_params(struct bond_params *params) use_carrier = 1; } + if ((arp_slow_switch != 0) && (arp_slow_switch != 1)) { + pr_warn("Warning: arp_slow_switch module parameter (%d), not of valid value (0/1), so it was set to 1\n", + arp_slow_switch); + arp_slow_switch = 1; + } + if (num_peer_notif < 0 || num_peer_notif > 255) { pr_warn("Warning: num_grat_arp/num_unsol_na (%d) not in range 0-255 so it was reset to 1\n", num_peer_notif); @@ -4516,6 +4539,7 @@ static int bond_check_params(struct bond_params *params) params->updelay = updelay; params->downdelay = downdelay; params->use_carrier = use_carrier; + params->arp_slow_switch = arp_slow_switch; params->lacp_fast = lacp_fast; params->primary[0] = 0; params->primary_reselect = primary_reselect_value; diff --git a/include/net/bonding.h b/include/net/bonding.h index c1740a2..208d31c 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -120,6 +120,7 @@ struct bond_params { int arp_validate; int arp_all_targets; int use_carrier; + int arp_slow_switch; int fail_over_mac; int updelay; int downdelay; -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] net/bonding: send arp in interval if no active slave
Jarod Wilson wrote: From: Uwe KoziolekWith some very finicky switch hardware, active backup bonding can get into a situation where we play ping-pong between interfaces, trying to get one to come up as the active slave. There seems to be an issue with the switch's arp replies either taking too long, or simply getting lost, so we wind up unable to get any interface up and active. Sometimes, the issue sorts itself out after a while, sometimes it doesn't. Testing with num_grat_arp has proven fruitless, but sending an additional arp on curr_arp_slave if we're still in the arp_interval timeslice in bond_ab_arp_probe(), has shown to produce 100% reliability in testing with this hardware combination. [jarod: manufacturing of changelog, addition of modparam gating] CC: Jay Vosburgh CC: Andy Gospodarek CC: Veaceslav Falico CC: netdev@vger.kernel.org Signed-off-by: Uwe Koziolek Signed-off-by: Jarod Wilson --- v2: add code comment as to why change is needed v3: fix wrapping of comments v4: [jarod] add module parameter gating of code addition drivers/net/bonding/bond_main.c | 24 include/net/bonding.h | 1 + 2 files changed, 25 insertions(+) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 90f2615..72ab512 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -95,6 +95,7 @@ static int miimon; static int updelay; static int downdelay; static int use_carrier= 1; +static int arp_slow_switch; static char *mode; static char *primary; static char *primary_reselect; @@ -133,6 +134,10 @@ MODULE_PARM_DESC(downdelay, "Delay before considering link down, " module_param(use_carrier, int, 0); MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; " "0 for off, 1 for on (default)"); +module_param(arp_slow_switch, int, 0); +MODULE_PARM_DESC(arp_slow_switch, "Do extra arp checks for switches with arp " + "caches that are slow to update; " + "0 for off (default), 1 for on"); module_param(mode, charp, 0); MODULE_PARM_DESC(mode, "Mode of operation; 0 for balance-rr, " "1 for active-backup, 2 for balance-xor, " @@ -2793,6 +2798,18 @@ static bool bond_ab_arp_probe(struct bonding *bond) return should_notify_rtnl; } + /* Sometimes the forwarding tables of the switches are not update +* fast enough, so the first arp response after a slave change is +* received on the wrong slave. +* +* The arp requests will be retried 2 times on the same slave. +*/ + if (arp_slow_switch && This here should actually be bond->params.arp_slow_switch, but I'd like to hear first if a module parameter gating this change is even a remotely acceptable idea. It'd keep the logic identical in the default case though, and still allow for people like Uwe that need it to deploy the work-around. Though I'm slightly curious if this problem does NOT manifest by simply setting a larger arp_interval. Early on, I thought I'd heard that other intervals had been tried with the same results, but a comment in this thread suggested maybe only 500 had been tried. -- Jarod Wilson ja...@redhat.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html