Re: [PATCH] bonding: 3ad: update slave arr after initialize

2021-04-19 Thread Jay Vosburgh
t;> return of ad_agg_selection_logic). >> >> I believe I understand the described problem, but I don't see >> how the patch fixes it. I suspect (but haven't tested) that the proper >> fix is to acquire mode_lock in bond_update_slave_arr while calling >>

Re: [PATCH] bonding: 3ad: update slave arr after initialize

2021-04-15 Thread Jay Vosburgh
remains false at return of ad_agg_selection_logic). I believe I understand the described problem, but I don't see how the patch fixes it. I suspect (but haven't tested) that the proper fix is to acquire mode_lock in bond_update_slave_arr while calling bond_3ad_get_active_agg_info to avoid conflict with the state machine. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH net-next] bonding: Added -ENODEV interpret for slaves option

2021-03-13 Thread Jay Vosburgh
opt->name, val->string); >+ } >+ break; > default: > break; > } >-- >2.25.1 > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: question about bonding mode 4

2021-01-29 Thread Jay Vosburgh
: port->aggregator is still NULL, which causes problem. >> >> aggregator = __get_first_agg(port); >> ad_agg_selection_logic(aggregator, update_slave_arr); >> >> if (!port->aggregator->is_active) >> port->actor_oper_port_state &= ~LACP_STATE_SYNCHRONIZATION; Correct, if the "did not find a suitable aggregator" path is taken, port->aggregator is NULL and bad things happen in the above block. This is something that needs to be fixed, but I'm also concerned that there are other issues lurking, so I'd like to be able to reproduce this. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH net-next v2] bonding: add a vlan+mac tx hashing option

2021-01-14 Thread Jay Vosburgh
gt;> > + struct ethhdr *mac_hdr = (struct ethhdr *)skb_mac_header(skb); >> >> I don't see anything in the patch making sure the interface actually >> has a L2 header. Should we validate somehow the ifc is Ethernet? > >I don't think it's necessary. There doesn't appear to be any explicit >check for BOND_XMIT_POLICY_LAYER2 either. I believe we're guaranteed to >not have anything but an ethernet header here, as the only other type I'm >aware of being supported is Infiniband, but we limit that to active-backup >only, and xmit_hash_policy isn't valid for active-backup. This is correct, interfaces in a bond other than active-backup will all be ARPHRD_ETHER. I'm unaware of a way to get a packet in there without at least an Ethernet header. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [RFC PATCH net-next] bonding: add a vlan+srcmac tx hashing option

2021-01-12 Thread Jay Vosburgh
Jarod Wilson wrote: >On Thu, Jan 07, 2021 at 07:03:40PM -0500, Jarod Wilson wrote: >> On Fri, Dec 18, 2020 at 04:18:59PM -0800, Jay Vosburgh wrote: >> > Jarod Wilson wrote: >> > >> > >This comes from an end-user request, where they're running multiple VM

Re: [RFC PATCH net-next] bonding: add a vlan+srcmac tx hashing option

2020-12-18 Thread Jay Vosburgh
th, is >fleshing out Documentation/networking/bonding.rst. I'm sure you're aware, but any final submission will also need to include netlink and iproute2 support. >Cc: Jay Vosburgh >Cc: Veaceslav Falico >Cc: Andy Gospodarek >Cc: "David S. Miller" >Cc: J

Re: [PATCH net-next] bonding: correct rr balancing during link failure

2020-12-08 Thread Jay Vosburgh
} >> } >> >> @@ -4117,6 +4118,7 @@ static struct slave *bond_get_slave_by_id(struct >> bonding *bond, >> break; >> if (bond_slave_can_tx(slave)) >> return slave; >> +bond->rr_tx_counter++; >> } >> /* no slave that can tx has been found */ >> return NULL; > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH net] bonding: reduce rtnl lock contention in mii monitor thread

2020-12-08 Thread Jay Vosburgh
active " : "backup ") : "", >+ bond->params.downdelay * >bond->params.miimon); >+ } >+ break; >+ >+ case BOND_LINK_DOWN: >+ slave_info(bond->dev, slave->dev, "link status down again after >%d ms\n", >+ (bond->params.updelay - slave->delay) * >+ bond->params.miimon); >+ break; >+ >+ case BOND_LINK_BACK: >+ if (slave->delay) { >+ slave_info(bond->dev, slave->dev, "link status up, >enabling it in %d ms\n", >+ bond->params.updelay * bond->params.miimon); >+ } >+ break; >+ } >+ > if (notify) { > bond_queue_slave_event(slave); > bond_lower_state_changed(slave); >-- >2.28.0 > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH net v2] bonding: fix feature flag setting at init time

2020-12-02 Thread Jay Vosburgh
Jarod Wilson wrote: >On Wed, Dec 2, 2020 at 12:55 PM Jay Vosburgh >wrote: >> >> Jarod Wilson wrote: >> >> >Don't try to adjust XFRM support flags if the bond device isn't yet >> >registered. Bad things can currently happen when netdev_chan

Re: [PATCH net v2] bonding: fix feature flag setting at init time

2020-12-02 Thread Jay Vosburgh
sed on further testing and suggestions from ivecera > >Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load") >Reported-by: Ivan Vecera >Suggested-by: Ivan Vecera >Cc: Jay Vosburgh >Cc: Veaceslav Falico >Cc: Andy Gospodarek >Cc: "David S. Miller

Re: [PATCH 2/5] bonding: replace use of the term master where possible

2020-11-11 Thread Jay Vosburgh
Jarod Wilson wrote: >Simply refer to what was the bonding "master" as the "bond" or bonding >device, depending on context. However, do retain compat code for the >bonding_masters sysfs interface to avoid breaking userspace. > >Cc: Jay Vosburgh >Cc: Veace

Re: [PATCH net-next v4 0/5] bonding: rename bond components

2020-11-11 Thread Jay Vosburgh
rence their deprecated >nature, explain the name changes, add references to NetworkManager, >include more netlink/iproute2 examples and make note of netlink >being the preferred interface for userspace interaction with bonds. > >v4: documentation table of contents fixes > >Cc:

Re: [PATCH] net: bonding: alb disable balance for IPv6 multicast related mac

2020-10-28 Thread Jay Vosburgh
true if the address is a multicast for IPv6. >+ */ >+static inline bool is_ipv6_multicast_ether_addr(const u8 *addr) >+{ >+ return (addr[0] & addr[1]) == 0x33; >+} I don't think this does what is intended. It will return true for a MAC that starts with any two values whose bitwise AND is 0x33, e.g., 0x73 0x3b. For IPv6 multicast, the first two octets of the MAC must be exactly 0x33 0x33. -J >+ >+/** > * is_valid_ether_addr - Determine if the given Ethernet address is valid > * @addr: Pointer to a six-byte array containing the Ethernet address > * >-- >1.8.3.1 --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] net: bonding: alb disable balance for IPv6 multicast related mac

2020-10-28 Thread Jay Vosburgh
LIU Yulong wrote: >Hi Jay, > > > >Thank you for your response and review. Please see my inline comments. I'm still reviewing your commentary, but to answer your final question regarding updating the patch, you'll need to repost the entire patch (with the new changes). This repost

Re: [PATCH net-next v2 6/6] bonding: make Kconfig toggle to disable legacy interfaces

2020-10-05 Thread Jay Vosburgh
piecemeal addition and removal from the existing UAPI. That makes for a much clearer flag day event for end users. By this I mean leave proc / sysfs as-is today, and then after a suitable deprecation period, remove it wholesale (rather than a compile time option). -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH net-next 4/5] bonding: make Kconfig toggle to disable legacy interfaces

2020-09-24 Thread Jay Vosburgh
Jarod Wilson wrote: >On Tue, Sep 22, 2020 at 8:01 PM Stephen Hemminger > wrote: >> >> On Tue, 22 Sep 2020 16:47:07 -0700 >> Jay Vosburgh wrote: >> >> > Stephen Hemminger wrote: >> > >> > >On Tue, 22 Sep 2020 09:37:30 -0400 >> &

Re: [PATCH net-next 4/5] bonding: make Kconfig toggle to disable legacy interfaces

2020-09-22 Thread Jay Vosburgh
enable, and enabling will break the UAPI? >Then you might convince maintainers to update documentation as well. >Last I checked there were still references to ifenslave. Distros still include ifenslave, but it's now a shell script that uses sysfs. I see it used in scripts from time to time. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH net-next 0/5] bonding: rename bond components

2020-09-22 Thread Jay Vosburgh
h this >set, unless the user consciously opts to do so via Kconfig. > >Jarod Wilson (5): > bonding: rename struct slave member link to link_state > bonding: rename slave to link where possible > bonding: rename master to aggregator where possible > bonding: make Kconfi

Re: [PATCH net] bonding: show saner speed for broadcast mode

2020-08-12 Thread Jay Vosburgh
" here? >Also, the type of the speed field is u32, not unsigned long, so adjust >that accordingly, as required to make min() function here without >complaining about mismatching types. > >Fixes: bb5b052f751b ("bond: add support to read speed and duplex via ethtool"

Re: [PATCH v2 1/2] PCI/ERR: Fix fatal error recovery for non-hotplug capable devices

2020-06-24 Thread Jay Vosburgh
Jay Vosburgh wrote: >sathyanarayanan.kuppusw...@linux.intel.com wrote: > >From: Kuppuswamy Sathyanarayanan >> >>Fatal (DPC) error recovery is currently broken for non-hotplug >>capable devices. With current implementation, after successful >>fatal error rec

Re: [PATCH net-next 3/4] bonding: support hardware encryption offload to slaves

2020-06-08 Thread Jay Vosburgh
as well as successful failover and recovery mid-netperf. > >v2: rebase on latest net-next and wrap with #ifdef CONFIG_XFRM_OFFLOAD >v3: add new CONFIG_BOND_XFRM_OFFLOAD option and fix shutdown path > >CC: Jay Vosburgh >CC: Veaceslav Falico >CC: Andy Gospodarek >CC: "Da

Re: [PATCH v2 1/2] PCI/ERR: Fix fatal error recovery for non-hotplug capable devices

2020-06-04 Thread Jay Vosburgh
r reset link operation which will also fix the above >mentioned issue. > >[original patch is from jay.vosbu...@canonical.com] >[original patch link >https://lore.kernel.org/linux-pci/12115.1588207324@famine/] >Fixes: 6d2c89441571 ("PCI/ERR: Update error status after reset_link()

Re: [PATCH] bonding: Fix reference count leak in bond_sysfs_slave_add.

2020-05-28 Thread Jay Vosburgh
a similar problem. > >Fixes: 07699f9a7c8d ("bonding: add sysfs /slave dir for bond slave devices.") >Signed-off-by: Qiushi Wu Acked-by: Jay Vosburgh >--- > drivers/net/bonding/bond_sysfs_slave.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > >diff

Re: [PATCH v1 1/1] PCI/ERR: Handle fatal error recovery for non-hotplug capable devices

2020-05-12 Thread Jay Vosburgh
_recovery. After 6d2c89441571 report_slot_reset is not >invoked, and the device does not recover. > >[original patch is from jay.vosbu...@canonical.com] >[original patch link >https://lore.kernel.org/linux-pci/18609.1588812972@famine/] >Fixes: 6d2c89441571 ("PCI/ERR: Update erro

Re: [PATCH v3] bonding: force enable lacp port after link state recovery for 802.3ad

2019-09-19 Thread Jay Vosburgh
rs don't have correct speed/duplex >settings at the time they send NETDEV_UP notification ...". > >Anyway, I think the lacp status should be fixed correctly, >since link-monitoring (miimon) set SPEED/DUPLEX right here. I suspect this is going to be related to the concurrent discussion with Aleksei, and would like to see the instrumentation results from his tests before adding another change to attempt to resolve this. Also, what device are you using for your testing, and are you able to run the instrumentation patch that I provided to Aleksei and provide its results? -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] bonding: force enable lacp port after link state recovery for 802.3ad

2019-08-20 Thread Jay Vosburgh
ed by duplex, but by carrier state. Duplex does affect whether or not a port is permitted to aggregate, but that's entirely separate logic (the AD_PORT_LACP_ENABLED flag). Would it be better to call bond_3ad_handle_link_change() here, instead of manually testing duplex and setting is_enabled? -J > continue; > > case BOND_LINK_UP: >-- >1.8.3.1 > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [bonding][patch] Regarding a bonding lacp issue

2019-08-08 Thread Jay Vosburgh
artner_oper.port_state & AD_STATE_SYNCHRONIZATION) && >+ (port->partner_oper.port_state & AD_STATE_COLLECTING) && >!__check_agg_selection_timer(port)) { > if (port->aggregator->is_active) > port->sm_mux_state = > >-- >Thanks, >Felix --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] bonding: Add vlan tx offload to hw_enc_features

2019-08-06 Thread Jay Vosburgh
this be tagged with Fixes: 278339a42a1b ("bonding: propogate vlan_features to bonding master") as 30d8177e8ac7 was? If not, is there an appropriate commit id? Acked-by: Jay Vosburgh -J >--- > drivers/net/bonding/bond_main.c | 2 ++ > 1 file changed, 2 insert

Re: [PATCH net] bonding/802.3ad: fix slave link initialization transition states

2019-05-24 Thread Jay Vosburgh
Presuming that this is the case, I don't see that there's much else to be done here, and so: Acked-by: Jay Vosburgh >The simple fix is to instead set the slave link to BOND_LINK_DOWN again, >if the link has never been up (last_link_up == 0), so the link state >doesn't bounce f

Re: [PATCH] bonding: show full hw address in sysfs for slave entries

2019-03-29 Thread Jay Vosburgh
_count); > > static ssize_t perm_hwaddr_show(struct slave *slave, char *buf) > { >- return sprintf(buf, "%pM\n", slave->perm_hwaddr); >+ return sprintf(buf, "%*phC\n", >+ slave->dev->addr_len, >+

Re: [PATCH 1/2] net: bonding: fix restricted __be16 degrades to integer

2019-03-08 Thread Jay Vosburgh
continue; >@@ -3238,7 +3238,7 @@ static inline u32 bond_eth_hash(struct sk_buff *skb) > > ep = skb_header_pointer(skb, 0, sizeof(hdr_tmp), _tmp); > if (ep) >- return ep->h_dest[5] ^ ep->h_source[5] ^ ep->h_proto; >+ return e

Re: [PATCH 2/2] net: bonding: fix incorrect type in assignment

2019-03-07 Thread Jay Vosburgh
s, newval.value is initially be32, but stored in a u64. __bond_opt_set will call bond_opt_parse, which in turn will call bond_option_arp_ip_targets_set (via .set), and the change above would swap the newval.value back to host order (on little endian architectures for which cpu_to_be32 is not a no-op). Am I misunderstanding? Did you test this change on an x86 or other little endian system? -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] Convert BUG_ON to WARN_ON in bond_options.c

2017-06-21 Thread Jay Vosburgh
pointer null"); >return ERROR_CODE >} In general, yes, but in this case, the condition should be impossible to hit, so BUG_ON seems appropriate. If bond_slave_get_rtnl/rcu() returns NULL for an actual bonding slave, other code paths (bond_fill_slave_info, bond_handle_frame) will likely crash before getting to this one. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] Convert BUG_ON to WARN_ON in bond_options.c

2017-06-21 Thread Jay Vosburgh
general, yes, but in this case, the condition should be impossible to hit, so BUG_ON seems appropriate. If bond_slave_get_rtnl/rcu() returns NULL for an actual bonding slave, other code paths (bond_fill_slave_info, bond_handle_frame) will likely crash before getting to this one. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] Convert BUG_ON to WARN_ON in bond_options.c

2017-06-21 Thread Jay Vosburgh
as there doesn't appear to be any way to get into bond_option_active_slave_set for a slave prior to bond_enslave registering the rx_handler for that slave, as these operations are mutexed by RTNL. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] Convert BUG_ON to WARN_ON in bond_options.c

2017-06-21 Thread Jay Vosburgh
tion_active_slave_set for a slave prior to bond_enslave registering the rx_handler for that slave, as these operations are mutexed by RTNL. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] Convert BUG_ON to WARN_ON in bond_options.c

2017-06-21 Thread Jay Vosburgh
if (new_active == old_active) { > /* do nothing */ >-- >2.7.4 > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] Convert BUG_ON to WARN_ON in bond_options.c

2017-06-21 Thread Jay Vosburgh
WARN_ON(!new_active); This is a reasonable idea in principle, but will require additional changes to prevent dereferencing new_active if it is NULL (which would happen just below this point in the code). -J > if (new_active == old_active) { > /* do nothing */ >-- >2.7.4 > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] Convert multiple netdev_info messages to netdev_dbg

2017-06-15 Thread Jay Vosburgh
bond->params.arp_interval = 0; >> /* set miimon to default value */ >> bond->params.miimon = BOND_DEFAULT_MIIMON; >> -netdev_info(bond->dev, "Setting MII monitoring interval to >> %d\n", >> +netdev_dbg(bond->dev, "Setting MII monitoring interval to %d\n", >> bond->params.miimon); > >etc... > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] Convert multiple netdev_info messages to netdev_dbg

2017-06-15 Thread Jay Vosburgh
bond->params.arp_interval = 0; >> /* set miimon to default value */ >> bond->params.miimon = BOND_DEFAULT_MIIMON; >> -netdev_info(bond->dev, "Setting MII monitoring interval to >> %d\n", >> +netdev_dbg(bond->dev, "Setting MII monitoring interval to %d\n", >> bond->params.miimon); > >etc... > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] net: bonding: use new api ethtool_{get|set}_link_ksettings

2016-10-26 Thread Jay Vosburgh
s = bond_ethtool_get_settings, > .get_link = ethtool_op_get_link, >+ .get_link_ksettings = bond_ethtool_get_link_ksettings, > }; > > static const struct net_device_ops bond_netdev_ops = { >-- >1.7.4.4 --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] net: bonding: use new api ethtool_{get|set}_link_ksettings

2016-10-26 Thread Jay Vosburgh
; } >@@ -4121,8 +4121,8 @@ static void bond_ethtool_get_drvinfo(struct net_device >*bond_dev, > > static const struct ethtool_ops bond_ethtool_ops = { > .get_drvinfo= bond_ethtool_get_drvinfo, >- .get_settings = bond_ethtool_get_settings, >

Re: [PATCH] bonding: Prevent deletion of a bond, or the last slave from a bond, with active usage.

2016-09-06 Thread Jay Vosburgh
gt;+ pr_info("%s is being deleted...\n", ifname); >+ unregister_netdevice(bond_dev); >+ } else { >+ pr_err("unable to delete non-existent %s\n", ifname); >+ res = -ENODEV; >+ } >+ rtnl_unlock(); > } else > goto err_no_cmd; > >@@ -139,7 +170,7 @@ static ssize_t bonding_store_bonds(struct class *cls, > return res; > > err_no_cmd: >- pr_err("no command found in bonding_masters - use +ifname or >-ifname\n"); >+ pr_err("no command found in bonding_masters - use +ifname or -ifname or >?-ifname\n"); > return -EPERM; > } > >-- >1.8.3.1 > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] bonding: Prevent deletion of a bond, or the last slave from a bond, with active usage.

2016-09-06 Thread Jay Vosburgh
;+ unregister_netdevice(bond_dev); >+ } else { >+ pr_err("unable to delete non-existent %s\n", ifname); >+ res = -ENODEV; >+ } >+ rtnl_unlock(); > } else > goto err_no_cmd; > >@@ -139,7 +170,7 @@ static ssize_t bonding_store_bonds(struct class *cls, > return res; > > err_no_cmd: >- pr_err("no command found in bonding_masters - use +ifname or >-ifname\n"); >+ pr_err("no command found in bonding_masters - use +ifname or -ifname or >?-ifname\n"); > return -EPERM; > } > >-- >1.8.3.1 > --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [PATCH] bonding:Fix perfomance drop if one bonding device in configuration is repeatedly restarted

2016-07-14 Thread Jay Vosburgh
, > arp->mac_dst); Regardless of the description above, this test is insuring that the code doesn't use the broadcast MAC address for the client. Changing it seems like a bad thing to do, as it would cause traffic for the client to be sent to the ethernet broadcast addres

Re: [PATCH] bonding:Fix perfomance drop if one bonding device in configuration is repeatedly restarted

2016-07-14 Thread Jay Vosburgh
description above, this test is insuring that the code doesn't use the broadcast MAC address for the client. Changing it seems like a bad thing to do, as it would cause traffic for the client to be sent to the ethernet broadcast address. This might, in fact, make the described test run better, but it has nothing to do with rebalancing and would negatively impact other hosts on the network. -J --- -Jay Vosburgh, jay.vosbu...@canonical.com

Re: [RFC PATCH net] net/core: don't increment rx_dropped on inactive slaves

2016-01-22 Thread Jay Vosburgh
"rx_drop_unforwardable" (or an equivalent but shorter name) counter that counts every time a packet is dropped due to is_skb_forwardable() returning false. __dev_forward_skb does this (and hits rx_dropped), as does the bridge (and does not count it). -J >CC: "David S. Mille

Re: [RFC PATCH net] net/core: don't increment rx_dropped on inactive slaves

2016-01-22 Thread Jay Vosburgh
C: "David S. Miller" <da...@davemloft.net> >CC: Eric Dumazet <eduma...@google.com> >CC: Jiri Pirko <j...@mellanox.com> >CC: Daniel Borkmann <dan...@iogearbox.net> >CC: Tom Herbert <t...@herbertland.com> >CC: Jay Vosburgh <j.vosbu...@gmail.

Re: [PATCH v4] net/bonding: send arp in interval if no active slave

2015-10-09 Thread Jay Vosburgh
the fail_over_mac option would affect this behavior, as it would cause the slaves to keep their MAC address for the duration, so the switch would not see the MAC move from port to port. Another thought would be to have the curr_arp_slave cycle through the slaves in random order, but that could cre

Re: [PATCH v4] net/bonding: send arp in interval if no active slave

2015-10-09 Thread Jay Vosburgh
I also wonder if the fail_over_mac option would affect this behavior, as it would cause the slaves to keep their MAC address for the duration, so the switch would not see the MAC move from port to port. Another thought would be to have the curr_arp_slave cycle through the slaves in r

Re: [PATCH] net/bonding: send arp in interval if no active slave

2015-09-03 Thread Jay Vosburgh
Uwe Koziolek wrote: >On Tue, Sep 01, 2015 at 05:41 PM +0200, Andy Gospodarek wrote: >> On Mon, Aug 17, 2015 at 10:51:27PM +0200, Uwe Koziolek wrote: >>> On Mon, Aug 17, 2015 at 09:14PM +0200, Jay Vosburgh wrote: >>>> Uwe Koziolek wrote: >>>> >

Re: [PATCH] net/bonding: send arp in interval if no active slave

2015-09-03 Thread Jay Vosburgh
Uwe Koziolek <uwe.kozio...@redknee.com> wrote: >On Tue, Sep 01, 2015 at 05:41 PM +0200, Andy Gospodarek wrote: >> On Mon, Aug 17, 2015 at 10:51:27PM +0200, Uwe Koziolek wrote: >>> On Mon, Aug 17, 2015 at 09:14PM +0200, Jay Vosburgh wrote: >>>> Uwe Kozio

Re: [PATCH] net/bonding: send arp in interval if no active slave

2015-08-17 Thread Jay Vosburgh
itional calls >of bond_ab_arp_probe. >Now the retries are not only for an up bond available, they are also >implemented for a down bond. Does this delay failover or bringup on switches that are not "problematic"? I.e., if arp_interval is, say, 1000 (1 second), will th

Re: [PATCH] net/bonding: send arp in interval if no active slave

2015-08-17 Thread Jay Vosburgh
no chance to solve the problem. The num_grat_arp is only used, if a different slave is going active. But in our case, the bonding slaves are not going into the state active for a longer time. [jarod: manufacturing of changelog] CC: Jay Vosburgh j.vosbu...@gmail.com CC: Veaceslav Falico vfal

Re: [PATCH v3] bonding: "primary_reselect" with "failure" is not working properly

2015-07-07 Thread Jay Vosburgh
e for failover/reselection and current >active slave is still up. > >Signed-off-by: Mazhar Rana >Signed-off-by: Jay Vosburgh I don't believe I posted a Signed-off-by, so you really shouldn't include one for me without it being explicitly stated. In any event, I'm good

Re: [PATCH v3] bonding: primary_reselect with failure is not working properly

2015-07-07 Thread Jay Vosburgh
. Signed-off-by: Mazhar Rana mazhar.r...@cyberoam.com Signed-off-by: Jay Vosburgh j.vosbu...@gmail.com I don't believe I posted a Signed-off-by, so you really shouldn't include one for me without it being explicitly stated. In any event, I'm good with the patch, so: Signed-off-by: Jay

Re: [PATCH v2] bonding: "primary_reselect" with "failure" is not working properly

2015-07-03 Thread Jay Vosburgh
GMAIL wrote: >Hi Jay, > >On Friday 03 July 2015 02:12 AM, Jay Vosburgh wrote: > >> [ added netdev to cc ] >> >> Mazhar Rana wrote: >> >>> When "primary_reselect" is set to "failure", primary interface should >>&g

Re: [PATCH v2] bonding: primary_reselect with failure is not working properly

2015-07-03 Thread Jay Vosburgh
GMAIL ranamazh...@gmail.com wrote: Hi Jay, On Friday 03 July 2015 02:12 AM, Jay Vosburgh wrote: [ added netdev to cc ] Mazhar Rana ranamazh...@gmail.com wrote: When primary_reselect is set to failure, primary interface should not become active until current active slave is up

Re: [PATCH v2] bonding: "primary_reselect" with "failure" is not working properly

2015-07-02 Thread Jay Vosburgh
struct slave *slave, *bestslave = NULL; struct list_head *iter; int mintime = bond->params.updelay; - primary = rtnl_dereference(bond->primary_slave); - if (primary && primary->link == BOND_LINK_UP && - bond_should_change_active(bond)) -

Re: [PATCH v2] bonding: primary_reselect with failure is not working properly

2015-07-02 Thread Jay Vosburgh
(bond, slave, iter) { if (slave-link == BOND_LINK_UP) --- -Jay Vosburgh, jay.vosbu...@canonical.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-27 Thread Jay Vosburgh
Paul E. McKenney wrote: >On Sat, Oct 25, 2014 at 11:18:27AM -0700, Paul E. McKenney wrote: >> On Sat, Oct 25, 2014 at 09:38:16AM -0700, Jay Vosburgh wrote: >> > Paul E. McKenney wrote: >> > >> > >On Fri, Oct 24, 2014 at 09:33:33PM -0700, Jay Vosburgh wrot

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-27 Thread Jay Vosburgh
Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Sat, Oct 25, 2014 at 11:18:27AM -0700, Paul E. McKenney wrote: On Sat, Oct 25, 2014 at 09:38:16AM -0700, Jay Vosburgh wrote: Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Fri, Oct 24, 2014 at 09:33:33PM -0700, Jay Vosburgh

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-25 Thread Jay Vosburgh
Paul E. McKenney wrote: >On Fri, Oct 24, 2014 at 05:20:48PM -0700, Jay Vosburgh wrote: >> Paul E. McKenney wrote: >> >> >On Fri, Oct 24, 2014 at 03:59:31PM -0700, Paul E. McKenney wrote: >> [...] >> >> Hmmm... It sure looks like we have some callbacks

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-25 Thread Jay Vosburgh
Paul E. McKenney wrote: >On Fri, Oct 24, 2014 at 09:33:33PM -0700, Jay Vosburgh wrote: >> Looking at the dmesg, the early boot messages seem to be >> confused as to how many CPUs there are, e.g., >> >> [0.00] SLUB: HWalign=64, Order=0-3, Mi

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-25 Thread Jay Vosburgh
Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Fri, Oct 24, 2014 at 09:33:33PM -0700, Jay Vosburgh wrote: Looking at the dmesg, the early boot messages seem to be confused as to how many CPUs there are, e.g., [0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-25 Thread Jay Vosburgh
Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Fri, Oct 24, 2014 at 05:20:48PM -0700, Jay Vosburgh wrote: Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Fri, Oct 24, 2014 at 03:59:31PM -0700, Paul E. McKenney wrote: [...] Hmmm... It sure looks like we have some callbacks

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
U_UP_CANCELED_FROZEN: >- for_each_rcu_flavor(rsp) >+ for_each_rcu_flavor(rsp) { > rcu_cleanup_dead_cpu(cpu, rsp); >+ do_nocb_deferred_wakeup(per_cpu_ptr(rsp->rda, cpu)); >+ } > break; > default: &

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
Paul E. McKenney wrote: >On Fri, Oct 24, 2014 at 03:02:04PM -0700, Jay Vosburgh wrote: >> Paul E. McKenney wrote: >> [...] >> I've got an ftrace capture from unmodified -net, it looks like >> this: >> >> ovs-vswitchd-902 [000] 471.77844

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
e3c0 l:88013fc0e3c0 n:88013fc8e3c0 ... [ 360.496469] 1: 88013fc8e3c0 l:88013fc0e3c0 n: (null) .G. [ 360.503407] 2: 88013fd0e3c0 l:88013fd0e3c0 n:88013fd8e3c0 ... [ 360.510346] 3: 88013fd8e3c0 l:88013fd0e3c0 n: (null) ... --- -Jay Vosburgh, jay.vosbu...@canonical.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
_cpu(cpu) { >+ if (!rcu_is_nocb_cpu(cpu)) >+ continue; >+ rdp = per_cpu_ptr(rsp->rda, cpu); >+ pr_alert("%3d: %p l:%p n:%p %c%c%c\n", >+ cpu, >+ rdp,

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
Thanx, Paul > >> > >> > >> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h >> > index 29fb23f33c18..927c17b081c7 100644 >> > --- a/kernel/rcu/tree_plug

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
Paul E. McKenney wrote: >On Thu, Oct 23, 2014 at 09:48:34PM -0700, Jay Vosburgh wrote: >> Paul E. McKenney wrote: [...] >> >Either way, my patch assumed that 39953dfd4007 (rcu: Avoid misordering in >> >__call_rcu_nocb_enqueue()) would work and that 1772947bd012 (rcu

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Thu, Oct 23, 2014 at 09:48:34PM -0700, Jay Vosburgh wrote: Paul E. McKenney paul...@linux.vnet.ibm.com wrote: [...] Either way, my patch assumed that 39953dfd4007 (rcu: Avoid misordering in __call_rcu_nocb_enqueue()) would work

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
= rdp_old_leader; } --- -Jay Vosburgh, jay.vosbu...@canonical.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
mode for an * arbitrarily long period of time with the scheduling-clock tick turned --- -Jay Vosburgh, jay.vosbu...@canonical.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
: 88013fc8e3c0 l:88013fc0e3c0 n: (null) .G. [ 360.503407] 2: 88013fd0e3c0 l:88013fd0e3c0 n:88013fd8e3c0 ... [ 360.510346] 3: 88013fd8e3c0 l:88013fd0e3c0 n: (null) ... --- -Jay Vosburgh, jay.vosbu...@canonical.com -- To unsubscribe from this list: send

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Fri, Oct 24, 2014 at 03:02:04PM -0700, Jay Vosburgh wrote: Paul E. McKenney paul...@linux.vnet.ibm.com wrote: [...] I've got an ftrace capture from unmodified -net, it looks like this: ovs-vswitchd-902 [000

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-24 Thread Jay Vosburgh
) { rcu_cleanup_dead_cpu(cpu, rsp); + do_nocb_deferred_wakeup(per_cpu_ptr(rsp-rda, cpu)); + } break; default: break; --- -Jay Vosburgh, jay.vosbu...@canonical.com -- To unsubscribe from this list: send the line

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-23 Thread Jay Vosburgh
g) >and c847f14217d5 (rcu: Avoid misordering in nocb_leader_wait())? Just a note to add that I am also reliably inducing what appears to be this issue on a current -net tree, when configuring openvswitch via script. I am available to test patches or bisect tomorrow (Friday) US time if needed. The stack is as follows: [ 1320.492020] INFO: task ovs-vswitchd:1303 blocked for more than 120 seconds. [ 1320.498965] Not tainted 3.17.0-testola+ #1 [ 1320.503570] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1320.511374] ovs-vswitchdD 88013fc14600 0 1303 1302 0x0004 [ 1320.511378] 8801388d77d8 0002 880031144b00 8801388d7fd8 [ 1320.511382] 00014600 00014600 8800b092e400 880031144b00 [ 1320.511385] 8800b1126000 81c58ad0 81c58ad8 7fff [ 1320.511389] Call Trace: [ 1320.511396] [] schedule+0x29/0x70 [ 1320.511399] [] schedule_timeout+0x1dc/0x260 [ 1320.511404] [] ? check_preempt_curr+0x8d/0xa0 [ 1320.511407] [] ? ttwu_do_wakeup+0x1d/0xd0 [ 1320.511410] [] wait_for_completion+0xa6/0x160 [ 1320.511413] [] ? wake_up_state+0x20/0x20 [ 1320.511417] [] _rcu_barrier+0x157/0x200 [ 1320.511419] [] rcu_barrier+0x15/0x20 [ 1320.511423] [] netdev_run_todo+0x60/0x300 [ 1320.511427] [] rtnl_unlock+0xe/0x10 [ 1320.511435] [] internal_dev_destroy+0x55/0x80 [openvswitch] [ 1320.511440] [] ovs_vport_del+0x32/0x40 [openvswitch] [ 1320.511444] [] ovs_dp_detach_port+0x30/0x40 [openvswitch] [ 1320.511448] [] ovs_vport_cmd_del+0xc5/0x110 [openvswitch] [ 1320.511452] [] genl_family_rcv_msg+0x1a5/0x3c0 [ 1320.511455] [] ? genl_family_rcv_msg+0x3c0/0x3c0 [ 1320.511458] [] genl_rcv_msg+0x91/0xd0 [ 1320.511461] [] netlink_rcv_skb+0xc1/0xe0 [ 1320.511463] [] genl_rcv+0x2c/0x40 [ 1320.511466] [] netlink_unicast+0xf6/0x200 [ 1320.511468] [] netlink_sendmsg+0x31d/0x780 [ 1320.511472] [] ? netlink_rcv_wake+0x44/0x60 [ 1320.511475] [] ? netlink_recvmsg+0x1d3/0x3e0 [ 1320.511479] [] sock_sendmsg+0x93/0xd0 [ 1320.511484] [] ? apparmor_file_alloc_security+0x20/0x40 [ 1320.511487] [] ? verify_iovec+0x47/0xd0 [ 1320.511491] [] ___sys_sendmsg+0x399/0x3b0 [ 1320.511495] [] ? kernfs_seq_stop_active+0x32/0x40 [ 1320.511499] [] ? native_sched_clock+0x35/0x90 [ 1320.511502] [] ? native_sched_clock+0x35/0x90 [ 1320.511505] [] ? sched_clock+0x9/0x10 [ 1320.511509] [] ? acct_account_cputime+0x1c/0x20 [ 1320.511512] [] ? account_user_time+0x8b/0xa0 [ 1320.511516] [] ? __fget_light+0x25/0x70 [ 1320.511519] [] __sys_sendmsg+0x42/0x80 [ 1320.511521] [] SyS_sendmsg+0x12/0x20 [ 1320.511525] [] tracesys_phase2+0xd8/0xdd -J --- -Jay Vosburgh, jay.vosbu...@canonical.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: localed stuck in recent 3.18 git in copy_net_ns?

2014-10-23 Thread Jay Vosburgh
] [8161d372] __sys_sendmsg+0x42/0x80 [ 1320.511521] [8161d3c2] SyS_sendmsg+0x12/0x20 [ 1320.511525] [8173e6a4] tracesys_phase2+0xd8/0xdd -J --- -Jay Vosburgh, jay.vosbu...@canonical.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body

Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1

2014-03-31 Thread Jay Vosburgh
d has been administratively set down and then back up. This effect should not occur when slaves are added while the bond is up; it's something that only happens after a down/up bounce of the bond. That said, the patch itself looks fine to me. Signed-off-by: Jay Vosburgh -J >Sign

Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1

2014-03-31 Thread Jay Vosburgh
down and then back up. This effect should not occur when slaves are added while the bond is up; it's something that only happens after a down/up bounce of the bond. That said, the patch itself looks fine to me. Signed-off-by: Jay Vosburgh j.vosbu...@gmail.com -J Signed-off

Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 in tlb and alb mode.

2014-03-27 Thread Jay Vosburgh
slave is active, other slaves are inactive. The "inactive" setting for alb is special, and means to not pass broadcast or multicast, but let unicast through. -J --- -Jay Vosburgh, IBM Linux Technology Center, fu...@us.ibm.com -- To unsubscribe from this list: send the lin

Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1 in tlb and alb mode.

2014-03-27 Thread Jay Vosburgh
or multicast, but let unicast through. -J --- -Jay Vosburgh, IBM Linux Technology Center, fu...@us.ibm.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo

Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1.

2014-03-21 Thread Jay Vosburgh
zheng.li wrote: >于 2014年03月21日 01:02, Jay Vosburgh 写道: >> Zheng Li wrote: >> >>> Except bond mode 1, in other bond modes, inactive slaves should keep >>> inactive flag to 1 to refuse to receive broadcast packets. Now, active >>> slave send broadcas

Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1.

2014-03-21 Thread Jay Vosburgh
zheng.li zheng.x...@oracle.com wrote: 于 2014年03月21日 01:02, Jay Vosburgh 写道: Zheng Li zheng.x...@oracle.com wrote: Except bond mode 1, in other bond modes, inactive slaves should keep inactive flag to 1 to refuse to receive broadcast packets. Now, active slave send broadcast packets

Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1.

2014-03-20 Thread Jay Vosburgh
> BOND_SLAVE_NOTIFY_NOW); > } > } >-- >1.7.6.5 > --- -Jay Vosburgh, IBM Linux Technology Center, fu...@us.ibm.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.ke

Re: [PATCH] bonding: Inactive slaves should keep inactive flag's value to 1.

2014-03-20 Thread Jay Vosburgh
--- -Jay Vosburgh, IBM Linux Technology Center, fu...@us.ibm.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http

Re: https://lkml.org/lkml/2013/2/1/531

2013-05-22 Thread Jay Vosburgh
lly go through Davem; I can check the patch and repost it to netdev against 3.4.46 if everybody is ok with that. -J --- -Jay Vosburgh, IBM Linux Technology Center, fu...@us.ibm.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

Re: https://lkml.org/lkml/2013/2/1/531

2013-05-22 Thread Jay Vosburgh
with that. -J --- -Jay Vosburgh, IBM Linux Technology Center, fu...@us.ibm.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read

Re: BUG: scheduling while atomic: ifup-bonding/3711/0x00000002 -- V3.6.7

2012-12-07 Thread Jay Vosburgh
rectly >from disk from doing so easily...grrr. > >Jay Vosburgh wrote: >> The miimon functionality is used to check link state and notice >> when slaves lose carrier. >--- > If I am running 'rr' on 2 channels -- specifically for the purpose >of link speed aggrega

Re: BUG: scheduling while atomic: ifup-bonding/3711/0x00000002 -- V3.6.7

2012-12-07 Thread Jay Vosburgh
directly from disk from doing so easily...grrr. Jay Vosburgh wrote: The miimon functionality is used to check link state and notice when slaves lose carrier. --- If I am running 'rr' on 2 channels -- specifically for the purpose of link speed aggregation (getting 1 20Gb channel out of 2 10Gb

Re: [PATCH] bonding: rlb mode of bond should not alter ARP originating via bridge

2012-11-29 Thread Jay Vosburgh
would render peers unable to >communicate with the destinations beyond the bridge. > >Signed-off-by: Zheng Li >Cc: Jay Vosburgh >Cc: Andy Gospodarek >Cc: "David S. Miller" Signed-off-by: Jay Vosburgh >--- > drivers/net/bonding/bond_alb.c |6 ++ > drivers/net

Re: [PATCH] bonding: rlb mode of bond should not alter ARP originating via bridge

2012-11-29 Thread Jay Vosburgh
with the destinations beyond the bridge. Signed-off-by: Zheng Li zheng.x...@oracle.com Cc: Jay Vosburgh fu...@us.ibm.com Cc: Andy Gospodarek a...@greyhouse.net Cc: David S. Miller da...@davemloft.net Signed-off-by: Jay Vosburgh fu...@us.ibm.com --- drivers/net/bonding/bond_alb.c |6 ++ drivers

Re: BUG: scheduling while atomic: ifup-bonding/3711/0x00000002 -- V3.6.7

2012-11-28 Thread Jay Vosburgh
[] >ixgbe_get_copper_link_capabilities_generic+0x2c/0x60 >[ 59.801433] [] ? bond_mii_monitor+0x2ed/0x640 >[ 59.801441] [] ixgbe_get_settings+0x34/0x2b0 >[ 59.801446] [] __ethtool_get_settings+0x85/0x140 >[ 59.801450] [] bond_update_speed_duplex+0x23/0x60 >[ 59.801471] [

Re: BUG: scheduling while atomic: ifup-bonding/3711/0x00000002 -- V3.6.7

2012-11-28 Thread Jay Vosburgh
] ? flush_kthread_worker+0x160/0x160 [ 59.801536] [816892e0] ? gs_change+0xb/0xb [ 59.804986] bonding: bond0: link status definitely up for interface p2p2, 1 Mbps full duplex. --- -Jay Vosburgh, IBM Linux Technology Center, fu...@us.ibm.com -- To unsubscribe from this list: send

Re: [PATCH] bonding: rlb mode of bond should not alter ARP originating via bridge

2012-11-27 Thread Jay Vosburgh
teh changelog), but the ARP request behavior change is new >> with this version. >> >> Since prior versions of the patch didn't cause this code to be >> skipped, is this change intentional? >> >> Did you check to see if the above logic is nece

Re: [PATCH] bonding: rlb mode of bond should not alter ARP originating via bridge

2012-11-27 Thread Jay Vosburgh
is necessary for ARP requests passing through via a bridge to prevent peers from stacking (in terms of load balance assignment) on the active slave due to bridged ARP traffic? -J Signed-off-by: Zheng Li zheng.x...@oracle.com Cc: Jay Vosburgh fu...@us.ibm.com Cc: Andy Gospodarek

Re: [PATCH] bonding: in balance-rr mode, set curr_active_slave only if it is up

2012-11-26 Thread Jay Vosburgh
link down for >ARP monitor", this was masked by slaves always starting in UP >state with ARP monitor (and MII monitor not relying on >curr_active_slave being NULL if there is no slave up). > >Signed-off-by: Michal Kubecek Signed-off-by: Jay Vosburgh > drivers/net/bonding

  1   2   >