On 07/04 , Roi Dayan wrote:
>
>
> On 2022-06-30 11:34 AM, Tao Liu wrote:
> > When netdev first opened by netdev_open(..., NULL, ...), netdev_class sets
> > to system by default, and auto_classified sets to true.
> >
> > If netdev reopens by netdev_open(..., "system", ...), auto_classified
> > should be cleared. This will be used in next patch to fix lag issue.
> >
> > Fixes: 8c2c225e481d ("netdev: Fix netdev_open() to track and recreate
> > classless interfaces")
> > Signed-off-by: Tao Liu <[email protected]>
> > ---
> > lib/netdev.c | 41 +++++++++++++++++++++++------------------
> > 1 file changed, 23 insertions(+), 18 deletions(-)
> >
> > diff --git a/lib/netdev.c b/lib/netdev.c
> > index 8305f6c..b97408b 100644
> > --- a/lib/netdev.c
> > +++ b/lib/netdev.c
> > @@ -387,25 +387,30 @@ netdev_open(const char *name, const char *type,
> > struct netdev **netdevp)
> > ovs_mutex_lock(&netdev_mutex);
> > netdev = shash_find_data(&netdev_shash, name);
> > - if (netdev &&
> > - type && type[0] && strcmp(type, netdev->netdev_class->type)) {
> > -
> > - if (netdev->auto_classified) {
> > - /* If this device was first created without a classification
> > type,
> > - * for example due to routing or tunneling code, and they keep
> > a
> > - * reference, a "classified" call to open will fail. In this
> > case
> > - * we remove the classless device, and re-add it below. We
> > remove
> > - * the netdev from the shash, and change the sequence, so
> > owners of
> > - * the old classless device can release/cleanup. */
> > - if (netdev->node) {
> > - shash_delete(&netdev_shash, netdev->node);
> > - netdev->node = NULL;
> > - netdev_change_seq_changed(netdev);
> > - }
> > + if (netdev && type && type[0]) {
> > + if (strcmp(type, netdev->netdev_class->type)) {
> > +
> > + if (netdev->auto_classified) {
> > + /* If this device was first created without a
> > classification
> > + * type, for example due to routing or tunneling code, and
> > they
> > + * keep a reference, a "classified" call to open will fail.
> > + * In this case we remove the classless device, and re-add
> > it
> > + * below. We remove the netdev from the shash, and change
> > the
> > + * sequence, so owners of the old classless device can
> > + * release/cleanup. */
> > + if (netdev->node) {
> > + shash_delete(&netdev_shash, netdev->node);
> > + netdev->node = NULL;
> > + netdev_change_seq_changed(netdev);
> > + }
> > - netdev = NULL;
> > - } else {
> > - error = EEXIST;
> > + netdev = NULL;
> > + } else {
> > + error = EEXIST;
> > + }
> > + } else if (netdev->auto_classified) {
> > + /* If netdev reopens of type "system", clear auto_classified */
> > + netdev->auto_classified = false;
> > }
> > }
>
> this fix you issue with lag?
Yes, it is.
> netdev_open() is not being called more than once?
This is my debug log:
2022-07-05T01:44:23.038Z|00377|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.038Z|00378|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.038Z|00379|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.038Z|00380|netdev_linux|INFO|lag: ifname=bond0,
if_index=27, sub=, nlmsg_type=16
2022-07-05T01:44:23.038Z|00381|netdev_linux|INFO|lag: ifname=net0, if_index=2,
sub=, nlmsg_type=16
2022-07-05T01:44:23.038Z|00382|netdev_linux|INFO|lag: ifname=net0, if_index=2,
sub=, nlmsg_type=16
2022-07-05T01:44:23.038Z|00383|netdev_linux|INFO|lag: ifname=net0, if_index=2,
sub=, nlmsg_type=16
2022-07-05T01:44:23.038Z|00384|netdev_linux|INFO|lag: ifname=net0, if_index=2,
sub=bond, nlmsg_type=16 ---> enslave net0
2022-07-05T01:44:23.038Z|00385|netdev_linux|INFO|lag: master_ifindex=27
2022-07-05T01:44:23.038Z|00386|netdev_linux|INFO|lag: master_name=bond0
2022-07-05T01:44:23.038Z|00387|netdev_linux|INFO|lag: auto_classified=1
2022-07-05T01:44:23.038Z|00388|netdev_linux|INFO|lag: ifname=bond0,
if_index=27, sub=, nlmsg_type=16
2022-07-05T01:44:23.038Z|00389|netdev_linux|INFO|lag: ifname=bond0,
if_index=27, sub=, nlmsg_type=16
2022-07-05T01:44:23.039Z|00390|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.039Z|00391|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.039Z|00392|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.046Z|00393|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.046Z|00394|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.046Z|00395|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.046Z|00396|netdev_linux|INFO|lag: ifname=net1, if_index=3,
sub=, nlmsg_type=16
2022-07-05T01:44:23.046Z|00397|netdev_linux|INFO|lag: ifname=net1, if_index=3,
sub=, nlmsg_type=16
2022-07-05T01:44:23.046Z|00398|netdev_linux|INFO|lag: ifname=net1, if_index=3,
sub=, nlmsg_type=16
2022-07-05T01:44:23.046Z|00399|netdev_linux|INFO|lag: ifname=net1, if_index=3,
sub=bond, nlmsg_type=16 ---> enslave net1
2022-07-05T01:44:23.046Z|00400|netdev_linux|INFO|lag: master_ifindex=27
2022-07-05T01:44:23.046Z|00401|netdev_linux|INFO|lag: master_name=bond0
2022-07-05T01:44:23.046Z|00402|netdev_linux|INFO|lag: auto_classified=1
2022-07-05T01:44:23.046Z|00403|netdev_linux|INFO|lag: ifname=bond0,
if_index=27, sub=, nlmsg_type=16
2022-07-05T01:44:23.046Z|00404|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.046Z|00405|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:23.047Z|00406|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:28.057Z|00407|netdev|INFO|open netdev bond0 of type system
2022-07-05T01:44:28.058Z|00408|netdev_offload_tc|INFO|added ingress qdisc to
bond0
2022-07-05T01:44:28.058Z|00409|netdev_offload|INFO|bond0: Assigned flow API
'linux_tc'.
2022-07-05T01:44:28.058Z|00410|netdev|INFO|open netdev bond0 of type system
2022-07-05T01:44:28.058Z|00411|bridge|INFO|bridge br0: added interface bond0 on
port 7 ---> add-port bond0
2022-07-05T01:44:28.058Z|00412|bridge|INFO|bridge br0: using datapath ID
000094292f70b6ad
2022-07-05T01:44:28.058Z|00413|netdev_linux|INFO|lag: ifname=bond0,
if_index=27, sub=openvswitch, nlmsg_type=16
2022-07-05T01:44:28.058Z|00414|netdev_linux|INFO|lag: ifname=net0, if_index=2,
sub=bond, nlmsg_type=16
2022-07-05T01:44:28.058Z|00415|netdev_linux|INFO|lag: master_ifindex=27
2022-07-05T01:44:28.058Z|00416|netdev_linux|INFO|lag: master_name=bond0
2022-07-05T01:44:28.058Z|00417|netdev_linux|INFO|lag: auto_classified=0
2022-07-05T01:44:28.059Z|00418|netdev_linux|INFO|lag: ifname=net1, if_index=3,
sub=bond, nlmsg_type=16
2022-07-05T01:44:28.059Z|00419|netdev_linux|INFO|lag: master_ifindex=27
2022-07-05T01:44:28.059Z|00420|netdev_linux|INFO|lag: master_name=bond0
2022-07-05T01:44:28.059Z|00421|netdev_linux|INFO|lag: auto_classified=0
2022-07-05T01:44:28.059Z|00422|netdev_linux|INFO|lag: ifname=bond0,
if_index=27, sub=openvswitch, nlmsg_type=16
2022-07-05T01:44:28.059Z|00423|netdev_linux|INFO|lag: ifname=br0, if_index=24,
sub=, nlmsg_type=16
2022-07-05T01:44:28.059Z|00424|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:28.059Z|00425|netdev|INFO|open netdev bond0 of type system
2022-07-05T01:44:28.060Z|00426|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:28.060Z|00427|netdev|INFO|open netdev bond0 of type system
2022-07-05T01:44:28.060Z|00428|netdev|INFO|open netdev bond0 of type
2022-07-05T01:44:28.060Z|00429|netdev|INFO|open netdev bond0 of type system
2022-07-05T01:44:28.061Z|00430|netdev|INFO|open netdev bond0 of type
The debug print:
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index b9e0c99..aad063a 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -674,14 +674,18 @@ netdev_linux_update_lag(struct rtnetlink_change *change)
uint32_t block_id;
int error = 0;
+ VLOG_INFO("lag: master_ifindex=%d", change->master_ifindex);
if (!if_indextoname(change->master_ifindex, master_name)) {
return;
}
+
+ VLOG_INFO("lag: master_name=%s", master_name);
master_netdev = netdev_from_name(master_name);
if (!master_netdev) {
return;
}
+ VLOG_INFO("lag: auto_classified=%d",
master_netdev->auto_classified);
/* If bond master is not attached to ovs, ingress block on * slaves
* shoud not be updated. */
if (!master_netdev->auto_classified &&
@@ -768,6 +772,7 @@ netdev_linux_run(const struct netdev_class *netdev_class
OVS_UNUSED)
ovs_mutex_unlock(&netdev->mutex);
}
+ VLOG_INFO("lag: ifname=%s, if_index=%d, sub=%s,
nlmsg_type=%d", change.ifname?:"", change.if_index, change.sub?:"",
change.nlmsg_type);
if (change.ifname &&
rtnetlink_type_is_rtnlgrp_link(change.nlmsg_type)) {
diff --git a/lib/netdev.c b/lib/netdev.c
index b97408b..6f602cb 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -384,6 +384,9 @@ netdev_open(const char *name, const char *type, struct
netdev **netdevp)
netdev_initialize();
+ if (!strcmp("bond0", name))
+ VLOG_INFO("open netdev %s of type %s", name, type?:"");
+
ovs_mutex_lock(&netdev_mutex);
netdev = shash_find_data(&netdev_shash, name);
>
> i'll take a little bit more this week and test a bit before acking.
>
OTOH, RTM_NEWLINK msg also sends more than once. The last msg is triggered
by add-port. Kernel calltrace:
rtnl_fill_ifinfo
rtmsg_ifinfo_build_skb
rtmsg_ifinfo_event.part.44
rtmsg_ifinfo
__dev_notify_flags
__dev_set_promiscuity
dev_set_promiscuity
bond_change_rx_flags
__dev_set_promiscuity
dev_set_promiscuity
ovs_netdev_link
ovs_vport_add
new_vport
ovs_vport_cmd_new
genl_family_rcv_msg_doit.isra.18
genl_rcv_msg
netlink_rcv_skb
genl_rcv
netlink_unicast
netlink_sendmsg
sock_sendmsg
____sys_sendmsg
___sys_sendmsg
__sys_sendmsg
do_syscall_64
entry_SYSCALL_64_after_hwframe
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev