Add population of a new peer_ifindex column in the interface status of (Linux) veth devices. This allows applications like OVN to programmatically discover veth pairs without requiring additional system calls or manual configuration.
The implementation uses the existing ethtool infrastructure: - Uses ETHTOOL_GDRVINFO to get n_stats from the veth driver - Dynamically allocates buffer based on n_stats - Queries ETHTOOL_GSTATS to retrieve veth statistics - Extracts peer_ifindex (first statistic for veth devices) - Stores it in the status column as "peer_ifindex" The value is cached to avoid performance degradation with an increase on the number of ports. This feature can enable automatic veth peer discovery for routing daemon integrations in OVN, eliminating the need for manual port-mapping configuration when using veth pairs for BGP/BFD routing protocols. Signed-off-by: Matteo Perin <[email protected]> --- This is the first part of a rework of a recent contribution I made in an effort to simplify the OVN routing-protocol-redirect option setup on an LRP when using veth pairs to connect routing daemons to OVN. This requires supporting the lookup of the veth peers bound to the LSP referred to. Since this added capability requires a call to ethtool gstats, it is better suited to be called directly from the OVS code (using the infrastructure already in-place) instead of shoehorning it inside the OVN code (as in the original patch). The veth peer value can be cached inside the netdev status information (for veth devices) and then retrieved by OVN. As pointed out on the original contribution discussion, some issues about stale veth values may arise, though. Even if the peer ifindex does not change if the veth is moved to a different namespace, it can happen that a veth is removed and recreated later with the same name, causing the OVS get_status to return a stale ifindex value. Unfortunately, I have not been able to find a satisfactory way of handling such situation (at least not is userspace). While in kernel space we can rely on RTM_NEWLINK/RTM_DELLINK to invalidate VALID_DRVINFO and retrigger fetching the peer ifindex, in userspace the only way to achieve the same seems to actively check if the ifindex has changed. Of course this is not really possible since making a query to ethtool or get_ifindex on every call to get_status will slow down the kernel when a huge number of ports is present. I do not really know how to tackle this situation in an elegant way, maybe a good compromise would be documenting this limitation (the need to remove and recreate veths seems not a very common occurrence to me, after all)? This will allow us to avoid moving too much code around for only a mere quality-of-life improvement, but I will really appreciate if someone has any insights or suggestions on how to tackle this issue. Best Regards, Matteo lib/netdev-linux-private.h | 1 + lib/netdev-linux.c | 29 +++++++++++++++++++++++++++++ tests/system-interface.at | 34 ++++++++++++++++++++++++++++++++++ vswitchd/vswitch.xml | 19 +++++++++++++++++++ 4 files changed, 83 insertions(+) diff --git a/lib/netdev-linux-private.h b/lib/netdev-linux-private.h index cf4a021d3..7755f061d 100644 --- a/lib/netdev-linux-private.h +++ b/lib/netdev-linux-private.h @@ -97,6 +97,7 @@ struct netdev_linux { uint8_t current_duplex; /* Cached from ETHTOOL_GSET. */ struct ethtool_drvinfo drvinfo; /* Cached from ETHTOOL_GDRVINFO. */ + uint64_t peer_ifindex; /* Cached from ETHTOOL_GSTATS (veth only). */ struct tc *tc; /* For devices of class netdev_tap_class only. */ diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index f4d2685a2..f86943439 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -3728,12 +3728,36 @@ netdev_linux_get_status(const struct netdev *netdev_, struct smap *smap) COVERAGE_INC(netdev_get_ethtool); memset(&netdev->drvinfo, 0, sizeof netdev->drvinfo); + netdev->peer_ifindex = 0; error = netdev_linux_do_ethtool(netdev->up.name, cmd, ETHTOOL_GDRVINFO, "ETHTOOL_GDRVINFO"); if (!error) { netdev->cache_valid |= VALID_DRVINFO; + + /* For veth devices, also get the peer_ifindex via ETHTOOL_GSTATS */ + if (!strcmp(netdev->drvinfo.driver, "veth") && netdev->drvinfo.n_stats > 0) { + /* Dynamically allocate based on n_stats from ETHTOOL_GDRVINFO. */ + size_t n_stats = netdev->drvinfo.n_stats; + size_t req_size = sizeof(struct ethtool_stats) + sizeof(__u64) * n_stats; + + struct ethtool_stats *req = xzalloc(req_size); + req->cmd = ETHTOOL_GSTATS; + req->n_stats = n_stats; + + int peer_error = netdev_linux_do_ethtool(netdev->up.name, + (struct ethtool_cmd *)req, + ETHTOOL_GSTATS, + "ETHTOOL_GSTATS"); + + /* peer_ifindex is the first stat for veth devices */ + if (!peer_error && req->n_stats > 0 && req->data[0] > 0) { + netdev->peer_ifindex = req->data[0]; + } + + free(req); + } } } @@ -3741,6 +3765,11 @@ netdev_linux_get_status(const struct netdev *netdev_, struct smap *smap) smap_add(smap, "driver_name", netdev->drvinfo.driver); smap_add(smap, "driver_version", netdev->drvinfo.version); smap_add(smap, "firmware_version", netdev->drvinfo.fw_version); + + /* For veth devices, report the cached peer_ifindex. */ + if (netdev->peer_ifindex > 0) { + smap_add_format(smap, "peer_ifindex", "%"PRIu64, netdev->peer_ifindex); + } } ovs_mutex_unlock(&netdev->mutex); diff --git a/tests/system-interface.at b/tests/system-interface.at index 15c4a0e2e..8db97b106 100644 --- a/tests/system-interface.at +++ b/tests/system-interface.at @@ -221,3 +221,37 @@ half OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP + +dnl Test that peer_ifindex is correctly reported for veth devices. +AT_SETUP([interface - veth peer_ifindex status]) +AT_SKIP_IF([test $HAVE_ETHTOOL = "no"]) + +OVS_TRAFFIC_VSWITCHD_START() + +dnl Create a veth pair: ovs-veth0 <-> ovs-veth1 +AT_CHECK([ip link add ovs-veth0 type veth peer name ovs-veth1]) +on_exit 'ip link del ovs-veth0' + +dnl Add ovs-veth0 to br0 +AT_CHECK([ovs-vsctl add-port br0 ovs-veth0]) + +dnl Get the actual ifindex of ovs-veth1 (the peer) and verify it's reported +PEER_IFINDEX=$(ip -j link show ovs-veth1 | sed -n 's/.*"ifindex":\([[0-9]]*\).*/\1/p') + +dnl Verify OVS reports the correct peer_ifindex +OVS_WAIT_UNTIL([ + OVS_PEER1=$(ovs-vsctl get interface ovs-veth0 status:peer_ifindex 2>/dev/null | tr -d '"') + test "$OVS_PEER1" = "$PEER_IFINDEX" +]) + +dnl Clean up +AT_CHECK([ip link del ovs-veth0]) + +OVS_TRAFFIC_VSWITCHD_STOP(["dnl +/could not open network device ovs-veth0/d +/cannot get .*STP status on nonexistent port/d +/ethtool command .*on network device ovs-veth0 failed/d +/error receiving .*ovs-veth0/d +/ovs-veth0: removing policing failed/d"]) + +AT_CLEANUP diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 29ad9c13e..e7fe5ba86 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -3770,6 +3770,25 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \ The version string of the network adapter's firmware, if available. </column> + <column name="status" key="peer_ifindex"> + <p> + For veth devices on Linux, this column contains the interface index + of the peer veth device. This allows applications to + programmatically discover veth pairs without requiring additional + system calls or parsing network interface information. + </p> + <p> + This value is obtained via the <code>ETHTOOL_GSTATS</code> ioctl, + which queries the veth driver statistics. The peer interface name + can be determined from this index using standard system functions + like <code>if_indextoname()</code>. + </p> + <p> + Only present for veth devices (when <ref column="status" + key="driver_name"/> is "veth"). Not available on non-Linux systems. + </p> + </column> + <column name="status" key="source_ip"> The source IP address used for an IPv4/IPv6 tunnel end-point, such as <code>gre</code>. -- 2.43.0 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
