Hi Ilya,

Thank you very much for the review and suggestions, I will send an update
addressing your comments shortly.
I reworked it as a patch series since it now requires some additional
changes to other parts of the code not necessarily only related to iface
status storage.

You were right that we are actually already subscribed to
RTM_NEWLINK/RTM_DELLINK events, thank you for pointing that out.
The reason I could not get the VALID_DRVINFO invalidation to trigger was
because of an issue on how netns ids are handled for system devices.
I have made a commit to try and address this issue. I took what is probably
the most naive approach, but it is the one that needed the least additional
code (we can discuss the details and alternatives in the relevant thread).

Best Regards,
Matteo

On Sat, 24 Jan 2026 at 00:23, Ilya Maximets <[email protected]> wrote:

> On 1/22/26 1:43 PM, Matteo Perin via dev wrote:
> > Add population of a new peer_ifindex column in the interface
> > status of (Linux) veth devices. This allows applications like
> > OVN to programmatically discover veth pairs without requiring
> > additional system calls or manual configuration.
> >
> > The implementation uses the existing ethtool infrastructure:
> > - Uses ETHTOOL_GDRVINFO to get n_stats from the veth driver
> > - Dynamically allocates buffer based on n_stats
> > - Queries ETHTOOL_GSTATS to retrieve veth statistics
> > - Extracts peer_ifindex (first statistic for veth devices)
> > - Stores it in the status column as "peer_ifindex"
> >
> > The value is cached to avoid performance degradation with an
> > increase on the number of ports.
> >
> > This feature can enable automatic veth peer discovery for routing
> > daemon integrations in OVN, eliminating the need for manual
> > port-mapping configuration when using veth pairs for BGP/BFD
> > routing protocols.
> >
> > Signed-off-by: Matteo Perin <[email protected]>
>
> Hi, Matteo.  Thanks for the patch!  See some comments below.
>
> > ---
> > This is the first part of a rework of a recent contribution I made in an
> effort
> > to simplify the OVN routing-protocol-redirect option setup on an LRP
> when using veth pairs to
> > connect routing daemons to OVN. This requires supporting the lookup of
> the veth peers bound to
> > the LSP referred to.
> >
> > Since this added capability requires a call to ethtool gstats, it is
> better suited to be called
> > directly from the OVS code (using the infrastructure already in-place)
> instead of shoehorning it
> > inside the OVN code (as in the original patch). The veth peer value can
> be cached inside the
> > netdev status information (for veth devices) and then retrieved by OVN.
> >
> > As pointed out on the original contribution discussion, some issues
> about stale veth values may
> > arise, though. Even if the peer ifindex does not change if the veth is
> moved to a different
> > namespace, it can happen that a veth is removed and recreated later with
> the same name, causing
> > the OVS get_status to return a stale ifindex value.
> >
> > Unfortunately, I have not been able to find a satisfactory way of
> handling such situation (at least
> > not is userspace). While in kernel space we can rely on
> RTM_NEWLINK/RTM_DELLINK to invalidate
> > VALID_DRVINFO and retrigger fetching the peer ifindex, in userspace the
> only way to achieve the same
> > seems to actively check if the ifindex has changed. Of course this is
> not really possible since making
> > a query to ethtool or get_ifindex on every call to get_status will slow
> down the kernel when a huge
> > number of ports is present.
>
> We're actually already subscribed to RTM_NEWLINK/RTM_DELLINK events.
> See netdev_linux_run()->netdev_linux_update() that parses the updates
> received from the kernel.  VALID_DRVINFO already gets invalidated there
> when the device is deleted.
>
> > I do not really know how to tackle this situation in an elegant way,
> maybe a good compromise would
> > be documenting this limitation (the need to remove and recreate veths
> seems not a very common
> > occurrence to me, after all)?
> >
> > This will allow us to avoid moving too much code around for only a mere
> quality-of-life improvement,
> > but I will really appreciate if someone has any insights or suggestions
> on how to tackle this issue.
> >
> > Best Regards,
> > Matteo
> >
> >  lib/netdev-linux-private.h |  1 +
> >  lib/netdev-linux.c         | 34 ++++++++++++++++++++++++++++++++++
> >  tests/system-interface.at  | 34 ++++++++++++++++++++++++++++++++++
> >  vswitchd/vswitch.xml       | 19 +++++++++++++++++++
> >  4 files changed, 88 insertions(+)
> >
> > diff --git a/lib/netdev-linux-private.h b/lib/netdev-linux-private.h
> > index cf4a021d3..c86a68510 100644
> > --- a/lib/netdev-linux-private.h
> > +++ b/lib/netdev-linux-private.h
> > @@ -97,6 +97,7 @@ struct netdev_linux {
> >      uint8_t current_duplex;          /* Cached from ETHTOOL_GSET. */
> >
> >      struct ethtool_drvinfo drvinfo;  /* Cached from ETHTOOL_GDRVINFO. */
> > +    uint64_t peer_ifindex;           /* Cached from ETHTOOL_GSTATS (if
> veth). */
>
> This line is a bit too long.
>
> >      struct tc *tc;
> >
> >      /* For devices of class netdev_tap_class only. */
> > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> > index f4d2685a2..dac9cfb7d 100644
> > --- a/lib/netdev-linux.c
> > +++ b/lib/netdev-linux.c
> > @@ -3728,12 +3728,41 @@ netdev_linux_get_status(const struct netdev
> *netdev_, struct smap *smap)
> >
> >          COVERAGE_INC(netdev_get_ethtool);
> >          memset(&netdev->drvinfo, 0, sizeof netdev->drvinfo);
> > +        netdev->peer_ifindex = 0;
> >          error = netdev_linux_do_ethtool(netdev->up.name,
> >                                          cmd,
> >                                          ETHTOOL_GDRVINFO,
> >                                          "ETHTOOL_GDRVINFO");
> >          if (!error) {
> >              netdev->cache_valid |= VALID_DRVINFO;
>
> Might be good to move this to the end of the 'if' body.
>
> > +
> > +            /* For veth devices, also get the peer_ifindex
> > +             * via ETHTOOL_GSTATS */
>
> The 'via ETHTOOL_GSTATS' seems unnecessary in the comment.  It's clear
> from the code.  Also, all sentences should end with a period.
>
> > +            if (!strcmp(netdev->drvinfo.driver, "veth") &&
> > +                netdev->drvinfo.n_stats > 0) {
> > +                /* Dynamically allocate based on n_stats
> > +                 * from ETHTOOL_GDRVINFO. */
>
> This comment is also fairly obvious from the code.
>
> > +                size_t n_stats = netdev->drvinfo.n_stats;
>
> May move this out of this 'if' statement and use already in the condition.
>
> > +                size_t req_size =
> > +                    sizeof(struct ethtool_stats) + sizeof(__u64) *
> n_stats;
> > +
> > +                struct ethtool_stats *req = xzalloc(req_size);
>
> We should avoid sizeof(type) if possible.  In this case, if you just define
> the variable and then do the assignment, we can write it like:
>
>                    req = xzalloc(sizeof *req + n_stats * sizeof
> req->data[0]);
>
> > +                req->cmd = ETHTOOL_GSTATS;
> > +                req->n_stats = n_stats;
> > +
>
> There are trailing whitespaces here and in other places in the patch.
> You can run checkpatch script before sending to catch this kind of
> problems.
>
> > +                int peer_error = netdev_linux_do_ethtool(netdev->
> up.name,
>
> You may define this variable at the top of the block and name it a little
> shorter like 's_error', so the indentation is easier to do.
>
> > +                                                (struct ethtool_cmd *)
> req,
> > +                                                ETHTOOL_GSTATS,
> > +                                                "ETHTOOL_GSTATS");
> > +
> > +                /* peer_ifindex is the first stat for veth devices */
>
> May put the 'peer_ifindex' into single quotes.  Also, period at the end.
>
> At the same time, it's not guaranteed that the peer_ifindex will
> alwasy be the first in the array.  We actually need to request the
> string set for the names and find the correct one in the set.
>
> We already have a function to get the string set.  See the
> netdev_linux_read_definitions().  It is tailored for ETH_SS_FEATURES,
> but it should be easy enough to allow it to query any type of the
> string set and then use here with the ETH_SS_STATS.  Then we can look
> for the peer ifindex in the names and choose the correct value.
>
> > +                if (!peer_error &&
> > +                    req->n_stats > 0 && req->data[0] > 0) {
>
> This can be a single line.
>
> > +                    netdev->peer_ifindex = req->data[0];
> > +                }
> > +
> > +                free(req);
> > +            }
> >          }
> >      }
> >
> > @@ -3741,6 +3770,11 @@ netdev_linux_get_status(const struct netdev
> *netdev_, struct smap *smap)
> >          smap_add(smap, "driver_name", netdev->drvinfo.driver);
> >          smap_add(smap, "driver_version", netdev->drvinfo.version);
> >          smap_add(smap, "firmware_version", netdev->drvinfo.fw_version);
> > +
> > +        /* For veth devices, report the cached peer_ifindex. */
>
> Also quotes.
>
> > +        if (netdev->peer_ifindex > 0) {
> > +            smap_add_format(smap, "peer_ifindex", "%"PRIu64,
> netdev->peer_ifindex);
>
> This line is too long.
>
> > +        }
> >      }
> >      ovs_mutex_unlock(&netdev->mutex);
> >
> > diff --git a/tests/system-interface.at b/tests/system-interface.at
> > index 15c4a0e2e..8db97b106 100644
> > --- a/tests/system-interface.at
> > +++ b/tests/system-interface.at
> > @@ -221,3 +221,37 @@ half
> >
> >  OVS_TRAFFIC_VSWITCHD_STOP
> >  AT_CLEANUP
> > +
> > +dnl Test that peer_ifindex is correctly reported for veth devices.
> > +AT_SETUP([interface - veth peer_ifindex status])
> > +AT_SKIP_IF([test $HAVE_ETHTOOL = "no"])
> > +
> > +OVS_TRAFFIC_VSWITCHD_START()
> > +
> > +dnl Create a veth pair: ovs-veth0 <-> ovs-veth1
>
> Period.   Same for all other comments here.
>
> > +AT_CHECK([ip link add ovs-veth0 type veth peer name ovs-veth1])
> > +on_exit 'ip link del ovs-veth0'
> > +
> > +dnl Add ovs-veth0 to br0
> > +AT_CHECK([ovs-vsctl add-port br0 ovs-veth0])
> > +
> > +dnl Get the actual ifindex of ovs-veth1 (the peer) and verify it's
> reported
> > +PEER_IFINDEX=$(ip -j link show ovs-veth1 | sed -n
> 's/.*"ifindex":\([[0-9]]*\).*/\1/p')
>
> I'm not sure if we can rely on the JSON output.  It's not a particularly
> recent thing, but still some systems may not support it.  Better if we
> get it from a standard output.
>
> > +
> > +dnl Verify OVS reports the correct peer_ifindex
> > +OVS_WAIT_UNTIL([
> > +    OVS_PEER1=$(ovs-vsctl get interface ovs-veth0 status:peer_ifindex
> 2>/dev/null | tr -d '"')
> > +    test "$OVS_PEER1" = "$PEER_IFINDEX"
> > +])
>
> This should be OVS_WAIT_UNTIL_EQUAL.
>
> > +
> > +dnl Clean up
> > +AT_CHECK([ip link del ovs-veth0])
>
> No need to do that, the on_exit hook will delete the port.
>
> > +
> > +OVS_TRAFFIC_VSWITCHD_STOP(["dnl
> > +/could not open network device ovs-veth0/d
> > +/cannot get .*STP status on nonexistent port/d
> > +/ethtool command .*on network device ovs-veth0 failed/d
> > +/error receiving .*ovs-veth0/d
> > +/ovs-veth0: removing policing failed/d"])
> > +
> > +AT_CLEANUP
> > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> > index 29ad9c13e..e7fe5ba86 100644
> > --- a/vswitchd/vswitch.xml
> > +++ b/vswitchd/vswitch.xml
> > @@ -3770,6 +3770,25 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
> type=patch options:peer=p1 \
> >          The version string of the network adapter's firmware, if
> available.
> >        </column>
> >
> > +      <column name="status" key="peer_ifindex">
> > +        <p>
> > +          For veth devices on Linux, this column contains the interface
> index
> > +          of the peer veth device. This allows applications to
>
> Double spaces between sentences.  Here and below.
>
> > +          programmatically discover veth pairs without requiring
> additional
> > +          system calls or parsing network interface information.
> > +        </p>
> > +        <p>
> > +          This value is obtained via the <code>ETHTOOL_GSTATS</code>
> ioctl,
> > +          which queries the veth driver statistics. The peer interface
> name
> > +          can be determined from this index using standard system
> functions
> > +          like <code>if_indextoname()</code>.
>
> This paragraph seems unnecessary.
>
> > +        </p>
> > +        <p>
> > +          Only present for veth devices (when <ref column="status"
> > +          key="driver_name"/> is "veth"). Not available on non-Linux
> systems.
> > +        </p>
> > +      </column>
> > +
> >        <column name="status" key="source_ip">
> >          The source IP address used for an IPv4/IPv6 tunnel end-point,
> such as
> >          <code>gre</code>.
>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to