> On May 26, 2022, at 9:11 PM, Jon Kohler <[email protected]> wrote:
> 
> For netdev_linux_update_via_netlink(), hint to the kernel that
> we do not need it to gather netlink internal stats when we want
> to update the netlink flags, as those stats are not rendered
> within OVS.
> 
> Background:
> ovs-vswitchd can spend quite a bit of time blocked by the kernel
> during netlink calls, especially systems with many cores. This
> time is dominated by the kernel-side internal stats gathering
> mechanism in netlink, specifically:
>  inet6_fill_link_af
>    inet6_fill_ifla6_attrs
>      __snmp6_fill_stats64
> 
> In Linux 4.4+, there exists a hint for netlink requests to not
> trigger the ipv6 stats gathering mechanism, which greatly reduces
> the amount of time that ovs-vswitchd is on CPU.
> 
> Testing and Results:
> Tested booting 320 VM's and measuring OVS utilization with perf
> record, then visualized into a flamegraph using a patched version
> of ovs 2.14.2. Calls under bridge_run() seem to get hit the worst
> by this issue.
> 
> Before bridge_run() == 11.3% of samples
> After bridge_run() == 3.4% of samples
> 
> Note that there are at least two observed netlink calls under
> bridge_run that are still kernel stats heavy after this patch:
> 
> Call 1:
>  bridge_run -> netdev_run -> route_table_run -> route_table_reset ->
>    ovs_router_insert -> ovs_router_insert__ -> get_src_addr ->
>      netdev_ger_addr_list -> netdev_linux_get_addr_list -> getifaddrs
> 
> Since the actual netlink call is coming from getifaddrs() in glibc,
> fixing would likely involve either duplicating glibc code in ovs
> source or patch glibc.
> 
> Call 2:
>  bridge_run -> iface_refresh_stats -> netdev_get_stats ->
>    netdev_linux_get_stats -> get_stats_via_netlink
> 
> This does use netlink based stats; however, it isn't immediately
> clear if just dropping the stats from inet6_fill_link_af would
> impact anything or not. Given this call is more intermittent, its
> of lesser concern.
> 
> Signed-off-by: Jon Kohler <[email protected]>
> Acked-by: Greg Smith <[email protected]>

Gentle bump

> ---
> lib/netdev-linux.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
> 
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> index 2766b3f2bf..f0246d3b2b 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -247,6 +247,12 @@ enum {
>     VALID_NUMA_ID           = 1 << 8,
> };
> 
> +/* Linux 4.4 introduced the ability to skip the internal stats gathering
> + * that netlink does via an external filter mask that can be passed into
> + * a netlink request.
> + */
> +#define      RTEXT_FILTER_SKIP_STATS (1 << 3)
> +
> /* Use one for the packet buffer and another for the aux buffer to receive
>  * TSO packets. */
> #define IOV_STD_SIZE 1
> @@ -6418,6 +6424,9 @@ netdev_linux_update_via_netlink(struct netdev_linux 
> *netdev)
>     if (netdev_linux_netnsid_is_remote(netdev)) {
>         nl_msg_put_u32(&request, IFLA_IF_NETNSID, netdev->netnsid);
>     }
> +
> +    nl_msg_put_u32(&request, IFLA_EXT_MASK, RTEXT_FILTER_SKIP_STATS);
> +
>     error = nl_transact(NETLINK_ROUTE, &request, &reply);
>     ofpbuf_uninit(&request);
>     if (error) {
> -- 
> 2.30.1 (Apple Git-130)
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to