On 12/22/25 4:23 PM, Adrian Moreno via dev wrote:
> @@ -206,12 +206,20 @@ rtnetlink_parse_cb(struct ofpbuf *buf, void *change)
>   *
>   * xxx Joins more multicast groups when needed.
>   *
> + * Callbacks might find that netdev-linux netdevs still hold outdated cached
> + * information. If the notification has to trigger some kind of 
> reconfiguration
> + * that requires up-to-date netdev cache, it should do it asynchronously, for
> + * instance by setting a flag in the callback and acting on it during the
> + * normal "*_run()" operation.
> + *
> + * Notifications might come from any network namespace.
> + *
>   * Returns an initialized nln_notifier if successful, NULL otherwise. */
>  struct nln_notifier *
>  rtnetlink_notifier_create(rtnetlink_notify_func *cb, void *aux)
>  {
>      if (!nln) {
> -        nln = nln_create(NETLINK_ROUTE, false, rtnetlink_parse_cb,
> +        nln = nln_create(NETLINK_ROUTE, true, rtnetlink_parse_cb,
>                           &rtn_change);

Hi, Adrian.  Thanks for all the work on the RTNL contention issues!

One big thing I do not like about this set though is this change to start
monitoring all namespaces here and in netdev-linux.  I don't think we should
be doing that as this can have a potentially significant performance impact.
For instance, we can have a BGP daemon running in a separate namespace that
will create a ton of route updates and OVS will receive all of them now that
it is subscribed.  Even parsing all those updates and discarding as irrelevant
eats a noticeable amount of CPU resources.  And people may run multiple BGP
daemons per node (which may sound unreasonable, but they do...) spamming OVS
with all the updates.  I'm afraid that it may even increase the contention
on the locks inside the kernel in such cases as notifications from many
namespaces start to be forwarded into a single socket.

All in all, I think, we need to find a more fine-grained solution here
instead of a blind subscription to all namespaces.

One other thing we could do is to deprecate the "support" for moving internal
ports to different namespaces.  This doesn't really work with OVS restarts
anyway.  And was half-broken as the first patch of this set reveals.

I believe the main user for this was OpenStack, as they moved the tap
interface for the DHCP agent into a different namespace where this agent was
running.  But IIRC, the default behavior has changed to use veth pairs quite
some time ago.  I don't think we claimed that this scenario was ever supported,
but there might be still some users that rely on it somewhat working.  It may
still somewhat work even if we don't listen for updates...

WDYT?

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to