On 3/3/26 6:38 PM, Ilya Maximets wrote:
> On 2/24/26 11:06 AM, Adrián Moreno wrote:
>> On Thu, Feb 19, 2026 at 05:22:20PM +0100, Adrián Moreno wrote:
>>> On Thu, Feb 19, 2026 at 02:51:18PM +0100, Ilya Maximets wrote:
>>>> On 12/22/25 4:23 PM, Adrian Moreno via dev wrote:
>>>>> @@ -206,12 +206,20 @@ rtnetlink_parse_cb(struct ofpbuf *buf, void *change)
>>>>>   *
>>>>>   * xxx Joins more multicast groups when needed.
>>>>>   *
>>>>> + * Callbacks might find that netdev-linux netdevs still hold outdated 
>>>>> cached
>>>>> + * information. If the notification has to trigger some kind of 
>>>>> reconfiguration
>>>>> + * that requires up-to-date netdev cache, it should do it 
>>>>> asynchronously, for
>>>>> + * instance by setting a flag in the callback and acting on it during the
>>>>> + * normal "*_run()" operation.
>>>>> + *
>>>>> + * Notifications might come from any network namespace.
>>>>> + *
>>>>>   * Returns an initialized nln_notifier if successful, NULL otherwise. */
>>>>>  struct nln_notifier *
>>>>>  rtnetlink_notifier_create(rtnetlink_notify_func *cb, void *aux)
>>>>>  {
>>>>>      if (!nln) {
>>>>> -        nln = nln_create(NETLINK_ROUTE, false, rtnetlink_parse_cb,
>>>>> +        nln = nln_create(NETLINK_ROUTE, true, rtnetlink_parse_cb,
>>>>>                           &rtn_change);
>>>>
>>>> Hi, Adrian.  Thanks for all the work on the RTNL contention issues!
>>>>
>>>> One big thing I do not like about this set though is this change to start
>>>> monitoring all namespaces here and in netdev-linux.  I don't think we 
>>>> should
>>>> be doing that as this can have a potentially significant performance 
>>>> impact.
>>>> For instance, we can have a BGP daemon running in a separate namespace that
>>>> will create a ton of route updates and OVS will receive all of them now 
>>>> that
>>>> it is subscribed.  Even parsing all those updates and discarding as 
>>>> irrelevant
>>>> eats a noticeable amount of CPU resources.  And people may run multiple BGP
>>>> daemons per node (which may sound unreasonable, but they do...) spamming 
>>>> OVS
>>>> with all the updates.  I'm afraid that it may even increase the contention
>>>> on the locks inside the kernel in such cases as notifications from many
>>>> namespaces start to be forwarded into a single socket.
>>>>
>>>> All in all, I think, we need to find a more fine-grained solution here
>>>> instead of a blind subscription to all namespaces.
>>>
>>> Thanks for the feedback. It's truely something we should think about.
>>>
>>> IIUC, the current series only subscribes to all namespaces for RTNLGRP_LINK.
>>> Routes are being monitored by the specific nln notifier that
>>> route-table.c creates in route_table_init() which does not subscribe to
>>> route events in other namespaces.

The part of the code above that I replied to here is turning on monitoring of
all namespaces for NETLINK_ROUTE, unless I'm missing something.  We monitor
all namespaces for RTNLGRP_LINK today, so that's not a big problem.

>>>
>>> I need to double check the impact of having NETLINK_LISTEN_ALL_NSID set
>>> on a socket if but only subscribed to a "smaller" amount of events but
>>> AFAICS, OVS would only be notified and parse link events in all
>>> namespaces. Links coming, going or changing "should" happen less often
>>> than routes being created in a multi-FRR setup
>>>
>>>>
>>>> One other thing we could do is to deprecate the "support" for moving 
>>>> internal
>>>> ports to different namespaces.  This doesn't really work with OVS restarts
>>>> anyway.  And was half-broken as the first patch of this set reveals.
>>>>
>>>> I believe the main user for this was OpenStack, as they moved the tap
>>>> interface for the DHCP agent into a different namespace where this agent 
>>>> was
>>>> running.
> 
> A correction on this statement:  The functionality is actually explicitly
> not about tap interfaces.  For tap ports, the nsid is always set to local.
> The feature is designed for moving internal ports in the kenrel datapath
> into a different namespace.  I personally never seen this being used.
> And I'm not sure if that even works...  Need to try.

So I tried this and it does work, we also get the link state updates from
the other namespace.  Though I still see no actual benefits of this kind
of setup.  The veth also seems to be faster out of the box.

> 
>>>> But IIRC, the default behavior has changed to use veth pairs quite
>>>> some time ago.  I don't think we claimed that this scenario was ever 
>>>> supported,
>>>> but there might be still some users that rely on it somewhat working.  It 
>>>> may
>>>> still somewhat work even if we don't listen for updates...
>>>>
>>>> WDYT?
>>>
>>> I cannot say if there are users depending on this. I did test
>>> superficially it as part of the work in this series and seemed to work.
>>>
>>> If we do decide to deprecate this, I guess we should at least start
>>> warning first for a release or two and then remove it?
> 
> We typically wait for 1 release cycle, unless it's something big, then
> we try to time the removal with the first release after a new LTS.
> 
>>> In which case,
>>> maybe the work in this series should focus on protecting OVS from
>>> netlink notification storms and when/if we finally deprecate,
>>> we can simplify the entire infrastructure?
>>>
>>
>> Following this train of thought.
>>
>> We don't have a flag to enable this feature so it would be currently
>> difficult to detect its use in order to emit a deprecation warning.
>>
>> What if we:
>> - Allow users of rtnetlink to set NETLINK_LISTEN_ALL_NSID dynamically
>> - Not set it initially
>> - Whenever we detect a vport moving to another netns (in
>> netdev_linux_netnsid_update__), we emit a deprecation warning and
>> dynamically set NETLINK_LISTEN_ALL_NSID to true.
>> - While at it, we can keep a list of "interested namespaces" to monitor
>>   pass it to the rtnetlink and make it filter notifications from other
>>   netns before parsing events.
>>
>> This would optimize the common case and at the same time we warn users
>> the feature will be deprecated.
>> WDYT?
> 
> Sounds interesting.  Not sure how much code that would be, but may be
> worth exploring.  Of course, in case the functionality is even working
> (the internal ports with the kernel datapath).
> 
> Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to