On 3/20/2025 2:17 AM, Nikolay Aleksandrov wrote:
On 3/19/25 00:42, Joseph Huang wrote:
Currently the bridge does not provide real-time feedback to user space
on whether or not an attempt to offload an mdb entry was successful.
This patch set adds support to notify user space about successful and
failed offload attempts, and the behavior is controlled by a new knob
mdb_notify_on_flag_change:
0 - the bridge will not notify user space about MDB flag change
1 - the bridge will notify user space about flag change if either
MDB_PG_FLAGS_OFFLOAD or MDB_PG_FLAGS_OFFLOAD_FAILED has changed
2 - the bridge will notify user space about flag change only if
MDB_PG_FLAGS_OFFLOAD_FAILED has changed
The default value is 0.
A break-down of the patches in the series:
Patch 1 adds offload failed flag to indicate that the offload attempt
has failed. The flag is reflected in netlink mdb entry flags.
Patch 2 adds the knob mdb_notify_on_flag_change, and notify user space
accordingly in br_switchdev_mdb_complete() when the result is known.
Patch 3 adds netlink interface to manipulate mdb_notify_on_flag_change
knob.
This patch set was inspired by the patch series "Add support for route
offload failure notifications" discussed here:
https://lore.kernel.org/all/20210207082258.3872086-1-ido...@idosch.org/
Joseph Huang (3):
net: bridge: mcast: Add offload failed mdb flag
net: bridge: mcast: Notify on offload flag change
net: bridge: Add notify on flag change netlink i/f
include/uapi/linux/if_bridge.h | 9 +++++----
include/uapi/linux/if_link.h | 14 ++++++++++++++
net/bridge/br_mdb.c | 30 +++++++++++++++++++++++++-----
net/bridge/br_multicast.c | 25 +++++++++++++++++++++++++
net/bridge/br_netlink.c | 21 +++++++++++++++++++++
net/bridge/br_private.h | 26 +++++++++++++++++++++-----
net/bridge/br_switchdev.c | 31 ++++++++++++++++++++++++++-----
7 files changed, 137 insertions(+), 19 deletions(-)
Hi,
Could you please share more about the motivation - why do you need this and
what will be using it?
Hi Nik,
The API for a user space application to join a multicast group is
write-only (and really best-efforts only), meaning that after an
application calls setsockopt(), the application has no way to know
whether the operation actually succeeded or not. Normally for soft
bridges this is not an issue; however for switchdev-backed bridges, due
to limited hardware resources, the failure rate is meaningfully higher.
With this patch set, the user space application will now get a
notification about a failed attempt to join a multicast group. The user
space application can then have the opportunity to mitigate the failure
[1][2].
Also why do you need an option with 3 different modes
instead of just an on/off switch for these notifications?
Thanks,
Nik
Some user space application might be interested in both successful and
failed offload attempts (for example the application might want to keep
an mdb database which is perfectly in sync with the hardware), while
some other user space application might only be interested in failed
attempts (so that it can retry the operation or choose a different group
for example).
This knob is modeled after fib_notify_on_flag_change knob on route
offload failure notification (see
https://lore.kernel.org/all/20210207082258.3872086-4-ido...@idosch.org/).
The rationale is that "Separate value (read: 2) is added for such
notifications because there are less of them, so they do not impact
performance and some users will find them more important."
Thanks,
Joseph
--
[1]
https://datatracker.ietf.org/doc/draft-ietf-pim-zeroconf-mcast-addr-alloc-ps/,
section 2, the last paragraph
[2]
https://datatracker.ietf.org/doc/draft-ietf-pim-ipv6-zeroconf-assignment/,
section 2.1, the first paragraph