Re: [PATCH 0/6] realtek: fix management of mdb entries

Jan Hoffmann Fri, 10 Mar 2023 10:46:17 -0800

Hi Vladimir,

Thank you for having a look at this!


On 06.03.2023 14:46, Vladimir Oltean wrote:

On Sat, Mar 04, 2023 at 01:52:32PM +0300, Arınç ÜNAL wrote:

On 4.03.2023 00:48, Jan Hoffmann wrote:

This series fixes multiple issues related to the L2 table and multicast
table. That includes an issue that causes corruption of the port mask
for unknown multicast forwarding, which can occur even when multicast
snooping is disabled.

With these patches, multicast snooping should be mostly working.
However, one important missing piece is forwarding of all multicast
traffic to multicast router ports (as specified in section 2.1.2-1 of
RFC4541). As far as I can see, this is a general issue that affects all
DSA switches, and cannot be fixed without changes to the DSA subsystem.


Do you plan to discuss this on the netdev mailing list with Vladimir?

Arınç


I searched for the patches and found them at
https://patchwork.ozlabs.org/project/openwrt/cover/[email protected]/

I guess that what should be discussed is the topic of switch ports
attached to multicast routers, yes?

DSA does not (and has never) listen(ed) to the switchdev notifications
emitted for bridge ports regarding multicast routers 
(SWITCHDEV_ATTR_ID_PORT_MROUTER).
It has only listened for a while to the switchdev notifications for the
bridge itself as a multicast router (SWITCHDEV_ATTR_ID_BRIDGE_MROUTER),
and even that was done for a fairly strange reason and eventually got
reverted for breaking something - see commits 08cc83cc7fd8 and c73c57081b3d.

I personally don't have use cases for IP multicast routing / snooping,
so I would need some guidance regarding what is needed for things to
work smoothly. Also, to make sure that they keep working in the future,
one of the tests from tools/testing/selftests/net/forwarding/ which
exercises flooding towards multicast ports (if it exists) should be
symlinked to tools/testing/selftests/drivers/net/dsa/ and adapted until
it works. That's a pretty good way to get maintainers' attention on a
feature that they don't normally test.

It's not the first time I'm reading RFC4541, but due to a lack of any
practical applications surrounding me (and therefore also partial lack
of understanding), I keep forgetting what it says :)

Section 2.1.1.  IGMP Forwarding Rules (for the control path) says

    1) A snooping switch should forward IGMP Membership Reports only to
       those ports where multicast routers are attached.

how? I guess IGMP/MLD packets should reach the CPU via packet traps
(which cause skb->offload_fwd_mark to be unset), and from there, the
bridge software data path identifies the mrouter ports and forwards
control packets only there? What happens if the particular switch
hardware doesn't support IGMP/MLD packet identification and trapping?

I wasn't really thinking about this potential issue, as that alreadyworks for switches that trap IGMP/MLD reports to the CPU. But multicastsnooping is definitely going to be broken for DSA drivers that implementthe MDB methods but don't trap IGMP/MLD to the CPU port.

Should the driver install a normal multicast forwarding rule for
all 224.0.0.X traffic (translated to MAC), and patch the tagging
protocol driver to set skb->offload_fwd_mark = 0 based on
eth_hdr(skb)->h_dest?

Doing something like that should work for IGMPv3/MLDv2, as reports havethe fixed destination addresses 224.0.0.22 and ff02:16. However, forthese protocol versions, it should also be okay to flood reports,because clients don't do report suppression in that case.

For IGMPv1/IGMPv2/MLDv1, this approach isn't practical, as reports aresent to the group address itself, i.e. are not limited to 224.0.0.X.Detecting them requires looking further into the packets (there are afew details on this in section 3 "IPv6 Considerations" of RFC4515).

So, I don't think there really is a good way to do multicast snooping onhardware that doesn't support detecting and trapping IGMP/MLD reportsspecifically. Of course, trapping all multicast to the CPU (as yousuggested below) should always be an option, but the additional CPU loadmight be an issue.

Another possibility would be to just not support multicast snooping onsuch devices and always flood multicast (i.e. how a driver withoutport_mdb_add/port_mdb_del works right now). However, that opens thequestion about what should happen when multicast snooping is enabled ona bridge (which is the default), but the DSA switch does not support it.I guess a clean solution would be to just not allow this in the firstplace. If this is allowed, then making sure that any multicastoriginating from the CPU is flooded to all switch ports should alsoavoid breaking multicast (but having a bridge that performs multicastsnooping only partially, i.e. for the non-offloaded ports, is probably aconfusing design).

Then for the data path we have:

2.1.2.  Data Forwarding Rules

    1) Packets with a destination IP address outside 224.0.0.X which are
       not IGMP should be forwarded according to group-based port
       membership tables and must also be forwarded on router ports.

       This is the main IGMP snooping functionality for the data path.
       One approach that an implementation could take would be to
       maintain separate membership and multicast router tables in
       software and then "merge" these tables into a forwarding cache.

For my clarity, this means that *all* IP multicast packets must be
forwarded to the multicast router ports, be their addresses known
(bridge mdb entries exist for them) or unknown (flooded)?


Yes, that is how I understand this section as well.

What does the software bridge implementation do? Does it perform this
"merging" of tables for us? (for known MDB entries, does it also notify
them through switchdev on the mrouter ports?) Looking superficially at
the first-order callers of br_mdb_notify(), I don't get the impression
that the bridge has this logic?

Unfortunately, the software bridge implementation doesn't do this kindof merging ahead of time, otherwise there wouldn't be any need to handlethis in DSA. It only takes place in br_multicast_flood, the functionthat does the actual forwarding of registered multicast:


https://elixir.bootlin.com/linux/v6.2.3/source/net/bridge/br_forward.c#L277

Or am I completely off with my reading of RFC4541?

It's not obvious to me, after looking at the few implementations of
SWITCHDEV_ATTR_ID_PORT_MROUTER handlers in drivers, that this merging of
forwarding tables would be done anywhere within sight.

It looks to me like all three switchdev drivers with handlers forSWITCHDEV_ATTR_ID_PORT_MROUTER actually add the mrouter ports to mdbentries:


https://elixir.bootlin.com/linux/v6.2.3/source/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c#L1022

https://elixir.bootlin.com/linux/v6.2.3/source/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c#L2163

https://elixir.bootlin.com/linux/v6.2.3/source/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c#L105

If someone could explain to me what are the expectations (we don't seem
to have a lot of good documentation) and how do the existing drivers work,
we can start discussing more concretely how the layering and API within
DSA should look like.

I experimented a bit with the rtl83xx switch driver in OpenWrt bypassing through the switchdev events to the DSA driver:

https://github.com/janh/openwrt/commit/80b677fe04292570d7e304e184fd7d4ac8397a4f

The disadvantage with this approach is that the DSA driver has to do the"merging" of mrouter ports and mdb entries. As a port can be a groupmember and multicast router port at the same time, this means the drivernow has to keep track of all mdb entries, to be able to properly handlewhen a port leaves a group or stops being a multicast router.

If the DSA subsystem could handle the "merging" instead and also callport_mdb_add/port_mdb_del as appropriate for multicast router ports, theindividual drivers wouldn't have to deal with this particular issue at all.

As a way to fix a bug quickly and get correct behavior, I guess there's
also the option of stopping to process multicast packets in hardware,
and configure the switch to always send any multicast to the CPU port
only. As long as the tagger knows to leave skb->offload_fwd_mark unset,
the bridge driver should know how to deal with those packets in
software, and forward them only to whom is interested. But the drawback
is that there is no forwarding acceleration involved. Maybe DSA should
have done that from the get go for drivers which didn't care about
multicast in particular, instead of ending up with this current situation
which appears to be slightly chaotic.


Thanks,
Jan

_______________________________________________
openwrt-devel mailing list
[email protected]
https://lists.openwrt.org/mailman/listinfo/openwrt-devel

Re: [PATCH 0/6] realtek: fix management of mdb entries

Reply via email to