Hi Vladimir,
Thank you for having a look at this!
On 06.03.2023 14:46, Vladimir Oltean wrote:
On Sat, Mar 04, 2023 at 01:52:32PM +0300, Arınç ÜNAL wrote:
On 4.03.2023 00:48, Jan Hoffmann wrote:
This series fixes multiple issues related to the L2 table and multicast
table. That includes an issue that causes corruption of the port mask
for unknown multicast forwarding, which can occur even when multicast
snooping is disabled.
With these patches, multicast snooping should be mostly working.
However, one important missing piece is forwarding of all multicast
traffic to multicast router ports (as specified in section 2.1.2-1 of
RFC4541). As far as I can see, this is a general issue that affects all
DSA switches, and cannot be fixed without changes to the DSA subsystem.
Do you plan to discuss this on the netdev mailing list with Vladimir?
Arınç
I searched for the patches and found them at
https://patchwork.ozlabs.org/project/openwrt/cover/[email protected]/
I guess that what should be discussed is the topic of switch ports
attached to multicast routers, yes?
DSA does not (and has never) listen(ed) to the switchdev notifications
emitted for bridge ports regarding multicast routers
(SWITCHDEV_ATTR_ID_PORT_MROUTER).
It has only listened for a while to the switchdev notifications for the
bridge itself as a multicast router (SWITCHDEV_ATTR_ID_BRIDGE_MROUTER),
and even that was done for a fairly strange reason and eventually got
reverted for breaking something - see commits 08cc83cc7fd8 and c73c57081b3d.
I personally don't have use cases for IP multicast routing / snooping,
so I would need some guidance regarding what is needed for things to
work smoothly. Also, to make sure that they keep working in the future,
one of the tests from tools/testing/selftests/net/forwarding/ which
exercises flooding towards multicast ports (if it exists) should be
symlinked to tools/testing/selftests/drivers/net/dsa/ and adapted until
it works. That's a pretty good way to get maintainers' attention on a
feature that they don't normally test.
It's not the first time I'm reading RFC4541, but due to a lack of any
practical applications surrounding me (and therefore also partial lack
of understanding), I keep forgetting what it says :)
Section 2.1.1. IGMP Forwarding Rules (for the control path) says
1) A snooping switch should forward IGMP Membership Reports only to
those ports where multicast routers are attached.
how? I guess IGMP/MLD packets should reach the CPU via packet traps
(which cause skb->offload_fwd_mark to be unset), and from there, the
bridge software data path identifies the mrouter ports and forwards
control packets only there? What happens if the particular switch
hardware doesn't support IGMP/MLD packet identification and trapping?
I wasn't really thinking about this potential issue, as that already
works for switches that trap IGMP/MLD reports to the CPU. But multicast
snooping is definitely going to be broken for DSA drivers that implement
the MDB methods but don't trap IGMP/MLD to the CPU port.
Should the driver install a normal multicast forwarding rule for
all 224.0.0.X traffic (translated to MAC), and patch the tagging
protocol driver to set skb->offload_fwd_mark = 0 based on
eth_hdr(skb)->h_dest?
Doing something like that should work for IGMPv3/MLDv2, as reports have
the fixed destination addresses 224.0.0.22 and ff02:16. However, for
these protocol versions, it should also be okay to flood reports,
because clients don't do report suppression in that case.
For IGMPv1/IGMPv2/MLDv1, this approach isn't practical, as reports are
sent to the group address itself, i.e. are not limited to 224.0.0.X.
Detecting them requires looking further into the packets (there are a
few details on this in section 3 "IPv6 Considerations" of RFC4515).
So, I don't think there really is a good way to do multicast snooping on
hardware that doesn't support detecting and trapping IGMP/MLD reports
specifically. Of course, trapping all multicast to the CPU (as you
suggested below) should always be an option, but the additional CPU load
might be an issue.
Another possibility would be to just not support multicast snooping on
such devices and always flood multicast (i.e. how a driver without
port_mdb_add/port_mdb_del works right now). However, that opens the
question about what should happen when multicast snooping is enabled on
a bridge (which is the default), but the DSA switch does not support it.
I guess a clean solution would be to just not allow this in the first
place. If this is allowed, then making sure that any multicast
originating from the CPU is flooded to all switch ports should also
avoid breaking multicast (but having a bridge that performs multicast
snooping only partially, i.e. for the non-offloaded ports, is probably a
confusing design).
Then for the data path we have:
2.1.2. Data Forwarding Rules
1) Packets with a destination IP address outside 224.0.0.X which are
not IGMP should be forwarded according to group-based port
membership tables and must also be forwarded on router ports.
This is the main IGMP snooping functionality for the data path.
One approach that an implementation could take would be to
maintain separate membership and multicast router tables in
software and then "merge" these tables into a forwarding cache.
For my clarity, this means that *all* IP multicast packets must be
forwarded to the multicast router ports, be their addresses known
(bridge mdb entries exist for them) or unknown (flooded)?
Yes, that is how I understand this section as well.
What does the software bridge implementation do? Does it perform this
"merging" of tables for us? (for known MDB entries, does it also notify
them through switchdev on the mrouter ports?) Looking superficially at
the first-order callers of br_mdb_notify(), I don't get the impression
that the bridge has this logic?
Unfortunately, the software bridge implementation doesn't do this kind
of merging ahead of time, otherwise there wouldn't be any need to handle
this in DSA. It only takes place in br_multicast_flood, the function
that does the actual forwarding of registered multicast:
https://elixir.bootlin.com/linux/v6.2.3/source/net/bridge/br_forward.c#L277
Or am I completely off with my reading of RFC4541?
It's not obvious to me, after looking at the few implementations of
SWITCHDEV_ATTR_ID_PORT_MROUTER handlers in drivers, that this merging of
forwarding tables would be done anywhere within sight.
It looks to me like all three switchdev drivers with handlers for
SWITCHDEV_ATTR_ID_PORT_MROUTER actually add the mrouter ports to mdb
entries:
https://elixir.bootlin.com/linux/v6.2.3/source/drivers/net/ethernet/marvell/prestera/prestera_switchdev.c#L1022
https://elixir.bootlin.com/linux/v6.2.3/source/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c#L2163
https://elixir.bootlin.com/linux/v6.2.3/source/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c#L105
If someone could explain to me what are the expectations (we don't seem
to have a lot of good documentation) and how do the existing drivers work,
we can start discussing more concretely how the layering and API within
DSA should look like.
I experimented a bit with the rtl83xx switch driver in OpenWrt by
passing through the switchdev events to the DSA driver:
https://github.com/janh/openwrt/commit/80b677fe04292570d7e304e184fd7d4ac8397a4f
The disadvantage with this approach is that the DSA driver has to do the
"merging" of mrouter ports and mdb entries. As a port can be a group
member and multicast router port at the same time, this means the driver
now has to keep track of all mdb entries, to be able to properly handle
when a port leaves a group or stops being a multicast router.
If the DSA subsystem could handle the "merging" instead and also call
port_mdb_add/port_mdb_del as appropriate for multicast router ports, the
individual drivers wouldn't have to deal with this particular issue at all.
As a way to fix a bug quickly and get correct behavior, I guess there's
also the option of stopping to process multicast packets in hardware,
and configure the switch to always send any multicast to the CPU port
only. As long as the tagger knows to leave skb->offload_fwd_mark unset,
the bridge driver should know how to deal with those packets in
software, and forward them only to whom is interested. But the drawback
is that there is no forwarding acceleration involved. Maybe DSA should
have done that from the get go for drivers which didn't care about
multicast in particular, instead of ending up with this current situation
which appears to be slightly chaotic.
Thanks,
Jan
_______________________________________________
openwrt-devel mailing list
[email protected]
https://lists.openwrt.org/mailman/listinfo/openwrt-devel