Hi Russell,

On Wed, 20 Feb 2019 12:36:59 +0000, Russell King <[email protected]> 
wrote:
> Switches work by learning the MAC address for each attached station by
> monitoring traffic from each station.  When a station sends a packet,
> the switch records which port the MAC address is connected to.
> 
> With IPv4 networking, before communication commences with a neighbour,
> an ARP packet is broadcasted to all stations asking for the MAC address
> corresponding with the IPv4.  The desired station responds with an ARP
> reply, and the ARP reply causes the switch to learn which port the
> station is connected to.
> 
> With IPv6 networking, the situation is rather different.  Rather than
> broadcasting ARP packets, a "neighbour solicitation" is multicasted
> rather than broadcasted.  This multicast needs to reach the intended
> station in order for the neighbour to be discovered.
> 
> Once a neighbour has been discovered, and entered into the sending
> stations neighbour cache, communication can restart at a point later
> without sending a new neighbour solicitation, even if the entry in
> the neighbour cache is marked as stale.  This can be after the MAC
> address has expired from the forwarding cache of the DSA switch -
> when that occurs, there is a long pause in communication.
> 
> Our DSA implementation for mv88e6xxx switches disables flooding of
> multicast and unicast frames for bridged ports.  As per the above
> description, this is fine for IPv4 networking, since the broadcasted
> ARP queries will be sent to and received by all stations on the same
> network.  However, this breaks IPv6 very badly - blocking neighbour
> solicitations and later causing connections to stall.
> 
> The defaults that the Linux bridge code expect from bridges are for
> unknown unicast and unknown multicast frames to be flooded to all ports
> on the bridge, which is at odds to the defaults adopted by our DSA
> implementation for mv88e6xxx switches.
> 
> This commit enables by default flooding of both unknown unicast and
> unknown multicast frames whenever a port is added to a bridge, and
> disables the flooding when a port leaves the bridge.  This means that
> mv88e6xxx DSA switches now behave as per the bridge(8) man page, and
> IPv6 works flawlessly through such a switch.
> 
> Signed-off-by: Russell King <[email protected]>
> ---
>  net/dsa/port.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index b84d010fb165..9e7aab13957e 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -105,6 +105,11 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct 
> net_device *br)
>       };
>       int err;
>  
> +     /* Set the flooding mode before joining */

Note that as stated by the comment just below, the port has already joined
the bridge here.

> +     err = dsa_port_bridge_flags(dp, BR_FLOOD | BR_MCAST_FLOOD, NULL);
> +     if (err)
> +             return err;
> +
>       /* Here the port is already bridged. Reflect the current configuration
>        * so that drivers can program their chips accordingly.
>        */
> @@ -113,8 +118,10 @@ int dsa_port_bridge_join(struct dsa_port *dp, struct 
> net_device *br)
>       err = dsa_port_notify(dp, DSA_NOTIFIER_BRIDGE_JOIN, &info);
>  
>       /* The bridging is rolled back on error */
> -     if (err)
> +     if (err) {
> +             dsa_port_bridge_flags(dp, 0, NULL);
>               dp->bridge_dev = NULL;
> +     }
>  
>       return err;
>  }
> @@ -137,6 +144,9 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct 
> net_device *br)
>       if (err)
>               pr_err("DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE\n");
>  
> +     /* Port is leaving the bridge, disable flooding */
> +     dsa_port_bridge_flags(dp, BR_LEARNING, NULL);
> +
>       /* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer,
>        * so allow it to be in BR_STATE_FORWARDING to be kept functional
>        */


This makes it clear that we must add this logic which sets the expected
default flags into the bridge code itself. But this can be done later.


Reviewed-by: Vivien Didelot <[email protected]>

Reply via email to