Hi Patrice& authors

Excellent work on coming up with this solution for L3 over L2 MC-LAG.  I am
curious about the use cases and problem this solution solves.

When I think of MC-LAG I think of proprietary legacy implementations of
multi chassis LAG such as Cisco vPC or Juniper MC-LAG  where in contrast
modern LAG using EVPN fabric for ARP/ND synchronization which does not
require ICCP or proprietary link between the leafs for synchronization.

My recommendation would be to not mention MC-LAG by itself and call it ESI
MC-LAG which is the modern EVPN fabric based LAG used with MPLS or VXLAN
fabrics.

I have some questions regarding 1.1 in the problem statement.  My
understanding AFAIK  that BGP over ESI LAG is very common in modern VXLAN
or MPLS fabric based DC where the host is eBGP peered to the RFC 9135 Inter
subnet forwarding Distributed Anycast Gateway (DAG) IP single session
hashed to DF leaf and synchronized with NDF leaf via EVPN fabric.

Here is how it works.

Below describes how an all-active multihomed host interacts with an
Ethernet VPN (EVPN) fabric using an anycast gateway and BGP
. The mechanism ensures redundancy, seamless failover, and load balancing
for both L2 and L3 traffic.
Here is a breakdown of the process explained in the user's text:
1. Anycast gateway and host peering

   - *Anycast IP:* A host is typically connected to two or more leaf
   switches via a Link Aggregation Group (LAG). All leaf switches connected to
   the same Ethernet Segment Identifier (ESI) share the same IP and MAC
   address, called the anycast gateway.
   -
      - *eBGP peering:* The multihomed host establishes an External BGP
      (eBGP) peering session with the anycast gateway IP address.
Since the IP is
      the same on both leaf switches, the host sees a single gateway.
   - *Designated Forwarder (DF) election:* EVPN uses a DF election
      algorithm to determine which leaf switch is the DF for a
specific Ethernet
      segment. The DF is responsible for forwarding Broadcast, Unknown-unicast,
      and Multicast (BUM) traffic to the host. The other leaf is the
non-DF (NDF).
   - *BGP session via DF:* The host's eBGP session will be established over
      the LAG member connected to the DF leaf. This is because the DF holds the
      active ARP/ND entry for the host.

2. Seamless failover

   - *ARP/ND synchronization:* EVPN synchronizes the host's ARP (for IPv4)
   and ND (for IPv6) information across the fabric using EVPN Type-2 routes
   (MAC/IP Advertisement routes). This means the NDF leaf is also aware of the
   host's IP and MAC address.
   - *Fabric notification and failover:* If the DF leaf switch fails, the
   eBGP session drops. The NDF leaf, having already been synchronized with the
   host's reachability information, takes over as the new DF. This provides a
   seamless failover, as the host's BGP peering is quickly re-established with
   the new DF leaf using the same anycast gateway IP address.

3. Traffic flow management

   - *Load balancing for host-advertised subnets:*When the multihomed host
   advertises subnets via BGP into the EVPN fabric, the fabric sees the routes
   originating from both the DF and NDF leaf switches (with the same ESI).
   This allows the fabric to use Equal-Cost Multipath (ECMP) routing to load
   balance incoming traffic flows across both all-active links.
   - *EVPN procedures for loop prevention:*
      - *Split horizon:* This mechanism prevents a BUM packet from being
      forwarded back to the multihomed host it originated from. For VXLAN, this
      is typically done using the source IP address of the VTEP (the
leaf switch)
      in the tunnel header to prevent the packet from looping back.
      - *Local bias:* With local bias, when a leaf switch receives BUM
      traffic from a remote VTEP that is also part of a shared
Ethernet segment,
      it will not forward that traffic out of its local port for that segment.
      This is the main VXLAN-based mechanism for split horizon filtering.
      - *Backup path aliasing (anycast aliasing):* This is an optimization
      that helps remote leaf switches load balance traffic toward a multihomed
      site. It allows load balancing across all leaf switches attached to the
      same ESI, ensuring efficient use of all paths.

Thanks

Gyan


On Fri, Sep 5, 2025 at 4:55 PM Patrice Brissette (pbrisset) <pbrisset=
[email protected]> wrote:

> Hi,
>
>
>
> We believe this draft is ready for WG adoption.
>
> How can we move it forward?
>
>
>
> Draft is here: 
> *https://datatracker.ietf.org/doc/draft-mackenzie-bess-evpn-l3mh-proto/
> <https://datatracker.ietf.org/doc/draft-mackenzie-bess-evpn-l3mh-proto/>*
>
>
>
> Regards,
>
> Patrice Brissette
>
> Distinguished Engineer
>
> Cisco Systems
>
>
>
>
>
>
> _______________________________________________
> BESS mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
_______________________________________________
BESS mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to