Re: evpn rebase to HEAD

Ondrej Zajicek Thu, 19 Feb 2026 09:08:59 -0800

On Wed, Feb 18, 2026 at 09:59:05PM +0100, Pim van Pelt wrote:
> Hoi,
> 
> Thanks for taking a look, Marina and Ondrej, I appreciate it!
> 
> On 18.02.2026 17:50, Ondrej Zajicek wrote:
> > As others noted, the relevant branch is 'oz-evpn', the older 'evpn'
> > branch fell victim to my needlesly strict adherence to "do not rebase
> > public branch" rule. The patches in 'oz-evpn' are not only rebased on
> > newer BIRD version, but also have fixes squashed in them, and there is
> > newer development. I just pushed there rebase to 2.18. Please look at
> > this branch first. Also note there are some minor changes to EVPN protocol
> > configuration syntax.
> I have ported by vppevpn protocol implementation to be based on oz-evpn, and
> the system is functional here also. Yaay!
> 
> I only had one small issue. In oz-evpn, the 'evpn' protocol will stay in
> 'startup' until the vxlan0 interface becomes ready. However, in my usecase,
> vxlan is not performed by the kernel, but by VPP, so there is no 'vxlan0'
> interface. I need only 'vni' and 'router address' (and the remote VTEP) to
> construct the dataplane configuration. To allow the evpn protocol to
> transition to PS_UP, I decided to fire an event that announces the IMET if
> router_addr and VNI are set, and skips waiting for the interface.


Hmm, you have NULL interface in the encap->tunnel_dev? Or some fake interface
created by if_get_by_name()? Or some dummy/irrelevant interface (loopback)?

The interface is here not just to get/check router_addr and VNI, but
primarily to construct next hops for routes in bridge table:

evpn_receive_mac() / evpn_receive_imet():

  .nh.iface = encap->tunnel_dev,

These are necessary not just for kernel dataplane (to specify tunnel
implemnting iface), but also formally just to have non-NULL nh.iface,
which we generally assumed in BIRD for RTD_UNICAST nexthops. So how
these routes looks in your setup?

Note that the nexthops of VXLAN-tunneled routes in bridge table are just
makeshift now, esp. usage of nh.gw for encap-dst-ip and nh->label[0]
encap-vni, these should get their own attributes (once we will redesign
nexthops to have proper attributes).

I am often uncertain how much BIRD representation of routes should match
Linux API representation of routes (esp. for idiosyncratic details like
here when Linux API assumes nominal tunnel interfaces in next hop
interfaces for lightweight tunnels), but i usually defer to try to keep
it consistent to limit impedance mismatch here. But it may cause
problems when other backends with different conventions are used, like
in your case.

Btw, i planned to explicitly configure bridge device for EVPN protocol
(as it is now implicitly through tunnel_dev->master). The idea is that as
VRF device (in Linux) defines L3 VRF, bridge device defines MAC-VRF. And
as L3 protocols are associated with specific L3 VRF, L2 protocols should
be associated with specific MAC-VRF. Do you have (kernel-level) bridge
device in your setup? (i do not mean using BIRD bridge protocol).


> > > (3) Setting BGP Next Hop clears MPLS Labelstack, filters cannot set this.
> > > When the BGP Next Hop is changed by an export filter, we lose the MPLS
> > > labelstack. There is no way to add MPLS labelstack in filters (at least,
> > > that I could find), so we cannot use 'next hop address X' to determine the
> > > Type-2 MAC VxLAN endpoint. Note: IMET updates do not use the BGP Next Hop,
> > > but rather a PSMI attribute with the 'router address' already.
> > Resetting MPLS label when changing next hop is intentional, as MPLS labels 
> > are
> > (in general) specific to receiving routers.
> > 
> > There is gw_mpls (and undocumented/semantically broken gw_mpls_stack)
> > attribute that could be accessed in filters.
> > 
> > I am not sure what is your use case here to change it with filters, can
> > you describe it more? What about setting 'router address' in EVPN proto?
> With the oz-evpn branch as-is, setting 'router address' in evpn proto will:
> 1) copy that to the PSMI attribute: good
> 2) not do anything for MAC announcements; they will have BGP.next_hop set to
> the session address.
> 
> if the previous patch in (2) is accepted, then 'router address' will be used
> as BGP.next_hop, which will avoid the need to change it with filters with
> (3).

Oh, i see. You are right, this should work automatically for both IMET / PMSI
and MAC.

I do not like using regular/immediate next hops here in EVPN table, as
it does not fit well semantically and requires formal device. But seems
to me that a reasonable alternative would be to just attach BGP_NEXT_HOP
by EVPN protocol, similarly how BGP_PMSI_TUNNEL is attached. Wil do that.
Any comments?

Note that immediate next hops in EVPN table for routes received through
BGP are here just as an artefact of BGP_NEXT_HOP resolvability check,
they should not be here too.



> If neither patch is applied, the following config:
> 
> protocol evpn {
>   ...
>   encapsulation vxlan { router address 192.0.2.1; };
> }
> protocol bgp {
>   evpn { import all; export all; };
>   local 2001:db8::1 as 65512;
>   neighbor 2001:db8::2 as 65512;
> }
> 
> will yield IMET pointing at 192.0.2.1 but MAC pointing at 2001:db8::1. If I
> want MAC pointing at 192.0.2.1 also, I would either need (2, my preference)
> or a filter with (3).
> If there exists a device out there which has different addressing for IMET
> and MAC (note: I don't know of any, but perhaps they exist), then (3) would
> come in handy.

While i agree that it should work automatically by just setting router
address in protocol evpn, i think that this setup that should work even
without patches:

 protocol evpn {
   ...
   encapsulation vxlan { router address 192.0.2.1; };
 }
 protocol bgp {
   evpn { import all; export all; next hop address 192.0.2.1; };
   local 2001:db8::1 as 65512;
   neighbor 2001:db8::2 as 65512;
 }

-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: [email protected])
"To err is human -- to blame it on a computer is even more so."

Re: evpn rebase to HEAD

Reply via email to