Hello BIRDs! And hello Ondřej.
As a part of my thesis I've been toying with EVPN in BIRD on the oz-evpn branch
and I managed to run into a segfault
when I add ethernet interface into a bridge. I am using a couple fedora 42 VMs
and I have this running in pnetlab
but I hope virtualization is not a problem here.
I narrowed it down to this very minimal bird.conf:
```
log stderr all;
log "/var/log/bird.log" all;
router id 10.10.10.3;
ipv4 table master4;
protocol device {
scan time 10;
}```
and a reproducer is basically what is in the README in
[cf-evpn-bgp-tags](https://gitlab.nic.cz/labs/bird-tools/-/tree/master/netlab/cf-evpn-bgp-tags)
test:
```
[root@debug ~]# ip link add name br0 type bridge vlan_filtering 1
[root@debug ~]# ip link add name vxlan0 type vxlan id 100 local 10.10.10.3
dstport 4789 nolearning
[root@debug ~]# ip link set vxlan0 master br0[root@debug ~]# bridge link set
dev vxlan0 learning off
[root@debug ~]# ip link set dev eth1 master br0
[root@debug ~]# bird -fSegmentation fault (core dumped)
```
I spent some time debugging this, although I am not too familiar with netlink I
hope my description will be useful :)
Segfault happens in "sysdep/linux/netlink.c" because [this
strcmp](https://gitlab.nic.cz/labs/bird/-/blob/oz-evpn/sysdep/linux/netlink.c?ref_type=heads#L1037)
on line 1037 gets a NULL in the first parameter "kind" which seems
expected based on that "kind" is set conditionally. Looking at the [array
li](https://gitlab.nic.cz/labs/bird/-/blob/oz-evpn/sysdep/linux/netlink.c?ref_type=heads#L1030)
and its size == 3 and into the "nl_parse_attrs" function
I understand that you are only trying to get the first three (?) IFLA_INFO_x
items which are
IFLA_INFO_UNSPEC, IFLA_INFO_KIND, IFLA_INFO_DATA.
The first "a->rta_type" that I see on my eth1 interface has a value 4 which is
the IFLA_INFO_SLAVE_KIND
and reading the data from it I indeed get "bridge" as a result.
All of these get skipped by the first
[continue](https://gitlab.nic.cz/labs/bird/-/blob/oz-evpn/sysdep/linux/netlink.c?ref_type=heads#L538)
so "nl_parse_attrs" does not return anything useful.
Is it possible that netlink is just not sending the data you need?
Bug most likely comes from
[66edbe43](https://gitlab.nic.cz/labs/bird/-/commit/66edbe43ce1fec8465577637c359d0294f580055)
commit which is specific to the oz-evpn branch so other non-evpn branches are
unaffected.
I'm also attaching a patch which fixes it for me and I can have EVPN up and
running.
Thank you for your time and I hope you'll have a great day!
TomášFrom 533261f6b6c97c76b260e7eccdaf860ab288dce1 Mon Sep 17 00:00:00 2001
From: tomasmatus <[email protected]>
Date: Sun, 9 Nov 2025 23:29:52 +0100
Subject: [PATCH] netlink: fix segfault when device kind is NULL
---
sysdep/linux/netlink.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sysdep/linux/netlink.c b/sysdep/linux/netlink.c
index eeb9123e..278e7044 100644
--- a/sysdep/linux/netlink.c
+++ b/sysdep/linux/netlink.c
@@ -1034,7 +1034,7 @@ nl_parse_link(struct nlmsghdr *h, int scan)
if (li[IFLA_INFO_KIND])
kind = RTA_DATA(li[IFLA_INFO_KIND]);
- if (!strcmp(kind, "bridge") && li[IFLA_INFO_DATA])
+ if (kind && !strcmp(kind, "bridge") && li[IFLA_INFO_DATA])
{
ea_set_attr_u32(&f.attrs->eattrs, tmp_linpool, EA_IFACE_TYPE, 0, EAF_TYPE_INT, IF_TYPE_BRIDGE);
@@ -1048,7 +1048,7 @@ nl_parse_link(struct nlmsghdr *h, int scan)
ea_set_attr_u32(&f.attrs->eattrs, tmp_linpool, EA_IFACE_BRIDGE_VLAN_FILTERING, 0, EAF_TYPE_INT, vlan_filtering);
}
}
- else if (!strcmp(kind, "vxlan") && li[IFLA_INFO_DATA])
+ else if (kind && !strcmp(kind, "vxlan") && li[IFLA_INFO_DATA])
{
ea_set_attr_u32(&f.attrs->eattrs, tmp_linpool, EA_IFACE_TYPE, 0, EAF_TYPE_INT, IF_TYPE_VXLAN);
--
2.51.2