IPv6 routes not imported into Kernel

2023-11-15 Thread Robert Finze

Hello all,

I've decided to learn more about BGP and Bird.
To do so I'm currently setting up a small BGP setup with 2 VMs and Bird 
as routing daemons.
While the first server is running smoothly, the second one is not 
importing IPv6 routes into the kernel. There are no issues with IPv4 
routes on either system.


The Bird config on both systems is nearly identical (only IPs differ) 
and also the systems are setup in a similar manner.


Both VMs run in ProxmoxCE 8.0.4
The system without issues is running:

BIRD 2.14
on
Linux version 5.4.0-165-generic (buildd@lcy02-amd64-078) (gcc version 
9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.2)) #182-Ubuntu SMP Mon Oct 2 
19:43:28 UTC 2023


The system with issues is running:

BIRD 2.14
Linux version 5.15.0-88-generic (buildd@lcy02-amd64-058) (gcc (Ubuntu 
11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) 
#98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023



The routes are correctly learned from upstream and exported to the 
kernel, but the kernel is not "learning" them.

The journal shows the following logs:

--
Nov 04 21:31:51 moon bird[850]: Netlink: Invalid argument
Nov 04 21:31:51 moon bird[850]: Netlink: Invalid argument
Nov 04 21:31:51 moon bird[850]: Netlink: Invalid argument
Nov 04 21:31:51 moon bird[850]: ...
Nov 04 21:31:51 moon bird[850]: Netlink: Invalid argument
Nov 04 21:31:51 moon bird[850]: ...
Nov 04 21:31:52 moon bird[850]: Netlink: Invalid argument
Nov 04 21:31:52 moon bird[850]: ...
Nov 04 21:31:53 moon bird[850]: Netlink: Invalid argument
Nov 04 21:31:54 moon bird[850]: ...
--

One example route from Birds point of view shows:
--
bird> show route for 2620:12f:f000::/44
Table master6:
2620:12f:f000::/44   unicast [upstream_1v6 2023-11-04 16:47:37] ! (100) 
[AS43i]

via 2a0d:x:1 on ens20
 unicast [upstream_2v6 2023-11-04 16:47:36] (100) 
[AS43i]

via 2a0d:x:2 on ens20
--

Linux shows:
# ip -6 r get 2620:12f:f000::
RTNETLINK answers: Network is unreachable

Manually adding the routes to the kernel works:
# ip -6 r a 2620:12f:f000::/44 nexthop via 2a0d::1 nexthop via 
2a0d::2 dev ens20



After creating a netlink monitor interface I was able to capture some 
netlink messages which show that the routes are trying to be created but 
an error is returned.

I don't get from the error message what the issue is.


ip link add nl0 type nlmon
ip link set nl0 up

Netlink route
   00 04 03 38 00 00 00 00 00 00 00 00 00 00 00 00   ...8
0010   68 00 00 00 18 00 05 05 13 0d 66 02 00 00 00 00   h.f.
0020   0a 28 00 00 fe 0c 00 01 00 00 00 00 14 00 01 00   .(..
0030   26 07 ff 00 0b 00 00 00 00 00 00 00 00 00 00 00   &...
0040   08 00 06 00 20 00 00 00 14 00 07 00 2a 0e 39 40    ...*.9@
0050   10 00 00 00 00 00 00 00 00 00 00 02 08 00 04 00   
0060   02 00 00 00 14 00 05 00 2a 0e 39 40 de ad 00 00   *.9@
0070   00 00 00 00 00 00 00 01   

Netlink
   00 04 03 38 00 00 00 00 00 00 00 00 00 00 00 00   ...8
0010   7c 00 00 00 02 00 00 00 13 0d 66 02 7a 31 09 81   |.f.z1..
0020   ea ff ff ff 68 00 00 00 18 00 05 05 13 0d 66 02   h.f.
0030   00 00 00 00 0a 28 00 00 fe 0c 00 01 00 00 00 00   .(..
0040   14 00 01 00 26 07 ff 00 0b 00 00 00 00 00 00 00   &...
0050   00 00 00 00 08 00 06 00 20 00 00 00 14 00 07 00    ...
0060   2a 0e 39 40 10 00 00 00 00 00 00 00 00 00 00 02   *.9@
0070   08 00 04 00 02 00 00 00 14 00 05 00 2a 0e 39 40   *.9@
0080   de ad 00 00 00 00 00 00 00 00 00 01   

Happy to forward a pcap or any other information as well.


After upgrading to newer Kernel it still shows the same behaviour, that 
is, "Netlink: Invalid argument" for IPv6 routes..


/proc/version
Linux version 6.2.0-36-generic (buildd@lcy02-amd64-050) 
(x86_64-linux-gnu-gcc-11 (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld 
(GNU Binutils for Ubuntu) 2.38) #37~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC 
Mon Oct  9 15:34:04 UTC 2



It would be great if anyone had any pointer to what I am missing or what 
is going wrong here.



Best,
Robert


Re: IPv6 routes not imported into Kernel

2024-02-27 Thread Robert Finze

Hi Gerdriaan,

thanks a lot for your input!
I haven't had much time to continue on this until now.

Please see my replies inline:


On 01.01.24 19:15, Gerdriaan Mulder wrote:

Hi Robert,

On 15/11/2023 22:58, Robert Finze wrote:
The Bird config on both systems is nearly identical (only IPs differ) 
and also the systems are setup in a similar manner.


It would be good to have a dump of the configuration of the non-working 
system (redact sensitive information such as passwords etc, but leave 
other information intact).


I've attached the config.

The routes are correctly learned from upstream and exported to the 
kernel, but the kernel is not "learning" them.


Interesting. The following dumps you sent might further help debugging 
the problem.



Netlink route
   00 04 03 38 00 00 00 00 00 00 00 00 00 00 00 00   ...8
0010   68 00 00 00 18 00 05 05 13 0d 66 02 00 00 00 00   h.f.
0020   0a 28 00 00 fe 0c 00 01 00 00 00 00 14 00 01 00   .(..
0030   26 07 ff 00 0b 00 00 00 00 00 00 00 00 00 00 00   &...
0040   08 00 06 00 20 00 00 00 14 00 07 00 2a 0e 39 40    ...*.9@
0050   10 00 00 00 00 00 00 00 00 00 00 02 08 00 04 00   
0060   02 00 00 00 14 00 05 00 2a 0e 39 40 de ad 00 00   *.9@
0070   00 00 00 00 00 00 00 01   


This decodes to (Wireshark supports "Import from hexdump", as I found out):

Linux rtnetlink (route netlink) protocol
     Netlink message header (type: Add network route)
     Length: 104
     Message type: Add network route (24)
     Flags: 0x0505
     Flags: 0x0505
     Sequence: 40242451
     Port ID: 0
     Address family: AF_INET6 (10)
     Length of destination: 40
     Length of source: 0
     TOS filter: 0x00
     Routing table ID: 254
     Routing protocol: BIRD (0x0c)
     Route origin: global route (0x00)
     Route type: Gateway or direct route (0x01)
     Route flags: 0x
     Attribute: Route destination address
     Len: 20
     Type: 0x0001, Route destination address (1)
     Data: 2607ff000b00
     Attribute: RTA_PRIORITY
     Len: 8
     Type: 0x0006, RTA_PRIORITY (6)
     Data: 2000
     Attribute: RTA_PREFSRC
     Len: 20
     Type: 0x0007, RTA_PREFSRC (7)
     Data: 2a0e39401002
     Attribute: Output interface index: 2
     Len: 8
     Type: 0x0004, Output interface index (4)
     Output interface index: 2
     Attribute: Gateway of the route
     Len: 20
     Type: 0x0005, Gateway of the route (5)
     Data: 2a0e3940dead0001


   00 04 03 38 00 00 00 00 00 00 00 00 00 00 00 00   ...8
0010   7c 00 00 00 02 00 00 00 13 0d 66 02 7a 31 09 81   |.f.z1..
0020   ea ff ff ff 68 00 00 00 18 00 05 05 13 0d 66 02   h.f.
0030   00 00 00 00 0a 28 00 00 fe 0c 00 01 00 00 00 00   .(..
0040   14 00 01 00 26 07 ff 00 0b 00 00 00 00 00 00 00   &...
0050   00 00 00 00 08 00 06 00 20 00 00 00 14 00 07 00    ...
0060   2a 0e 39 40 10 00 00 00 00 00 00 00 00 00 00 02   *.9@
0070   08 00 04 00 02 00 00 00 14 00 05 00 2a 0e 39 40   *.9@
0080   de ad 00 00 00 00 00 00 00 00 00 01   


decodes as:

Netlink message
     Netlink message header (type: Error)
     Length: 124
     Message type: Error (0x0002)
     Flags: 0x
     Sequence: 40242451
     Port ID: 2164863354
     Error code: Invalid argument (-EINVAL) (-22)
     Netlink message header (type: 0x0018)
     Length: 104
     Message type: Protocol-specific (0x0018)
     Flags: 0x0505
     Flags: 0x0505
     Sequence: 40242451
     Port ID: 0

The first message could probably be replicated by running:

ip -6 route add 2607:ff00:b::/40 via 2a0e:3940:dead::1 table 254 
protocol bird scope global src 2a0e:3940:1000::2 dev 2


this returns:
RTNETLINK answers: No route to host

- where dev 2 indicates the network interface with index 2, this is 
probably ens20 in your setup?


It should be ens19. I'm currently not sure how to verify that.
"ip a" shows:

1: lo
2: ens18
3: ens19
4: ens20
5: dummy0


- table 254 is most likely the main table (see /etc/iproute2/rt_tables)


Correct, this is 'main'.

I'm unsure how to decode RTA_PRIORITY correctly here. Regardless, you 
could run this command on the non-working host. Perhaps `ip route` can 
tell you a bit more information. In a slightly modified case (I've 
replaced the `via ...` with a known gateway), I get: "Error: Invalid 
source address." (with: iproute2-6.5.0)


My current hunch is that `src 2a0e:3940:1000::2` is not a valid address 
on your system. A closer read on your earlier comment:


This ip is bound on the dummy0 interface:

5: dummy0:  mtu 1500 qdisc noqueue 
state DOWN group default qlen 1000

li

Re: IPv6 routes not imported into Kernel

2024-02-29 Thread Robert Finze

Hi Gerdriaan,

I've followed your advice and set up 2 VMs for testing.


On 28.02.24 12:02, Gerdriaan Mulder wrote:

Next I want to try a fresh 20.04 install and see what happens.


I would try a fresh install of Ubuntu 20.04 with the same kernel as the 
machine that currently works, indeed. If the problem goes away, it might 
be an issue between Ubuntu 20.04 and 22.04. If the problem persists, it 
might be some subtle configuration difference. I wouldn't yet upgrade to 
BIRD 3.0alpha because that changes too many variables in order to debug 
the problem.

A)
Ubuntu 24.04
Linux moon2 6.6.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 30 
10:27:29 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Bird 3.0alpha2
Bird 2.14
Bird 2.13

B)
Ubuntu 20.04
Linux moon3 5.4.0-172-generic #190-Ubuntu SMP Fri Feb 2 23:24:22 UTC 
2024 x86_64 x86_64 x86_64 GNU/Linux

Bird 2.14

C)
Ubuntu 20.04
Linux star 5.4.0-172-generic #190-Ubuntu SMP Fri Feb 2 23:24:22 UTC 2024 
x86_64 x86_64 x86_64 GNU/Linux

Bird 2.14

VM C is my current router which is working fine and from which I'm 
exporting one route towards A and B.


On VM A I've tested different bird version and all show the same 
behaviour. Before I've upgraded to 24.04 I've ran the tests on 22.04 and 
the results are the same.


VM C is exporting one route towards A and B which is being accepted by 
bird but on A doesn't end up in the kernel. On B there's no issues and 
everything is working as expected.
It seems that there is indeed a difference between 20.04 and 22.04 (and 
newer).

I'm a bit stuck here.

For now I'd be fine with running 20.04 on all routers, but eventually 
it'd be nice to upgrade.


Besides, in your initial post, you pasted a few routes from BIRD that 
were using protocols "upstream_1v6" and "upstream_2v6". They seem to be 
missing from the bird.conf you posted. The route addition in the netlink 
dump is different from the routes you showed in BIRD, which makes it 
more difficult to pinpoint the problem.


Apologies for this. Yes, there are 2 more upstreams configured, but are 
shut so that it's easier to troubleshoot.


I think it's a good idea to focus on getting just one route exported 
from BIRD to the kernel successfully. If it's possible in your setup, 
perhaps just configure 1 upstream, and only import 1 route from that 
upstream in BIRD, and export the same route through the kernel protocol.


Best regards,
Gerdriaan Mulder


Cheers,
Robert