Hoi Nate,

further to what Andrew suggested, there are a few more hints I can offer:
1) Make sure there is enough netlink socket buffer by adding this to your
sysctl set:
cat << EOF > /etc/sysctl.d/81-VPP-netlink.conf
# Increase netlink to 64M
net.core.rmem_default=67108864
net.core.wmem_default=67108864
net.core.rmem_max=67108864
net.core.wmem_max=67108864
EOF
sysctl -p /etc/sysctl.d/81-VPP-netlink.conf

2) Ensure there is enough memory by adding this to VPP's startup config:
memory {
  main-heap-size 2G
  main-heap-page-size default-hugepage
}

3) Many prefixes (like a full BGP routing table) will need more stats
memory, so increase that too in VPP's startup config:
statseg {
  size 128M
}

And in case you missed it, make sure to create the linux-cp devices in a
separate namespace by adding this to the startup config:
linux-cp {
  default netns dataplane
}

Then you should be able to consume the IPv4 and IPv6 DFZ in your router. I
tested extensively with FRR and Bird2, and so far had good success.

groet,
Pim

On Thu, May 27, 2021 at 10:02 AM Andrew Yourtchenko <ayour...@gmail.com>
wrote:

> I would guess from your traceback you are running out of memory, so
> increasing the main heap size to something like 4x could help…
>
> --a
>
> On 27 May 2021, at 08:29, Nate Sales <n...@natesales.net> wrote:
>
> 
> Hello,
>
> I'm having some trouble with the linux-cp netlink plugin. After building
> it from the patch set (https://gerrit.fd.io/r/c/vpp/+/31122), it does
> correctly receive netlink messages and insert routes from the linux kernel
> table into the VPP FIB. When loading a large amount of routes however (full
> IPv4 table), VPP crashes after loading about 400k routes.
>
> It appears to be receiving a SIGABRT that terminates the VPP process:
>
> May 27 06:10:33 pdx1rtr1 vnet[2232]: received signal SIGABRT, PC
> 0x7fe9b99bdce1
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #0  0x00007fe9b9de1a7b 0x7fe9b9de1a7b
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #1  0x00007fe9b9d13140 0x7fe9b9d13140
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #2  0x00007fe9b99bdce1 gsignal + 0x141
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #3  0x00007fe9b99a7537 abort + 0x123
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #4  0x000055d43480a1f3 0x55d43480a1f3
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #5  0x00007fe9b9c9c8d5
> vec_resize_allocate_memory + 0x285
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #6  0x00007fe9b9d71feb
> vlib_validate_combined_counter + 0xdb
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #7  0x00007fe9ba4f1e55
> load_balance_create + 0x205
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #8  0x00007fe9ba4c639d
> fib_entry_src_mk_lb + 0x38d
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #9  0x00007fe9ba4c64a4
> fib_entry_src_action_install + 0x44
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #10 0x00007fe9ba4c681b
> fib_entry_src_action_activate + 0x17b
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #11 0x00007fe9ba4c3780
> fib_entry_create + 0x70
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #12 0x00007fe9ba4b9afc
> fib_table_entry_update + 0x29c
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #13 0x00007fe935fcedce 0x7fe935fcedce
> May 27 06:10:33 pdx1rtr1 vnet[2232]: #14 0x00007fe935fd2ab5 0x7fe935fd2ab5
> May 27 06:10:33 pdx1rtr1 systemd[1]: vpp.service: Main process exited,
> code=killed, status=6/ABRT
> May 27 06:10:33 pdx1rtr1 systemd[1]: vpp.service: Failed with result
> 'signal'.
> May 27 06:10:33 pdx1rtr1 systemd[1]: vpp.service: Consumed 12.505s CPU
> time.
> May 27 06:10:34 pdx1rtr1 systemd[1]: vpp.service: Scheduled restart job,
> restart counter is at 2.
> May 27 06:10:34 pdx1rtr1 systemd[1]: Stopped vector packet processing
> engine.
> May 27 06:10:34 pdx1rtr1 systemd[1]: vpp.service: Consumed 12.505s CPU
> time.
> May 27 06:10:34 pdx1rtr1 systemd[1]: Starting vector packet processing
> engine...
> May 27 06:10:34 pdx1rtr1 systemd[1]: Started vector packet processing
> engine.
>
> Here's what I'm working with:
>
> root@pdx1rtr1:~# uname -a< /div>
> Linux pdx1rtr1 5.10.0-7-amd64 #1 SMP Debian 5.10.38-1 (2021-05-20) x86_64
> GNU/Linux
> root@pdx1rtr1:~# vppctl show ver
> vpp v21.10-rc0~3-g3f3da0d27 built by nate on altair at 2021-05-27T01:21:58
> root@pdx1rtr1:~# bird --version
> BIRD version 2.0.7
>
> And some adjusted sysctl params:
>
> net.core.rmem_default = 67108864
> net.core.wmem_default = 67108864
> net.core.rmem_max = 67108864
> net.core.wmem_max = 67108864
> vm.nr_hugepages = 1024
> vm.max_map_count = 3096
> vm.hugetlb_shm_group = 0
> kernel.shmmax = 2147483648
>
> In case it's at all helpful, I ran a "sh ip fib sum" every second and
> restarted BIRD to observe when the routes start processing, and to get the
> last known fib state before the crash:
>
> Thu May 27 06:10:20 UTC 2021
> ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto flowlabel ]
> epoch:0 flags:none locks:[adjacency:1, default-route:1, lcp-rt:1, ]
>     Prefix length         Count
>                    0               1
>                    4               2
>                    8               3
>                    9               5
>                   10              29
>                   11              62
>         ;           12             169
>                   13             357
>                   14             702
>                   15            1140
>                   16            7110
>                   17            4710
>                   18            7763
>                   19           13814
>               &nb sp;   20           22146
>                   21           26557
>                   22           51780
>                   23           43914
>                   24          227173
>                   27               1
>                   32               6
> Thu May 27 06:10:21 UTC 2021
> clib_socket_init: connect (fd 3, '/run/vpp/cli.sock'): Connection refused
> Thu May 27 06:10:22 UTC 2021
> ipv4-VRF:0, fib_index:0, flow hash:[src dst spor t dport proto flowlabel ]
> epoch:0 flags:none locks:[default-route:1, ]
>     Prefix length         Count
>                    0               1
>                    4               2
>                   32               2
>
>
> I'm new to VPP so let me know if there are other logs that would be useful
> too.
>
> Cheers,
> Nate
>
>
>
>
>
>
> 
>
>

-- 
Pim van Pelt <p...@ipng.nl>
PBVP1-RIPE - http://www.ipng.nl/
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19487): https://lists.fd.io/g/vpp-dev/message/19487
Mute This Topic: https://lists.fd.io/mt/83119168/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to