Re: [vpp-dev] VPP crashes because of API segment exhaustion
Hi Alexander, Quick reply. Nice bug report! Agreed that it looks like vl_api_clnt_process sleeps, probably because it hit a queue size of 0, but memclnt_queue_callback or the timeout, albeit 20s is a lot, should wake it up. So, given that QUEUE_SIGNAL_EVENT is set, the only thing that comes to mind is that maybe somehow vlib_process_signal_event context gets corrupted. Could you run a debug image and see if anything asserts? Is vlib_process_signal_event called by chance from a worker? Regards, Florin > On Jan 24, 2023, at 7:59 AM, Alexander Chernavin via lists.fd.io > wrote: > > Hello all, > > We are experiencing VPP crashes that occur a few days after the startup > because of API segment exhaustion. Increasing API segment size to 256MB > didn't stop the crashes from occurring. > > Can you please take a look at the description below and tell us if you have > seen similar issues or have any ideas what the cause may be? > > Given: > VPP 22.10 > 2 worker threads > API segment size is 256MB > ~893k IPv4 routes and ~160k IPv6 routes added > > Backtrace: >> [..] >> #32660 0x55b02f606896 in os_panic () at >> /home/jenkins/tnsr-pkgs/work/vpp/src/vpp/vnet/main.c:414 >> #32661 0x7fce3c0ec740 in clib_mem_heap_alloc_inline (heap=0x0, >> size=, align=8, >> os_out_of_memory_on_failure=1) at >> /home/jenkins/tnsr-pkgs/work/vpp/src/vppinfra/mem_dlmalloc.c:613 >> #32662 clib_mem_alloc (size=) >> at /home/jenkins/tnsr-pkgs/work/vpp/src/vppinfra/mem_dlmalloc.c:628 >> #32663 0x7fce3dc4ee6f in vl_msg_api_alloc_internal (vlib_rp=0x130026000, >> nbytes=69, pool=0, >> may_return_null=0) at >> /home/jenkins/tnsr-pkgs/work/vpp/src/vlibmemory/memory_shared.c:179 >> #32664 0x7fce3dc592cd in vl_api_rpc_call_main_thread_inline (force_rpc=0 >> '\000', >> fp=, data=, data_length=) >> at /home/jenkins/tnsr-pkgs/work/vpp/src/vlibmemory/memclnt_api.c:617 >> #32665 vl_api_rpc_call_main_thread (fp=0x7fce3c74de70 , >> data=0x7fcc372bdc00 "& \001$ ", data_length=28) >> at /home/jenkins/tnsr-pkgs/work/vpp/src/vlibmemory/memclnt_api.c:641 >> #32666 0x7fce3cc7fe2d in icmp6_neighbor_solicitation_or_advertisement >> (vm=0x7fccc0864000, >> frame=0x7fcccd7d2d40, is_solicitation=1, node=) >> at /home/jenkins/tnsr-pkgs/work/vpp/src/vnet/ip6-nd/ip6_nd.c:163 >> #32667 icmp6_neighbor_solicitation (vm=0x7fccc0864000, node=0x7fccc09e3380, >> frame=0x7fcccd7d2d40) >> at /home/jenkins/tnsr-pkgs/work/vpp/src/vnet/ip6-nd/ip6_nd.c:322 >> #32668 0x7fce3c1a2fe0 in dispatch_node (vm=0x7fccc0864000, >> node=0x7fce3dc74836, >> type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING, >> frame=0x7fcccd7d2d40, >> last_time_stamp=4014159654296481) at >> /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:961 >> #32669 dispatch_pending_node (vm=0x7fccc0864000, pending_frame_index=7, >> last_time_stamp=4014159654296481) at >> /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:1120 >> #32670 vlib_main_or_worker_loop (vm=0x7fccc0864000, is_main=0) >> at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:1589 >> #32671 vlib_worker_loop (vm=vm@entry=0x7fccc0864000) >> at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:1723 >> #32672 0x7fce3c1f581a in vlib_worker_thread_fn (arg=0x7fccbdb11b40) >> at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/threads.c:1579 >> #32673 0x7fce3c1f02c1 in vlib_worker_thread_bootstrap_fn >> (arg=0x7fccbdb11b40) >> at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/threads.c:418 >> #32674 0x7fce3be3db43 in start_thread (arg=) at >> ./nptl/pthread_create.c:442 >> #32675 0x7fce3becfa00 in clone3 () at >> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 > > According to the backtrace, an IPv6 neighbor is being learned. Since the > packet was received on a worker thread, the neighbor information is being > passed to the main thread by making an RPC call (that works via the API). For > this, an API message for RPC call is being allocated from the API segment (as > а client). But the allocation is failing because of no available memory. > > If inspect the API rings after crashing, it can be seen that they are all > filled with VL_API_RPC_CALL messages. Also, it can be seen that there are a > lot of pending RPC requests (vm->pending_rpc_requests has ~3.3M items). Thus, > API segment exhaustion occurs because of a huge number of pending RPC > messages. > > RPC messages are processed in a process node called api-rx-from-ring > (function is called vl_api_clnt_process). And process nodes are handled in > the main thread only. > > First hypothesis is that the main loop of the main thread pauses for such a > long time that a huge number of pending RPC messages are accumulated by the > worker threads (that keep running). But this doesn't seem to be confirmed if > inspect vm->loop_interval_start of all threads after crashing. > vm->loop_interval_start of the worker threads would have been greater
Re: [vpp-dev] VPP Linux-CP/Linux-NL : MPLS?
Hoi, MPLS is not supported in Linux CP. It is a regularly requested feature, but not quite as straight forward. Contributions welcome! groet, Pim On Tue, Jan 24, 2023 at 5:16 PM wrote: > Hello, > > I'm trying to populate MPLS FIB via Linux-CP plugin. > MPLS records are created via FRR and populated to Linux Kernel routing > table (I use default ns). Below one can see "push" operation and "swap" > operation. > mpls table 0 was created in vpp by "mpls table add 0" command. > mpls was enabled on all the interfaces, both towards media and taps. > Still, do not see anything in FIB. Should MPLS tables sync work, or may be, > I forgot setup something in VPP? > > root@tn3:/home/abramov# ip -f mpls route show > 40050 as to 41000 via inet6 fd00:200::2 dev Ten0.1914 proto static > root@tn3:/home/abramov# ip -6 route show | grep 4 > fd00:100::4 nhid 209 encap mpls 4 via fd00:200::2 dev Ten0.1914 > proto static metric 20 pref medium > root@tn3:/home/abramov# vppctl > > vpp# show mpls fib 0 40050 > MPLS-VRF:0, fib_index:1 locks:[interface:4, CLI:1, ] > vpp# show ip6 fib > ipv6-VRF:0, fib_index:0, flow hash:[src dst sport dport proto flowlabel ] > epoch:0 flags:none locks:[adjacency:1, default-route:1, lcp-rt:1, ] > ::/0 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:6 buckets:1 uRPF:5 to:[0:0]] > [0] [@0]: dpo-drop ip6 > fd00:100::4/128 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:17 buckets:1 uRPF:17 to:[0:0]] > [0] [@5]: ipv6 via fd00:200::2 TenGigabitEthernet1c/0/1.1914: mtu:9000 > next:5 flags:[] 2af08d2cf6163cecef5f778f8100077a86dd > fd00:200::/64 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:15 buckets:1 uRPF:14 to:[0:0]] > [0] [@4]: ipv6-glean: [src:fd00:200::/64] > TenGigabitEthernet1c/0/1.1914: mtu:9000 next:2 flags:[] > 3cecef5f778f8100077a86dd > fd00:200::1/128 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:16 buckets:1 uRPF:15 > to:[10:848]] > [0] [@20]: dpo-receive: fd00:200::1 on TenGigabitEthernet1c/0/1.1914 > fd00:200::2/128 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:18 buckets:1 uRPF:12 to:[0:0]] > [0] [@5]: ipv6 via fd00:200::2 TenGigabitEthernet1c/0/1.1914: mtu:9000 > next:5 flags:[] 2af08d2cf6163cecef5f778f8100077a86dd > fe80::/10 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:7 buckets:1 uRPF:6 to:[8:544]] > [0] [@14]: ip6-link-local > vpp# show mpls fib > MPLS-VRF:0, fib_index:1 locks:[interface:4, CLI:1, ] > ip4-explicit-null:neos/21 fib:1 index:30 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[43] locks:2 flags:exclusive, uPRF-list:31 len:0 itfs:[] > path:[53] pl-index:43 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]: dst-address,unicast lookup in interface's mpls table > > forwarding: mpls-neos-chain > [@0]: dpo-load-balance: [proto:mpls index:33 buckets:1 uRPF:31 to:[0:0]] > [0] [@4]: dst-address,unicast lookup in interface's mpls table > ip4-explicit-null:eos/21 fib:1 index:29 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[42] locks:2 flags:exclusive, uPRF-list:30 len:0 itfs:[] > path:[52] pl-index:42 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]: dst-address,unicast lookup in interface's ip4 table > > forwarding: mpls-eos-chain > [@0]: dpo-load-balance: [proto:mpls index:32 buckets:1 uRPF:30 to:[0:0]] > [0] [@3]: dst-address,unicast lookup in interface's ip4 table > router-alert:neos/21 fib:1 index:27 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[40] locks:2 flags:exclusive, uPRF-list:28 len:0 itfs:[] > path:[50] pl-index:40 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]: dpo-punt > > forwarding: mpls-neos-chain > [@0]: dpo-load-balance: [proto:mpls index:30 buckets:1 uRPF:28 to:[0:0]] > [0] [@2]: dpo-punt > router-alert:eos/21 fib:1 index:28 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[41] locks:2 flags:exclusive, uPRF-list:29 len:0 itfs:[] > path:[51] pl-index:41 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]: dpo-punt > > forwarding: mpls-eos-chain > [@0]: dpo-load-balance: [proto:mpls index:31 buckets:1 uRPF:29 to:[0:0]] > [0] [@2]: dpo-punt > ipv6-explicit-null:neos/21 fib:1 index:32 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[45] locks:2 flags:exclusive, uPRF-list:33 len:0 itfs:[] > path:[55] pl-index:45 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]: dst-address,unicast lookup in interface's mpls table > > forwarding:
Re: [vpp-dev] VPP Linux-CP/Linux-NL : MPLS?
No, this is not currently supported. MPLS configuration is not synched from the host system using linux-nl. IP routes/addresses/neighbors and some interface attributes (admin state, MTU, MAC address) are synched. -Matt On Tue, Jan 24, 2023 at 10:16 AM wrote: > Hello, > > I'm trying to populate MPLS FIB via Linux-CP plugin. > MPLS records are created via FRR and populated to Linux Kernel routing > table (I use default ns). Below one can see "push" operation and "swap" > operation. > mpls table 0 was created in vpp by "mpls table add 0" command. > mpls was enabled on all the interfaces, both towards media and taps. > Still, do not see anything in FIB. Should MPLS tables sync work, or may be, > I forgot setup something in VPP? > > root@tn3:/home/abramov# ip -f mpls route show > 40050 as to 41000 via inet6 fd00:200::2 dev Ten0.1914 proto static > root@tn3:/home/abramov# ip -6 route show | grep 4 > fd00:100::4 nhid 209 encap mpls 4 via fd00:200::2 dev Ten0.1914 > proto static metric 20 pref medium > root@tn3:/home/abramov# vppctl > > vpp# show mpls fib 0 40050 > MPLS-VRF:0, fib_index:1 locks:[interface:4, CLI:1, ] > vpp# show ip6 fib > ipv6-VRF:0, fib_index:0, flow hash:[src dst sport dport proto flowlabel ] > epoch:0 flags:none locks:[adjacency:1, default-route:1, lcp-rt:1, ] > ::/0 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:6 buckets:1 uRPF:5 to:[0:0]] > [0] [@0]: dpo-drop ip6 > fd00:100::4/128 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:17 buckets:1 uRPF:17 to:[0:0]] > [0] [@5]: ipv6 via fd00:200::2 TenGigabitEthernet1c/0/1.1914: mtu:9000 > next:5 flags:[] 2af08d2cf6163cecef5f778f8100077a86dd > fd00:200::/64 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:15 buckets:1 uRPF:14 to:[0:0]] > [0] [@4]: ipv6-glean: [src:fd00:200::/64] > TenGigabitEthernet1c/0/1.1914: mtu:9000 next:2 flags:[] > 3cecef5f778f8100077a86dd > fd00:200::1/128 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:16 buckets:1 uRPF:15 > to:[10:848]] > [0] [@20]: dpo-receive: fd00:200::1 on TenGigabitEthernet1c/0/1.1914 > fd00:200::2/128 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:18 buckets:1 uRPF:12 to:[0:0]] > [0] [@5]: ipv6 via fd00:200::2 TenGigabitEthernet1c/0/1.1914: mtu:9000 > next:5 flags:[] 2af08d2cf6163cecef5f778f8100077a86dd > fe80::/10 > unicast-ip6-chain > [@0]: dpo-load-balance: [proto:ip6 index:7 buckets:1 uRPF:6 to:[8:544]] > [0] [@14]: ip6-link-local > vpp# show mpls fib > MPLS-VRF:0, fib_index:1 locks:[interface:4, CLI:1, ] > ip4-explicit-null:neos/21 fib:1 index:30 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[43] locks:2 flags:exclusive, uPRF-list:31 len:0 itfs:[] > path:[53] pl-index:43 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]: dst-address,unicast lookup in interface's mpls table > > forwarding: mpls-neos-chain > [@0]: dpo-load-balance: [proto:mpls index:33 buckets:1 uRPF:31 to:[0:0]] > [0] [@4]: dst-address,unicast lookup in interface's mpls table > ip4-explicit-null:eos/21 fib:1 index:29 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[42] locks:2 flags:exclusive, uPRF-list:30 len:0 itfs:[] > path:[52] pl-index:42 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]: dst-address,unicast lookup in interface's ip4 table > > forwarding: mpls-eos-chain > [@0]: dpo-load-balance: [proto:mpls index:32 buckets:1 uRPF:30 to:[0:0]] > [0] [@3]: dst-address,unicast lookup in interface's ip4 table > router-alert:neos/21 fib:1 index:27 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[40] locks:2 flags:exclusive, uPRF-list:28 len:0 itfs:[] > path:[50] pl-index:40 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]: dpo-punt > > forwarding: mpls-neos-chain > [@0]: dpo-load-balance: [proto:mpls index:30 buckets:1 uRPF:28 to:[0:0]] > [0] [@2]: dpo-punt > router-alert:eos/21 fib:1 index:28 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[41] locks:2 flags:exclusive, uPRF-list:29 len:0 itfs:[] > path:[51] pl-index:41 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]: dpo-punt > > forwarding: mpls-eos-chain > [@0]: dpo-load-balance: [proto:mpls index:31 buckets:1 uRPF:29 to:[0:0]] > [0] [@2]: dpo-punt > ipv6-explicit-null:neos/21 fib:1 index:32 locks:2 > special refs:1 entry-flags:exclusive, > src-flags:added,contributing,active, > path-list:[45] locks:2 flags:exclusive, uPRF-list:33 len:0 itfs:[] > path:[55] pl-index:45 mpls weight=1 pref=0 exclusive: > oper-flags:resolved, cfg-flags:exclusive, > [@0]:
[vpp-dev] VPP Linux-CP/Linux-NL : MPLS?
Hello, I'm trying to populate MPLS FIB via Linux-CP plugin. MPLS records are created via FRR and populated to Linux Kernel routing table (I use default ns). Below one can see "push" operation and "swap" operation. mpls table 0 was created in vpp by "mpls table add 0" command. mpls was enabled on all the interfaces, both towards media and taps. Still, do not see anything in FIB. Should MPLS tables sync work, or may be, I forgot setup something in VPP? root@tn3:/home/abramov# ip -f mpls route show 40050 as to 41000 via inet6 fd00:200::2 dev Ten0.1914 proto static root@tn3:/home/abramov# ip -6 route show | grep 4 fd00:100::4 nhid 209 encap mpls 4 via fd00:200::2 dev Ten0.1914 proto static metric 20 pref medium root@tn3:/home/abramov# vppctl vpp# show mpls fib 0 40050 MPLS-VRF:0, fib_index:1 locks:[interface:4, CLI:1, ] vpp# show ip6 fib ipv6-VRF:0, fib_index:0, flow hash:[src dst sport dport proto flowlabel ] epoch:0 flags:none locks:[adjacency:1, default-route:1, lcp-rt:1, ] ::/0 unicast-ip6-chain [@0]: dpo-load-balance: [proto:ip6 index:6 buckets:1 uRPF:5 to:[0:0]] [0] [@0]: dpo-drop ip6 fd00:100::4/128 unicast-ip6-chain [@0]: dpo-load-balance: [proto:ip6 index:17 buckets:1 uRPF:17 to:[0:0]] [0] [@5]: ipv6 via fd00:200::2 TenGigabitEthernet1c/0/1.1914: mtu:9000 next:5 flags:[] 2af08d2cf6163cecef5f778f8100077a86dd fd00:200::/64 unicast-ip6-chain [@0]: dpo-load-balance: [proto:ip6 index:15 buckets:1 uRPF:14 to:[0:0]] [0] [@4]: ipv6-glean: [src:fd00:200::/64] TenGigabitEthernet1c/0/1.1914: mtu:9000 next:2 flags:[] 3cecef5f778f8100077a86dd fd00:200::1/128 unicast-ip6-chain [@0]: dpo-load-balance: [proto:ip6 index:16 buckets:1 uRPF:15 to:[10:848]] [0] [@20]: dpo-receive: fd00:200::1 on TenGigabitEthernet1c/0/1.1914 fd00:200::2/128 unicast-ip6-chain [@0]: dpo-load-balance: [proto:ip6 index:18 buckets:1 uRPF:12 to:[0:0]] [0] [@5]: ipv6 via fd00:200::2 TenGigabitEthernet1c/0/1.1914: mtu:9000 next:5 flags:[] 2af08d2cf6163cecef5f778f8100077a86dd fe80::/10 unicast-ip6-chain [@0]: dpo-load-balance: [proto:ip6 index:7 buckets:1 uRPF:6 to:[8:544]] [0] [@14]: ip6-link-local vpp# show mpls fib MPLS-VRF:0, fib_index:1 locks:[interface:4, CLI:1, ] ip4-explicit-null:neos/21 fib:1 index:30 locks:2 special refs:1 entry-flags:exclusive, src-flags:added,contributing,active, path-list:[43] locks:2 flags:exclusive, uPRF-list:31 len:0 itfs:[] path:[53] pl-index:43 mpls weight=1 pref=0 exclusive: oper-flags:resolved, cfg-flags:exclusive, [@0]: dst-address,unicast lookup in interface's mpls table forwarding: mpls-neos-chain [@0]: dpo-load-balance: [proto:mpls index:33 buckets:1 uRPF:31 to:[0:0]] [0] [@4]: dst-address,unicast lookup in interface's mpls table ip4-explicit-null:eos/21 fib:1 index:29 locks:2 special refs:1 entry-flags:exclusive, src-flags:added,contributing,active, path-list:[42] locks:2 flags:exclusive, uPRF-list:30 len:0 itfs:[] path:[52] pl-index:42 mpls weight=1 pref=0 exclusive: oper-flags:resolved, cfg-flags:exclusive, [@0]: dst-address,unicast lookup in interface's ip4 table forwarding: mpls-eos-chain [@0]: dpo-load-balance: [proto:mpls index:32 buckets:1 uRPF:30 to:[0:0]] [0] [@3]: dst-address,unicast lookup in interface's ip4 table router-alert:neos/21 fib:1 index:27 locks:2 special refs:1 entry-flags:exclusive, src-flags:added,contributing,active, path-list:[40] locks:2 flags:exclusive, uPRF-list:28 len:0 itfs:[] path:[50] pl-index:40 mpls weight=1 pref=0 exclusive: oper-flags:resolved, cfg-flags:exclusive, [@0]: dpo-punt forwarding: mpls-neos-chain [@0]: dpo-load-balance: [proto:mpls index:30 buckets:1 uRPF:28 to:[0:0]] [0] [@2]: dpo-punt router-alert:eos/21 fib:1 index:28 locks:2 special refs:1 entry-flags:exclusive, src-flags:added,contributing,active, path-list:[41] locks:2 flags:exclusive, uPRF-list:29 len:0 itfs:[] path:[51] pl-index:41 mpls weight=1 pref=0 exclusive: oper-flags:resolved, cfg-flags:exclusive, [@0]: dpo-punt forwarding: mpls-eos-chain [@0]: dpo-load-balance: [proto:mpls index:31 buckets:1 uRPF:29 to:[0:0]] [0] [@2]: dpo-punt ipv6-explicit-null:neos/21 fib:1 index:32 locks:2 special refs:1 entry-flags:exclusive, src-flags:added,contributing,active, path-list:[45] locks:2 flags:exclusive, uPRF-list:33 len:0 itfs:[] path:[55] pl-index:45 mpls weight=1 pref=0 exclusive: oper-flags:resolved, cfg-flags:exclusive, [@0]: dst-address,unicast lookup in interface's mpls table forwarding: mpls-neos-chain [@0]: dpo-load-balance: [proto:mpls index:35 buckets:1 uRPF:33 to:[0:0]] [0] [@4]: dst-address,unicast lookup in interface's mpls table ipv6-explicit-null:eos/21 fib:1 index:31 locks:2 special refs:1 entry-flags:exclusive, src-flags:added,contributing,active, path-list:[44] locks:2 flags:exclusive, uPRF-list:32 len:0 itfs:[] path:[54] pl-index:44 mpls weight=1 pref=0 exclusive: oper-flags:resolved, cfg-flags:exclusive, [@0]: dst-address,unicast lookup in interface's ip6 table forwarding: mpls-eos-chain [@0]: dpo-load-balance: [proto:mpls
[vpp-dev] VPP crashes because of API segment exhaustion
Hello all, We are experiencing VPP crashes that occur a few days after the startup because of API segment exhaustion. Increasing API segment size to 256MB didn't stop the crashes from occurring. Can you please take a look at the description below and tell us if you have seen similar issues or have any ideas what the cause may be? Given: - VPP 22.10 - 2 worker threads - API segment size is 256MB - ~893k IPv4 routes and ~160k IPv6 routes added Backtrace: > [..] > #32660 0x55b02f606896 in os_panic () at > /home/jenkins/tnsr-pkgs/work/vpp/src/vpp/vnet/main.c:414 > #32661 0x7fce3c0ec740 in clib_mem_heap_alloc_inline (heap=0x0, > size=, align=8, > os_out_of_memory_on_failure=1) at > /home/jenkins/tnsr-pkgs/work/vpp/src/vppinfra/mem_dlmalloc.c:613 > #32662 clib_mem_alloc (size=) > at /home/jenkins/tnsr-pkgs/work/vpp/src/vppinfra/mem_dlmalloc.c:628 > #32663 0x7fce3dc4ee6f in vl_msg_api_alloc_internal > (vlib_rp=0x130026000, nbytes=69, pool=0, > may_return_null=0) at > /home/jenkins/tnsr-pkgs/work/vpp/src/vlibmemory/memory_shared.c:179 > #32664 0x7fce3dc592cd in vl_api_rpc_call_main_thread_inline > (force_rpc=0 '\000', > fp=, data=, data_length=) > at /home/jenkins/tnsr-pkgs/work/vpp/src/vlibmemory/memclnt_api.c:617 > #32665 vl_api_rpc_call_main_thread (fp=0x7fce3c74de70 , > data=0x7fcc372bdc00 "& \001$ ", data_length=28) > at /home/jenkins/tnsr-pkgs/work/vpp/src/vlibmemory/memclnt_api.c:641 > #32666 0x7fce3cc7fe2d in icmp6_neighbor_solicitation_or_advertisement > (vm=0x7fccc0864000, > frame=0x7fcccd7d2d40, is_solicitation=1, node=) > at /home/jenkins/tnsr-pkgs/work/vpp/src/vnet/ip6-nd/ip6_nd.c:163 > #32667 icmp6_neighbor_solicitation (vm=0x7fccc0864000, > node=0x7fccc09e3380, frame=0x7fcccd7d2d40) > at /home/jenkins/tnsr-pkgs/work/vpp/src/vnet/ip6-nd/ip6_nd.c:322 > #32668 0x7fce3c1a2fe0 in dispatch_node (vm=0x7fccc0864000, > node=0x7fce3dc74836, > type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING, > frame=0x7fcccd7d2d40, > last_time_stamp=4014159654296481) at > /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:961 > #32669 dispatch_pending_node (vm=0x7fccc0864000, pending_frame_index=7, > last_time_stamp=4014159654296481) at > /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:1120 > #32670 vlib_main_or_worker_loop (vm=0x7fccc0864000, is_main=0) > at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:1589 > #32671 vlib_worker_loop (vm=vm@entry=0x7fccc0864000) > at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:1723 > #32672 0x7fce3c1f581a in vlib_worker_thread_fn (arg=0x7fccbdb11b40) > at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/threads.c:1579 > #32673 0x7fce3c1f02c1 in vlib_worker_thread_bootstrap_fn > (arg=0x7fccbdb11b40) > at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/threads.c:418 > #32674 0x7fce3be3db43 in start_thread (arg=) at > ./nptl/pthread_create.c:442 > #32675 0x7fce3becfa00 in clone3 () at > ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 > According to the backtrace, an IPv6 neighbor is being learned. Since the packet was received on a worker thread, the neighbor information is being passed to the main thread by making an RPC call (that works via the API). For this, an API message for RPC call is being allocated from the API segment (as а client). But the allocation is failing because of no available memory. If inspect the API rings after crashing, it can be seen that they are all filled with VL_API_RPC_CALL messages. Also, it can be seen that there are a lot of pending RPC requests (vm->pending_rpc_requests has ~3.3M items). Thus, API segment exhaustion occurs because of a huge number of pending RPC messages. RPC messages are processed in a process node called api-rx-from-ring (function is called vl_api_clnt_process). And process nodes are handled in the main thread only. First hypothesis is that the main loop of the main thread pauses for such a long time that a huge number of pending RPC messages are accumulated by the worker threads (that keep running). But this doesn't seem to be confirmed if inspect vm->loop_interval_start of all threads after crashing. vm->loop_interval_start of the worker threads would have been greater than vm->loop_interval_start of the main thread. > (gdb) p vlib_global_main.vlib_mains[0]->loop_interval_start > $117 = 197662.55595008997 > (gdb) p vlib_global_main.vlib_mains[1]->loop_interval_start > $119 = 197659.82887979984 > (gdb) p vlib_global_main.vlib_mains[2]->loop_interval_start > $121 = 197659.93944517447 > Second hypothesis is that pending RPC messages stop being processed completely at some point and keep being accumulated while the memory permits. This seems to be confirmed if inspect the process node after crashing. It can be seen that vm->main_loop_count is much bigger than the process node's main_loop_count_last_dispatch (difference is ~50M iterations). Although, according to the flags, the node is waiting for
Re: [vpp-dev] VPP LCP: IS-IS does not work
That's not surprising, could you also show me a trace? (trace add dpdk-input 10 and then show trace with ISIS packet) On Tue, 24 Jan 2023 at 16:25, wrote: > Hi Stanislav, > > Unfortunately, your patch didn't help. VPP builds, but IS-IS packets still > cannot be passed between the CP and the wire. > > Furthermore, it looks like LCP lcp-auto-subint feature was broken: > > root@tn3:/home/abramov/vpp# vppctl > _____ _ ___ > __/ __/ _ \ (_)__| | / / _ \/ _ \ > _/ _// // / / / _ \ | |/ / ___/ ___/ > /_/ /(_)_/\___/ |___/_/ /_/ > > vpp# > vpp# > vpp# > vpp# show interface > Name IdxState MTU (L3/IP4/IP6/MPLS) > Counter Count > TenGigabitEthernet1c/0/1 1 down 9000/0/0/0 > local00 down 0/0/0/0 > vpp# set interface state TenGigabitEthernet1c/0/1 up > vpp# lcp create 1 host-if Ten0 > vpp# show interface > Name IdxState MTU (L3/IP4/IP6/MPLS) > Counter Count > TenGigabitEthernet1c/0/1 1 up 9000/0/0/0 rx > packets 2451 > rx > bytes 228627 > tx > packets 7 > tx > bytes 746 > > drops 2451 > > ip49 > > ip62 > local00 down 0/0/0/0 > tap1 2 up 9000/0/0/0 rx > packets 7 > rx > bytes 746 > > ip67 > vpp# quit > root@tn3:/home/abramov/vpp# ip link set Ten0 up > root@tn3:/home/abramov/vpp# vppctl > _____ _ ___ > __/ __/ _ \ (_)__| | / / _ \/ _ \ > _/ _// // / / / _ \ | |/ / ___/ ___/ > /_/ /(_)_/\___/ |___/_/ /_/ > > vpp# lcp lcp > lcp-auto-subint lcp-sync > vpp# lcp lcp-auto-subint on > vpp# lcp lcp-sync on > vpp# show lcp > lcp default netns '' > lcp lcp-auto-subint on > lcp lcp-sync on > lcp del-static-on-link-down off > lcp del-dynamic-on-link-down off > itf-pair: [0] TenGigabitEthernet1c/0/1 tap1 Ten0 1248 type tap > vpp# quit > root@tn3:/home/abramov/vpp# ip link add Ten0.1914 link Ten0 type vlan id > 1914 > root@tn3:/home/abramov/vpp# ip link set Ten0.1914 up > root@tn3:/home/abramov/vpp# vppctl > _____ _ ___ > __/ __/ _ \ (_)__| | / / _ \/ _ \ > _/ _// // / / / _ \ | |/ / ___/ ___/ > /_/ /(_)_/\___/ |___/_/ /_/ > > vpp# show int > Name IdxState MTU (L3/IP4/IP6/MPLS) > Counter Count > TenGigabitEthernet1c/0/1 1 up 9000/0/0/0 rx > packets 16501 > rx > bytes 1519839 > tx > packets 7 > tx > bytes 746 > > drops 16501 > > ip4 39 > > ip68 > local00 down 0/0/0/0 > tap1 2 up 9000/0/0/0 rx > packets17 > rx > bytes 19710 > > drops 10 > > ip67 > > > vpp# show node counters >Count Node > Reason Severity > 10 lldp-inputlldp packets received on > disabled i error >516 dpdk-input no > errorerror > 21arp-disabled ARP > Disabled error > 74 osi-input unknown osi > protocol error > 5 snap-input unknown oui/snap > protocolerror > 11 ethernet-input unknown ethernet > type error > 74127 ethernet-input unknown > vlan error >145 ethernet-input subinterface > downerror > vpp# > > > -- Best regards Stanislav Zaikin -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22502): https://lists.fd.io/g/vpp-dev/message/22502 Mute This Topic: https://lists.fd.io/mt/96476162/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe:
Re: [vpp-dev] VPP LCP: IS-IS does not work
Hi Stanislav, Unfortunately, your patch didn't help. VPP builds, but IS-IS packets still cannot be passed between the CP and the wire. Furthermore, it looks like LCP lcp-auto-subint feature was broken: root@tn3:/home/abramov/vpp# vppctl ___ _ _ _ ___ __/ __/ _ \ (_)__ | | / / _ \/ _ \ _/ _// // / / / _ \ | |/ / ___/ ___/ /_/ /(_)_/\___/ |___/_/ /_/ vpp# vpp# vpp# vpp# show interface Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count TenGigabitEthernet1c/0/1 1 down 9000/0/0/0 local0 0 down 0/0/0/0 vpp# set interface state TenGigabitEthernet1c/0/1 up vpp# lcp create 1 host-if Ten0 vpp# show interface Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count TenGigabitEthernet1c/0/1 1 up 9000/0/0/0 rx packets 2451 rx bytes 228627 tx packets 7 tx bytes 746 drops 2451 ip4 9 ip6 2 local0 0 down 0/0/0/0 tap1 2 up 9000/0/0/0 rx packets 7 rx bytes 746 ip6 7 vpp# quit root@tn3:/home/abramov/vpp# ip link set Ten0 up root@tn3:/home/abramov/vpp# vppctl ___ _ _ _ ___ __/ __/ _ \ (_)__ | | / / _ \/ _ \ _/ _// // / / / _ \ | |/ / ___/ ___/ /_/ /(_)_/\___/ |___/_/ /_/ vpp# lcp lcp lcp-auto-subint lcp-sync vpp# lcp lcp-auto-subint on vpp# lcp lcp-sync on vpp# show lcp lcp default netns '' lcp lcp-auto-subint on lcp lcp-sync on lcp del-static-on-link-down off lcp del-dynamic-on-link-down off itf-pair: [0] TenGigabitEthernet1c/0/1 tap1 Ten0 1248 type tap vpp# quit root@tn3:/home/abramov/vpp# ip link add Ten0.1914 link Ten0 type vlan id 1914 root@tn3:/home/abramov/vpp# ip link set Ten0.1914 up root@tn3:/home/abramov/vpp# vppctl ___ _ _ _ ___ __/ __/ _ \ (_)__ | | / / _ \/ _ \ _/ _// // / / / _ \ | |/ / ___/ ___/ /_/ /(_)_/\___/ |___/_/ /_/ vpp# show int Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count TenGigabitEthernet1c/0/1 1 up 9000/0/0/0 rx packets 16501 rx bytes 1519839 tx packets 7 tx bytes 746 drops 16501 ip4 39 ip6 8 local0 0 down 0/0/0/0 tap1 2 up 9000/0/0/0 rx packets 17 rx bytes 19710 drops 10 ip6 7 vpp# show node counters Count Node Reason Severity 10 lldp-input lldp packets received on disabled i error 516 dpdk-input no error error 21 arp-disabled ARP Disabled error 74 osi-input unknown osi protocol error 5 snap-input unknown oui/snap protocol error 11 ethernet-input unknown ethernet type error 74127 ethernet-input unknown vlan error 145 ethernet-input subinterface down error vpp# -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22501): https://lists.fd.io/g/vpp-dev/message/22501 Mute This Topic: https://lists.fd.io/mt/96476162/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] way to report security vulnerabilities
Hi Laszlo, > Could you please point me to some description of how to report security > vulnerability on vpp? This time I do not have any, just for documentation > purposes. The process is documented here: https://wiki.fd.io/view/TSC:Vulnerability_Management Best ben -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22500): https://lists.fd.io/g/vpp-dev/message/22500 Mute This Topic: https://lists.fd.io/mt/96494554/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] way to report security vulnerabilities
Hello fellow developers, Could you please point me to some description of how to report security vulnerability on vpp? This time I do not have any, just for documentation purposes. Regards/Laszlo Kiraly -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22499): https://lists.fd.io/g/vpp-dev/message/22499 Mute This Topic: https://lists.fd.io/mt/96494554/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] ikev2 mobike
Hello guys, Does IKEV2 support MOBIKE.. Thank you. -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22498): https://lists.fd.io/g/vpp-dev/message/22498 Mute This Topic: https://lists.fd.io/mt/96493925/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] help with review
Looks good to me. Merged. Best regards, Ole > On 24 Jan 2023, at 08:55, Stanislav Zaikin wrote: > > Hello folks, > > Any help with review is much appreciated. Both patches are waiting for quite > a long time. > > - https://gerrit.fd.io/r/c/vpp/+/36721 > Short description: "autoendian" was broken for streLaming message types like: > service { > rpc lcp_itf_pair_get returns lcp_itf_pair_get_reply > stream lcp_itf_pair_details; > }; > vppapigen_c.py isn't generating boilerplate (endian handler, json handler, > format handler, erc) for such types (both for vpp side and for vapi). > There's currently also no support for streaming services in VAPI/C++, I have > a patch for that, I will send it after this one will be merged (if it will be > merged in the end). > > - https://gerrit.fd.io/r/c/vpp/+/36110 > Short description: there is a fast path in "ethernet-input" for whole frames > with ETH_INPUT_FRAME_F_SINGLE_SW_IF_IDX flag. The rest input nodes have this > snippet to allocate a frame per interface when receiving the packets (at > least in dpdk-input and memif-input). I thought it'd be useful to have this > fast path for tap interfaces. Theoretically it can be even measured with csit > but I didn't succeed at that :) > > -- > Best regards > Stanislav Zaikin > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22497): https://lists.fd.io/g/vpp-dev/message/22497 Mute This Topic: https://lists.fd.io/mt/96493146/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-