Hoi Matt,
Thanks a lot for those tips - it helped me understand the problem better.
I'll answer your questions but I think the problem is a no-op, because I
was using the standard 16K buffers, and they ~immedately run out with 8
threads. Raising to 128K raises total consumption to the low 30K mark, and
stays there. No more issues seen after that.
- Issue occurs with both IPv4 only, IPv6 only and mixed traffic.
- Yes, other packets are being forwarded. The problem is present with no
traffic as well.
- No other control plane activity (no FRR or Bird running), just connected
routes (the 10 I mentioned above: 5 on a plain phy and 5 on a BondEthernet)
- The problem greatly accelerated when sending traffic to BondEthernet0's
LIP; I was doing 20Gbit of traffic on a 2x10G LAG and the buffer use shot
up immediately to 16K and flatlined
Pool NameIndex NUMA Size Data Size Total Avail Cached
Used
default-numa-0 0 0 2496 2048168000 163
16637
Roger on the suggestion for the patch - I've used an 8-threaded VPP with
128K buffers for about 4 seconds of fping / arp cache flushes while
iperf'ing against it; and it's still fine:
Pool NameIndex NUMA Size Data Size Total Avail Cached
Used
default-numa-0 0 0 2496 2048 128520 97603 1422
29495
Patch sent in https://gerrit.fd.io/r/c/vpp/+/33606
groet,
Pim
On Wed, Aug 25, 2021 at 5:44 PM Matthew Smith wrote:
> Hi Pim,
>
> Responses are inline...
>
> On Tue, Aug 24, 2021 at 4:47 AM Pim van Pelt wrote:
>
>> Hoi,
>>
>> I've noticed that when a linuxcp enabled VPP 21.06 with multiple threads
>> receives many ARP requests, eventually it crashes in lcp_arp_phy_node in
>> lcp_node.c:675 and :775 because we do a vlib_buffer_copy() which returns
>> NULL, after which we try to dereference the result. How to repro:
>> 1) create a few interfaces/subints and give them IP addresses in Linux
>> and VPP. I made 5 phy subints and 5 subints on a bondethernet.
>> 2) rapidly fping the Linux CP and at the same time continuously flush the
>> neighbor cache on the Linux namespace:
>> On the vpp machine in 'dataplane' namespace:
>> while :; do ip nei flush all; done
>> On a Linux machine connected to VPP:
>> while :; do fping -c 1 -B 1 -p 10 10.1.1.2 10.1.2.2 10.1.3.2
>> 10.1.4.2 10.1.5.2 10.0.1.2 10.0.2.2 10.0.3.2 10.0.4.2 10.0.5.2
>> 2001:db8:1:1::2 2001:db8:1:2::2 2001:db8:1:3::2 2001:db8:1:4::2
>> 2001:db8:1:5::2 2001:db8:0:1::2 2001:db8:0:2::2 2001:db8:0:3::2
>> 2001:db8:0:4::2 2001:db8:0:5::2; done
>>
>> VPP will now be seeing lots of ARP traffic to and from the host. After a
>> while, c0 or c1 from lcp_node.c:675 and lcp_node.c:775 will be NULL and
>> cause a crash.
>> I temporarily worked around this by simply adding:
>>
>> @@ -675,6 +675,10 @@ VLIB_NODE_FN (lcp_arp_phy_node)
>>
>> c0 = vlib_buffer_copy (vm, b0);
>>
>> vlib_buffer_advance (b0, len0);
>>
>>
>>
>> + // pim(2021-08-24) -- address SIGSEGV when copy
>> returns NULL
>>
>> + if (!c0)
>>
>> + continue;
>>
>> +
>>
>> /* Send to the host */
>>
>> vnet_buffer (c0)->sw_if_index[VLIB_TX] =
>>
>> lip0->lip_host_sw_if_index;
>>
>> but I'm not very comfortable in this part of VPP, and I'm sure there's a
>> better way to catch the buffer copy failing?
>>
>
> No, checking whether the return value is null is the correct way to detect
> failure.
>
>
>
>> I haven't quite understood this code yet, but shouldn't we free c0 and c1
>> in these functions?
>>
>
> No, c0 and c1 are enqueued to another node (interface-output). The buffers
> are freed after being transmitted or dropped by subsequent nodes. Freeing
> them in this node while also enqueuing them would result in problems.
>
>
>
>> It seems that when I'm doing my rapid ping/arp/flush exercise above, VPP
>> is slowly consuming more memory (as seen by show memory main-heap; all 4
>> threads are monotonously growing by a few hundred kB per minute of runtime).
>>
>
> I made a quick attempt to reproduce the issue and was unsuccessful. Though
> I did not use a bond interface or subinterfaces, just physical interfaces.
>
> How many buffers are being allocated (vppctl show buffers)? Does the issue
> occur if you only send IPv4 packets instead of both IPv4 and IPv6? Are
> other packets besides the ICMP and ARP being forwarded while you're running
> this test? Is there any other control plane activity occurring during the
> test (E.g. BGP adding routes)?
>
>
>
>> If somebody could help me take a look, I'd appreciate it.
>>
>
> It would be better to make your patch like this:
>
> if (c0)
> {
> /* Send to the host */
> vnet_buffer (c0)->sw_if_index[VLIB_TX] =
> lip0->lip_host_sw_if_index;
> reply_copies[n_copies++] =