Hi Pim,

Responses are inline...

On Tue, Aug 24, 2021 at 4:47 AM Pim van Pelt <p...@ipng.nl> wrote:

> Hoi,
>
> I've noticed that when a linuxcp enabled VPP 21.06 with multiple threads
> receives many ARP requests, eventually it crashes in lcp_arp_phy_node in
> lcp_node.c:675 and :775 because we do a vlib_buffer_copy() which returns
> NULL, after which we try to dereference the result. How to repro:
> 1) create a few interfaces/subints and give them IP addresses in Linux and
> VPP. I made 5 phy subints and 5 subints on a bondethernet.
> 2) rapidly fping the Linux CP and at the same time continuously flush the
> neighbor cache on the Linux namespace:
> On the vpp machine in 'dataplane' namespace:
>   while :; do ip nei flush all; done
> On a Linux machine connected to VPP:
>   while :; do fping -c 10000 -B 1 -p 10 10.1.1.2 10.1.2.2 10.1.3.2
> 10.1.4.2 10.1.5.2 10.0.1.2 10.0.2.2 10.0.3.2 10.0.4.2 10.0.5.2
> 2001:db8:1:1::2 2001:db8:1:2::2 2001:db8:1:3::2 2001:db8:1:4::2
> 2001:db8:1:5::2 2001:db8:0:1::2 2001:db8:0:2::2 2001:db8:0:3::2
> 2001:db8:0:4::2 2001:db8:0:5::2; done
>
> VPP will now be seeing lots of ARP traffic to and from the host. After a
> while, c0 or c1 from lcp_node.c:675 and lcp_node.c:775 will be NULL and
> cause a crash.
> I temporarily worked around this by simply adding:
>
> @@ -675,6 +675,10 @@ VLIB_NODE_FN (lcp_arp_phy_node)
>
>                   c0 = vlib_buffer_copy (vm, b0);
>
>                   vlib_buffer_advance (b0, len0);
>
>
>
> +                 // pim(2021-08-24) -- address SIGSEGV when copy returns
> NULL
>
> +                 if (!c0)
>
> +                   continue;
>
> +
>
>                   /* Send to the host */
>
>                   vnet_buffer (c0)->sw_if_index[VLIB_TX] =
>
>                     lip0->lip_host_sw_if_index;
>
> but I'm not very comfortable in this part of VPP, and I'm sure there's a
> better way to catch the buffer copy failing?
>

No, checking whether the return value is null is the correct way to detect
failure.



> I haven't quite understood this code yet, but shouldn't we free c0 and c1
> in these functions?
>

No, c0 and c1 are enqueued to another node (interface-output). The buffers
are freed after being transmitted or dropped by subsequent nodes. Freeing
them in this node while also enqueuing them would result in problems.



> It seems that when I'm doing my rapid ping/arp/flush exercise above, VPP
> is slowly consuming more memory (as seen by show memory main-heap; all 4
> threads are monotonously growing by a few hundred kB per minute of runtime).
>

I made a quick attempt to reproduce the issue and was unsuccessful. Though
I did not use a bond interface or subinterfaces, just physical interfaces.

How many buffers are being allocated (vppctl show buffers)? Does the issue
occur if you only send IPv4 packets instead of both IPv4 and IPv6? Are
other packets besides the ICMP and ARP being forwarded while you're running
this test? Is there any other control plane activity occurring during the
test (E.g. BGP adding routes)?



> If somebody could help me take a look, I'd appreciate it.
>

 It would be better to make your patch like this:

                  if (c0)
                    {
                      /* Send to the host */
                      vnet_buffer (c0)->sw_if_index[VLIB_TX] =
                        lip0->lip_host_sw_if_index;
                      reply_copies[n_copies++] = vlib_get_buffer_index (vm,
c0);
                    }

When you do the opposite ('if (!c0) continue;'), you skip the call to
vlib_validate_buffer_enqueue_x2() at the end of the loop body which would
enqueue the original buffers to the next node. So those buffers will leak
and the issue will be exacerbated.

Thanks,
-Matt
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20019): https://lists.fd.io/g/vpp-dev/message/20019
Mute This Topic: https://lists.fd.io/mt/85107134/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to