Re: [ovs-dev] [PATCHv18] netdev-afxdp: add new netdev type for AF_XDP.

Eelco Chaudron Tue, 20 Aug 2019 04:19:40 -0700


On 20 Aug 2019, at 12:10, Ilya Maximets wrote:

On 14.08.2019 19:16, William Tu wrote:
On Wed, Aug 14, 2019 at 7:58 AM William Tu <u9012...@gmail.com>wrote:
On Wed, Aug 14, 2019 at 5:09 AM Eelco Chaudron <echau...@redhat.com>wrote:
On 8 Aug 2019, at 17:38, Ilya Maximets wrote:

<SNIP>
I see a rather high number of afxdp_cq_skip, which should to my
knowledge never happen?
I tried to investigate this previously, but didn't find anything
suspicious.
So, for my knowledge, this should never happen too.
However, I only looked at the code without actually running,because
I had no
HW available for testing.

While investigation and stress-testing virtual ports I found few
issues with
missing locking inside the kernel, so there is no trust forkernel
part of XDP
implementation from my side. I'm suspecting that there are some
other bugs in
kernel/libbpf that only could be reproduced with driver mode.
This never happens for virtual ports with SKB mode, so I neversaw
this coverage
counter being non-zero.
Did some quick debugging, as something else has come up thatneeds my
attention :)
But once I’m in a faulty state and sent a single packet,causingafxdp_complete_tx() to be called, it tells me 2048 descriptorsare
ready, which is XSK_RING_PROD__DEFAULT_NUM_DESCS. So I guess that
there might be some ring management bug. Maybe consumer andreceiverare equal meaning 0 buffers, but it returns max? I did not lookat
the kernel code, so this is just a wild guess :)

(gdb) p tx_done
$3 = 2048

(gdb) p umem->cq
$4 = {cached_prod = 3830466864, cached_cons = 3578066899, mask =
2047, size = 2048, producer = 0x7f08486b8000, consumer =
0x7f08486b8040, ring = 0x7f08486b8080}
Thanks for debugging!
xsk_ring_cons__peek() just returns the difference betweencached_prod
and cached_cons, but these values are too different:

3830466864 - 3578066899 = 252399965

Since this value > requested, it returns requested number (2048).
So, the ring is broken. At least broken its 'cached' part. It'llbe
good
to look at *consumer and *producer values to verify the state ofthe
actual ring.
I’ll try to find some more time next week to debug further.

William I noticed your email in xdp-newbies where you mention this
problem of getting the wrong pointers. Did you ever follow up, ordid
further trouble shooting on the above?
Yes, I posted here
https://www.spinics.net/lists/xdp-newbies/msg00956.html
"Question/Bug about AF_XDP idx_cq from xsk_ring_cons__peek?"

At that time I was thinking about reproducing the problem using the
xdpsock sample code from kernel. But turned out that my reproduction
code is not correct, so not able to show the case we hit here inOVS.
Then I put more similar code logic from OVS to xdpsock, but theproblem
does not show up. As a result, I worked around it by marking addr as
"*addr == UINT64_MAX".

I will debug again this week once I get my testbed back.
Just to refresh my memory. The original issue is that
when calling:
tx_done = xsk_ring_cons__peek(&umem->cq, CONS_NUM_DESCS, &idx_cq);
xsk_ring_cons__release(&umem->cq, tx_done);
I expect there are 'tx_done' elems on the CQ to re-cycle back tomemory pool.However, when I inspect these elems, I found some elems that'already' beenreported complete last time I call xsk_ring_cons__peek. In otherword, some
elems show up at CQ twice. And this cause overflow of the mempool.

Thus, mark the elems on CQ as UINT64_MAX to indicate that we already
seen this elem.
William, Eelco, which HW NIC you're using? Which kernel driver?


I’m using the below on the latest bpf-next driver:

01:00.0 Ethernet controller: Intel Corporation 82599ES 10-GigabitSFI/SFP+ Network Connection (rev 01)01:00.1 Ethernet controller: Intel Corporation 82599ES 10-GigabitSFI/SFP+ Network Connection (rev 01)


//Eelco

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCHv18] netdev-afxdp: add new netdev type for AF_XDP.

Reply via email to