On 20 Nov 2019, at 21:17, William Tu wrote:
On Tue, Nov 19, 2019 at 10:51:01AM -0800, William Tu wrote:
Hi Eelco,
Thanks for your testing.
On Tue, Nov 12, 2019 at 11:25:55AM +0100, Eelco Chaudron wrote:
See one remark below, however when I did a quick test with a program
that
would not load it goes into some re-try loop:
2019-11-12T10:13:21.658Z|01609|netdev_afxdp|INFO|eno1: Removing xdp
program.
2019-11-12T10:13:21.658Z|01610|netdev_afxdp|INFO|Removed program ID:
0, fd:
0
2019-11-12T10:13:21.658Z|01611|netdev_afxdp|INFO|eno1: Setting XDP
mode to
DRV.
2019-11-12T10:13:22.224Z|01612|netdev_afxdp|INFO|eno1: Load custom
XDP
program at /root/xdp_pass_kern.o.
2019-11-12T10:13:22.274Z|01613|netdev_afxdp|ERR|xsk_socket__create
failed
(Resource temporarily unavailable) mode: DRV use-need-wakeup: true
qid: 0
I couldn't reproduce this issue.
Also looking at xsk_socket__create, I couln't figure out how it
returns
EAGAIN (Resource temporarily unavailable) in your case.
2019-11-12T10:13:22.300Z|01614|netdev_afxdp|ERR|Failed to create
AF_XDP
socket on queue 0.
2019-11-12T10:13:22.300Z|01615|netdev_afxdp|INFO|eno1: Removing xdp
program.
2019-11-12T10:13:22.658Z|00047|ovs_rcu(urcu2)|WARN|blocked 1000 ms
waiting
for main to quiesce
2019-11-12T10:13:22.735Z|01616|netdev_afxdp|INFO|Removed program ID:
320,
fd: 0
2019-11-12T10:13:22.735Z|01617|netdev_afxdp|ERR|AF_XDP device eno1
reconfig
failed.
2019-11-12T10:13:22.735Z|01618|dpif_netdev|ERR|Failed to set
interface eno1
new configuration
2019-11-12T10:13:22.735Z|01619|netdev_afxdp|INFO|FIXME: Device tapVM
always
use numa id 0.
2019-11-12T10:13:22.735Z|01620|dpif_netdev|INFO|Core 1 on numa node
0
assigned port 'tapVM' rx queue 0 (measured processing cycles 0).
2019-11-12T10:13:22.735Z|01621|dpif|WARN|netdev@ovs-netdev: failed
to add
eno1 as port: Invalid argument
2019-11-12T10:13:22.735Z|01622|netdev_afxdp|INFO|eno1: Removing xdp
program.
2019-11-12T10:13:22.736Z|01623|netdev_afxdp|INFO|Removed program ID:
0, fd:
0
2019-11-12T10:13:22.736Z|01624|timeval|WARN|Unreasonably long 1079ms
poll
interval (7ms user, 80ms system)
2019-11-12T10:13:22.736Z|01625|timeval|WARN|faults: 16387 minor, 0
major
2019-11-12T10:13:22.736Z|01626|timeval|WARN|context switches: 327
voluntary,
337 involuntary
2019-11-12T10:13:22.738Z|01627|netdev_afxdp|INFO|eno1: Removing xdp
program.
2019-11-12T10:13:22.738Z|01628|netdev_afxdp|INFO|Removed program ID:
0, fd:
0
2019-11-12T10:13:22.739Z|01629|netdev_afxdp|INFO|eno1: Setting XDP
mode to
DRV.
2019-11-12T10:13:23.312Z|01630|netdev_afxdp|INFO|eno1: Load custom
XDP
program at /root/xdp_pass_kern.o.
2019-11-12T10:13:23.363Z|01631|netdev_afxdp|ERR|xsk_socket__create
failed
(Resource temporarily unavailable) mode: DRV use-need-wakeup: true
qid: 0
2019-11-12T10:13:23.392Z|01632|netdev_afxdp|ERR|Failed to create
AF_XDP
socket on queue 0.
2019-11-12T10:13:23.392Z|01633|netdev_afxdp|INFO|eno1: Removing xdp
program.
2019-11-12T10:13:23.738Z|00048|ovs_rcu(urcu2)|WARN|blocked 1000 ms
waiting
for main to quiesce
2019-11-12T10:13:23.823Z|01634|netdev_afxdp|INFO|Removed program ID:
324,
fd: 0
2019-11-12T10:13:23.823Z|01635|netdev_afxdp|ERR|AF_XDP device eno1
reconfig
failed.
2019-11-12T10:13:23.823Z|01636|dpif_netdev|ERR|Failed to set
interface eno1
new configuration
2019-11-12T10:13:23.823Z|01637|netdev_afxdp|INFO|FIXME: Device tapVM
always
use numa id 0.
2019-11-12T10:13:23.823Z|01638|dpif_netdev|INFO|Core 1 on numa node
0
assigned port 'tapVM' rx queue 0 (measured processing cycles 0).
2019-11-12T10:13:23.823Z|01639|dpif|WARN|netdev@ovs-netdev: failed
to add
eno1 as port: Invalid argument
But in addition during this loop it’s not freeing it’s
resources:
$ bpftool prog list && bpftool map
4: xdp tag 80b55d8a76303785
loaded_at 2019-11-12T05:11:15-0500 uid 0
xlated 136B jited 96B memlock 4096B map_ids 4
12: xdp name xdp_prog_simple tag 3b185187f1855c4c gpl
loaded_at 2019-11-12T05:11:58-0500 uid 0
xlated 16B jited 35B memlock 4096B
16: xdp name xdp_prog_simple tag 3b185187f1855c4c gpl
loaded_at 2019-11-12T05:11:59-0500 uid 0
xlated 16B jited 35B memlock 4096B
20: xdp name xdp_prog_simple tag 3b185187f1855c4c gpl
loaded_at 2019-11-12T05:12:00-0500 uid 0
xlated 16B jited 35B memlock 4096B
…
…
120: (null) name xsks_map flags 0x0
key 4B value 4B max_entries 32 memlock 4096B
122: (null) name xsks_map flags 0x0
key 4B value 4B max_entries 32 memlock 4096B
124: (null) name xsks_map flags 0x0
key 4B value 4B max_entries 32 memlock 4096B
126: (null) name xsks_map flags 0x0
key 4B value 4B max_entries 32 memlock 4096B
128: (null) name xsks_map flags 0x0
key 4B value 4B max_entries 32 memlock 4096B
130: (null) name xsks_map flags 0x0
key 4B value 4B max_entries 32 memlock 4096B
132: (null) name xsks_map flags 0x0
key 4B value 4B max_entries 32 memlock 4096B
And base on the above, the XDP program and map are already
loaded, so in xsk_socket__create(), it already reaches the
end:
708 if (!(xsk->config.libbpf_flags &
XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
709 err = xsk_setup_xdp_prog(xsk);
710 if (err)
711 goto out_mmap_tx;
So I assume something wrong with xsk_setup_xdp_prog()?
I suspect bpf_get_link_xdp_id() returns EAGAIN?
I will keep debugging but if you have time, could you help
finding out why EAGAIN is returned? Thanks!
--William
Thanks for your help. I've found the root cause.
At the error handling path, the prog_fd and map_fd is not
saved, so xsk_destroy_all() does not clean them properly.
Fixed it by diff below, I will send out new version.
William
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index e079cd33bc9a..e9b8a43bb7c0 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -484,6 +484,8 @@ xsk_configure_all(struct netdev *netdev)
VLOG_INFO("%s: Load custom XDP program at %s.",
netdev_get_name(netdev), dev->xdpobj);
+ dev->prog_fd = prog_fd;
+ dev->map_fd = map_fd;
}
VLOG_DBG("%s: configure queue %d mode %s use-need-wakeup
%s.",
@@ -502,8 +504,6 @@ xsk_configure_all(struct netdev *netdev)
atomic_init(&xsk_info->tx_dropped, 0);
xsk_info->outstanding_tx = 0;
xsk_info->available_rx = PROD_NUM_DESCS;
- dev->prog_fd = prog_fd;
- dev->map_fd = map_fd;
}
n_txq = netdev_n_txq(netdev);
Cool, thanks!
//Eelco
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev