On 5/14/25 9:02 PM, Ilya Maximets wrote:
On 5/12/25 4:23 PM, Daniel Borkmann wrote:
On 5/12/25 2:03 PM, Ilya Maximets wrote:
On 5/9/25 4:05 PM, Daniel Borkmann wrote:
On 5/9/25 12:53 AM, Ilya Maximets wrote:
On 5/8/25 2:34 PM, Daniel Borkmann wrote:
Extend inhibit=on setting with the option to specify a pinned XSK map
path along with a starting index (default 0) to push the created XSK
sockets into. Example usage:

     # ./build/qemu-system-x86_64 [...] \
       -netdev 
af-xdp,ifname=eth0,id=net0,mode=native,queues=2,inhibit=on,map-path=/sys/fs/bpf/foo,map-start-index=2
       -device virtio-net-pci,netdev=net0 [...]

This is useful for the case where an existing XDP program with XSK map
is present on the AF_XDP supported phys device and the XSK map not yet
populated. Qemu will then push the XSK sockets into the specified map.

Thanks for the patch!

Could you, please, explain the use case a little more?  Is this patch
aiming to improve usability?  Do you have a specific use case in mind?

The use case we have is basically that the phys NIC has an XDP program
already attached which redirects into xsk map (e.g. installed from separate
control plane), the xsk map got pinned during that process into bpf fs,
and now qemu is launched, it creates the xsk sockets and then places them
into the map by gathering the map fd from the pinned bpf fs file.

OK.  That's what I thought.  Would be good to expand the commit message
a bit explaining the use case.

Ack, I already adjusted locally. Planning to send v2 ~today with your feedback
incorporated. Much appreciated!

The main idea behind 'inhibit' is that the qemu doesn't need to have a lot
of privileges to use the pre-loaded program and the pre-created sockets.
But creating the sockets and setting them into a map doesn't allow us to
run without privileges, IIUC.  May be worth mentioning at least in the
commit message.

Yes, privileges for above use case are still needed. Will clarify in the
description.

OK.

Also, isn't map-start-index the same thing as start-queue ?  Do we need
both of them?

I'd say yes given it does not have to be an exact mapping wrt queue index
to map slot. The default is 0 though and I expect this to be the most used
scenario.

I'm still not sure about this.  For example, libxdp treats queue id as a map
index.  And this value is actually not being used much in libxdp when the
program load is inhibited.  I see that with a custom XDP program the indexes
inside the map may not directly correspond to queues in the device, and, in
fact, may have no relation to the actual queues in the device at all.

Right, that's correct.

However, we're still calling them "queues" from the QEMU interface (as in the
"queues" parameter of the net/af-xdp device), and QEMU will just treat every
slot in the BPF map as separate queues, as this BPF map is essentially the
network device that QEMU is working with, it doesn't actually know what's
behind it.

So, I think, it should be appropriate to simplify the interface and
just use existing start-queue configuration knob for this.

What do you think?

I was thinking of an example like the below (plainly taken from the XDP example
programs at github.com/xdp-project/bpf-examples).

    struct {
        __uint(type, BPF_MAP_TYPE_XSKMAP);
        __uint(max_entries, MAX_SOCKS);
        __uint(key_size, sizeof(int));
        __uint(value_size, sizeof(int));
    } xsks_map SEC(".maps");

    int num_socks = 0;
    static unsigned int rr;

    SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
    {
        rr = (rr + 1) & (num_socks - 1);
        return bpf_redirect_map(&xsks_map, rr, XDP_DROP);
    }

If we'd just reuse the start-queue configuration knob for this, then it wouldn't
work. So I think having the flexibility of where to place the sockets in the map
would make sense. But I can also drop that part of you think it does not warrant
the extra knob and align to start-queue then the map always needs to be of the
same size as the combined NIC queues.

I'm a little confused here.  The 'start-queue' is not used for anything 
important,
AFAICT, in case of inhibit=on.  So, why re-using it instead of adding a new
config option reduces the number of available use cases?

Hm, maybe I'm missing something, but we use inhibit=on and do /not/ pass 
sock-fds as
a parameter and instead fully rely on qemu to create all related af_xdp 
sockets. So
the start-queue /is/ relevant for the underlying NIC queue selection as we pass 
the
queue_id to xsk_socket__create().

Thanks,
Daniel

Reply via email to