Re: [PATCH] net/af-xdp: Support pinned map path for AF_XDP sockets

Ilya Maximets Wed, 14 May 2025 12:09:51 -0700

On 5/12/25 4:23 PM, Daniel Borkmann wrote:
> On 5/12/25 2:03 PM, Ilya Maximets wrote:
>> On 5/9/25 4:05 PM, Daniel Borkmann wrote:
>>> On 5/9/25 12:53 AM, Ilya Maximets wrote:
>>>> On 5/8/25 2:34 PM, Daniel Borkmann wrote:
>>>>> Extend inhibit=on setting with the option to specify a pinned XSK map
>>>>> path along with a starting index (default 0) to push the created XSK
>>>>> sockets into. Example usage:
>>>>>
>>>>>     # ./build/qemu-system-x86_64 [...] \
>>>>>       -netdev 
>>>>> af-xdp,ifname=eth0,id=net0,mode=native,queues=2,inhibit=on,map-path=/sys/fs/bpf/foo,map-start-index=2
>>>>>       -device virtio-net-pci,netdev=net0 [...]
>>>>>
>>>>> This is useful for the case where an existing XDP program with XSK map
>>>>> is present on the AF_XDP supported phys device and the XSK map not yet
>>>>> populated. Qemu will then push the XSK sockets into the specified map.
>>>>
>>>> Thanks for the patch!
>>>>
>>>> Could you, please, explain the use case a little more?  Is this patch
>>>> aiming to improve usability?  Do you have a specific use case in mind?
>>>
>>> The use case we have is basically that the phys NIC has an XDP program
>>> already attached which redirects into xsk map (e.g. installed from separate
>>> control plane), the xsk map got pinned during that process into bpf fs,
>>> and now qemu is launched, it creates the xsk sockets and then places them
>>> into the map by gathering the map fd from the pinned bpf fs file.
>>
>> OK.  That's what I thought.  Would be good to expand the commit message
>> a bit explaining the use case.
> 
> Ack, I already adjusted locally. Planning to send v2 ~today with your feedback
> incorporated. Much appreciated!
> 
>>>> The main idea behind 'inhibit' is that the qemu doesn't need to have a lot
>>>> of privileges to use the pre-loaded program and the pre-created sockets.
>>>> But creating the sockets and setting them into a map doesn't allow us to
>>>> run without privileges, IIUC.  May be worth mentioning at least in the
>>>> commit message.
>>>
>>> Yes, privileges for above use case are still needed. Will clarify in the
>>> description.
>>
>> OK.
>>
>>>> Also, isn't map-start-index the same thing as start-queue ?  Do we need
>>>> both of them?
>>>
>>> I'd say yes given it does not have to be an exact mapping wrt queue index
>>> to map slot. The default is 0 though and I expect this to be the most used
>>> scenario.
>>
>> I'm still not sure about this.  For example, libxdp treats queue id as a map
>> index.  And this value is actually not being used much in libxdp when the
>> program load is inhibited.  I see that with a custom XDP program the indexes
>> inside the map may not directly correspond to queues in the device, and, in
>> fact, may have no relation to the actual queues in the device at all.
> 
> Right, that's correct.
> 
>> However, we're still calling them "queues" from the QEMU interface (as in the
>> "queues" parameter of the net/af-xdp device), and QEMU will just treat every
>> slot in the BPF map as separate queues, as this BPF map is essentially the
>> network device that QEMU is working with, it doesn't actually know what's
>> behind it.
>>
>> So, I think, it should be appropriate to simplify the interface and
>> just use existing start-queue configuration knob for this.
>>
>> What do you think?
> 
> I was thinking of an example like the below (plainly taken from the XDP 
> example
> programs at github.com/xdp-project/bpf-examples).
> 
>    struct {
>       __uint(type, BPF_MAP_TYPE_XSKMAP);
>       __uint(max_entries, MAX_SOCKS);
>       __uint(key_size, sizeof(int));
>       __uint(value_size, sizeof(int));
>    } xsks_map SEC(".maps");
> 
>    int num_socks = 0;
>    static unsigned int rr;
> 
>    SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
>    {
>       rr = (rr + 1) & (num_socks - 1);
>       return bpf_redirect_map(&xsks_map, rr, XDP_DROP);
>    }
> 
> If we'd just reuse the start-queue configuration knob for this, then it 
> wouldn't
> work. So I think having the flexibility of where to place the sockets in the 
> map
> would make sense. But I can also drop that part of you think it does not 
> warrant
> the extra knob and align to start-queue then the map always needs to be of the
> same size as the combined NIC queues.


I'm a little confused here.  The 'start-queue' is not used for anything 
important,
AFAICT, in case of inhibit=on.  So, why re-using it instead of adding a new
config option reduces the number of available use cases?

Bets regards, Ilya Maximets.

Re: [PATCH] net/af-xdp: Support pinned map path for AF_XDP sockets

Reply via email to