Re: [RFC PATCH v2 00/14] Introducing AF_XDP support

2018-04-12 Thread Björn Töpel
2018-04-11 20:43 GMT+02:00 Alexei Starovoitov :
> On 4/11/18 5:17 AM, Björn Töpel wrote:
>>
>>
>> In the current RFC you are required to create both an Rx and Tx
>> queue to bind the socket, which is just weird for your "Rx on one
>> device, Tx to another" scenario. I'll fix that in the next RFC.
>
> I would defer on adding new features until the key functionality
> lands.  imo it's in good shape and I would submit it without RFC tag
> as soon as net-next reopens.

Yes, makes sense. We're doing some ptr_ring-like vs head/tail
measurements, and depending on the result we'll send out a proper
patch when net-next is open again.

What tree should we target -- bpf-next or net-next?


Thanks!
Björn


Re: [RFC PATCH v2 00/14] Introducing AF_XDP support

2018-04-11 Thread Alexei Starovoitov

On 4/11/18 5:17 AM, Björn Töpel wrote:


In the current RFC you are required to create both an Rx and Tx queue
to bind the socket, which is just weird for your "Rx on one device, Tx
to another" scenario. I'll fix that in the next RFC.


I would defer on adding new features until the key functionality lands.
imo it's in good shape and I would submit it without RFC tag as soon as
net-next reopens.



Re: [RFC PATCH v2 00/14] Introducing AF_XDP support

2018-04-11 Thread Björn Töpel
2018-04-10 16:14 GMT+02:00 William Tu :
> On Mon, Apr 9, 2018 at 11:47 PM, Björn Töpel  wrote:

[...]

>>>
>>
>> So you've setup two identical UMEMs? Then you can just forward the
>> incoming Rx descriptor to the other netdev's Tx queue. Note, that you
>> only need to copy the descriptor, not the actual frame data.
>>
>
> Thanks!
> I will give it a try, I guess you're saying I can do below:
>
> int sfd1; // for device1
> int sfd2; // for device2
> ...
> // create 2 umem
> umem1 = calloc(1, sizeof(*umem));
> umem2 = calloc(1, sizeof(*umem));
>
> // allocate 1 shared buffer, 1 xdp_umem_reg
> posix_memalign(&bufs, ...)
> mr.addr = (__u64)bufs; // shared for umem1,2
> ...
>
> // umem reg the same mr
> setsockopt(sfd1, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr))
> setsockopt(sfd2, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr))
>
> // setup fill, completion, mmap for sfd1 and sfd2
> ...
>
> Since both device can put frame data in 'bufs', I only need to copy
> the descs between 2 umem1 and umem2. Am I understand correct?
>

Yup, spot on! umem1 and umem2 have the same layout/index "address
space", so you can just forward the descriptors and never touch the
data.

In the current RFC you are required to create both an Rx and Tx queue
to bind the socket, which is just weird for your "Rx on one device, Tx
to another" scenario. I'll fix that in the next RFC.


Björn

> Regards,
> William


Re: [RFC PATCH v2 00/14] Introducing AF_XDP support

2018-04-10 Thread William Tu
On Mon, Apr 9, 2018 at 11:47 PM, Björn Töpel  wrote:
> 2018-04-09 23:51 GMT+02:00 William Tu :
>> On Tue, Mar 27, 2018 at 9:59 AM, Björn Töpel  wrote:
>>> From: Björn Töpel 
>>>
>>> This RFC introduces a new address family called AF_XDP that is
>>> optimized for high performance packet processing and, in upcoming
>>> patch sets, zero-copy semantics. In this v2 version, we have removed
>>> all zero-copy related code in order to make it smaller, simpler and
>>> hopefully more review friendly. This RFC only supports copy-mode for
>>> the generic XDP path (XDP_SKB) for both RX and TX and copy-mode for RX
>>> using the XDP_DRV path. Zero-copy support requires XDP and driver
>>> changes that Jesper Dangaard Brouer is working on. Some of his work is
>>> already on the mailing list for review. We will publish our zero-copy
>>> support for RX and TX on top of his patch sets at a later point in
>>> time.
>>>
>>> An AF_XDP socket (XSK) is created with the normal socket()
>>> syscall. Associated with each XSK are two queues: the RX queue and the
>>> TX queue. A socket can receive packets on the RX queue and it can send
>>> packets on the TX queue. These queues are registered and sized with
>>> the setsockopts XDP_RX_QUEUE and XDP_TX_QUEUE, respectively. It is
>>> mandatory to have at least one of these queues for each socket. In
>>> contrast to AF_PACKET V2/V3 these descriptor queues are separated from
>>> packet buffers. An RX or TX descriptor points to a data buffer in a
>>> memory area called a UMEM. RX and TX can share the same UMEM so that a
>>> packet does not have to be copied between RX and TX. Moreover, if a
>>> packet needs to be kept for a while due to a possible retransmit, the
>>> descriptor that points to that packet can be changed to point to
>>> another and reused right away. This again avoids copying data.
>>>
>>> This new dedicated packet buffer area is called a UMEM. It consists of
>>> a number of equally size frames and each frame has a unique frame
>>> id. A descriptor in one of the queues references a frame by
>>> referencing its frame id. The user space allocates memory for this
>>> UMEM using whatever means it feels is most appropriate (malloc, mmap,
>>> huge pages, etc). This memory area is then registered with the kernel
>>> using the new setsockopt XDP_UMEM_REG. The UMEM also has two queues:
>>> the FILL queue and the COMPLETION queue. The fill queue is used by the
>>> application to send down frame ids for the kernel to fill in with RX
>>> packet data. References to these frames will then appear in the RX
>>> queue of the XSK once they have been received. The completion queue,
>>> on the other hand, contains frame ids that the kernel has transmitted
>>> completely and can now be used again by user space, for either TX or
>>> RX. Thus, the frame ids appearing in the completion queue are ids that
>>> were previously transmitted using the TX queue. In summary, the RX and
>>> FILL queues are used for the RX path and the TX and COMPLETION queues
>>> are used for the TX path.
>>>
>> Can we register a UMEM to multiple device's queue?
>>
>
> No, one UMEM, one netdev queue in this RFC. That being said, there's
> nothing stopping a user from creating an additional UMEM, say UMEM',
> pointing to the same memory as UMEM, but bound to another
> netdev/queue. Note that the user space application has to make sure
> that the buffer handling is sane (user/kernel frame ownership).
>
> We used to allow to share UMEM between unrelated sockets, but after
> the introduction of the UMEM queues (fill/completion) that's no the
> case any more. For the zero-copy scenario, having to manage multiple
> DMA mappings per UMEM was a bit of a mess, so we went for the simpler
> (current) solution with one UMEM per netdev/queue.
>
>> So far the l2fwd sample code is sending/receiving from the same
>> queue. I'm thinking about forwarding packets from one device to another.
>> Now I'm copying packets from one device's RX desc to another device's TX
>> completion queue. But this introduces one extra copy.
>>
>
> So you've setup two identical UMEMs? Then you can just forward the
> incoming Rx descriptor to the other netdev's Tx queue. Note, that you
> only need to copy the descriptor, not the actual frame data.
>

Thanks!
I will give it a try, I guess you're saying I can do below:

int sfd1; // for device1
int sfd2; // for device2
...
// create 2 umem
umem1 = calloc(1, sizeof(*umem));
umem2 = calloc(1, sizeof(*umem));

// allocate 1 shared buffer, 1 xdp_umem_reg
posix_memalign(&bufs, ...)
mr.addr = (__u64)bufs; // shared for umem1,2
...

// umem reg the same mr
setsockopt(sfd1, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr))
setsockopt(sfd2, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr))

// setup fill, completion, mmap for sfd1 and sfd2
...

Since both device can put frame data in 'bufs', I only need to copy
the descs between 2 umem1 and umem2. Am I understand correct?

Regards,
William


Re: [RFC PATCH v2 00/14] Introducing AF_XDP support

2018-04-09 Thread Björn Töpel
2018-04-09 23:51 GMT+02:00 William Tu :
> On Tue, Mar 27, 2018 at 9:59 AM, Björn Töpel  wrote:
>> From: Björn Töpel 
>>
>> This RFC introduces a new address family called AF_XDP that is
>> optimized for high performance packet processing and, in upcoming
>> patch sets, zero-copy semantics. In this v2 version, we have removed
>> all zero-copy related code in order to make it smaller, simpler and
>> hopefully more review friendly. This RFC only supports copy-mode for
>> the generic XDP path (XDP_SKB) for both RX and TX and copy-mode for RX
>> using the XDP_DRV path. Zero-copy support requires XDP and driver
>> changes that Jesper Dangaard Brouer is working on. Some of his work is
>> already on the mailing list for review. We will publish our zero-copy
>> support for RX and TX on top of his patch sets at a later point in
>> time.
>>
>> An AF_XDP socket (XSK) is created with the normal socket()
>> syscall. Associated with each XSK are two queues: the RX queue and the
>> TX queue. A socket can receive packets on the RX queue and it can send
>> packets on the TX queue. These queues are registered and sized with
>> the setsockopts XDP_RX_QUEUE and XDP_TX_QUEUE, respectively. It is
>> mandatory to have at least one of these queues for each socket. In
>> contrast to AF_PACKET V2/V3 these descriptor queues are separated from
>> packet buffers. An RX or TX descriptor points to a data buffer in a
>> memory area called a UMEM. RX and TX can share the same UMEM so that a
>> packet does not have to be copied between RX and TX. Moreover, if a
>> packet needs to be kept for a while due to a possible retransmit, the
>> descriptor that points to that packet can be changed to point to
>> another and reused right away. This again avoids copying data.
>>
>> This new dedicated packet buffer area is called a UMEM. It consists of
>> a number of equally size frames and each frame has a unique frame
>> id. A descriptor in one of the queues references a frame by
>> referencing its frame id. The user space allocates memory for this
>> UMEM using whatever means it feels is most appropriate (malloc, mmap,
>> huge pages, etc). This memory area is then registered with the kernel
>> using the new setsockopt XDP_UMEM_REG. The UMEM also has two queues:
>> the FILL queue and the COMPLETION queue. The fill queue is used by the
>> application to send down frame ids for the kernel to fill in with RX
>> packet data. References to these frames will then appear in the RX
>> queue of the XSK once they have been received. The completion queue,
>> on the other hand, contains frame ids that the kernel has transmitted
>> completely and can now be used again by user space, for either TX or
>> RX. Thus, the frame ids appearing in the completion queue are ids that
>> were previously transmitted using the TX queue. In summary, the RX and
>> FILL queues are used for the RX path and the TX and COMPLETION queues
>> are used for the TX path.
>>
> Can we register a UMEM to multiple device's queue?
>

No, one UMEM, one netdev queue in this RFC. That being said, there's
nothing stopping a user from creating an additional UMEM, say UMEM',
pointing to the same memory as UMEM, but bound to another
netdev/queue. Note that the user space application has to make sure
that the buffer handling is sane (user/kernel frame ownership).

We used to allow to share UMEM between unrelated sockets, but after
the introduction of the UMEM queues (fill/completion) that's no the
case any more. For the zero-copy scenario, having to manage multiple
DMA mappings per UMEM was a bit of a mess, so we went for the simpler
(current) solution with one UMEM per netdev/queue.

> So far the l2fwd sample code is sending/receiving from the same
> queue. I'm thinking about forwarding packets from one device to another.
> Now I'm copying packets from one device's RX desc to another device's TX
> completion queue. But this introduces one extra copy.
>

So you've setup two identical UMEMs? Then you can just forward the
incoming Rx descriptor to the other netdev's Tx queue. Note, that you
only need to copy the descriptor, not the actual frame data.

> One way I can do is to call bpf_redirect helper function, but sometimes
> I still need to process the packet in userspace.
>
> I like this work!
> Thanks a lot.

Happy to hear that, and thanks a bunch for trying it out. Keep that
feedback coming!


Björn

> William


Re: [RFC PATCH v2 00/14] Introducing AF_XDP support

2018-04-09 Thread William Tu
On Tue, Mar 27, 2018 at 9:59 AM, Björn Töpel  wrote:
> From: Björn Töpel 
>
> This RFC introduces a new address family called AF_XDP that is
> optimized for high performance packet processing and, in upcoming
> patch sets, zero-copy semantics. In this v2 version, we have removed
> all zero-copy related code in order to make it smaller, simpler and
> hopefully more review friendly. This RFC only supports copy-mode for
> the generic XDP path (XDP_SKB) for both RX and TX and copy-mode for RX
> using the XDP_DRV path. Zero-copy support requires XDP and driver
> changes that Jesper Dangaard Brouer is working on. Some of his work is
> already on the mailing list for review. We will publish our zero-copy
> support for RX and TX on top of his patch sets at a later point in
> time.
>
> An AF_XDP socket (XSK) is created with the normal socket()
> syscall. Associated with each XSK are two queues: the RX queue and the
> TX queue. A socket can receive packets on the RX queue and it can send
> packets on the TX queue. These queues are registered and sized with
> the setsockopts XDP_RX_QUEUE and XDP_TX_QUEUE, respectively. It is
> mandatory to have at least one of these queues for each socket. In
> contrast to AF_PACKET V2/V3 these descriptor queues are separated from
> packet buffers. An RX or TX descriptor points to a data buffer in a
> memory area called a UMEM. RX and TX can share the same UMEM so that a
> packet does not have to be copied between RX and TX. Moreover, if a
> packet needs to be kept for a while due to a possible retransmit, the
> descriptor that points to that packet can be changed to point to
> another and reused right away. This again avoids copying data.
>
> This new dedicated packet buffer area is called a UMEM. It consists of
> a number of equally size frames and each frame has a unique frame
> id. A descriptor in one of the queues references a frame by
> referencing its frame id. The user space allocates memory for this
> UMEM using whatever means it feels is most appropriate (malloc, mmap,
> huge pages, etc). This memory area is then registered with the kernel
> using the new setsockopt XDP_UMEM_REG. The UMEM also has two queues:
> the FILL queue and the COMPLETION queue. The fill queue is used by the
> application to send down frame ids for the kernel to fill in with RX
> packet data. References to these frames will then appear in the RX
> queue of the XSK once they have been received. The completion queue,
> on the other hand, contains frame ids that the kernel has transmitted
> completely and can now be used again by user space, for either TX or
> RX. Thus, the frame ids appearing in the completion queue are ids that
> were previously transmitted using the TX queue. In summary, the RX and
> FILL queues are used for the RX path and the TX and COMPLETION queues
> are used for the TX path.
>
Can we register a UMEM to multiple device's queue?

So far the l2fwd sample code is sending/receiving from the same
queue. I'm thinking about forwarding packets from one device to another.
Now I'm copying packets from one device's RX desc to another device's TX
completion queue. But this introduces one extra copy.

One way I can do is to call bpf_redirect helper function, but sometimes
I still need to process the packet in userspace.

I like this work!
Thanks a lot.
William


Re: [RFC PATCH v2 00/14] Introducing AF_XDP support

2018-03-29 Thread Jesper Dangaard Brouer


On Thu, 29 Mar 2018 08:16:23 +0200 Björn Töpel  wrote:

> 2018-03-28 23:18 GMT+02:00 Eric Leblond :
> > Hello,
> >
> > On Tue, 2018-03-27 at 18:59 +0200, Björn Töpel wrote:  
> >> From: Björn Töpel 
> >>
> >>  
> > optimized for high performance packet processing and, in upcoming  
> >> patch sets, zero-copy semantics. In this v2 version, we have removed
> >> all zero-copy related code in order to make it smaller, simpler and
> >> hopefully more review friendly. This RFC only supports copy-mode for
> >> the generic XDP path (XDP_SKB) for both RX and TX and copy-mode for
> >> RX
> >>  
> >
> > ...  
> >>
> >> How is then packets distributed between these two XSK? We have
> >> introduced a new BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in
> >> full). The user-space application can place an XSK at an arbitrary
> >> place in this map. The XDP program can then redirect a packet to a
> >> specific index in this map and at this point XDP validates that the
> >> XSK in that map was indeed bound to that device and queue number. If
> >> not, the packet is dropped. If the map is empty at that index, the
> >> packet is also dropped. This also means that it is currently
> >> mandatory
> >> to have an XDP program loaded (and one XSK in the XSKMAP) to be able
> >> to get any traffic to user space through the XSK.  
> >
> > If I get it correctly, this feature will have to be used to bound
> > multiple sockets to a single queue and the eBPF filter will be
> > responsible of the load balancing. Am I correct ?
> >  
> 
> Exactly! The XDP program executing for a certain Rx queue will
> distribute the packets to the socket(s) in the xskmap.

It is important to understand that we (want/need to) maintain a Single
Producer Single Consumer (SPSC) scenario here, for performance reasons.

This _is_ maintained in this patchset AFAIK. But as the API user, you
have to understand that the responsibility of aligning this is yours!
If you don't the frames are dropped (silently).
  The BPF programmer MUST select the correct XSKMAP index,  such that
the ctx->rx_queue_index match the queue_id registered in the xdp_sock
(and bounded ifindex also match).


Bjørn, Magnus and I have discussed other API options. E.g. where the
XSKMAP index _is_ the rx_queue_index, and BPF programmer is not allowed
select another index.  We settled on the API in the patchset, where BPF
programmer get more freedom, and can select an invalid index, that
cause packets to be dropped.

An advantage of this API is that we allow one RX-queue, to multiplex
into many xdk_sock's (all bound to this same RX-queue and ifindex).
This still maintain a Single Producer, as the RX-queue just have a
Single Producer relationship's with each xdp_sock.


I imagine, that Suricata/Eric, want to capture all the RX-queues on
the net_device.  For this to happen, he need to create a xdp_sock per
RX-queue, and have a side-bpf-map that assist in the XSKMAP lookup, or
simply populate the XSKMAP to correspond to the rx_queue_index.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


Re: [RFC PATCH v2 00/14] Introducing AF_XDP support

2018-03-28 Thread Björn Töpel
2018-03-28 23:18 GMT+02:00 Eric Leblond :
> Hello,
>
> On Tue, 2018-03-27 at 18:59 +0200, Björn Töpel wrote:
>> From: Björn Töpel 
>>
>>
> optimized for high performance packet processing and, in upcoming
>> patch sets, zero-copy semantics. In this v2 version, we have removed
>> all zero-copy related code in order to make it smaller, simpler and
>> hopefully more review friendly. This RFC only supports copy-mode for
>> the generic XDP path (XDP_SKB) for both RX and TX and copy-mode for
>> RX
>>
>
> ...
>>
>> How is then packets distributed between these two XSK? We have
>> introduced a new BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in
>> full). The user-space application can place an XSK at an arbitrary
>> place in this map. The XDP program can then redirect a packet to a
>> specific index in this map and at this point XDP validates that the
>> XSK in that map was indeed bound to that device and queue number. If
>> not, the packet is dropped. If the map is empty at that index, the
>> packet is also dropped. This also means that it is currently
>> mandatory
>> to have an XDP program loaded (and one XSK in the XSKMAP) to be able
>> to get any traffic to user space through the XSK.
>
> If I get it correctly, this feature will have to be used to bound
> multiple sockets to a single queue and the eBPF filter will be
> responsible of the load balancing. Am I correct ?
>

Exactly! The XDP program executing for a certain Rx queue will
distribute the packets to the socket(s) in the xskmap.

>> AF_XDP can operate in two different modes: XDP_SKB and XDP_DRV. If
>> the
>> driver does not have support for XDP, or XDP_SKB is explicitly chosen
> ...
>
> Thanks a lot for this work, I'm gonna try to implement this in
> Suricata.
>

Thanks for trying it out! All input is very much appreciated
(clunkiness of API, crashes...)!


Björn

> Best regards,
> --
> Eric Leblond


Re: [RFC PATCH v2 00/14] Introducing AF_XDP support

2018-03-28 Thread Eric Leblond
Hello,

On Tue, 2018-03-27 at 18:59 +0200, Björn Töpel wrote:
> From: Björn Töpel 
> 
> 
optimized for high performance packet processing and, in upcoming
> patch sets, zero-copy semantics. In this v2 version, we have removed
> all zero-copy related code in order to make it smaller, simpler and
> hopefully more review friendly. This RFC only supports copy-mode for
> the generic XDP path (XDP_SKB) for both RX and TX and copy-mode for
> RX
> 

...
> 
> How is then packets distributed between these two XSK? We have
> introduced a new BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in
> full). The user-space application can place an XSK at an arbitrary
> place in this map. The XDP program can then redirect a packet to a
> specific index in this map and at this point XDP validates that the
> XSK in that map was indeed bound to that device and queue number. If
> not, the packet is dropped. If the map is empty at that index, the
> packet is also dropped. This also means that it is currently
> mandatory
> to have an XDP program loaded (and one XSK in the XSKMAP) to be able
> to get any traffic to user space through the XSK.

If I get it correctly, this feature will have to be used to bound
multiple sockets to a single queue and the eBPF filter will be
responsible of the load balancing. Am I correct ?

> AF_XDP can operate in two different modes: XDP_SKB and XDP_DRV. If
> the
> driver does not have support for XDP, or XDP_SKB is explicitly chosen
...

Thanks a lot for this work, I'm gonna try to implement this in
Suricata.

Best regards,
--
Eric Leblond


[RFC PATCH v2 00/14] Introducing AF_XDP support

2018-03-27 Thread Björn Töpel
From: Björn Töpel 

This RFC introduces a new address family called AF_XDP that is
optimized for high performance packet processing and, in upcoming
patch sets, zero-copy semantics. In this v2 version, we have removed
all zero-copy related code in order to make it smaller, simpler and
hopefully more review friendly. This RFC only supports copy-mode for
the generic XDP path (XDP_SKB) for both RX and TX and copy-mode for RX
using the XDP_DRV path. Zero-copy support requires XDP and driver
changes that Jesper Dangaard Brouer is working on. Some of his work is
already on the mailing list for review. We will publish our zero-copy
support for RX and TX on top of his patch sets at a later point in
time.

An AF_XDP socket (XSK) is created with the normal socket()
syscall. Associated with each XSK are two queues: the RX queue and the
TX queue. A socket can receive packets on the RX queue and it can send
packets on the TX queue. These queues are registered and sized with
the setsockopts XDP_RX_QUEUE and XDP_TX_QUEUE, respectively. It is
mandatory to have at least one of these queues for each socket. In
contrast to AF_PACKET V2/V3 these descriptor queues are separated from
packet buffers. An RX or TX descriptor points to a data buffer in a
memory area called a UMEM. RX and TX can share the same UMEM so that a
packet does not have to be copied between RX and TX. Moreover, if a
packet needs to be kept for a while due to a possible retransmit, the
descriptor that points to that packet can be changed to point to
another and reused right away. This again avoids copying data.

This new dedicated packet buffer area is called a UMEM. It consists of
a number of equally size frames and each frame has a unique frame
id. A descriptor in one of the queues references a frame by
referencing its frame id. The user space allocates memory for this
UMEM using whatever means it feels is most appropriate (malloc, mmap,
huge pages, etc). This memory area is then registered with the kernel
using the new setsockopt XDP_UMEM_REG. The UMEM also has two queues:
the FILL queue and the COMPLETION queue. The fill queue is used by the
application to send down frame ids for the kernel to fill in with RX
packet data. References to these frames will then appear in the RX
queue of the XSK once they have been received. The completion queue,
on the other hand, contains frame ids that the kernel has transmitted
completely and can now be used again by user space, for either TX or
RX. Thus, the frame ids appearing in the completion queue are ids that
were previously transmitted using the TX queue. In summary, the RX and
FILL queues are used for the RX path and the TX and COMPLETION queues
are used for the TX path.

The socket is then finally bound with a bind() call to a device and a
specific queue id on that device, and it is not until bind is
completed that traffic starts to flow. Note that in this RFC, all
packet data is copied out to user-space.

A new feature in this RFC is that the UMEM can be shared between
processes, if desired. If a process wants to do this, it simply skips
the registration of the UMEM and its corresponding two queues, sets a
flag in the bind call and submits the XSK of the process it would like
to share UMEM with as well as its own newly created XSK socket. The
new process will then receive frame id references in its own RX queue
that point to this shared UMEM. Note that since the queue structures
are single-consumer / single-producer (for performance reasons), the
new process has to create its own socket with associated RX and TX
queues, since it cannot share this with the other process. This is
also the reason that there is only one set of FILL and COMPLETION
queues per UMEM. It is the responsibility of a single process to
handle the UMEM. If multiple-producer / multiple-consumer queues are
implemented in the future, this requirement could be relaxed.

How is then packets distributed between these two XSK? We have
introduced a new BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in
full). The user-space application can place an XSK at an arbitrary
place in this map. The XDP program can then redirect a packet to a
specific index in this map and at this point XDP validates that the
XSK in that map was indeed bound to that device and queue number. If
not, the packet is dropped. If the map is empty at that index, the
packet is also dropped. This also means that it is currently mandatory
to have an XDP program loaded (and one XSK in the XSKMAP) to be able
to get any traffic to user space through the XSK.

AF_XDP can operate in two different modes: XDP_SKB and XDP_DRV. If the
driver does not have support for XDP, or XDP_SKB is explicitly chosen
when loading the XDP program, XDP_SKB mode is employed that uses SKBs
together with the generic XDP support and copies out the data to user
space. A fallback mode that works for any network device. On the other
hand, if the driver has support for XDP, it will be used by the AF_XDP
code to provide