Add the initial skeleton for the rtap poll mode driver, a virtual
ethernet device that uses Linux io_uring for packet I/O with kernel
TAP devices.

This patch includes:
  - MAINTAINERS entry
  - Driver documentation (doc/guides/nics/rtap.rst)
  - Feature matrix (doc/guides/nics/features/rtap.ini)
  - Release notes update
  - Meson build integration with liburing dependency
  - Header file with shared data structures and declarations
  - Stub probe/remove handlers that register the vdev driver
  - Empty dev_ops with only dev_close implemented

The driver registers as net_rtap and is Linux-only.
It requires the liburing library version 2.0 or later.
Earlier versions have known security and build issues.
The library is available in all currently supported distributions
(Debian 12+, Ubuntu 22.04+, RHEL 9+, Fedora 35+)

Signed-off-by: Stephen Hemminger <[email protected]>
---
 MAINTAINERS                            |   7 +
 doc/guides/nics/features/rtap.ini      |  13 ++
 doc/guides/nics/index.rst              |   1 +
 doc/guides/nics/rtap.rst               | 101 +++++++++++++++
 doc/guides/rel_notes/release_26_03.rst |   6 +
 drivers/net/meson.build                |   1 +
 drivers/net/rtap/meson.build           |  26 ++++
 drivers/net/rtap/rtap.h                |  69 ++++++++++
 drivers/net/rtap/rtap_ethdev.c         | 172 +++++++++++++++++++++++++
 9 files changed, 396 insertions(+)
 create mode 100644 doc/guides/nics/features/rtap.ini
 create mode 100644 doc/guides/nics/rtap.rst
 create mode 100644 drivers/net/rtap/meson.build
 create mode 100644 drivers/net/rtap/rtap.h
 create mode 100644 drivers/net/rtap/rtap_ethdev.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 5683b87e4a..3d0877fdc7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1135,6 +1135,13 @@ F: doc/guides/nics/pcap_ring.rst
 F: app/test/test_pmd_ring.c
 F: app/test/test_pmd_ring_perf.c
 
+Rtap PMD - EXPERIMENTAL
+M: Stephen Hemminger <[email protected]>
+F: drivers/net/rtap/
+F: app/test/test_pmd_rtap.c
+F: doc/guides/nics/rtap.rst
+F: doc/guides/nics/features/rtap.ini
+
 Null Networking PMD
 M: Tetsuya Mukawa <[email protected]>
 F: drivers/net/null/
diff --git a/doc/guides/nics/features/rtap.ini 
b/doc/guides/nics/features/rtap.ini
new file mode 100644
index 0000000000..ed7c638029
--- /dev/null
+++ b/doc/guides/nics/features/rtap.ini
@@ -0,0 +1,13 @@
+;
+; Supported features of the 'rtap' driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Linux                = Y
+ARMv7                = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index b00ed998c5..274575fe70 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -65,6 +65,7 @@ Network Interface Controller Drivers
     qede
     r8169
     rnp
+    rtap
     sfc_efx
     softnic
     tap
diff --git a/doc/guides/nics/rtap.rst b/doc/guides/nics/rtap.rst
new file mode 100644
index 0000000000..1c1cb8dd58
--- /dev/null
+++ b/doc/guides/nics/rtap.rst
@@ -0,0 +1,101 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+
+RTAP Poll Mode Driver
+=======================
+
+The RTAP Poll Mode Driver (PMD) is similar to the TAP PMD. It is a
+virtual device that uses Linux io_uring for efficient packet I/O with
+the Linux kernel.
+It is useful when writing DPDK applications that need to support interaction
+with the Linux TCP/IP stack for control plane or tunneling.
+
+The RTAP PMD creates a kernel network device that can be
+managed by standard tools such as ``ip`` and ``ethtool`` commands.
+
+From a DPDK application, the RTAP device looks like a DPDK ethdev.
+It supports the standard DPDK APIs to query for information, statistics,
+and send/receive packets.
+
+Features
+--------
+
+- Uses io_uring for asynchronous packet I/O via read/write and readv/writev
+- TX offloads: multi-segment, UDP checksum, TCP checksum, TCP segmentation 
(TSO)
+- RX offloads: UDP checksum, TCP checksum, TCP LRO, scatter
+- Virtio net header support for offload negotiation with the kernel
+- Multi-queue support (up to 128 queues)
+- Multi-process support (secondary processes receive queue fds from primary)
+- Link state change notification via netlink
+- Rx interrupt support for power-aware applications (eventfd per queue)
+- Promiscuous and allmulticast mode
+- MAC address configuration
+- MTU update
+- Link up/down control
+- Basic and per-queue statistics
+
+Requirements
+------------
+
+- **liburing >= 2.0**.  Earlier versions have known security and build issues.
+
+- The kernel must support ``IORING_ASYNC_CANCEL_ALL`` (upstream since 5.19).
+  The meson build checks for this symbol and will not build the driver
+  if the installed kernel headers do not provide it.  Because enterprise
+  distributions backport features independently of version numbers,
+  the driver avoids hard-coding a kernel version check.
+
+Known working distributions:
+
+- Debian 12 (Bookworm) or later
+- Ubuntu 24.04 (Noble) or later (22.04 with HWE kernel)
+- Fedora 37 or later
+- SUSE Linux Enterprise 15 SP6 or later / openSUSE Tumbleweed
+
+RHEL 9 ships io_uring only as a Technology Preview (disabled by default)
+and is not supported.
+
+For more info on io_uring, please see:
+
+- `io_uring on Wikipedia <https://en.wikipedia.org/wiki/Io_uring>`_
+- `liburing on GitHub <https://github.com/axboe/liburing>`_
+
+
+Arguments
+---------
+
+RTAP devices are created with the ``--vdev=net_rtap0`` command line option.
+Multiple devices can be created by repeating the option with different device 
names
+(``net_rtap1``, ``net_rtap2``, etc.).
+
+By default, the Linux interfaces are named ``rtap0``, ``rtap1``, etc.
+The interface name can be specified by adding the ``iface=foo0``, for example::
+
+   --vdev=net_rtap0,iface=io0 --vdev=net_rtap1,iface=io1 ...
+
+The PMD inherits the MAC address assigned by the kernel which will be
+a locally assigned random Ethernet address.
+
+Normally, when the DPDK application exits, the RTAP device is removed.
+But this behavior can be overridden by the use of the persist flag, which
+causes the kernel network interface to survive application exit. Example::
+
+  --vdev=net_rtap0,iface=io0,persist ...
+
+
+Limitations
+-----------
+
+- The kernel must have io_uring support with ``IORING_ASYNC_CANCEL_ALL``
+  (upstream since 5.19, but may be backported by distributions).
+  io_uring support may also be disabled in some environments or by security 
policies
+  (for example, Docker disables io_uring in its default seccomp profile,
+  and RHEL 9 disables it via ``kernel.io_uring_disabled`` sysctl).
+
+- Since RTAP device uses a file descriptor to talk to the kernel,
+  the same number of queues must be specified for receive and transmit.
+
+- The maximum number of queues is 128.
+
+- No flow support. Receive queue selection for incoming packets is determined
+  by the Linux kernel. See kernel documentation for more info:
+  https://www.kernel.org/doc/html/latest/networking/scaling.html
diff --git a/doc/guides/rel_notes/release_26_03.rst 
b/doc/guides/rel_notes/release_26_03.rst
index 031eaa657e..db5c61a15c 100644
--- a/doc/guides/rel_notes/release_26_03.rst
+++ b/doc/guides/rel_notes/release_26_03.rst
@@ -63,6 +63,12 @@ New Features
 
   * Added support for pre and post VF reset callbacks.
 
+* **Added rtap virtual ethernet driver.**
+
+  Added a new experimental virtual device driver that uses Linux io_uring
+  for packet injection into the kernel network stack.
+  It requires Linux kernel 5.1 or later and the liburing library.
+
 
 Removed Items
 -------------
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index c7dae4ad27..ef1ee68385 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -56,6 +56,7 @@ drivers = [
         'r8169',
         'ring',
         'rnp',
+        'rtap',
         'sfc',
         'softnic',
         'tap',
diff --git a/drivers/net/rtap/meson.build b/drivers/net/rtap/meson.build
new file mode 100644
index 0000000000..7bd7806ef3
--- /dev/null
+++ b/drivers/net/rtap/meson.build
@@ -0,0 +1,26 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2026 Stephen Hemminger
+
+if not is_linux
+    build = false
+    reason = 'only supported on Linux'
+endif
+
+liburing = dependency('liburing', version: '>= 2.0', required: false)
+if not liburing.found()
+    build = false
+    reason = 'missing dependency, "liburing"'
+endif
+
+if build and not cc.has_header_symbol('linux/io_uring.h', 
'IORING_ASYNC_CANCEL_ALL')
+    build = false
+    reason = 'kernel headers missing IORING_ASYNC_CANCEL_ALL (need kernel >= 
5.19 headers)'
+endif
+
+sources = files(
+        'rtap_ethdev.c',
+)
+
+ext_deps += liburing
+
+require_iova_in_mbuf = false
diff --git a/drivers/net/rtap/rtap.h b/drivers/net/rtap/rtap.h
new file mode 100644
index 0000000000..507ab000f3
--- /dev/null
+++ b/drivers/net/rtap/rtap.h
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2026 Stephen Hemminger
+ */
+
+#ifndef _RTAP_H_
+#define _RTAP_H_
+
+#include <assert.h>
+#include <unistd.h>
+#include <net/if.h>
+#include <liburing.h>
+#include <linux/virtio_net.h>
+
+#include <ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_log.h>
+
+
+extern int rtap_logtype;
+#define RTE_LOGTYPE_RTAP rtap_logtype
+#define PMD_LOG(level, ...) \
+       RTE_LOG_LINE_PREFIX(level, RTAP, "%s(): ", __func__, __VA_ARGS__)
+
+#define PMD_LOG_ERRNO(level, fmt, ...) \
+       RTE_LOG_LINE(level, RTAP, "%s(): " fmt ": %s", __func__, ## 
__VA_ARGS__, strerror(errno))
+
+#ifdef RTE_ETHDEV_DEBUG_RX
+#define PMD_RX_LOG(level, ...) \
+       RTE_LOG_LINE_PREFIX(level, RTAP, "%s() rx: ", __func__, __VA_ARGS__)
+#else
+#define PMD_RX_LOG(...) do { } while (0)
+#endif
+
+#ifdef RTE_ETHDEV_DEBUG_TX
+#define PMD_TX_LOG(level, ...) \
+       RTE_LOG_LINE_PREFIX(level, RTAP, "%s() tx: ", __func__, __VA_ARGS__)
+#else
+#define PMD_TX_LOG(...) do { } while (0)
+#endif
+
+struct rtap_rx_queue {
+       struct rte_mempool *mb_pool;    /* rx buffer pool */
+       struct io_uring io_ring;        /* queue of posted read's */
+       uint16_t port_id;
+       uint16_t queue_id;
+
+       uint64_t rx_packets;
+       uint64_t rx_bytes;
+       uint64_t rx_errors;
+} __rte_cache_aligned;
+
+struct rtap_tx_queue {
+       struct io_uring io_ring;
+       uint16_t port_id;
+       uint16_t queue_id;
+       uint16_t free_thresh;
+
+       uint64_t tx_packets;
+       uint64_t tx_bytes;
+       uint64_t tx_errors;
+} __rte_cache_aligned;
+
+struct rtap_pmd {
+       int keep_fd;                    /* keep alive file descriptor */
+       char ifname[IFNAMSIZ];          /* name assigned by kernel */
+       struct rte_ether_addr eth_addr; /* address assigned by kernel */
+};
+
+#endif /* _RTAP_H_ */
diff --git a/drivers/net/rtap/rtap_ethdev.c b/drivers/net/rtap/rtap_ethdev.c
new file mode 100644
index 0000000000..ee5b5bad1b
--- /dev/null
+++ b/drivers/net/rtap/rtap_ethdev.c
@@ -0,0 +1,172 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2026 Stephen Hemminger
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <net/if.h>
+#include <linux/if.h>
+#include <linux/if_arp.h>
+#include <linux/if_tun.h>
+#include <linux/virtio_net.h>
+
+#include <bus_vdev_driver.h>
+#include <ethdev_driver.h>
+#include <ethdev_vdev.h>
+#include <rte_common.h>
+#include <rte_dev.h>
+#include <rte_eal.h>
+#include <rte_ethdev.h>
+#include <rte_ether.h>
+#include <rte_kvargs.h>
+#include <rte_log.h>
+
+#include "rtap.h"
+
+#define RTAP_DEFAULT_IFNAME    "rtap%d"
+
+#define RTAP_IFACE_ARG         "iface"
+#define RTAP_PERSIST_ARG       "persist"
+
+static const char * const valid_arguments[] = {
+       RTAP_IFACE_ARG,
+       RTAP_PERSIST_ARG,
+       NULL
+};
+
+static int
+rtap_dev_close(struct rte_eth_dev *dev)
+{
+       struct rtap_pmd *pmd = dev->data->dev_private;
+
+       PMD_LOG(INFO, "Closing %s", pmd->ifname);
+
+       if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+               /* mac_addrs must not be freed alone because part of 
dev_private */
+               dev->data->mac_addrs = NULL;
+
+               if (pmd->keep_fd != -1) {
+                       PMD_LOG(DEBUG, "Closing keep_fd %d", pmd->keep_fd);
+                       close(pmd->keep_fd);
+                       pmd->keep_fd = -1;
+               }
+       }
+
+       free(dev->process_private);
+       dev->process_private = NULL;
+
+       return 0;
+}
+
+static const struct eth_dev_ops rtap_ops = {
+       .dev_close              = rtap_dev_close,
+};
+
+static int
+rtap_parse_iface(const char *key __rte_unused, const char *value, void 
*extra_args)
+{
+       char *name = extra_args;
+
+       /* must not be null string */
+       if (value == NULL || value[0] == '\0' || strnlen(value, IFNAMSIZ) == 
IFNAMSIZ)
+               return -EINVAL;
+
+       strlcpy(name, value, IFNAMSIZ);
+       return 0;
+}
+
+static int
+rtap_probe(struct rte_vdev_device *vdev)
+{
+       const char *name = rte_vdev_device_name(vdev);
+       const char *params = rte_vdev_device_args(vdev);
+       struct rte_kvargs *kvlist = NULL;
+       struct rte_eth_dev *eth_dev = NULL;
+       int *fds = NULL;
+       char tap_name[IFNAMSIZ] = RTAP_DEFAULT_IFNAME;
+       uint8_t persist = 0;
+       int ret;
+
+       PMD_LOG(INFO, "Initializing %s", name);
+
+       if (params != NULL) {
+               kvlist = rte_kvargs_parse(params, valid_arguments);
+               if (kvlist == NULL)
+                       return -1;
+
+               if (rte_kvargs_count(kvlist, RTAP_IFACE_ARG) == 1) {
+                       ret = rte_kvargs_process_opt(kvlist, RTAP_IFACE_ARG,
+                                                    &rtap_parse_iface, 
tap_name);
+                       if (ret < 0)
+                               goto error;
+               }
+
+               if (rte_kvargs_count(kvlist, RTAP_PERSIST_ARG) == 1)
+                       persist = 1;
+       }
+
+       /* Per-queue tap fd's (for primary process) */
+       fds = calloc(RTE_MAX_QUEUES_PER_PORT, sizeof(int));
+       if (fds == NULL) {
+               PMD_LOG(ERR, "Unable to allocate fd array");
+               goto error;
+       }
+       for (unsigned int i = 0; i < RTE_MAX_QUEUES_PER_PORT; i++)
+               fds[i] = -1;
+
+       eth_dev = rte_eth_vdev_allocate(vdev, sizeof(struct rtap_pmd));
+       if (eth_dev == NULL) {
+               PMD_LOG(ERR, "%s Unable to allocate device struct", tap_name);
+               goto error;
+       }
+
+       eth_dev->dev_ops = &rtap_ops;
+       eth_dev->process_private = fds;
+       eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS;
+
+       RTE_SET_USED(persist); /* used in later patches */
+
+       rte_eth_dev_probing_finish(eth_dev);
+       rte_kvargs_free(kvlist);
+       return 0;
+
+error:
+       if (eth_dev != NULL) {
+               eth_dev->process_private = NULL;
+               rte_eth_dev_release_port(eth_dev);
+       }
+       free(fds);
+       rte_kvargs_free(kvlist);
+       return -1;
+}
+
+static int
+rtap_remove(struct rte_vdev_device *dev)
+{
+       struct rte_eth_dev *eth_dev;
+
+       eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+       if (eth_dev == NULL)
+               return 0;
+
+       rtap_dev_close(eth_dev);
+       rte_eth_dev_release_port(eth_dev);
+       return 0;
+}
+
+static struct rte_vdev_driver pmd_rtap_drv = {
+       .probe = rtap_probe,
+       .remove = rtap_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_rtap, pmd_rtap_drv);
+RTE_PMD_REGISTER_ALIAS(net_rtap, eth_rtap);
+RTE_PMD_REGISTER_PARAM_STRING(net_rtap,
+       RTAP_IFACE_ARG "=<string> "
+       RTAP_PERSIST_ARG);
+RTE_LOG_REGISTER_DEFAULT(rtap_logtype, NOTICE);
-- 
2.51.0

Reply via email to