On 6/26/2026 2:47 AM, Raghavendra Ningoji wrote:
> Add the skeleton of a new dmadev poll-mode driver for the AMD AE4DMA
> hardware DMA engine, providing only PCI probe/remove and per-queue
> hardware initialisation. An AE4DMA engine exposes 16 hardware command
> queues, each with a 32-entry descriptor ring; the PMD maps each
> hardware channel to its own dmadev with a single virtual channel,
> so a PCI function appears as 16 dmadevs named "<pci-bdf>-ch0" ..
> "<pci-bdf>-ch15".
> 
> This patch only registers the PCI driver, allocates the dmadev
> objects, reserves the per-queue descriptor rings and programs the
> hardware queue base addresses. Control and data path operations are
> added in subsequent patches.
> 
> Signed-off-by: Raghavendra Ningoji <[email protected]>
> ---
>  .mailmap                               |   1 +
>  MAINTAINERS                            |   5 +
>  doc/guides/dmadevs/ae4dma.rst          |  53 ++++++
>  doc/guides/dmadevs/index.rst           |   1 +
>  doc/guides/rel_notes/release_26_07.rst |   7 +
>  drivers/dma/ae4dma/ae4dma_dmadev.c     | 220 +++++++++++++++++++++++++
>  drivers/dma/ae4dma/ae4dma_hw_defs.h    | 154 +++++++++++++++++
>  drivers/dma/ae4dma/ae4dma_internal.h   |  97 +++++++++++
>  drivers/dma/ae4dma/meson.build         |   7 +
>  drivers/dma/meson.build                |   1 +
>  usertools/dpdk-devbind.py              |   5 +-
>  11 files changed, 550 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/dmadevs/ae4dma.rst
>  create mode 100644 drivers/dma/ae4dma/ae4dma_dmadev.c
>  create mode 100644 drivers/dma/ae4dma/ae4dma_hw_defs.h
>  create mode 100644 drivers/dma/ae4dma/ae4dma_internal.h
>  create mode 100644 drivers/dma/ae4dma/meson.build
> 
> diff --git a/.mailmap b/.mailmap
> index 89ba6ffccc..71a62564fa 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -1329,6 +1329,7 @@ Radu Bulie <[email protected]>
>  Radu Nicolau <[email protected]>
>  Rafael Ávila de Espíndola <[email protected]>
>  Rafal Kozik <[email protected]>
> +Raghavendra Ningoji <[email protected]>
>  Ragothaman Jayaraman <[email protected]>
>  Rahul Bhansali <[email protected]>
>  Rahul Gupta <[email protected]>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 9143d028bc..2e27af49f4 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1361,6 +1361,11 @@ F: doc/guides/compressdevs/features/zsda.ini
>  DMAdev Drivers
>  --------------
>  
> +AMD AE4DMA
> +M: Bhagyada Modali <[email protected]>
> +F: drivers/dma/ae4dma/
> +F: doc/guides/dmadevs/ae4dma.rst
> +
>  Intel IDXD - EXPERIMENTAL
>  M: Bruce Richardson <[email protected]>
>  M: Kevin Laatz <[email protected]>
> diff --git a/doc/guides/dmadevs/ae4dma.rst b/doc/guides/dmadevs/ae4dma.rst
> new file mode 100644
> index 0000000000..a85c1d92ca
> --- /dev/null
> +++ b/doc/guides/dmadevs/ae4dma.rst
> @@ -0,0 +1,53 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2025 Advanced Micro Devices, Inc.

2025 -> 2026?

> +
> +.. include:: <isonum.txt>
> +
> +AMD AE4DMA DMA Device Driver
> +============================
> +
> +The ``ae4dma`` dmadev driver is a poll-mode driver (PMD) for the
> +AMD AE4DMA hardware DMA engine. The engine exposes 16 independent
> +hardware command queues, each with a ring of 32 descriptors. The PMD
> +maps each hardware command queue to a separate DPDK dmadev with a
> +single virtual channel, so a single PCI function appears as 16 dmadevs
> +named ``<pci-bdf>-ch0`` through ``<pci-bdf>-ch15``.
> +
> +The driver supports memory-to-memory copy operations only.
> +
> +Hardware Requirements
> +---------------------
> +
> +The ``dpdk-devbind.py`` script can be used to list AE4DMA devices on
> +the system::
> +
> +   dpdk-devbind.py --status-dev dma
> +
> +AE4DMA devices appear with vendor ID ``0x1022`` and device ID
> +``0x149b``.
> +
> +Compilation
> +-----------
> +
> +The driver is built as part of the standard DPDK build on x86 platforms
> +using ``meson`` and ``ninja``; no extra configuration is required.
> +
> +Device Setup
> +------------
> +
> +The AE4DMA device must be bound to a DPDK-compatible kernel module such
> +as ``vfio-pci`` before it can be used::
> +
> +   dpdk-devbind.py -b vfio-pci <pci-bdf>
> +
> +Initialization
> +~~~~~~~~~~~~~~
> +
> +On probe the PMD performs the following steps for each PCI function:
> +
> +* Reads BAR0 and programs the common configuration register with the
> +  number of hardware queues to enable (16).
> +* For each hardware queue it allocates a 32-entry descriptor ring in
> +  IOVA-contiguous memory, programs the queue base address and ring
> +  depth into the per-queue registers, and enables the queue.
> +* Interrupts are masked; completion is polled by the application.
> diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst
> index 56beb1733f..97399590f6 100644
> --- a/doc/guides/dmadevs/index.rst
> +++ b/doc/guides/dmadevs/index.rst
> @@ -11,6 +11,7 @@ an application through DMA API.
>     :maxdepth: 1
>     :numbered:
>  
> +   ae4dma
>     cnxk
>     dpaa
>     dpaa2
> diff --git a/doc/guides/rel_notes/release_26_07.rst 
> b/doc/guides/rel_notes/release_26_07.rst
> index f012d47a4b..9a78a7ef62 100644
> --- a/doc/guides/rel_notes/release_26_07.rst
> +++ b/doc/guides/rel_notes/release_26_07.rst
> @@ -63,6 +63,13 @@ New Features
>      ``rte_eal_init`` and the application is responsible for probing each 
> device,
>    * ``--auto-probing`` enables the initial bus probing, which is the current 
> default behavior.
>  
> +* **Added AMD AE4DMA DMA PMD.**
> +
> +  Added a new ``dma/ae4dma`` driver for the AMD AE4DMA hardware DMA engine.
> +  Each PCI function exposes 16 hardware command queues; the PMD registers one
> +  dmadev per channel with a single virtual channel and supports
> +  memory-to-memory copy operations.
> +
>  
>  Removed Items
>  -------------
> diff --git a/drivers/dma/ae4dma/ae4dma_dmadev.c 
> b/drivers/dma/ae4dma/ae4dma_dmadev.c
> new file mode 100644
> index 0000000000..3d82f86906
> --- /dev/null
> +++ b/drivers/dma/ae4dma/ae4dma_dmadev.c
> @@ -0,0 +1,220 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2026 Advanced Micro Devices, Inc. All rights reserved.
> + */
> +
> +#include <errno.h>
> +#include <inttypes.h>
> +#include <stdio.h>
> +#include <string.h>
> +
> +#include <rte_bus_pci.h>
> +#include <bus_pci_driver.h>
> +#include <rte_dmadev_pmd.h>
> +#include <rte_malloc.h>
> +
> +#include "ae4dma_internal.h"
> +
> +/*
> + * One dmadev per AE4DMA hardware channel; each dmadev has exactly one
> + * virtual channel. The HW's per-queue register block must be densely
> + * packed right after the engine-common config register at BAR0+0; the
> + * build-time check below catches an accidental layout change.
> + */
> +static_assert(sizeof(struct ae4dma_hwq_regs) == 32,
> +             "ae4dma_hwq_regs stride changed; per-queue offset math will 
> break");
> +
> +RTE_LOG_REGISTER_DEFAULT(ae4dma_pmd_logtype, INFO);
> +
> +#define AE4DMA_PMD_NAME dmadev_ae4dma
> +
> +static const struct rte_memzone *
> +ae4dma_queue_dma_zone_reserve(const char *queue_name,
> +             uint32_t queue_size, int socket_id)
> +{
> +     const struct rte_memzone *mz;
> +
> +     mz = rte_memzone_lookup(queue_name);
> +     if (mz != NULL) {
> +             if (((size_t)queue_size <= mz->len) &&
> +                             ((socket_id == SOCKET_ID_ANY) ||
> +                              (socket_id == mz->socket_id))) {
> +                     AE4DMA_PMD_INFO("reuse memzone already "
> +                                     "allocated for %s", queue_name);
> +                     return mz;
> +             }
> +             AE4DMA_PMD_ERR("Incompatible memzone already "
> +                             "allocated %s, size %u, socket %d. "
> +                             "Requested size %u, socket %u",
> +                             queue_name, (uint32_t)mz->len,
> +                             mz->socket_id, queue_size, socket_id);
> +             return NULL;
> +     }
> +     return rte_memzone_reserve_aligned(queue_name, queue_size,
> +                     socket_id, RTE_MEMZONE_IOVA_CONTIG, queue_size);

No need to do such reuse, and this resource could setup in vchan_setup ops,
but your dmadev has max 32 descriptors and only 1 vchan per-dmadev, so I think
it's ok to setup in the probe.

> +}
> +
> +static int
> +ae4dma_add_queue(struct ae4dma_dmadev *dev, struct rte_pci_device *pci,
> +             uint8_t qn, const char *pci_name)
> +{
> +     uint32_t dma_addr_lo, dma_addr_hi;
> +     struct ae4dma_cmd_queue *cmd_q;
> +     const struct rte_memzone *q_mz;
> +
> +     dev->io_regs = pci->mem_resource[AE4DMA_PCIE_BAR].addr;
> +
> +     cmd_q = &dev->cmd_q;
> +     cmd_q->id = qn;
> +     cmd_q->qidx = 0;
> +     cmd_q->qsize = AE4DMA_QUEUE_SIZE(AE4DMA_QUEUE_DESC_SIZE);
> +     cmd_q->hwq_regs = (volatile struct ae4dma_hwq_regs *)dev->io_regs + (qn 
> + 1);
> +
> +     /*
> +      * Memzone name must be globally unique. Embed PCI BDF so multiple
> +      * PCI functions probed concurrently don't collide.
> +      */
> +     snprintf(cmd_q->memz_name, sizeof(cmd_q->memz_name),
> +                     "ae4dma_%s_q%u", pci_name, (unsigned int)qn);
> +
> +     q_mz = ae4dma_queue_dma_zone_reserve(cmd_q->memz_name,
> +                     cmd_q->qsize, rte_socket_id());
> +     if (q_mz == NULL) {
> +             AE4DMA_PMD_ERR("memzone reserve failed for %s", 
> cmd_q->memz_name);
> +             return -ENOMEM;
> +     }
> +
> +     cmd_q->mz = q_mz;
> +     cmd_q->qbase_addr = q_mz->addr;
> +     cmd_q->qbase_desc = q_mz->addr;
> +     cmd_q->qbase_phys_addr = q_mz->iova;
> +
> +     AE4DMA_WRITE_REG(&cmd_q->hwq_regs->max_idx, 
> AE4DMA_DESCRIPTORS_PER_CMDQ);
> +     AE4DMA_WRITE_REG(&cmd_q->hwq_regs->control_reg.control_raw,
> +                     AE4DMA_CMD_QUEUE_ENABLE);
> +     AE4DMA_WRITE_REG(&cmd_q->hwq_regs->intr_status_reg.intr_status_raw,
> +                     AE4DMA_DISABLE_INTR);
> +     cmd_q->next_write = AE4DMA_READ_REG(&cmd_q->hwq_regs->write_idx);
> +     cmd_q->next_read = AE4DMA_READ_REG(&cmd_q->hwq_regs->read_idx);
> +     cmd_q->ring_buff_count = 0;
> +
> +     dma_addr_lo = lower_32_bits(cmd_q->qbase_phys_addr);
> +     AE4DMA_WRITE_REG(&cmd_q->hwq_regs->qbase_lo, dma_addr_lo);
> +     dma_addr_hi = upper_32_bits(cmd_q->qbase_phys_addr);
> +     AE4DMA_WRITE_REG(&cmd_q->hwq_regs->qbase_hi, dma_addr_hi);
> +
> +     return 0;
> +}
> +
> +static void
> +ae4dma_channel_dev_name(char *out, size_t outlen, const char *pci_name,
> +             unsigned int ch)
> +{
> +     snprintf(out, outlen, "%s-ch%u", pci_name, ch);
> +}
> +
> +static int
> +ae4dma_dmadev_create(const char *name, struct rte_pci_device *dev, uint8_t 
> qn)
> +{
> +     struct rte_dma_dev *dmadev;
> +     struct ae4dma_dmadev *ae4dma;
> +     char hwq_dev_name[RTE_DEV_NAME_MAX_LEN];

Please define local variables in a descending order, with longer ones
placed at the front. It is recommended to modify the entire driver in
this way.

> +
> +     memset(hwq_dev_name, 0, sizeof(hwq_dev_name));

why not char hwq_dev_name[RTE_DEV_NAME_MAX_LEN] = {0};

> +     ae4dma_channel_dev_name(hwq_dev_name, sizeof(hwq_dev_name), name, qn);
> +
> +     dmadev = rte_dma_pmd_allocate(hwq_dev_name, dev->device.numa_node,
> +                     sizeof(struct ae4dma_dmadev));
> +     if (dmadev == NULL) {
> +             AE4DMA_PMD_ERR("Unable to allocate dma device");
> +             return -ENOMEM;
> +     }
> +     dmadev->device = &dev->device;
> +     dmadev->fp_obj->dev_private = dmadev->data->dev_private;
> +
> +     ae4dma = dmadev->data->dev_private;
> +
> +     if (ae4dma_add_queue(ae4dma, dev, qn, name) != 0)
> +             goto init_error;
> +     return 0;
> +
> +init_error:
> +     AE4DMA_PMD_ERR("failed");

why not add more info, e.g. Probe failed!

> +     rte_dma_pmd_release(hwq_dev_name);
> +     return -ENOMEM;
> +}
> +
> +static int
> +ae4dma_dmadev_probe(struct rte_pci_driver *drv __rte_unused,
> +             struct rte_pci_device *dev)
> +{
> +     char name[32];
> +     char chname[RTE_DEV_NAME_MAX_LEN];
> +     void *mmio_base;
> +     uint32_t q_per_eng;
> +     int ret = 0;
> +     uint8_t i;
> +
> +     rte_pci_device_name(&dev->addr, name, sizeof(name));
> +     AE4DMA_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node);
> +
> +     mmio_base = dev->mem_resource[AE4DMA_PCIE_BAR].addr;
> +     if (mmio_base == NULL) {
> +             AE4DMA_PMD_ERR("%s: BAR%d not mapped", name, AE4DMA_PCIE_BAR);
> +             return -ENODEV;
> +     }
> +
> +     /* Program the per-engine HW queue count once. */
> +     AE4DMA_WRITE_REG_OFFSET(mmio_base, AE4DMA_COMMON_CONFIG_OFFSET,
> +                     AE4DMA_MAX_HW_QUEUES);
> +     q_per_eng = AE4DMA_READ_REG_OFFSET(mmio_base, 
> AE4DMA_COMMON_CONFIG_OFFSET);
> +     AE4DMA_PMD_INFO("%s: AE4DMA queues per engine = %u", name, q_per_eng);
> +
> +     for (i = 0; i < AE4DMA_MAX_HW_QUEUES; i++) {
> +             ret = ae4dma_dmadev_create(name, dev, i);
> +             if (ret != 0) {
> +                     AE4DMA_PMD_ERR("%s create dmadev %u failed!", name, i);
> +                     while (i > 0) {
> +                             i--;
> +                             ae4dma_channel_dev_name(chname, sizeof(chname), 
> name, i);
> +                             rte_dma_pmd_release(chname);
> +                     }
> +                     break;
> +             }
> +     }
> +     return ret;
> +}
> +
> +static int
> +ae4dma_dmadev_remove(struct rte_pci_device *dev)
> +{
> +     char name[32];
> +     char chname[RTE_DEV_NAME_MAX_LEN];
> +     unsigned int i;
> +
> +     rte_pci_device_name(&dev->addr, name, sizeof(name));
> +
> +     AE4DMA_PMD_INFO("Closing %s on NUMA node %d",
> +                     name, dev->device.numa_node);
> +
> +     for (i = 0; i < AE4DMA_MAX_HW_QUEUES; i++) {
> +             ae4dma_channel_dev_name(chname, sizeof(chname), name, i);
> +             rte_dma_pmd_release(chname);
> +     }
> +     return 0;
> +}
> +
> +static const struct rte_pci_id pci_id_ae4dma_map[] = {
> +     { RTE_PCI_DEVICE(AMD_VENDOR_ID, AE4DMA_DEVICE_ID) },
> +     { .vendor_id = 0, /* sentinel */ },
> +};
> +
> +static struct rte_pci_driver ae4dma_pmd_drv = {
> +     .id_table = pci_id_ae4dma_map,
> +     .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
> +     .probe = ae4dma_dmadev_probe,
> +     .remove = ae4dma_dmadev_remove,
> +};
> +
> +RTE_PMD_REGISTER_PCI(AE4DMA_PMD_NAME, ae4dma_pmd_drv);
> +RTE_PMD_REGISTER_PCI_TABLE(AE4DMA_PMD_NAME, pci_id_ae4dma_map);
> +RTE_PMD_REGISTER_KMOD_DEP(AE4DMA_PMD_NAME, "* igb_uio | uio_pci_generic | 
> vfio-pci");
> diff --git a/drivers/dma/ae4dma/ae4dma_hw_defs.h 
> b/drivers/dma/ae4dma/ae4dma_hw_defs.h
> new file mode 100644
> index 0000000000..e7798be09b
> --- /dev/null
> +++ b/drivers/dma/ae4dma/ae4dma_hw_defs.h
> @@ -0,0 +1,154 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2026 Advanced Micro Devices, Inc. All rights reserved.
> + */
> +
> +#ifndef __AE4DMA_HW_DEFS_H__
> +#define __AE4DMA_HW_DEFS_H__
> +
> +#include <stdint.h>
> +
> +#include <rte_bus_pci.h>
> +#include <rte_byteorder.h>
> +#include <rte_io.h>
> +#include <rte_pci.h>
> +#include <rte_memzone.h>

Some of the include file are not need for this head-file.

> +
> +#define AE4DMA_BIT(nr)                       (1UL << (nr))
> +
> +/* ae4dma device details */
> +#define AMD_VENDOR_ID        0x1022
> +#define AE4DMA_DEVICE_ID     0x149b
> +#define AE4DMA_PCIE_BAR 0
> +
> +/*
> + * An AE4DMA engine has 16 DMA queues. Each queue supports 32 descriptors.
> + */
> +#define AE4DMA_MAX_HW_QUEUES        16
> +#define AE4DMA_QUEUE_START_INDEX    0
> +#define AE4DMA_CMD_QUEUE_ENABLE              0x1
> +#define AE4DMA_CMD_QUEUE_DISABLE     0x0
> +
> +/* Common to all queues */
> +#define AE4DMA_COMMON_CONFIG_OFFSET 0x00
> +
> +#define AE4DMA_DISABLE_INTR 0x01
> +
> +/* Descriptor status */
> +enum ae4dma_dma_status {
> +     AE4DMA_DMA_DESC_SUBMITTED = 0,
> +     AE4DMA_DMA_DESC_VALIDATED = 1,
> +     AE4DMA_DMA_DESC_PROCESSED = 2,
> +     AE4DMA_DMA_DESC_COMPLETED = 3,
> +     AE4DMA_DMA_DESC_ERROR = 4,
> +};
> +
> +/* Descriptor error-code */
> +enum ae4dma_dma_err {
> +     AE4DMA_DMA_ERR_NO_ERR = 0,
> +     AE4DMA_DMA_ERR_INV_HEADER = 1,
> +     AE4DMA_DMA_ERR_INV_STATUS = 2,
> +     AE4DMA_DMA_ERR_INV_LEN = 3,
> +     AE4DMA_DMA_ERR_INV_SRC = 4,
> +     AE4DMA_DMA_ERR_INV_DST = 5,
> +     AE4DMA_DMA_ERR_INV_ALIGN = 6,
> +     AE4DMA_DMA_ERR_UNKNOWN = 7,
> +};
> +
> +/* HW Queue status */
> +enum ae4dma_hwqueue_status {
> +     AE4DMA_HWQUEUE_EMPTY = 0,
> +     AE4DMA_HWQUEUE_FULL = 1,
> +     AE4DMA_HWQUEUE_NOT_EMPTY = 4,
> +};
> +/*
> + * descriptor for AE4DMA commands
> + * 8 32-bit words:
> + * word 0: source memory type; destination memory type ; control bits
> + * word 1: desc_id; error code; status
> + * word 2: length
> + * word 3: reserved
> + * word 4: upper 32 bits of source pointer
> + * word 5: low 32 bits of source pointer
> + * word 6: upper 32 bits of destination pointer
> + * word 7: low 32 bits of destination pointer
> + */
> +
> +/* AE4DMA Descriptor - DWORD0 - Controls bits: Reserved for future use */
> +#define AE4DMA_DWORD0_STOP_ON_COMPLETION     AE4DMA_BIT(0)
> +#define AE4DMA_DWORD0_INTERRUPT_ON_COMPLETION        AE4DMA_BIT(1)
> +#define AE4DMA_DWORD0_START_OF_MESSAGE               AE4DMA_BIT(3)
> +#define AE4DMA_DWORD0_END_OF_MESSAGE         AE4DMA_BIT(4)
> +#define AE4DMA_DWORD0_DESTINATION_MEMORY_TYPE        RTE_GENMASK64(5, 4)
> +#define AE4DMA_DWORD0_SOURCE_MEMEORY_TYPE    RTE_GENMASK64(7, 6)
> +
> +#define AE4DMA_DWORD0_DESTINATION_MEMORY_TYPE_MEMORY    (0x0)
> +#define AE4DMA_DWORD0_DESTINATION_MEMORY_TYPE_IOMEMORY  (1<<4)
> +#define AE4DMA_DWORD0_SOURCE_MEMEORY_TYPE_MEMORY    (0x0)
> +#define AE4DMA_DWORD0_SOURCE_MEMEORY_TYPE_IOMEMORY  (1<<6)
> +
> +struct ae4dma_desc_dword0 {
> +     uint8_t byte0;
> +     uint8_t byte1;
> +     uint16_t timestamp;
> +};
> +
> +struct ae4dma_desc_dword1 {
> +     uint8_t status;
> +     uint8_t err_code;
> +     uint16_t desc_id;
> +};
> +
> +struct ae4dma_desc {
> +     struct ae4dma_desc_dword0 dw0;
> +     struct ae4dma_desc_dword1 dw1;
> +     uint32_t length;
> +     uint32_t reserved;
> +     uint32_t src_lo;
> +     uint32_t src_hi;
> +     uint32_t dst_lo;
> +     uint32_t dst_hi;
> +};
> +
> +/*
> + * Registers for each queue :4 bytes length
> + * Effective address : offset + reg
> + */
> +struct ae4dma_hwq_regs {
> +     union {
> +             uint32_t control_raw;
> +             struct {
> +                     uint32_t queue_enable: 1;
> +                     uint32_t reserved_internal: 31;
> +             } control;
> +     } control_reg;
> +
> +     union {
> +             uint32_t status_raw;
> +             struct {
> +                     uint32_t reserved0: 1;
> +                     /* 0–empty, 1–full, 2–stopped, 3–error , 4–Not Empty */
> +                     uint32_t queue_status: 2;
> +                     uint32_t reserved1: 21;
> +                     uint32_t interrupt_type: 4;
> +                     uint32_t reserved2: 4;
> +             } status;
> +     } status_reg;
> +
> +     uint32_t max_idx;
> +     uint32_t read_idx;
> +     uint32_t write_idx;
> +
> +     union {
> +             uint32_t intr_status_raw;
> +             struct {
> +                     uint32_t intr_status: 1;
> +                     uint32_t reserved: 31;
> +             } intr_status;
> +     } intr_status_reg;
> +
> +     uint32_t qbase_lo;
> +     uint32_t qbase_hi;
> +
> +};
> +
> +#endif /* AE4DMA_HW_DEFS_H */
> diff --git a/drivers/dma/ae4dma/ae4dma_internal.h 
> b/drivers/dma/ae4dma/ae4dma_internal.h
> new file mode 100644
> index 0000000000..7f149c97b5
> --- /dev/null
> +++ b/drivers/dma/ae4dma/ae4dma_internal.h
> @@ -0,0 +1,97 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2026 Advanced Micro Devices, Inc. All rights reserved.
> + */
> +
> +#ifndef _AE4DMA_INTERNAL_H_
> +#define _AE4DMA_INTERNAL_H_
> +
> +#include <stdint.h>
> +
> +#include "ae4dma_hw_defs.h"
> +
> +/* Return bits 32-63 of a 64-bit number. */
> +#define upper_32_bits(n) ((uint32_t)(((n) >> 16) >> 16))
> +
> +/* Return bits 0-31 of a 64-bit number. */
> +#define lower_32_bits(n) ((uint32_t)((n) & 0xffffffff))
> +
> +/* Hardware ring depth (slots per queue); must be power of two. */
> +#define AE4DMA_DESCRIPTORS_PER_CMDQ  32
> +#define AE4DMA_QUEUE_DESC_SIZE               sizeof(struct ae4dma_desc)
> +#define AE4DMA_QUEUE_SIZE(n)         (AE4DMA_DESCRIPTORS_PER_CMDQ * (n))
> +

two blank lines

> +
> +/* AE4DMA registers Write/Read */
> +static inline void ae4dma_pci_reg_write(void *base, int offset,
> +             uint32_t value)
> +{
> +     volatile void *reg_addr = ((uint8_t *)base + offset);
> +
> +     rte_write32((rte_cpu_to_le_32(value)), reg_addr);
> +}
> +
> +static inline uint32_t ae4dma_pci_reg_read(void *base, int offset)
> +{
> +     volatile void *reg_addr = ((uint8_t *)base + offset);
> +
> +     return rte_le_to_cpu_32(rte_read32(reg_addr));
> +}
> +
> +#define AE4DMA_READ_REG_OFFSET(hw_addr, reg_offset) \
> +     ae4dma_pci_reg_read(hw_addr, reg_offset)
> +
> +#define AE4DMA_WRITE_REG_OFFSET(hw_addr, reg_offset, value) \
> +     ae4dma_pci_reg_write(hw_addr, reg_offset, value)
> +
> +

two blank lines

> +#define AE4DMA_READ_REG(hw_addr) \
> +     ae4dma_pci_reg_read((void *)(uintptr_t)(hw_addr), 0)
> +
> +#define AE4DMA_WRITE_REG(hw_addr, value) \
> +     ae4dma_pci_reg_write((void *)(uintptr_t)(hw_addr), 0, value)
> +
> +/* A structure describing an AE4DMA command queue. */
> +struct __rte_cache_aligned ae4dma_cmd_queue {
> +     char memz_name[RTE_MEMZONE_NAMESIZE];
> +     const struct rte_memzone *mz;
> +     volatile struct ae4dma_hwq_regs *hwq_regs;
> +
> +     struct rte_dma_vchan_conf qcfg;
> +     struct rte_dma_stats stats;
> +     /* Queue address */
> +     struct ae4dma_desc *qbase_desc;
> +     void *qbase_addr;
> +     rte_iova_t qbase_phys_addr;
> +     enum ae4dma_dma_err status[AE4DMA_DESCRIPTORS_PER_CMDQ];
> +     /* Queue identifier */
> +     uint64_t id;    /* queue id */
> +     uint64_t qidx;  /* queue index */
> +     uint64_t qsize; /* queue size */
> +     uint32_t ring_buff_count;
> +     uint16_t next_read;
> +     uint16_t next_write;
> +     uint16_t last_write; /* Used to compute submitted count. */
> +};
> +
> +/*
> + * One dmadev per AE4DMA hardware channel: probe creates AE4DMA_MAX_HW_QUEUES
> + * dmadevs per PCI function, each owning a single HW command queue.
> + */
> +struct ae4dma_dmadev {
> +     void *io_regs;
> +     struct ae4dma_cmd_queue cmd_q; /* single HW queue owned by this dmadev 
> */
> +};
> +
> +

two blank line

> +extern int ae4dma_pmd_logtype;
> +#define RTE_LOGTYPE_AE4DMA_PMD ae4dma_pmd_logtype
> +
> +#define AE4DMA_PMD_LOG(level, ...) \
> +     RTE_LOG_LINE_PREFIX(level, AE4DMA_PMD, "%s(): ", __func__, __VA_ARGS__)
> +
> +#define AE4DMA_PMD_DEBUG(...)  AE4DMA_PMD_LOG(DEBUG, __VA_ARGS__)
> +#define AE4DMA_PMD_INFO(...)   AE4DMA_PMD_LOG(INFO, __VA_ARGS__)
> +#define AE4DMA_PMD_ERR(...)    AE4DMA_PMD_LOG(ERR, __VA_ARGS__)
> +#define AE4DMA_PMD_WARN(...)   AE4DMA_PMD_LOG(WARNING, __VA_ARGS__)
> +
> +#endif /* _AE4DMA_INTERNAL_H_ */
> diff --git a/drivers/dma/ae4dma/meson.build b/drivers/dma/ae4dma/meson.build
> new file mode 100644
> index 0000000000..e48ab0d561
> --- /dev/null
> +++ b/drivers/dma/ae4dma/meson.build
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2024 Advanced Micro Devices, Inc. All rights reserved.

2024 -> 2026

Does this also support run BSD or Windows, if not please add following 
instruments:
if not is_linux
    build = false
    reason = 'only supported on Linux'
    subdir_done()
endif

> +
> +build = dpdk_conf.has('RTE_ARCH_X86')
> +reason = 'only supported on x86'
> +sources = files('ae4dma_dmadev.c')
> +deps += ['bus_pci', 'dmadev']
> diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build
> index e0d94db967..c230ac5a06 100644
> --- a/drivers/dma/meson.build
> +++ b/drivers/dma/meson.build
> @@ -2,6 +2,7 @@
>  # Copyright 2021 HiSilicon Limited
>  
>  drivers = [
> +        'ae4dma',
>          'cnxk',
>          'dpaa',
>          'dpaa2',
> diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py
> index 93f2383dff..7d09f155de 100755
> --- a/usertools/dpdk-devbind.py
> +++ b/usertools/dpdk-devbind.py
> @@ -86,6 +86,9 @@
>  cn9k_ree = {'Class': '08', 'Vendor': '177d', 'Device': 'a0f4',
>              'SVendor': None, 'SDevice': None}
>  
> +amd_ae4dma = {'Class': '08', 'Vendor': '1022', 'Device': '149b',
> +              'SVendor': None, 'SDevice': None}
> +
>  virtio_blk = {'Class': '01', 'Vendor': "1af4", 'Device': '1001,1042',
>                'SVendor': None, 'SDevice': None}
>  
> @@ -95,7 +98,7 @@
>  network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class]
>  baseband_devices = [acceleration_class]
>  crypto_devices = [encryption_class, intel_processor_class]
> -dma_devices = [cnxk_dma, hisilicon_dma,
> +dma_devices = [amd_ae4dma, cnxk_dma, hisilicon_dma,
>                 intel_idxd_gnrd, intel_idxd_dmr, intel_idxd_spr,
>                 intel_ioat_bdw, intel_ioat_icx, intel_ioat_skx,
>                 odm_dma]

Reply via email to