On 2/24/2026 3:33 AM, Bjorn Andersson wrote:
> On Tue, Feb 24, 2026 at 12:38:54AM +0530, Ekansh Gupta wrote:
>> This patch series introduces the Qualcomm DSP Accelerator (QDA) driver,
>> a modern DRM-based accelerator implementation for Qualcomm Hexagon DSPs.
>> The driver provides a standardized interface for offloading computational
>> tasks to DSPs found on Qualcomm SoCs, supporting all DSP domains (ADSP,
>> CDSP, SDSP, GDSP).
>>
>> The QDA driver is designed as an alternative for the FastRPC driver
>> in drivers/misc/, offering improved resource management, better integration
>> with standard kernel subsystems, and alignment with the Linux kernel's
>> Compute Accelerators framework.
>>
> If I understand correctly, this is just the same FastRPC protocol but
> in the accel framework, and hence with a new userspace ABI?
>
> I don't fancy the name "QDA" as an acronym for "FastRPC Accel".
>
> I would much prefer to see this living in drivers/accel/fastrpc and be
> named some variation of "fastrpc" (e.g. fastrpc_accel). (Driver name can
> be "fastrpc" as the other one apparently is named "qcom,fastrpc").
Planning to stick with QDA as per the future plans where the driver might use 
some
other mechanism than fastrpc(signalling).
>
>> User-space staging branch
>> ============
>> https://github.com/qualcomm/fastrpc/tree/accel/staging
>>
>> Key Features
>> ============
>>
>> * Standard DRM accelerator interface via /dev/accel/accelN
>> * GEM-based buffer management with DMA-BUF import/export support
>> * IOMMU-based memory isolation using per-process context banks
>> * FastRPC protocol implementation for DSP communication
>> * RPMsg transport layer for reliable message passing
>> * Support for all DSP domains (ADSP, CDSP, SDSP, GDSP)
>> * Comprehensive IOCTL interface for DSP operations
>>
>> High-Level Architecture Differences with Existing FastRPC Driver
>> =================================================================
>>
>> The QDA driver represents a significant architectural departure from the
>> existing FastRPC driver (drivers/misc/fastrpc.c), addressing several key
>> limitations while maintaining protocol compatibility:
>>
>> 1. DRM Accelerator Framework Integration
>>   - FastRPC: Custom character device (/dev/fastrpc-*)
>>   - QDA: Standard DRM accel device (/dev/accel/accelN)
>>   - Benefit: Leverages established DRM infrastructure for device
>>     management.
>>
>> 2. Memory Management
>>   - FastRPC: Custom memory allocator with ION/DMA-BUF integration
>>   - QDA: Native GEM objects with full PRIME support
>>   - Benefit: Seamless buffer sharing using standard DRM mechanisms
>>
>> 3. IOMMU Context Bank Management
>>   - FastRPC: Direct IOMMU domain manipulation, limited isolation
>>   - QDA: Custom compute bus (qda_cb_bus_type) with proper device model
>>   - Benefit: Each CB device is a proper struct device with IOMMU group
>>     support, enabling better isolation and resource tracking.
>>   - 
>> https://lore.kernel.org/all/[email protected]/
>>
>> 4. Memory Manager Architecture
>>   - FastRPC: Monolithic allocator
>>   - QDA: Pluggable memory manager with backend abstraction
>>   - Benefit: Currently uses DMA-coherent backend, easily extensible for
>>     future memory types (e.g., carveout, CMA)
>>
>> 5. Transport Layer
>>   - FastRPC: Direct RPMsg integration in core driver
>>   - QDA: Abstracted transport layer (qda_rpmsg.c)
>>   - Benefit: Clean separation of concerns, easier to add alternative
>>     transports if needed
>>
>> 8. Code Organization
>>   - FastRPC: ~3000 lines in single file
>>   - QDA: Modular design across multiple files (~4600 lines total)
> "Now 50% more LOC and you need 6 tabs open in your IDE!"
>
> Might be better, but in itself it provides no immediate value.
I added this as a point because I think separating/abstracting sensible parts 
to different files
might improve readability and maintainability. But if that doesn't make sense, 
then I can
remove this point.

https://lore.kernel.org/all/[email protected]/
>
>>     * qda_drv.c: Core driver and DRM integration
>>     * qda_gem.c: GEM object management
>>     * qda_memory_manager.c: Memory and IOMMU management
>>     * qda_fastrpc.c: FastRPC protocol implementation
>>     * qda_rpmsg.c: Transport layer
>>     * qda_cb.c: Context bank device management
>>   - Benefit: Better maintainability, clearer separation of concerns
>>
>> 9. UAPI Design
>>   - FastRPC: Custom IOCTL interface
>>   - QDA: DRM-style IOCTLs with proper versioning support
>>   - Benefit: Follows DRM conventions, easier userspace integration
>>
>> 10. Documentation
>>   - FastRPC: Minimal in-tree documentation
>>   - QDA: Comprehensive documentation in Documentation/accel/qda/
>>   - Benefit: Better developer experience, clearer API contracts
>>
>> 11. Buffer Reference Mechanism
>>   - FastRPC: Uses buffer file descriptors (FDs) for all book-keeping
>>     in both kernel and DSP
>>   - QDA: Uses GEM handles for kernel-side management, providing better
>>     integration with DRM subsystem
>>   - Benefit: Leverages DRM GEM infrastructure for reference counting,
>>     lifetime management, and integration with other DRM components
>>
> This is all good, but what is the plan regarding /dev/fastrpc-*?
>
> The idea here clearly is to provide an alternative implementation, and
> they seem to bind to the same toplevel compatible - so you can only
> compile one into your kernel at any point in time.
>
> So if I understand correctly, at some point in time we need to say
> CONFIG_DRM_ACCEL_QDA=m and CONFIG_QCOM_FASTRPC=n, which will break all
> existing user space applications? That's not acceptable.
>
>
> Would it be possible to have a final driver that is implemented as a
> accel, but provides wrappers for the legacy misc and ioctl interface to
> the applications?
As per the discussions on other thread, I believe compat driver would be the 
way to
go for this. When I send the actual driver changes, I can include compat driver 
as well
to the patches.

I'm assuming a compat driver will live in the same QDA directory and will 
translate misc/fastrpc
calls to accel/qda calls if QDA is enabled.
>
> Regards,
> Bjorn
>
>> Key Technical Improvements
>> ===========================
>>
>> * Proper device model: CB devices are real struct device instances on a
>>   custom bus, enabling proper IOMMU group management and power management
>>   integration
>>
>> * Reference-counted IOMMU devices: Multiple file descriptors from the same
>>   process share a single IOMMU device, reducing overhead
>>
>> * GEM-based buffer lifecycle: Automatic cleanup via DRM GEM reference
>>   counting, eliminating many resource leak scenarios
>>
>> * Modular memory backends: The memory manager supports pluggable backends,
>>   currently implementing DMA-coherent allocations with SID-prefixed
>>   addresses for DSP firmware
>>
>> * Context-based invocation tracking: XArray-based context management with
>>   proper synchronization and cleanup
>>
>> Patch Series Organization
>> ==========================
>>
>> Patches 1-2:   Driver skeleton and documentation
>> Patches 3-6:   RPMsg transport and IOMMU/CB infrastructure
>> Patches 7-9:   DRM device registration and basic IOCTL
>> Patches 10-12: GEM buffer management and PRIME support
>> Patches 13-17: FastRPC protocol implementation (attach, invoke, create,
>>                map/unmap)
>> Patch 18:      MAINTAINERS entry
>>
>> Open Items
>> ===========
>>
>> The following items are identified as open items:
>>
>> 1. Privilege Level Management
>>   - Currently, daemon processes and user processes have the same access
>>     level as both use the same accel device node. This needs to be
>>     addressed as daemons attach to privileged DSP PDs and require
>>     higher privilege levels for system-level operations
>>   - Seeking guidance on the best approach: separate device nodes,
>>     capability-based checks, or DRM master/authentication mechanisms
>>
>> 2. UAPI Compatibility Layer
>>   - Add UAPI compat layer to facilitate migration of client applications
>>     from existing FastRPC UAPI to the new QDA accel driver UAPI,
>>     ensuring smooth transition for existing userspace code
>>   - Seeking guidance on implementation approach: in-kernel translation
>>     layer, userspace wrapper library, or hybrid solution
>>
>> 3. Documentation Improvements
>>   - Add detailed IOCTL usage examples
>>   - Document DSP firmware interface requirements
>>   - Create migration guide from existing FastRPC
>>
>> 4. Per-Domain Memory Allocation
>>   - Develop new userspace API to support memory allocation on a per
>>     domain basis, enabling domain-specific memory management and
>>     optimization
>>
>> 5. Audio and Sensors PD Support
>>   - The current patch series does not handle Audio PD and Sensors PD
>>     functionalities. These specialized protection domains require
>>     additional support for real-time constraints and power management
>>
>> Interface Compatibility
>> ========================
>>
>> The QDA driver maintains compatibility with existing FastRPC infrastructure:
>>
>> * Device Tree Bindings: The driver uses the same device tree bindings as
>>   the existing FastRPC driver, ensuring no changes are required to device
>>   tree sources. The "qcom,fastrpc" compatible string and child node
>>   structure remain unchanged.
>>
>> * Userspace Interface: While the driver provides a new DRM-based UAPI,
>>   the underlying FastRPC protocol and DSP firmware interface remain
>>   compatible. This ensures that DSP firmware and libraries continue to
>>   work without modification.
>>
>> * Migration Path: The modular design allows for gradual migration, where
>>   both drivers can coexist during the transition period. Applications can
>>   be migrated incrementally to the new UAPI with the help of the planned
>>   compatibility layer.
>>
>> References
>> ==========
>>
>> Previous discussions on this migration:
>> - https://lkml.org/lkml/2024/6/24/479
>> - https://lkml.org/lkml/2024/6/21/1252
>>
>> Testing
>> =======
>>
>> The driver has been tested on Qualcomm platforms with:
>> - Basic FastRPC attach/release operations
>> - DSP process creation and initialization
>> - Memory mapping/unmapping operations
>> - Dynamic invocation with various buffer types
>> - GEM buffer allocation and mmap
>> - PRIME buffer import from other subsystems
>>
>> Signed-off-by: Ekansh Gupta <[email protected]>
>> ---
>> Ekansh Gupta (18):
>>       accel/qda: Add Qualcomm QDA DSP accelerator driver docs
>>       accel/qda: Add Qualcomm DSP accelerator driver skeleton
>>       accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
>>       accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
>>       accel/qda: Create compute CB devices on QDA compute bus
>>       accel/qda: Add memory manager for CB devices
>>       accel/qda: Add DRM accel device registration for QDA driver
>>       accel/qda: Add per-file DRM context and open/close handling
>>       accel/qda: Add QUERY IOCTL and basic QDA UAPI header
>>       accel/qda: Add DMA-backed GEM objects and memory manager integration
>>       accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
>>       accel/qda: Add PRIME dma-buf import support
>>       accel/qda: Add initial FastRPC attach and release support
>>       accel/qda: Add FastRPC dynamic invocation support
>>       accel/qda: Add FastRPC DSP process creation support
>>       accel/qda: Add FastRPC-based DSP memory mapping support
>>       accel/qda: Add FastRPC-based DSP memory unmapping support
>>       MAINTAINERS: Add MAINTAINERS entry for QDA driver
>>
>>  Documentation/accel/index.rst          |    1 +
>>  Documentation/accel/qda/index.rst      |   14 +
>>  Documentation/accel/qda/qda.rst        |  129 ++++
>>  MAINTAINERS                            |    9 +
>>  arch/arm64/configs/defconfig           |    2 +
>>  drivers/accel/Kconfig                  |    1 +
>>  drivers/accel/Makefile                 |    2 +
>>  drivers/accel/qda/Kconfig              |   35 ++
>>  drivers/accel/qda/Makefile             |   19 +
>>  drivers/accel/qda/qda_cb.c             |  182 ++++++
>>  drivers/accel/qda/qda_cb.h             |   26 +
>>  drivers/accel/qda/qda_compute_bus.c    |   23 +
>>  drivers/accel/qda/qda_drv.c            |  375 ++++++++++++
>>  drivers/accel/qda/qda_drv.h            |  171 ++++++
>>  drivers/accel/qda/qda_fastrpc.c        | 1002 
>> ++++++++++++++++++++++++++++++++
>>  drivers/accel/qda/qda_fastrpc.h        |  433 ++++++++++++++
>>  drivers/accel/qda/qda_gem.c            |  211 +++++++
>>  drivers/accel/qda/qda_gem.h            |  103 ++++
>>  drivers/accel/qda/qda_ioctl.c          |  271 +++++++++
>>  drivers/accel/qda/qda_ioctl.h          |  118 ++++
>>  drivers/accel/qda/qda_memory_dma.c     |   91 +++
>>  drivers/accel/qda/qda_memory_dma.h     |   46 ++
>>  drivers/accel/qda/qda_memory_manager.c |  382 ++++++++++++
>>  drivers/accel/qda/qda_memory_manager.h |  148 +++++
>>  drivers/accel/qda/qda_prime.c          |  194 +++++++
>>  drivers/accel/qda/qda_prime.h          |   43 ++
>>  drivers/accel/qda/qda_rpmsg.c          |  327 +++++++++++
>>  drivers/accel/qda/qda_rpmsg.h          |   57 ++
>>  drivers/iommu/iommu.c                  |    4 +
>>  include/linux/qda_compute_bus.h        |   22 +
>>  include/uapi/drm/qda_accel.h           |  224 +++++++
>>  31 files changed, 4665 insertions(+)
>> ---
>> base-commit: d4906ae14a5f136ceb671bb14cedbf13fa560da6
>> change-id: 20260223-qda-firstpost-4ab05249e2cc
>>
>> Best regards,
>> -- 
>> Ekansh Gupta <[email protected]>
>>
>>

Reply via email to