@Fengchengwen, @Bruce, @Kevin Kindly review, since this is a library change we have to merge this before rc1.
Thanks, Pavan. >>> Hi Bruce, >>> >>> >On Sat, May 24, 2025 at 02:43:10PM +0530, <[email protected]> wrote: >>> >> From: Pavan Nikhilesh <[email protected]> >>> >> >>> >> Introduce DMA enqueue/dequeue operations to the DMA device library. >>> >> >>> >> Add configuration flags to rte_dma_config instead of boolean for >>> >> individual features. >>> >> >>> >> The enqueue/dequeue operations allow applications to communicate with the >>> >> DMA device using the rte_dma_op structure, providing a more flexible and >>> >> efficient way to manage DMA operations. >>> >> >>> > >>> >While I have no really strong objections to this addition to the dmadev >>> >API, I'd appreciate if you could explain WHY or how this method of working >>> >is more efficient in your usecase? When designing the dmadev APIs >>> >originally, we looked at using both an enqueue-type API as well as the >>> >implemented individual-op-based APIs. IIRC at that time testing showed that >>> >using the single ops directly was faster than using the enqueue APIs, so >>> >I'm wondering what exactly has changed, or is different about your usecase? >>> > >>> >>> Here is an example where we see enqueue/dequeue ops to be useful especially >>> when >>> integrating with Graph library. >>> >>> We had to write an entire wrapper[1] for tracking sges with the current >>> implementation >>> making our nodes[2] very complex. >>> >> >>Can you explain a bit more here. Why do you need the wrapper rather than >>just tracking in a circular ring all the copies offloaded? How does having >>an enqueue API make this better? > >This is what we already do in our wrapper. >We found it unnecessary overhead since, the driver already does this internally >and we can leverage the existing functionality. >This also reduces the memory footprint as in the case below we use a lot of >VCHANS. > >Instead of checking for completions and maintaining the circular ring, we can >spend >those cycles doing other things in the application. > >>Can you perhaps give a trivial example >>showing the difference it makes here? The examples you give below are >>rather long to understand quickly. >> > >The example below is a graph based application which currently uses the >wrapper implementation. >Which we want to swap with enq/deq ops to reduce overhead. > >Also, the ops descriptor already existes for eventdev subsystem, we are just >importing it to DMA >device and reusing it. > >>Thanks, >>/Bruce >> >>> [1]<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_MarvellEmbeddedProcessors_dao_blob_dao-2Ddevel_lib_common_dao-5Fdma.h&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=E3SgYMjtKCMVsB-fmvgGV3o-g_fjLhk5Pupi9ijohpc&m=dXtUywAGV8Rir_dtqGP5J-tvRAxN9zQjmM96PeDo6Ke6QybID8eLdPbVwWzlgZFy&s=QryV2vh2_mWEz5yS37615Xb1F6B-gQZHM1uZ3badxoU&e=> >>> [2]<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_MarvellEmbeddedProcessors_dao_blob_3f364261de91e355699bd9af20d60ea6459f7d67_lib_virtio-5Fnet_virtio-5Fnet-5Fdeq-5Fext.c-23L51&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=E3SgYMjtKCMVsB-fmvgGV3o-g_fjLhk5Pupi9ijohpc&m=dXtUywAGV8Rir_dtqGP5J-tvRAxN9zQjmM96PeDo6Ke6QybID8eLdPbVwWzlgZFy&s=Bl2X7g7xXg_XrWvVIjPhMuIZuy3PG7tOM-Eje9i2ITA&e=> >>> >>> >/Bruce >>> >>> Thanks, >>> Pavan. >>> >

