On Thu, Jun 04, 2026 at 04:08:06PM +0900, Koichiro Den wrote: > On Fri, Jan 09, 2026 at 03:13:24PM -0500, Frank Li wrote: > > Patch depend on > > https://lore.kernel.org/imx/[email protected]/T/#t > > > > Only test eDMA, have not tested HDMA. > > Hi Frank, > > I expect this series may be revisited in the near future, since the first > dependency series reached v7 and looks close to landing. > > With the latest versions of the two dependencies: > - [PATCH v7 0/9] dmaengine: Add new API to combine configuration and > descriptor preparation > > https://lore.kernel.org/dmaengine/[email protected]/ > - [PATCH v2 00/11] dmaengine: dw-edma: flatten desc structions and simple > code > > https://lore.kernel.org/dmaengine/[email protected]/ > > I tested this RFT series with the HDMA engine on a SpacemiT K3. > The test results are below, using the same format as your results: > > Baseline, before applying the three series (v7 + v2 + this RFT) > > Rnd read , 4KB, QD=1 , 1 job : IOPS=8567, BW=33.5MiB/s (35.1MB/s) > Rnd read , 4KB, QD=32, 1 job : IOPS=55.5k, BW=217MiB/s (227MB/s) > Rnd read , 4KB, QD=32, 4 jobs: IOPS=83.0k, BW=324MiB/s (340MB/s) > Rnd read , 128KB, QD=1 , 1 job : IOPS=3817, BW=477MiB/s (500MB/s) > Rnd read , 128KB, QD=32, 1 job : IOPS=10.8k, BW=1346MiB/s (1411MB/s) > Rnd read , 128KB, QD=32, 4 jobs: IOPS=11.2k, BW=1403MiB/s (1471MB/s) > Rnd read , 512KB, QD=1 , 1 job : IOPS=1515, BW=758MiB/s (794MB/s) > Rnd read , 512KB, QD=32, 1 job : IOPS=2795, BW=1399MiB/s (1467MB/s) > Rnd read , 512KB, QD=32, 4 jobs: IOPS=2795, BW=1404MiB/s (1472MB/s) > Rnd write, 4KB, QD=1 , 1 job : IOPS=9035, BW=35.3MiB/s (37.0MB/s) > Rnd write, 4KB, QD=32, 1 job : IOPS=38.3k, BW=149MiB/s (157MB/s) > Rnd write, 4KB, QD=32, 4 jobs: IOPS=41.8k, BW=163MiB/s (171MB/s) > Rnd write, 128KB, QD=1 , 1 job : IOPS=3969, BW=496MiB/s (520MB/s) > Rnd write, 128KB, QD=32, 1 job : IOPS=8260, BW=1033MiB/s (1083MB/s) > Rnd write, 128KB, QD=32, 4 jobs: IOPS=8295, BW=1038MiB/s (1089MB/s) > Seq read , 128KB, QD=1 , 1 job : IOPS=4609, BW=576MiB/s (604MB/s) > Seq read , 128KB, QD=32, 1 job : IOPS=10.8k, BW=1345MiB/s (1410MB/s) > Seq read , 512KB, QD=1 , 1 job : IOPS=1524, BW=762MiB/s (799MB/s) > Seq read , 512KB, QD=32, 1 job : IOPS=2799, BW=1401MiB/s (1469MB/s) > Seq read , 1MB, QD=32, 1 job : IOPS=1401, BW=1404MiB/s (1472MB/s) > Seq write, 128KB, QD=1 , 1 job : IOPS=3722, BW=465MiB/s (488MB/s) > Seq write, 128KB, QD=32, 1 job : IOPS=8246, BW=1031MiB/s (1081MB/s) > Seq write, 512KB, QD=1 , 1 job : IOPS=1283, BW=642MiB/s (673MB/s) > Seq write, 512KB, QD=32, 1 job : IOPS=2072, BW=1038MiB/s (1088MB/s) > Seq write, 1MB, QD=32, 1 job : IOPS=1037, BW=1040MiB/s (1091MB/s) > Rnd rdwr , 4K..1MB, QD=8 , 4 jobs: IOPS=1540, BW=768MiB/s (805MB/s) > IOPS=1549, BW=768MiB/s (805MB/s) > > After your three series (v7 + v2 + this) > > Rnd read , 4KB, QD=1 , 1 job : IOPS=7216, BW=28.2MiB/s (29.6MB/s) > Rnd read , 4KB, QD=32, 1 job : IOPS=61.1k, BW=239MiB/s (250MB/s) > Rnd read , 4KB, QD=32, 4 jobs: IOPS=75.3k, BW=294MiB/s (309MB/s) > Rnd read , 128KB, QD=1 , 1 job : IOPS=4711, BW=589MiB/s (618MB/s) > Rnd read , 128KB, QD=32, 1 job : IOPS=10.8k, BW=1354MiB/s (1420MB/s) > Rnd read , 128KB, QD=32, 4 jobs: IOPS=11.2k, BW=1403MiB/s (1471MB/s) > Rnd read , 512KB, QD=1 , 1 job : IOPS=1497, BW=749MiB/s (785MB/s) > Rnd read , 512KB, QD=32, 1 job : IOPS=2802, BW=1403MiB/s (1471MB/s) > Rnd read , 512KB, QD=32, 4 jobs: IOPS=2798, BW=1405MiB/s (1474MB/s) > Rnd write, 4KB, QD=1 , 1 job : IOPS=7411, BW=29.0MiB/s (30.4MB/s) > Rnd write, 4KB, QD=32, 1 job : IOPS=39.3k, BW=153MiB/s (161MB/s) > Rnd write, 4KB, QD=32, 4 jobs: IOPS=42.9k, BW=167MiB/s (176MB/s) > Rnd write, 128KB, QD=1 , 1 job : IOPS=3736, BW=467MiB/s (490MB/s) > Rnd write, 128KB, QD=32, 1 job : IOPS=8302, BW=1038MiB/s (1089MB/s) > Rnd write, 128KB, QD=32, 4 jobs: IOPS=8314, BW=1041MiB/s (1091MB/s) > Seq read , 128KB, QD=1 , 1 job : IOPS=4092, BW=512MiB/s (536MB/s) > Seq read , 128KB, QD=32, 1 job : IOPS=10.8k, BW=1354MiB/s (1420MB/s) > Seq read , 512KB, QD=1 , 1 job : IOPS=1474, BW=737MiB/s (773MB/s) > Seq read , 512KB, QD=32, 1 job : IOPS=2794, BW=1399MiB/s (1467MB/s) > Seq read , 1MB, QD=32, 1 job : IOPS=1401, BW=1404MiB/s (1472MB/s) > Seq write, 128KB, QD=1 , 1 job : IOPS=4135, BW=517MiB/s (542MB/s) > Seq write, 128KB, QD=32, 1 job : IOPS=8307, BW=1039MiB/s (1089MB/s) > Seq write, 512KB, QD=1 , 1 job : IOPS=1259, BW=630MiB/s (660MB/s) > Seq write, 512KB, QD=32, 1 job : IOPS=2073, BW=1038MiB/s (1089MB/s) > Seq write, 1MB, QD=32, 1 job : IOPS=1034, BW=1038MiB/s (1088MB/s) > Rnd rdwr , 4K..1MB, QD=8 , 4 jobs: IOPS=1531, BW=763MiB/s (801MB/s) > IOPS=1540, BW=765MiB/s (802MB/s) > > On this HDMA setup, I did not observe a clear performance difference from > applying the three series alone. Still, I like the overall direction. > > > P.S. > Separately, as a follow-up experiment, I also prototyped an extra series on > top > of your three series that allows us to make use of HDMA watermark interrupts. > With that series, in particular for the high queue-depth cases, the results > improved noticeably on this platform. I haven't posted that series yet though.
Thanks for test it. I am monitor above recondition patch set. Frank > > After your three series (v7 + v2 + this) + use of HDMA watermark interrupts > > Rnd read , 4KB, QD=1 , 1 job : IOPS=8016, BW=31.3MiB/s (32.8MB/s) > Rnd read , 4KB, QD=32, 1 job : IOPS=63.4k, BW=248MiB/s (260MB/s) > Rnd read , 4KB, QD=32, 4 jobs: IOPS=92.7k, BW=362MiB/s (380MB/s) > Rnd read , 128KB, QD=1 , 1 job : IOPS=3530, BW=441MiB/s (463MB/s) > Rnd read , 128KB, QD=32, 1 job : IOPS=12.0k, BW=1500MiB/s (1573MB/s) > Rnd read , 128KB, QD=32, 4 jobs: IOPS=12.4k, BW=1555MiB/s (1631MB/s) > Rnd read , 512KB, QD=1 , 1 job : IOPS=1541, BW=771MiB/s (808MB/s) > Rnd read , 512KB, QD=32, 1 job : IOPS=3116, BW=1560MiB/s (1636MB/s) > Rnd read , 512KB, QD=32, 4 jobs: IOPS=3099, BW=1556MiB/s (1632MB/s) > Rnd write, 4KB, QD=1 , 1 job : IOPS=8748, BW=34.2MiB/s (35.8MB/s) > Rnd write, 4KB, QD=32, 1 job : IOPS=57.6k, BW=225MiB/s (236MB/s) > Rnd write, 4KB, QD=32, 4 jobs: IOPS=80.3k, BW=314MiB/s (329MB/s) > Rnd write, 128KB, QD=1 , 1 job : IOPS=3878, BW=485MiB/s (508MB/s) > Rnd write, 128KB, QD=32, 1 job : IOPS=9798, BW=1225MiB/s (1285MB/s) > Rnd write, 128KB, QD=32, 4 jobs: IOPS=9970, BW=1248MiB/s (1308MB/s) > Seq read , 128KB, QD=1 , 1 job : IOPS=4516, BW=565MiB/s (592MB/s) > Seq read , 128KB, QD=32, 1 job : IOPS=12.0k, BW=1497MiB/s (1570MB/s) > Seq read , 512KB, QD=1 , 1 job : IOPS=1571, BW=786MiB/s (824MB/s) > Seq read , 512KB, QD=32, 1 job : IOPS=3073, BW=1538MiB/s (1613MB/s) > Seq read , 1MB, QD=32, 1 job : IOPS=1573, BW=1576MiB/s (1653MB/s) > Seq write, 128KB, QD=1 , 1 job : IOPS=3977, BW=497MiB/s (521MB/s) > Seq write, 128KB, QD=32, 1 job : IOPS=9806, BW=1226MiB/s (1286MB/s) > Seq write, 512KB, QD=1 , 1 job : IOPS=1404, BW=702MiB/s (736MB/s) > Seq write, 512KB, QD=32, 1 job : IOPS=2496, BW=1250MiB/s (1310MB/s) > Seq write, 1MB, QD=32, 1 job : IOPS=1252, BW=1256MiB/s (1317MB/s) > Rnd rdwr , 4K..1MB, QD=8 , 4 jobs: IOPS=1682, BW=836MiB/s (877MB/s) > IOPS=1688, BW=838MiB/s (879MB/s) > > Best regards, > Koichiro > > > Corn case have not tested, such as pause/resume transfer. > > > > Before > > > > Rnd read, 4KB, QD=1, 1 job : IOPS=6780, BW=26.5MiB/s (27.8MB/s) > > Rnd read, 4KB, QD=32, 1 job : IOPS=28.6k, BW=112MiB/s (117MB/s) > > Rnd read, 4KB, QD=32, 4 jobs: IOPS=33.4k, BW=130MiB/s (137MB/s) > > Rnd read, 128KB, QD=1, 1 job : IOPS=1188, BW=149MiB/s (156MB/s) > > Rnd read, 128KB, QD=32, 1 job : IOPS=1440, BW=180MiB/s (189MB/s) > > Rnd read, 128KB, QD=32, 4 jobs: IOPS=1282, BW=160MiB/s (168MB/s) > > Rnd read, 512KB, QD=1, 1 job : IOPS=254, BW=127MiB/s (134MB/s) > > Rnd read, 512KB, QD=32, 1 job : IOPS=354, BW=177MiB/s (186MB/s) > > Rnd read, 512KB, QD=32, 4 jobs: IOPS=388, BW=194MiB/s (204MB/s) > > Rnd write, 4KB, QD=1, 1 job : IOPS=6282, BW=24.5MiB/s (25.7MB/s) > > Rnd write, 4KB, QD=32, 1 job : IOPS=24.9k, BW=97.5MiB/s (102MB/s) > > Rnd write, 4KB, QD=32, 4 jobs: IOPS=27.4k, BW=107MiB/s (112MB/s) > > Rnd write, 128KB, QD=1, 1 job : IOPS=1098, BW=137MiB/s (144MB/s) > > Rnd write, 128KB, QD=32, 1 job : IOPS=1195, BW=149MiB/s (157MB/s) > > Rnd write, 128KB, QD=32, 4 jobs: IOPS=1120, BW=140MiB/s (147MB/s) > > Seq read, 128KB, QD=1, 1 job : IOPS=936, BW=117MiB/s (123MB/s) > > Seq read, 128KB, QD=32, 1 job : IOPS=1218, BW=152MiB/s (160MB/s) > > Seq read, 512KB, QD=1, 1 job : IOPS=301, BW=151MiB/s (158MB/s) > > Seq read, 512KB, QD=32, 1 job : IOPS=360, BW=180MiB/s (189MB/s) > > Seq read, 1MB, QD=32, 1 job : IOPS=193, BW=194MiB/s (203MB/s) > > Seq write, 128KB, QD=1, 1 job : IOPS=796, BW=99.5MiB/s (104MB/s) > > Seq write, 128KB, QD=32, 1 job : IOPS=1019, BW=127MiB/s (134MB/s) > > Seq write, 512KB, QD=1, 1 job : IOPS=213, BW=107MiB/s (112MB/s) > > Seq write, 512KB, QD=32, 1 job : IOPS=273, BW=137MiB/s (143MB/s) > > Seq write, 1MB, QD=32, 1 job : IOPS=168, BW=168MiB/s (177MB/s) > > Rnd rdwr, 4K..1MB, QD=8, 4 jobs: IOPS=255, BW=128MiB/s (134MB/s) > > IOPS=266, BW=135MiB/s (141MB/s) > > > > After > > > > Rnd read, 4KB, QD=1, 1 job : IOPS=6148, BW=24.0MiB/s (25.2MB/s) > > Rnd read, 4KB, QD=32, 1 job : IOPS=29.4k, BW=115MiB/s (121MB/s) > > Rnd read, 4KB, QD=32, 4 jobs: IOPS=38.8k, BW=151MiB/s (159MB/s) > > Rnd read, 128KB, QD=1, 1 job : IOPS=859, BW=107MiB/s (113MB/s) > > Rnd read, 128KB, QD=32, 1 job : IOPS=1504, BW=188MiB/s (197MB/s) > > Rnd read, 128KB, QD=32, 4 jobs: IOPS=1531, BW=191MiB/s (201MB/s) > > Rnd read, 512KB, QD=1, 1 job : IOPS=238, BW=119MiB/s (125MB/s) > > Rnd read, 512KB, QD=32, 1 job : IOPS=390, BW=195MiB/s (205MB/s) > > Rnd read, 512KB, QD=32, 4 jobs: IOPS=404, BW=202MiB/s (212MB/s) > > Rnd write, 4KB, QD=1, 1 job : IOPS=5801, BW=22.7MiB/s (23.8MB/s) > > Rnd write, 4KB, QD=32, 1 job : IOPS=24.7k, BW=96.6MiB/s (101MB/s) > > Rnd write, 4KB, QD=32, 4 jobs: IOPS=32.7k, BW=128MiB/s (134MB/s) > > Rnd write, 128KB, QD=1, 1 job : IOPS=744, BW=93.1MiB/s (97.6MB/s) > > Rnd write, 128KB, QD=32, 1 job : IOPS=1278, BW=160MiB/s (168MB/s) > > Rnd write, 128KB, QD=32, 4 jobs: IOPS=1278, BW=160MiB/s (168MB/s) > > Seq read, 128KB, QD=1, 1 job : IOPS=853, BW=107MiB/s (112MB/s) > > Seq read, 128KB, QD=32, 1 job : IOPS=1511, BW=189MiB/s (198MB/s) > > Seq read, 512KB, QD=1, 1 job : IOPS=240, BW=120MiB/s (126MB/s) > > Seq read, 512KB, QD=32, 1 job : IOPS=386, BW=193MiB/s (203MB/s) > > Seq read, 1MB, QD=32, 1 job : IOPS=200, BW=201MiB/s (211MB/s) > > Seq write, 128KB, QD=1, 1 job : IOPS=749, BW=93.7MiB/s (98.3MB/s) > > Seq write, 128KB, QD=32, 1 job : IOPS=1266, BW=158MiB/s (166MB/s) > > Seq write, 512KB, QD=1, 1 job : IOPS=198, BW=99.0MiB/s (104MB/s) > > Seq write, 512KB, QD=32, 1 job : IOPS=352, BW=176MiB/s (185MB/s) > > Seq write, 1MB, QD=32, 1 job : IOPS=184, BW=184MiB/s (193MB/s) > > Rnd rdwr, 4K..1MB, QD=8, 4 jobs: IOPS=287, BW=145MiB/s (152MB/s) > > IOPS=299, BW=149MiB/s (156MB/s) > > > > Signed-off-by: Frank Li <[email protected]> > > --- > > Frank Li (5): > > dmaengine: dw-edma: Add dw_edma_core_ll_cur_idx() to get completed > > link entry pos > > dmaengine: dw-edma: Move dw_hdma_set_callback_result() up > > dmaengine: dw-edma: Make DMA link list work as a circular buffer > > dmaengine: dw-edma: Dynamitc append new request during dmaengine > > running > > dmaengine: dw-edma: Add trace support > > > > drivers/dma/dw-edma/Makefile | 3 + > > drivers/dma/dw-edma/dw-edma-core.c | 215 > > ++++++++++++++++++++++++---------- > > drivers/dma/dw-edma/dw-edma-core.h | 42 ++++++- > > drivers/dma/dw-edma/dw-edma-trace.c | 4 + > > drivers/dma/dw-edma/dw-edma-trace.h | 150 ++++++++++++++++++++++++ > > drivers/dma/dw-edma/dw-edma-v0-core.c | 39 +++++- > > drivers/dma/dw-edma/dw-hdma-v0-core.c | 17 +++ > > 7 files changed, 409 insertions(+), 61 deletions(-) > > --- > > base-commit: 020f6d8442f35105660a29d0d236d3f8650c8142 > > change-id: 20251212-edma_dymatic-a57843ff0dfe > > > > Best regards, > > -- > > Frank Li <[email protected]> > >

