cjj66619 opened a new pull request, #18878:
URL: https://github.com/apache/nuttx/pull/18878

   … limit
   
   The STM32 DMA NDTR/CNDTR transfer-count register is 16 bits wide on all 
STM32 series the in-tree stm32 driver supports (F0/F1/F2/F3/F4/F7/ 
G4/H7/L0/L1/L4 - both DMA IPv1 with CNDTR and IPv2 with SxNDTR).  Until now 
stm32_spi_exchange()'s DMA path forwarded the caller's full nwords to 
stm32_dmasetup(), so a single SPI_EXCHANGE() of 65536 or more words silently 
programmed NDTR to (nwords & 0xffff).  When the truncated count happened to be 
zero - the typical case for one-flash-erase-block (64 KiB) or one-FAT-cluster 
reads - the DMA stream completed instantly without raising the 
transfer-complete interrupt the SPI driver waits on, so the caller deadlocked 
forever in spi_dmarxwait().
   
   Fix it by walking the request in chunks of at most 65535 words inside the 
existing DMA branch of spi_exchange().  Each chunk goes through the exact same 
spi_dmarxsetup()/spi_dmatxsetup()/spi_dma{rx,tx}start()/ spi_dma{rx,tx}wait() 
sequence as before, with txp/rxp advanced by (chunk * word_size) between 
iterations.  word_size is derived from priv->nbits so 16-bit-per-word transfers 
are accounted for correctly.
   
   CONFIG_SPI_TRIGGER semantics are preserved for the first chunk only: a 
deferred trigger is meaningful for "one user-visible exchange = one hardware 
trigger", which is exactly the unit a chunked exchange still represents.  
Subsequent chunks must start unconditionally; otherwise the caller would have 
to re-arm the trigger between chunks and that contract was never part of the 
SPI_TRIGGER API.  The new fall-through keeps every existing CONFIG_SPI_TRIGGER 
user behaving the same way for any single transfer that fits within one 
descriptor (i.e. all of them today).
   
   A few opportunistic cleanups went along with the rewrite:
     * Rescale priv->buflen-clamped nwords so the DMA descriptor always matches 
the actually-copied byte count (a latent bug masked by the fact that 
buflen-bound transfers never exceeded one descriptor).
     * Promote the spiinfo() format specifier from %d to %zu to match the 
size_t parameter.
     * Fix two adjacent comment typos ("a in driver buffers" -> "an in-driver 
buffer", "indicated" -> "indicate").
   
   Test platform: STM32F407VGT6 (Alientek Explorer M144Z-M4) + Winbond W25Q128 
NOR over SPI1, NuttX nsh + LittleFS, dev-ai-contest-2026 manifest of openvela 
(NuttX 12.x baseline).
   
   Measured 1 MiB sequential read on /dev/w25 (apps/examples/w25bench):
     * baseline (POLL @ SCK 10.5 MHz, no DMA):     0.85 MB/s, 1.21 s
     * + SCK 42 MHz (CONFIG_W25_SPIFREQUENCY):     1.58 MB/s, 0.65 s
     * + DMA, no chunking:                         hangs at 64 KiB
     * + this patch (chunked DMA):                 5.12 MB/s, 0.20 s
   
   Sequential read efficiency improves from 65% to 97.5% of bus capacity 
(BW_busline = SCK / 8 = 5.25 MB/s @ /2 prescaler).  The chunk boundary overhead 
is approximately one extra spi_dma{rx,tx}setup() pair per 65535 words, which 
costs a few microseconds and is dominated by the DMA transfer itself for any 
practical buffer size; for 1 MiB / 16 chunks the measured overhead is below 
0.05% of total latency.
   
   *Note: Please adhere to [Contributing 
Guidelines](https://github.com/apache/nuttx/blob/master/CONTRIBUTING.md).*
   
   ## Summary
   
   Chunk SPI DMA exchange to honor 16-bit NDTR/CNDTR. Fixes a deadlock when 
SPI_EXCHANGE() is called with nwords ≥ 65536. Tested on STM32F407 + W25Q128 
NOR: 1 MiB read 1.21 s → 200 ms, 0.85 → 5.12 MB/s.
   
   ## Impact
   
   YES — affects every STM32 SPI DMA caller (SPI flash, SD/MMC SPI mode, I2S 
over SPI). Without this patch, single SPI_EXCHANGE() ≥ 64 KiB silently 
deadlocks. With this patch, behavior is correct and identical for transfers < 
64 KiB
   
   ## Testing
   
   Tested on STM32F407VGT6 (custom board) + W25Q128 NOR + LittleFS. Plus 
cross-board sanity build on stm32f4discovery:nsh with CONFIG_STM32_SPI1_DMA=y. 
tools/checkpatch.sh -f and tools/nxstyle clean. Detailed numbers + reproduction 
steps in commit body.
   
   *Pure documentation changes can just be tested with `make html`
   (see docs) and verification of the correct format in your
   browser.*
   
   **_PRs without testing information will not be accepted. We will
   request test logs._**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to