DMA isn't always the best answer. It's sometimes best to just leave the data in the PL and have the processor access it directly.
If the processor reads the data directly, it's just accessed once, and only the data you need is accessed. If you transfer via DMA, it's read once by the DMA from the PL, written to PS memory once by the DMA, and then read again by the processor to do it's processing. Also, the DMA must potentially transfer more data than the current processing actually needs, since it may need to account for contingencies. So although DMA access *might* be faster access, it's definitely accessing the data more times. It won't always be worth it. DMA also adds an additional layer of software complexity. As an example of the non-DMA solution, the demo I released for the RFSoC4x2 pulls the data directly from the PL into the ARM without doing any DMA. Regards, Ross On Thu, Oct 5, 2023, 3:06 PM Jack Hickish <jackhick...@gmail.com> wrote: > This seems like a fun "discuss at the workshop" topic! > I have a couple of applications where I think this functionality would be > useful, so I'd definitely be interested in helping out. > > From a toolflow side I think getting the automated instantiation of the > DMA IP should be relatively straightforward. Handling what the CPU does to > interact with the core, and/or how you might interact with the core > remotely over a network I'm less sure about. > > Cheers > Jack > > On Thu, 5 Oct 2023 at 12:08, Matthew Schiller <mschi...@nrao.edu> wrote: > >> The right way to do what you describe is with the axi DMA block, but as >> you point out that has a software interface to configure the transfer. The >> main data would flow over an AXI4 “full” interface that supports burst >> transactions (but the Xilinx-provided DMA block already does that), and the >> configuration of the DMA block comes from software over AXI4 lite. There >> are two approaches (which should be supported by either using the correct >> DMA block or the correct settings on the DMA block). A Standard DMA block >> can be used if fixed addresses in memory can be allocated. This would mean >> that the linux kernel is told to only use ½ of the PS memory for example. >> Software can still access the upper half though for example /dev/mem reads, >> but the upper memory disappears from linux for normal applications.. >> Alternatively, though more complicated, a “scatter-gather” DMA is >> implemented. A Scatter Gather DMA uses a software driver/server that will >> “malloc” memory in a normal software way, and then provide pointers to the >> Scatter Gather DMA to that memory. Because of the way virtual memory >> works, this is not as trivial as it sounds and is requires several steps to >> accomplish as the FPGA needs the physical, not virtual address, and must >> respect the fact that memory is allocated in virtual memory on “pages” and >> not necessarily contiguously. >> >> >> >> sgDMA is better in many systems though because linux can still access all >> the memory so if you aren’t recording data, for example, more complicated >> software applications can run. >> >> >> >> I don’t believe this has been done yet in casper, but it is possible >> since these are standard Xilinx provided blocks. We just need to get the >> block instantiated properly in sysgen to accept an AXI streaming data >> stream from your DSP algorithm or the ADCs. and then on the ARM processor >> we need appropriate software/drivers to allocate memory and configure the >> DMA. >> >> >> >> I think I heard a rumor that it was planned, but hasn’t been tackled yet. >> >> >> >> With AXIDMA, you can probably get to around 20Gbit/sec (in theory >> probably as high as 40 depending on what speed the DDR4 train to) or better >> transfer performance to the PL. Not that the little arm on these FPGAs can >> do much with that speed of data, but for recording a snippit of data or >> something like that that can allow some fairly significant sample rates of >> I/Q data for example. (at 8-bit I/Q that’s >1GSPS!). If instead you did >> the register approach you mentioned I would expect rates around 100Mbit/sec >> to be possible, and to achieve that the processor in the ARM will be going >> nuts, because AXI4-Lite tends to require the processor to spin (DMA frees >> the processor for other stuff, while polling registers takes time to >> accomplish) >> >> >> >> >> >> FWIW: ngVLA plans to create functionality like this in “pure” hdl and >> given the current effort to use more VHDL/Verilog blocks in casper ngVLA’s >> work may be useful in the future. I hope to make progress on ngVLA’s >> approach later this calendar year. But ngVLA is on Intel FPGAs so a porting >> process would still be required to get that into Casper. >> >> >> >> >> >> >> >> *From:* casper@lists.berkeley.edu <casper@lists.berkeley.edu> *On Behalf >> Of *Ken Semanov >> *Sent:* Thursday, October 5, 2023 4:11 AM >> *To:* casper@lists.berkeley.edu >> *Subject:* [casper] PL data to PS DDR4 (AXI) {External} >> >> >> >> Is there an obvious way to migrate data from the PL into memory that is >> mapped into the address space of the PS? Ideally I would use >> axi_interconnect as shown >> https://casper-toolflow.readthedocs.io/en/latest/axi4lite_documentation.html >> >> >> >> A possible approach is to instantiate axi_dma within the PL , and the PL >> acts as the master during transfers. But the axi_dma exposes a AXI4-Lite >> slave port to the PS so that the PS configures and starts the transfers. >> The receiving raw device would be the memory controller of the PS DDR4. >> (Presumably the data is accessed later by software via the DMA engine). >> >> >> >> Another approach would be to expose a single register, and perform this >> slowly word-by-word (without streaming or bursting.) >> >> >> >> Is this plausible in CASPER, or are steep changes required? >> >> >> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "casper@lists.berkeley.edu" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to casper+unsubscr...@lists.berkeley.edu. >> To view this discussion on the web visit >> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/0424800a-035f-447f-92ed-07402b9d0239n%40lists.berkeley.edu >> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/0424800a-035f-447f-92ed-07402b9d0239n%40lists.berkeley.edu?utm_medium=email&utm_source=footer> >> . >> >> -- >> You received this message because you are subscribed to the Google Groups >> "casper@lists.berkeley.edu" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to casper+unsubscr...@lists.berkeley.edu. >> To view this discussion on the web visit >> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/BL0PR14MB352338AEAFC2A58D667699CEABCAA%40BL0PR14MB3523.namprd14.prod.outlook.com >> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/BL0PR14MB352338AEAFC2A58D667699CEABCAA%40BL0PR14MB3523.namprd14.prod.outlook.com?utm_medium=email&utm_source=footer> >> . >> > -- > You received this message because you are subscribed to the Google Groups " > casper@lists.berkeley.edu" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to casper+unsubscr...@lists.berkeley.edu. > To view this discussion on the web visit > https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAG1GKSnF8Bc7bfMajOVBEqdgdkhp2q3DiikSLRz4jpQX--1RCg%40mail.gmail.com > <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAG1GKSnF8Bc7bfMajOVBEqdgdkhp2q3DiikSLRz4jpQX--1RCg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "casper@lists.berkeley.edu" group. To unsubscribe from this group and stop receiving emails from it, send an email to casper+unsubscr...@lists.berkeley.edu. To view this discussion on the web visit https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAG4nf730H6Hy%2B1krgq%2BRSYaBtVq8YHPikqz2ddoa61t4nEi0OA%40mail.gmail.com.