On 2020-05-15 22:33, Song Bao Hua wrote:
Subject: Re: Constantly map and unmap of streaming DMA buffers with
IOMMU backend might cause serious performance problem

On Fri, May 15, 2020 at 01:10:21PM +0100, Robin Murphy wrote:
Meanwhile, for the safety of buffers, lower-layer drivers need to make
certain the buffers have already been unmapped in iommu before those
buffers go back to buddy for other users.

That sounds like it would only have benefit in a very small set of specific
circumstances, and would be very difficult to generalise to buffers that
are mapped via dma_map_page() or dma_map_single(). Furthermore, a
high-level API that affects a low-level driver's interpretation of
mid-layer API calls without the mid-layer's knowledge sounds like a hideous
abomination of anti-design. If a mid-layer API lends itself to inefficiency
at the lower level, it would seem a lot cleaner and more robust to extend
*that* API for stateful buffer reuse. Failing that, it might possibly be
appropriate to approach this at the driver level - many of the cleverer
network drivers already implement buffer pools to recycle mapped SKBs
internally, couldn't the "zip driver" simply try doing something like that
for itself?

Exactly.  If you upper consumer of the DMA API keeps reusing the same
pages just map them once and use dma_sync_* to transfer ownership as
needed.

The problem is that the lower-layer drivers don't know if upper consumer keeps 
reusing the same pages. They are running in different software layers.
For example, Consumer is here in mm/zswap.c
static int zswap_frontswap_store(unsigned type, pgoff_t offset,
                                struct page *page)
{
        ...
        /* compress */
        dst = get_cpu_var(zswap_dstmem);
        ...
        ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen);
        ...
}

But the lower-layer driver is in drivers/crypto/...

Meanwhile, the lower-layer driver couldn't cache the pointers of buffer address 
coming from consumers to detect if the upper-layer is using the same page.
Because the same page might come from different users or come from the 
different stages of the same user with different permissions.

Indeed the driver can't cache arbitrary pointers, but if typical buffers are small enough it can copy the data into its own already-mapped page, dma_sync it, and perform the DMA operation from there. That might even be more or less what your first suggestion was, but I'm still not quite sure.

For example, consumer A uses the buffer as destination, then returns it to 
buddy, but consumer B gets the same buffer and uses it as source.

Another possibility is
Consumer A uses the buffer, returns it to buddy, after some time, it allocates 
a buffer again, but gets the same buffer from buddy like before.

For the safety of the buffer, lower-layer driver must guarantee the buffer is 
unmapped when the buffer returns to buddy.

I think only the upper-layer consumer knows if it is reusing the buffer.

Right, and if reusing buffers is common in crypto callers, then there's an argument for "set up reusable buffer", "process updated buffer" and "clean up buffer" operations to be added to the crypto API itself, such that the underlying drivers can then optimise for DMA usage in a robust and obvious way if they want to (or just implement the setup and teardown as no-ops and still do a full map/unmap in each "process" call if they don't).

Robin.
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to