Hi Joerg,

On 11/08/15 10:37, Joerg Roedel wrote:
On Fri, Aug 07, 2015 at 02:38:39PM +0100, Robin Murphy wrote:
Indeed, DMA_DEBUG will check that a driver is making DMA API calls
to the arch code in the right way; this is a different check, to
catch things like the arch code passing the wrong domain into this
layer, or someone else having messed directly with the domain via
the IOMMU API. If the iommu_unmap doesn't match the IOVA region we
looked up, that means the IOMMU page tables have somehow become
inconsistent with the IOVA allocator, so we are in an unrecoverable
situation where we can no longer be sure what devices have access
to. That's bad.

Sure, but the BUG_ON would also trigger on things like a double-free,
which is bad to handle as a BUG_ON. A WARN_ON for this is sufficient.

Oh dear, it gets even better than that; in the case of a simple double-unmap where the IOVA is already free, we wouldn't even get as far as that check because we'd die calling iova_size(NULL). How on Earth did I get to v5 without spotting that? :(

Anyway, on reflection I think you're probably right; I've clearly been working on this for long enough to start falling into the "my thing is obviously more important than all the other things" trap.

AFAIK, yes (this is just a slight tidyup of the existing code that
32-bit Exynos/Tegra/Rockchip/etc. devices are already using) - the
display guys want increasingly massive contiguous allocations for
framebuffers, layers, etc., so having IOMMU magic deal with that
saves CMA for non-IOMMU devices that really need it.

Makes sense, I thougt about something similar for x86 too to avoid the
high-order allocations we currently do. I guess the buffer will later be
mapped into the vmalloc space for the CPU?

Indeed - for non-coherent devices we have to remap all allocations (IOMMU or not) anyway in order to get a non-cacheable CPU mapping of the buffer, so having non-contiguous pages is no bother; for coherent devices we can just do the same thing but keep the vmalloc mapping cacheable. There's also the DMA_ATTR_NO_KERNEL_MAPPING case (e.g. GPU just wants a big buffer to render into and read back out again) where we wouldn't need a CPU address at all, although on arm64 vmalloc space is cheap enough that we've no plans to implement that at the moment.

Robin.

_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to