On 02/08/17 18:05, Christian König wrote:
> Am 02.08.2017 um 18:43 schrieb Jerome Glisse:
>> On Wed, Aug 02, 2017 at 10:26:40AM +0200, Christian König wrote:
>>> [SNIP]
>> So to summarize you are saying you do not trust the value you get from
>> pci_map_page() ?
> 
> Well, what we don't trust is that we actually get this value correctly
> into our page tables.
> 
>> If not then i stress again that you have all the informations you need
>> inside the amdgpu driver. You can take the same scheme i propose to
>> dump ttm.dma_address[] and compare against content of GPU page table.
> 
> Yes, exactly. But then again we have the mapping page to dma-address
> (because that is what drivers usually need), but what we need for
> debugging is a map with the info dma-address to page.
> 
> I mean we can obviously build the reverse table in the driver ourself,
> but that is just a waste of memory if you ask me cause the IOMMU driver
> should have that info in it's tables anyway.
> 
> Additional to that we have at least some plans to get away from BOs and
> towards HMM. So it would be nice to have a central debugging feature to
> access the iommu map.

Note that underlying issue here doesn't have anything specifically to do
with IOMMUs - yes, the DMA API ops implemented by IOMMU drivers provide
the most obvious breakage of the assumption that DMA address == physical
address, but there are also various other cases (bus offsets, SWIOTLB,
etc.) for which it doesn't always hold. If the amdgpu-specific path
comes to a dead end, any acceptably general solution would probably have
to be implemented at the DMA API level.

> I'm actually surprised that the IOMMU subsystem doesn't already have
> something like that somewhere. I mean how do you guys validate that
> what's written into the IOMMU tables is actually correct?

In terms of DMA API implementation, experience says that you can usually
rely on everything going to hell pretty much instantly if it isn't ;)

The IOMMU API itself has tracepoints as mentioned at the top of the
thread, but everyone other than arm/arm64 implements their IOMMU-based
DMA ops directly without going via that abstraction. Adding DMA API
tracepoints would be a possibility, but would be awkward to do
generically because it's all static inlines in a header which ends up
getting pulled into the middle of other trace event headers (yes, I
tried...)

> Isn't there some way (debugfs/sysfs/tools?) to dump them?

FWIW I'm not aware of anything that does exactly what you want (other
than some trick patches I keep around that only work for arm64), but it
does come to mind that it ought to be fairly straightforward to give
dma-debug the ability to dump out the information it already captures.
It wouldn't be viable for deployment in non-development kernels, but
it's a idea - for production kernels, the simple answer is probably
"boot with iommu=off|pt when GPU debugging" anyway.

Robin.
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to