On 02/08/17 18:05, Christian König wrote: > Am 02.08.2017 um 18:43 schrieb Jerome Glisse: >> On Wed, Aug 02, 2017 at 10:26:40AM +0200, Christian König wrote: >>> [SNIP] >> So to summarize you are saying you do not trust the value you get from >> pci_map_page() ? > > Well, what we don't trust is that we actually get this value correctly > into our page tables. > >> If not then i stress again that you have all the informations you need >> inside the amdgpu driver. You can take the same scheme i propose to >> dump ttm.dma_address[] and compare against content of GPU page table. > > Yes, exactly. But then again we have the mapping page to dma-address > (because that is what drivers usually need), but what we need for > debugging is a map with the info dma-address to page. > > I mean we can obviously build the reverse table in the driver ourself, > but that is just a waste of memory if you ask me cause the IOMMU driver > should have that info in it's tables anyway. > > Additional to that we have at least some plans to get away from BOs and > towards HMM. So it would be nice to have a central debugging feature to > access the iommu map.
Note that underlying issue here doesn't have anything specifically to do with IOMMUs - yes, the DMA API ops implemented by IOMMU drivers provide the most obvious breakage of the assumption that DMA address == physical address, but there are also various other cases (bus offsets, SWIOTLB, etc.) for which it doesn't always hold. If the amdgpu-specific path comes to a dead end, any acceptably general solution would probably have to be implemented at the DMA API level. > I'm actually surprised that the IOMMU subsystem doesn't already have > something like that somewhere. I mean how do you guys validate that > what's written into the IOMMU tables is actually correct? In terms of DMA API implementation, experience says that you can usually rely on everything going to hell pretty much instantly if it isn't ;) The IOMMU API itself has tracepoints as mentioned at the top of the thread, but everyone other than arm/arm64 implements their IOMMU-based DMA ops directly without going via that abstraction. Adding DMA API tracepoints would be a possibility, but would be awkward to do generically because it's all static inlines in a header which ends up getting pulled into the middle of other trace event headers (yes, I tried...) > Isn't there some way (debugfs/sysfs/tools?) to dump them? FWIW I'm not aware of anything that does exactly what you want (other than some trick patches I keep around that only work for arm64), but it does come to mind that it ought to be fairly straightforward to give dma-debug the ability to dump out the information it already captures. It wouldn't be viable for deployment in non-development kernels, but it's a idea - for production kernels, the simple answer is probably "boot with iommu=off|pt when GPU debugging" anyway. Robin. _______________________________________________ iommu mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/iommu
