Re: [PATCH 1/9] mm: Introduce new vm_insert_range and vm_insert_range_buggy API
On Fri, Jan 11, 2019 at 8:33 PM Souptick Joarder wrote: > > Previouly drivers have their own way of mapping range of > kernel pages/memory into user vma and this was done by > invoking vm_insert_page() within a loop. > > As this pattern is common across different drivers, it can > be generalized by creating new functions and use it across > the drivers. > > vm_insert_range() is the API which could be used to mapped > kernel memory/pages in drivers which has considered vm_pgoff > > vm_insert_range_buggy() is the API which could be used to map > range of kernel memory/pages in drivers which has not considered > vm_pgoff. vm_pgoff is passed default as 0 for those drivers. > > We _could_ then at a later "fix" these drivers which are using > vm_insert_range_buggy() to behave according to the normal vm_pgoff > offsetting simply by removing the _buggy suffix on the function > name and if that causes regressions, it gives us an easy way to revert. > > Signed-off-by: Souptick Joarder > Suggested-by: Russell King > Suggested-by: Matthew Wilcox Any comment on these APIs ? > --- > include/linux/mm.h | 4 +++ > mm/memory.c| 81 > ++ > mm/nommu.c | 14 ++ > 3 files changed, 99 insertions(+) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 5411de9..9d1dff6 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2514,6 +2514,10 @@ unsigned long change_prot_numa(struct vm_area_struct > *vma, > int remap_pfn_range(struct vm_area_struct *, unsigned long addr, > unsigned long pfn, unsigned long size, pgprot_t); > int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page > *); > +int vm_insert_range(struct vm_area_struct *vma, struct page **pages, > + unsigned long num); > +int vm_insert_range_buggy(struct vm_area_struct *vma, struct page **pages, > + unsigned long num); > vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, > unsigned long pfn); > vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long > addr, > diff --git a/mm/memory.c b/mm/memory.c > index 4ad2d29..00e66df 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1520,6 +1520,87 @@ int vm_insert_page(struct vm_area_struct *vma, > unsigned long addr, > } > EXPORT_SYMBOL(vm_insert_page); > > +/** > + * __vm_insert_range - insert range of kernel pages into user vma > + * @vma: user vma to map to > + * @pages: pointer to array of source kernel pages > + * @num: number of pages in page array > + * @offset: user's requested vm_pgoff > + * > + * This allows drivers to insert range of kernel pages they've allocated > + * into a user vma. > + * > + * If we fail to insert any page into the vma, the function will return > + * immediately leaving any previously inserted pages present. Callers > + * from the mmap handler may immediately return the error as their caller > + * will destroy the vma, removing any successfully inserted pages. Other > + * callers should make their own arrangements for calling unmap_region(). > + * > + * Context: Process context. > + * Return: 0 on success and error code otherwise. > + */ > +static int __vm_insert_range(struct vm_area_struct *vma, struct page **pages, > + unsigned long num, unsigned long offset) > +{ > + unsigned long count = vma_pages(vma); > + unsigned long uaddr = vma->vm_start; > + int ret, i; > + > + /* Fail if the user requested offset is beyond the end of the object > */ > + if (offset > num) > + return -ENXIO; > + > + /* Fail if the user requested size exceeds available object size */ > + if (count > num - offset) > + return -ENXIO; > + > + for (i = 0; i < count; i++) { > + ret = vm_insert_page(vma, uaddr, pages[offset + i]); > + if (ret < 0) > + return ret; > + uaddr += PAGE_SIZE; > + } > + > + return 0; > +} > + > +/** > + * vm_insert_range - insert range of kernel pages starts with non zero offset > + * @vma: user vma to map to > + * @pages: pointer to array of source kernel pages > + * @num: number of pages in page array > + * > + * Maps an object consisting of `num' `pages', catering for the user's > + * requested vm_pgoff > + * > + * Context: Process context. Called by mmap handlers. > + * Return: 0 on success and error code otherwise. > + */ > +int vm_insert_range(struct vm_area_struct *vma, struct page **pages, > + unsigned long num) > +{ > + return __vm_insert_range(vma, pages, num, vma->vm_pgoff); > +} > +EXPORT_SYMBOL(vm_insert_range); > + > +/** > + * vm_insert_range_buggy - insert range of kernel pages starts with zero > offset > + * @vma: user vma to map to > + * @pages: pointer to array of source kernel pages >
Re: [PATCH 2/2] iommu/arm-smmu: Add support for non-coherent page table mappings
On Mon, Jan 21, 2019 at 11:35:30AM +0530, Vivek Gautam wrote: > On Sun, Jan 20, 2019 at 5:31 AM Will Deacon wrote: > > On Thu, Jan 17, 2019 at 02:57:18PM +0530, Vivek Gautam wrote: > > > Adding a device tree option for arm smmu to enable non-cacheable > > > memory for page tables. > > > We already enable a smmu feature for coherent walk based on > > > whether the smmu device is dma-coherent or not. Have an option > > > to enable non-cacheable page table memory to force set it for > > > particular smmu devices. > > > > Hmm, I must be missing something here. What is the difference between this > > new property, and simply omitting dma-coherent on the SMMU? > > So, this is what I understood from the email thread for Last level > cache support - > Robin pointed to the fact that we may need to add support for setting > non-cacheable > mappings in the TCR. > Currently, we don't do that for SMMUs that omit dma-coherent. > We rely on the interconnect to handle the configuration set in TCR, > and let interconnect > ignore the cacheability if it can't support. I think that's a bug. With that fixed, can you get what you want by omitting "dma-coherent"? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] arm64/xen: fix xen-swiotlb cache flushing
On Sat, 19 Jan 2019, Christoph Hellwig wrote: > [full quote deleted, please take a little more care when quoting] > > On Fri, Jan 18, 2019 at 04:44:23PM -0800, Stefano Stabellini wrote: > > > #ifdef CONFIG_XEN > > > - if (xen_initial_domain()) { > > > - dev->archdata.dev_dma_ops = dev->dma_ops; > > > + if (xen_initial_domain()) > > > dev->dma_ops = xen_dma_ops; > > > - } > > > #endif > > > } > > > > This is an optional suggestion, but it would be nice to add a check on > > dev->dma_ops being unset here, something like: > > > > #ifdef CONFIG_XEN > > if (xen_initial_domain()) { > > if (dev->dma_ops != NULL) > > warning/error > > dev->dma_ops = xen_dma_ops; > > } > > > > Does it make sense? > > Well, no such check existed before, so this probably should be a > separate patch if we care enough. I have a series for 5.1 pending > that moves the IOMMU handling to the comment code which will make > the ops assginment a lot cleaner, and I guess I could fold such > a check in that. Doing it now will just create churn as it would > have to get reworked anyway Fine by me, thank you! > > Reviewed-by: Stefano Stabellini > > Where should we pick this up? I could pick it up through the dma-mapping > tree given that is where the problem is introduced, but the Xen or arm64 > trees would also fit. I am happy for you to carry it in the dma-mapping tree, especially if you have other fixes to send. Otherwise, let me know. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [git pull] IOMMU Fixes for Linux v5.0-rc1
The pull request you sent on Mon, 21 Jan 2019 10:45:42 +0100: > git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git > tags/iommu-fixes-v5.0-rc3 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/52e60b754438f34d23348698534e9ca63cd751d7 Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: Implement dma_[un]map_resource()
On Sat, Jan 19, 2019 at 11:46:22AM -0700, Logan Gunthorpe wrote: > > Instead of having two tiny wrappers I'd just revert > > 964f2311a6862f1fbcc044d0828ad90030928b7f if we need to pass a real > > physical address now. > > Ok, I can resubmit this with that cleanup. Should I do it in two commits > (1 revert + 1 new) or is one commit that just restores the original > helper fine? I think a single commit is fine. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1] iommu/s390: Declare s390 iommu reserved regions
On 21/01/2019 11:51, Pierre Morel wrote: > On 18/01/2019 14:51, Jean-Philippe Brucker wrote: >> Hi Pierre, >> >> On 18/01/2019 13:29, Pierre Morel wrote: >>> On 17/01/2019 14:02, Robin Murphy wrote: On 15/01/2019 17:37, Pierre Morel wrote: > The s390 iommu can only allow DMA transactions between the zPCI device > entries start_dma and end_dma. > > > ... > >>> >>> I already posted a patch retrieving the geometry through >>> VFIO_IOMMU_GET_INFO using a specific capability for the geometry [1], >>> and AFAIU, Alex did not agree with this. >> >> On arm we also need to report the IOMMU geometry to userspace (max IOVA >> size in particular). Shameer has been working on a solution [2] that >> presents a unified view of both geometry and reserved regions into the >> VFIO_IOMMU_GET_INFO call, and I think we should go with that. If I >> understand correctly it's currently blocked on the RMRR problem and >> we're waiting for Jacob or Ashok to take a look at it, as Kevin pinged >> them on thread [1]? >> >> [2] https://lkml.org/lkml/2018/4/18/293 >> >> Thanks, >> Jean >> > > Hi Jean, > > I hopped that this proposition went in the same direction based on the > following assumptions: > > > - The goal of the get_resv_region is defined in iommu.h as: > - > * @get_resv_regions: Request list of reserved regions for a device > - > > - A iommu reserve region is a region which should not be mapped. > Isn't it exactly what happens outside the aperture? > Shouldn't it be reported by the iommu reserved region? > > - If we use VFIO and want to get all reserved region we will have the > VFIO_IOMMU_GET_INFO call provided by Shameer and it can get all reserved > regions depending from the iommu driver itself at once by calling the > get_reserved_region callback instead of having to merge them with the > aperture. > > - If there are other reserved region, depending on the system > configuration and not on the IOMMU itself, the VFIO_IOMMU_GET_INFO call > will have to merge them with the region gotten from the iommu driver. I guess that would only happen if VFIO wanted to reserve IOVA regions for its own use. But I don't see how that relates to the aperture > - If we do not use QEMU nor VFIO at all, AFAIU, the standard way to > retrieve the reserved regions associated with a device is to call the > get_reserved_region callback from the associated iommu. > > Please tell me were I am wrong. Those are good points in my opinion (assuming the new reserved regions for aperture is done in the core API) To be reliable though, userspace can't get the aperture from sysfs reserved_regions, it will have to get it from VFIO. If a new application expects the aperture to be described in reserved_regions but is running under an old kernel, it will just assume a 64-bit aperture by mistake. Unless we introduce a new IOMMU_RESV_* type to distinguish the aperture? Another thing to note is that we're currently adding nested support to VFIO for Arm and x86 archs. It requires providing low-level and sometimes arch-specific information about the IOMMU to userspace, information that is easier to describe using sysfs (e.g. /sys/class/iommu//*) than a VFIO ioctl. Among them is the input address size of the IOMMU, so we'll end up with redundant information in sysfs on x86 and Arm sides, but that's probably harmless. Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] ARM: dma-mapping: Clear DMA ops on teardown
On Mon, Jan 21, 2019 at 02:52:16PM +, Robin Murphy wrote: > Installing the appropriate non-IOMMU DMA ops in arm_iommu_detch_device() > serves the case where IOMMU-aware drivers choose to control their own > mapping but still make DMA API calls, however it also affects the case > when the arch code itself tears down the mapping upon driver unbinding, > where the ops now get left in place and can inhibit arch_setup_dma_ops() > on subsequent re-probe attempts. > > Fix the latter case by making sure that arch_teardown_dma_ops() cleans > up whenever the ops were automatically installed by its counterpart. > > Reported-by: Tobias Jakobi > Reported-by: Marek Szyprowski > Fixes: 1874619a7df4 "ARM: dma-mapping: Set proper DMA ops in > arm_iommu_detach_device()" > Tested-by: Tobias Jakobi > Signed-off-by: Robin Murphy > --- > > Sorry for the delay - there was a giant email infrastructure cock-up just > at the point I wanted to go back through my archive and double-check the > discussion around the original commit... > > Robin. > > arch/arm/mm/dma-mapping.c | 2 ++ > 1 file changed, 2 insertions(+) I had also tested your draft on Tegra last week and this looks identical, so: Tested-by: Thierry Reding signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
On 21/01/2019 14:24, Ard Biesheuvel wrote: On Mon, 21 Jan 2019 at 14:56, Robin Murphy wrote: On 21/01/2019 13:36, Ard Biesheuvel wrote: On Mon, 21 Jan 2019 at 14:25, Robin Murphy wrote: On 21/01/2019 10:50, Ard Biesheuvel wrote: On Mon, 21 Jan 2019 at 11:17, Vivek Gautam wrote: Hi, On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel wrote: On Mon, 21 Jan 2019 at 06:54, Vivek Gautam wrote: Qualcomm SoCs have an additional level of cache called as System cache, aka. Last level cache (LLC). This cache sits right before the DDR, and is tightly coupled with the memory controller. The clients using this cache request their slices from this system cache, make it active, and can then start using it. For these clients with smmu, to start using the system cache for buffers and, related page tables [1], memory attributes need to be set accordingly. This series add the required support. Does this actually improve performance on reads from a device? The non-cache coherent DMA routines perform an unconditional D-cache invalidate by VA to the PoC before reading from the buffers filled by the device, and I would expect the PoC to be defined as lying beyond the LLC to still guarantee the architected behavior. We have seen performance improvements when running Manhattan GFXBench benchmarks. Ah ok, that makes sense, since in that case, the data flow is mostly to the device, not from the device. As for the PoC, from my knowledge on sdm845 the system cache, aka Last level cache (LLC) lies beyond the point of coherency. Non-cache coherent buffers will not be cached to system cache also, and no additional software cache maintenance ops are required for system cache. Pratik can add more if I am missing something. To take care of the memory attributes from DMA APIs side, we can add a DMA_ATTR definition to take care of any dma non-coherent APIs calls. So does the device use the correct inner non-cacheable, outer writeback cacheable attributes if the SMMU is in pass-through? We have been looking into another use case where the fact that the SMMU overrides memory attributes is causing issues (WC mappings used by the radeon and amdgpu driver). So if the SMMU would honour the existing attributes, would you still need the SMMU changes? Even if we could force a stage 2 mapping with the weakest pagetable attributes (such that combining would work), there would still need to be a way to set the TCR attributes appropriately if this behaviour is wanted for the SMMU's own table walks as well. Isn't that just a matter of implementing support for SMMUs that lack the 'dma-coherent' attribute? Not quite - in general they need INC-ONC attributes in case there actually is something in the architectural outer-cacheable domain. But is it a problem to use INC-ONC attributes for the SMMU PTW on this chip? AIUI, the reason for the SMMU changes is to avoid the performance hit of snooping, which is more expensive than cache maintenance of SMMU page tables. So are you saying the by-VA cache maintenance is not relayed to this system cache, resulting in page table updates to be invisible to masters using INC-ONC attributes? I only have a relatively vague impression of how this Qcom interconnect actually behaves, but AIUI the outer attribute has no correctness impact (it's effectively mismatched between CPU and devices already), only some degree of latency improvement which is effectively the opposite of no-snoop, in allowing certain non-coherent device traffic to still allocate in the LLC. I'm assuming that if that latency matters for the device accesses themselves, it might also matter for the associated table walks depending on the TLB miss rate. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] ARM: dma-mapping: Clear DMA ops on teardown
Installing the appropriate non-IOMMU DMA ops in arm_iommu_detch_device() serves the case where IOMMU-aware drivers choose to control their own mapping but still make DMA API calls, however it also affects the case when the arch code itself tears down the mapping upon driver unbinding, where the ops now get left in place and can inhibit arch_setup_dma_ops() on subsequent re-probe attempts. Fix the latter case by making sure that arch_teardown_dma_ops() cleans up whenever the ops were automatically installed by its counterpart. Reported-by: Tobias Jakobi Reported-by: Marek Szyprowski Fixes: 1874619a7df4 "ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()" Tested-by: Tobias Jakobi Signed-off-by: Robin Murphy --- Sorry for the delay - there was a giant email infrastructure cock-up just at the point I wanted to go back through my archive and double-check the discussion around the original commit... Robin. arch/arm/mm/dma-mapping.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index f1e2922e447c..1e3e08a1c456 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -2390,4 +2390,6 @@ void arch_teardown_dma_ops(struct device *dev) return; arm_teardown_iommu_dma_ops(dev); + /* Let arch_setup_dma_ops() start again from scratch upon re-probe */ + set_dma_ops(dev, NULL); } -- 2.20.1.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: use generic DMA mapping code in powerpc V4
Hello Christoph, Thanks for your reply. I successfully compiled a kernel (uImage) for the X5000 from your Git 'powerpc-dma.6-debug' (both patches) today. It detects the SATA hard disk drive and boots without any problems. I will test the first patch in next days. Thanks for your help, Christian On 19 January 2019 at 3:04PM, Christoph Hellwig wrote: On Sat, Jan 19, 2019 at 02:02:22PM +0100, Christoph Hellwig wrote: Interesting. This suggest it is related to the use of ZONE_DMA by the FSL SOCs that your board uses. Let me investigate this a bit more. As a hack to check that theory I've pushed a new commit to the powerpc-dma.6-debug branch to use old powerpc GFP_DMA selection with the new dma direct code: http://git.infradead.org/users/hch/misc.git/commitdiff/5c532d07c2f3c3972104de505d06b8d85f403f06 And another one that drops the addressability checks that powerpc never had: http://git.infradead.org/users/hch/misc.git/commitdiff/18e7629b38465ca98f8e7eed639123a13ac3b669 Can you first test with both patches, and then just with the first in case that worked? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
On Mon, 21 Jan 2019 at 14:56, Robin Murphy wrote: > > On 21/01/2019 13:36, Ard Biesheuvel wrote: > > On Mon, 21 Jan 2019 at 14:25, Robin Murphy wrote: > >> > >> On 21/01/2019 10:50, Ard Biesheuvel wrote: > >>> On Mon, 21 Jan 2019 at 11:17, Vivek Gautam > >>> wrote: > > Hi, > > > On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel > wrote: > > > > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam > > wrote: > >> > >> Qualcomm SoCs have an additional level of cache called as > >> System cache, aka. Last level cache (LLC). This cache sits right > >> before the DDR, and is tightly coupled with the memory controller. > >> The clients using this cache request their slices from this > >> system cache, make it active, and can then start using it. > >> For these clients with smmu, to start using the system cache for > >> buffers and, related page tables [1], memory attributes need to be > >> set accordingly. This series add the required support. > >> > > > > Does this actually improve performance on reads from a device? The > > non-cache coherent DMA routines perform an unconditional D-cache > > invalidate by VA to the PoC before reading from the buffers filled by > > the device, and I would expect the PoC to be defined as lying beyond > > the LLC to still guarantee the architected behavior. > > We have seen performance improvements when running Manhattan > GFXBench benchmarks. > > >>> > >>> Ah ok, that makes sense, since in that case, the data flow is mostly > >>> to the device, not from the device. > >>> > As for the PoC, from my knowledge on sdm845 the system cache, aka > Last level cache (LLC) lies beyond the point of coherency. > Non-cache coherent buffers will not be cached to system cache also, and > no additional software cache maintenance ops are required for system > cache. > Pratik can add more if I am missing something. > > To take care of the memory attributes from DMA APIs side, we can add a > DMA_ATTR definition to take care of any dma non-coherent APIs calls. > > >>> > >>> So does the device use the correct inner non-cacheable, outer > >>> writeback cacheable attributes if the SMMU is in pass-through? > >>> > >>> We have been looking into another use case where the fact that the > >>> SMMU overrides memory attributes is causing issues (WC mappings used > >>> by the radeon and amdgpu driver). So if the SMMU would honour the > >>> existing attributes, would you still need the SMMU changes? > >> > >> Even if we could force a stage 2 mapping with the weakest pagetable > >> attributes (such that combining would work), there would still need to > >> be a way to set the TCR attributes appropriately if this behaviour is > >> wanted for the SMMU's own table walks as well. > >> > > > > Isn't that just a matter of implementing support for SMMUs that lack > > the 'dma-coherent' attribute? > > Not quite - in general they need INC-ONC attributes in case there > actually is something in the architectural outer-cacheable domain. But is it a problem to use INC-ONC attributes for the SMMU PTW on this chip? AIUI, the reason for the SMMU changes is to avoid the performance hit of snooping, which is more expensive than cache maintenance of SMMU page tables. So are you saying the by-VA cache maintenance is not relayed to this system cache, resulting in page table updates to be invisible to masters using INC-ONC attributes? > The > case of the outer cacheablility being not that but a hint to control > non-CPU traffic through some not-quite-transparent cache behind the PoC > definitely stays wrapped up in qcom-specific magic ;) > I'm not surprised ... ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dmaengine: fsl-edma: dma map slave device address
Hi Angelo, On 18.01.2019 23:50, Angelo Dureghello wrote: > Hi Laurentiu, > > On Fri, Jan 18, 2019 at 12:06:23PM +0200, Laurentiu Tudor wrote: >> This mapping needs to be created in order for slave dma transfers >> to work on systems with SMMU. The implementation mostly mimics the >> one in pl330 dma driver, authored by Robin Murphy. >> >> Signed-off-by: Laurentiu Tudor >> Suggested-by: Robin Murphy >> --- >> Original approach was to add the missing mappings in the i2c client driver, >> see here for discussion: >> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fpatch%2F1026013%2Fdata=02%7C01%7Claurentiu.tudor%40nxp.com%7C7861dfe95dfb4fceeb8208d67d907488%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C636834456718898365sdata=XM5shQdcIRgFLtCmuRuFtViR6ttPDWI%2BNHXoPi68Xs8%3Dreserved=0 >> >> drivers/dma/fsl-edma-common.c | 66 --- >> drivers/dma/fsl-edma-common.h | 4 +++ >> drivers/dma/fsl-edma.c| 1 + >> drivers/dma/mcf-edma.c| 1 + >> 4 files changed, 68 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c >> index 8876c4c1bb2c..0e95ee24b6d4 100644 >> --- a/drivers/dma/fsl-edma-common.c >> +++ b/drivers/dma/fsl-edma-common.c >> @@ -6,6 +6,7 @@ >> #include >> #include >> #include >> +#include >> >> #include "fsl-edma-common.h" >> >> @@ -173,12 +174,62 @@ int fsl_edma_resume(struct dma_chan *chan) >> } >> EXPORT_SYMBOL_GPL(fsl_edma_resume); >> >> +static void fsl_edma_unprep_slave_dma(struct fsl_edma_chan *fsl_chan) >> +{ >> +if (fsl_chan->dma_dir != DMA_NONE) >> +dma_unmap_resource(fsl_chan->vchan.chan.device->dev, >> + fsl_chan->dma_dev_addr, >> + fsl_chan->dma_dev_size, >> + fsl_chan->dma_dir, 0); >> +fsl_chan->dma_dir = DMA_NONE; >> +} >> + >> +static bool fsl_edma_prep_slave_dma(struct fsl_edma_chan *fsl_chan, >> +enum dma_transfer_direction dir) >> +{ >> +struct device *dev = fsl_chan->vchan.chan.device->dev; >> +enum dma_data_direction dma_dir; >> +phys_addr_t addr = 0; >> +u32 size = 0; >> + >> +switch (dir) { >> +case DMA_MEM_TO_DEV: >> +dma_dir = DMA_FROM_DEVICE; >> +addr = fsl_chan->cfg.dst_addr; >> +size = fsl_chan->cfg.dst_maxburst; >> +break; >> +case DMA_DEV_TO_MEM: >> +dma_dir = DMA_TO_DEVICE; >> +addr = fsl_chan->cfg.src_addr; >> +size = fsl_chan->cfg.src_maxburst; >> +break; >> +default: >> +dma_dir = DMA_NONE; >> +break; >> +} >> + >> +/* Already mapped for this config? */ >> +if (fsl_chan->dma_dir == dma_dir) >> +return true; >> + >> +fsl_edma_unprep_slave_dma(fsl_chan); >> + >> +fsl_chan->dma_dev_addr = dma_map_resource(dev, addr, size, dma_dir, 0); >> +if (dma_mapping_error(dev, fsl_chan->dma_dev_addr)) >> +return false; >> +fsl_chan->dma_dev_size = size; >> +fsl_chan->dma_dir = dma_dir; >> + >> +return true; >> +} >> + >> int fsl_edma_slave_config(struct dma_chan *chan, >> struct dma_slave_config *cfg) >> { >> struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan); >> >> memcpy(_chan->cfg, cfg, sizeof(*cfg)); >> +fsl_edma_unprep_slave_dma(fsl_chan); >> >> return 0; >> } >> @@ -378,6 +429,9 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic( >> if (!is_slave_direction(direction)) >> return NULL; >> >> +if (!fsl_edma_prep_slave_dma(fsl_chan, direction)) >> +return NULL; >> + >> sg_len = buf_len / period_len; >> fsl_desc = fsl_edma_alloc_desc(fsl_chan, sg_len); >> if (!fsl_desc) >> @@ -409,11 +463,11 @@ struct dma_async_tx_descriptor >> *fsl_edma_prep_dma_cyclic( >> >> if (direction == DMA_MEM_TO_DEV) { >> src_addr = dma_buf_next; >> -dst_addr = fsl_chan->cfg.dst_addr; >> +dst_addr = fsl_chan->dma_dev_addr; >> soff = fsl_chan->cfg.dst_addr_width; >> doff = 0; >> } else { >> -src_addr = fsl_chan->cfg.src_addr; >> +src_addr = fsl_chan->dma_dev_addr; >> dst_addr = dma_buf_next; >> soff = 0; >> doff = fsl_chan->cfg.src_addr_width; >> @@ -444,6 +498,9 @@ struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg( >> if (!is_slave_direction(direction)) >> return NULL; >> >> +if (!fsl_edma_prep_slave_dma(fsl_chan, direction)) >> +return NULL; >> + >> fsl_desc = fsl_edma_alloc_desc(fsl_chan, sg_len); >> if (!fsl_desc) >> return NULL; >> @@ -468,11 +525,11 @@
Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
On 21/01/2019 13:36, Ard Biesheuvel wrote: On Mon, 21 Jan 2019 at 14:25, Robin Murphy wrote: On 21/01/2019 10:50, Ard Biesheuvel wrote: On Mon, 21 Jan 2019 at 11:17, Vivek Gautam wrote: Hi, On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel wrote: On Mon, 21 Jan 2019 at 06:54, Vivek Gautam wrote: Qualcomm SoCs have an additional level of cache called as System cache, aka. Last level cache (LLC). This cache sits right before the DDR, and is tightly coupled with the memory controller. The clients using this cache request their slices from this system cache, make it active, and can then start using it. For these clients with smmu, to start using the system cache for buffers and, related page tables [1], memory attributes need to be set accordingly. This series add the required support. Does this actually improve performance on reads from a device? The non-cache coherent DMA routines perform an unconditional D-cache invalidate by VA to the PoC before reading from the buffers filled by the device, and I would expect the PoC to be defined as lying beyond the LLC to still guarantee the architected behavior. We have seen performance improvements when running Manhattan GFXBench benchmarks. Ah ok, that makes sense, since in that case, the data flow is mostly to the device, not from the device. As for the PoC, from my knowledge on sdm845 the system cache, aka Last level cache (LLC) lies beyond the point of coherency. Non-cache coherent buffers will not be cached to system cache also, and no additional software cache maintenance ops are required for system cache. Pratik can add more if I am missing something. To take care of the memory attributes from DMA APIs side, we can add a DMA_ATTR definition to take care of any dma non-coherent APIs calls. So does the device use the correct inner non-cacheable, outer writeback cacheable attributes if the SMMU is in pass-through? We have been looking into another use case where the fact that the SMMU overrides memory attributes is causing issues (WC mappings used by the radeon and amdgpu driver). So if the SMMU would honour the existing attributes, would you still need the SMMU changes? Even if we could force a stage 2 mapping with the weakest pagetable attributes (such that combining would work), there would still need to be a way to set the TCR attributes appropriately if this behaviour is wanted for the SMMU's own table walks as well. Isn't that just a matter of implementing support for SMMUs that lack the 'dma-coherent' attribute? Not quite - in general they need INC-ONC attributes in case there actually is something in the architectural outer-cacheable domain. The case of the outer cacheablility being not that but a hint to control non-CPU traffic through some not-quite-transparent cache behind the PoC definitely stays wrapped up in qcom-specific magic ;) Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/arm-smmu: Move to bitmap for arm_smmu_domain atrributes
On 21/01/2019 05:53, Vivek Gautam wrote: A number of arm_smmu_domain's attributes can be assigned based on the iommu domains's attributes. These local attributes better be managed by a bitmap. So remove boolean flags and move to a 32-bit bitmap, and enable each bits separtely. Signed-off-by: Vivek Gautam --- drivers/iommu/arm-smmu.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 7ebbcf1b2eb3..52b300dfc096 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -257,10 +257,11 @@ struct arm_smmu_domain { const struct iommu_gather_ops *tlb_ops; struct arm_smmu_cfg cfg; enum arm_smmu_domain_stage stage; - boolnon_strict; struct mutexinit_mutex; /* Protects smmu pointer */ spinlock_t cb_lock; /* Serialises ATS1* ops and TLB syncs */ struct iommu_domain domain; +#define ARM_SMMU_DOMAIN_ATTR_NON_STRICTBIT(0) + unsigned intattr; }; struct arm_smmu_option_prop { @@ -901,7 +902,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA; - if (smmu_domain->non_strict) + if (smmu_domain->attr & ARM_SMMU_DOMAIN_ATTR_NON_STRICT) pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT; /* Non coherent page table mappings only for Stage-1 */ @@ -1598,7 +1599,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain, case IOMMU_DOMAIN_DMA: switch (attr) { case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - *(int *)data = smmu_domain->non_strict; + *(int *)data = !!(smmu_domain->attr & + ARM_SMMU_DOMAIN_ATTR_NON_STRICT); return 0; default: return -ENODEV; @@ -1638,7 +1640,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain, case IOMMU_DOMAIN_DMA: switch (attr) { case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - smmu_domain->non_strict = *(int *)data; + smmu_domain->attr |= ARM_SMMU_DOMAIN_ATTR_NON_STRICT; But what if *data == 0? Robin. break; default: ret = -ENODEV; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
On Mon, 21 Jan 2019 at 14:25, Robin Murphy wrote: > > On 21/01/2019 10:50, Ard Biesheuvel wrote: > > On Mon, 21 Jan 2019 at 11:17, Vivek Gautam > > wrote: > >> > >> Hi, > >> > >> > >> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel > >> wrote: > >>> > >>> On Mon, 21 Jan 2019 at 06:54, Vivek Gautam > >>> wrote: > > Qualcomm SoCs have an additional level of cache called as > System cache, aka. Last level cache (LLC). This cache sits right > before the DDR, and is tightly coupled with the memory controller. > The clients using this cache request their slices from this > system cache, make it active, and can then start using it. > For these clients with smmu, to start using the system cache for > buffers and, related page tables [1], memory attributes need to be > set accordingly. This series add the required support. > > >>> > >>> Does this actually improve performance on reads from a device? The > >>> non-cache coherent DMA routines perform an unconditional D-cache > >>> invalidate by VA to the PoC before reading from the buffers filled by > >>> the device, and I would expect the PoC to be defined as lying beyond > >>> the LLC to still guarantee the architected behavior. > >> > >> We have seen performance improvements when running Manhattan > >> GFXBench benchmarks. > >> > > > > Ah ok, that makes sense, since in that case, the data flow is mostly > > to the device, not from the device. > > > >> As for the PoC, from my knowledge on sdm845 the system cache, aka > >> Last level cache (LLC) lies beyond the point of coherency. > >> Non-cache coherent buffers will not be cached to system cache also, and > >> no additional software cache maintenance ops are required for system cache. > >> Pratik can add more if I am missing something. > >> > >> To take care of the memory attributes from DMA APIs side, we can add a > >> DMA_ATTR definition to take care of any dma non-coherent APIs calls. > >> > > > > So does the device use the correct inner non-cacheable, outer > > writeback cacheable attributes if the SMMU is in pass-through? > > > > We have been looking into another use case where the fact that the > > SMMU overrides memory attributes is causing issues (WC mappings used > > by the radeon and amdgpu driver). So if the SMMU would honour the > > existing attributes, would you still need the SMMU changes? > > Even if we could force a stage 2 mapping with the weakest pagetable > attributes (such that combining would work), there would still need to > be a way to set the TCR attributes appropriately if this behaviour is > wanted for the SMMU's own table walks as well. > Isn't that just a matter of implementing support for SMMUs that lack the 'dma-coherent' attribute? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
On 21/01/2019 10:50, Ard Biesheuvel wrote: On Mon, 21 Jan 2019 at 11:17, Vivek Gautam wrote: Hi, On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel wrote: On Mon, 21 Jan 2019 at 06:54, Vivek Gautam wrote: Qualcomm SoCs have an additional level of cache called as System cache, aka. Last level cache (LLC). This cache sits right before the DDR, and is tightly coupled with the memory controller. The clients using this cache request their slices from this system cache, make it active, and can then start using it. For these clients with smmu, to start using the system cache for buffers and, related page tables [1], memory attributes need to be set accordingly. This series add the required support. Does this actually improve performance on reads from a device? The non-cache coherent DMA routines perform an unconditional D-cache invalidate by VA to the PoC before reading from the buffers filled by the device, and I would expect the PoC to be defined as lying beyond the LLC to still guarantee the architected behavior. We have seen performance improvements when running Manhattan GFXBench benchmarks. Ah ok, that makes sense, since in that case, the data flow is mostly to the device, not from the device. As for the PoC, from my knowledge on sdm845 the system cache, aka Last level cache (LLC) lies beyond the point of coherency. Non-cache coherent buffers will not be cached to system cache also, and no additional software cache maintenance ops are required for system cache. Pratik can add more if I am missing something. To take care of the memory attributes from DMA APIs side, we can add a DMA_ATTR definition to take care of any dma non-coherent APIs calls. So does the device use the correct inner non-cacheable, outer writeback cacheable attributes if the SMMU is in pass-through? We have been looking into another use case where the fact that the SMMU overrides memory attributes is causing issues (WC mappings used by the radeon and amdgpu driver). So if the SMMU would honour the existing attributes, would you still need the SMMU changes? Even if we could force a stage 2 mapping with the weakest pagetable attributes (such that combining would work), there would still need to be a way to set the TCR attributes appropriately if this behaviour is wanted for the SMMU's own table walks as well. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu/io-pgtable-arm: Add support for non-coherent page tables
On 17/01/2019 09:27, Vivek Gautam wrote: From Robin's comment [1] about touching TCR configurations - "TBH if we're going to touch the TCR attributes at all then we should probably correct that sloppiness first - there's an occasional argument for using non-cacheable pagetables even on a coherent SMMU if reducing snoop traffic/latency on walks outweighs the cost of cache maintenance on PTE updates, but anyone thinking they can get that by overriding dma-coherent silently gets the worst of both worlds thanks to this current TCR value." We have IO_PGTABLE_QUIRK_NO_DMA quirk present, but we don't force anybody _not_ using dma-coherent smmu to have non-cacheable page table mappings. Having another quirk flag can help in having non-cacheable memory for page tables once and for all. [1] https://lore.kernel.org/patchwork/patch/1020906/ Signed-off-by: Vivek Gautam --- drivers/iommu/io-pgtable-arm.c | 17 - drivers/iommu/io-pgtable.h | 6 ++ 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 237cacd4a62b..c76919c30f1a 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -780,7 +780,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) struct arm_lpae_io_pgtable *data; if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA | - IO_PGTABLE_QUIRK_NON_STRICT)) + IO_PGTABLE_QUIRK_NON_STRICT | + IO_PGTABLE_QUIRK_NON_COHERENT)) return NULL; data = arm_lpae_alloc_pgtable(cfg); @@ -788,9 +789,14 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) return NULL; /* TCR */ - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); + reg = ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT; + + if (cfg->quirks & IO_PGTABLE_QUIRK_NON_COHERENT) + reg |= ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT | + ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT; + else + reg |= ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT | + ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT; switch (ARM_LPAE_GRANULE(data)) { case SZ_4K: @@ -873,7 +879,8 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie) /* The NS quirk doesn't apply at stage 2 */ if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA | - IO_PGTABLE_QUIRK_NON_STRICT)) + IO_PGTABLE_QUIRK_NON_STRICT | + IO_PGTABLE_QUIRK_NON_COHERENT)) return NULL; data = arm_lpae_alloc_pgtable(cfg); diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h index 47d5ae559329..46604cf7b017 100644 --- a/drivers/iommu/io-pgtable.h +++ b/drivers/iommu/io-pgtable.h @@ -75,6 +75,11 @@ struct io_pgtable_cfg { * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs * on unmap, for DMA domains using the flush queue mechanism for * delayed invalidation. +* +* IO_PGTABLE_QUIRK_NON_COHERENT: Enforce non-cacheable mappings for +* pagetables even on a coherent SMMU for cases where reducing +* snoop traffic/latency on walks outweighs the cost of cache +* maintenance on PTE updates. Hmm, we can't actually "enforce" anything with this as-is - all we're doing is setting the attributes that the IOMMU will use for pagetable walks, and that has no impact on how the CPU actually writes PTEs to memory. In particular, in the case of a hardware-coherent IOMMU which is described as such, even if we make the dma_map/sync calls they still won't do anything since they 'know' that the IOMMU is coherent. Thus if we then set up a non-cacheable TCR we would have no proper means of making pagetables correctly visible to the walker. Aw crap, this is turning out to be a microcosm of the PCIe no-snoop mess... :( To start with, at least, what we want is to set a non-cacheable TCR if the IOMMU is *not* coherent (as far as Linux is concerned - that includes the firmware-lying-about-the-hardware situation I was alluding to before), but even that isn't necessarily as straightforward as it seems. AFAICS, if QUIRK_NO_DMA is set then we definitely have to use a cacheable TCR; we can't strictly rely on the inverse being true, but in practice we *might* get away with it since we already disallow most cases in which the DMA API calls would actually do anything for a known-coherent IOMMU device. Robin. */ #define IO_PGTABLE_QUIRK_ARM_NS BIT(0) #define IO_PGTABLE_QUIRK_NO_PERMS
Re: [PATCH v1] iommu/s390: Declare s390 iommu reserved regions
On 18/01/2019 13:29, Pierre Morel wrote: On 17/01/2019 14:02, Robin Murphy wrote: On 15/01/2019 17:37, Pierre Morel wrote: The s390 iommu can only allow DMA transactions between the zPCI device entries start_dma and end_dma. Let's declare the regions before start_dma and after end_dma as reserved regions using the appropriate callback in iommu_ops. The reserved region may later be retrieved from sysfs or from the vfio iommu internal interface. For this particular case, I think the best solution is to give VFIO the ability to directly interrogate the domain geometry (which s390 appears to set correctly already). The idea of reserved regions was really for 'unexpected' holes inside the usable address space - using them to also describe places that are entirely outside that address space rather confuses things IMO. Furthermore, even if we *did* end up going down the route of actively reserving addresses beyond the usable aperture, it doesn't seem sensible for individual drivers to do it themselves when the core API already describes the relevant information generically. Robin. Robin, I already posted a patch retrieving the geometry through VFIO_IOMMU_GET_INFO using a specific capability for the geometry [1], and AFAIU, Alex did not agree with this. What is different in what you propose? I didn't mean to imply that aperture and reserved regions are mutually exclusive, just that they are conceptually distinct things, i.e. there is a fundamental difference between "address which could in theory be mapped but wouldn't work as expected" and "address which is physically impossible to map at all". Admittedly I hadn't closely followed all of the previous discussions, and Alex has a fair point - for VFIO users who will mostly care about checking whether two address maps are compatible, it probably is more useful to just describe a single list of usable regions, rather than the absolute bounds plus a list of unusable holes within them. That still doesn't give us any need to conflate things throughout the kernel internals, though - the typical usage there is to size an IOVA allocator or page table based on the aperture, then carve out any necessary reservations. In that context, having to be aware of and handle 'impossible' reservations outside the aperture just invites bugs and adds complexity that would be better avoided. Robin. @Alex: I was hoping that this patch goes in your direction. What do you think? Thanks, Pierre [1]: https://lore.kernel.org/patchwork/patch/1030369/ This seems to me related with the work Shameer has started on vfio_iommu_type1 so I add Alex and Shameer to the CC list. Pierre Morel (1): iommu/s390: Declare s390 iommu reserved regions drivers/iommu/s390-iommu.c | 29 + 1 file changed, 29 insertions(+) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1] iommu/s390: Declare s390 iommu reserved regions
On 18/01/2019 14:51, Jean-Philippe Brucker wrote: Hi Pierre, On 18/01/2019 13:29, Pierre Morel wrote: On 17/01/2019 14:02, Robin Murphy wrote: On 15/01/2019 17:37, Pierre Morel wrote: The s390 iommu can only allow DMA transactions between the zPCI device entries start_dma and end_dma. ... I already posted a patch retrieving the geometry through VFIO_IOMMU_GET_INFO using a specific capability for the geometry [1], and AFAIU, Alex did not agree with this. On arm we also need to report the IOMMU geometry to userspace (max IOVA size in particular). Shameer has been working on a solution [2] that presents a unified view of both geometry and reserved regions into the VFIO_IOMMU_GET_INFO call, and I think we should go with that. If I understand correctly it's currently blocked on the RMRR problem and we're waiting for Jacob or Ashok to take a look at it, as Kevin pinged them on thread [1]? [2] https://lkml.org/lkml/2018/4/18/293 Thanks, Jean Hi Jean, I hopped that this proposition went in the same direction based on the following assumptions: - The goal of the get_resv_region is defined in iommu.h as: - * @get_resv_regions: Request list of reserved regions for a device - - A iommu reserve region is a region which should not be mapped. Isn't it exactly what happens outside the aperture? Shouldn't it be reported by the iommu reserved region? - If we use VFIO and want to get all reserved region we will have the VFIO_IOMMU_GET_INFO call provided by Shameer and it can get all reserved regions depending from the iommu driver itself at once by calling the get_reserved_region callback instead of having to merge them with the aperture. - If there are other reserved region, depending on the system configuration and not on the IOMMU itself, the VFIO_IOMMU_GET_INFO call will have to merge them with the region gotten from the iommu driver. - If we do not use QEMU nor VFIO at all, AFAIU, the standard way to retrieve the reserved regions associated with a device is to call the get_reserved_region callback from the associated iommu. Please tell me were I am wrong. Regards, Pierre What is different in what you propose? @Alex: I was hoping that this patch goes in your direction. What do you think? Thanks, Pierre [1]: https://lore.kernel.org/patchwork/patch/1030369/ This seems to me related with the work Shameer has started on vfio_iommu_type1 so I add Alex and Shameer to the CC list. Pierre Morel (1): iommu/s390: Declare s390 iommu reserved regions drivers/iommu/s390-iommu.c | 29 + 1 file changed, 29 insertions(+) -- Pierre Morel Linux/KVM/QEMU in Böblingen - Germany ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 0/7] Add virtio-iommu driver
Hi, On 18/01/2019 15:51, Michael S. Tsirkin wrote: > > On Tue, Jan 15, 2019 at 12:19:52PM +, Jean-Philippe Brucker wrote: >> Implement the virtio-iommu driver, following specification v0.9 [1]. >> >> This is a simple rebase onto Linux v5.0-rc2. We now use the >> dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing >> dev->iommu_fwspec, but there aren't any functional change from v6 [2]. >> >> Our current goal for virtio-iommu is to get a paravirtual IOMMU working >> on Arm, and enable device assignment to guest userspace. In this >> use-case the mappings are static, and don't require optimal performance, >> so this series tries to keep things simple. However there is plenty more >> to do for features and optimizations, and having this base in v5.1 would >> be good. Given that most of the changes are to drivers/iommu, I believe >> the driver and future changes should go via the IOMMU tree. >> >> You can find Linux driver and kvmtool device on v0.9.2 branches [3], >> module and x86 support on virtio-iommu/devel. Also tested with Eric's >> QEMU device [4]. Please note that the series depends on Robin's >> probe-deferral fix [5], which will hopefully land in v5.0. >> >> [1] Virtio-iommu specification v0.9, sources and pdf >> git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9 >> http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf >> >> [2] [PATCH v6 0/7] Add virtio-iommu driver >> >> https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html >> >> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2 >> git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2 >> >> [4] [RFC v9 00/17] VIRTIO-IOMMU device >> https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html >> >> [5] [PATCH] iommu/of: Fix probe-deferral >> https://www.spinics.net/lists/arm-kernel/msg698371.html > > Thanks for the work! > So really my only issue with this is that there's no > way for the IOMMU to describe the devices that it > covers. > > As a result that is then done in a platform-specific way. > > And this means that for example it does not solve the problem that e.g. > some power people have in that their platform simply does not have a way > to specify which devices are covered by the IOMMU. Isn't power using device tree? I haven't looked much at power because I was told a while ago that they already paravirtualize their IOMMU and don't need virtio-iommu, except perhaps for some legacy platforms. Or something along those lines. But I would certainly be interested in enabling the IOMMU for more architectures. As for the enumeration problem, I still don't think we can get much better than DT and ACPI as solutions (and IMO they are necessary to make this device portable). But I believe that getting DT and ACPI support is just a one-off inconvenience. That is, once the required bindings are accepted, any future extension can then be done at the virtio level with feature bits and probe requests, without having to update ACPI or DT. Thanks, Jean > Solving that problem would make me much more excited about > this device. > > On the other hand I can see that while there have been some > developments most of the code has been stable for quite a while now. > > So what I am trying to do right about now, is making a small module that > loads early and pokes at the IOMMU sufficiently to get the data about > which devices use the IOMMU out of it using standard virtio config > space. IIUC it's claimed to be impossible without messy changes to the > boot sequence. > > If I succeed at least on some platforms I'll ask that this design is > worked into this device, minimizing info that goes through DT/ACPI. If > I see I can't make it in time to meet the next merge window, I plan > merging the existing patches using DT (barring surprises). > > As I only have a very small amount of time to spend on this attempt, If > someone else wants to try doing that in parallel, that would be great! > > >> Jean-Philippe Brucker (7): >> dt-bindings: virtio-mmio: Add IOMMU description >> dt-bindings: virtio: Add virtio-pci-iommu node >> of: Allow the iommu-map property to omit untranslated devices >> PCI: OF: Initialize dev->fwnode appropriately >> iommu: Add virtio-iommu driver >> iommu/virtio: Add probe request >> iommu/virtio: Add event queue >> >> .../devicetree/bindings/virtio/iommu.txt | 66 + >> .../devicetree/bindings/virtio/mmio.txt | 30 + >> MAINTAINERS |7 + >> drivers/iommu/Kconfig | 11 + >> drivers/iommu/Makefile|1 + >> drivers/iommu/virtio-iommu.c | 1158 + >> drivers/of/base.c | 10 +- >> drivers/pci/of.c |7 + >> include/uapi/linux/virtio_ids.h |1 + >> include/uapi/linux/virtio_iommu.h | 161 +++ >>
Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
On Mon, 21 Jan 2019 at 11:17, Vivek Gautam wrote: > > Hi, > > > On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel > wrote: > > > > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam > > wrote: > > > > > > Qualcomm SoCs have an additional level of cache called as > > > System cache, aka. Last level cache (LLC). This cache sits right > > > before the DDR, and is tightly coupled with the memory controller. > > > The clients using this cache request their slices from this > > > system cache, make it active, and can then start using it. > > > For these clients with smmu, to start using the system cache for > > > buffers and, related page tables [1], memory attributes need to be > > > set accordingly. This series add the required support. > > > > > > > Does this actually improve performance on reads from a device? The > > non-cache coherent DMA routines perform an unconditional D-cache > > invalidate by VA to the PoC before reading from the buffers filled by > > the device, and I would expect the PoC to be defined as lying beyond > > the LLC to still guarantee the architected behavior. > > We have seen performance improvements when running Manhattan > GFXBench benchmarks. > Ah ok, that makes sense, since in that case, the data flow is mostly to the device, not from the device. > As for the PoC, from my knowledge on sdm845 the system cache, aka > Last level cache (LLC) lies beyond the point of coherency. > Non-cache coherent buffers will not be cached to system cache also, and > no additional software cache maintenance ops are required for system cache. > Pratik can add more if I am missing something. > > To take care of the memory attributes from DMA APIs side, we can add a > DMA_ATTR definition to take care of any dma non-coherent APIs calls. > So does the device use the correct inner non-cacheable, outer writeback cacheable attributes if the SMMU is in pass-through? We have been looking into another use case where the fact that the SMMU overrides memory attributes is causing issues (WC mappings used by the radeon and amdgpu driver). So if the SMMU would honour the existing attributes, would you still need the SMMU changes? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
Hi, On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel wrote: > > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam > wrote: > > > > Qualcomm SoCs have an additional level of cache called as > > System cache, aka. Last level cache (LLC). This cache sits right > > before the DDR, and is tightly coupled with the memory controller. > > The clients using this cache request their slices from this > > system cache, make it active, and can then start using it. > > For these clients with smmu, to start using the system cache for > > buffers and, related page tables [1], memory attributes need to be > > set accordingly. This series add the required support. > > > > Does this actually improve performance on reads from a device? The > non-cache coherent DMA routines perform an unconditional D-cache > invalidate by VA to the PoC before reading from the buffers filled by > the device, and I would expect the PoC to be defined as lying beyond > the LLC to still guarantee the architected behavior. We have seen performance improvements when running Manhattan GFXBench benchmarks. As for the PoC, from my knowledge on sdm845 the system cache, aka Last level cache (LLC) lies beyond the point of coherency. Non-cache coherent buffers will not be cached to system cache also, and no additional software cache maintenance ops are required for system cache. Pratik can add more if I am missing something. To take care of the memory attributes from DMA APIs side, we can add a DMA_ATTR definition to take care of any dma non-coherent APIs calls. Regards Vivek > > > > > This change is a realisation of following changes from downstream msm-4.9: > > iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT[2] > > iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[3] > > > > Changes since v2: > > - Split the patches into io-pgtable-arm driver and arm-smmu driver. > > - Converted smmu domain attributes to a bitmap, so multiple attributes > >can be managed easily. > > - With addition of non-coherent page table mapping support [4], this > >patch series now aligns with the understanding of upgrading the > >non-coherent devices to use some level of outer cache. > > - Updated the macros and comments to reflect the use of QCOM_SYS_CACHE. > > - QCOM_SYS_CACHE can still be used at stage 2, so that doens't depend on > >stage-1 mapping. > > - Added change to disable the attribute from arm_smmu_domain_set_attr() > >when needed. > > - Removed the page protection controls for QCOM_SYS_CACHE at the DMA API > >level. > > > > Goes on top of the non-coherent page tables support patch series [4] > > > > [1] https://patchwork.kernel.org/patch/10302791/ > > [2] > > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9=bf762276796e79ca90014992f4d9da5593fa7d51 > > [3] > > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9=d4c72c413ea27c43f60825193d4de9cb8ffd9602 > > [4] https://lore.kernel.org/patchwork/cover/1032938/ > > > > Vivek Gautam (3): > > iommu/arm-smmu: Move to bitmap for arm_smmu_domain atrributes > > iommu/io-pgtable-arm: Add support to use system cache > > iommu/arm-smmu: Add support to use system cache > > > > drivers/iommu/arm-smmu.c | 28 > > drivers/iommu/io-pgtable-arm.c | 15 +-- > > drivers/iommu/io-pgtable.h | 4 > > include/linux/iommu.h | 2 ++ > > 4 files changed, 43 insertions(+), 6 deletions(-) > > > > -- > > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member > > of Code Aurora Forum, hosted by The Linux Foundation > > > > > > ___ > > linux-arm-kernel mailing list > > linux-arm-ker...@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[git pull] IOMMU Fixes for Linux v5.0-rc1
Hi Linus, The following changes since commit bfeffd155283772bbe78c6a05dec7c0128ee500c: Linux 5.0-rc1 (2019-01-06 17:08:20 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git tags/iommu-fixes-v5.0-rc3 for you to fetch changes up to e8e683ae9a736407a20135df7809090a446db707: iommu/of: Fix probe-deferral (2019-01-11 12:28:24 +0100) IOMMU Fix for Linux v5.0-rc3 One fix only for now: - Fix probe deferral in iommu/of code (broke with recent changes to iommu_ops->add_device invocation) Robin Murphy (1): iommu/of: Fix probe-deferral drivers/iommu/of_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Please pull. Thanks, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu