Re: [PATCH 1/9] mm: Introduce new vm_insert_range and vm_insert_range_buggy API

2019-01-21 Thread Souptick Joarder
On Fri, Jan 11, 2019 at 8:33 PM Souptick Joarder  wrote:
>
> Previouly drivers have their own way of mapping range of
> kernel pages/memory into user vma and this was done by
> invoking vm_insert_page() within a loop.
>
> As this pattern is common across different drivers, it can
> be generalized by creating new functions and use it across
> the drivers.
>
> vm_insert_range() is the API which could be used to mapped
> kernel memory/pages in drivers which has considered vm_pgoff
>
> vm_insert_range_buggy() is the API which could be used to map
> range of kernel memory/pages in drivers which has not considered
> vm_pgoff. vm_pgoff is passed default as 0 for those drivers.
>
> We _could_ then at a later "fix" these drivers which are using
> vm_insert_range_buggy() to behave according to the normal vm_pgoff
> offsetting simply by removing the _buggy suffix on the function
> name and if that causes regressions, it gives us an easy way to revert.
>
> Signed-off-by: Souptick Joarder 
> Suggested-by: Russell King 
> Suggested-by: Matthew Wilcox 

Any comment on these APIs ?

> ---
>  include/linux/mm.h |  4 +++
>  mm/memory.c| 81 
> ++
>  mm/nommu.c | 14 ++
>  3 files changed, 99 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 5411de9..9d1dff6 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2514,6 +2514,10 @@ unsigned long change_prot_numa(struct vm_area_struct 
> *vma,
>  int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
> unsigned long pfn, unsigned long size, pgprot_t);
>  int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page 
> *);
> +int vm_insert_range(struct vm_area_struct *vma, struct page **pages,
> +   unsigned long num);
> +int vm_insert_range_buggy(struct vm_area_struct *vma, struct page **pages,
> +   unsigned long num);
>  vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
> unsigned long pfn);
>  vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long 
> addr,
> diff --git a/mm/memory.c b/mm/memory.c
> index 4ad2d29..00e66df 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1520,6 +1520,87 @@ int vm_insert_page(struct vm_area_struct *vma, 
> unsigned long addr,
>  }
>  EXPORT_SYMBOL(vm_insert_page);
>
> +/**
> + * __vm_insert_range - insert range of kernel pages into user vma
> + * @vma: user vma to map to
> + * @pages: pointer to array of source kernel pages
> + * @num: number of pages in page array
> + * @offset: user's requested vm_pgoff
> + *
> + * This allows drivers to insert range of kernel pages they've allocated
> + * into a user vma.
> + *
> + * If we fail to insert any page into the vma, the function will return
> + * immediately leaving any previously inserted pages present.  Callers
> + * from the mmap handler may immediately return the error as their caller
> + * will destroy the vma, removing any successfully inserted pages. Other
> + * callers should make their own arrangements for calling unmap_region().
> + *
> + * Context: Process context.
> + * Return: 0 on success and error code otherwise.
> + */
> +static int __vm_insert_range(struct vm_area_struct *vma, struct page **pages,
> +   unsigned long num, unsigned long offset)
> +{
> +   unsigned long count = vma_pages(vma);
> +   unsigned long uaddr = vma->vm_start;
> +   int ret, i;
> +
> +   /* Fail if the user requested offset is beyond the end of the object 
> */
> +   if (offset > num)
> +   return -ENXIO;
> +
> +   /* Fail if the user requested size exceeds available object size */
> +   if (count > num - offset)
> +   return -ENXIO;
> +
> +   for (i = 0; i < count; i++) {
> +   ret = vm_insert_page(vma, uaddr, pages[offset + i]);
> +   if (ret < 0)
> +   return ret;
> +   uaddr += PAGE_SIZE;
> +   }
> +
> +   return 0;
> +}
> +
> +/**
> + * vm_insert_range - insert range of kernel pages starts with non zero offset
> + * @vma: user vma to map to
> + * @pages: pointer to array of source kernel pages
> + * @num: number of pages in page array
> + *
> + * Maps an object consisting of `num' `pages', catering for the user's
> + * requested vm_pgoff
> + *
> + * Context: Process context. Called by mmap handlers.
> + * Return: 0 on success and error code otherwise.
> + */
> +int vm_insert_range(struct vm_area_struct *vma, struct page **pages,
> +   unsigned long num)
> +{
> +   return __vm_insert_range(vma, pages, num, vma->vm_pgoff);
> +}
> +EXPORT_SYMBOL(vm_insert_range);
> +
> +/**
> + * vm_insert_range_buggy - insert range of kernel pages starts with zero 
> offset
> + * @vma: user vma to map to
> + * @pages: pointer to array of source kernel pages
> 

Re: [PATCH 2/2] iommu/arm-smmu: Add support for non-coherent page table mappings

2019-01-21 Thread Will Deacon
On Mon, Jan 21, 2019 at 11:35:30AM +0530, Vivek Gautam wrote:
> On Sun, Jan 20, 2019 at 5:31 AM Will Deacon  wrote:
> > On Thu, Jan 17, 2019 at 02:57:18PM +0530, Vivek Gautam wrote:
> > > Adding a device tree option for arm smmu to enable non-cacheable
> > > memory for page tables.
> > > We already enable a smmu feature for coherent walk based on
> > > whether the smmu device is dma-coherent or not. Have an option
> > > to enable non-cacheable page table memory to force set it for
> > > particular smmu devices.
> >
> > Hmm, I must be missing something here. What is the difference between this
> > new property, and simply omitting dma-coherent on the SMMU?
> 
> So, this is what I understood from the email thread for Last level
> cache support -
> Robin pointed to the fact that we may need to add support for setting
> non-cacheable
> mappings in the TCR.
> Currently, we don't do that for SMMUs that omit dma-coherent.
> We rely on the interconnect to handle the configuration set in TCR,
> and let interconnect
> ignore the cacheability if it can't support.

I think that's a bug. With that fixed, can you get what you want by omitting
"dma-coherent"?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] arm64/xen: fix xen-swiotlb cache flushing

2019-01-21 Thread Stefano Stabellini
On Sat, 19 Jan 2019, Christoph Hellwig wrote:
> [full quote deleted, please take a little more care when quoting]
> 
> On Fri, Jan 18, 2019 at 04:44:23PM -0800, Stefano Stabellini wrote:
> > >  #ifdef CONFIG_XEN
> > > - if (xen_initial_domain()) {
> > > - dev->archdata.dev_dma_ops = dev->dma_ops;
> > > + if (xen_initial_domain())
> > >   dev->dma_ops = xen_dma_ops;
> > > - }
> > >  #endif
> > >  }
> > 
> > This is an optional suggestion, but it would be nice to add a check on
> > dev->dma_ops being unset here, something like:
> > 
> >   #ifdef CONFIG_XEN
> >   if (xen_initial_domain()) {
> >   if (dev->dma_ops != NULL)
> >   warning/error
> >   dev->dma_ops = xen_dma_ops;
> >   }
> > 
> > Does it make sense?
> 
> Well, no such check existed before, so this probably should be a
> separate patch if we care enough.  I have a series for 5.1 pending
> that moves the IOMMU handling to the comment code which will make
> the ops assginment a lot cleaner, and I guess I could fold such
> a check in that.  Doing it now will just create churn as it would
> have to get reworked anyway

Fine by me, thank you!


> > Reviewed-by: Stefano Stabellini 
> 
> Where should we pick this up?  I could pick it up through the dma-mapping
> tree given that is where the problem is introduced, but the Xen or arm64
> trees would also fit.

I am happy for you to carry it in the dma-mapping tree, especially if
you have other fixes to send. Otherwise, let me know.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [git pull] IOMMU Fixes for Linux v5.0-rc1

2019-01-21 Thread pr-tracker-bot
The pull request you sent on Mon, 21 Jan 2019 10:45:42 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
> tags/iommu-fixes-v5.0-rc3

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/52e60b754438f34d23348698534e9ca63cd751d7

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: Implement dma_[un]map_resource()

2019-01-21 Thread Christoph Hellwig
On Sat, Jan 19, 2019 at 11:46:22AM -0700, Logan Gunthorpe wrote:
> > Instead of having two tiny wrappers I'd just revert
> > 964f2311a6862f1fbcc044d0828ad90030928b7f if we need to pass a real
> > physical address now.
> 
> Ok, I can resubmit this with that cleanup. Should I do it in two commits
> (1 revert + 1 new) or is one commit that just restores the original
> helper fine?

I think a single commit is fine.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] iommu/s390: Declare s390 iommu reserved regions

2019-01-21 Thread Jean-Philippe Brucker
On 21/01/2019 11:51, Pierre Morel wrote:
> On 18/01/2019 14:51, Jean-Philippe Brucker wrote:
>> Hi Pierre,
>>
>> On 18/01/2019 13:29, Pierre Morel wrote:
>>> On 17/01/2019 14:02, Robin Murphy wrote:
 On 15/01/2019 17:37, Pierre Morel wrote:
> The s390 iommu can only allow DMA transactions between the zPCI device
> entries start_dma and end_dma.
>
> 
> ...
> 
>>>
>>> I already posted a patch retrieving the geometry through
>>> VFIO_IOMMU_GET_INFO using a specific capability for the geometry [1],
>>> and AFAIU, Alex did not agree with this.
>>
>> On arm we also need to report the IOMMU geometry to userspace (max IOVA
>> size in particular). Shameer has been working on a solution [2] that
>> presents a unified view of both geometry and reserved regions into the
>> VFIO_IOMMU_GET_INFO call, and I think we should go with that. If I
>> understand correctly it's currently blocked on the RMRR problem and
>> we're waiting for Jacob or Ashok to take a look at it, as Kevin pinged
>> them on thread [1]?
>>
>> [2] https://lkml.org/lkml/2018/4/18/293
>>
>> Thanks,
>> Jean
>>
> 
> Hi Jean,
> 
> I hopped that this proposition went in the same direction based on the
> following assumptions:
> 
> 
> - The goal of the get_resv_region is defined in iommu.h as:
> -
> * @get_resv_regions: Request list of reserved regions for a device
> -
> 
> - A iommu reserve region is a region which should not be mapped.
> Isn't it exactly what happens outside the aperture?
> Shouldn't it be reported by the iommu reserved region?
> 
> - If we use VFIO and want to get all reserved region we will have the
> VFIO_IOMMU_GET_INFO call provided by Shameer and it can get all reserved
> regions depending from the iommu driver itself at once by calling the
> get_reserved_region callback instead of having to merge them with the
> aperture.
> 
> - If there are other reserved region, depending on the system
> configuration and not on the IOMMU itself, the VFIO_IOMMU_GET_INFO call
> will have to merge them with the region gotten from the iommu driver.

I guess that would only happen if VFIO wanted to reserve IOVA regions
for its own use. But I don't see how that relates to the aperture

> - If we do not use QEMU nor VFIO at all, AFAIU, the standard way to
> retrieve the reserved regions associated with a device is to call the
> get_reserved_region callback from the associated iommu.
> 
> Please tell me were I am wrong.

Those are good points in my opinion (assuming the new reserved regions
for aperture is done in the core API)

To be reliable though, userspace can't get the aperture from sysfs
reserved_regions, it will have to get it from VFIO. If a new application
expects the aperture to be described in reserved_regions but is running
under an old kernel, it will just assume a 64-bit aperture by mistake.
Unless we introduce a new IOMMU_RESV_* type to distinguish the aperture?

Another thing to note is that we're currently adding nested support to
VFIO for Arm and x86 archs. It requires providing low-level and
sometimes arch-specific information about the IOMMU to userspace,
information that is easier to describe using sysfs (e.g.
/sys/class/iommu//*) than a VFIO ioctl. Among them is the input
address size of the IOMMU, so we'll end up with redundant information in
sysfs on x86 and Arm sides, but that's probably harmless.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] ARM: dma-mapping: Clear DMA ops on teardown

2019-01-21 Thread Thierry Reding
On Mon, Jan 21, 2019 at 02:52:16PM +, Robin Murphy wrote:
> Installing the appropriate non-IOMMU DMA ops in arm_iommu_detch_device()
> serves the case where IOMMU-aware drivers choose to control their own
> mapping but still make DMA API calls, however it also affects the case
> when the arch code itself tears down the mapping upon driver unbinding,
> where the ops now get left in place and can inhibit arch_setup_dma_ops()
> on subsequent re-probe attempts.
> 
> Fix the latter case by making sure that arch_teardown_dma_ops() cleans
> up whenever the ops were automatically installed by its counterpart.
> 
> Reported-by: Tobias Jakobi 
> Reported-by: Marek Szyprowski 
> Fixes: 1874619a7df4 "ARM: dma-mapping: Set proper DMA ops in 
> arm_iommu_detach_device()"
> Tested-by: Tobias Jakobi 
> Signed-off-by: Robin Murphy 
> ---
> 
> Sorry for the delay - there was a giant email infrastructure cock-up just
> at the point I wanted to go back through my archive and double-check the
> discussion around the original commit...
> 
> Robin.
> 
>  arch/arm/mm/dma-mapping.c | 2 ++
>  1 file changed, 2 insertions(+)

I had also tested your draft on Tegra last week and this looks
identical, so:

Tested-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-21 Thread Robin Murphy

On 21/01/2019 14:24, Ard Biesheuvel wrote:

On Mon, 21 Jan 2019 at 14:56, Robin Murphy  wrote:


On 21/01/2019 13:36, Ard Biesheuvel wrote:

On Mon, 21 Jan 2019 at 14:25, Robin Murphy  wrote:


On 21/01/2019 10:50, Ard Biesheuvel wrote:

On Mon, 21 Jan 2019 at 11:17, Vivek Gautam  wrote:


Hi,


On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
 wrote:


On Mon, 21 Jan 2019 at 06:54, Vivek Gautam  wrote:


Qualcomm SoCs have an additional level of cache called as
System cache, aka. Last level cache (LLC). This cache sits right
before the DDR, and is tightly coupled with the memory controller.
The clients using this cache request their slices from this
system cache, make it active, and can then start using it.
For these clients with smmu, to start using the system cache for
buffers and, related page tables [1], memory attributes need to be
set accordingly. This series add the required support.



Does this actually improve performance on reads from a device? The
non-cache coherent DMA routines perform an unconditional D-cache
invalidate by VA to the PoC before reading from the buffers filled by
the device, and I would expect the PoC to be defined as lying beyond
the LLC to still guarantee the architected behavior.


We have seen performance improvements when running Manhattan
GFXBench benchmarks.



Ah ok, that makes sense, since in that case, the data flow is mostly
to the device, not from the device.


As for the PoC, from my knowledge on sdm845 the system cache, aka
Last level cache (LLC) lies beyond the point of coherency.
Non-cache coherent buffers will not be cached to system cache also, and
no additional software cache maintenance ops are required for system cache.
Pratik can add more if I am missing something.

To take care of the memory attributes from DMA APIs side, we can add a
DMA_ATTR definition to take care of any dma non-coherent APIs calls.



So does the device use the correct inner non-cacheable, outer
writeback cacheable attributes if the SMMU is in pass-through?

We have been looking into another use case where the fact that the
SMMU overrides memory attributes is causing issues (WC mappings used
by the radeon and amdgpu driver). So if the SMMU would honour the
existing attributes, would you still need the SMMU changes?


Even if we could force a stage 2 mapping with the weakest pagetable
attributes (such that combining would work), there would still need to
be a way to set the TCR attributes appropriately if this behaviour is
wanted for the SMMU's own table walks as well.



Isn't that just a matter of implementing support for SMMUs that lack
the 'dma-coherent' attribute?


Not quite - in general they need INC-ONC attributes in case there
actually is something in the architectural outer-cacheable domain.


But is it a problem to use INC-ONC attributes for the SMMU PTW on this
chip? AIUI, the reason for the SMMU changes is to avoid the
performance hit of snooping, which is more expensive than cache
maintenance of SMMU page tables. So are you saying the by-VA cache
maintenance is not relayed to this system cache, resulting in page
table updates to be invisible to masters using INC-ONC attributes?


I only have a relatively vague impression of how this Qcom interconnect 
actually behaves, but AIUI the outer attribute has no correctness impact 
(it's effectively mismatched between CPU and devices already), only some 
degree of latency improvement which is effectively the opposite of 
no-snoop, in allowing certain non-coherent device traffic to still 
allocate in the LLC. I'm assuming that if that latency matters for the 
device accesses themselves, it might also matter for the associated 
table walks depending on the TLB miss rate.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] ARM: dma-mapping: Clear DMA ops on teardown

2019-01-21 Thread Robin Murphy
Installing the appropriate non-IOMMU DMA ops in arm_iommu_detch_device()
serves the case where IOMMU-aware drivers choose to control their own
mapping but still make DMA API calls, however it also affects the case
when the arch code itself tears down the mapping upon driver unbinding,
where the ops now get left in place and can inhibit arch_setup_dma_ops()
on subsequent re-probe attempts.

Fix the latter case by making sure that arch_teardown_dma_ops() cleans
up whenever the ops were automatically installed by its counterpart.

Reported-by: Tobias Jakobi 
Reported-by: Marek Szyprowski 
Fixes: 1874619a7df4 "ARM: dma-mapping: Set proper DMA ops in 
arm_iommu_detach_device()"
Tested-by: Tobias Jakobi 
Signed-off-by: Robin Murphy 
---

Sorry for the delay - there was a giant email infrastructure cock-up just
at the point I wanted to go back through my archive and double-check the
discussion around the original commit...

Robin.

 arch/arm/mm/dma-mapping.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index f1e2922e447c..1e3e08a1c456 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -2390,4 +2390,6 @@ void arch_teardown_dma_ops(struct device *dev)
return;
 
arm_teardown_iommu_dma_ops(dev);
+   /* Let arch_setup_dma_ops() start again from scratch upon re-probe */
+   set_dma_ops(dev, NULL);
 }
-- 
2.20.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: use generic DMA mapping code in powerpc V4

2019-01-21 Thread Christian Zigotzky

Hello Christoph,

Thanks for your reply. I successfully compiled a kernel (uImage) for the 
X5000 from your Git 'powerpc-dma.6-debug' (both patches) today.


It detects the SATA hard disk drive and boots without any problems. I 
will test the first patch in next days.


Thanks for your help,

Christian


On 19 January 2019 at 3:04PM, Christoph Hellwig wrote:

On Sat, Jan 19, 2019 at 02:02:22PM +0100, Christoph Hellwig wrote:

Interesting.  This suggest it is related to the use of ZONE_DMA by
the FSL SOCs that your board uses.  Let me investigate this a bit more.

As a hack to check that theory I've pushed a new commit to the
powerpc-dma.6-debug branch to use old powerpc GFP_DMA selection
with the new dma direct code:

http://git.infradead.org/users/hch/misc.git/commitdiff/5c532d07c2f3c3972104de505d06b8d85f403f06

And another one that drops the addressability checks that powerpc
never had:

http://git.infradead.org/users/hch/misc.git/commitdiff/18e7629b38465ca98f8e7eed639123a13ac3b669

Can you first test with both patches, and then just with the first
in case that worked?




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-21 Thread Ard Biesheuvel
On Mon, 21 Jan 2019 at 14:56, Robin Murphy  wrote:
>
> On 21/01/2019 13:36, Ard Biesheuvel wrote:
> > On Mon, 21 Jan 2019 at 14:25, Robin Murphy  wrote:
> >>
> >> On 21/01/2019 10:50, Ard Biesheuvel wrote:
> >>> On Mon, 21 Jan 2019 at 11:17, Vivek Gautam  
> >>> wrote:
> 
>  Hi,
> 
> 
>  On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
>   wrote:
> >
> > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam 
> >  wrote:
> >>
> >> Qualcomm SoCs have an additional level of cache called as
> >> System cache, aka. Last level cache (LLC). This cache sits right
> >> before the DDR, and is tightly coupled with the memory controller.
> >> The clients using this cache request their slices from this
> >> system cache, make it active, and can then start using it.
> >> For these clients with smmu, to start using the system cache for
> >> buffers and, related page tables [1], memory attributes need to be
> >> set accordingly. This series add the required support.
> >>
> >
> > Does this actually improve performance on reads from a device? The
> > non-cache coherent DMA routines perform an unconditional D-cache
> > invalidate by VA to the PoC before reading from the buffers filled by
> > the device, and I would expect the PoC to be defined as lying beyond
> > the LLC to still guarantee the architected behavior.
> 
>  We have seen performance improvements when running Manhattan
>  GFXBench benchmarks.
> 
> >>>
> >>> Ah ok, that makes sense, since in that case, the data flow is mostly
> >>> to the device, not from the device.
> >>>
>  As for the PoC, from my knowledge on sdm845 the system cache, aka
>  Last level cache (LLC) lies beyond the point of coherency.
>  Non-cache coherent buffers will not be cached to system cache also, and
>  no additional software cache maintenance ops are required for system 
>  cache.
>  Pratik can add more if I am missing something.
> 
>  To take care of the memory attributes from DMA APIs side, we can add a
>  DMA_ATTR definition to take care of any dma non-coherent APIs calls.
> 
> >>>
> >>> So does the device use the correct inner non-cacheable, outer
> >>> writeback cacheable attributes if the SMMU is in pass-through?
> >>>
> >>> We have been looking into another use case where the fact that the
> >>> SMMU overrides memory attributes is causing issues (WC mappings used
> >>> by the radeon and amdgpu driver). So if the SMMU would honour the
> >>> existing attributes, would you still need the SMMU changes?
> >>
> >> Even if we could force a stage 2 mapping with the weakest pagetable
> >> attributes (such that combining would work), there would still need to
> >> be a way to set the TCR attributes appropriately if this behaviour is
> >> wanted for the SMMU's own table walks as well.
> >>
> >
> > Isn't that just a matter of implementing support for SMMUs that lack
> > the 'dma-coherent' attribute?
>
> Not quite - in general they need INC-ONC attributes in case there
> actually is something in the architectural outer-cacheable domain.

But is it a problem to use INC-ONC attributes for the SMMU PTW on this
chip? AIUI, the reason for the SMMU changes is to avoid the
performance hit of snooping, which is more expensive than cache
maintenance of SMMU page tables. So are you saying the by-VA cache
maintenance is not relayed to this system cache, resulting in page
table updates to be invisible to masters using INC-ONC attributes?

> The
> case of the outer cacheablility being not that but a hint to control
> non-CPU traffic through some not-quite-transparent cache behind the PoC
> definitely stays wrapped up in qcom-specific magic ;)
>

I'm not surprised ...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dmaengine: fsl-edma: dma map slave device address

2019-01-21 Thread Laurentiu Tudor
Hi Angelo,

On 18.01.2019 23:50, Angelo Dureghello wrote:
> Hi Laurentiu,
> 
> On Fri, Jan 18, 2019 at 12:06:23PM +0200, Laurentiu Tudor wrote:
>> This mapping needs to be created in order for slave dma transfers
>> to work on systems with SMMU. The implementation mostly mimics the
>> one in pl330 dma driver, authored by Robin Murphy.
>>
>> Signed-off-by: Laurentiu Tudor 
>> Suggested-by: Robin Murphy 
>> ---
>> Original approach was to add the missing mappings in the i2c client driver,
>> see here for discussion: 
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fpatch%2F1026013%2Fdata=02%7C01%7Claurentiu.tudor%40nxp.com%7C7861dfe95dfb4fceeb8208d67d907488%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C636834456718898365sdata=XM5shQdcIRgFLtCmuRuFtViR6ttPDWI%2BNHXoPi68Xs8%3Dreserved=0
>>
>>   drivers/dma/fsl-edma-common.c | 66 ---
>>   drivers/dma/fsl-edma-common.h |  4 +++
>>   drivers/dma/fsl-edma.c|  1 +
>>   drivers/dma/mcf-edma.c|  1 +
>>   4 files changed, 68 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
>> index 8876c4c1bb2c..0e95ee24b6d4 100644
>> --- a/drivers/dma/fsl-edma-common.c
>> +++ b/drivers/dma/fsl-edma-common.c
>> @@ -6,6 +6,7 @@
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>>   
>>   #include "fsl-edma-common.h"
>>   
>> @@ -173,12 +174,62 @@ int fsl_edma_resume(struct dma_chan *chan)
>>   }
>>   EXPORT_SYMBOL_GPL(fsl_edma_resume);
>>   
>> +static void fsl_edma_unprep_slave_dma(struct fsl_edma_chan *fsl_chan)
>> +{
>> +if (fsl_chan->dma_dir != DMA_NONE)
>> +dma_unmap_resource(fsl_chan->vchan.chan.device->dev,
>> +   fsl_chan->dma_dev_addr,
>> +   fsl_chan->dma_dev_size,
>> +   fsl_chan->dma_dir, 0);
>> +fsl_chan->dma_dir = DMA_NONE;
>> +}
>> +
>> +static bool fsl_edma_prep_slave_dma(struct fsl_edma_chan *fsl_chan,
>> +enum dma_transfer_direction dir)
>> +{
>> +struct device *dev = fsl_chan->vchan.chan.device->dev;
>> +enum dma_data_direction dma_dir;
>> +phys_addr_t addr = 0;
>> +u32 size = 0;
>> +
>> +switch (dir) {
>> +case DMA_MEM_TO_DEV:
>> +dma_dir = DMA_FROM_DEVICE;
>> +addr = fsl_chan->cfg.dst_addr;
>> +size = fsl_chan->cfg.dst_maxburst;
>> +break;
>> +case DMA_DEV_TO_MEM:
>> +dma_dir = DMA_TO_DEVICE;
>> +addr = fsl_chan->cfg.src_addr;
>> +size = fsl_chan->cfg.src_maxburst;
>> +break;
>> +default:
>> +dma_dir = DMA_NONE;
>> +break;
>> +}
>> +
>> +/* Already mapped for this config? */
>> +if (fsl_chan->dma_dir == dma_dir)
>> +return true;
>> +
>> +fsl_edma_unprep_slave_dma(fsl_chan);
>> +
>> +fsl_chan->dma_dev_addr = dma_map_resource(dev, addr, size, dma_dir, 0);
>> +if (dma_mapping_error(dev, fsl_chan->dma_dev_addr))
>> +return false;
>> +fsl_chan->dma_dev_size = size;
>> +fsl_chan->dma_dir = dma_dir;
>> +
>> +return true;
>> +}
>> +
>>   int fsl_edma_slave_config(struct dma_chan *chan,
>>   struct dma_slave_config *cfg)
>>   {
>>  struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
>>   
>>  memcpy(_chan->cfg, cfg, sizeof(*cfg));
>> +fsl_edma_unprep_slave_dma(fsl_chan);
>>   
>>  return 0;
>>   }
>> @@ -378,6 +429,9 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
>>  if (!is_slave_direction(direction))
>>  return NULL;
>>   
>> +if (!fsl_edma_prep_slave_dma(fsl_chan, direction))
>> +return NULL;
>> +
>>  sg_len = buf_len / period_len;
>>  fsl_desc = fsl_edma_alloc_desc(fsl_chan, sg_len);
>>  if (!fsl_desc)
>> @@ -409,11 +463,11 @@ struct dma_async_tx_descriptor 
>> *fsl_edma_prep_dma_cyclic(
>>   
>>  if (direction == DMA_MEM_TO_DEV) {
>>  src_addr = dma_buf_next;
>> -dst_addr = fsl_chan->cfg.dst_addr;
>> +dst_addr = fsl_chan->dma_dev_addr;
>>  soff = fsl_chan->cfg.dst_addr_width;
>>  doff = 0;
>>  } else {
>> -src_addr = fsl_chan->cfg.src_addr;
>> +src_addr = fsl_chan->dma_dev_addr;
>>  dst_addr = dma_buf_next;
>>  soff = 0;
>>  doff = fsl_chan->cfg.src_addr_width;
>> @@ -444,6 +498,9 @@ struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
>>  if (!is_slave_direction(direction))
>>  return NULL;
>>   
>> +if (!fsl_edma_prep_slave_dma(fsl_chan, direction))
>> +return NULL;
>> +
>>  fsl_desc = fsl_edma_alloc_desc(fsl_chan, sg_len);
>>  if (!fsl_desc)
>>  return NULL;
>> @@ -468,11 +525,11 @@ 

Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-21 Thread Robin Murphy

On 21/01/2019 13:36, Ard Biesheuvel wrote:

On Mon, 21 Jan 2019 at 14:25, Robin Murphy  wrote:


On 21/01/2019 10:50, Ard Biesheuvel wrote:

On Mon, 21 Jan 2019 at 11:17, Vivek Gautam  wrote:


Hi,


On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
 wrote:


On Mon, 21 Jan 2019 at 06:54, Vivek Gautam  wrote:


Qualcomm SoCs have an additional level of cache called as
System cache, aka. Last level cache (LLC). This cache sits right
before the DDR, and is tightly coupled with the memory controller.
The clients using this cache request their slices from this
system cache, make it active, and can then start using it.
For these clients with smmu, to start using the system cache for
buffers and, related page tables [1], memory attributes need to be
set accordingly. This series add the required support.



Does this actually improve performance on reads from a device? The
non-cache coherent DMA routines perform an unconditional D-cache
invalidate by VA to the PoC before reading from the buffers filled by
the device, and I would expect the PoC to be defined as lying beyond
the LLC to still guarantee the architected behavior.


We have seen performance improvements when running Manhattan
GFXBench benchmarks.



Ah ok, that makes sense, since in that case, the data flow is mostly
to the device, not from the device.


As for the PoC, from my knowledge on sdm845 the system cache, aka
Last level cache (LLC) lies beyond the point of coherency.
Non-cache coherent buffers will not be cached to system cache also, and
no additional software cache maintenance ops are required for system cache.
Pratik can add more if I am missing something.

To take care of the memory attributes from DMA APIs side, we can add a
DMA_ATTR definition to take care of any dma non-coherent APIs calls.



So does the device use the correct inner non-cacheable, outer
writeback cacheable attributes if the SMMU is in pass-through?

We have been looking into another use case where the fact that the
SMMU overrides memory attributes is causing issues (WC mappings used
by the radeon and amdgpu driver). So if the SMMU would honour the
existing attributes, would you still need the SMMU changes?


Even if we could force a stage 2 mapping with the weakest pagetable
attributes (such that combining would work), there would still need to
be a way to set the TCR attributes appropriately if this behaviour is
wanted for the SMMU's own table walks as well.



Isn't that just a matter of implementing support for SMMUs that lack
the 'dma-coherent' attribute?


Not quite - in general they need INC-ONC attributes in case there 
actually is something in the architectural outer-cacheable domain. The 
case of the outer cacheablility being not that but a hint to control 
non-CPU traffic through some not-quite-transparent cache behind the PoC 
definitely stays wrapped up in qcom-specific magic ;)


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] iommu/arm-smmu: Move to bitmap for arm_smmu_domain atrributes

2019-01-21 Thread Robin Murphy

On 21/01/2019 05:53, Vivek Gautam wrote:

A number of arm_smmu_domain's attributes can be assigned based
on the iommu domains's attributes. These local attributes better
be managed by a bitmap.
So remove boolean flags and move to a 32-bit bitmap, and enable
each bits separtely.

Signed-off-by: Vivek Gautam 
---
  drivers/iommu/arm-smmu.c | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7ebbcf1b2eb3..52b300dfc096 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -257,10 +257,11 @@ struct arm_smmu_domain {
const struct iommu_gather_ops   *tlb_ops;
struct arm_smmu_cfg cfg;
enum arm_smmu_domain_stage  stage;
-   boolnon_strict;
struct mutexinit_mutex; /* Protects smmu pointer */
spinlock_t  cb_lock; /* Serialises ATS1* ops and 
TLB syncs */
struct iommu_domain domain;
+#define ARM_SMMU_DOMAIN_ATTR_NON_STRICTBIT(0)
+   unsigned intattr;
  };
  
  struct arm_smmu_option_prop {

@@ -901,7 +902,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
  
-	if (smmu_domain->non_strict)

+   if (smmu_domain->attr & ARM_SMMU_DOMAIN_ATTR_NON_STRICT)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
  
  	/* Non coherent page table mappings only for Stage-1 */

@@ -1598,7 +1599,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case IOMMU_DOMAIN_DMA:
switch (attr) {
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
-   *(int *)data = smmu_domain->non_strict;
+   *(int *)data = !!(smmu_domain->attr &
+ ARM_SMMU_DOMAIN_ATTR_NON_STRICT);
return 0;
default:
return -ENODEV;
@@ -1638,7 +1640,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
*domain,
case IOMMU_DOMAIN_DMA:
switch (attr) {
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
-   smmu_domain->non_strict = *(int *)data;
+   smmu_domain->attr |= ARM_SMMU_DOMAIN_ATTR_NON_STRICT;


But what if *data == 0?

Robin.


break;
default:
ret = -ENODEV;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-21 Thread Ard Biesheuvel
On Mon, 21 Jan 2019 at 14:25, Robin Murphy  wrote:
>
> On 21/01/2019 10:50, Ard Biesheuvel wrote:
> > On Mon, 21 Jan 2019 at 11:17, Vivek Gautam  
> > wrote:
> >>
> >> Hi,
> >>
> >>
> >> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
> >>  wrote:
> >>>
> >>> On Mon, 21 Jan 2019 at 06:54, Vivek Gautam  
> >>> wrote:
> 
>  Qualcomm SoCs have an additional level of cache called as
>  System cache, aka. Last level cache (LLC). This cache sits right
>  before the DDR, and is tightly coupled with the memory controller.
>  The clients using this cache request their slices from this
>  system cache, make it active, and can then start using it.
>  For these clients with smmu, to start using the system cache for
>  buffers and, related page tables [1], memory attributes need to be
>  set accordingly. This series add the required support.
> 
> >>>
> >>> Does this actually improve performance on reads from a device? The
> >>> non-cache coherent DMA routines perform an unconditional D-cache
> >>> invalidate by VA to the PoC before reading from the buffers filled by
> >>> the device, and I would expect the PoC to be defined as lying beyond
> >>> the LLC to still guarantee the architected behavior.
> >>
> >> We have seen performance improvements when running Manhattan
> >> GFXBench benchmarks.
> >>
> >
> > Ah ok, that makes sense, since in that case, the data flow is mostly
> > to the device, not from the device.
> >
> >> As for the PoC, from my knowledge on sdm845 the system cache, aka
> >> Last level cache (LLC) lies beyond the point of coherency.
> >> Non-cache coherent buffers will not be cached to system cache also, and
> >> no additional software cache maintenance ops are required for system cache.
> >> Pratik can add more if I am missing something.
> >>
> >> To take care of the memory attributes from DMA APIs side, we can add a
> >> DMA_ATTR definition to take care of any dma non-coherent APIs calls.
> >>
> >
> > So does the device use the correct inner non-cacheable, outer
> > writeback cacheable attributes if the SMMU is in pass-through?
> >
> > We have been looking into another use case where the fact that the
> > SMMU overrides memory attributes is causing issues (WC mappings used
> > by the radeon and amdgpu driver). So if the SMMU would honour the
> > existing attributes, would you still need the SMMU changes?
>
> Even if we could force a stage 2 mapping with the weakest pagetable
> attributes (such that combining would work), there would still need to
> be a way to set the TCR attributes appropriately if this behaviour is
> wanted for the SMMU's own table walks as well.
>

Isn't that just a matter of implementing support for SMMUs that lack
the 'dma-coherent' attribute?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-21 Thread Robin Murphy

On 21/01/2019 10:50, Ard Biesheuvel wrote:

On Mon, 21 Jan 2019 at 11:17, Vivek Gautam  wrote:


Hi,


On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
 wrote:


On Mon, 21 Jan 2019 at 06:54, Vivek Gautam  wrote:


Qualcomm SoCs have an additional level of cache called as
System cache, aka. Last level cache (LLC). This cache sits right
before the DDR, and is tightly coupled with the memory controller.
The clients using this cache request their slices from this
system cache, make it active, and can then start using it.
For these clients with smmu, to start using the system cache for
buffers and, related page tables [1], memory attributes need to be
set accordingly. This series add the required support.



Does this actually improve performance on reads from a device? The
non-cache coherent DMA routines perform an unconditional D-cache
invalidate by VA to the PoC before reading from the buffers filled by
the device, and I would expect the PoC to be defined as lying beyond
the LLC to still guarantee the architected behavior.


We have seen performance improvements when running Manhattan
GFXBench benchmarks.



Ah ok, that makes sense, since in that case, the data flow is mostly
to the device, not from the device.


As for the PoC, from my knowledge on sdm845 the system cache, aka
Last level cache (LLC) lies beyond the point of coherency.
Non-cache coherent buffers will not be cached to system cache also, and
no additional software cache maintenance ops are required for system cache.
Pratik can add more if I am missing something.

To take care of the memory attributes from DMA APIs side, we can add a
DMA_ATTR definition to take care of any dma non-coherent APIs calls.



So does the device use the correct inner non-cacheable, outer
writeback cacheable attributes if the SMMU is in pass-through?

We have been looking into another use case where the fact that the
SMMU overrides memory attributes is causing issues (WC mappings used
by the radeon and amdgpu driver). So if the SMMU would honour the
existing attributes, would you still need the SMMU changes?


Even if we could force a stage 2 mapping with the weakest pagetable 
attributes (such that combining would work), there would still need to 
be a way to set the TCR attributes appropriately if this behaviour is 
wanted for the SMMU's own table walks as well.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/io-pgtable-arm: Add support for non-coherent page tables

2019-01-21 Thread Robin Murphy

On 17/01/2019 09:27, Vivek Gautam wrote:

 From Robin's comment [1] about touching TCR configurations -

"TBH if we're going to touch the TCR attributes at all then we should
probably correct that sloppiness first - there's an occasional argument
for using non-cacheable pagetables even on a coherent SMMU if reducing
snoop traffic/latency on walks outweighs the cost of cache maintenance
on PTE updates, but anyone thinking they can get that by overriding
dma-coherent silently gets the worst of both worlds thanks to this
current TCR value."

We have IO_PGTABLE_QUIRK_NO_DMA quirk present, but we don't force
anybody _not_ using dma-coherent smmu to have non-cacheable page table
mappings.
Having another quirk flag can help in having non-cacheable memory for
page tables once and for all.

[1] https://lore.kernel.org/patchwork/patch/1020906/

Signed-off-by: Vivek Gautam 
---
  drivers/iommu/io-pgtable-arm.c | 17 -
  drivers/iommu/io-pgtable.h |  6 ++
  2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 237cacd4a62b..c76919c30f1a 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -780,7 +780,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
struct arm_lpae_io_pgtable *data;
  
  	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA |

-   IO_PGTABLE_QUIRK_NON_STRICT))
+   IO_PGTABLE_QUIRK_NON_STRICT |
+   IO_PGTABLE_QUIRK_NON_COHERENT))
return NULL;
  
  	data = arm_lpae_alloc_pgtable(cfg);

@@ -788,9 +789,14 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
return NULL;
  
  	/* TCR */

-   reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
- (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
- (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+   reg = ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT;
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_NON_COHERENT)
+   reg |= ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT |
+  ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT;
+   else
+   reg |= ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT |
+  ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT;
  
  	switch (ARM_LPAE_GRANULE(data)) {

case SZ_4K:
@@ -873,7 +879,8 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, 
void *cookie)
  
  	/* The NS quirk doesn't apply at stage 2 */

if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA |
-   IO_PGTABLE_QUIRK_NON_STRICT))
+   IO_PGTABLE_QUIRK_NON_STRICT |
+   IO_PGTABLE_QUIRK_NON_COHERENT))
return NULL;
  
  	data = arm_lpae_alloc_pgtable(cfg);

diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 47d5ae559329..46604cf7b017 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -75,6 +75,11 @@ struct io_pgtable_cfg {
 * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs
 *  on unmap, for DMA domains using the flush queue mechanism for
 *  delayed invalidation.
+*
+* IO_PGTABLE_QUIRK_NON_COHERENT: Enforce non-cacheable mappings for
+*  pagetables even on a coherent SMMU for cases where reducing
+*  snoop traffic/latency on walks outweighs the cost of cache
+*  maintenance on PTE updates.


Hmm, we can't actually "enforce" anything with this as-is - all we're 
doing is setting the attributes that the IOMMU will use for pagetable 
walks, and that has no impact on how the CPU actually writes PTEs to 
memory. In particular, in the case of a hardware-coherent IOMMU which is 
described as such, even if we make the dma_map/sync calls they still 
won't do anything since they 'know' that the IOMMU is coherent. Thus if 
we then set up a non-cacheable TCR we would have no proper means of 
making pagetables correctly visible to the walker.


Aw crap, this is turning out to be a microcosm of the PCIe no-snoop 
mess... :(


To start with, at least, what we want is to set a non-cacheable TCR if 
the IOMMU is *not* coherent (as far as Linux is concerned - that 
includes the firmware-lying-about-the-hardware situation I was alluding 
to before), but even that isn't necessarily as straightforward as it 
seems. AFAICS, if QUIRK_NO_DMA is set then we definitely have to use a 
cacheable TCR; we can't strictly rely on the inverse being true, but in 
practice we *might* get away with it since we already disallow most 
cases in which the DMA API calls would actually do anything for a 
known-coherent IOMMU device.


Robin.


 */
#define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
#define IO_PGTABLE_QUIRK_NO_PERMS 

Re: [PATCH v1] iommu/s390: Declare s390 iommu reserved regions

2019-01-21 Thread Robin Murphy

On 18/01/2019 13:29, Pierre Morel wrote:

On 17/01/2019 14:02, Robin Murphy wrote:

On 15/01/2019 17:37, Pierre Morel wrote:

The s390 iommu can only allow DMA transactions between the zPCI device
entries start_dma and end_dma.

Let's declare the regions before start_dma and after end_dma as
reserved regions using the appropriate callback in iommu_ops.

The reserved region may later be retrieved from sysfs or from
the vfio iommu internal interface.


For this particular case, I think the best solution is to give VFIO 
the ability to directly interrogate the domain geometry (which s390 
appears to set correctly already). The idea of reserved regions was 
really for 'unexpected' holes inside the usable address space - using 
them to also describe places that are entirely outside that address 
space rather confuses things IMO.


Furthermore, even if we *did* end up going down the route of actively 
reserving addresses beyond the usable aperture, it doesn't seem 
sensible for individual drivers to do it themselves when the core API 
already describes the relevant information generically.


Robin.


Robin,

I already posted a patch retrieving the geometry through 
VFIO_IOMMU_GET_INFO using a specific capability for the geometry [1], 
and AFAIU, Alex did not agree with this.


What is different in what you propose?


I didn't mean to imply that aperture and reserved regions are mutually 
exclusive, just that they are conceptually distinct things, i.e. there 
is a fundamental difference between "address which could in theory be 
mapped but wouldn't work as expected" and "address which is physically 
impossible to map at all".


Admittedly I hadn't closely followed all of the previous discussions, 
and Alex has a fair point - for VFIO users who will mostly care about 
checking whether two address maps are compatible, it probably is more 
useful to just describe a single list of usable regions, rather than the 
absolute bounds plus a list of unusable holes within them. That still 
doesn't give us any need to conflate things throughout the kernel 
internals, though - the typical usage there is to size an IOVA allocator 
or page table based on the aperture, then carve out any necessary 
reservations. In that context, having to be aware of and handle 
'impossible' reservations outside the aperture just invites bugs and 
adds complexity that would be better avoided.


Robin.



@Alex: I was hoping that this patch goes in your direction. What do you 
think?


Thanks,
Pierre

[1]: https://lore.kernel.org/patchwork/patch/1030369/





This seems to me related with the work Shameer has started on
vfio_iommu_type1 so I add Alex and Shameer to the CC list.


Pierre Morel (1):
   iommu/s390: Declare s390 iommu reserved regions

  drivers/iommu/s390-iommu.c | 29 +
  1 file changed, 29 insertions(+)







___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v1] iommu/s390: Declare s390 iommu reserved regions

2019-01-21 Thread Pierre Morel

On 18/01/2019 14:51, Jean-Philippe Brucker wrote:

Hi Pierre,

On 18/01/2019 13:29, Pierre Morel wrote:

On 17/01/2019 14:02, Robin Murphy wrote:

On 15/01/2019 17:37, Pierre Morel wrote:

The s390 iommu can only allow DMA transactions between the zPCI device
entries start_dma and end_dma.



...



I already posted a patch retrieving the geometry through
VFIO_IOMMU_GET_INFO using a specific capability for the geometry [1],
and AFAIU, Alex did not agree with this.


On arm we also need to report the IOMMU geometry to userspace (max IOVA
size in particular). Shameer has been working on a solution [2] that
presents a unified view of both geometry and reserved regions into the
VFIO_IOMMU_GET_INFO call, and I think we should go with that. If I
understand correctly it's currently blocked on the RMRR problem and
we're waiting for Jacob or Ashok to take a look at it, as Kevin pinged
them on thread [1]?

[2] https://lkml.org/lkml/2018/4/18/293

Thanks,
Jean



Hi Jean,

I hopped that this proposition went in the same direction based on the 
following assumptions:



- The goal of the get_resv_region is defined in iommu.h as:
-
* @get_resv_regions: Request list of reserved regions for a device
-

- A iommu reserve region is a region which should not be mapped.
Isn't it exactly what happens outside the aperture?
Shouldn't it be reported by the iommu reserved region?

- If we use VFIO and want to get all reserved region we will have the 
VFIO_IOMMU_GET_INFO call provided by Shameer and it can get all reserved 
regions depending from the iommu driver itself at once by calling the 
get_reserved_region callback instead of having to merge them with the 
aperture.


- If there are other reserved region, depending on the system 
configuration and not on the IOMMU itself, the VFIO_IOMMU_GET_INFO call 
will have to merge them with the region gotten from the iommu driver.


- If we do not use QEMU nor VFIO at all, AFAIU, the standard way to 
retrieve the reserved regions associated with a device is to call the 
get_reserved_region callback from the associated iommu.


Please tell me were I am wrong.

Regards,
Pierre




What is different in what you propose?

@Alex: I was hoping that this patch goes in your direction. What do you
think?

Thanks,
Pierre

[1]: https://lore.kernel.org/patchwork/patch/1030369/





This seems to me related with the work Shameer has started on
vfio_iommu_type1 so I add Alex and Shameer to the CC list.


Pierre Morel (1):
    iommu/s390: Declare s390 iommu reserved regions

   drivers/iommu/s390-iommu.c | 29 +
   1 file changed, 29 insertions(+)











--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 0/7] Add virtio-iommu driver

2019-01-21 Thread Jean-Philippe Brucker
Hi,

On 18/01/2019 15:51, Michael S. Tsirkin wrote:
> 
> On Tue, Jan 15, 2019 at 12:19:52PM +, Jean-Philippe Brucker wrote:
>> Implement the virtio-iommu driver, following specification v0.9 [1].
>>
>> This is a simple rebase onto Linux v5.0-rc2. We now use the
>> dev_iommu_fwspec_get() helper introduced in v5.0 instead of accessing
>> dev->iommu_fwspec, but there aren't any functional change from v6 [2].
>>
>> Our current goal for virtio-iommu is to get a paravirtual IOMMU working
>> on Arm, and enable device assignment to guest userspace. In this
>> use-case the mappings are static, and don't require optimal performance,
>> so this series tries to keep things simple. However there is plenty more
>> to do for features and optimizations, and having this base in v5.1 would
>> be good. Given that most of the changes are to drivers/iommu, I believe
>> the driver and future changes should go via the IOMMU tree.
>>
>> You can find Linux driver and kvmtool device on v0.9.2 branches [3],
>> module and x86 support on virtio-iommu/devel. Also tested with Eric's
>> QEMU device [4]. Please note that the series depends on Robin's
>> probe-deferral fix [5], which will hopefully land in v5.0.
>>
>> [1] Virtio-iommu specification v0.9, sources and pdf
>> git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
>> http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
>>
>> [2] [PATCH v6 0/7] Add virtio-iommu driver
>> 
>> https://lists.linuxfoundation.org/pipermail/iommu/2018-December/032127.html
>>
>> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.2
>> git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9.2
>>
>> [4] [RFC v9 00/17] VIRTIO-IOMMU device
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html
>>
>> [5] [PATCH] iommu/of: Fix probe-deferral
>> https://www.spinics.net/lists/arm-kernel/msg698371.html
> 
> Thanks for the work!
> So really my only issue with this is that there's no
> way for the IOMMU to describe the devices that it
> covers.
> 
> As a result that is then done in a platform-specific way.
> 
> And this means that for example it does not solve the problem that e.g.
> some power people have in that their platform simply does not have a way
> to specify which devices are covered by the IOMMU.

Isn't power using device tree? I haven't looked much at power because I
was told a while ago that they already paravirtualize their IOMMU and
don't need virtio-iommu, except perhaps for some legacy platforms. Or
something along those lines. But I would certainly be interested in
enabling the IOMMU for more architectures.

As for the enumeration problem, I still don't think we can get much
better than DT and ACPI as solutions (and IMO they are necessary to make
this device portable). But I believe that getting DT and ACPI support is
just a one-off inconvenience. That is, once the required bindings are
accepted, any future extension can then be done at the virtio level with
feature bits and probe requests, without having to update ACPI or DT.

Thanks,
Jean

> Solving that problem would make me much more excited about
> this device.
> 
> On the other hand I can see that while there have been some
> developments most of the code has been stable for quite a while now.
> 
> So what I am trying to do right about now, is making a small module that
> loads early and pokes at the IOMMU sufficiently to get the data about
> which devices use the IOMMU out of it using standard virtio config
> space.  IIUC it's claimed to be impossible without messy changes to the
> boot sequence.
> 
> If I succeed at least on some platforms I'll ask that this design is
> worked into this device, minimizing info that goes through DT/ACPI.  If
> I see I can't make it in time to meet the next merge window, I plan
> merging the existing patches using DT (barring surprises).
> 
> As I only have a very small amount of time to spend on this attempt, If
> someone else wants to try doing that in parallel, that would be great!
> 
> 
>> Jean-Philippe Brucker (7):
>>   dt-bindings: virtio-mmio: Add IOMMU description
>>   dt-bindings: virtio: Add virtio-pci-iommu node
>>   of: Allow the iommu-map property to omit untranslated devices
>>   PCI: OF: Initialize dev->fwnode appropriately
>>   iommu: Add virtio-iommu driver
>>   iommu/virtio: Add probe request
>>   iommu/virtio: Add event queue
>>
>>  .../devicetree/bindings/virtio/iommu.txt  |   66 +
>>  .../devicetree/bindings/virtio/mmio.txt   |   30 +
>>  MAINTAINERS   |7 +
>>  drivers/iommu/Kconfig |   11 +
>>  drivers/iommu/Makefile|1 +
>>  drivers/iommu/virtio-iommu.c  | 1158 +
>>  drivers/of/base.c |   10 +-
>>  drivers/pci/of.c  |7 +
>>  include/uapi/linux/virtio_ids.h   |1 +
>>  include/uapi/linux/virtio_iommu.h |  161 +++
>>  

Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-21 Thread Ard Biesheuvel
On Mon, 21 Jan 2019 at 11:17, Vivek Gautam  wrote:
>
> Hi,
>
>
> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
>  wrote:
> >
> > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam  
> > wrote:
> > >
> > > Qualcomm SoCs have an additional level of cache called as
> > > System cache, aka. Last level cache (LLC). This cache sits right
> > > before the DDR, and is tightly coupled with the memory controller.
> > > The clients using this cache request their slices from this
> > > system cache, make it active, and can then start using it.
> > > For these clients with smmu, to start using the system cache for
> > > buffers and, related page tables [1], memory attributes need to be
> > > set accordingly. This series add the required support.
> > >
> >
> > Does this actually improve performance on reads from a device? The
> > non-cache coherent DMA routines perform an unconditional D-cache
> > invalidate by VA to the PoC before reading from the buffers filled by
> > the device, and I would expect the PoC to be defined as lying beyond
> > the LLC to still guarantee the architected behavior.
>
> We have seen performance improvements when running Manhattan
> GFXBench benchmarks.
>

Ah ok, that makes sense, since in that case, the data flow is mostly
to the device, not from the device.

> As for the PoC, from my knowledge on sdm845 the system cache, aka
> Last level cache (LLC) lies beyond the point of coherency.
> Non-cache coherent buffers will not be cached to system cache also, and
> no additional software cache maintenance ops are required for system cache.
> Pratik can add more if I am missing something.
>
> To take care of the memory attributes from DMA APIs side, we can add a
> DMA_ATTR definition to take care of any dma non-coherent APIs calls.
>

So does the device use the correct inner non-cacheable, outer
writeback cacheable attributes if the SMMU is in pass-through?

We have been looking into another use case where the fact that the
SMMU overrides memory attributes is causing issues (WC mappings used
by the radeon and amdgpu driver). So if the SMMU would honour the
existing attributes, would you still need the SMMU changes?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-21 Thread Vivek Gautam
Hi,


On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
 wrote:
>
> On Mon, 21 Jan 2019 at 06:54, Vivek Gautam  
> wrote:
> >
> > Qualcomm SoCs have an additional level of cache called as
> > System cache, aka. Last level cache (LLC). This cache sits right
> > before the DDR, and is tightly coupled with the memory controller.
> > The clients using this cache request their slices from this
> > system cache, make it active, and can then start using it.
> > For these clients with smmu, to start using the system cache for
> > buffers and, related page tables [1], memory attributes need to be
> > set accordingly. This series add the required support.
> >
>
> Does this actually improve performance on reads from a device? The
> non-cache coherent DMA routines perform an unconditional D-cache
> invalidate by VA to the PoC before reading from the buffers filled by
> the device, and I would expect the PoC to be defined as lying beyond
> the LLC to still guarantee the architected behavior.

We have seen performance improvements when running Manhattan
GFXBench benchmarks.

As for the PoC, from my knowledge on sdm845 the system cache, aka
Last level cache (LLC) lies beyond the point of coherency.
Non-cache coherent buffers will not be cached to system cache also, and
no additional software cache maintenance ops are required for system cache.
Pratik can add more if I am missing something.

To take care of the memory attributes from DMA APIs side, we can add a
DMA_ATTR definition to take care of any dma non-coherent APIs calls.

Regards
Vivek
>
>
>
> > This change is a realisation of following changes from downstream msm-4.9:
> > iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT[2]
> > iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[3]
> >
> > Changes since v2:
> >  - Split the patches into io-pgtable-arm driver and arm-smmu driver.
> >  - Converted smmu domain attributes to a bitmap, so multiple attributes
> >can be managed easily.
> >  - With addition of non-coherent page table mapping support [4], this
> >patch series now aligns with the understanding of upgrading the
> >non-coherent devices to use some level of outer cache.
> >  - Updated the macros and comments to reflect the use of QCOM_SYS_CACHE.
> >  - QCOM_SYS_CACHE can still be used at stage 2, so that doens't depend on
> >stage-1 mapping.
> >  - Added change to disable the attribute from arm_smmu_domain_set_attr()
> >when needed.
> >  - Removed the page protection controls for QCOM_SYS_CACHE at the DMA API
> >level.
> >
> > Goes on top of the non-coherent page tables support patch series [4]
> >
> > [1] https://patchwork.kernel.org/patch/10302791/
> > [2] 
> > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9=bf762276796e79ca90014992f4d9da5593fa7d51
> > [3] 
> > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9=d4c72c413ea27c43f60825193d4de9cb8ffd9602
> > [4] https://lore.kernel.org/patchwork/cover/1032938/
> >
> > Vivek Gautam (3):
> >   iommu/arm-smmu: Move to bitmap for arm_smmu_domain atrributes
> >   iommu/io-pgtable-arm: Add support to use system cache
> >   iommu/arm-smmu: Add support to use system cache
> >
> >  drivers/iommu/arm-smmu.c   | 28 
> >  drivers/iommu/io-pgtable-arm.c | 15 +--
> >  drivers/iommu/io-pgtable.h |  4 
> >  include/linux/iommu.h  |  2 ++
> >  4 files changed, 43 insertions(+), 6 deletions(-)
> >
> > --
> > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> > of Code Aurora Forum, hosted by The Linux Foundation
> >
> >
> > ___
> > linux-arm-kernel mailing list
> > linux-arm-ker...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[git pull] IOMMU Fixes for Linux v5.0-rc1

2019-01-21 Thread Joerg Roedel
Hi Linus,

The following changes since commit bfeffd155283772bbe78c6a05dec7c0128ee500c:

  Linux 5.0-rc1 (2019-01-06 17:08:20 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
tags/iommu-fixes-v5.0-rc3

for you to fetch changes up to e8e683ae9a736407a20135df7809090a446db707:

  iommu/of: Fix probe-deferral (2019-01-11 12:28:24 +0100)


IOMMU Fix for Linux v5.0-rc3

One fix only for now:

- Fix probe deferral in iommu/of code (broke with recent changes
  to iommu_ops->add_device invocation)


Robin Murphy (1):
  iommu/of: Fix probe-deferral

 drivers/iommu/of_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Please pull.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu