Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code

2020-05-29 Thread Jerry Snitselaar

On Tue Apr 14 20, Joerg Roedel wrote:

Hi,

here is the second version of this patch-set. The first version with
some more introductory text can be found here:

https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/

Changes v1->v2:

* Rebased to v5.7-rc1

* Re-wrote the arm-smmu changes as suggested by Robin Murphy

* Re-worked the Exynos patches to hopefully not break the
  driver anymore

* Fixed a missing mutex_unlock() reported by Marek Szyprowski,
  thanks for that.

There is also a git-branch available with these patches applied:


https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2

Please review.

Thanks,

Joerg

Joerg Roedel (32):
 iommu: Move default domain allocation to separate function
 iommu/amd: Implement iommu_ops->def_domain_type call-back
 iommu/vt-d: Wire up iommu_ops->def_domain_type
 iommu/amd: Remove dma_mask check from check_device()
 iommu/amd: Return -ENODEV in add_device when device is not handled by
   IOMMU
 iommu: Add probe_device() and remove_device() call-backs
 iommu: Move default domain allocation to iommu_probe_device()
 iommu: Keep a list of allocated groups in __iommu_probe_device()
 iommu: Move new probe_device path to separate function
 iommu: Split off default domain allocation from group assignment
 iommu: Move iommu_group_create_direct_mappings() out of
   iommu_group_add_device()
 iommu: Export bus_iommu_probe() and make is safe for re-probing
 iommu/amd: Remove dev_data->passthrough
 iommu/amd: Convert to probe/release_device() call-backs
 iommu/vt-d: Convert to probe/release_device() call-backs
 iommu/arm-smmu: Convert to probe/release_device() call-backs
 iommu/pamu: Convert to probe/release_device() call-backs
 iommu/s390: Convert to probe/release_device() call-backs
 iommu/virtio: Convert to probe/release_device() call-backs
 iommu/msm: Convert to probe/release_device() call-backs
 iommu/mediatek: Convert to probe/release_device() call-backs
 iommu/mediatek-v1 Convert to probe/release_device() call-backs
 iommu/qcom: Convert to probe/release_device() call-backs
 iommu/rockchip: Convert to probe/release_device() call-backs
 iommu/tegra: Convert to probe/release_device() call-backs
 iommu/renesas: Convert to probe/release_device() call-backs
 iommu/omap: Remove orphan_dev tracking
 iommu/omap: Convert to probe/release_device() call-backs
 iommu/exynos: Use first SYSMMU in controllers list for IOMMU core
 iommu/exynos: Convert to probe/release_device() call-backs
 iommu: Remove add_device()/remove_device() code-paths
 iommu: Unexport iommu_group_get_for_dev()

Sai Praneeth Prakhya (1):
 iommu: Add def_domain_type() callback in iommu_ops

drivers/iommu/amd_iommu.c   |  97 
drivers/iommu/amd_iommu_types.h |   1 -
drivers/iommu/arm-smmu-v3.c |  38 +--
drivers/iommu/arm-smmu.c|  39 ++--
drivers/iommu/exynos-iommu.c|  24 +-
drivers/iommu/fsl_pamu_domain.c |  22 +-
drivers/iommu/intel-iommu.c |  68 +-
drivers/iommu/iommu.c   | 393 +---
drivers/iommu/ipmmu-vmsa.c  |  60 ++---
drivers/iommu/msm_iommu.c   |  34 +--
drivers/iommu/mtk_iommu.c   |  24 +-
drivers/iommu/mtk_iommu_v1.c|  50 ++--
drivers/iommu/omap-iommu.c  |  99 ++--
drivers/iommu/qcom_iommu.c  |  24 +-
drivers/iommu/rockchip-iommu.c  |  26 +--
drivers/iommu/s390-iommu.c  |  22 +-
drivers/iommu/tegra-gart.c  |  24 +-
drivers/iommu/tegra-smmu.c  |  31 +--
drivers/iommu/virtio-iommu.c|  41 +---
include/linux/iommu.h   |  21 +-
20 files changed, 533 insertions(+), 605 deletions(-)

--
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



Hi Joerg,

With this patchset, I have an epyc system where if I boot with
iommu=nopt and force a dump I will see some io page faults for a nic
on the system. The vmcore is harvested and the system reboots. I
haven't reproduced it on other systems yet, but without the patchset I
don't see the io page faults during the kdump.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api

2020-05-29 Thread Logan Gunthorpe



On 2020-05-29 3:11 p.m., Marek Szyprowski wrote:
> Patches are pending:
> https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprow...@samsung.com/T/

Cool, nice! Though, I still don't think that fixes the issue in
i915_scatterlist.h given it still ignores sg_dma_len() and strictly
relies on sg_next()/sg_is_last() to stop iterating -- and I suspect this
is the bug that got in Tom's way.

>> However, as Robin pointed out, there are other ugly tricks like stopping
>> iterating through the SGL when sg_dma_len() is zero. For example, the
>> AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does
>> this trick and thus likely isn't buggy (otherwise, I'd expect someone to
>> have complained by now seeing AMD has already switched to IOMMU-DMA.
> 
> I'm not sure that this is a trick. Stopping at zero sg_dma_len() was 
> somewhere documented.

Well whatever you want to call it, it is ugly to have some drivers doing
one thing with the returned value and others assuming there's an extra
zero at the end. It just causes confusion for people reading/copying the
code. It would be better if they are all consistent. However, I concede
stopping at zero should not be broken, presently.

Logan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api

2020-05-29 Thread Marek Szyprowski
Hi Logan,

On 29.05.2020 21:05, Logan Gunthorpe wrote:
> On 2020-05-29 6:45 a.m., Christoph Hellwig wrote:
>> On Thu, May 28, 2020 at 06:00:44PM -0600, Logan Gunthorpe wrote:
 This issue is most likely in the i915 driver and is most likely caused by 
 the driver not respecting the return value of the dma_map_ops::map_sg 
 function. You can see the driver ignoring the return value here:
 https://protect2.fireeye.com/url?k=ca25a34b-97f7b813-ca242804-0cc47a31c8b4-0ecdffc9f56851e1=1=https%3A%2F%2Fgithub.com%2Ftorvalds%2Flinux%2Fblob%2F7e0165b2f1a912a06e381e91f0f4e495f4ac3736%2Fdrivers%2Fgpu%2Fdrm%2Fi915%2Fgem%2Fi915_gem_dmabuf.c%23L51

 Previously this didn’t cause issues because the intel map_sg always 
 returned the same number of elements as the input scatter gather list but 
 with the change to this dma-iommu api this is no longer the case. I wasn’t 
 able to track the bug down to a specific line of code unfortunately.
>> Mark did a big audit into the map_sg API abuse and initially had
>> some i915 patches, but then gave up on them with this comment:
>>
>> "The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
>>   it fully. The driver creatively uses sg_table->orig_nents to store the
>>   size of the allocate scatterlist and ignores the number of the entries
>>   returned by dma_map_sg function. In this patchset I only fixed the
>>   sg_table objects exported by dmabuf related functions. I hope that I
>>   didn't break anything there."
>>
>> it would be really nice if the i915 maintainers could help with sorting
>> that API abuse out.
>>
> I agree completely that the API abuse should be sorted out, but I think
> that's much larger than just the i915 driver. Pretty much every dma-buf
> map_dma_buf implementation I looked at ignores the returned nents of
> sg_attrs. This sucks, but I don't think it's the bug Tom ran into. See:
>
> amdgpu_dma_buf_map
> armada_gem_prime_map_dma_buf
> drm_gem_map_dma_buf
> i915_gem_map_dma_buf
> tegra_gem_prime_map_dma_buf
>
> So this should probably be addressed by the whole GPU community.

Patches are pending:
https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprow...@samsung.com/T/

> However, as Robin pointed out, there are other ugly tricks like stopping
> iterating through the SGL when sg_dma_len() is zero. For example, the
> AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does
> this trick and thus likely isn't buggy (otherwise, I'd expect someone to
> have complained by now seeing AMD has already switched to IOMMU-DMA.

I'm not sure that this is a trick. Stopping at zero sg_dma_len() was 
somewhere documented.

> As I tried to point out in my previous email, i915 does not do this
> trick. In fact, it completely ignores sg_dma_len() and is thus
> completely broken. See i915_scatterlist.h and the __sgt_iter() function.
> So it doesn't sound to me like Mark's fix would address the issue at
> all. Per my previous email, I'd guess that it can be fixed simply by
> adjusting the __sgt_iter() function to do something more sensible.
> (Better yet, if possible, ditch __sgt_iter() and use the common DRM
> function that AMD uses).
>
> This would at least allow us to make progress with Tom's IOMMU-DMA patch
> set and once that gets in, it will be harder for other drivers to make
> the same mistake.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [git pull] IOMMU Fixes for Linux v5.7-rc7

2020-05-29 Thread pr-tracker-bot
The pull request you sent on Fri, 29 May 2020 20:58:41 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
> tags/iommu-fixes-v5.7-rc7

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b58f2140ea8605ee6ea0530d9c0cb5d049f9c7ca

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Fix event counter availability check

2020-05-29 Thread Alexander Monakov
The driver performs an extra check if the IOMMU's capabilities advertise
presence of performance counters: it verifies that counters are writable
by writing a hard-coded value to a counter and testing that reading that
counter gives back the same value.

Unfortunately it does so quite early, even before pci_enable_device is
called for the IOMMU, i.e. when accessing its MMIO space is not
guaranteed to work. On Ryzen 4500U CPU, this actually breaks the test:
the driver assumes the counters are not writable, and disables the
functionality.

Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves
the issue. This is the earliest point in amd_iommu_init_pci where the
call succeeds on my laptop.

Signed-off-by: Alexander Monakov 
Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Cc: iommu@lists.linux-foundation.org
---

PS. I'm seeing another hiccup with IOMMU probing on my system:
pci :00:00.2: can't derive routing for PCI INT A
pci :00:00.2: PCI INT A: not connected

Hopefully I can figure it out, but I'd appreciate hints.

 drivers/iommu/amd_iommu_init.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5b81fd16f5fa..1b7ec6b6a282 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1788,8 +1788,6 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE))
amd_iommu_np_cache = true;
 
-   init_iommu_perf_ctr(iommu);
-
if (is_rd890_iommu(iommu->dev)) {
int i, j;
 
@@ -1891,8 +1889,10 @@ static int __init amd_iommu_init_pci(void)
 
init_device_table_dma();
 
-   for_each_iommu(iommu)
+   for_each_iommu(iommu) {
iommu_flush_all_caches(iommu);
+   init_iommu_perf_ctr(iommu);
+   }
 
if (!ret)
print_iommu_info();

base-commit: 75caf310d16cc5e2f851c048cd597f5437013368
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api

2020-05-29 Thread Logan Gunthorpe


On 2020-05-29 6:45 a.m., Christoph Hellwig wrote:
> On Thu, May 28, 2020 at 06:00:44PM -0600, Logan Gunthorpe wrote:
>>> This issue is most likely in the i915 driver and is most likely caused by 
>>> the driver not respecting the return value of the dma_map_ops::map_sg 
>>> function. You can see the driver ignoring the return value here:
>>> https://github.com/torvalds/linux/blob/7e0165b2f1a912a06e381e91f0f4e495f4ac3736/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L51
>>>
>>> Previously this didn’t cause issues because the intel map_sg always 
>>> returned the same number of elements as the input scatter gather list but 
>>> with the change to this dma-iommu api this is no longer the case. I wasn’t 
>>> able to track the bug down to a specific line of code unfortunately.  
> 
> Mark did a big audit into the map_sg API abuse and initially had
> some i915 patches, but then gave up on them with this comment:
> 
> "The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
>  it fully. The driver creatively uses sg_table->orig_nents to store the
>  size of the allocate scatterlist and ignores the number of the entries
>  returned by dma_map_sg function. In this patchset I only fixed the
>  sg_table objects exported by dmabuf related functions. I hope that I
>  didn't break anything there."
> 
> it would be really nice if the i915 maintainers could help with sorting
> that API abuse out.
> 

I agree completely that the API abuse should be sorted out, but I think
that's much larger than just the i915 driver. Pretty much every dma-buf
map_dma_buf implementation I looked at ignores the returned nents of
sg_attrs. This sucks, but I don't think it's the bug Tom ran into. See:

amdgpu_dma_buf_map
armada_gem_prime_map_dma_buf
drm_gem_map_dma_buf
i915_gem_map_dma_buf
tegra_gem_prime_map_dma_buf

So this should probably be addressed by the whole GPU community.

However, as Robin pointed out, there are other ugly tricks like stopping
iterating through the SGL when sg_dma_len() is zero. For example, the
AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does
this trick and thus likely isn't buggy (otherwise, I'd expect someone to
have complained by now seeing AMD has already switched to IOMMU-DMA.

As I tried to point out in my previous email, i915 does not do this
trick. In fact, it completely ignores sg_dma_len() and is thus
completely broken. See i915_scatterlist.h and the __sgt_iter() function.
So it doesn't sound to me like Mark's fix would address the issue at
all. Per my previous email, I'd guess that it can be fixed simply by
adjusting the __sgt_iter() function to do something more sensible.
(Better yet, if possible, ditch __sgt_iter() and use the common DRM
function that AMD uses).

This would at least allow us to make progress with Tom's IOMMU-DMA patch
set and once that gets in, it will be harder for other drivers to make
the same mistake.

Logan


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[git pull] IOMMU Fixes for Linux v5.7-rc7

2020-05-29 Thread Joerg Roedel
Hi Linus,

The following changes since commit 9cb1fd0efd195590b828b9b865421ad345a4a145:

  Linux 5.7-rc7 (2020-05-24 15:32:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
tags/iommu-fixes-v5.7-rc7

for you to fetch changes up to 7cc31613734c4870ae32f5265d576ef296621343:

  iommu: Fix reference count leak in iommu_group_alloc. (2020-05-29 15:27:50 
+0200)


IOMMU Fixes for Linux v5.7-rc7

Including:

- Two compile test fixes for issues introduced during the
  5.7-rc1 merge window.

- A fix for a reference count leak in an error path of
  iommu_group_alloc().


Krzysztof Kozlowski (2):
  ia64: Hide the archdata.iommu field behind generic IOMMU_API
  x86: Hide the archdata.iommu field behind generic IOMMU_API

Qiushi Wu (1):
  iommu: Fix reference count leak in iommu_group_alloc.

 arch/ia64/include/asm/device.h | 2 +-
 arch/x86/include/asm/device.h  | 2 +-
 drivers/iommu/iommu.c  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

Please pull.

Thanks,

Joerg


signature.asc
Description: Digital signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 00/14] PCI: brcmstb: enable PCIe for STB chips

2020-05-29 Thread Jim Quinlan via iommu
On Fri, May 29, 2020 at 1:49 PM Rob Herring  wrote:
>
> On Tue, May 26, 2020 at 03:12:39PM -0400, Jim Quinlan wrote:
> > v2:
> > Commit: "device core: Add ability to handle multiple dma offsets"
> >   o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
> >   o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
> >   o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
> >   o dev->dma_pfn_map => dev->dma_pfn_offset_map
> >   o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
> >   o In device.h: s/const void */const struct dma_pfn_offset_region */
> >   o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
> > guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
> >   o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
> > dev->dma_pfn_offset_map is copied as well.
> >   o Merged two of the DMA commits into one (Christoph).
> >
> > Commit "arm: dma-mapping: Invoke dma offset func if needed":
> >   o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET
> >
> > Other commits' changes:
> >   o Removed need for carrying of_id var in priv (Nicolas)
> >   o Commit message rewordings (Bjorn)
> >   o Commit log messages filled to 75 chars (Bjorn)
> >   o devm_reset_control_get_shared())
> > => devm_reset_control_get_optional_shared (Philipp)
> >   o Add call to reset_control_assert() in PCIe remove routines (Philipp)
> >
> > v1:
> > This patchset expands the usefulness of the Broadcom Settop Box PCIe
> > controller by building upon the PCIe driver used currently by the
> > Raspbery Pi.  Other forms of this patchset were submitted by me years
> > ago and not accepted; the major sticking point was the code required
> > for the DMA remapping needed for the PCIe driver to work [1].
> >
> > There have been many changes to the DMA and OF subsystems since that
> > time, making a cleaner and less intrusive patchset possible.  This
> > patchset implements a generalization of "dev->dma_pfn_offset", except
> > that instead of a single scalar offset it provides for multiple
> > offsets via a function which depends upon the "dma-ranges" property of
> > the PCIe host controller.  This is required for proper functionality
> > of the BrcmSTB PCIe controller and possibly some other devices.
>
> If you can enable the h/w support without the multiple offset support,
> then I'd split up this series. The latter part might take a bit more
> time.
>
> Rob
Unfortunately, the STB PCIe  controller depends on the multiple PFN
offset functionality.
Thanks,
Jim
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/14] device core: Add ability to handle multiple dma offsets

2020-05-29 Thread Jim Quinlan via iommu
On Fri, May 29, 2020 at 1:35 PM Rob Herring  wrote:
>
> On Wed, May 27, 2020 at 9:43 AM Jim Quinlan  
> wrote:
> >
> > Hi Nicolas,
> >
> > On Wed, May 27, 2020 at 11:00 AM Nicolas Saenz Julienne
> >  wrote:
> > >
> > > Hi Jim,
> > > one thing comes to mind, there is a small test suite in 
> > > drivers/of/unittest.c
> > > (specifically of_unittest_pci_dma_ranges()) you could extend it to 
> > > include your
> > > use cases.
> > Sure, will check out.
> > >
> > > On Tue, 2020-05-26 at 15:12 -0400, Jim Quinlan wrote:
> > > > The new field in struct device 'dma_pfn_offset_map' is used to 
> > > > facilitate
> > > > the use of multiple pfn offsets between cpu addrs and dma addrs.  It is
> > > > similar to 'dma_pfn_offset' except that the offset chosen depends on the
> > > > cpu or dma address involved.
> > > >
> > > > Signed-off-by: Jim Quinlan 
> > > > ---
> > > >  drivers/of/address.c| 65 +++--
> > > >  drivers/usb/core/message.c  |  3 ++
> > > >  drivers/usb/core/usb.c  |  3 ++
> > > >  include/linux/device.h  | 10 +-
> > > >  include/linux/dma-direct.h  | 10 --
> > > >  include/linux/dma-mapping.h | 46 ++
> > > >  kernel/dma/Kconfig  | 13 
> > > >  7 files changed, 144 insertions(+), 6 deletions(-)
> > > >
> > >
> > > [...]
> > >
> > > > @@ -977,10 +1020,19 @@ int of_dma_get_range(struct device *dev, struct
> > > > device_node *np, u64 *dma_addr,
> > > >   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> > > >range.bus_addr, range.cpu_addr, range.size);
> > > >
> > > > + num_ranges++;
> > > >   if (dma_offset && range.cpu_addr - range.bus_addr != 
> > > > dma_offset)
> > > > {
> > > > - pr_warn("Can't handle multiple dma-ranges with 
> > > > different
> > > > offsets on node(%pOF)\n", node);
> > > > - /* Don't error out as we'd break some existing 
> > > > DTs */
> > > > - continue;
> > > > + if (!IS_ENABLED(CONFIG_DMA_PFN_OFFSET_MAP)) {
> > > > + pr_warn("Can't handle multiple dma-ranges 
> > > > with
> > > > different offsets on node(%pOF)\n", node);
> > > > + pr_warn("Perhaps set 
> > > > DMA_PFN_OFFSET_MAP=y?\n");
> > > > + /*
> > > > +  * Don't error out as we'd break some 
> > > > existing
> > > > +  * DTs that are using configs w/o
> > > > +  * CONFIG_DMA_PFN_OFFSET_MAP set.
> > > > +  */
> > > > + continue;
> > >
> > > dev->bus_dma_limit is set in of_dma_configure(), this function's caller, 
> > > based
> > > on dma_start's value (set after this continue). So you'd be effectively 
> > > setting
> > > the dev->bus_dma_limit to whatever we get from the first dma-range.
> > I'm not seeing that at all.  On the  evaluation of each dma-range,
> > dma_start and dma_end are re-evaluated to be the lowest and highest
> > bus values of the  dma-ranges seen so far.  After all dma-ranges are
> > examined,  dev->bus_dma_limit being set to the highest.  In fact, the
> > current code -- ie before my commits -- already does this for multiple
> > dma-ranges as long as the cpu-bus offset is the same in the
> > dma-ranges.
> > >
> > > This can be troublesome depending on how the dma-ranges are setup, for 
> > > example
> > > if the first dma-range doesn't include the CMA area, in arm64 generally 
> > > set as
> > > high as possible in ZONE_DMA32, that would render it useless for
> > > dma/{direct/swiotlb}. Again depending on the bus_dma_limit value, if 
> > > smaller
> > > than ZONE_DMA you'd be unable to allocate any DMA memory.
> > >
> > > IMO, a solution to this calls for a revamp of dma-direct's dma_capable(): 
> > > match
> > > the target DMA memory area with each dma-range we have to see if it fits.
> > >
> > > > + }
> > > > + dma_multi_pfn_offset = true;
> > > >   }
> > > >   dma_offset = range.cpu_addr - range.bus_addr;
> > > >
> > > > @@ -991,6 +1043,13 @@ int of_dma_get_range(struct device *dev, struct
> > > > device_node *np, u64 *dma_addr,
> > > >   dma_end = range.bus_addr + range.size;
> > > >   }
> > > >
> > > > + if (dma_multi_pfn_offset) {
> > > > + dma_offset = 0;
> > > > + ret = attach_dma_pfn_offset_map(dev, node, num_ranges);
> > > > + if (ret)
> > > > + return ret;
> > > > + }
> > > > +
> > > >   if (dma_start >= dma_end) {
> > > >   ret = -EINVAL;
> > > >   pr_debug("Invalid DMA ranges configuration on 
> > > > node(%pOF)\n",
> > > > diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> > > > index 6197938dcc2d..aaa3e58f5eb4 100644
> > > > --- 

Re: [PATCH v2 00/14] PCI: brcmstb: enable PCIe for STB chips

2020-05-29 Thread Rob Herring
On Tue, May 26, 2020 at 03:12:39PM -0400, Jim Quinlan wrote:
> v2:
> Commit: "device core: Add ability to handle multiple dma offsets"
>   o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
>   o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
>   o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
>   o dev->dma_pfn_map => dev->dma_pfn_offset_map
>   o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
>   o In device.h: s/const void */const struct dma_pfn_offset_region */
>   o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
> guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
>   o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
> dev->dma_pfn_offset_map is copied as well.
>   o Merged two of the DMA commits into one (Christoph).
> 
> Commit "arm: dma-mapping: Invoke dma offset func if needed":
>   o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET
> 
> Other commits' changes:
>   o Removed need for carrying of_id var in priv (Nicolas)
>   o Commit message rewordings (Bjorn)
>   o Commit log messages filled to 75 chars (Bjorn)
>   o devm_reset_control_get_shared())
> => devm_reset_control_get_optional_shared (Philipp)
>   o Add call to reset_control_assert() in PCIe remove routines (Philipp)
> 
> v1:
> This patchset expands the usefulness of the Broadcom Settop Box PCIe
> controller by building upon the PCIe driver used currently by the
> Raspbery Pi.  Other forms of this patchset were submitted by me years
> ago and not accepted; the major sticking point was the code required
> for the DMA remapping needed for the PCIe driver to work [1].
> 
> There have been many changes to the DMA and OF subsystems since that
> time, making a cleaner and less intrusive patchset possible.  This
> patchset implements a generalization of "dev->dma_pfn_offset", except
> that instead of a single scalar offset it provides for multiple
> offsets via a function which depends upon the "dma-ranges" property of
> the PCIe host controller.  This is required for proper functionality
> of the BrcmSTB PCIe controller and possibly some other devices.

If you can enable the h/w support without the multiple offset support, 
then I'd split up this series. The latter part might take a bit more 
time.

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/14] device core: Add ability to handle multiple dma offsets

2020-05-29 Thread Rob Herring
On Wed, May 27, 2020 at 9:43 AM Jim Quinlan  wrote:
>
> Hi Nicolas,
>
> On Wed, May 27, 2020 at 11:00 AM Nicolas Saenz Julienne
>  wrote:
> >
> > Hi Jim,
> > one thing comes to mind, there is a small test suite in 
> > drivers/of/unittest.c
> > (specifically of_unittest_pci_dma_ranges()) you could extend it to include 
> > your
> > use cases.
> Sure, will check out.
> >
> > On Tue, 2020-05-26 at 15:12 -0400, Jim Quinlan wrote:
> > > The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> > > the use of multiple pfn offsets between cpu addrs and dma addrs.  It is
> > > similar to 'dma_pfn_offset' except that the offset chosen depends on the
> > > cpu or dma address involved.
> > >
> > > Signed-off-by: Jim Quinlan 
> > > ---
> > >  drivers/of/address.c| 65 +++--
> > >  drivers/usb/core/message.c  |  3 ++
> > >  drivers/usb/core/usb.c  |  3 ++
> > >  include/linux/device.h  | 10 +-
> > >  include/linux/dma-direct.h  | 10 --
> > >  include/linux/dma-mapping.h | 46 ++
> > >  kernel/dma/Kconfig  | 13 
> > >  7 files changed, 144 insertions(+), 6 deletions(-)
> > >
> >
> > [...]
> >
> > > @@ -977,10 +1020,19 @@ int of_dma_get_range(struct device *dev, struct
> > > device_node *np, u64 *dma_addr,
> > >   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> > >range.bus_addr, range.cpu_addr, range.size);
> > >
> > > + num_ranges++;
> > >   if (dma_offset && range.cpu_addr - range.bus_addr != 
> > > dma_offset)
> > > {
> > > - pr_warn("Can't handle multiple dma-ranges with 
> > > different
> > > offsets on node(%pOF)\n", node);
> > > - /* Don't error out as we'd break some existing DTs 
> > > */
> > > - continue;
> > > + if (!IS_ENABLED(CONFIG_DMA_PFN_OFFSET_MAP)) {
> > > + pr_warn("Can't handle multiple dma-ranges 
> > > with
> > > different offsets on node(%pOF)\n", node);
> > > + pr_warn("Perhaps set 
> > > DMA_PFN_OFFSET_MAP=y?\n");
> > > + /*
> > > +  * Don't error out as we'd break some 
> > > existing
> > > +  * DTs that are using configs w/o
> > > +  * CONFIG_DMA_PFN_OFFSET_MAP set.
> > > +  */
> > > + continue;
> >
> > dev->bus_dma_limit is set in of_dma_configure(), this function's caller, 
> > based
> > on dma_start's value (set after this continue). So you'd be effectively 
> > setting
> > the dev->bus_dma_limit to whatever we get from the first dma-range.
> I'm not seeing that at all.  On the  evaluation of each dma-range,
> dma_start and dma_end are re-evaluated to be the lowest and highest
> bus values of the  dma-ranges seen so far.  After all dma-ranges are
> examined,  dev->bus_dma_limit being set to the highest.  In fact, the
> current code -- ie before my commits -- already does this for multiple
> dma-ranges as long as the cpu-bus offset is the same in the
> dma-ranges.
> >
> > This can be troublesome depending on how the dma-ranges are setup, for 
> > example
> > if the first dma-range doesn't include the CMA area, in arm64 generally set 
> > as
> > high as possible in ZONE_DMA32, that would render it useless for
> > dma/{direct/swiotlb}. Again depending on the bus_dma_limit value, if smaller
> > than ZONE_DMA you'd be unable to allocate any DMA memory.
> >
> > IMO, a solution to this calls for a revamp of dma-direct's dma_capable(): 
> > match
> > the target DMA memory area with each dma-range we have to see if it fits.
> >
> > > + }
> > > + dma_multi_pfn_offset = true;
> > >   }
> > >   dma_offset = range.cpu_addr - range.bus_addr;
> > >
> > > @@ -991,6 +1043,13 @@ int of_dma_get_range(struct device *dev, struct
> > > device_node *np, u64 *dma_addr,
> > >   dma_end = range.bus_addr + range.size;
> > >   }
> > >
> > > + if (dma_multi_pfn_offset) {
> > > + dma_offset = 0;
> > > + ret = attach_dma_pfn_offset_map(dev, node, num_ranges);
> > > + if (ret)
> > > + return ret;
> > > + }
> > > +
> > >   if (dma_start >= dma_end) {
> > >   ret = -EINVAL;
> > >   pr_debug("Invalid DMA ranges configuration on node(%pOF)\n",
> > > diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> > > index 6197938dcc2d..aaa3e58f5eb4 100644
> > > --- a/drivers/usb/core/message.c
> > > +++ b/drivers/usb/core/message.c
> > > @@ -1960,6 +1960,9 @@ int usb_set_configuration(struct usb_device *dev, 
> > > int
> > > configuration)
> > >*/
> > >   intf->dev.dma_mask = dev->dev.dma_mask;
> > >   

Re: Re: [PATCH] PCI: Relax ACS requirement for Intel RCiEP devices.

2020-05-29 Thread Darrel Goeddel

On 5/28/20 4:38 PM, Alex Williamson wrote:

On Thu, 28 May 2020 13:57:42 -0700
Ashok Raj  wrote:


All Intel platforms guarantee that all root complex implementations
must send transactions up to IOMMU for address translations. Hence for
RCiEP devices that are Vendor ID Intel, can claim exception for lack of
ACS support.


3.16 Root-Complex Peer to Peer Considerations
When DMA remapping is enabled, peer-to-peer requests through the
Root-Complex must be handled
as follows:
• The input address in the request is translated (through first-level,
   second-level or nested translation) to a host physical address (HPA).
   The address decoding for peer addresses must be done only on the
   translated HPA. Hardware implementations are free to further limit
   peer-to-peer accesses to specific host physical address regions
   (or to completely disallow peer-forwarding of translated requests).
• Since address translation changes the contents (address field) of
   the PCI Express Transaction Layer Packet (TLP), for PCI Express
   peer-to-peer requests with ECRC, the Root-Complex hardware must use
   the new ECRC (re-computed with the translated address) if it
   decides to forward the TLP as a peer request.
• Root-ports, and multi-function root-complex integrated endpoints, may
   support additional peerto-peer control features by supporting PCI Express
   Access Control Services (ACS) capability. Refer to ACS capability in
   PCI Express specifications for details.

Since Linux didn't give special treatment to allow this exception, certain
RCiEP MFD devices are getting grouped in a single iommu group. This
doesn't permit a single device to be assigned to a guest for instance.

In one vendor system: Device 14.x were grouped in a single IOMMU group.

/sys/kernel/iommu_groups/5/devices/:00:14.0
/sys/kernel/iommu_groups/5/devices/:00:14.2
/sys/kernel/iommu_groups/5/devices/:00:14.3

After the patch:
/sys/kernel/iommu_groups/5/devices/:00:14.0
/sys/kernel/iommu_groups/5/devices/:00:14.2
/sys/kernel/iommu_groups/6/devices/:00:14.3 <<< new group

14.0 and 14.2 are integrated devices, but legacy end points.
Whereas 14.3 was a PCIe compliant RCiEP.

00:14.3 Network controller: Intel Corporation Device 9df0 (rev 30)
Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00

This permits assigning this device to a guest VM.

Fixes: f096c061f552 ("iommu: Rework iommu_group_get_for_pci_dev()")


I don't really understand this Fixes tag.  This seems like a feature,
not a fix.  If you want it in stable releases as a feature, request it
via Cc: sta...@vger.kernel.org.  I'd drop that tag, that's my nit.
Otherwise:

Reviewed-by: Alex Williamson 


I have tested this patch with 5.6.14 as well as a slightly modified
version (without pci_acs_ctrl_enabled()) in a 3.10 enterprise linux
kernel.

Tested-by: Darrel Goeddel 


Signed-off-by: Ashok Raj 
To: Joerg Roedel 
To: Bjorn Helgaas 
Cc: linux-ker...@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Cc: Lu Baolu 
Cc: Alex Williamson 
Cc: Darrel Goeddel 
Cc: Mark Scott ,
Cc: Romil Sharma 
Cc: Ashok Raj 
---
v2: Moved functionality from iommu to pci quirks - Alex Williamson

  drivers/pci/quirks.c | 15 +++
  1 file changed, 15 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 28c9a2409c50..63373ca0a3fe 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4682,6 +4682,20 @@ static int pci_quirk_mf_endpoint_acs(struct pci_dev 
*dev, u16 acs_flags)
PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_DT);
  }
  
+static int pci_quirk_rciep_acs(struct pci_dev *dev, u16 acs_flags)

+{
+   /*
+* RCiEP's are required to allow p2p only on translated addresses.
+* Refer to Intel VT-d specification Section 3.16 Root-Complex Peer
+* to Peer Considerations
+*/
+   if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_END)
+   return -ENOTTY;
+
+   return pci_acs_ctrl_enabled(acs_flags,
+   PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF);
+}
+
  static int pci_quirk_brcm_acs(struct pci_dev *dev, u16 acs_flags)
  {
/*
@@ -4764,6 +4778,7 @@ static const struct pci_dev_acs_enabled {
/* I219 */
{ PCI_VENDOR_ID_INTEL, 0x15b7, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x15b8, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_rciep_acs },
/* QCOM QDF2xxx root ports */
{ PCI_VENDOR_ID_QCOM, 0x0400, pci_quirk_qcom_rp_acs },
{ PCI_VENDOR_ID_QCOM, 0x0401, pci_quirk_qcom_rp_acs },




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 01/10] iommu/amd: Move AMD IOMMU driver to a subdirectory

2020-05-29 Thread Joerg Roedel
On Wed, May 27, 2020 at 01:53:04PM +0200, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> The driver consists of five C files and three header files by now.
> Move them to their own subdirectory to not clutter to iommu top-level
> directory with them.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  MAINTAINERS  | 2 +-
>  drivers/iommu/Makefile   | 6 +++---
>  drivers/iommu/{ => amd}/amd_iommu.h  | 0
>  drivers/iommu/{ => amd}/amd_iommu_proto.h| 0
>  drivers/iommu/{ => amd}/amd_iommu_types.h| 0
>  drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} | 0
>  drivers/iommu/{amd_iommu_init.c => amd/init.c}   | 2 +-
>  drivers/iommu/{amd_iommu.c => amd/iommu.c}   | 2 +-
>  drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} | 0
>  drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c}   | 0
>  10 files changed, 6 insertions(+), 6 deletions(-)
>  rename drivers/iommu/{ => amd}/amd_iommu.h (100%)
>  rename drivers/iommu/{ => amd}/amd_iommu_proto.h (100%)
>  rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%)
>  rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (100%)
>  rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%)
>  rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (99%)
>  rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (100%)
>  rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%)

Okay, so there were a lot of conflicts creating my next branch after this
patch-set was applied, so I skipped this patch to make resolving them a
bit easier.

I will send out this patch again separatly together with a patch doing
the same for VT-d and merge it directly into the next branch I will send
to Linus.


Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: Fix compile warning

2020-05-29 Thread Jacob Pan
On Fri, 29 May 2020 15:15:45 +0200
Joerg Roedel  wrote:

> Applied, thanks.
> 
> On Thu, May 28, 2020 at 11:03:51AM -0700, Jacob Pan wrote:
> > Make intel_svm_unbind_mm() a static function.
> > 
> > Fixes: 064a57d7ddfc ("iommu/vt-d: Replace intel SVM APIs with
> > generic SVA APIs")  
> 
> Please make sure the fixes tags (or any other tags) are not
> line-wrapped in future patch submissions.
> 
Got it, thanks.

> Thanks,
> 
>   Joerg

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/10] iommu/amd: Updates and Cleanups

2020-05-29 Thread Suravee Suthikulpanit

Joerg,

On 5/27/2020 6:53 PM, Joerg Roedel wrote:

Hi,

here is a collection of patches that clean up a few things
in the AMD IOMMU driver. Foremost, it moves all related
files of the driver into a separate subdirectory.

But the patches also remove usage of dev->archdata.iommu and
clean up dev_data handling and domain allocation.

Patches are runtime-tested, including device-assignment.

Please review.

Regards,

Joerg

Joerg Roedel (10):
   iommu/amd: Move AMD IOMMU driver to a subdirectory
   iommu/amd: Unexport get_dev_data()
   iommu/amd: Let free_pagetable() not rely on domain->pt_root
   iommu/amd: Allocate page-table in protection_domain_init()
   iommu/amd: Free page-table in protection_domain_free()
   iommu/amd: Consolidate domain allocation/freeing
   iommu/amd: Remove PD_DMA_OPS_MASK
   iommu/amd: Merge private header files
   iommu/amd: Store dev_data as device iommu private data
   iommu/amd: Remove redundant devid checks

  MAINTAINERS   |   2 +-
  drivers/iommu/Makefile|   6 +-
  .../{amd_iommu_proto.h => amd/amd_iommu.h}|  20 +-
  drivers/iommu/{ => amd}/amd_iommu_types.h |   0
  .../{amd_iommu_debugfs.c => amd/debugfs.c}|   5 +-
  .../iommu/{amd_iommu_init.c => amd/init.c}|   6 +-
  drivers/iommu/{amd_iommu.c => amd/iommu.c}| 265 ++
  .../iommu/{amd_iommu_v2.c => amd/iommu_v2.c}  |  14 +-
  .../{amd_iommu_quirks.c => amd/quirks.c}  |   0
  drivers/iommu/amd_iommu.h |  14 -
  10 files changed, 117 insertions(+), 215 deletions(-)
  rename drivers/iommu/{amd_iommu_proto.h => amd/amd_iommu.h} (88%)
  rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%)
  rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (89%)
  rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%)
  rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (95%)
  rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (98%)
  rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%)
  delete mode 100644 drivers/iommu/amd_iommu.h



Thank you for cleaning up.

Reviewed-by: Suravee Suthikulpanit 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Fix reference count leak in iommu_group_alloc.

2020-05-29 Thread Joerg Roedel
On Wed, May 27, 2020 at 04:00:19PM -0500, wu000...@umn.edu wrote:
> From: Qiushi Wu 
> 
> kobject_init_and_add() takes reference even when it fails.
> Thus, when kobject_init_and_add() returns an error,
> kobject_put() must be called to properly clean up the kobject.
> 
> Fixes: d72e31c93746 ("iommu: IOMMU Groups")
> Signed-off-by: Qiushi Wu 
> ---
>  drivers/iommu/iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: Fix compile warning

2020-05-29 Thread Joerg Roedel
Applied, thanks.

On Thu, May 28, 2020 at 11:03:51AM -0700, Jacob Pan wrote:
> Make intel_svm_unbind_mm() a static function.
> 
> Fixes: 064a57d7ddfc ("iommu/vt-d: Replace intel SVM APIs with generic
> SVA APIs")

Please make sure the fixes tags (or any other tags) are not line-wrapped
in future patch submissions.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 0/3] iommu/vt-d: real DMA sub-device info allocation

2020-05-29 Thread Joerg Roedel
On Wed, May 27, 2020 at 10:56:14AM -0600, Jon Derrick wrote:
> Jon Derrick (3):
>   iommu/vt-d: Only clear real DMA device's context entries
>   iommu/vt-d: Allocate domain info for real DMA sub-devices
>   iommu/vt-d: Remove real DMA lookup in find_domain
> 
>  drivers/iommu/intel-iommu.c | 31 +++
>  include/linux/intel-iommu.h |  1 +
>  2 files changed, 24 insertions(+), 8 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/10] iommu/amd: Updates and Cleanups

2020-05-29 Thread Joerg Roedel
On Fri, May 29, 2020 at 07:15:13PM +0700, Suravee Suthikulpanit wrote:
> Thank you for cleaning up.
> 
> Reviewed-by: Suravee Suthikulpanit 

Thanks for you review, Suravee. Patches are now applied.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] iommu: Remove iommu_sva_ops::mm_exit()

2020-05-29 Thread Joerg Roedel
On Thu, Apr 23, 2020 at 02:53:27PM +0200, Jean-Philippe Brucker wrote:
> The IOMMU SVA API currently requires device drivers to implement an
> mm_exit() callback, which stops device jobs that do DMA. This function
> is called in the release() MMU notifier, when an address space that is
> shared with a device exits.
> 
> It has been noted several time during discussions about SVA that
> cancelling DMA jobs can be slow and complex, and doing it in the
> release() notifier might cause synchronization issues. Device drivers
> must in any case call unbind() to remove their bond, after stopping DMA
> from a more favorable context (release of a file descriptor).
> 
> Patch 1 removes the mm_exit() callback from the uacce module, and patch
> 2 removes it from the IOMMU API. Since v1 [1] I fixed the uacce unbind
> reported by Zhangfei and added details in the commit message of patch 2.
> 
> Jean-Philippe Brucker (2):
>   uacce: Remove mm_exit() op
>   iommu: Remove iommu_sva_ops::mm_exit()

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api

2020-05-29 Thread Christoph Hellwig
On Thu, May 28, 2020 at 06:00:44PM -0600, Logan Gunthorpe wrote:
> > This issue is most likely in the i915 driver and is most likely caused by 
> > the driver not respecting the return value of the dma_map_ops::map_sg 
> > function. You can see the driver ignoring the return value here:
> > https://github.com/torvalds/linux/blob/7e0165b2f1a912a06e381e91f0f4e495f4ac3736/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L51
> > 
> > Previously this didn’t cause issues because the intel map_sg always 
> > returned the same number of elements as the input scatter gather list but 
> > with the change to this dma-iommu api this is no longer the case. I wasn’t 
> > able to track the bug down to a specific line of code unfortunately.  

Mark did a big audit into the map_sg API abuse and initially had
some i915 patches, but then gave up on them with this comment:

"The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
 it fully. The driver creatively uses sg_table->orig_nents to store the
 size of the allocate scatterlist and ignores the number of the entries
 returned by dma_map_sg function. In this patchset I only fixed the
 sg_table objects exported by dmabuf related functions. I hope that I
 didn't break anything there."

it would be really nice if the i915 maintainers could help with sorting
that API abuse out.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 04/24] iommu: Add a page fault handler

2020-05-29 Thread Xiang Zheng
Hi,

On 2020/5/20 1:54, Jean-Philippe Brucker wrote:
> Some systems allow devices to handle I/O Page Faults in the core mm. For
> example systems implementing the PCIe PRI extension or Arm SMMU stall
> model. Infrastructure for reporting these recoverable page faults was
> added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
> fault report API"). Add a page fault handler for host SVA.
> 
> IOMMU driver can now instantiate several fault workqueues and link them
> to IOPF-capable devices. Drivers can choose between a single global
> workqueue, one per IOMMU device, one per low-level fault queue, one per
> domain, etc.
> 
> When it receives a fault event, supposedly in an IRQ handler, the IOMMU
> driver reports the fault using iommu_report_device_fault(), which calls
> the registered handler. The page fault handler then calls the mm fault
> handler, and reports either success or failure with iommu_page_response().
> When the handler succeeded, the IOMMU retries the access.
> 
> The iopf_param pointer could be embedded into iommu_fault_param. But
> putting iopf_param into the iommu_param structure allows us not to care
> about ordering between calls to iopf_queue_add_device() and
> iommu_register_device_fault_handler().
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
> v6->v7: Fix leak in iopf_queue_discard_partial()
> ---
>  drivers/iommu/Kconfig  |   4 +
>  drivers/iommu/Makefile |   1 +
>  include/linux/iommu.h  |  51 +
>  drivers/iommu/io-pgfault.c | 459 +
>  4 files changed, 515 insertions(+)
>  create mode 100644 drivers/iommu/io-pgfault.c
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index d9fa5b410015..15e9dc4e503c 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -107,6 +107,10 @@ config IOMMU_SVA
>   bool
>   select IOASID
>  
> +config IOMMU_PAGE_FAULT
> + bool
> + select IOMMU_SVA
> +
>  config FSL_PAMU
>   bool "Freescale IOMMU support"
>   depends on PCI
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 40c800dd4e3e..bf5cb4ee8409 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -4,6 +4,7 @@ obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>  obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>  obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o
>  obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
> +obj-$(CONFIG_IOMMU_PAGE_FAULT) += io-pgfault.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b62525747bd9..a462157c855b 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -46,6 +46,7 @@ struct iommu_domain;
>  struct notifier_block;
>  struct iommu_sva;
>  struct iommu_fault_event;
> +struct iopf_queue;
>  
>  /* iommu fault flags */
>  #define IOMMU_FAULT_READ 0x0
> @@ -347,6 +348,7 @@ struct iommu_fault_param {
>   * struct dev_iommu - Collection of per-device IOMMU data
>   *
>   * @fault_param: IOMMU detected device fault reporting data
> + * @iopf_param:   I/O Page Fault queue and data
>   * @fwspec:   IOMMU fwspec data
>   * @priv: IOMMU Driver private data
>   *
> @@ -356,6 +358,7 @@ struct iommu_fault_param {
>  struct dev_iommu {
>   struct mutex lock;
>   struct iommu_fault_param*fault_param;
> + struct iopf_device_param*iopf_param;
>   struct iommu_fwspec *fwspec;
>   void*priv;
>  };

[...]

>  #endif /* __LINUX_IOMMU_H */
> diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
> new file mode 100644
> index ..1f61c1bc05da
> --- /dev/null
> +++ b/drivers/iommu/io-pgfault.c
> @@ -0,0 +1,459 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Handle device page faults
> + *
> + * Copyright (C) 2020 ARM Ltd.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "iommu-sva.h"
> +
> +/**
> + * struct iopf_queue - IO Page Fault queue
> + * @wq: the fault workqueue
> + * @devices: devices attached to this queue
> + * @lock: protects the device list
> + */
> +struct iopf_queue {
> + struct workqueue_struct *wq;
> + struct list_headdevices;
> + struct mutexlock;
> +};
> +
> +/**
> + * struct iopf_device_param - IO Page Fault data attached to a device
> + * @dev: the device that owns this param
> + * @queue: IOPF queue
> + * @queue_list: index into queue->devices
> + * @partial: faults that are part of a Page Request Group for which the last
> + *   request hasn't been submitted yet.
> + */
> +struct iopf_device_param {
> + struct device   *dev;
> + struct iopf_queue   *queue;
> + struct list_headqueue_list;
> + struct list_headpartial;
> +};
> 

RE: [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group

2020-05-29 Thread Prakhya, Sai Praneeth
Hi Baolu,

> -Original Message-
> From: Lu Baolu 
> Sent: Thursday, May 28, 2020 7:43 PM
> To: Prakhya, Sai Praneeth ;
> iommu@lists.linux-foundation.org
> Cc: baolu...@linux.intel.com; Christoph Hellwig ; Joerg Roedel
> ; Raj, Ashok ; Will Deacon
> ; Mehta, Sohil ; Robin
> Murphy ; Jacob Pan 
> Subject: Re: [PATCH V3 1/3] iommu: Add support to change default domain of
> an iommu_group
> 
> Hi Sai,
> 
> On 5/29/20 3:23 AM, Sai Praneeth Prakhya wrote:
> > Presently, the default domain of an iommu_group is allocated during
> > boot time (i.e. when a device is being added to a group) and it cannot
> > be
>   
> 
> This is inaccurate as Joerg's code has deferred default domain allocation and
> attaching after group allocation. I'd suggest to remove this.

Ok.. makes sense. I will remove it. I think it should have been like below to 
accurately describe Joerg's changes, is that correct?

"Presently, the default domain of an iommu_group is allocated during
boot time (i.e. during device probe and after the device is added to a group) 
and it cannot
be"
 
> > changed later. So, the device would typically be either in identity
> > (also known as pass_through) mode (controlled by "iommu=pt" kernel
> > command line
>
> 
> There are other kernel parameters to put device in pass_through mode.
> I'd suggest to remove this.

Ok.. makes sense. I will remove it.
I think, you were talking about ARM parameters (iommu.passthrough).. am I right?

> > argument) or the device would be in DMA mode as long as the machine is
> > up and running. There is no way to change the default domain type
> > dynamically i.e. after booting, a device cannot switch between
> > identity mode and DMA mode.
> >
> > But, assume a use case wherein the user trusts the device and believes
> > that the OS is secure enough and hence wants *only* this device to
> > bypass IOMMU (so that it could be high performing) whereas all the
> > other devices to go through IOMMU (so that the system is protected).
> > Presently, this use case is not supported. It will be helpful if there
> > is some way to change the default domain of a B:D.F dynamically. Hence, add
> such support.
>^
> 
> Currently default domain is per iommu_group, we have no per device default
> domain yet. Probably, "default domain of an iommu group"?

Makes sense. I will change it.

> > A privileged user could request the kernel to change the default
> > domain type of a iommu_group by writing to
> > "/sys/kernel/iommu_groups//type" file. Presently, only three
> > values are supported 1. identity: all the DMA transactions from the
> > device in this group are
> >   *not* translated by the iommu 2. DMA: all the DMA
> > transactions from the device in this group are
> >  translated by the iommu
> > 3. auto: change to the type the device was booted with
> >
> > Note:
> > 1. Default domain of an iommu group with two or more devices cannot be
> > changed.
> > 2. The device in the iommu group shouldn't be bound to any driver.
> > 3. The device shouldn't be assigned to user for direct access.
> >
> > Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for
> > more information.
> >
> > Cc: Christoph Hellwig 
> > Cc: Joerg Roedel 
> > Cc: Ashok Raj 
> > Cc: Will Deacon 
> > Cc: Lu Baolu 
> > Cc: Sohil Mehta 
> > Cc: Robin Murphy 
> > Cc: Jacob Pan 
> > Signed-off-by: Sai Praneeth Prakhya 
> > ---
> >   drivers/iommu/iommu.c | 211
> +-
> >   1 file changed, 210 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index
> > a4c2f122eb8b..2b6cca799055 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -92,6 +92,8 @@ static void __iommu_detach_group(struct
> iommu_domain *domain,
> >   static int iommu_create_device_direct_mappings(struct iommu_group
> *group,
> >struct device *dev);
> >   static struct iommu_group *iommu_group_get_for_dev(struct device
> > *dev);
> > +static ssize_t iommu_group_store_type(struct iommu_group *group,
> > + const char *buf, size_t count);
> >
> >   #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)
>   \
> >   struct iommu_group_attribute iommu_group_attr_##_name =   \
> > @@ -524,7 +526,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO,
> iommu_group_show_name, NULL);
> >   static IOMMU_GROUP_ATTR(reserved_regions, 0444,
> > iommu_group_show_resv_regions, NULL);
> >
> > -static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
> > +static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
> > +   iommu_group_store_type);
> >
> >   static void iommu_group_release(struct kobject *kobj)
> >   {
> > @@ -2847,3 +2850,209 @@ int iommu_sva_get_pasid(struct iommu_sva
>