Re: use generic DMA mapping code in powerpc V4

2018-12-03 Thread Christian Zigotzky

Hi All,

Could you please test Christoph's kernel on your PASEMI and NXP boards? 
Download:


'git clone git://git.infradead.org/users/hch/misc.git -b powerpc-dma.5 a'

Thanks,
Christian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/8] vfio/mdev: IOMMU aware mediated device

2018-12-03 Thread Xu Zaibo

Hi,


Is this solution trying to support general user space processes who 
are directly working on devices?


Yes, it is.



Okay. But I got another question. As I write a Crypto driver, could I 
call 'mdev_register_device'?
Or in other words, is 'mdev_register_device' acceptable for drivers of 
Crypto?


+cc: Herbert Xu

Thanks,
Zaibo




On 2018/11/5 15:34, Lu Baolu wrote:

Hi,

The Mediate Device is a framework for fine-grained physical device
sharing across the isolated domains. Currently the mdev framework
is designed to be independent of the platform IOMMU support. As the
result, the DMA isolation relies on the mdev parent device in a
vendor specific way.

There are several cases where a mediated device could be protected
and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
[1] introduces a new translation mode called 'scalable mode', which
enables PASID-granular translations. The vt-d scalable mode is the
key ingredient for Scalable I/O Virtualization [2] [3] which allows
sharing a device in minimal possible granularity (ADI - Assignable
Device Interface).

A mediated device backed by an ADI could be protected and isolated
by the IOMMU since 1) the parent device supports tagging an unique
PASID to all DMA traffic out of the mediated device; and 2) the DMA
translation unit (IOMMU) supports the PASID granular translation.
We can apply IOMMU protection and isolation to this kind of devices
just as what we are doing with an assignable PCI device.

In order to distinguish the IOMMU-capable mediated devices from those
which still need to rely on parent devices, this patch set adds two
new members in struct mdev_device.

* iommu_device
   - This, if set, indicates that the mediated device could
 be fully isolated and protected by IOMMU via attaching
 an iommu domain to this device. If empty, it indicates
 using vendor defined isolation.

* iommu_domain
   - This is a place holder for an iommu domain. A domain
 could be store here for later use once it has been
 attached to the iommu_device of this mdev.

Below helpers are added to set and get above iommu device
and iommu domain pointers in mdev core implementation.

* mdev_set/get_iommu_device(dev, iommu_device)
   - Set or get the iommu device which represents this mdev
 in IOMMU's device scope. Drivers don't need to set the
 iommu device if it uses vendor defined isolation.

* mdev_set/get_iommu_domain(domain)
   - A iommu domain which has been attached to the iommu
 device in order to protect and isolate the mediated
 device will be kept in the mdev data structure and
 could be retrieved later.

The mdev parent device driver could opt-in that the mdev could be
fully isolated and protected by the IOMMU when the mdev is being
created by invoking mdev_set_iommu_device() in its @create().

In the vfio_iommu_type1_attach_group(), a domain allocated through
iommu_domain_alloc() will be attached to the mdev iommu device if
an iommu device has been set. Otherwise, the dummy external domain
will be used and all the DMA isolation and protection are routed to
parent driver as the result.

On IOMMU side, a basic requirement is allowing to attach multiple
domains to a PCI device if the device advertises the capability
and the IOMMU hardware supports finer granularity translations than
the normal PCI Source ID based translation.

As the result, a PCI device could work in two modes: normal mode
and auxiliary mode. In the normal mode, a pci device could be
isolated in the Source ID granularity; the pci device itself could
be assigned to a user application by attaching a single domain
to it. In the auxiliary mode, a pci device could be isolated in
finer granularity, hence subsets of the device could be assigned
to different user level application by attaching a different domain
to each subset.

The device driver is able to switch between above two modes with
below interfaces:

* iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY)
   - Represents the ability of supporting multiple domains
 per device.

* iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE)
   - Enable the multiple domains capability for the device
 referenced by @dev.

* iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE)
   - Disable the multiple domains capability for the device
 referenced by @dev.

* iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID)
   - Return ID used for finer-granularity DMA translation.

* iommu_attach_device_aux(domain, dev)
   - Attach a domain to the device in the auxiliary mode.

* iommu_detach_device_aux(domain, dev)
   - Detach the aux domain from device.

In order for the ease of discussion, sometimes we call "a domain in
auxiliary mode' or simply 'an auxiliary domain' when a domain is
attached to a device for finer granularity translations. But we need
to keep in mind that this doesn't mean there is a differnt domain
type. A same domain could be bound to a device for Source ID based
translation, and b

Re: [PATCH v2 1/9] mm: Introduce new vm_insert_range API

2018-12-03 Thread Souptick Joarder
On Mon, Dec 3, 2018 at 11:52 AM Mike Rapoport  wrote:
>
> On Mon, Dec 03, 2018 at 09:51:45AM +0530, Souptick Joarder wrote:
> > Hi Mike,
> >
> > On Sun, Dec 2, 2018 at 4:43 PM Mike Rapoport  wrote:
> > >
> > > On Sun, Dec 02, 2018 at 11:49:44AM +0530, Souptick Joarder wrote:
> > > > Previouly drivers have their own way of mapping range of
> > > > kernel pages/memory into user vma and this was done by
> > > > invoking vm_insert_page() within a loop.
> > > >
> > > > As this pattern is common across different drivers, it can
> > > > be generalized by creating a new function and use it across
> > > > the drivers.
> > > >
> > > > vm_insert_range is the new API which will be used to map a
> > > > range of kernel memory/pages to user vma.
> > > >
> > > > This API is tested by Heiko for Rockchip drm driver, on rk3188,
> > > > rk3288, rk3328 and rk3399 with graphics.
> > > >
> > > > Signed-off-by: Souptick Joarder 
> > > > Reviewed-by: Matthew Wilcox 
> > > > Tested-by: Heiko Stuebner 
> > > > ---
> > > >  include/linux/mm_types.h |  3 +++
> > > >  mm/memory.c  | 38 ++
> > > >  mm/nommu.c   |  7 +++
> > > >  3 files changed, 48 insertions(+)
> > > >
> > > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > > > index 5ed8f62..15ae24f 100644
> > > > --- a/include/linux/mm_types.h
> > > > +++ b/include/linux/mm_types.h
> > > > @@ -523,6 +523,9 @@ extern void tlb_gather_mmu(struct mmu_gather *tlb, 
> > > > struct mm_struct *mm,
> > > >  extern void tlb_finish_mmu(struct mmu_gather *tlb,
> > > >   unsigned long start, unsigned long end);
> > > >
> > > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > > > + struct page **pages, unsigned long page_count);
> > > > +
> > >
> > > This seem to belong to include/linux/mm.h, near vm_insert_page()
> >
> > Ok, I will change it. Apart from this change does it looks good ?
>
> With this change you can add
>
> Reviewed-by: Mike Rapoport 

Thanks Mike.

>
> > >
> > > >  static inline void init_tlb_flush_pending(struct mm_struct *mm)
> > > >  {
> > > >   atomic_set(&mm->tlb_flush_pending, 0);
> > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > index 15c417e..84ea46c 100644
> > > > --- a/mm/memory.c
> > > > +++ b/mm/memory.c
> > > > @@ -1478,6 +1478,44 @@ static int insert_page(struct vm_area_struct 
> > > > *vma, unsigned long addr,
> > > >  }
> > > >
> > > >  /**
> > > > + * vm_insert_range - insert range of kernel pages into user vma
> > > > + * @vma: user vma to map to
> > > > + * @addr: target user address of this page
> > > > + * @pages: pointer to array of source kernel pages
> > > > + * @page_count: number of pages need to insert into user vma
> > > > + *
> > > > + * This allows drivers to insert range of kernel pages they've 
> > > > allocated
> > > > + * into a user vma. This is a generic function which drivers can use
> > > > + * rather than using their own way of mapping range of kernel pages 
> > > > into
> > > > + * user vma.
> > > > + *
> > > > + * If we fail to insert any page into the vma, the function will return
> > > > + * immediately leaving any previously-inserted pages present.  Callers
> > > > + * from the mmap handler may immediately return the error as their 
> > > > caller
> > > > + * will destroy the vma, removing any successfully-inserted pages. 
> > > > Other
> > > > + * callers should make their own arrangements for calling 
> > > > unmap_region().
> > > > + *
> > > > + * Context: Process context. Called by mmap handlers.
> > > > + * Return: 0 on success and error code otherwise
> > > > + */
> > > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > > > + struct page **pages, unsigned long page_count)
> > > > +{
> > > > + unsigned long uaddr = addr;
> > > > + int ret = 0, i;
> > > > +
> > > > + for (i = 0; i < page_count; i++) {
> > > > + ret = vm_insert_page(vma, uaddr, pages[i]);
> > > > + if (ret < 0)
> > > > + return ret;
> > > > + uaddr += PAGE_SIZE;
> > > > + }
> > > > +
> > > > + return ret;
> > > > +}
> > > > +EXPORT_SYMBOL(vm_insert_range);
> > > > +
> > > > +/**
> > > >   * vm_insert_page - insert single page into user vma
> > > >   * @vma: user vma to map to
> > > >   * @addr: target user address of this page
> > > > diff --git a/mm/nommu.c b/mm/nommu.c
> > > > index 749276b..d6ef5c7 100644
> > > > --- a/mm/nommu.c
> > > > +++ b/mm/nommu.c
> > > > @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, 
> > > > unsigned long addr,
> > > >  }
> > > >  EXPORT_SYMBOL(vm_insert_page);
> > > >
> > > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > > > + struct page **pages, unsigned long page_count)
> > > > +{
> > > > + return -EINVAL;
> > > > +}
> > > > +EXPORT_SYMBOL(vm_insert_range);
> > > > +
> > > >  /*
> > 

Re: [PATCH] of/device: add blacklist for iommu dma_ops

2018-12-03 Thread Vivek Gautam
On Mon, Dec 3, 2018 at 7:56 PM Rob Clark  wrote:
>
> On Mon, Dec 3, 2018 at 7:45 AM Robin Murphy  wrote:
> >
> > Hi Rob,
> >
> > On 01/12/2018 16:53, Rob Clark wrote:
> > > This solves a problem we see with drm/msm, caused by getting
> > > iommu_dma_ops while we attach our own domain and manage it directly at
> > > the iommu API level:
> > >
> > >[0038] user address but active_mm is swapper
> > >Internal error: Oops: 9605 [#1] PREEMPT SMP
> > >Modules linked in:
> > >CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90
> > >Hardware name: xxx (DT)
> > >Workqueue: events deferred_probe_work_func
> > >pstate: 80c9 (Nzcv daif +PAN +UAO)
> > >pc : iommu_dma_map_sg+0x7c/0x2c8
> > >lr : iommu_dma_map_sg+0x40/0x2c8
> > >sp : ff80095eb4f0
> > >x29: ff80095eb4f0 x28: 
> > >x27: ffc0f9431578 x26: 
> > >x25:  x24: 0003
> > >x23: 0001 x22: ffc0fa9ac010
> > >x21:  x20: ffc0fab40980
> > >x19: ffc0fab40980 x18: 0003
> > >x17: 01c4 x16: 0007
> > >x15: 000e x14: 
> > >x13:  x12: 0028
> > >x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> > >x9 :  x8 : ffc0fab409a0
> > >x7 :  x6 : 0002
> > >x5 : 0001 x4 : 
> > >x3 : 0001 x2 : 0002
> > >x1 : ffc0f9431578 x0 : 
> > >Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb)
> > >Call trace:
> > > iommu_dma_map_sg+0x7c/0x2c8
> > > __iommu_map_sg_attrs+0x70/0x84
> > > get_pages+0x170/0x1e8
> > > msm_gem_get_iova+0x8c/0x128
> > > _msm_gem_kernel_new+0x6c/0xc8
> > > msm_gem_kernel_new+0x4c/0x58
> > > dsi_tx_buf_alloc_6g+0x4c/0x8c
> > > msm_dsi_host_modeset_init+0xc8/0x108
> > > msm_dsi_modeset_init+0x54/0x18c
> > > _dpu_kms_drm_obj_init+0x430/0x474
> > > dpu_kms_hw_init+0x5f8/0x6b4
> > > msm_drm_bind+0x360/0x6c8
> > > try_to_bring_up_master.part.7+0x28/0x70
> > > component_master_add_with_match+0xe8/0x124
> > > msm_pdev_probe+0x294/0x2b4
> > > platform_drv_probe+0x58/0xa4
> > > really_probe+0x150/0x294
> > > driver_probe_device+0xac/0xe8
> > > __device_attach_driver+0xa4/0xb4
> > > bus_for_each_drv+0x98/0xc8
> > > __device_attach+0xac/0x12c
> > > device_initial_probe+0x24/0x30
> > > bus_probe_device+0x38/0x98
> > > deferred_probe_work_func+0x78/0xa4
> > > process_one_work+0x24c/0x3dc
> > > worker_thread+0x280/0x360
> > > kthread+0x134/0x13c
> > > ret_from_fork+0x10/0x18
> > >Code: d284 91000725 6b17039f 5400048a (f9401f40)
> > >---[ end trace f22dda57f3648e2c ]---
> > >Kernel panic - not syncing: Fatal exception
> > >SMP: stopping secondary CPUs
> > >Kernel Offset: disabled
> > >CPU features: 0x0,22802a18
> > >Memory Limit: none
> > >
> > > The problem is that when drm/msm does it's own iommu_attach_device(),
> > > now the domain returned by iommu_get_domain_for_dev() is drm/msm's
> > > domain, and it doesn't have domain->iova_cookie.
> >
> > Does this crash still happen with 4.20-rc? Because as of 6af588fed391 it
> > really shouldn't.
> >
> > > We kind of avoided this problem prior to sdm845/dpu because the iommu
> > > was attached to the mdp node in dt, which is a child of the toplevel
> > > mdss node (which corresponds to the dev passed in dma_map_sg()).  But
> > > with sdm845, now the iommu is attached at the mdss level so we hit the
> > > iommu_dma_ops in dma_map_sg().
> > >
> > > But auto allocating/attaching a domain before the driver is probed was
> > > already a blocking problem for enabling per-context pagetables for the
> > > GPU.  This problem is also now solved with this patch.
> >
> > s/solved/worked around/
> >
> > If you want a guarantee of actually getting a specific hardware context
> > allocated for a given domain, there needs to be code in the IOMMU driver
> > to understand and honour that. Implicitly depending on whatever happens
> > to fall out of current driver behaviour on current systems is not a real
> > solution.
> >
> > > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in 
> > > of_dma_configure
> >
> > That's rather misleading, since the crash described above depends on at
> > least two other major changes which came long after that commit.
> >
> > It's not that I don't understand exactly what you want here - just that
> > this commit message isn't a coherent justification for that ;)
> >
> > > Tested-by: Douglas Anderson 
> > > Signed-off-by: Rob Clark 
> > > ---
> > > This is an alternative/replacement for [1].  What it lacks in elegance
> > > it makes up for in practicality ;-)
> > >
> > > [1] https://patchwork.freedesktop.org/patch

[PATCH v19 4/5] dt-bindings: arm-smmu: Add bindings for qcom,smmu-v2

2018-12-03 Thread Vivek Gautam
Add bindings doc for Qcom's smmu-v2 implementation.

Signed-off-by: Vivek Gautam 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Rob Herring 
Reviewed-by: Robin Murphy 
---

Changes since v18:
 None.

 .../devicetree/bindings/iommu/arm,smmu.txt | 39 ++
 1 file changed, 39 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 8a6ffce12af5..a6504b37cc21 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -17,10 +17,16 @@ conditions.
 "arm,mmu-401"
 "arm,mmu-500"
 "cavium,smmu-v2"
+"qcom,smmu-v2"
 
   depending on the particular implementation and/or the
   version of the architecture implemented.
 
+  Qcom SoCs must contain, as below, SoC-specific compatibles
+  along with "qcom,smmu-v2":
+  "qcom,msm8996-smmu-v2", "qcom,smmu-v2",
+  "qcom,sdm845-smmu-v2", "qcom,smmu-v2".
+
 - reg   : Base address and size of the SMMU.
 
 - #global-interrupts : The number of global interrupts exposed by the
@@ -71,6 +77,22 @@ conditions.
   or using stream matching with #iommu-cells = <2>, and
   may be ignored if present in such cases.
 
+- clock-names:List of the names of clocks input to the device. The
+  required list depends on particular implementation and
+  is as follows:
+  - for "qcom,smmu-v2":
+- "bus": clock required for downstream bus access and
+ for the smmu ptw,
+- "iface": clock required to access smmu's registers
+   through the TCU's programming interface.
+  - unspecified for other implementations.
+
+- clocks: Specifiers for all clocks listed in the clock-names property,
+  as per generic clock bindings.
+
+- power-domains:  Specifiers for power domains required to be powered on for
+  the SMMU to operate, as per generic power domain bindings.
+
 ** Deprecated properties:
 
 - mmu-masters (deprecated in favour of the generic "iommus" binding) :
@@ -137,3 +159,20 @@ conditions.
 iommu-map = <0 &smmu3 0 0x400>;
 ...
 };
+
+   /* Qcom's arm,smmu-v2 implementation */
+   smmu4: iommu@d0 {
+   compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2";
+   reg = <0xd0 0x1>;
+
+   #global-interrupts = <1>;
+   interrupts = ,
+,
+;
+   #iommu-cells = <1>;
+   power-domains = <&mmcc MDSS_GDSC>;
+
+   clocks = <&mmcc SMMU_MDP_AXI_CLK>,
+<&mmcc SMMU_MDP_AHB_CLK>;
+   clock-names = "bus", "iface";
+   };
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/8] vfio/mdev: IOMMU aware mediated device

2018-12-03 Thread Lu Baolu

Hi,

On 12/4/18 11:46 AM, Xu Zaibo wrote:

Hi,

Is this solution trying to support general user space processes who are 
directly working on devices?


Yes, it is.



Thanks,
Zaibo


Best regards,
Lu Baolu



.

On 2018/11/5 15:34, Lu Baolu wrote:

Hi,

The Mediate Device is a framework for fine-grained physical device
sharing across the isolated domains. Currently the mdev framework
is designed to be independent of the platform IOMMU support. As the
result, the DMA isolation relies on the mdev parent device in a
vendor specific way.

There are several cases where a mediated device could be protected
and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
[1] introduces a new translation mode called 'scalable mode', which
enables PASID-granular translations. The vt-d scalable mode is the
key ingredient for Scalable I/O Virtualization [2] [3] which allows
sharing a device in minimal possible granularity (ADI - Assignable
Device Interface).

A mediated device backed by an ADI could be protected and isolated
by the IOMMU since 1) the parent device supports tagging an unique
PASID to all DMA traffic out of the mediated device; and 2) the DMA
translation unit (IOMMU) supports the PASID granular translation.
We can apply IOMMU protection and isolation to this kind of devices
just as what we are doing with an assignable PCI device.

In order to distinguish the IOMMU-capable mediated devices from those
which still need to rely on parent devices, this patch set adds two
new members in struct mdev_device.

* iommu_device
   - This, if set, indicates that the mediated device could
 be fully isolated and protected by IOMMU via attaching
 an iommu domain to this device. If empty, it indicates
 using vendor defined isolation.

* iommu_domain
   - This is a place holder for an iommu domain. A domain
 could be store here for later use once it has been
 attached to the iommu_device of this mdev.

Below helpers are added to set and get above iommu device
and iommu domain pointers in mdev core implementation.

* mdev_set/get_iommu_device(dev, iommu_device)
   - Set or get the iommu device which represents this mdev
 in IOMMU's device scope. Drivers don't need to set the
 iommu device if it uses vendor defined isolation.

* mdev_set/get_iommu_domain(domain)
   - A iommu domain which has been attached to the iommu
 device in order to protect and isolate the mediated
 device will be kept in the mdev data structure and
 could be retrieved later.

The mdev parent device driver could opt-in that the mdev could be
fully isolated and protected by the IOMMU when the mdev is being
created by invoking mdev_set_iommu_device() in its @create().

In the vfio_iommu_type1_attach_group(), a domain allocated through
iommu_domain_alloc() will be attached to the mdev iommu device if
an iommu device has been set. Otherwise, the dummy external domain
will be used and all the DMA isolation and protection are routed to
parent driver as the result.

On IOMMU side, a basic requirement is allowing to attach multiple
domains to a PCI device if the device advertises the capability
and the IOMMU hardware supports finer granularity translations than
the normal PCI Source ID based translation.

As the result, a PCI device could work in two modes: normal mode
and auxiliary mode. In the normal mode, a pci device could be
isolated in the Source ID granularity; the pci device itself could
be assigned to a user application by attaching a single domain
to it. In the auxiliary mode, a pci device could be isolated in
finer granularity, hence subsets of the device could be assigned
to different user level application by attaching a different domain
to each subset.

The device driver is able to switch between above two modes with
below interfaces:

* iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY)
   - Represents the ability of supporting multiple domains
 per device.

* iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE)
   - Enable the multiple domains capability for the device
 referenced by @dev.

* iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE)
   - Disable the multiple domains capability for the device
 referenced by @dev.

* iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID)
   - Return ID used for finer-granularity DMA translation.

* iommu_attach_device_aux(domain, dev)
   - Attach a domain to the device in the auxiliary mode.

* iommu_detach_device_aux(domain, dev)
   - Detach the aux domain from device.

In order for the ease of discussion, sometimes we call "a domain in
auxiliary mode' or simply 'an auxiliary domain' when a domain is
attached to a device for finer granularity translations. But we need
to keep in mind that this doesn't mean there is a differnt domain
type. A same domain could be bound to a device for Source ID based
translation, and bound to another device for finer granularity
translation at the same time.

This patch series extends both IOMMU and vfio componen

[PATCH v19 3/5] iommu/arm-smmu: Add the device_link between masters and smmu

2018-12-03 Thread Vivek Gautam
From: Sricharan R 

Finally add the device link between the master device and
smmu, so that the smmu gets runtime enabled/disabled only when the
master needs it. This is done from add_device callback which gets
called once when the master is added to the smmu.

Signed-off-by: Sricharan R 
Signed-off-by: Vivek Gautam 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Robin Murphy 
---

Changes since v18:
 None.

 drivers/iommu/arm-smmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 1917d214c4d9..b6b11642b3a9 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1500,6 +1500,9 @@ static int arm_smmu_add_device(struct device *dev)
 
iommu_device_link(&smmu->iommu, dev);
 
+   device_link_add(dev, smmu->dev,
+   DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER);
+
return 0;
 
 out_cfg_free:
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v19 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant

2018-12-03 Thread Vivek Gautam
qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
clock and power requirements.
On msm8996, multiple cores, viz. mdss, video, etc. use this
smmu. On sdm845, this smmu is used with gpu.
Add bindings for the same.

Signed-off-by: Vivek Gautam 
Reviewed-by: Rob Herring 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Robin Murphy 
---

Changes since v18:
 None.

 drivers/iommu/arm-smmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b6b11642b3a9..ba18d89d4732 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -120,6 +120,7 @@ enum arm_smmu_implementation {
GENERIC_SMMU,
ARM_MMU500,
CAVIUM_SMMUV2,
+   QCOM_SMMUV2,
 };
 
 struct arm_smmu_s2cr {
@@ -2030,6 +2031,7 @@ ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, 
GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
 ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
+ARM_SMMU_MATCH_DATA(qcom_smmuv2, ARM_SMMU_V2, QCOM_SMMUV2);
 
 static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,smmu-v1", .data = &smmu_generic_v1 },
@@ -2038,6 +2040,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,mmu-401", .data = &arm_mmu401 },
{ .compatible = "arm,mmu-500", .data = &arm_mmu500 },
{ .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 },
+   { .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 },
{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v19 2/5] iommu/arm-smmu: Invoke pm_runtime across the driver

2018-12-03 Thread Vivek Gautam
From: Sricharan R 

Enable pm-runtime on devices that implement a pm domain. Then,
add pm runtime hooks to several iommu_ops to power cycle the
smmu device for explicit TLB invalidation requests, and
register space accesses, etc.
We need these hooks when the smmu, linked to its master through
device links, has to be powered-up without the master device
being in context.

Signed-off-by: Sricharan R 
[vivek: Cleanup pm runtime calls]
Signed-off-by: Vivek Gautam 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Robin Murphy 
---

Changes since v18:
 None.

 drivers/iommu/arm-smmu.c | 108 ++-
 1 file changed, 98 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 602b67d4f2d6..1917d214c4d9 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -270,6 +270,20 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
{ 0, NULL},
 };
 
+static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
+{
+   if (pm_runtime_enabled(smmu->dev))
+   return pm_runtime_get_sync(smmu->dev);
+
+   return 0;
+}
+
+static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
+{
+   if (pm_runtime_enabled(smmu->dev))
+   pm_runtime_put(smmu->dev);
+}
+
 static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
 {
return container_of(dom, struct arm_smmu_domain, domain);
@@ -929,11 +943,15 @@ static void arm_smmu_destroy_domain_context(struct 
iommu_domain *domain)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-   int irq;
+   int ret, irq;
 
if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
return;
 
+   ret = arm_smmu_rpm_get(smmu);
+   if (ret < 0)
+   return;
+
/*
 * Disable the context bank and free the page tables before freeing
 * it.
@@ -948,6 +966,8 @@ static void arm_smmu_destroy_domain_context(struct 
iommu_domain *domain)
 
free_io_pgtable_ops(smmu_domain->pgtbl_ops);
__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
+
+   arm_smmu_rpm_put(smmu);
 }
 
 static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
@@ -1229,10 +1249,15 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
return -ENODEV;
 
smmu = fwspec_smmu(fwspec);
+
+   ret = arm_smmu_rpm_get(smmu);
+   if (ret < 0)
+   return ret;
+
/* Ensure that the domain is finalised */
ret = arm_smmu_init_domain_context(domain, smmu);
if (ret < 0)
-   return ret;
+   goto rpm_put;
 
/*
 * Sanity check the domain. We don't support domains across
@@ -1242,49 +1267,74 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
dev_err(dev,
"cannot attach to SMMU %s whilst already attached to 
domain on SMMU %s\n",
dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev));
-   return -EINVAL;
+   ret = -EINVAL;
+   goto rpm_put;
}
 
/* Looks ok, so add the device to the domain */
-   return arm_smmu_domain_add_master(smmu_domain, fwspec);
+   ret = arm_smmu_domain_add_master(smmu_domain, fwspec);
+
+rpm_put:
+   arm_smmu_rpm_put(smmu);
+   return ret;
 }
 
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t size, int prot)
 {
struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+   struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
+   int ret;
 
if (!ops)
return -ENODEV;
 
-   return ops->map(ops, iova, paddr, size, prot);
+   arm_smmu_rpm_get(smmu);
+   ret = ops->map(ops, iova, paddr, size, prot);
+   arm_smmu_rpm_put(smmu);
+
+   return ret;
 }
 
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 size_t size)
 {
struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+   struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
+   size_t ret;
 
if (!ops)
return 0;
 
-   return ops->unmap(ops, iova, size);
+   arm_smmu_rpm_get(smmu);
+   ret = ops->unmap(ops, iova, size);
+   arm_smmu_rpm_put(smmu);
+
+   return ret;
 }
 
 static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
 {
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
 
-   if (smmu_domain->tlb_ops)
+   if (smmu_domain->tlb_ops) {
+   arm_smmu_rpm_get(smmu);
smmu_domain-

[PATCH v19 0/5] iommu/arm-smmu: Add runtime pm/sleep support

2018-12-03 Thread Vivek Gautam
Changes since v18:
 - Addressing Stephen's comment [5]:
 Replaced the entire clock bulk data filling and handling with
 devm_clk_bulk_get_all().

Changes since v17:
 - Addressing Will's comment to embed Thor's change [2] for pulling
   clocks information from device tree. This is done by squashing
   Thor's change [2] in v17's 1/5 patch [3].
 - Another minor change is addition of runtime pm hooks to
   arm_smmu_iova_to_phys_hard().

Previous version of this patch series is @ [1].
Also refer to [4] for change logs for previous versions.

[1] https://lore.kernel.org/patchwork/cover/1017699/
[2] https://lore.kernel.org/patchwork/patch/996143/
[3] https://lore.kernel.org/patchwork/patch/1013167/
[4] https://lore.kernel.org/patchwork/cover/979429/
[5] https://lore.kernel.org/patchwork/patch/1017700/

Sricharan R (3):
  iommu/arm-smmu: Add pm_runtime/sleep ops
  iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device
  iommu/arm-smmu: Add the device_link between masters and smmu

Vivek Gautam (2):
  dt-bindings: arm-smmu: Add bindings for qcom,smmu-v2
  iommu/arm-smmu: Add support for qcom,smmu-v2 variant

 .../devicetree/bindings/iommu/arm,smmu.txt |  39 +
 drivers/iommu/arm-smmu.c   | 170 +++--
 2 files changed, 197 insertions(+), 12 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v19 1/5] iommu/arm-smmu: Add pm_runtime/sleep ops

2018-12-03 Thread Vivek Gautam
From: Sricharan R 

The smmu needs to be functional only when the respective
master's using it are active. The device_link feature
helps to track such functional dependencies, so that the
iommu gets powered when the master device enables itself
using pm_runtime. So by adapting the smmu driver for
runtime pm, above said dependency can be addressed.

This patch adds the pm runtime/sleep callbacks to the
driver and the corresponding bulk clock handling for all
the clocks needed by smmu.

Also, while we enable the runtime pm, add a pm sleep suspend
callback that pushes devices to low power state by turning
the clocks off in a system sleep.
Add corresponding clock enable path in resume callback as well.

Signed-off-by: Sricharan R 
Signed-off-by: Archit Taneja 
[Thor: Rework to get clocks from device tree]
Signed-off-by: Thor Thayer 
[vivek: rework for clock and pm ops]
Signed-off-by: Vivek Gautam 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Robin Murphy 
Tested-by: Thor Thayer 
---

Changes since v18:
 - Replaced the entire clock bulk data filling and handling with
   devm_clk_bulk_get_all().

 drivers/iommu/arm-smmu.c | 58 +---
 1 file changed, 55 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5a28ae892504..602b67d4f2d6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -206,6 +207,8 @@ struct arm_smmu_device {
u32 num_global_irqs;
u32 num_context_irqs;
unsigned int*irqs;
+   struct clk_bulk_data*clks;
+   int num_clks;
 
u32 cavium_id_base; /* Specific to Cavium */
 
@@ -1947,7 +1950,7 @@ struct arm_smmu_match_data {
 };
 
 #define ARM_SMMU_MATCH_DATA(name, ver, imp)\
-static struct arm_smmu_match_data name = { .version = ver, .model = imp }
+static const struct arm_smmu_match_data name = { .version = ver, .model = imp }
 
 ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU);
@@ -2150,6 +2153,17 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
smmu->irqs[i] = irq;
}
 
+   err = devm_clk_bulk_get_all(dev, &smmu->clks);
+   if (err < 0) {
+   dev_err(dev, "failed to get clocks %d\n", err);
+   return err;
+   }
+   smmu->num_clks = err;
+
+   err = clk_bulk_prepare_enable(smmu->num_clks, smmu->clks);
+   if (err)
+   return err;
+
err = arm_smmu_device_cfg_probe(smmu);
if (err)
return err;
@@ -2236,6 +2250,9 @@ static int arm_smmu_device_remove(struct platform_device 
*pdev)
 
/* Turn the thing off */
writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
+
+   clk_bulk_disable_unprepare(smmu->num_clks, smmu->clks);
+
return 0;
 }
 
@@ -2244,15 +2261,50 @@ static void arm_smmu_device_shutdown(struct 
platform_device *pdev)
arm_smmu_device_remove(pdev);
 }
 
-static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
+static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
 {
struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+   int ret;
+
+   ret = clk_bulk_enable(smmu->num_clks, smmu->clks);
+   if (ret)
+   return ret;
 
arm_smmu_device_reset(smmu);
+
return 0;
 }
 
-static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume);
+static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
+{
+   struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+
+   clk_bulk_disable(smmu->num_clks, smmu->clks);
+
+   return 0;
+}
+
+static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
+{
+   if (pm_runtime_suspended(dev))
+   return 0;
+
+   return arm_smmu_runtime_resume(dev);
+}
+
+static int __maybe_unused arm_smmu_pm_suspend(struct device *dev)
+{
+   if (pm_runtime_suspended(dev))
+   return 0;
+
+   return arm_smmu_runtime_suspend(dev);
+}
+
+static const struct dev_pm_ops arm_smmu_pm_ops = {
+   SET_SYSTEM_SLEEP_PM_OPS(arm_smmu_pm_suspend, arm_smmu_pm_resume)
+   SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
+  arm_smmu_runtime_resume, NULL)
+};
 
 static struct platform_driver arm_smmu_driver = {
.driver = {
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support

2018-12-03 Thread Lu Baolu

Hi,

On 12/4/18 1:23 AM, Liu, Yi L wrote:

Hi Joerg,


From: Joerg Roedel [mailto:j...@8bytes.org]
Sent: Monday, December 3, 2018 5:49 AM
To: Lu Baolu 
Subject: Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor
support

On Wed, Nov 28, 2018 at 11:54:41AM +0800, Lu Baolu wrote:

-
-   desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,

0);

+   /*
+* Need two pages to accommodate 256 descriptors of 256 bits each
+* if the remapping hardware supports scalable mode translation.
+*/
+   desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
+!!ecap_smts(iommu->ecap));



Same here, does the allocation really need GFP_ATOMIC?


still leave to Baolu.


The existing code uses GFP_ATOMIC, this patch only changes the size of
the allocated desc_page.

I don't think we really need GFP_ATOMIC here (and also for some other
places). I will clean up them in a separated patch.






  struct q_inval {
raw_spinlock_t  q_lock;
-   struct qi_desc  *desc;  /* invalidation queue */
+   void*desc;  /* invalidation queue */
int *desc_status;   /* desc status */
int free_head;  /* first free entry */
int free_tail;  /* last free entry */


Why do you switch the pointer to void* ?


In this patch, there is some code like the code below. It calculates
destination address of memcpy with qi->desc. If it's still struct qi_desc
pointer, the calculation result would be wrong.

+   memcpy(desc, qi->desc + (wait_index << shift),
+  1 << shift);

The change of the calculation method is to support 128 bits invalidation
descriptors and 256 invalidation descriptors in this unified code logic.

Also, the conversation between Baolu and me may help.

https://lore.kernel.org/patchwork/patch/1006756/


Yes. We need to support different descriptor size.

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables

2018-12-03 Thread Lu Baolu

Hi,

On 12/4/18 1:23 AM, Liu, Yi L wrote:

Hi Joerg,


From: Joerg Roedel [mailto:j...@8bytes.org]
Sent: Monday, December 3, 2018 5:44 AM
To: Lu Baolu 
Subject: Re: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables

Hi Baolu,

On Wed, Nov 28, 2018 at 11:54:39AM +0800, Lu Baolu wrote:

@@ -2482,12 +2482,13 @@ static struct dmar_domain

*dmar_insert_one_dev_info(struct intel_iommu *iommu,

if (dev)
dev->archdata.iommu = info;

-   if (dev && dev_is_pci(dev) && info->pasid_supported) {
+   /* PASID table is mandatory for a PCI device in scalable mode. */
+   if (dev && dev_is_pci(dev) && sm_supported(iommu)) {


This will also allocate a PASID table if the device does not support
PASIDs, right? Will the table not be used in that case or will the
device just use the fallback PASID? Isn't it better in that case to have
no PASID table?


We need to allocate the PASID table in scalable mode, the reason is as below:
In VT-d scalable mode, all address translation is done in PASID-granularity.
For requests-with-PASID, the address translation would be subjected to the
PASID entry specified by the PASID value in the DMA request. However, for
requests-without-PASID, there is no PASID in the DMA request. To fulfil
the translation logic, we've introduced RID2PASID field in sm-context-entry
in VT-d 3.o spec. So that such DMA requests would be subjected to the pasid
entry specified by the PASID value in the RID2PASID field of sm-context-entry.

So for a device without PASID support, we need to at least to have a PASID
entry so that its DMA request (without pasid) can be translated. Thus a PASID
table is needed for such devices.




@@ -143,18 +143,20 @@ int intel_pasid_alloc_table(struct device *dev)
return -ENOMEM;
INIT_LIST_HEAD(&pasid_table->dev);

-   size = sizeof(struct pasid_entry);
-   count = min_t(int, pci_max_pasids(to_pci_dev(dev)), intel_pasid_max_id);
-   order = get_order(size * count);
+   if (info->pasid_supported)
+   max_pasid = min_t(int, pci_max_pasids(to_pci_dev(dev)),
+ intel_pasid_max_id);
+
+   size = max_pasid >> (PASID_PDE_SHIFT - 3);
+   order = size ? get_order(size) : 0;
pages = alloc_pages_node(info->iommu->node,
-GFP_ATOMIC | __GFP_ZERO,
-order);
+GFP_ATOMIC | __GFP_ZERO, order);


This is a simple data structure allocation path, does it need
GFP_ATOMIC?




This function is called in an unsleepable context.

spin_lock(&lock)
[...]
if (pasid_table_is_necessary)
allocate_pasid_table(dev)
[...]
spin_unlock(&lock)

We can move it out of the lock range.

How about

if (pasid_table_is_necessary)
pasid_table = allocate_pasid_table(dev)

spin_lock(&lock)
[...]
if (pasid_table_is_necessary)
set_up_pasid_table(pasid_table)
[...]
spin_unlock(&lock)

?

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/8] vfio/mdev: IOMMU aware mediated device

2018-12-03 Thread Xu Zaibo

Hi,

Is this solution trying to support general user space processes who are 
directly working on devices?


Thanks,
Zaibo

.

On 2018/11/5 15:34, Lu Baolu wrote:

Hi,

The Mediate Device is a framework for fine-grained physical device
sharing across the isolated domains. Currently the mdev framework
is designed to be independent of the platform IOMMU support. As the
result, the DMA isolation relies on the mdev parent device in a
vendor specific way.

There are several cases where a mediated device could be protected
and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
[1] introduces a new translation mode called 'scalable mode', which
enables PASID-granular translations. The vt-d scalable mode is the
key ingredient for Scalable I/O Virtualization [2] [3] which allows
sharing a device in minimal possible granularity (ADI - Assignable
Device Interface).

A mediated device backed by an ADI could be protected and isolated
by the IOMMU since 1) the parent device supports tagging an unique
PASID to all DMA traffic out of the mediated device; and 2) the DMA
translation unit (IOMMU) supports the PASID granular translation.
We can apply IOMMU protection and isolation to this kind of devices
just as what we are doing with an assignable PCI device.

In order to distinguish the IOMMU-capable mediated devices from those
which still need to rely on parent devices, this patch set adds two
new members in struct mdev_device.

* iommu_device
   - This, if set, indicates that the mediated device could
 be fully isolated and protected by IOMMU via attaching
 an iommu domain to this device. If empty, it indicates
 using vendor defined isolation.

* iommu_domain
   - This is a place holder for an iommu domain. A domain
 could be store here for later use once it has been
 attached to the iommu_device of this mdev.

Below helpers are added to set and get above iommu device
and iommu domain pointers in mdev core implementation.

* mdev_set/get_iommu_device(dev, iommu_device)
   - Set or get the iommu device which represents this mdev
 in IOMMU's device scope. Drivers don't need to set the
 iommu device if it uses vendor defined isolation.

* mdev_set/get_iommu_domain(domain)
   - A iommu domain which has been attached to the iommu
 device in order to protect and isolate the mediated
 device will be kept in the mdev data structure and
 could be retrieved later.

The mdev parent device driver could opt-in that the mdev could be
fully isolated and protected by the IOMMU when the mdev is being
created by invoking mdev_set_iommu_device() in its @create().

In the vfio_iommu_type1_attach_group(), a domain allocated through
iommu_domain_alloc() will be attached to the mdev iommu device if
an iommu device has been set. Otherwise, the dummy external domain
will be used and all the DMA isolation and protection are routed to
parent driver as the result.

On IOMMU side, a basic requirement is allowing to attach multiple
domains to a PCI device if the device advertises the capability
and the IOMMU hardware supports finer granularity translations than
the normal PCI Source ID based translation.

As the result, a PCI device could work in two modes: normal mode
and auxiliary mode. In the normal mode, a pci device could be
isolated in the Source ID granularity; the pci device itself could
be assigned to a user application by attaching a single domain
to it. In the auxiliary mode, a pci device could be isolated in
finer granularity, hence subsets of the device could be assigned
to different user level application by attaching a different domain
to each subset.

The device driver is able to switch between above two modes with
below interfaces:

* iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY)
   - Represents the ability of supporting multiple domains
 per device.

* iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE)
   - Enable the multiple domains capability for the device
 referenced by @dev.

* iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE)
   - Disable the multiple domains capability for the device
 referenced by @dev.

* iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID)
   - Return ID used for finer-granularity DMA translation.

* iommu_attach_device_aux(domain, dev)
   - Attach a domain to the device in the auxiliary mode.

* iommu_detach_device_aux(domain, dev)
   - Detach the aux domain from device.

In order for the ease of discussion, sometimes we call "a domain in
auxiliary mode' or simply 'an auxiliary domain' when a domain is
attached to a device for finer granularity translations. But we need
to keep in mind that this doesn't mean there is a differnt domain
type. A same domain could be bound to a device for Source ID based
translation, and bound to another device for finer granularity
translation at the same time.

This patch series extends both IOMMU and vfio components to support
mdev device passing through when it could be isolated and protected
by 

[PATCH v3 5/6] arm64: defconfig: Enable ARM_SMMU_TEGRA

2018-12-03 Thread Krishna Reddy
Enabling CONFIG_ARM_SMMU_TEGRA that is used
by Tegra194 SOC.

Signed-off-by: Krishna Reddy 
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 5c2b1f6..e5ea13b 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -625,6 +625,7 @@ CONFIG_QCOM_APCS_IPC=y
 CONFIG_ROCKCHIP_IOMMU=y
 CONFIG_TEGRA_IOMMU_SMMU=y
 CONFIG_ARM_SMMU=y
+CONFIG_ARM_SMMU_TEGRA=y
 CONFIG_ARM_SMMU_V3=y
 CONFIG_QCOM_IOMMU=y
 CONFIG_REMOTEPROC=m
-- 
2.1.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/6] iommu/arm-smmu: Add support to program multiple ARM SMMU's identically

2018-12-03 Thread Krishna Reddy
Add support to program multiple ARM SMMU's identically as one SMMU device.
Tegra194 uses Two ARM SMMU's as one SMMU device and both ARM SMMU's need
to be programmed identically.

Signed-off-by: Krishna Reddy 
---
 drivers/iommu/lib-arm-smmu.c | 191 ---
 1 file changed, 144 insertions(+), 47 deletions(-)

diff --git a/drivers/iommu/lib-arm-smmu.c b/drivers/iommu/lib-arm-smmu.c
index 6aba5db..7036763 100644
--- a/drivers/iommu/lib-arm-smmu.c
+++ b/drivers/iommu/lib-arm-smmu.c
@@ -69,9 +69,9 @@
  * therefore this actually makes more sense than it might first appear.
  */
 #ifdef CONFIG_64BIT
-#define smmu_write_atomic_lq   writeq_relaxed
+#define smmu_write_atomic_lq   writeq_all_relaxed
 #else
-#define smmu_write_atomic_lq   writel_relaxed
+#define smmu_write_atomic_lq   writel_all_relaxed
 #endif
 
 /* Translation context bank */
@@ -135,6 +135,48 @@ struct arm_smmu_domain {
struct iommu_domain domain;
 };
 
+#define to_smmu_intance(smmu, inst, addr) \
+   (addr - smmu->bases[0] + smmu->bases[inst])
+
+static void writel_all(struct arm_smmu_device *smmu,
+  u32 value, void __iomem *addr)
+{
+   int i;
+
+   writel(value, addr);
+   for (i = 1; i < smmu->num_smmus; i++) {
+   void __iomem *reg_addr = to_smmu_intance(smmu, i, addr);
+
+   writel(value, reg_addr);
+   }
+}
+
+static void writel_all_relaxed(struct arm_smmu_device *smmu,
+  u32 value, void __iomem *addr)
+{
+   int i;
+
+   writel_relaxed(value, addr);
+   for (i = 1; i < smmu->num_smmus; i++) {
+   void __iomem *reg_addr = to_smmu_intance(smmu, i, addr);
+
+   writel_relaxed(value, reg_addr);
+   }
+}
+
+static void writeq_all_relaxed(struct arm_smmu_device *smmu,
+  u64 value, void __iomem *addr)
+{
+   int i;
+
+   writeq_relaxed(value, addr);
+   for (i = 1; i < smmu->num_smmus; i++) {
+   void __iomem *reg_addr = to_smmu_intance(smmu, i, addr);
+
+   writeq_relaxed(value, reg_addr);
+   }
+}
+
 static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
 {
return container_of(dom, struct arm_smmu_domain, domain);
@@ -179,25 +221,37 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device 
*smmu,
 
 static void arm_smmu_tlb_sync_global(struct arm_smmu_device *smmu)
 {
-   void __iomem *base = ARM_SMMU_GR0(smmu);
+   int i;
unsigned long flags;
 
spin_lock_irqsave(&smmu->global_sync_lock, flags);
-   __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_GR0_sTLBGSYNC,
-   base + ARM_SMMU_GR0_sTLBGSTATUS);
+   for (i = 0; i < smmu->num_smmus; i++) {
+   void __iomem *base = ARM_SMMU_GR0(smmu);
+
+   if (i > 0)
+   base = to_smmu_intance(smmu, i, base);
+   __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_GR0_sTLBGSYNC,
+   base + ARM_SMMU_GR0_sTLBGSTATUS);
+   }
spin_unlock_irqrestore(&smmu->global_sync_lock, flags);
 }
 
 static void arm_smmu_tlb_sync_context(void *cookie)
 {
+   int i;
struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_device *smmu = smmu_domain->smmu;
-   void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
unsigned long flags;
 
spin_lock_irqsave(&smmu_domain->cb_lock, flags);
-   __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_CB_TLBSYNC,
-   base + ARM_SMMU_CB_TLBSTATUS);
+   for (i = 0; i < smmu->num_smmus; i++) {
+   void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
+
+   if (i > 0)
+   base = to_smmu_intance(smmu, i, base);
+   __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_CB_TLBSYNC,
+   base + ARM_SMMU_CB_TLBSTATUS);
+   }
spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
 }
 
@@ -212,13 +266,14 @@ static void arm_smmu_tlb_inv_context_s1(void *cookie)
 {
struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-   void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx);
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   void __iomem *base = ARM_SMMU_CB(smmu, cfg->cbndx);
 
/*
 * NOTE: this is not a relaxed write; it needs to guarantee that PTEs
 * cleared by the current CPU are visible to the SMMU before the TLBI.
 */
-   writel(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
+   writel_all(smmu, cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
arm_smmu_tlb_sync_context(cookie);
 }
 
@@ -229,7 +284,7 @@ static void arm_smmu_tlb_inv_context_s2(void *cookie)
void __iomem *base = ARM_SMMU_GR0(smmu);
 
/* NOTE: see above */
-   writel(smmu_domain

[PATCH v3 6/6] arm64: tegra: Add SMMU nodes to Tegra194 device tree

2018-12-03 Thread Krishna Reddy
Add SMMU nodes and dma-ranges to Tegra194 device tree.
Tegra194 has three ARM SMMU Instances. Two of them are used
together to access IOVA interleaved. The third one is used
as regular ARM SMMU.

Signed-off-by: Krishna Reddy 
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 148 +++
 1 file changed, 148 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index f274562..9a3e08c 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -12,6 +12,7 @@
interrupt-parent = <&gic>;
#address-cells = <2>;
#size-cells = <2>;
+   dma-ranges = <0x0 0x0 0x0 0x0 0x8 0x0>;
 
/* control backbone */
cbb {
@@ -957,4 +958,151 @@
(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>;
interrupt-parent = <&gic>;
};
+
+   dualsmmu: iommu@1200 {
+   compatible = "tegra194,arm,mmu-500";
+   reg = <0 0x1200 0 0x80>,
+ <0 0x1100 0 0x80>;
+   interrupts = ,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+;
+   stream-match-mask = <0x7f80>;
+   #global-interrupts = <1>;
+   #iommu-cells = <1>;
+   };
+
+   smmu: iommu@1000 {
+   compatible = "arm,mmu-500";
+   reg = <0 0x1000 0 0x80>;
+   interrupts = ,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+

[PATCH v3 1/6] iommu/arm-smmu: create library for ARM SMMU

2018-12-03 Thread Krishna Reddy
Create library routines to share ARM SMMU programming
and common IOMMU API implementation for ARM SMMU v1 and v2
based architecture Implementations.

Signed-off-by: Krishna Reddy 
---
 drivers/iommu/Makefile   |1 +
 drivers/iommu/lib-arm-smmu.c | 1671 ++
 drivers/iommu/lib-arm-smmu.h |  161 
 3 files changed, 1833 insertions(+)
 create mode 100644 drivers/iommu/lib-arm-smmu.c
 create mode 100644 drivers/iommu/lib-arm-smmu.h

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index a158a68..ea87cae 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
+obj-$(CONFIG_ARM_SMMU) += lib-arm-smmu.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
 obj-$(CONFIG_DMAR_TABLE) += dmar.o
 obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
diff --git a/drivers/iommu/lib-arm-smmu.c b/drivers/iommu/lib-arm-smmu.c
new file mode 100644
index 000..6aba5db
--- /dev/null
+++ b/drivers/iommu/lib-arm-smmu.c
@@ -0,0 +1,1671 @@
+/*
+ * Copyright (c) 2018, NVIDIA Corporation
+ * Author: Krishna Reddy 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Library for ARM architected v1 and v2 SMMU implementations.
+ * This library is created by reusing the code from arm-smmu.c which is
+ *   authored by Will Deacon.
+ */
+
+#define pr_fmt(fmt) "lib-arm-smmu: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+#include "lib-arm-smmu.h"
+
+#define ARM_MMU500_ACTLR_CPRE  (1 << 1)
+
+#define ARM_MMU500_ACR_CACHE_LOCK  (1 << 26)
+#define ARM_MMU500_ACR_S2CRB_TLBEN (1 << 10)
+#define ARM_MMU500_ACR_SMTNMB_TLBEN(1 << 8)
+
+#define TLB_LOOP_TIMEOUT   100 /* 1s! */
+#define TLB_SPIN_COUNT 10
+
+/* SMMU global address space */
+#define ARM_SMMU_GR0(smmu) ((smmu)->base)
+#define ARM_SMMU_GR1(smmu) ((smmu)->base + (1 << (smmu)->pgshift))
+
+/*
+ * SMMU global address space with conditional offset to access secure
+ * aliases of non-secure registers (e.g. nsCR0: 0x400, nsGFSR: 0x448,
+ * nsGFSYNR0: 0x450)
+ */
+#define ARM_SMMU_GR0_NS(smmu)  \
+   ((smmu)->base + \
+   ((smmu->options & ARM_SMMU_OPT_SECURE_CFG_ACCESS)   \
+   ? 0x400 : 0))
+
+/*
+ * Some 64-bit registers only make sense to write atomically, but in such
+ * cases all the data relevant to AArch32 formats lies within the lower word,
+ * therefore this actually makes more sense than it might first appear.
+ */
+#ifdef CONFIG_64BIT
+#define smmu_write_atomic_lq   writeq_relaxed
+#else
+#define smmu_write_atomic_lq   writel_relaxed
+#endif
+
+/* Translation context bank */
+#define ARM_SMMU_CB(smmu, n)   ((smmu)->cb_base + ((n) << (smmu)->pgshift))
+
+#define MSI_IOVA_BASE  0x800
+#define MSI_IOVA_LENGTH0x10
+
+#define s2cr_init_val (struct arm_smmu_s2cr){ \
+   .type = smmu->disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS, \
+}
+
+struct arm_smmu_master_cfg {
+   struct arm_smmu_device  *smmu;
+   s16 smendx[];
+};
+#define INVALID_SMENDX -1
+#define __fwspec_cfg(fw) ((struct arm_smmu_master_cfg *)fw->iommu_priv)
+#define fwspec_smmu(fw)  (__fwspec_cfg(fw)->smmu)
+#define fwspec_smendx(fw, i) \
+   (i >= fw->num_ids ? INVALID_SMENDX : __fwspec_cfg(fw)->smendx[i])
+#define for_each_cfg_sme(fw, i, idx) \
+   for (i = 0; idx = fwspec_smendx(fw, i), i < fw->num_ids; ++i)
+
+enum arm_smmu_context_fmt {
+   ARM_SMMU_CTX_FMT_NONE,
+   ARM_SMMU_CTX_FMT_AARCH64,
+   ARM_SMMU_CTX_FMT_AARCH32_L,
+   ARM_SMMU_CTX_FMT_AARCH32_S,
+};
+
+struct arm_smmu_cfg {
+   u8  cbndx;
+   u8  irptndx;
+   union {
+   u16 asid;
+   u16 vmid;
+   };
+   u32 cbar;
+   enum arm_smmu_context_fmt   fmt;
+};
+#define INVALID_IRPTNDX0xff
+
+enum arm_smmu_domain

[PATCH v3 3/6] iommu/arm-smmu: update arm-smmu.c to use ARM SMMU library

2018-12-03 Thread Krishna Reddy
Update ARM SMMU driver to use the ARM SMMU library routines.

Signed-off-by: Krishna Reddy 
---
 drivers/iommu/arm-smmu.c | 1709 +-
 1 file changed, 11 insertions(+), 1698 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5a28ae8..453159b 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1,5 +1,5 @@
 /*
- * IOMMU API for ARM architected SMMU implementations.
+ * ARM SMMU driver for ARM architected V1 and V2 SMMU implementations.
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
@@ -39,7 +39,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -56,49 +55,7 @@
 
 #include "io-pgtable.h"
 #include "arm-smmu-regs.h"
-
-#define ARM_MMU500_ACTLR_CPRE  (1 << 1)
-
-#define ARM_MMU500_ACR_CACHE_LOCK  (1 << 26)
-#define ARM_MMU500_ACR_S2CRB_TLBEN (1 << 10)
-#define ARM_MMU500_ACR_SMTNMB_TLBEN(1 << 8)
-
-#define TLB_LOOP_TIMEOUT   100 /* 1s! */
-#define TLB_SPIN_COUNT 10
-
-/* Maximum number of context banks per SMMU */
-#define ARM_SMMU_MAX_CBS   128
-
-/* SMMU global address space */
-#define ARM_SMMU_GR0(smmu) ((smmu)->base)
-#define ARM_SMMU_GR1(smmu) ((smmu)->base + (1 << (smmu)->pgshift))
-
-/*
- * SMMU global address space with conditional offset to access secure
- * aliases of non-secure registers (e.g. nsCR0: 0x400, nsGFSR: 0x448,
- * nsGFSYNR0: 0x450)
- */
-#define ARM_SMMU_GR0_NS(smmu)  \
-   ((smmu)->base + \
-   ((smmu->options & ARM_SMMU_OPT_SECURE_CFG_ACCESS)   \
-   ? 0x400 : 0))
-
-/*
- * Some 64-bit registers only make sense to write atomically, but in such
- * cases all the data relevant to AArch32 formats lies within the lower word,
- * therefore this actually makes more sense than it might first appear.
- */
-#ifdef CONFIG_64BIT
-#define smmu_write_atomic_lq   writeq_relaxed
-#else
-#define smmu_write_atomic_lq   writel_relaxed
-#endif
-
-/* Translation context bank */
-#define ARM_SMMU_CB(smmu, n)   ((smmu)->cb_base + ((n) << (smmu)->pgshift))
-
-#define MSI_IOVA_BASE  0x800
-#define MSI_IOVA_LENGTH0x10
+#include "lib-arm-smmu.h"
 
 static int force_stage;
 module_param(force_stage, int, S_IRUGO);
@@ -109,150 +66,6 @@ module_param(disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
 
-enum arm_smmu_arch_version {
-   ARM_SMMU_V1,
-   ARM_SMMU_V1_64K,
-   ARM_SMMU_V2,
-};
-
-enum arm_smmu_implementation {
-   GENERIC_SMMU,
-   ARM_MMU500,
-   CAVIUM_SMMUV2,
-};
-
-struct arm_smmu_s2cr {
-   struct iommu_group  *group;
-   int count;
-   enum arm_smmu_s2cr_type type;
-   enum arm_smmu_s2cr_privcfg  privcfg;
-   u8  cbndx;
-};
-
-#define s2cr_init_val (struct arm_smmu_s2cr){  \
-   .type = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS,\
-}
-
-struct arm_smmu_smr {
-   u16 mask;
-   u16 id;
-   boolvalid;
-};
-
-struct arm_smmu_cb {
-   u64 ttbr[2];
-   u32 tcr[2];
-   u32 mair[2];
-   struct arm_smmu_cfg *cfg;
-};
-
-struct arm_smmu_master_cfg {
-   struct arm_smmu_device  *smmu;
-   s16 smendx[];
-};
-#define INVALID_SMENDX -1
-#define __fwspec_cfg(fw) ((struct arm_smmu_master_cfg *)fw->iommu_priv)
-#define fwspec_smmu(fw)  (__fwspec_cfg(fw)->smmu)
-#define fwspec_smendx(fw, i) \
-   (i >= fw->num_ids ? INVALID_SMENDX : __fwspec_cfg(fw)->smendx[i])
-#define for_each_cfg_sme(fw, i, idx) \
-   for (i = 0; idx = fwspec_smendx(fw, i), i < fw->num_ids; ++i)
-
-struct arm_smmu_device {
-   struct device   *dev;
-
-   void __iomem*base;
-   void __iomem*cb_base;
-   unsigned long   pgshift;
-
-#define ARM_SMMU_FEAT_COHERENT_WALK(1 << 0)
-#define ARM_SMMU_FEAT_STREAM_MATCH (1 << 1)
-#define ARM_SMMU_FEAT_TRANS_S1 (1 << 2)
-#define ARM_SMMU_FEAT_TRANS_S2 (1 << 3)
-#define ARM_SMMU_FEAT_TRANS_NESTED (1 << 4)
-#define ARM_SMMU_FEAT_TRANS_OPS(1 << 5)
-#define ARM_SMMU_FEAT_VMID16   (1 << 6)
-#define ARM_SMMU_FEAT_FMT_A

[PATCH v3 4/6] iommu/tegra194_smmu: Add Tegra194 SMMU driver

2018-12-03 Thread Krishna Reddy
Tegra194 SMMU driver supports Dual ARM SMMU configuration
necessary for Tegra194 SOC.
The IOVA accesses from HW devices are interleaved across two
ARM SMMU devices transparently.

Signed-off-by: Krishna Reddy 
---
 drivers/iommu/Kconfig |  11 ++
 drivers/iommu/Makefile|   1 +
 drivers/iommu/tegra194-smmu.c | 394 ++
 3 files changed, 406 insertions(+)
 create mode 100644 drivers/iommu/tegra194-smmu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d9a2571..2dc08c7 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -357,6 +357,17 @@ config ARM_SMMU
  Say Y here if your SoC includes an IOMMU device implementing
  the ARM SMMU architecture.
 
+config ARM_SMMU_TEGRA
+   bool "Dual ARM SMMU(MMU-500) support on Tegra"
+   depends on ARM64 && MMU && ARCH_TEGRA
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   help
+ Support for implementation of Dual ARM SMMU Instances as single
+ SMMU Device on Tegra194. IOVA accesses from devices are interleaved
+ across these two ARM SMMU Instances. Number of ARM SMMU Instances used
+ in the SMMU device is not known to HW devices.
+
 config ARM_SMMU_V3
bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
depends on ARM64
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index ea87cae..f1587bb 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
 obj-$(CONFIG_ARM_SMMU) += lib-arm-smmu.o
+obj-$(CONFIG_ARM_SMMU_TEGRA) += tegra194-smmu.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
 obj-$(CONFIG_DMAR_TABLE) += dmar.o
 obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
diff --git a/drivers/iommu/tegra194-smmu.c b/drivers/iommu/tegra194-smmu.c
new file mode 100644
index 000..7b21670
--- /dev/null
+++ b/drivers/iommu/tegra194-smmu.c
@@ -0,0 +1,394 @@
+/*
+ * Copyright (c) 2018, NVIDIA Corporation
+ * Author: Krishna Reddy 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#define pr_fmt(fmt) "tegra194-smmu: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+#include "lib-arm-smmu.h"
+
+static int force_stage;
+module_param(force_stage, int, S_IRUGO);
+MODULE_PARM_DESC(force_stage,
+   "Force SMMU mappings to be installed at a particular stage of 
translation. A value of '1' or '2' forces the corresponding stage. All other 
values are ignored (i.e. no stage is forced). Note that selecting a specific 
stage will disable support for nested translation.");
+static bool disable_bypass;
+module_param(disable_bypass, bool, S_IRUGO);
+MODULE_PARM_DESC(disable_bypass,
+   "Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
+
+static struct platform_driver tegra194_smmu_driver;
+static struct iommu_ops tegra194_smmu_ops;
+
+static struct iommu_domain *tegra194_smmu_domain_alloc(unsigned int type)
+{
+   return arm_smmu_domain_alloc_common(type, true);
+}
+
+static int tegra194_smmu_attach_dev(struct iommu_domain *domain,
+   struct device *dev)
+{
+   return arm_smmu_attach_dev_common(domain, dev, &tegra194_smmu_ops);
+}
+
+static int tegra194_smmu_match_node(struct device *dev, void *data)
+{
+   return dev->fwnode == data;
+}
+
+static
+struct arm_smmu_device *tegra194_smmu_get_by_fwnode(struct fwnode_handle 
*fwnode)
+{
+   struct device *dev = driver_find_device(&tegra194_smmu_driver.driver,
+   NULL, fwnode, tegra194_smmu_match_node);
+   put_device(dev);
+   return dev ? dev_get_drvdata(dev) : NULL;
+}
+
+static int tegra194_smmu_add_device(struct device *dev)
+{
+   struct arm_smmu_device *smmu;
+   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+
+   if (fwspec && fwspec->ops == &tegra194_smmu_ops)
+   smmu = tegra194_smmu_get_by_fwnode(fwspec->iommu_fwnode);
+   else
+   return -ENODEV;
+
+   return arm_smmu_add_device_common(dev, smmu);
+}
+
+static void tegra194_smmu_remov

[PATCH v3 0/5] Add Tegra194 Dual ARM SMMU driver

2018-12-03 Thread Krishna Reddy
NVIDIA's Xavier (Tegra194) SOC has two ARM SMMU(MMU-500) instances, which
are used as one SMMU device in HW. The IOVA accesses from HW devices are
interleaved across these two SMMU instances and need to be programmed identical.

The existing ARM SMMU driver can't be used in its current form for programming
the two SMMU instances identically. But, Most of the code can be shared between
ARM SMMU driver and Tegra194 SMMU driver. To allow sharing the code, Created
a libray based on the current ARM SMMU driver and added suppport to program
multiple ARM SMMU Instances identically. Upated Current ARM SMMU driver and
Tegra194 SMMU driver to use the library functions.

Please review the patches and provide feedback.

Changes in v2:
* Added CONFIG_ARM_SMMU_TEGRA to protect Tegra194 SMMU driver compilation
* Enabled CONFIG_ARM_SMMU_TEGRA in defconfig
* Added SMMU nodes in Tegra194 device tree

Changes in v3:
* Created library for ARM SMMU based on arm-smmu.c
* Added support to program multiple ARM SMMU instances identically
* Updated arm-smmu.c/tegra194-smmu.c to use ARM SMMU library functions

Krishna Reddy (6):
  iommu/arm-smmu: create library for ARM SMMU
  iommu/arm-smmu: Add support to program multiple ARM SMMU's identically
  iommu/arm-smmu: update arm-smmu.c to use ARM SMMU library
  iommu/tegra194_smmu: Add Tegra194 SMMU driver
  arm64: defconfig: Enable ARM_SMMU_TEGRA
  arm64: tegra: Add SMMU nodes to Tegra194 device tree

 arch/arm64/boot/dts/nvidia/tegra194.dtsi |  148 +++
 arch/arm64/configs/defconfig |1 +
 drivers/iommu/Kconfig|   11 +
 drivers/iommu/Makefile   |2 +
 drivers/iommu/arm-smmu.c | 1709 +
 drivers/iommu/lib-arm-smmu.c | 1768 ++
 drivers/iommu/lib-arm-smmu.h |  161 +++
 drivers/iommu/tegra194-smmu.c|  394 +++
 8 files changed, 2496 insertions(+), 1698 deletions(-)
 create mode 100644 drivers/iommu/lib-arm-smmu.c
 create mode 100644 drivers/iommu/lib-arm-smmu.h
 create mode 100644 drivers/iommu/tegra194-smmu.c

-- 
2.1.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/4] PCI / ACPI: Identify untrusted PCI devices

2018-12-03 Thread Bjorn Helgaas
On Thu, Nov 29, 2018 at 06:51:50PM +0300, Mika Westerberg wrote:
> A malicious PCI device may use DMA to attack the system. An external
> Thunderbolt port is a convenient point to attach such a device. The OS
> may use IOMMU to defend against DMA attacks.
> 
> Recent BIOSes with Thunderbolt ports mark these externally facing root
> ports with this ACPI _DSD [1]:

I'm not 100% comfortable with the "Recent BIOSes" wording because that
suggests that we can rely on the fact that *all* BIOSes newer than
some date X mark these ports.

Since this _DSD usage is Microsoft-specific and not required by either
PCIe or ACPI specs, we can't rely on it.  A BIOS that doesn't
implement it may not be Windows-certified, but it's perfectly
spec-compliant otherwise and we have to keep in mind the possibility
that ports without this _DSD may still be externally visible and may
still be attack vectors.

>   Name (_DSD, Package () {
>   ToUUID ("efcc06cc-73ac-4bc3-bff0-76143807c389"),
>   Package () {
>   Package () {"ExternalFacingPort", 1},
> Package () {"UID", 0 }
>   }
>   })
> 
> If we find such a root port, mark it and all its children as untrusted.
> The rest of the OS may use this information to enable DMA protection
> against malicious devices. For instance the device may be put behind an
> IOMMU to keep it from accessing memory outside of what the driver has
> allocated for it.
> 
> While at it, add a comment on top of prp_guids array explaining the
> possible caveat resulting when these GUIDs are treated equivalent.
> 
> [1] 
> https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports
> 
> Signed-off-by: Mika Westerberg 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/acpi/property.c | 11 +++
>  drivers/pci/pci-acpi.c  | 19 +++
>  drivers/pci/probe.c | 15 +++
>  include/linux/pci.h |  8 
>  4 files changed, 53 insertions(+)
> 
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index 8c7c4583b52d..77abe0ec4043 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -24,6 +24,14 @@ static int acpi_data_get_property_array(const struct 
> acpi_device_data *data,
>   acpi_object_type type,
>   const union acpi_object **obj);
>  
> +/*
> + * The GUIDs here are made equivalent to each other in order to avoid extra
> + * complexity in the properties handling code, with the caveat that the
> + * kernel will accept certain combinations of GUID and properties that are
> + * not defined without a warning. For instance if any of the properties
> + * from different GUID appear in a property list of another, it will be
> + * accepted by the kernel. Firmware validation tools should catch these.
> + */
>  static const guid_t prp_guids[] = {
>   /* ACPI _DSD device properties GUID: 
> daffd814-6eba-4d8c-8a91-bc9bbf4aa301 */
>   GUID_INIT(0xdaffd814, 0x6eba, 0x4d8c,
> @@ -31,6 +39,9 @@ static const guid_t prp_guids[] = {
>   /* Hotplug in D3 GUID: 6211e2c0-58a3-4af3-90e1-927a4e0c55a4 */
>   GUID_INIT(0x6211e2c0, 0x58a3, 0x4af3,
> 0x90, 0xe1, 0x92, 0x7a, 0x4e, 0x0c, 0x55, 0xa4),
> + /* External facing port GUID: efcc06cc-73ac-4bc3-bff0-76143807c389 */
> + GUID_INIT(0xefcc06cc, 0x73ac, 0x4bc3,
> +   0xbf, 0xf0, 0x76, 0x14, 0x38, 0x07, 0xc3, 0x89),
>  };
>  
>  static const guid_t ads_guid =
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 921db6f80340..e1949f7efd9c 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -789,6 +789,24 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev,
>   ACPI_FREE(obj);
>  }
>  
> +static void pci_acpi_set_untrusted(struct pci_dev *dev)
> +{
> + u8 val;
> +
> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT)
> + return;
> + if (device_property_read_u8(&dev->dev, "ExternalFacingPort", &val))
> + return;
> +
> + /*
> +  * These root ports expose PCIe (including DMA) outside of the
> +  * system so make sure we treat them and everything behind as
> +  * untrusted.
> +  */
> + if (val)
> + dev->untrusted = 1;
> +}
> +
>  static void pci_acpi_setup(struct device *dev)
>  {
>   struct pci_dev *pci_dev = to_pci_dev(dev);
> @@ -798,6 +816,7 @@ static void pci_acpi_setup(struct device *dev)
>   return;
>  
>   pci_acpi_optimize_delay(pci_dev, adev->handle);
> + pci_acpi_set_untrusted(pci_dev);
>  
>   pci_acpi_add_pm_notifier(adev, pci_dev);
>   if (!adev->wakeup.flags.valid)
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index b1c05b5054a0..257b9f6f2ebb 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1378,6 +1378,19 @@ static void set_pcie_thunderbolt(struct pci_dev *dev)
>   }
>  }
>  
> +static void set_pcie_untr

Re: [PATCH v18 1/5] iommu/arm-smmu: Add pm_runtime/sleep ops

2018-12-03 Thread Stephen Boyd
Quoting Vivek Gautam (2018-12-02 22:43:38)
> On Fri, Nov 30, 2018 at 11:45 PM Will Deacon  wrote:
> >
> > On Thu, Nov 29, 2018 at 08:25:20PM +0530, Vivek Gautam wrote:
> > > clk_bulk_get_all() seems like going only the OF way.
> > > Is there another way here to have something common between ACPI
> > > and OF, and then do the clk_bulk_get?
> >
> > I'd say just go with clk_bulk_get_all() and if somebody really wants to
> > mess with the SMMU clocks on a system booted via ACPI, then it's their
> > problem to solve. My understanding is that the design of IORT makes this
> > next to impossible to solve anyway, because a static table is used and
> > therefore we're unable to run whatever ASL methods need to be invoked to
> > mess with the clocks.
> 
> Sure then. I will respin this patch-series.
> 

Right. The idea is to add non-OF support to clk_bulk_get_all() if/when
we get the requirement. Sounds like we can keep waiting a little longer
for that to happen.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2] mm: Replace all open encodings for NUMA_NO_NODE

2018-12-03 Thread Doug Ledford
On Mon, 2018-11-26 at 17:56 +0530, Anshuman Khandual wrote:
> At present there are multiple places where invalid node number is encoded
> as -1. Even though implicitly understood it is always better to have macros
> in there. Replace these open encodings for an invalid node number with the
> global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like
> 'invalid node' from various places redirecting them to a common definition.
> 
> Signed-off-by: Anshuman Khandual 
> ---
> Changes in V2:
> 
> - Added inclusion of 'numa.h' header at various places per Andrew
> - Updated 'dev_to_node' to use NUMA_NO_NODE instead per Vinod
> 
> Changes in V1: (https://lkml.org/lkml/2018/11/23/485)
> 
> - Dropped OCFS2 changes per Joseph
> - Dropped media/video drivers changes per Hans
> 
> RFC - https://patchwork.kernel.org/patch/10678035/
> 
> Build tested this with multiple cross compiler options like alpha, sparc,
> arm64, x86, powerpc, powerpc64le etc with their default config which might
> not have compiled tested all driver related changes. I will appreciate
> folks giving this a test in their respective build environment.
> 
> All these places for replacement were found by running the following grep
> patterns on the entire kernel code. Please let me know if this might have
> missed some instances. This might also have replaced some false positives.
> I will appreciate suggestions, inputs and review.
> 
> 1. git grep "nid == -1"
> 2. git grep "node == -1"
> 3. git grep "nid = -1"
> 4. git grep "node = -1"
> 
>  drivers/infiniband/hw/hfi1/affinity.c |  3 ++-
>  drivers/infiniband/hw/hfi1/init.c |  3 ++-

For the drivers/infiniband changes,

Acked-by: Doug Ledford 

-- 
Doug Ledford 
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


signature.asc
Description: This is a digitally signed message part
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/4] dma-debug: Dynamically expand the dma_debug_entry pool

2018-12-03 Thread John Garry

On 03/12/2018 17:28, Robin Murphy wrote:

Certain drivers such as large multi-queue network adapters can use pools
of mapped DMA buffers larger than the default dma_debug_entry pool of
65536 entries, with the result that merely probing such a device can
cause DMA debug to disable itself during boot unless explicitly given an
appropriate "dma_debug_entries=..." option.

Developers trying to debug some other driver on such a system may not be
immediately aware of this, and at worst it can hide bugs if they fail to
realise that dma-debug has already disabled itself unexpectedly by the
time the code of interest gets to run. Even once they do realise, it can
be a bit of a pain to emprirically determine a suitable number of
preallocated entries to configure without massively over-allocating.

There's really no need for such a static limit, though, since we can
quite easily expand the pool at runtime in those rare cases that the
preallocated entries are insufficient, which is arguably the least
surprising and most useful behaviour.


Hi Robin,

Do you have an idea on shrinking the pool again when the culprit driver 
is removed, i.e. we have so many unused debug entries now available?


Thanks,
John



Signed-off-by: Robin Murphy 
---
 kernel/dma/debug.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index de5db800dbfc..46cc075aec99 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -47,6 +47,9 @@
 #ifndef PREALLOC_DMA_DEBUG_ENTRIES
 #define PREALLOC_DMA_DEBUG_ENTRIES (1 << 16)
 #endif
+/* If the pool runs out, try this many times to allocate this many new entries 
*/
+#define DMA_DEBUG_DYNAMIC_ENTRIES 256
+#define DMA_DEBUG_DYNAMIC_RETRIES 2

 enum {
dma_debug_single,
@@ -702,12 +705,21 @@ static struct dma_debug_entry *dma_entry_alloc(void)
 {
struct dma_debug_entry *entry;
unsigned long flags;
+   int retry_count;

-   spin_lock_irqsave(&free_entries_lock, flags);
+   for (retry_count = 0; ; retry_count++) {
+   spin_lock_irqsave(&free_entries_lock, flags);
+
+   if (num_free_entries > 0)
+   break;

-   if (list_empty(&free_entries)) {
-   global_disable = true;
spin_unlock_irqrestore(&free_entries_lock, flags);
+
+   if (retry_count < DMA_DEBUG_DYNAMIC_RETRIES &&
+   !prealloc_memory(DMA_DEBUG_DYNAMIC_ENTRIES))
+   continue;
+
+   global_disable = true;
pr_err("debugging out of memory - disabling\n");
return NULL;
}




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-debug: Kconfig for PREALLOC_DMA_DEBUG_ENTRIES

2018-12-03 Thread Christoph Hellwig
On Mon, Dec 03, 2018 at 11:56:11AM +, John Garry wrote:
> On 01/12/2018 16:36, Christoph Hellwig wrote:
>> On Fri, Nov 30, 2018 at 07:39:50PM +, Robin Murphy wrote:
>>> I was assuming the point was to also add something like
>>>
>>> default 131072 if HNS_ENET
>>>
>>> so that DMA debug doesn't require too much thought from the user. If they
>>> still have to notice the overflow message and empirically figure out a
>>> value that does work, rebuilding the kernel each time is far less
>>> convenient than simply adding "dma_debug_entries=..." to their kernel
>>> command line and rebooting, which they can do today. If they do already
>>> know up-front that the default will need overriding and what the
>>> appropriate value is, then the command line still seems seems just as
>>> convenient.
>>
>> I'm not so fond of random drivers changing the defaults.  My idea
>> was rather to have the config option so that the defconfig files for
>> the Hisilicon SOCs with this hardware could select a larger number
>> without making a total mess of the kernel configuration.
>>
>> If we really have to we could do different defaults, but I'd still
>> much rather do this on a arch/platform basis than specific drivers.
>
> As I understand, some drivers could even use much more than this (131072), 
> to such a point where I can't imagine that we would want to set an arch 
> default to support them. For this HNS_ENET case, it is arm64 specific so it 
> would be an arch defconfig.

But I'm not sure we could always do the right thing for everyone.

I think we might be better of trying to just dynamically allocate
entries when we run out of them instead of coming up with a perfect
number.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/4] dma-debug: implement dynamic entry allocation

2018-12-03 Thread Christoph Hellwig
On Mon, Dec 03, 2018 at 05:28:05PM +, Robin Murphy wrote:
> The HNS_ENET discussion got me thinking, why not just make DMA debug
> cleverer so that (in terms of basic functionality at least) we don't
> need to worry about driver- or arch-specific configuration at all?
> 
> Patches #2 and #3 are the real meat here - #1 is just some preparatory
> cleanup motivated by moving printks around, and could be split out if
> desired; I kept #4 separate as a possible nice-to-have depending on
> what people think.

Hah, I just thought of this and suggested it a few seconds ago.
So obviously I think it is a good idea, but let me look at the details.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] dma-debug: Use pr_fmt()

2018-12-03 Thread Robin Murphy
Use pr_fmt() to generate the "DMA-API: " prefix consistently. This
results in it being added to a couple of pr_*() messages which were
missing it before, and for the err_printk() calls moves it to the actual
start of the message instead of somewhere in the middle.

Signed-off-by: Robin Murphy 
---

I chose not to refactor the existing split strings for minimal churn here.

 kernel/dma/debug.c | 74 --
 1 file changed, 38 insertions(+), 36 deletions(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 231ca4628062..91b84140e4a5 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -17,6 +17,8 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  */
 
+#define pr_fmt(fmt)"DMA-API: " fmt
+
 #include 
 #include 
 #include 
@@ -234,7 +236,7 @@ static bool driver_filter(struct device *dev)
error_count += 1;   \
if (driver_filter(dev) &&   \
(show_all_errors || show_num_errors > 0)) { \
-   WARN(1, "%s %s: " format,   \
+   WARN(1, pr_fmt("%s %s: ") format,   \
 dev ? dev_driver_string(dev) : "NULL", \
 dev ? dev_name(dev) : "NULL", ## arg); \
dump_entry_trace(entry);\
@@ -519,7 +521,7 @@ static void active_cacheline_inc_overlap(phys_addr_t cln)
 * prematurely.
 */
WARN_ONCE(overlap > ACTIVE_CACHELINE_MAX_OVERLAP,
- "DMA-API: exceeded %d overlapping mappings of cacheline 
%pa\n",
+ pr_fmt("exceeded %d overlapping mappings of cacheline %pa\n"),
  ACTIVE_CACHELINE_MAX_OVERLAP, &cln);
 }
 
@@ -614,7 +616,7 @@ void debug_dma_assert_idle(struct page *page)
 
cln = to_cacheline_number(entry);
err_printk(entry->dev, entry,
-  "DMA-API: cpu touching an active dma mapped cacheline 
[cln=%pa]\n",
+  "cpu touching an active dma mapped cacheline [cln=%pa]\n",
   &cln);
 }
 
@@ -634,7 +636,7 @@ static void add_dma_entry(struct dma_debug_entry *entry)
 
rc = active_cacheline_insert(entry);
if (rc == -ENOMEM) {
-   pr_err("DMA-API: cacheline tracking ENOMEM, dma-debug 
disabled\n");
+   pr_err("cacheline tracking ENOMEM, dma-debug disabled\n");
global_disable = true;
}
 
@@ -673,7 +675,7 @@ static struct dma_debug_entry *dma_entry_alloc(void)
if (list_empty(&free_entries)) {
global_disable = true;
spin_unlock_irqrestore(&free_entries_lock, flags);
-   pr_err("DMA-API: debugging out of memory - disabling\n");
+   pr_err("debugging out of memory - disabling\n");
return NULL;
}
 
@@ -777,7 +779,7 @@ static int prealloc_memory(u32 num_entries)
num_free_entries = num_entries;
min_free_entries = num_entries;
 
-   pr_info("DMA-API: preallocated %d debug entries\n", num_entries);
+   pr_info("preallocated %d debug entries\n", num_entries);
 
return 0;
 
@@ -850,7 +852,7 @@ static ssize_t filter_write(struct file *file, const char 
__user *userbuf,
 * switched off.
 */
if (current_driver_name[0])
-   pr_info("DMA-API: switching off dma-debug driver 
filter\n");
+   pr_info("switching off dma-debug driver filter\n");
current_driver_name[0] = 0;
current_driver = NULL;
goto out_unlock;
@@ -868,7 +870,7 @@ static ssize_t filter_write(struct file *file, const char 
__user *userbuf,
current_driver_name[i] = 0;
current_driver = NULL;
 
-   pr_info("DMA-API: enable driver filter for driver [%s]\n",
+   pr_info("enable driver filter for driver [%s]\n",
current_driver_name);
 
 out_unlock:
@@ -887,7 +889,7 @@ static int dma_debug_fs_init(void)
 {
dma_debug_dent = debugfs_create_dir("dma-api", NULL);
if (!dma_debug_dent) {
-   pr_err("DMA-API: can not create debugfs directory\n");
+   pr_err("can not create debugfs directory\n");
return -ENOMEM;
}
 
@@ -973,7 +975,7 @@ static int dma_debug_device_change(struct notifier_block 
*nb, unsigned long acti
count = device_dma_allocations(dev, &entry);
if (count == 0)
break;
-   err_printk(dev, entry, "DMA-API: device driver has pending "
+   err_printk(dev, entry, "device driver has pending "
"DMA allocations while released from device "
"[count=%d]\n"
"One of leaked entries details: "
@@ -1

[PATCH 0/4] dma-debug: implement dynamic entry allocation

2018-12-03 Thread Robin Murphy
The HNS_ENET discussion got me thinking, why not just make DMA debug
cleverer so that (in terms of basic functionality at least) we don't
need to worry about driver- or arch-specific configuration at all?

Patches #2 and #3 are the real meat here - #1 is just some preparatory
cleanup motivated by moving printks around, and could be split out if
desired; I kept #4 separate as a possible nice-to-have depending on
what people think.

Robin.


Robin Murphy (4):
  dma-debug: Use pr_fmt()
  dma-debug: Refactor dma_debug_entry allocation
  dma-debug: Dynamically expand the dma_debug_entry pool
  dma-debug: Make leak-like behaviour apparent

 kernel/dma/debug.c | 198 -
 1 file changed, 105 insertions(+), 93 deletions(-)

-- 
2.19.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] dma-debug: Refactor dma_debug_entry allocation

2018-12-03 Thread Robin Murphy
Make prealloc_memory() a little more general and robust so that it
serves for runtime reallocations too. The first thing we can do with
that is clean up dma_debug_resize_entries() quite a bit.

Signed-off-by: Robin Murphy 
---
 kernel/dma/debug.c | 95 +++---
 1 file changed, 40 insertions(+), 55 deletions(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 91b84140e4a5..de5db800dbfc 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -645,6 +645,39 @@ static void add_dma_entry(struct dma_debug_entry *entry)
 */
 }
 
+static int prealloc_memory(u32 num_entries)
+{
+   struct dma_debug_entry *entry, *next_entry;
+   unsigned long flags;
+   LIST_HEAD(tmp);
+   int i;
+
+   for (i = 0; i < num_entries; ++i) {
+   entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+   if (!entry)
+   goto out_err;
+
+   list_add_tail(&entry->list, &tmp);
+   }
+
+   spin_lock_irqsave(&free_entries_lock, flags);
+   list_splice(&tmp, &free_entries);
+   num_free_entries += num_entries;
+   nr_total_entries += num_entries;
+   spin_unlock_irqrestore(&free_entries_lock, flags);
+
+   return 0;
+
+out_err:
+
+   list_for_each_entry_safe(entry, next_entry, &tmp, list) {
+   list_del(&entry->list);
+   kfree(entry);
+   }
+
+   return -ENOMEM;
+}
+
 static struct dma_debug_entry *__dma_entry_alloc(void)
 {
struct dma_debug_entry *entry;
@@ -714,44 +747,25 @@ int dma_debug_resize_entries(u32 num_entries)
int i, delta, ret = 0;
unsigned long flags;
struct dma_debug_entry *entry;
-   LIST_HEAD(tmp);
-
-   spin_lock_irqsave(&free_entries_lock, flags);
 
if (nr_total_entries < num_entries) {
delta = num_entries - nr_total_entries;
 
-   spin_unlock_irqrestore(&free_entries_lock, flags);
-
-   for (i = 0; i < delta; i++) {
-   entry = kzalloc(sizeof(*entry), GFP_KERNEL);
-   if (!entry)
-   break;
-
-   list_add_tail(&entry->list, &tmp);
-   }
-
-   spin_lock_irqsave(&free_entries_lock, flags);
-
-   list_splice(&tmp, &free_entries);
-   nr_total_entries += i;
-   num_free_entries += i;
+   ret = prealloc_memory(delta);
} else {
delta = nr_total_entries - num_entries;
 
+   spin_lock_irqsave(&free_entries_lock, flags);
for (i = 0; i < delta && !list_empty(&free_entries); i++) {
entry = __dma_entry_alloc();
kfree(entry);
}
+   spin_unlock_irqrestore(&free_entries_lock, flags);
 
nr_total_entries -= i;
+   if (nr_total_entries != num_entries)
+   ret = -EBUSY;
}
-
-   if (nr_total_entries != num_entries)
-   ret = 1;
-
-   spin_unlock_irqrestore(&free_entries_lock, flags);
-
return ret;
 }
 
@@ -763,36 +777,6 @@ int dma_debug_resize_entries(u32 num_entries)
  *   2. Preallocate a given number of dma_debug_entry structs
  */
 
-static int prealloc_memory(u32 num_entries)
-{
-   struct dma_debug_entry *entry, *next_entry;
-   int i;
-
-   for (i = 0; i < num_entries; ++i) {
-   entry = kzalloc(sizeof(*entry), GFP_KERNEL);
-   if (!entry)
-   goto out_err;
-
-   list_add_tail(&entry->list, &free_entries);
-   }
-
-   num_free_entries = num_entries;
-   min_free_entries = num_entries;
-
-   pr_info("preallocated %d debug entries\n", num_entries);
-
-   return 0;
-
-out_err:
-
-   list_for_each_entry_safe(entry, next_entry, &free_entries, list) {
-   list_del(&entry->list);
-   kfree(entry);
-   }
-
-   return -ENOMEM;
-}
-
 static ssize_t filter_read(struct file *file, char __user *user_buf,
   size_t count, loff_t *ppos)
 {
@@ -1038,7 +1022,8 @@ static int dma_debug_init(void)
return 0;
}
 
-   nr_total_entries = num_free_entries;
+   min_free_entries = num_free_entries;
+   pr_info("preallocated %d debug entries\n", nr_total_entries);
 
dma_debug_initialized = true;
 
-- 
2.19.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] dma-debug: Dynamically expand the dma_debug_entry pool

2018-12-03 Thread Robin Murphy
Certain drivers such as large multi-queue network adapters can use pools
of mapped DMA buffers larger than the default dma_debug_entry pool of
65536 entries, with the result that merely probing such a device can
cause DMA debug to disable itself during boot unless explicitly given an
appropriate "dma_debug_entries=..." option.

Developers trying to debug some other driver on such a system may not be
immediately aware of this, and at worst it can hide bugs if they fail to
realise that dma-debug has already disabled itself unexpectedly by the
time the code of interest gets to run. Even once they do realise, it can
be a bit of a pain to emprirically determine a suitable number of
preallocated entries to configure without massively over-allocating.

There's really no need for such a static limit, though, since we can
quite easily expand the pool at runtime in those rare cases that the
preallocated entries are insufficient, which is arguably the least
surprising and most useful behaviour.

Signed-off-by: Robin Murphy 
---
 kernel/dma/debug.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index de5db800dbfc..46cc075aec99 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -47,6 +47,9 @@
 #ifndef PREALLOC_DMA_DEBUG_ENTRIES
 #define PREALLOC_DMA_DEBUG_ENTRIES (1 << 16)
 #endif
+/* If the pool runs out, try this many times to allocate this many new entries 
*/
+#define DMA_DEBUG_DYNAMIC_ENTRIES 256
+#define DMA_DEBUG_DYNAMIC_RETRIES 2
 
 enum {
dma_debug_single,
@@ -702,12 +705,21 @@ static struct dma_debug_entry *dma_entry_alloc(void)
 {
struct dma_debug_entry *entry;
unsigned long flags;
+   int retry_count;
 
-   spin_lock_irqsave(&free_entries_lock, flags);
+   for (retry_count = 0; ; retry_count++) {
+   spin_lock_irqsave(&free_entries_lock, flags);
+
+   if (num_free_entries > 0)
+   break;
 
-   if (list_empty(&free_entries)) {
-   global_disable = true;
spin_unlock_irqrestore(&free_entries_lock, flags);
+
+   if (retry_count < DMA_DEBUG_DYNAMIC_RETRIES &&
+   !prealloc_memory(DMA_DEBUG_DYNAMIC_ENTRIES))
+   continue;
+
+   global_disable = true;
pr_err("debugging out of memory - disabling\n");
return NULL;
}
-- 
2.19.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 4/4] dma-debug: Make leak-like behaviour apparent

2018-12-03 Thread Robin Murphy
Now that we can dynamically allocate DMA debug entries to cope with
drivers maintaining excessively large numbers of live mappings, a driver
which *does* actually have a bug leaking mappings (and is not unloaded)
will no longer trigger the "DMA-API: debugging out of memory - disabling"
message until it gets to actual kernel OOM conditions, which means it
could go unnoticed for a while. To that end, let's inform the user each
time the pool has grown to a multiple of its initial size, which should
make it apparent that they either have a leak or might want to increase
the preallocation size.

Signed-off-by: Robin Murphy 
---

Tagging this one as RFC since people might think it's silly.

 kernel/dma/debug.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 46cc075aec99..c4759dab0f8c 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -696,6 +696,17 @@ static struct dma_debug_entry *__dma_entry_alloc(void)
return entry;
 }
 
+void __dma_entry_alloc_check_leak(void)
+{
+   u32 tmp = nr_total_entries % nr_prealloc_entries;
+
+   /* Shout each time we tick over some multiple of the initial pool */
+   if (tmp < DMA_DEBUG_DYNAMIC_ENTRIES) {
+   pr_info("dma_debug_entry pool grown to %u00%% - possible 
mapping leak?\n",
+   (nr_total_entries / nr_prealloc_entries));
+   }
+}
+
 /* struct dma_entry allocator
  *
  * The next two functions implement the allocator for
@@ -716,8 +727,10 @@ static struct dma_debug_entry *dma_entry_alloc(void)
spin_unlock_irqrestore(&free_entries_lock, flags);
 
if (retry_count < DMA_DEBUG_DYNAMIC_RETRIES &&
-   !prealloc_memory(DMA_DEBUG_DYNAMIC_ENTRIES))
+   !prealloc_memory(DMA_DEBUG_DYNAMIC_ENTRIES)) {
+   __dma_entry_alloc_check_leak();
continue;
+   }
 
global_disable = true;
pr_err("debugging out of memory - disabling\n");
-- 
2.19.1.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: remove the ->mapping_error method from dma_map_ops V3

2018-12-03 Thread Christoph Hellwig
Does anyone but Linus and Russell have comments on this series?

I'd like to pull it in fairly quickly as I have a fair amount of
work on top of it that I'd like to get into 4.21 as well.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables

2018-12-03 Thread Liu, Yi L
Hi Joerg,

> From: Joerg Roedel [mailto:j...@8bytes.org]
> Sent: Monday, December 3, 2018 5:44 AM
> To: Lu Baolu 
> Subject: Re: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables
> 
> Hi Baolu,
> 
> On Wed, Nov 28, 2018 at 11:54:39AM +0800, Lu Baolu wrote:
> > @@ -2482,12 +2482,13 @@ static struct dmar_domain
> *dmar_insert_one_dev_info(struct intel_iommu *iommu,
> > if (dev)
> > dev->archdata.iommu = info;
> >
> > -   if (dev && dev_is_pci(dev) && info->pasid_supported) {
> > +   /* PASID table is mandatory for a PCI device in scalable mode. */
> > +   if (dev && dev_is_pci(dev) && sm_supported(iommu)) {
> 
> This will also allocate a PASID table if the device does not support
> PASIDs, right? Will the table not be used in that case or will the
> device just use the fallback PASID? Isn't it better in that case to have
> no PASID table?

We need to allocate the PASID table in scalable mode, the reason is as below:
In VT-d scalable mode, all address translation is done in PASID-granularity.
For requests-with-PASID, the address translation would be subjected to the
PASID entry specified by the PASID value in the DMA request. However, for
requests-without-PASID, there is no PASID in the DMA request. To fulfil
the translation logic, we've introduced RID2PASID field in sm-context-entry
in VT-d 3.o spec. So that such DMA requests would be subjected to the pasid
entry specified by the PASID value in the RID2PASID field of sm-context-entry.

So for a device without PASID support, we need to at least to have a PASID
entry so that its DMA request (without pasid) can be translated. Thus a PASID
table is needed for such devices.

> 
> > @@ -143,18 +143,20 @@ int intel_pasid_alloc_table(struct device *dev)
> > return -ENOMEM;
> > INIT_LIST_HEAD(&pasid_table->dev);
> >
> > -   size = sizeof(struct pasid_entry);
> > -   count = min_t(int, pci_max_pasids(to_pci_dev(dev)), intel_pasid_max_id);
> > -   order = get_order(size * count);
> > +   if (info->pasid_supported)
> > +   max_pasid = min_t(int, pci_max_pasids(to_pci_dev(dev)),
> > + intel_pasid_max_id);
> > +
> > +   size = max_pasid >> (PASID_PDE_SHIFT - 3);
> > +   order = size ? get_order(size) : 0;
> > pages = alloc_pages_node(info->iommu->node,
> > -GFP_ATOMIC | __GFP_ZERO,
> > -order);
> > +GFP_ATOMIC | __GFP_ZERO, order);
> 
> This is a simple data structure allocation path, does it need
> GFP_ATOMIC?

will leave it to Baolu.

> 
>   Joerg

Thanks,
Yi Liu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support

2018-12-03 Thread Liu, Yi L
Hi Joerg,

> From: Joerg Roedel [mailto:j...@8bytes.org]
> Sent: Monday, December 3, 2018 5:49 AM
> To: Lu Baolu 
> Subject: Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor
> support
> 
> On Wed, Nov 28, 2018 at 11:54:41AM +0800, Lu Baolu wrote:
> > -
> > -   desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
> 0);
> > +   /*
> > +* Need two pages to accommodate 256 descriptors of 256 bits each
> > +* if the remapping hardware supports scalable mode translation.
> > +*/
> > +   desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
> > +!!ecap_smts(iommu->ecap));
> 
> 
> Same here, does the allocation really need GFP_ATOMIC?

still leave to Baolu.

> 
> >  struct q_inval {
> > raw_spinlock_t  q_lock;
> > -   struct qi_desc  *desc;  /* invalidation queue */
> > +   void*desc;  /* invalidation queue */
> > int *desc_status;   /* desc status */
> > int free_head;  /* first free entry */
> > int free_tail;  /* last free entry */
> 
> Why do you switch the pointer to void* ?

In this patch, there is some code like the code below. It calculates
destination address of memcpy with qi->desc. If it's still struct qi_desc
pointer, the calculation result would be wrong.

+   memcpy(desc, qi->desc + (wait_index << shift),
+  1 << shift);

The change of the calculation method is to support 128 bits invalidation
descriptors and 256 invalidation descriptors in this unified code logic.

Also, the conversation between Baolu and me may help.

https://lore.kernel.org/patchwork/patch/1006756/

> 
>   Joerg

Thanks,
Yi Liu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 6/8] vfio/mdev: Add iommu place holders in mdev_device

2018-12-03 Thread Kirti Wankhede



On 11/21/2018 2:15 PM, Christoph Hellwig wrote:
> On Wed, Nov 21, 2018 at 02:22:08AM +0530, Kirti Wankhede wrote:
>> It is about how mdev framework can be used by existing drivers. These
>> symbols doesn't use any other exported symbols.
> 
> That is an unfortunate accident of history, but doesn't extent to new
> ones.  It also is another inidicator those drivers probably are derived
> works of the Linux kernel and might be in legal trouble one way or
> another.
> 

These symbols are just to associate iommu properties of a physical
device with a mdev device, doesn't include low-level information.

Thanks, 
Kirti
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] of/device: add blacklist for iommu dma_ops

2018-12-03 Thread Rob Clark
On Mon, Dec 3, 2018 at 7:45 AM Robin Murphy  wrote:
>
> Hi Rob,
>
> On 01/12/2018 16:53, Rob Clark wrote:
> > This solves a problem we see with drm/msm, caused by getting
> > iommu_dma_ops while we attach our own domain and manage it directly at
> > the iommu API level:
> >
> >[0038] user address but active_mm is swapper
> >Internal error: Oops: 9605 [#1] PREEMPT SMP
> >Modules linked in:
> >CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90
> >Hardware name: xxx (DT)
> >Workqueue: events deferred_probe_work_func
> >pstate: 80c9 (Nzcv daif +PAN +UAO)
> >pc : iommu_dma_map_sg+0x7c/0x2c8
> >lr : iommu_dma_map_sg+0x40/0x2c8
> >sp : ff80095eb4f0
> >x29: ff80095eb4f0 x28: 
> >x27: ffc0f9431578 x26: 
> >x25:  x24: 0003
> >x23: 0001 x22: ffc0fa9ac010
> >x21:  x20: ffc0fab40980
> >x19: ffc0fab40980 x18: 0003
> >x17: 01c4 x16: 0007
> >x15: 000e x14: 
> >x13:  x12: 0028
> >x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> >x9 :  x8 : ffc0fab409a0
> >x7 :  x6 : 0002
> >x5 : 0001 x4 : 
> >x3 : 0001 x2 : 0002
> >x1 : ffc0f9431578 x0 : 
> >Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb)
> >Call trace:
> > iommu_dma_map_sg+0x7c/0x2c8
> > __iommu_map_sg_attrs+0x70/0x84
> > get_pages+0x170/0x1e8
> > msm_gem_get_iova+0x8c/0x128
> > _msm_gem_kernel_new+0x6c/0xc8
> > msm_gem_kernel_new+0x4c/0x58
> > dsi_tx_buf_alloc_6g+0x4c/0x8c
> > msm_dsi_host_modeset_init+0xc8/0x108
> > msm_dsi_modeset_init+0x54/0x18c
> > _dpu_kms_drm_obj_init+0x430/0x474
> > dpu_kms_hw_init+0x5f8/0x6b4
> > msm_drm_bind+0x360/0x6c8
> > try_to_bring_up_master.part.7+0x28/0x70
> > component_master_add_with_match+0xe8/0x124
> > msm_pdev_probe+0x294/0x2b4
> > platform_drv_probe+0x58/0xa4
> > really_probe+0x150/0x294
> > driver_probe_device+0xac/0xe8
> > __device_attach_driver+0xa4/0xb4
> > bus_for_each_drv+0x98/0xc8
> > __device_attach+0xac/0x12c
> > device_initial_probe+0x24/0x30
> > bus_probe_device+0x38/0x98
> > deferred_probe_work_func+0x78/0xa4
> > process_one_work+0x24c/0x3dc
> > worker_thread+0x280/0x360
> > kthread+0x134/0x13c
> > ret_from_fork+0x10/0x18
> >Code: d284 91000725 6b17039f 5400048a (f9401f40)
> >---[ end trace f22dda57f3648e2c ]---
> >Kernel panic - not syncing: Fatal exception
> >SMP: stopping secondary CPUs
> >Kernel Offset: disabled
> >CPU features: 0x0,22802a18
> >Memory Limit: none
> >
> > The problem is that when drm/msm does it's own iommu_attach_device(),
> > now the domain returned by iommu_get_domain_for_dev() is drm/msm's
> > domain, and it doesn't have domain->iova_cookie.
>
> Does this crash still happen with 4.20-rc? Because as of 6af588fed391 it
> really shouldn't.
>
> > We kind of avoided this problem prior to sdm845/dpu because the iommu
> > was attached to the mdp node in dt, which is a child of the toplevel
> > mdss node (which corresponds to the dev passed in dma_map_sg()).  But
> > with sdm845, now the iommu is attached at the mdss level so we hit the
> > iommu_dma_ops in dma_map_sg().
> >
> > But auto allocating/attaching a domain before the driver is probed was
> > already a blocking problem for enabling per-context pagetables for the
> > GPU.  This problem is also now solved with this patch.
>
> s/solved/worked around/
>
> If you want a guarantee of actually getting a specific hardware context
> allocated for a given domain, there needs to be code in the IOMMU driver
> to understand and honour that. Implicitly depending on whatever happens
> to fall out of current driver behaviour on current systems is not a real
> solution.
>
> > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in 
> > of_dma_configure
>
> That's rather misleading, since the crash described above depends on at
> least two other major changes which came long after that commit.
>
> It's not that I don't understand exactly what you want here - just that
> this commit message isn't a coherent justification for that ;)
>
> > Tested-by: Douglas Anderson 
> > Signed-off-by: Rob Clark 
> > ---
> > This is an alternative/replacement for [1].  What it lacks in elegance
> > it makes up for in practicality ;-)
> >
> > [1] https://patchwork.freedesktop.org/patch/264930/
> >
> >   drivers/of/device.c | 22 ++
> >   1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/of/device.c b/drivers/of/device.c
> > index 5957cd4fa262..15ffee00fb22 100644
> > --- a/drivers/of/device.c
> > +++ b/drivers/of/device.c
> > 

Re: [PATCH] of/device: add blacklist for iommu dma_ops

2018-12-03 Thread Rob Clark
On Mon, Dec 3, 2018 at 7:45 AM Robin Murphy  wrote:
>
> Hi Rob,
>
> On 01/12/2018 16:53, Rob Clark wrote:
> > This solves a problem we see with drm/msm, caused by getting
> > iommu_dma_ops while we attach our own domain and manage it directly at
> > the iommu API level:
> >
> >[0038] user address but active_mm is swapper
> >Internal error: Oops: 9605 [#1] PREEMPT SMP
> >Modules linked in:
> >CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90
> >Hardware name: xxx (DT)
> >Workqueue: events deferred_probe_work_func
> >pstate: 80c9 (Nzcv daif +PAN +UAO)
> >pc : iommu_dma_map_sg+0x7c/0x2c8
> >lr : iommu_dma_map_sg+0x40/0x2c8
> >sp : ff80095eb4f0
> >x29: ff80095eb4f0 x28: 
> >x27: ffc0f9431578 x26: 
> >x25:  x24: 0003
> >x23: 0001 x22: ffc0fa9ac010
> >x21:  x20: ffc0fab40980
> >x19: ffc0fab40980 x18: 0003
> >x17: 01c4 x16: 0007
> >x15: 000e x14: 
> >x13:  x12: 0028
> >x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> >x9 :  x8 : ffc0fab409a0
> >x7 :  x6 : 0002
> >x5 : 0001 x4 : 
> >x3 : 0001 x2 : 0002
> >x1 : ffc0f9431578 x0 : 
> >Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb)
> >Call trace:
> > iommu_dma_map_sg+0x7c/0x2c8
> > __iommu_map_sg_attrs+0x70/0x84
> > get_pages+0x170/0x1e8
> > msm_gem_get_iova+0x8c/0x128
> > _msm_gem_kernel_new+0x6c/0xc8
> > msm_gem_kernel_new+0x4c/0x58
> > dsi_tx_buf_alloc_6g+0x4c/0x8c
> > msm_dsi_host_modeset_init+0xc8/0x108
> > msm_dsi_modeset_init+0x54/0x18c
> > _dpu_kms_drm_obj_init+0x430/0x474
> > dpu_kms_hw_init+0x5f8/0x6b4
> > msm_drm_bind+0x360/0x6c8
> > try_to_bring_up_master.part.7+0x28/0x70
> > component_master_add_with_match+0xe8/0x124
> > msm_pdev_probe+0x294/0x2b4
> > platform_drv_probe+0x58/0xa4
> > really_probe+0x150/0x294
> > driver_probe_device+0xac/0xe8
> > __device_attach_driver+0xa4/0xb4
> > bus_for_each_drv+0x98/0xc8
> > __device_attach+0xac/0x12c
> > device_initial_probe+0x24/0x30
> > bus_probe_device+0x38/0x98
> > deferred_probe_work_func+0x78/0xa4
> > process_one_work+0x24c/0x3dc
> > worker_thread+0x280/0x360
> > kthread+0x134/0x13c
> > ret_from_fork+0x10/0x18
> >Code: d284 91000725 6b17039f 5400048a (f9401f40)
> >---[ end trace f22dda57f3648e2c ]---
> >Kernel panic - not syncing: Fatal exception
> >SMP: stopping secondary CPUs
> >Kernel Offset: disabled
> >CPU features: 0x0,22802a18
> >Memory Limit: none
> >
> > The problem is that when drm/msm does it's own iommu_attach_device(),
> > now the domain returned by iommu_get_domain_for_dev() is drm/msm's
> > domain, and it doesn't have domain->iova_cookie.
>
> Does this crash still happen with 4.20-rc? Because as of 6af588fed391 it
> really shouldn't.

for this hw, I'm still on 4.19, although that does look like it would
avoid the issue.

> > We kind of avoided this problem prior to sdm845/dpu because the iommu
> > was attached to the mdp node in dt, which is a child of the toplevel
> > mdss node (which corresponds to the dev passed in dma_map_sg()).  But
> > with sdm845, now the iommu is attached at the mdss level so we hit the
> > iommu_dma_ops in dma_map_sg().
> >
> > But auto allocating/attaching a domain before the driver is probed was
> > already a blocking problem for enabling per-context pagetables for the
> > GPU.  This problem is also now solved with this patch.
>
> s/solved/worked around/
>
> If you want a guarantee of actually getting a specific hardware context
> allocated for a given domain, there needs to be code in the IOMMU driver
> to understand and honour that. Implicitly depending on whatever happens
> to fall out of current driver behaviour on current systems is not a real
> solution.

ok, fair.. but I'll settle for "works" in the absence of better options..

At some level, it would be nice to be able to optionally specify a
context-bank in the iommu bindings.  But not sure how to make that fit
w/ cb allocated per domain.  And I assume I'm the only one who has
this problem?

> > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in 
> > of_dma_configure
>
> That's rather misleading, since the crash described above depends on at
> least two other major changes which came long after that commit.

Fair, when I realized it was the difference in where the iommu
attaches between dpu1 (sdm845) and everything coming before, I should
have removed the tag.

BR,
-R

> It's not that I don't understand exactly what you want here - just that
> this commit message isn't a coheren

Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support

2018-12-03 Thread Joerg Roedel
On Wed, Nov 28, 2018 at 11:54:41AM +0800, Lu Baolu wrote:
> -
> - desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, 0);
> + /*
> +  * Need two pages to accommodate 256 descriptors of 256 bits each
> +  * if the remapping hardware supports scalable mode translation.
> +  */
> + desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
> +  !!ecap_smts(iommu->ecap));


Same here, does the allocation really need GFP_ATOMIC?

>  struct q_inval {
>   raw_spinlock_t  q_lock;
> - struct qi_desc  *desc;  /* invalidation queue */
> + void*desc;  /* invalidation queue */
>   int *desc_status;   /* desc status */
>   int free_head;  /* first free entry */
>   int free_tail;  /* last free entry */

Why do you switch the pointer to void* ?


Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables

2018-12-03 Thread Joerg Roedel
Hi Baolu,

On Wed, Nov 28, 2018 at 11:54:39AM +0800, Lu Baolu wrote:
> @@ -2482,12 +2482,13 @@ static struct dmar_domain 
> *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>   if (dev)
>   dev->archdata.iommu = info;
>  
> - if (dev && dev_is_pci(dev) && info->pasid_supported) {
> + /* PASID table is mandatory for a PCI device in scalable mode. */
> + if (dev && dev_is_pci(dev) && sm_supported(iommu)) {

This will also allocate a PASID table if the device does not support
PASIDs, right? Will the table not be used in that case or will the
device just use the fallback PASID? Isn't it better in that case to have
no PASID table?

> @@ -143,18 +143,20 @@ int intel_pasid_alloc_table(struct device *dev)
>   return -ENOMEM;
>   INIT_LIST_HEAD(&pasid_table->dev);
>  
> - size = sizeof(struct pasid_entry);
> - count = min_t(int, pci_max_pasids(to_pci_dev(dev)), intel_pasid_max_id);
> - order = get_order(size * count);
> + if (info->pasid_supported)
> + max_pasid = min_t(int, pci_max_pasids(to_pci_dev(dev)),
> +   intel_pasid_max_id);
> +
> + size = max_pasid >> (PASID_PDE_SHIFT - 3);
> + order = size ? get_order(size) : 0;
>   pages = alloc_pages_node(info->iommu->node,
> -  GFP_ATOMIC | __GFP_ZERO,
> -  order);
> +  GFP_ATOMIC | __GFP_ZERO, order);

This is a simple data structure allocation path, does it need
GFP_ATOMIC?



Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/9] iommu: clean up/remove modular stuff from non-modules.

2018-12-03 Thread Joerg Roedel
On Sat, Dec 01, 2018 at 02:19:08PM -0500, Paul Gortmaker wrote:
> Paul Gortmaker (9):
>   iommu: audit and remove any unnecessary uses of module.h
>   iommu/rockchip: Make it explicitly non-modular
>   iommu/msm: Make it explicitly non-modular
>   iommu/mediatek: Make it explicitly non-modular
>   iommu/ipmmu-vmsa: Make it explicitly non-modular
>   iommu/qcom: Make it explicitly non-modular
>   iommu/tegra: Make it explicitly non-modular
>   iommu/arm-smmu: Make arm-smmu explicitly non-modular
>   iommu/arm-smmu: Make arm-smmu-v3 explicitly non-modular

Applied all, thanks Paul.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 0/7] Add virtio-iommu driver

2018-12-03 Thread Auger Eric
Hi Michael,

On 11/27/18 6:16 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 27, 2018 at 06:09:25PM +0100, Auger Eric wrote:
>> Hi Michael,
>>
>> On 11/27/18 5:53 PM, Michael S. Tsirkin wrote:
>>> On Thu, Nov 22, 2018 at 07:37:54PM +, Jean-Philippe Brucker wrote:
 Implement the virtio-iommu driver, following specification v0.9 [1].

 Since v4 [2] I fixed the issues reported by Eric, and added Reviewed-by
 from Eric and Rob. Thanks!

 I changed the specification to fix one inconsistency discussed in v4.
 That the device fills the probe buffer with zeroes is now a "SHOULD"
 instead of a "MAY", since it's the only way for the driver to know if
 the device wrote the status. Existing devices already do this. In
 addition the device now needs to fill the three padding bytes at the
 tail with zeroes.

 You can find Linux driver and kvmtool device on branches
 virtio-iommu/v0.9 [3]. I also lightly tested with Eric's latest QEMU
 device [4].
>>>
>>> I tried to get this to work on my x86 box but without
>>> success. Any hints? Does this have to do with the IORT table?
>>> I think we really should just reserve our own table ID
>>> and avoid the pain of trying to add things to the IORT spec.
>>> I'm reluctant to merge lots of code that I can't easily test.
>>> Again, if we found a way to push more configuration into
>>> virtio config space the problem space would be smaller.
>>
>> You can at least test it with QEMU ARM virt in TCG mode.
> 
> It's slow enough that I generally just focus on KVM.
> 
>> Then I have
>> worked on the IORT integration in PC/Q35 but this is not yet functional.
>> I need to debug what's wrong on guest ACPI probing. I plan to work on
>> this this week.
>>
>> Thanks
>>
>> Eric
> 
> Sounds good. Did you need to make changes to IORT?  I don't remember. If
> yes it would be very easy to just have a virtio specific ACPI table -
> I am assuming ARM guys will be just as hostile to virt changes
> to IORT as they were to minor changes to ARM vIOMMU.

I eventually succeeded to prototype the IORT integration on x86. Please
find below the branches to use for testing:

- QEMU: https://github.com/eauger/qemu.git v3.1.0-rc3-virtio-iommu-v0.9-x86
  On top of "[RFC v9 00/17] VIRTIO-IOMMU device", I added the IORT build
for pc machine
- LINUX GUEST: https://github.com/eauger/linux.git
virtio-iommu-v0.9-iort-x86
  Jean's virtio-iommu/devel branch + 2 hacks (to be discussed separately)
  Make sure the CONFIG_VIRTIO_IOMMU config is set

Used qemu cmd line featuring -device virtio-iommu-pci,addr=0xa:

./x86_64-softmmu/qemu-system-x86_64 -M
q35,accel=kvm,usb=off,sata=off,dump-guest-core=off -cpu
Haswell,-hle,-rtm -smp 4,sockets=4,cores=1,threads=1 -m 4G -display none
--enable-kvm -serial tcp:localhost:,server -device
virtio-iommu-pci,addr=0xa -trace
events=/home/augere/UPSTREAM/qemu/hw-virtio-iommu -qmp
unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -rtc
base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -realtime
mlock=off -no-hpet -no-shutdown -device
virtio-blk-pci,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,iommu_platform,disable-modern=off,disable-legacy=on
-drive
file=/home/augere/VM/IMAGES/vm0.qcow2,format=qcow2,if=none,id=drv0
-device
virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on
-netdev
tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown
-kernel
/home/augere/VM/BOOT/vmlinuz-4.20.0-rc3-guest-defconfig-virtio-iommu-0.9-x86+
-initrd
/home/augere/VM/BOOT/initrd.img-4.20.0-rc3-guest-defconfig-virtio-iommu-0.9-x86+
-append 'root=/dev/mapper/rhel_dhcp156--238-root ro crashkernel=auto
rd.lvm.lv=rhel_dhcp156-238/root rd.lvm.lv=rhel_dhcp156-238/swap rw
ignore_loglevel console=tty0 console=ttyS0,115200n8 serial acpi=force'
-net none -d guest_errors

I put sata=off otherwise I have early untranslated transactions that
prevent the guest from booting.

Please let me know if you encounter any issue reproducing the
environment on your side.

Thanks

Eric


> 
> 
>>>
 [1] Virtio-iommu specification v0.9, sources, pdf and diff from v0.8
 git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
 http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
 
 http://jpbrucker.net/virtio-iommu/spec/diffs/virtio-iommu-pdf-diff-v0.8-v0.9.pdf

 [2] [PATCH v4 0/7] Add virtio-iommu driver
 
 https://lists.linuxfoundation.org/pipermail/iommu/2018-November/031074.html

 [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9
 git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9

 [4] [RFC v9 00/17] VIRTIO-IOMMU device
 https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html

 Jean-Philippe Brucker (7):
   dt-bindings: virtio-mmio: Add IOMMU description
   dt-bindings: virtio: Add virtio-pci-iommu node
   of: Allow the iom

Re: [PATCH] irq_remapping: Remove unused header files

2018-12-03 Thread Joerg Roedel
On Fri, Nov 30, 2018 at 09:35:21AM -0500, Yangtao Li wrote:
> seq_file.h does not need to be included,so remove it.
> 
> Signed-off-by: Yangtao Li 
> ---
>  drivers/iommu/irq_remapping.c | 1 -
>  1 file changed, 1 deletion(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist()

2018-12-03 Thread j...@8bytes.org
On Wed, Nov 28, 2018 at 09:23:35AM +, Yoshihiro Shimoda wrote:
> Yoshihiro Shimoda (2):
>   iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist() to check SoC
> revisions
>   iommu/ipmmu-vmsa: add an array of slave devices whitelist

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] of/device: add blacklist for iommu dma_ops

2018-12-03 Thread Robin Murphy

Hi Rob,

On 01/12/2018 16:53, Rob Clark wrote:

This solves a problem we see with drm/msm, caused by getting
iommu_dma_ops while we attach our own domain and manage it directly at
the iommu API level:

   [0038] user address but active_mm is swapper
   Internal error: Oops: 9605 [#1] PREEMPT SMP
   Modules linked in:
   CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90
   Hardware name: xxx (DT)
   Workqueue: events deferred_probe_work_func
   pstate: 80c9 (Nzcv daif +PAN +UAO)
   pc : iommu_dma_map_sg+0x7c/0x2c8
   lr : iommu_dma_map_sg+0x40/0x2c8
   sp : ff80095eb4f0
   x29: ff80095eb4f0 x28: 
   x27: ffc0f9431578 x26: 
   x25:  x24: 0003
   x23: 0001 x22: ffc0fa9ac010
   x21:  x20: ffc0fab40980
   x19: ffc0fab40980 x18: 0003
   x17: 01c4 x16: 0007
   x15: 000e x14: 
   x13:  x12: 0028
   x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
   x9 :  x8 : ffc0fab409a0
   x7 :  x6 : 0002
   x5 : 0001 x4 : 
   x3 : 0001 x2 : 0002
   x1 : ffc0f9431578 x0 : 
   Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb)
   Call trace:
iommu_dma_map_sg+0x7c/0x2c8
__iommu_map_sg_attrs+0x70/0x84
get_pages+0x170/0x1e8
msm_gem_get_iova+0x8c/0x128
_msm_gem_kernel_new+0x6c/0xc8
msm_gem_kernel_new+0x4c/0x58
dsi_tx_buf_alloc_6g+0x4c/0x8c
msm_dsi_host_modeset_init+0xc8/0x108
msm_dsi_modeset_init+0x54/0x18c
_dpu_kms_drm_obj_init+0x430/0x474
dpu_kms_hw_init+0x5f8/0x6b4
msm_drm_bind+0x360/0x6c8
try_to_bring_up_master.part.7+0x28/0x70
component_master_add_with_match+0xe8/0x124
msm_pdev_probe+0x294/0x2b4
platform_drv_probe+0x58/0xa4
really_probe+0x150/0x294
driver_probe_device+0xac/0xe8
__device_attach_driver+0xa4/0xb4
bus_for_each_drv+0x98/0xc8
__device_attach+0xac/0x12c
device_initial_probe+0x24/0x30
bus_probe_device+0x38/0x98
deferred_probe_work_func+0x78/0xa4
process_one_work+0x24c/0x3dc
worker_thread+0x280/0x360
kthread+0x134/0x13c
ret_from_fork+0x10/0x18
   Code: d284 91000725 6b17039f 5400048a (f9401f40)
   ---[ end trace f22dda57f3648e2c ]---
   Kernel panic - not syncing: Fatal exception
   SMP: stopping secondary CPUs
   Kernel Offset: disabled
   CPU features: 0x0,22802a18
   Memory Limit: none

The problem is that when drm/msm does it's own iommu_attach_device(),
now the domain returned by iommu_get_domain_for_dev() is drm/msm's
domain, and it doesn't have domain->iova_cookie.


Does this crash still happen with 4.20-rc? Because as of 6af588fed391 it 
really shouldn't.



We kind of avoided this problem prior to sdm845/dpu because the iommu
was attached to the mdp node in dt, which is a child of the toplevel
mdss node (which corresponds to the dev passed in dma_map_sg()).  But
with sdm845, now the iommu is attached at the mdss level so we hit the
iommu_dma_ops in dma_map_sg().

But auto allocating/attaching a domain before the driver is probed was
already a blocking problem for enabling per-context pagetables for the
GPU.  This problem is also now solved with this patch.


s/solved/worked around/

If you want a guarantee of actually getting a specific hardware context 
allocated for a given domain, there needs to be code in the IOMMU driver 
to understand and honour that. Implicitly depending on whatever happens 
to fall out of current driver behaviour on current systems is not a real 
solution.



Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in of_dma_configure


That's rather misleading, since the crash described above depends on at 
least two other major changes which came long after that commit.


It's not that I don't understand exactly what you want here - just that 
this commit message isn't a coherent justification for that ;)



Tested-by: Douglas Anderson 
Signed-off-by: Rob Clark 
---
This is an alternative/replacement for [1].  What it lacks in elegance
it makes up for in practicality ;-)

[1] https://patchwork.freedesktop.org/patch/264930/

  drivers/of/device.c | 22 ++
  1 file changed, 22 insertions(+)

diff --git a/drivers/of/device.c b/drivers/of/device.c
index 5957cd4fa262..15ffee00fb22 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -72,6 +72,14 @@ int of_device_add(struct platform_device *ofdev)
return device_add(&ofdev->dev);
  }
  
+static const struct of_device_id iommu_blacklist[] = {

+   { .compatible = "qcom,mdp4" },
+   { .compatible = "qcom,mdss" },
+   { .compatible = "qcom,sdm845-mdss" },
+   { .compatible = "qcom,adreno" },
+   {}
+};
+
  /**
   * of_dma_configure - Setup DMA configuration
   * @dev:  Device to apply DMA configurati

Re: [PATCH] dma-debug: Kconfig for PREALLOC_DMA_DEBUG_ENTRIES

2018-12-03 Thread John Garry

On 01/12/2018 16:36, Christoph Hellwig wrote:

On Fri, Nov 30, 2018 at 07:39:50PM +, Robin Murphy wrote:

I was assuming the point was to also add something like

default 131072 if HNS_ENET

so that DMA debug doesn't require too much thought from the user. If they
still have to notice the overflow message and empirically figure out a
value that does work, rebuilding the kernel each time is far less
convenient than simply adding "dma_debug_entries=..." to their kernel
command line and rebooting, which they can do today. If they do already
know up-front that the default will need overriding and what the
appropriate value is, then the command line still seems seems just as
convenient.


I'm not so fond of random drivers changing the defaults.  My idea
was rather to have the config option so that the defconfig files for
the Hisilicon SOCs with this hardware could select a larger number
without making a total mess of the kernel configuration.

If we really have to we could do different defaults, but I'd still
much rather do this on a arch/platform basis than specific drivers.


As I understand, some drivers could even use much more than this 
(131072), to such a point where I can't imagine that we would want to 
set an arch default to support them. For this HNS_ENET case, it is arm64 
specific so it would be an arch defconfig.


Thanks,
John


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

.




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: PCI passthrough with multiple devices in same iommu group

2018-12-03 Thread gokul cg
Thanks for the info.

Yes. we need to do this as our architecture manages everything from one of
super privileged guest .
Rest of the guest don't have access to these devices .

Let me check with hardware vendor to see if we can segregate SMBus device
into separate  iommu group.

-Gokul
On Fri, Nov 30, 2018 at 9:35 PM Alex Williamson 
wrote:

> On Fri, 30 Nov 2018 12:04:26 +0530
> gokul cg  wrote:
>
> > Thanks for info
> >
> > See inline
> > On Wed, Nov 28, 2018 at 8:56 PM Alex Williamson <
> alex.william...@redhat.com>
> > wrote:
> >
> > > On Wed, 28 Nov 2018 20:21:02 +0530
> > > gokul cg  wrote:
> > >
> > > > Hi Folks,
> > > >
> > > > Please excuse me , I just writing to you as i could see you had made
> > > > changes regarding iommu and I thought you could give help me here.
> > > >
> > > > We are testing visualization with QEMU-2.7.0 + Linux-4.8+, we are
> facing
> > > > issue while configuring pass through PCI devices in QEMU to pass it
> to
> > > > guest OS.
> > > > We are using following QEMU argument to configure PCI device as
> > > passthrough
> > > > device to guest, which was working with Linux 3.10 distro
> (hypervisor:
> > > QEMU
> > > > 1.7.2, Linux:  3.10+).
> > > >
> > > > /usr/bin/qemu-system-x86_64 -name guest_1 -S -machine
> > > > pc-i440fx-1.7,accel=kvm,usb=off -m 49152\
> > > >  -device kvm-pci-assign,host=:00:1f.3 -device
> > > > kvm-pci-assign,host=:09:0e.0  ..
> > >
> > > Legacy KVM device assignment (aka kvm-pci-assign) has been deprecated
> > > for some time now, you'll find it doesn't even exist in newer kernels
> > > and QEMU.
> > >
> > > Understood .
> >
> > > > Here is the error message that we will get when we try to configure
> PCI
> > > > devices as passthrough using kvm pci assignment method in Linux 4.8+
> > > > (hypervisor: QEMU 2.7.0, Linux:  4.8+).
> > > >
> > > > which is shown below.
> > > >
> > > > ---log---
> > > >
> > > > From QEMU:
> > > > qemu-system-x86_64: -device kvm-pci-assign,host=:00:1f.3: Failed
> to
> > > > assign device "(null)": Invalid argument
> > > >
> > > > From dmesg:
> > > > pci-stub :00:1f.3: kvm assign device failed ret -22
> > > >
> > > > ---end
> > > >
> > > > Info about devices (CPU:  Intel(R) Xeon(R) CPU E5-2608L):
> > > >
> > > > root@shining-node:~# lspci -k -s 00:1f.3
> > > > 00:1f.3 SMBus: Intel Corporation C610/X99 series chipset SMBus
> Controller
> > > > (rev 05)
> > > > Subsystem: Intel Corporation C610/X99 series chipset SMBus
> > > > Controller
> > > > Kernel driver in use: pci-stub
> > > > Kernel modules: i2c_i801
> > >
> > > Why are you trying to assign an SMBus controller to a VM?
> > >
> > > We want guest of to read out eprom and sensor devices  and manage our
> > chassis .
> > Our i2c devices are connected to  Intel SMbus controller
>
> It seems fundamentally unsafe to allow a user driver (where QEMU is
> just a userspace driver exposing the device into a VM) to manage the
> host chassis.  Additionally since this set of multifunction devices do
> not expose access control services, we must assume that untranslated
> DMA between the function is possible and thus group them together.  It
> would be up to the hardware vendor whether the devices are in fact DMA
> isolated to provide an ACS quirk to split this grouping.  Otherwise
> you'll need to provide that isolation as a downstream assumption.
>
> > > root@shining-node:~#
> > > > root@shining-node:~# lspci -k -s  09:0e.0
> > > > 09:0e.0 Network controller: Broadcom Limited Device b680 (rev 12)
> > > > root@shining-node:~#
> > > >
> > > > From the web search i could see that it is because there are multiple
> > > > devices in same iommu_group that the passthrough device belongs to.
> > > > When i check iommu_group , i have multiple devices in same group but
> all
> > > > those are intel devices under Wellsburg PCH.
> > >
> > > Nope, kvm-pci-assign doesn't make use of IOMMU groups, more likely just
> > > some state of disrepair as it's deprecated and replaced by vfio-pci,
> > > which does use iommu groups.  So iommu groups will be a problem, but
> > > not in the configuration you're attempting to use above.
> > >
> > >  The error i had pasted above "< pci-stub :00:1f.3: kvm assign
> device
> > failed ret -22 >" is comming as iommu attach device
> > fails because of following check .
> > 1094 int iommu_attach_device(struct iommu_domain *domain, struct device
> > *dev)
> > 1095 {
> > 1096 struct iommu_group *group;
> > 1097 int ret;
> > 1098
> > 1099 group = iommu_group_get(dev);
> > 1100 /* FIXME: Remove this when groups a mandatory for iommu
> > drivers */
> > 1101 if (group == NULL)
> > 1102 return __iommu_attach_device(domain, dev);
> > 1103
>
> Ah yes, I forget about this since legacy KVM device assignment has been
> deprecated for so long.
>
> > > > root@shining-node:~# ls
> > > > /sys/bus/pci/devices/\:00\:1f.3/iommu_group/devices/
> > > > :00:1f.0  :00:1f

Re: [PATCH] of/device: add blacklist for iommu dma_ops

2018-12-03 Thread Marek Szyprowski
Hi Tomasz,

On 2018-12-03 01:10, Tomasz Figa wrote:
> On Sat, Dec 1, 2018 at 8:54 AM Rob Clark  wrote:
>> This solves a problem we see with drm/msm, caused by getting
>> iommu_dma_ops while we attach our own domain and manage it directly at
>> the iommu API level:
>>
>>   [0038] user address but active_mm is swapper
>>   Internal error: Oops: 9605 [#1] PREEMPT SMP
>>   Modules linked in:
>>   CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90
>>   Hardware name: xxx (DT)
>>   Workqueue: events deferred_probe_work_func
>>   pstate: 80c9 (Nzcv daif +PAN +UAO)
>>   pc : iommu_dma_map_sg+0x7c/0x2c8
>>   lr : iommu_dma_map_sg+0x40/0x2c8
>>   sp : ff80095eb4f0
>>   x29: ff80095eb4f0 x28: 
>>   x27: ffc0f9431578 x26: 
>>   x25:  x24: 0003
>>   x23: 0001 x22: ffc0fa9ac010
>>   x21:  x20: ffc0fab40980
>>   x19: ffc0fab40980 x18: 0003
>>   x17: 01c4 x16: 0007
>>   x15: 000e x14: 
>>   x13:  x12: 0028
>>   x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
>>   x9 :  x8 : ffc0fab409a0
>>   x7 :  x6 : 0002
>>   x5 : 0001 x4 : 
>>   x3 : 0001 x2 : 0002
>>   x1 : ffc0f9431578 x0 : 
>>   Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb)
>>   Call trace:
>>iommu_dma_map_sg+0x7c/0x2c8
>>__iommu_map_sg_attrs+0x70/0x84
>>get_pages+0x170/0x1e8
>>msm_gem_get_iova+0x8c/0x128
>>_msm_gem_kernel_new+0x6c/0xc8
>>msm_gem_kernel_new+0x4c/0x58
>>dsi_tx_buf_alloc_6g+0x4c/0x8c
>>msm_dsi_host_modeset_init+0xc8/0x108
>>msm_dsi_modeset_init+0x54/0x18c
>>_dpu_kms_drm_obj_init+0x430/0x474
>>dpu_kms_hw_init+0x5f8/0x6b4
>>msm_drm_bind+0x360/0x6c8
>>try_to_bring_up_master.part.7+0x28/0x70
>>component_master_add_with_match+0xe8/0x124
>>msm_pdev_probe+0x294/0x2b4
>>platform_drv_probe+0x58/0xa4
>>really_probe+0x150/0x294
>>driver_probe_device+0xac/0xe8
>>__device_attach_driver+0xa4/0xb4
>>bus_for_each_drv+0x98/0xc8
>>__device_attach+0xac/0x12c
>>device_initial_probe+0x24/0x30
>>bus_probe_device+0x38/0x98
>>deferred_probe_work_func+0x78/0xa4
>>process_one_work+0x24c/0x3dc
>>worker_thread+0x280/0x360
>>kthread+0x134/0x13c
>>ret_from_fork+0x10/0x18
>>   Code: d284 91000725 6b17039f 5400048a (f9401f40)
>>   ---[ end trace f22dda57f3648e2c ]---
>>   Kernel panic - not syncing: Fatal exception
>>   SMP: stopping secondary CPUs
>>   Kernel Offset: disabled
>>   CPU features: 0x0,22802a18
>>   Memory Limit: none
>>
>> The problem is that when drm/msm does it's own iommu_attach_device(),
>> now the domain returned by iommu_get_domain_for_dev() is drm/msm's
>> domain, and it doesn't have domain->iova_cookie.
>>
>> We kind of avoided this problem prior to sdm845/dpu because the iommu
>> was attached to the mdp node in dt, which is a child of the toplevel
>> mdss node (which corresponds to the dev passed in dma_map_sg()).  But
>> with sdm845, now the iommu is attached at the mdss level so we hit the
>> iommu_dma_ops in dma_map_sg().
>>
>> But auto allocating/attaching a domain before the driver is probed was
>> already a blocking problem for enabling per-context pagetables for the
>> GPU.  This problem is also now solved with this patch.
>>
>> Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in 
>> of_dma_configure
>> Tested-by: Douglas Anderson 
>> Signed-off-by: Rob Clark 
>> ---
>> This is an alternative/replacement for [1].  What it lacks in elegance
>> it makes up for in practicality ;-)
>>
>> [1] https://patchwork.freedesktop.org/patch/264930/
>>
>>  drivers/of/device.c | 22 ++
>>  1 file changed, 22 insertions(+)
>>
>> diff --git a/drivers/of/device.c b/drivers/of/device.c
>> index 5957cd4fa262..15ffee00fb22 100644
>> --- a/drivers/of/device.c
>> +++ b/drivers/of/device.c
>> @@ -72,6 +72,14 @@ int of_device_add(struct platform_device *ofdev)
>> return device_add(&ofdev->dev);
>>  }
>>
>> +static const struct of_device_id iommu_blacklist[] = {
>> +   { .compatible = "qcom,mdp4" },
>> +   { .compatible = "qcom,mdss" },
>> +   { .compatible = "qcom,sdm845-mdss" },
>> +   { .compatible = "qcom,adreno" },
>> +   {}
>> +};
>> +
>>  /**
>>   * of_dma_configure - Setup DMA configuration
>>   * @dev:   Device to apply DMA configuration
>> @@ -164,6 +172,20 @@ int of_dma_configure(struct device *dev, struct 
>> device_node *np, bool force_dma)
>> dev_dbg(dev, "device is%sbehind an iommu\n",
>> iommu ? " " : " not ");
>>
>> +   /*
>> +* There is at least one case where the driver wants to directly
>> +* manage the IOMMU, but if we end up with iommu dma_ops, that
>> +* int

[PATCH v2] kernel/dma: Fix panic caused by passing swiotlb to command line

2018-12-03 Thread zhe.he
From: He Zhe 

setup_io_tlb_npages does not check input argument before passing it
to isdigit. The argument would be a NULL pointer if "swiotlb", without
its value, is set in command line and thus causes the following panic.

PANIC: early exception 0xe3 IP 10:bb9b8e9f error 0 cr2 0x0
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
4.19.0-rc3-yocto-standard+ #9
[0.00] RIP: 0010:setup_io_tlb_npages+0xf/0x95
...
[0.00] Call Trace:
[0.00]  do_early_param+0x57/0x8e
[0.00]  parse_args+0x208/0x320
[0.00]  ? rdinit_setup+0x30/0x30
[0.00]  parse_early_options+0x29/0x2d
[0.00]  ? rdinit_setup+0x30/0x30
[0.00]  parse_early_param+0x36/0x4d
[0.00]  setup_arch+0x336/0x99e
[0.00]  start_kernel+0x6f/0x4e6
[0.00]  x86_64_start_reservations+0x24/0x26
[0.00]  x86_64_start_kernel+0x6f/0x72
[0.00]  secondary_startup_64+0xa4/0xb0

This patch adds a check to prevent the panic and sets it for 4MB by default.

Signed-off-by: He Zhe 
Cc: sta...@vger.kernel.org
Cc: konrad.w...@oracle.com
Cc: h...@lst.de
Cc: m.szyprow...@samsung.com
Cc: robin.mur...@arm.com

Signed-off-by: He Zhe 
---
v2: Set swiotlb for 4MB by default

 kernel/dma/swiotlb.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 045930e..0e18cd4 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -58,6 +58,9 @@
  */
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
 
+#define IO_TLB_DEFAULT_MB 4
+#define IO_TLB_DEFAULT_SLABS ((IO_TLB_DEFAULT_MB<<20) >> IO_TLB_SHIFT)
+
 enum swiotlb_force swiotlb_force;
 
 /*
@@ -103,6 +106,14 @@ static int late_alloc;
 static int __init
 setup_io_tlb_npages(char *str)
 {
+   if (!str) {
+   pr_err("No value provided for swiotlb, "
+  "set it to %d slabs for %dMB by default.\n",
+  IO_TLB_DEFAULT_SLABS, IO_TLB_DEFAULT_MB);
+   io_tlb_nslabs = IO_TLB_DEFAULT_SLABS;
+   return 0;
+   }
+
if (isdigit(*str)) {
io_tlb_nslabs = simple_strtoul(str, &str, 0);
/* avoid tail segment of size < IO_TLB_SEGSIZE */
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] kernel/dma: Fix panic caused by passing swiotlb to command line

2018-12-03 Thread He Zhe


On 2018/12/2 20:30, Konrad Rzeszutek Wilk wrote:
> On Tue, Oct 30, 2018 at 07:06:10PM +0800, He Zhe wrote:
>>
>> On 2018/10/23 19:14, He Zhe wrote:
>>> On 2018/10/23 03:29, Konrad Rzeszutek Wilk wrote:
 On Sat, Sep 22, 2018 at 08:56:58PM +0800, He Zhe wrote:
> May I have your input?
 Alternatively would it make more sense for it to assume some default
 value?
>>> Maybe, but the original code has no default value and I have no idea
>>> what default value is proper here.
>> Can anyone give some suggestions? Though I'd prefer to do nothing when
>> no option is provided.
> One provided the paramter for a reason. I would just do a small one, say
> 4MB.

OK. Thanks, I'll send v2.

Zhe

>> Thanks,
>> Zhe
>>
>>> Zhe
>>>
> Thanks,
> Zhe
>
> On 2018年09月17日 11:27, zhe...@windriver.com wrote:
>> From: He Zhe 
>>
>> setup_io_tlb_npages does not check input argument before passing it
>> to isdigit. The argument would be a NULL pointer if "swiotlb", without
>> its value, is set in command line and thus causes the following panic.
>>
>> PANIC: early exception 0xe3 IP 10:bb9b8e9f error 0 cr2 0x0
>> [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
>> 4.19.0-rc3-yocto-standard+ #9
>> [0.00] RIP: 0010:setup_io_tlb_npages+0xf/0x95
>> ...
>> [0.00] Call Trace:
>> [0.00]  do_early_param+0x57/0x8e
>> [0.00]  parse_args+0x208/0x320
>> [0.00]  ? rdinit_setup+0x30/0x30
>> [0.00]  parse_early_options+0x29/0x2d
>> [0.00]  ? rdinit_setup+0x30/0x30
>> [0.00]  parse_early_param+0x36/0x4d
>> [0.00]  setup_arch+0x336/0x99e
>> [0.00]  start_kernel+0x6f/0x4e6
>> [0.00]  x86_64_start_reservations+0x24/0x26
>> [0.00]  x86_64_start_kernel+0x6f/0x72
>> [0.00]  secondary_startup_64+0xa4/0xb0
>>
>> This patch adds a check to prevent the panic.
>>
>> Signed-off-by: He Zhe 
>> Cc: sta...@vger.kernel.org
>> Cc: konrad.w...@oracle.com
>> Cc: h...@lst.de
>> Cc: m.szyprow...@samsung.com
>> Cc: robin.mur...@arm.com
>> ---
>>  kernel/dma/swiotlb.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
>> index 4f8a6db..46fc34e 100644
>> --- a/kernel/dma/swiotlb.c
>> +++ b/kernel/dma/swiotlb.c
>> @@ -109,6 +109,11 @@ static int late_alloc;
>>  static int __init
>>  setup_io_tlb_npages(char *str)
>>  {
>> +if (!str) {
>> +pr_err("Config string not provided\n");
>> +return -EINVAL;
>> +}
>> +
>>  if (isdigit(*str)) {
>>  io_tlb_nslabs = simple_strtoul(str, &str, 0);
>>  /* avoid tail segment of size < IO_TLB_SEGSIZE */
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 06/15] iommu/mediatek: Add mt8183 IOMMU support

2018-12-03 Thread Yong Wu
Hi Matthias,

 Thanks very much for your review.

On Mon, 2018-12-03 at 00:56 +0100, Matthias Brugger wrote:
> 
> On 17/11/2018 03:35, Yong Wu wrote:
> > The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
> > the ARM Short-descriptor like mt8173, and most of the HW registers
> > are the same.
> > 
> > Here list main changes in mt8183:
> > 1) mt8183 has only one M4U HW like mt8173.
> 
> That's a change?

I guess I should use "difference" rather than "changes".
(mt8173 and mt8183 have only one M4u HW while mt2712 have two.)

> 
> > 2) mt8183 don't have its "bclk" clock, the M4U use the EMI clock
> > which has already been enabled before kernel.
> > 3) mt8183 can support the dram over 4GB, but it don't call this "4GB
> > mode".
> > 4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
> > the bit[33:32] in the physical address of the pgtable base, But the
> > standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
> > we add a mask.
> > 5) mt8183 HW has a GALS modules, the SMI should add "gals" clock
> > support.
> > 6) the larb-id in smi-common has been remapped. This means the
> > larb-id reported in the mtk_iommu_isr is not the real larb-id, here
> > is the remapping relationship of mt8183:
> 
> This whole list is an indicator that you will need several enhancements to the
> existing code before you can add mt8183 support.
> Please do so in independent patches (e.g. gals modules, remap logic)

OK. I will separate them into 2 new preparing patches.

> Regarding making bclk optional, I think it would be better to set it to NULL 
> for
> mt8183 otherwise the driver will work with a broken device tree on systems 
> that
> actually need the bclk. Same holds for gals0 and gals1.

OK. From your comment below, I will add a "has_bclk" in the platform
data and change code like this:

 if (data->plat_data->has_bclk) {
data->bclk = devm_clk_get();
if (IS_ERR(data->bclk))
return PTR_ERR(data->bclk).
 } else {
data->bclk = NULL;
 }

> 
> >M4U
> > |
> > -
> > |   SMI common  |
> > -0-7-5-6-1-2--3-4- <- Id remapped
> >  | | | | | |  | |
> > larb0 larb1 IPU0  IPU1 larb4 larb5  larb6  CCU
> > disp  vdec  img   cam   venc  imgcam
> > As above, larb0 connects with the id 0 in smi-common.
> >   larb1 connects with the id 7 in smi-common.
> >   ...
> > Take a example, if the larb-id reported in the mtk_iommu_isr is 7,
> > actually it is larb1(vdec).
> > 
> > Signed-off-by: Yong Wu 
> > ---
> >  drivers/iommu/mtk_iommu.c | 38 --
> >  drivers/iommu/mtk_iommu.h |  5 +
> >  drivers/memory/mtk-smi.c  | 47 
> > +++
> >  3 files changed, 80 insertions(+), 10 deletions(-)
> > 
> 
> [...]
> 
> > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> > index a243047..e5fd8ee 100644
> > --- a/drivers/iommu/mtk_iommu.h
> > +++ b/drivers/iommu/mtk_iommu.h
> > @@ -39,11 +39,16 @@ enum mtk_iommu_plat {
> > M4U_MT2701,
> > M4U_MT2712,
> > M4U_MT8173,
> > +   M4U_MT8183,
> >  };
> >  
> >  struct mtk_iommu_plat_data {
> > enum mtk_iommu_plat m4u_plat;
> > bool has_4gb_mode;
> > +
> > +   /* The larb-id may be remapped in the smi-common. */
> > +   bool larbid_remap_enable;
> > +   unsigned int larbid_remapped[MTK_LARB_NR_MAX];
> 
> Please add new features to the driver as single patches. Don't add this kind 
> of
> things in the patch where you enable a new SoC.

OK.

> 
> >  };
> >  
> >  struct mtk_iommu_domain;
> > diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> > index e37e54b..979153b 100644
> > --- a/drivers/memory/mtk-smi.c
> > +++ b/drivers/memory/mtk-smi.c
> > @@ -59,6 +59,7 @@ struct mtk_smi_larb_gen {
> >  struct mtk_smi {
> > struct device   *dev;
> > struct clk  *clk_apb, *clk_smi;
> > +   struct clk  *clk_gals0, *clk_gals1;
> > struct clk  *clk_async; /*only needed by mt2701*/
> > void __iomem*smi_ao_base;
> >  };
> > @@ -93,8 +94,20 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
> > if (ret)
> > goto err_disable_apb;
> >  
> > +   ret = clk_prepare_enable(smi->clk_gals0);
> > +   if (ret)
> > +   goto err_disable_smi;
> > +
> > +   ret = clk_prepare_enable(smi->clk_gals1);
> > +   if (ret)
> > +   goto err_disable_gals0;
> > +>  return 0;
> >  
> > +err_disable_gals0:
> > +   clk_disable_unprepare(smi->clk_gals0);
> > +err_disable_smi:
> > +   clk_disable_unprepare(smi->clk_smi);
> >  err_disable_apb:
> > clk_disable_unprepare(smi->clk_apb);
> >  err_put_pm:
> > @@ -104,6 +117,8 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
> >  
> >  static vo

Re: [PATCH v3 13/15] memory: mtk-smi: Get rid of need_larbid

2018-12-03 Thread Yong Wu
On Mon, 2018-12-03 at 00:04 +0100, Matthias Brugger wrote:
> 
> On 17/11/2018 03:35, Yong Wu wrote:
> > The "mediatek,larb-id" has already been parsed in MTK IOMMU driver.
> > It's no need to parse it again in SMI driver. Only clean some codes.
> > This patch is fit for all the current mt2701, mt2712, mt7623, mt8173
> > and mt8183.
> 
> I'm trying to understand why we need the mediatek,larb-id at all. From what I
> understand as long as the mediatek larbs described in the iommu are ordered
> (larb0, larb1, larb2, etc) we actually get the same value as defined by
> mediatek,larb-id. At least this holds for all present implementations.
> 
> On the other hand I don't see where the mtk_iommu_v1 driver actually parses 
> the
> larb-id, can you please help to understand:
> 
> 1) why we need the larb-id at all

Actually only mt2712 which have 2 M4U HW need "mediatek,larb-id".

This is larbs in the m4u0/1 dtsi node of mt2712:
m4u0 { mediatek,larbs = <&larb0 &larb1 &larb2 &larb3 &larb6>;}
m4u1 { mediatek,larbs = <&larb4 &larb5 &larb7>;}

the id don't increase one by one, M4U have to get the larbid with the
help of "mediatek,larb-id".

(The m4u/smi dtsi patch of mt2712 will be send with some other modules,
maybe in this week.)


> 2) how this will work if we do not parse the larb-id for v1 iommu at all

As you said above and I also have wrote that the larbid "must sort
according to the local arbiter index" in the "mediatek,larbs"
description of dt-binding. All the M4U except mt2712 could ignore
"mediatek,larb-id". the v1 iommu also should be ok.

I'm not sure whether we should change [1], if only reserving mt2712
there, we also should change the dtsi file of mt2701 and mt7623.or keep
it as is.

[1]
https://elixir.bootlin.com/linux/v4.20-rc1/source/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt#L20

> 
> Thanks a lot,
> Matthias
> 
> > 

[snip]


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu