Reparenting a platform device

2012-04-05 Thread Thierry Reding
Hi,

I have a device tree where I have a GART device and a DRM device which uses
the GART. The GART is implemented by an IOMMU driver (tegra-gart) and
requires the user device to be a child of the GART device (it explicitly
checks for this when the user device is attached).

I've tried two alternatives to achieve this: create the GART device in the
user driver's .probe() function and explicitly set the DRM device's parent
to the resulting platform device like so:

gart = platform_device_alloc(...);
...
pdev-dev.parent = gart-dev;

The alternative is to use the device tree to look up the GART device node and
resolve it to the corresponding struct device:

gart_node = of_parse_phandle(drm-dev-of_node, gart-parent, 0);
gart = bus_find_device(drm-dev-bus, NULL, gart_node, match_of_node);

Where match_of_node matches each struct device's .of_node field to the
gart_node.

Both of these variants seem to work, and the DRM device can be properly
attached to the GART device. However, after the DRM driver's .probe() exits,
I get the following error:

[   25.579261] [ cut here ]
[   25.583895] WARNING: at 
/home/thierry.reding/src/kernel/linux-ipmp.git/kernel/mutex-debug.c:78 
debug_mutex_unlock+0x114/0x128()
[   25.595352] Modules linked in: tegra_drm(+) cfbfillrect cfbimgblt 
cfbcopyarea drm_kms_helper drm fb pwm_tegra pwm_bl backlight
[   25.606821] [c0013c54] (unwind_backtrace+0x0/0xf8) from 
[c0027bcc] (warn_slowpath_common+0x4c/0x64)
[   25.616207] [c0027bcc] (warn_slowpath_common+0x4c/0x64) from 
[c0027c00] (warn_slowpath_null+0x1c/0x24)
[   25.625853] [c0027c00] (warn_slowpath_null+0x1c/0x24) from 
[c00667fc] (debug_mutex_unlock+0x114/0x128)
[   25.635508] [c00667fc] (debug_mutex_unlock+0x114/0x128) from 
[c03bee2c] (__mutex_unlock_slowpath+0x84/0x140)
[   25.645680] [c03bee2c] (__mutex_unlock_slowpath+0x84/0x140) from 
[c02149b4] (__driver_attach+0x78/0x90)
[   25.655412] [c02149b4] (__driver_attach+0x78/0x90) from 
[c0213128] (bus_for_each_dev+0x50/0x7c)
[   25.664449] [c0213128] (bus_for_each_dev+0x50/0x7c) from 
[c0214038] (bus_add_driver+0x174/0x234)
[   25.673572] [c0214038] (bus_add_driver+0x174/0x234) from 
[c0214e7c] (driver_register+0x78/0x12c)
[   25.682696] [c0214e7c] (driver_register+0x78/0x12c) from 
[c0008620] (do_one_initcall+0x34/0x174)
[   25.691826] [c0008620] (do_one_initcall+0x34/0x174) from 
[c006dab0] (sys_init_module+0xc20/0x18e0)
[   25.701131] [c006dab0] (sys_init_module+0xc20/0x18e0) from 
[c000df00] (ret_fast_syscall+0x0/0x30)
[   25.710336] ---[ end trace f64968a4a9d9fe8b ]---

I may be misinterpreting things, but this looks to be caused by the
reparenting. So the question is whether what I'm trying to do is valid or
if there is some other way to properly make the GART device the parent of
the DRM device.

Thanks,
Thierry


pgpWLTIkYmdQH.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Reparenting a platform device

2012-04-05 Thread Thierry Reding
* Thierry Reding wrote:
 * Stephen Warren wrote:
  On 04/05/2012 02:42 AM, Thierry Reding wrote:
   Hi,
   
   I have a device tree where I have a GART device and a DRM device which 
   uses
   the GART. The GART is implemented by an IOMMU driver (tegra-gart) and
   requires the user device to be a child of the GART device (it explicitly
   checks for this when the user device is attached).
  
  Isn't this wrong?
  
  I would expect the device parent/child relationship to reflect the
  CPU-initiated register access bus topology.
  
  A device's interaction with an IOMMU is an aspect of a device's
  initiating accesses itself, not CPU-initiated register accesses.
 
 Actually I have no idea why this was made a requirement. Maybe Hiroshi can
 comment on this. The driver really only needs this to basically obtain a
 pointer to itself. The MSM I/O MMU implementation does something similar,
 though, and goes on to register actual child devices (they are instantiated
 in arch/arm/mach-msm/devices-iommu.c). Each of those devices is then assigned
 a specific memory area it seems.
[...]
 I don't think having more than one device using the IOMMU will work properly
 anyway in the context of the Tegra GART driver because there is not means to
 allocate specific regions of the GART aperture to individual devices. So
 really the one and only client actually needs to manage the allocations from
 the GART aperture.

I've been thinking about this some more and it occurred to me that maybe it
would be best to add an additional layer between the GART and the clients
which could manage the allocations. Such a virtual device could actually be
registered as a child of the GART device and allow individual drivers to
request mappings from it so that there is a single instance handing them out
and keep them consistent.

How does that sound?

Thierry


pgpqlS0SBSpSZ.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Reparenting a platform device

2012-04-06 Thread Thierry Reding
* Stephen Warren wrote:
 I must admit I'm not at all familiar with the IOMMU APIs, but isn't the
 IOMMU driver/subsystem itself what is managing all the allocations and
 handing them out to clients? And client drivers do things like asking
 for N pages of memory mapped into their aperture? If that is true, I'm
 not sure what the purpose is of the intermediate device you propose.
 Sorry for being somewhat clueless.

That was my impression too at first. But it seems like all the IOMMU
subsystem does is map individual pages. So you pass it a physical address of
a page along with the IO virtual address to map it to. That's at least the
way it is handled for Tegra 2 GART (and from a quick look also by the Tegra 3
SMMU). For the SMMU the situation may be different because it may not have a
fixed aperture that needs to contain the IO virtual addresses, though I must
admit I haven't looked at Tegra 3 in enough detail to judge this.

So this intermediate device would be purely an allocator for the GART
aperture and handle the actual mapping via the IOMMU. This would probably be
really simple and is in fact now done in the DRM driver. The nice thing if
this would be a separate device is that it would be easy to make it a child
of the IOMMU and wouldn't be as counter-intuitive as making the DRM device
the IOMMU's child.

Thierry


pgpuEzI7BRvlC.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Reparenting a platform device

2012-04-07 Thread Thierry Reding
* Grant Likely wrote:
 On Thu, 5 Apr 2012 10:42:58 +0200, Thierry Reding 
 thierry.red...@avionic-design.de wrote:
  Hi,
  
  I have a device tree where I have a GART device and a DRM device which uses
  the GART. The GART is implemented by an IOMMU driver (tegra-gart) and
  requires the user device to be a child of the GART device (it explicitly
  checks for this when the user device is attached).
  
  I've tried two alternatives to achieve this: create the GART device in the
  user driver's .probe() function and explicitly set the DRM device's parent
  to the resulting platform device like so:
  
  gart = platform_device_alloc(...);
  ...
  pdev-dev.parent = gart-dev;
 
 Yeah, don't *ever* try to do this.  The device hierarchy is a complex
 data structure which must never be directly manipulated.
 
  
  The alternative is to use the device tree to look up the GART device node 
  and
  resolve it to the corresponding struct device:
  
  gart_node = of_parse_phandle(drm-dev-of_node, gart-parent, 0);
  gart = bus_find_device(drm-dev-bus, NULL, gart_node, match_of_node);
  
  Where match_of_node matches each struct device's .of_node field to the
  gart_node.
  
  Both of these variants seem to work, and the DRM device can be properly
  attached to the GART device. However, after the DRM driver's .probe() exits,
  I get the following error:
 
 I don't understand what you're trying to describe here as the 2nd
 option.
 
 Regardless, reparenting should not ben the solution at all.  What does
 the device tree that you envision look like for this?  What devices
 are created, and what drivers bind to them?

The reason why I need to reparent at all is because the IOMMU driver requires
the user to be a child of the IOMMU device. If you look at the driver in
drivers/iommu/tegra-gart.c you'll see that it references dev-parent in
several places, most notably in the gart_iommu_attach_dev() function. So
there's really only two options that I can see: 1) create a virtual device
that is a child of the GART and is in charge of the actual allocations from
the GART and have the DRM driver use that interface or 2) change the GART
driver's behaviour in a way that the parent/child relationship is no longer a
requirement.

1) has the advantage of providing a central allocation manager for the GART
and will allow to register multiple clients with the GART. 2) does not have
that advantage.

Another alternative may be to allow only a single device to attach to the
GART that doesn't have to be a child of the GART. That way the DRM could take
care of GART aperture allocations, which seems to be the most logical place
to do that anyway.

Thierry


pgpt675R8iKev.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Reparenting a platform device

2012-04-08 Thread Thierry Reding
* Stephen Warren wrote:
 On 04/06/2012 01:20 AM, Thierry Reding wrote:
  * Stephen Warren wrote:
  I must admit I'm not at all familiar with the IOMMU APIs, but isn't the
  IOMMU driver/subsystem itself what is managing all the allocations and
  handing them out to clients? And client drivers do things like asking
  for N pages of memory mapped into their aperture? If that is true, I'm
  not sure what the purpose is of the intermediate device you propose.
  Sorry for being somewhat clueless.
  
  That was my impression too at first. But it seems like all the IOMMU
  subsystem does is map individual pages. So you pass it a physical address of
  a page along with the IO virtual address to map it to. That's at least the
  way it is handled for Tegra 2 GART (and from a quick look also by the Tegra 
  3
  SMMU). For the SMMU the situation may be different because it may not have a
  fixed aperture that needs to contain the IO virtual addresses, though I must
  admit I haven't looked at Tegra 3 in enough detail to judge this.
  
  So this intermediate device would be purely an allocator for the GART
  aperture and handle the actual mapping via the IOMMU. This would probably be
  really simple and is in fact now done in the DRM driver. The nice thing if
  this would be a separate device is that it would be easy to make it a child
  of the IOMMU and wouldn't be as counter-intuitive as making the DRM device
  the IOMMU's child.
 
 OK, I see.
 
 I'll defer to Hiroshi here; he's far more familiar with all this than I am.
 
 My only comment is that I'd be surprised to see any kind of memory
 allocator represented as a driver (as opposed to a utility module) since
 it's not really representing an actual hardware block.

I agree. In that case I don't see any other solution than to remove the
requirement of the parent/child relationship from the GART driver. It seems
like the SMMU driver doesn't have such a requirement so it should be fine.

Thierry


pgpolgZKcDkNj.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[RFC 3/4] drm: fixed: Add dfixed_frac() macro

2012-04-11 Thread Thierry Reding
This commit is taken from the Chromium tree and was originally written
by Robert Morell.

Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
---
 include/drm/drm_fixed.h |1 +
 1 file changed, 1 insertion(+)

diff --git a/include/drm/drm_fixed.h b/include/drm/drm_fixed.h
index 4a08a66..0ead502 100644
--- a/include/drm/drm_fixed.h
+++ b/include/drm/drm_fixed.h
@@ -37,6 +37,7 @@ typedef union dfixed {
 #define dfixed_init(A) { .full = dfixed_const((A)) }
 #define dfixed_init_half(A) { .full = dfixed_const_half((A)) }
 #define dfixed_trunc(A) ((A).full  12)
+#define dfixed_frac(A) ((A).full  ((1  12) - 1))
 
 static inline u32 dfixed_floor(fixed20_12 A)
 {
-- 
1.7.10

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 2/4] iommu: tegra/gart: Add device tree support

2012-04-11 Thread Thierry Reding
This commit adds device tree support for the GART hardware available on
NVIDIA Tegra 20 SoCs.

Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
---
 arch/arm/boot/dts/tegra20.dtsi |6 ++
 arch/arm/mach-tegra/board-dt-tegra20.c |1 +
 drivers/iommu/tegra-gart.c |   10 ++
 3 files changed, 17 insertions(+)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 55b28fd..cf3ff41 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -212,5 +212,11 @@
interrupts =  0 97 0x04 ;
phy_type = utmi;
};
+
+   gart: gart@7000f000 {
+   compatible = nvidia,tegra20-gart;
+   reg =  0x7000f000 0x0100/* controller registers */
+   0x5800 0x0200 ; /* GART aperture */
+   };
 };
 
diff --git a/arch/arm/mach-tegra/board-dt-tegra20.c 
b/arch/arm/mach-tegra/board-dt-tegra20.c
index 57745e6..bffba1b 100644
--- a/arch/arm/mach-tegra/board-dt-tegra20.c
+++ b/arch/arm/mach-tegra/board-dt-tegra20.c
@@ -66,6 +66,7 @@ struct of_dev_auxdata tegra20_auxdata_lookup[] __initdata = {
OF_DEV_AUXDATA(nvidia,tegra20-ehci, TEGRA_USB3_BASE, tegra-ehci.2,
   tegra_ehci3_pdata),
OF_DEV_AUXDATA(nvidia,tegra20-pwm, TEGRA_PWFM_BASE, tegra-pwm, 
NULL),
+   OF_DEV_AUXDATA(nvidia,tegra20-gart, TEGRA_MC_BASE, tegra-gart, 
NULL),
{}
 };
 
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index f6bc1e6..4a571b7 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -29,6 +29,7 @@
 #include linux/device.h
 #include linux/io.h
 #include linux/iommu.h
+#include linux/of.h
 
 #include asm/cacheflush.h
 
@@ -422,6 +423,13 @@ const struct dev_pm_ops tegra_gart_pm_ops = {
.resume = tegra_gart_resume,
 };
 
+#ifdef CONFIG_OF
+static struct of_device_id tegra_gart_of_match[] __devinitdata = {
+   { .compatible = nvidia,tegra20-gart, },
+   { },
+};
+#endif
+
 static struct platform_driver tegra_gart_driver = {
.probe  = tegra_gart_probe,
.remove = tegra_gart_remove,
@@ -429,6 +437,7 @@ static struct platform_driver tegra_gart_driver = {
.owner  = THIS_MODULE,
.name   = tegra-gart,
.pm = tegra_gart_pm_ops,
+   .of_match_table = of_match_ptr(tegra_gart_of_match),
},
 };
 
@@ -448,4 +457,5 @@ module_exit(tegra_gart_exit);
 
 MODULE_DESCRIPTION(IOMMU API for GART in Tegra20);
 MODULE_AUTHOR(Hiroshi DOYU hd...@nvidia.com);
+MODULE_ALIAS(platform:tegra-gart);
 MODULE_LICENSE(GPL v2);
-- 
1.7.10

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 1/4] iommu: tegra/gart: use correct gart_device

2012-04-11 Thread Thierry Reding
From: Vandana Salve vsa...@nvidia.com

Pass the correct gart device pointer.

Reviewed-by: Vandana Salve vsa...@nvidia.com
Tested-by: Vandana Salve vsa...@nvidia.com
Reviewed-by: Hiroshi Doyu hd...@nvidia.com
Reviewed-by: Bharat Nihalani bnihal...@nvidia.com
Signed-off-by: Hiroshi DOYU hd...@nvidia.com
---
 drivers/iommu/tegra-gart.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 779306e..f6bc1e6 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -158,7 +158,7 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
struct gart_client *client, *c;
int err = 0;
 
-   gart = dev_get_drvdata(dev-parent);
+   gart = gart_handle;
if (!gart)
return -EINVAL;
domain-priv = gart;
-- 
1.7.10

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 0/4] Add NVIDIA Tegra DRM support

2012-04-11 Thread Thierry Reding
* Hiroshi Doyu wrote:
 From: Thierry Reding thierry.red...@avionic-design.de
[...]
  Thierry Reding (3):
iommu: tegra/gart: Add device tree support
drm: fixed: Add dfixed_frac() macro
drm: Add NVIDIA Tegra support
  
  Vandana Salve (1):
iommu: tegra/gart: use correct gart_device
 
 I guess that the following 2 patches can be merged now, independet of
 the rest of drm patches.
 
   [RFC 1/4] iommu: tegra/gart: use correct gart_device
   [RFC 2/4] iommu: tegra/gart: Add device tree support

Yes, I can probably post those separately so they can be applied
independently. Should they go in via the IOMMU tree or the Tegra
tree?

Thierry


pgpT7xc0RiJOr.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 0/4] Add NVIDIA Tegra DRM support

2012-04-11 Thread Thierry Reding
* Alan Cox wrote:
 On Wed, 11 Apr 2012 14:10:26 +0200
 Thierry Reding thierry.red...@avionic-design.de wrote:
 
  This series adds a basic DRM driver for NVIDIA Tegra 2 processors. It
  currently only supports the RGB output and I've successfully tested it
  against the fbcon kernel module and the xf86-video-modesetting driver.
  The code uses the Tegra's IOMMU/GART to remap non-contiguous memory.
  This means that currently video memory is limited to 32 MB, the size of
  the GART aperture.
 
 You should only need continguous memory with GEM for the framebuffer /
 console bits via /dev/fb. In theory the fb layer can be taught to hanndle
 non linear framebuffers but nobody has yet done so. (Now there's a GSOC
 project ... ;))
 
 What we do on GMA500 is to allocate the kernel framebuffer from linearly
 mapped memory but the normal GEM objects from anywhere as the GEM mapping
 into userspace will deal with presenting it to user space as a virtually
 linear buffer.

That's actually what the driver does as well. It uses the shmfs-backed memory
provided by GEM and only maps it through the GART to provide a linear view
for the display controller which cannot do scatter-gather I/O. Neither the
kernel nor the user-space mapping go through the GART.

You are right that for anything but the framebuffer objects it probably
doesn't matter. It's a little hard to tell because I don't have any
documentation for either the 2D nor 3D engines. Eventually a better option
might be to use stolen memory instead of remapping it through the GART.

Thierry


pgpn0gy1X7Mqj.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 4/4] drm: Add NVIDIA Tegra support

2012-04-11 Thread Thierry Reding
* Daniel Vetter wrote:
 On Wed, Apr 11, 2012 at 04:11:08PM +0200, Thierry Reding wrote:
  * Daniel Vetter wrote:
   On Wed, Apr 11, 2012 at 03:23:26PM +0200, Thierry Reding wrote:
* Daniel Vetter wrote:
 On Wed, Apr 11, 2012 at 02:10:30PM +0200, Thierry Reding wrote:
  This commit adds a very basic DRM driver for NVIDIA Tegra SoCs. It
  currently has rudimentary GEM support and can run a console on the
  framebuffer as well as X using the xf86-video-modesetting driver.
  Only the RGB output is supported. Quite a lot of things still need
  to be worked out and there is a lot of room for cleanup.
 
 Indeed, after a quick look there are tons of functions that are just 
 stubs
 ;-) One thing I wonder though is why you directly use the iommu api 
 and
 not wrap it up into dma_map? Is arm infrastructure just not there yet 
 or
 do you plan to tightly integrate the tegra drm with the iommu (e.g. 
 for
 process space switching or similarly funky stuff)?

I'm not sure I know what you are referring to. Looking for all users of
iommu_map() doesn't turn up anything related to dma_map. Can you point 
me in
the right direction?
   
   Well, you use the iommu api to map/unmap memory into the iommu for tegra,
   whereas usually device drivers just use the dma api to do that. The usual
   interface is dma_map_sg/dma_unmap_sg, but there are quite a few variants
   around. I'm just wondering why this you've choosen this.
  
  I don't think this works on ARM. Maybe I'm not seeing the whole picture but
  judging by a quick look through the kernel tree there aren't any users that
  map DMA memory through an IOMMU.
  
  Maybe your question is answered by my reply to Alan's comment. The mapping
  is actually done to get a linear view for the display controller which
  doesn't support SG transfers. The kernel and user-space already have virtual
  linear buffers.
 
 Hm, in that case it looks like your iommu works more like the gtt on intel
 chips and less like the iommu on intel platforms (which we access through
 the dma_map api).

Yes, it's very much like the GTT on Intel chips. In fact I've been using the
gma500 driver as a source for inspiration. Wikipedia confirms that GTT and
GART are synonymous.

 I wonder whether that will end up in some layering fun together with
 dma_buf, which conceptually is at the same level as the dma api, which
 usually uses an underlying iommu exposed with the iommu api you're using.

That's odd. The only users of the IOMMU API that I can find in the kernel
tree are in drivers/remoteproc and drivers/media/video/omap3isp. And omap3isp
doesn't do any actual mapping at a quick glance. Can you point me to where
this is hooked up with the Intel IOMMU?

Thierry


pgp6tBMalmVXB.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 4/4] drm: Add NVIDIA Tegra support

2012-04-11 Thread Thierry Reding
* Alan Cox wrote:
  Maybe your question is answered by my reply to Alan's comment. The mapping
  is actually done to get a linear view for the display controller which
  doesn't support SG transfers. The kernel and user-space already have virtual
  linear buffers.
 
 The framebuffer currently needs a physically contiguous map for the
 console devices. Well you could vmap them but that is pretty hideous on a
 32bit platform with 32bit 1080p display plugged into it!

Heh, vmap() is exactly what I do. =) Would you mind explaining why exactly it
is hideous?

I'll have to investigate what an appropriate alternative would look like.

Thierry


pgp4S1plaNGPQ.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 4/4] drm: Add NVIDIA Tegra support

2012-04-12 Thread Thierry Reding
* Arnd Bergmann wrote:
 On Wednesday 11 April 2012, Thierry Reding wrote:
  Daniel Vetter wrote:
   Well, you use the iommu api to map/unmap memory into the iommu for tegra,
   whereas usually device drivers just use the dma api to do that. The usual
   interface is dma_map_sg/dma_unmap_sg, but there are quite a few variants
   around. I'm just wondering why this you've choosen this.
  
  I don't think this works on ARM. Maybe I'm not seeing the whole picture but
  judging by a quick look through the kernel tree there aren't any users that
  map DMA memory through an IOMMU.
 
 dma_map_sg is certainly the right interface to use, and Marek Szyprowski has
 patches to make that work on ARM, hopefully going into v3.5, so you could
 use those.

I've looked at Marek's patches but I don't think they'll work for Tegra 2 or
Tegra 3. The corresponding iommu_map() functions only set one PTE, regardless
of the number of bytes passed to them. However, the Tegra TRM indicates that
mapping needs to be done on a per-page basis so contiguous regions cannot be
combined. I suppose the IOMMU driver would have to be fixed to program more
than a single page in that case.

Also this doesn't yet solve the vmap() problem that is needed for the kernel
virtual mapping. I did try using dma_alloc_writecombine(), but that only
works for chunks of 2 MB or smaller, unless I use init_consistent_dma_size()
during board setup, which isn't provided for in a DT setup. I couldn't find
a better alternative, but I admit I'm not very familiar with all the VM APIs.
Do you have any suggestions on how to solve this? Otherwise I'll try and dig
in some more.

Thierry


pgpN1I5gnKtdt.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 3/4] drm: fixed: Add dfixed_frac() macro

2012-04-12 Thread Thierry Reding
* Stephen Warren wrote:
 On 04/11/2012 06:10 AM, Thierry Reding wrote:
  This commit is taken from the Chromium tree and was originally written
  by Robert Morell.
 
 Maybe just cherry-pick it from there? That way, the git authorship will
 show up as Robert.

I can do that. Though I'll have to remove the Chromium-specific tags.

Thierry


pgpernz1zQbKU.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 2/4] iommu: tegra/gart: Add device tree support

2012-04-12 Thread Thierry Reding
* Stephen Warren wrote:
 On 04/11/2012 06:10 AM, Thierry Reding wrote:
  This commit adds device tree support for the GART hardware available on
  NVIDIA Tegra 20 SoCs.
  
  Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
  ---
   arch/arm/boot/dts/tegra20.dtsi |6 ++
   arch/arm/mach-tegra/board-dt-tegra20.c |1 +
   drivers/iommu/tegra-gart.c |   10 ++
   3 files changed, 17 insertions(+)
 
 I think I'd prefer at least the tegra20.dtsi change split out into a
 separate patch so I can queue it in a dt topic branch in the Tegra repo.
 
 It might be a good idea to split this into two on an arch/arm vs.
 drivers/iommu boundary. Looking at Olof's branches for 3.4, that split
 is what happened there.
 
 Finally, there should be a binding documentation file in
 Documentation/devicetree/bindings in order to specify the number of reg
 property entries needed, and their meaning, since there's more than 1
 (even though you did comment it nicely in the .dtsi file)

Okay, I'll do that.

Thierry


pgpd4sliiEaEi.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 4/4] drm: Add NVIDIA Tegra support

2012-04-12 Thread Thierry Reding
* Sascha Hauer wrote:
 You might want to have a look at the sdrm patches I recently posted to
 dri-devel and arm Linux Kernel. Among other things they allow to
 register crtcs/connectors/encoders seperately so that each of them can
 have its own representation in the devicetree. I haven't looked into
 devicetree support for DRM, but with or without devicetree the problem
 that we do not have a single PCI card for registering all DRM components
 is the same.

I'll do that. One interesting use-case that's been on my mind for some time
is if it would be possible to provide a CRTC via DRM that isn't part of the
SoC or DRM device but which can display a framebuffer prepared by the DRM
framework.

In other words I would like to use the Tegra hardware to render content into
a framebuffer (using potentially the 3D engine or HW accelerated video
decoding blocks) but display that framebuffer with a CRTC registered by a
different driver (perhaps provided by a PCIe or USB device).

I think such a setup would be possible if the CRTC registration can be
decoupled from the DRM driver. Perhaps sdrm even supports that already?

Thierry


pgpv190ioPDy1.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 4/4] drm: Add NVIDIA Tegra support

2012-04-12 Thread Thierry Reding
* Marek Szyprowski wrote:
[...]
 We already have dma_map_page() and dma_map_single() which are very similar. 
 Maybe adding dma_map_pages() won't be such a bad idea? 
 
 If not maybe we should provide some kind of helper functions which converts 
 page array to scatterlist and then maps them.

drm_prime_pages_to_sg() seems to do exactly that.

Thierry


pgpPDKFjI5TYg.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 4/4] drm: Add NVIDIA Tegra support

2012-04-12 Thread Thierry Reding
* Stephen Warren wrote:
 On 04/12/2012 12:50 AM, Thierry Reding wrote:
  drm {
  compatible = nvidia,tegra20-drm;
 
 I'm don't think having an explicit drm node is the right approach; drm
 is after all a SW term and the DT should be describing HW. Having some
 kind of top-level node almost certainly makes sense, but naming it
 something related to tegra display than drm would be appropriate.

In this case there really isn't a HW device that can be represented. But in
the end it's still the DRM driver that needs to bind to the device. However
the other graphics devices (MPE, VI/CSI, EPP, GR2D and GR3D) probably need
to be bound against.

Would it be possible for someone at NVIDIA to provide some more details about
what those other devices are? GR2D and GR3D seem obvious, MPE might be video
decoding, VI/CSI video input and camera interface? As to EPP I have no idea.

Maybe one solution would be to have a top-level DRM device with a register
map from 0x5400 to 0x547f, which the TRM designates as host
registers. Then subnodes could be used for the subdevices.

  lvds {
  compatible = ...;
  dc = disp1;
  };
 
 Aren't the outputs separate HW blocks too, such that they would have
 their own compatible/reg properties and their own drivers, and be
 outside the top-level drm/display node?

The RGB output is programmed via the display controller registers. For HDMI,
TVO and DSI there are indeed separate sets of registers in addition to the
display controller's. So perhaps for those more nodes would be required:

hdmi : hdmi@5428 {
compatible = nvidia,tegra20-hdmi;
reg = 0x5428 0x0004;
};

And hook that up with the HDMI output node of the DRM node:

drm {
hdmi {
compatible = ...;
connector = hdmi;
dc = disp2;
};
};

Maybe with this setup we no longer need the compatible property since it
will already be inherent in the connector property. There will have to be
special handling for the RGB output, which could be the default if the
connector property is missing.

 I believe the mapping between the output this node represents and the
 display controller (dc above) that it uses is not static; the
 connectivity should be set up at runtime, and possibly dynamically
 alterable via xrandr or equivalent.

I think the mapping is always static for a given board. There is no way to
switch an HDMI port to LVDS at runtime. But maybe I misunderstand what you're
saying.

  Instead, the active platform data should probably be stored in a
  tegra_drm struct that's stored in the dev's private data.
  tegra_drm_probe() might then look more like:
 
  struct tegra_drm *tdev;
 
  tdev = devm_kzalloc();
  tdev-pdata = pdev-dev.platform_data;
  if (!tdev-pdata)
  tdev-pdata = tegra_drm_parse_dt();
  if (!tdev-pdata)
  return -EINVAL;
 
  dev_set_drvdata(dev, tdev);
 
  This is safe, since probe() will never assume that dev_get_drvdata()
  might contain something valid before probe() sets it.
  
  I prefer my approach over storing the data in an extra field because the
  device platform_data field is where everybody would expect it. Furthermore
  this wouldn't be relevant if we decided not to support non-DT setups.
 
 Drivers are expected to use pre-existing platform data, if provided.
 This might happen in order to work around bugs in device tree content.

Okay I see. I'll have to store it in a separate field in the private
structure then.

Thierry


pgpCkt2Jf8sQa.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 4/4] drm: Add NVIDIA Tegra support

2012-04-13 Thread Thierry Reding
* Stephen Warren wrote:
 On 04/12/2012 11:44 AM, Thierry Reding wrote:
[...]
 And given that, I don't think we should name the node after some
 OS-specific software concept. Device tree is intended to model hardware.
[...]
  Maybe one solution would be to have a top-level DRM device with a register
  map from 0x5400 to 0x547f, which the TRM designates as host
  registers. Then subnodes could be used for the subdevices.
 
 Ah yes, just what I was thinking above:-)

I came up with the following:

/* host1x */
host1x : host1x@5000 {
reg = 0x5000 0x00024000;
interrupts = 0 64 0x04   /* cop syncpt */
  0 65 0x04   /* mpcore syncpt */
  0 66 0x04   /* cop general */
  0 67 0x04; /* mpcore general */
};

/* graphics host */
graphics@5400 {
compatible = nvidia,tegra20-graphics;

#address-cells = 1;
#size-cells = 1;
ranges = 0 0x5400 0x0800;

host1x = host1x;

/* video-encoding/decoding */
mpe@5404 {
reg = 0x5404 0x0004;
interrupts = 0 68 0x04;
};

/* video input */
vi@5408 {
reg = 0x5408 0x0004;
interrupts = 0 69 0x04;
};

/* EPP */
epp@540c {
reg = 0x540c 0x0004;
interrupts = 0 70 0x04;
}

/* ISP */
isp@5410 {
reg = 0x5410 0x0004;
interrupts = 0 71 0x04;
};

/* 2D engine */
gr2d@5414 {
reg = 0x5414 0x0004;
interrupts = 0 72 0x04;
};

/* 3D engine */
gr3d@5418 {
reg = 0x5418 0x0004;
};

/* display controllers */
disp1 : dc@5420 {
compatible = nvidia,tegra20-dc;
reg = 0x5420 0x0004;
interrupts = 0 73 0x04;
};

disp2 : dc@5424 {
compatible = nvidia,tegra20-dc;
reg = 0x5424 0x0004;
interrupts = 0 74 0x04;
};

/* outputs */
lvds : rgb {
compatible = nvidia,tegra20-rgb;
};

hdmi : hdmi@5428 {
compatible = nvidia,tegra20-hdmi;
reg = 0x5428 0x0004;
interrupts = 0 75 0x04;
};

tvo : tvo@542c {
compatible = nvidia,tegra20-tvo;
reg = 0x542c 0x0004;
interrupts = 0 76 0x04;
};

dsi : dsi@5430 {
compatible = nvidia,tegra20-dsi;
reg = 0x5430 0x0004;
};

display-controllers = disp1 disp2;
outputs = lvds hdmi tvo dsi;

/* initial configuration */
configuration {
lvds {
display-controller = disp1;
output = lvds;
};

hdmi {
display-controller = disp2;
output = hdmi;
};
};
};

I added an additional node for the initial configuration so that the driver
knows which mapping to setup at boot. What I don't quite see yet is where to
attach EDID data or pass the phandle to the I2C controller for DDC/EDID
probing. The initial configuration is certainly not the right place. Perhaps
the outputs property should be made a node instead:

outputs {
lvds_out {
output = lvds;
edid = edid;
};

hdmi_out {
output = hdmi;
ddc = i2c2;
};
};

But then outputs should probably become something like connectors
instead and the initial configuration refers to the _out phandles.

Thierry


pgpV7A9U5ob2r.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] iommu/tegra: smmu: Print device name correctly

2012-04-13 Thread Thierry Reding
* Hiroshi Doyu wrote:
 From: Hiroshi DOYU hd...@nvidia.com
 
 Print an attached device name correctly.
 
 Signed-off-by: Hiroshi DOYU hd...@nvidia.com

Reviewed-by: Thierry Reding thierry.red...@avionic-design.de


pgplzIzg1ZxMI.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] iommu/tegra: Add device tree support for SMMU

2012-04-13 Thread Thierry Reding
* Hiroshi Doyu wrote:
 From: Hiroshi DOYU hd...@nvidia.com
 
 Add device tree support for Tegra30 IOMMU(SMMU).
 
 Signed-off-by: Hiroshi DOYU hd...@nvidia.com
 ---
  drivers/iommu/tegra-smmu.c |   10 ++
  1 files changed, 10 insertions(+), 0 deletions(-)

I would expect the binding documentation to be included with this patch
instead of the previous one. Stephen?

Thierry


pgpS5PjkA8aJB.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 3/3] ARM: dt: tegra20: Add GART device

2012-04-13 Thread Thierry Reding
This commit adds the device node required to probe NVIDIA Tegra 20 GART
hardware from the device tree.

Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
---
 arch/arm/boot/dts/tegra20.dtsi |6 ++
 arch/arm/mach-tegra/board-dt-tegra20.c |1 +
 2 files changed, 7 insertions(+)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 55b28fd..cf3ff41 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -212,5 +212,11 @@
interrupts =  0 97 0x04 ;
phy_type = utmi;
};
+
+   gart: gart@7000f000 {
+   compatible = nvidia,tegra20-gart;
+   reg =  0x7000f000 0x0100/* controller registers */
+   0x5800 0x0200 ; /* GART aperture */
+   };
 };
 
diff --git a/arch/arm/mach-tegra/board-dt-tegra20.c 
b/arch/arm/mach-tegra/board-dt-tegra20.c
index 57745e6..bffba1b 100644
--- a/arch/arm/mach-tegra/board-dt-tegra20.c
+++ b/arch/arm/mach-tegra/board-dt-tegra20.c
@@ -66,6 +66,7 @@ struct of_dev_auxdata tegra20_auxdata_lookup[] __initdata = {
OF_DEV_AUXDATA(nvidia,tegra20-ehci, TEGRA_USB3_BASE, tegra-ehci.2,
   tegra_ehci3_pdata),
OF_DEV_AUXDATA(nvidia,tegra20-pwm, TEGRA_PWFM_BASE, tegra-pwm, 
NULL),
+   OF_DEV_AUXDATA(nvidia,tegra20-gart, TEGRA_MC_BASE, tegra-gart, 
NULL),
{}
 };
 
-- 
1.7.10

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/3] iommu: tegra/gart: use correct gart_device

2012-04-13 Thread Thierry Reding
From: Vandana Salve vsa...@nvidia.com

Pass the correct gart device pointer.

Reviewed-by: Vandana Salve vsa...@nvidia.com
Tested-by: Vandana Salve vsa...@nvidia.com
Reviewed-by: Hiroshi Doyu hd...@nvidia.com
Reviewed-by: Bharat Nihalani bnihal...@nvidia.com
Signed-off-by: Hiroshi DOYU hd...@nvidia.com
---
 drivers/iommu/tegra-gart.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 779306e..f6bc1e6 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -158,7 +158,7 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
struct gart_client *client, *c;
int err = 0;
 
-   gart = dev_get_drvdata(dev-parent);
+   gart = gart_handle;
if (!gart)
return -EINVAL;
domain-priv = gart;
-- 
1.7.10

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/3] iommu: tegra/gart: Add device tree support

2012-04-13 Thread Thierry Reding
This commit adds device tree support for the GART hardware available on
NVIDIA Tegra 20 SoCs.

Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
---
 .../devicetree/bindings/iommu/nvidia,tegra20-gart.txt|   14 ++
 drivers/iommu/tegra-gart.c   |   11 +++
 2 files changed, 25 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt 
b/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
new file mode 100644
index 000..2d87b91
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
@@ -0,0 +1,14 @@
+NVIDIA Tegra 20 GART
+
+Required properties:
+- compatible: nvidia,tegra20-gart
+- reg: Two pairs of cells specifying the physical address and size of
+  the memory controller registers and the GART aperture respectively.
+
+Example:
+
+   gart: gart@7000f000 {
+   compatible = nvidia,tegra20-gart;
+   reg =  0x7000f000 0x0100/* controller registers */
+   0x5800 0x0200 ; /* GART aperture */
+   };
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index f6bc1e6..40533bb 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -29,6 +29,7 @@
 #include linux/device.h
 #include linux/io.h
 #include linux/iommu.h
+#include linux/of.h
 
 #include asm/cacheflush.h
 
@@ -422,6 +423,14 @@ const struct dev_pm_ops tegra_gart_pm_ops = {
.resume = tegra_gart_resume,
 };
 
+#ifdef CONFIG_OF
+static struct of_device_id tegra_gart_of_match[] __devinitdata = {
+   { .compatible = nvidia,tegra20-gart, },
+   { },
+};
+MODULE_DEVICE_TABLE(of, tegra_gart_of_match);
+#endif
+
 static struct platform_driver tegra_gart_driver = {
.probe  = tegra_gart_probe,
.remove = tegra_gart_remove,
@@ -429,6 +438,7 @@ static struct platform_driver tegra_gart_driver = {
.owner  = THIS_MODULE,
.name   = tegra-gart,
.pm = tegra_gart_pm_ops,
+   .of_match_table = of_match_ptr(tegra_gart_of_match),
},
 };
 
@@ -448,4 +458,5 @@ module_exit(tegra_gart_exit);
 
 MODULE_DESCRIPTION(IOMMU API for GART in Tegra20);
 MODULE_AUTHOR(Hiroshi DOYU hd...@nvidia.com);
+MODULE_ALIAS(platform:tegra-gart);
 MODULE_LICENSE(GPL v2);
-- 
1.7.10

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/3] ARM: dt: tegra20: Add GART device

2012-04-14 Thread Thierry Reding
* Stephen Warren wrote:
 On 04/13/2012 07:08 AM, Thierry Reding wrote:
  This commit adds the device node required to probe NVIDIA Tegra 20 GART
  hardware from the device tree.
 
  diff --git a/arch/arm/mach-tegra/board-dt-tegra20.c 
  b/arch/arm/mach-tegra/board-dt-tegra20.c
 
  +   OF_DEV_AUXDATA(nvidia,tegra20-gart, TEGRA_MC_BASE, tegra-gart, 
  NULL),
 
 That's only needed if lookup tables (e.g. for clocks) require that the
 device have a specific name. I don't believe the GART is mentioned in
 any such lookup tables, so I think you can drop that change.

I'll remove that hunk.

Thierry


pgpL3dIsf3hnM.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2] ARM: dt: tegra20: Add GART device

2012-04-16 Thread Thierry Reding
This commit adds the device node required to probe NVIDIA Tegra 20 GART
hardware from the device tree.

Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
---
Changes in v2:
- drop unneeded of_dev_auxdata entry

 arch/arm/boot/dts/tegra20.dtsi |6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 55b28fd..cf3ff41 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -212,5 +212,11 @@
interrupts =  0 97 0x04 ;
phy_type = utmi;
};
+
+   gart: gart@7000f000 {
+   compatible = nvidia,tegra20-gart;
+   reg =  0x7000f000 0x0100/* controller registers */
+   0x5800 0x0200 ; /* GART aperture */
+   };
 };
 
-- 
1.7.10

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 4/4] drm: Add NVIDIA Tegra support

2012-04-16 Thread Thierry Reding
* Stephen Warren wrote:
 On 04/15/2012 02:39 AM, Thierry Reding wrote:
  I think I like the former better. The way I understand it the children of 
  the
  graphics node will have to be registered explicitly by the DRM driver 
  because
  of_platform_populate() doesn't work recursively. That would ensure that the
  DRM driver can setup the CRTC and output registries before the other devices
  call back into the DRM to register themselves.
 
 Yes, with the first way, the DRM driver will have to call
 of_platform_populate() recursively to make this work.
 
 The thing here is that the device tree should model hardware, not be
 designed purely to match the device registration needs of the DRM
 driver. I'm not sure that it's correct to model all those devices as
 children of the top-level graphics object; I /think/ all the devices are
 flat on a single bus, and hence not children of each-other. That all
 said, I guess having the nodes as children isn't too far off how the HW
 is designed (even if the register accesses aren't on a child bus, the
 modules at least logically are grouped together in an umbrella
 situation), so I wouldn't push back on the first option above that you
 prefer.

After trying to implement this I'm not so sure anymore that this is the best
approach. I think I'll have to play around with this some more to see what
fits best.

  Boards should still be able to boot and display a console on the standard
  output even if the user doesn't provide a video= option. Shouldn't there be 
  a
  way for a board DTS to specify what the default (or even allowed) 
  connections
  are?
 
 Why wouldn't the default be to light up all outputs that have an
 attached display, or an algorithm something like:
 
 * If internal LCD is present, use that
 * Else, if HDMI display plugged in, use that
 ...

That sounds doable.

  Evaluation hardware like the Harmony might have LVDS, HDMI and VGA 
  connectors
  to provide for a wide range of use cases. The Plutux for instance has only 
  an
  HDMI connector and the Medcom has only LVDS. For the Medcom it would be 
  quite
  confusing for people to suddenly see an HDMI-1 connector pop up f.e. in
  xrandr. It would be equally useless for the Plutux to show up as supporting
  an LVDS or VGA connector.
 
 So the device tree for those devices would disable (or not include) the
 connectors that were not present on the board.

Okay, makes sense.

  Has there been any discussion as to how EDID data would best be represented
  in DT? Should it just be a binary blob or rather some textual 
  representation?
 
 I think a binary blob makes sense - that's the exact same format it'd
 have if read over the DDC I2C bus.

DTC has /incbin/ for that. Is arch/arm/boot/dts still the correct place for
EDID blobs? I could add tegra-medcom.edid if that's okay.

Thierry


pgpHj7Wa0zx4D.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 4/4] drm: Add NVIDIA Tegra support

2012-04-16 Thread Thierry Reding
* Stephen Warren wrote:
 On 04/16/2012 12:48 PM, Thierry Reding wrote:
  * Stephen Warren wrote:
 ...
  Has there been any discussion as to how EDID data would best be 
  represented
  in DT? Should it just be a binary blob or rather some textual 
  representation?
 
  I think a binary blob makes sense - that's the exact same format it'd
  have if read over the DDC I2C bus.
  
  DTC has /incbin/ for that. Is arch/arm/boot/dts still the correct place for
  EDID blobs? I could add tegra-medcom.edid if that's okay.
 
 As far as I know, yes.
 
 Perhaps we'll want to start putting stuff in SoC-specific
 sub-directories given the number of files we'll end up with here
 (irrespective of EDID etc.), but I haven't seen any move towards that yet.

Yes, especially as more machines are moving to DT that directory will soon
become quite cluttered. I suppose a tegra subdirectory wouldn't hurt.

I've been looking about for tools to generate EDID data but didn't find
anything useful. Does anyone know of any tool that's more convenient than
manually filling a struct edid and writing that to a file?

Thierry


pgpF139gJ4WaT.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] arm/dts: Tegra30: Add device tree support for SMMU

2012-04-18 Thread Thierry Reding
* Hiroshi Doyu wrote:
 diff --git a/drivers/of/of_dma.c b/drivers/of/of_dma.c
 new file mode 100644
 index 000..1db1ccd
 --- /dev/null
 +++ b/drivers/of/of_dma.c
 @@ -0,0 +1,35 @@
 +/*
 + * Stealed from:

Stolen from

 + *   arch/microblaze/kernel/prom_parse.c
 + *   arch/powerpc/kernel/prom_parse.c
 + */
 +
 +#include linux/of_address.h
 +
 +void of_parse_dma_window(struct device_node *dn, const void *dma_window_prop,
 + unsigned long *busno, unsigned long *phys, unsigned long *size)
 +{
 + const u32 *dma_window;

Should be __be32.

 + u32 cells;
 + const unsigned char *prop;
 +
 + dma_window = dma_window_prop;
 +
 + /* busno is always one cell */
 + if (busno)
 + *busno = *(dma_window++);

This needs endianness conversion:

*busno = be32_to_cpup(dma_window++);

 +
 + prop = of_get_property(dn, #dma-address-cells, NULL);
 + if (!prop)
 + prop = of_get_property(dn, #address-cells, NULL);
 +
 + cells = prop ? *(u32 *)prop : of_n_addr_cells(dn);

Same here.

 + *phys = of_read_number(dma_window, cells);
 +
 + dma_window += cells;
 +
 + prop = of_get_property(dn, #dma-size-cells, NULL);
 + cells = prop ? *(u32 *)prop : of_n_size_cells(dn);

And here.

Thierry


pgpHTGRLdfLIW.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] dt: Add general DMA window parser

2012-04-18 Thread Thierry Reding
* Hiroshi Doyu wrote:
 diff --git a/drivers/of/of_dma.c b/drivers/of/of_dma.c
 new file mode 100644
 index 000..45c9e88
 --- /dev/null
 +++ b/drivers/of/of_dma.c
 @@ -0,0 +1,35 @@
 +/*
 + * Stolen from:
 + *   arch/microblaze/kernel/prom_parse.c
 + *   arch/powerpc/kernel/prom_parse.c
 + */
 +
 +#include linux/of_address.h
 +
 +void of_parse_dma_window(struct device_node *dn, const void *dma_window_prop,
 + unsigned long *busno, unsigned long *phys, unsigned long *size)
 +{
 + const __be32 *dma_window;
 + u32 cells;
 + const unsigned char *prop;

There's no need for this to be const unsigned char *, const void * will do
just as well.

 +
 + dma_window = dma_window_prop;
 +
 + /* busno is always one cell */
 + if (busno)
 + *busno = be32_to_cpup(dma_window++);
 +
 + prop = of_get_property(dn, #dma-address-cells, NULL);
 + if (!prop)
 + prop = of_get_property(dn, #address-cells, NULL);
 +
 + cells = prop ? be32_to_cpup(prop) : of_n_addr_cells(dn);
 + *phys = of_read_number(dma_window, cells);

This should probably fail gracefully if phys == NULL, similar to what you do
for busno.

 +
 + dma_window += cells;
 +
 + prop = of_get_property(dn, #dma-size-cells, NULL);
 + cells = prop ? be32_to_cpup(prop) : of_n_size_cells(dn);
 + *size = of_read_number(dma_window, cells);

Same here.

 +}
 +

And you might want to add a EXPORT_SYMBOL(of_parse_dma_window) here so the
function can be used from modules.

Sorry for having you go through another round. I should have looked more
closely before.

Thierry


pgpteip6Gokw2.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 0/4] Add NVIDIA Tegra DRM support

2012-04-19 Thread Thierry Reding
* Dave Airlie wrote:
 On Thu, Apr 19, 2012 at 6:35 PM, Thierry Reding
 thierry.red...@avionic-design.de wrote:
  Before posting the next round of patches I wanted to clarify whether we need
  to take the Tegra driver through staging. Lucas brought this up referring to
  previous experience with Nouveau where Linus used to complain that userspace
  ABI was broken for non-staging drivers. I don't know how relevant that is
  for Tegra. We could also keep it in drivers/gpu/drm and only add userspace
  interfaces that we are sure are not going to change. Currently there isn't
  isn't anything that could be easily broken as only some of the standard DRM
  interfaces are supported anyway.
 
  Alternatively we could keep the driver in a separate tree until it becomes
  mature enough.
 
  Any thoughts?
 
 It's probably okay to avoid staging if it doesn't add any userspace ioctls.
 
 A KMS driver that just supports the dumb ioctls so -modesetting works,
 would be the first thing to aim for I suppose, like how the exynos
 guys did it.
 
 Adding userspace interfaces is where you'll get into ABI guarantees
 etc, and these are normally required only for the accel engines.

I think this is what Lucas was concerned about. The plan is to look at how
much can be used from the Nouveau code and make it work on the Tegra. So
would it be possible to get a basic dumb KMS driver into mainline
(non-staging) and phase in acceleration later on, with ABI guarantees? I
guess development can go on in separate trees until the ABI is stable and can
subsequently be ported to the mainline driver.

Is that an acceptable approach?

Thierry


pgptDMAdOxRkD.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC v2 2/5] tps6586x: Add device tree support

2012-04-25 Thread Thierry Reding
* Thierry Reding wrote:
 * Mark Brown wrote:
  On Wed, Apr 25, 2012 at 11:44:59AM +0200, Thierry Reding wrote:
   This commit adds device tree support for the TPS6586x regulator.
   
   Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
  
  This looks basically good from a quick scan through but the pattern of
  looking up regulator nodes by name is very common so should be factored
  out - I made a similar comment in response to a recent patch from
  Rhyland Klein and earlier today he posted a patch regulator: add
  generic of node parsing for regulators which does just that.  Can you
  please redo this on top of his code?  I'll probably apply it later
  today, though I didn't properly read the code yet.
 
 I'll take a look.

After taking a closer look I don't think Rhyland's patch is very useful for
this driver. I need to lookup the platform ID by regulator name anyway so
using the new code is actually more work and requires a second table that
lists the regulator names only.

Thierry


pgpsHJKEBbZlk.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC v2 3/5] i2c: Add of_i2c_get_adapter() function

2012-04-25 Thread Thierry Reding
* Stephen Warren wrote:
 On 04/25/2012 03:45 AM, Thierry Reding wrote:
  This function resolves an OF device node to an I2C adapter registered
  with the I2C core.
 
 I think this is doing the same thing as a patch I posted recently:
 http://www.spinics.net/lists/linux-i2c/msg07808.html

I wasn't aware of that patch.

 What's the advantage of one way over the other?

Both are fine I think. Your version is shorter but mine may be a little
faster since it doesn't iterate over all devices on the bus. Since the
lookup doesn't happen very frequently this isn't much of an issue.

Thierry


pgpohWML5B7JG.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC v2 5/5] drm: Add NVIDIA Tegra support

2012-05-21 Thread Thierry Reding
* Terje Bergström wrote:
 On 21.05.2012 14:05, Thierry Reding wrote:
 
  I agree. It's really just a simple-bus kind of bus. Except that it'll need
  another compatible value to make the host1x driver bind to it. But we should
  be able to make it work without creating a completely new bus. I guess the
  ranges property could specify 0x5400 as the base address and child
  nodes could use offsets relative to that. Maybe even that is overkill,
  though.
 
 
 Hi,
 
 In the phase where we are just getting display up, yes, the bus is a
 really simple bus. For that, all you need to know about host1x is that
 it needs to be turned on whenever you talk to the display. Having a bus
 seems overkill.
 
 At a latest stage, we're going to need support for host1x channels and
 related infrastructure. In that phase, we need to find good homes for
 channel allocation, job tracking, push buffer management, and other code
 common to all host1x drivers. In the current nvhost driver, it's the job
 of the bus_type and related utilities, and that seems like a good enough
 fit.
 
 I have a bit of hard time understanding the significance of a bus_type
 in the Linux driver model.
 
 On one hand, bus_type allows grouping a set of devices and drivers
 according to the APIs they implement. In nvhost, we have struct
 nvhost_device and struct nvhost_driver, and each host1x client driver
 needs to use them. It's natural to put them in the same bus_type. As a
 bonus, we can assign further common tasks to the bus_type.
 
 On the other hand, implementing a bus_type introduces code almost
 duplicate to platform bus. This is the part that I have hard time with.
 
 The device tree should reflect the target state, as it's going to be an
 API between kernel and boot loader. I believe it'd be best to introduce
 host1x as a bus and put all host1x clients as nodes inside host1x.

I agree that reflecting the hardware hierarchy within the device tree makes
sense. I'll need to add some code to tear down child devices in the cleanup
path, but that shouldn't be too difficult.

However, adding a whole bus_type implementation would pretty much duplicate
the platform bus code. I guess host1x could just provide struct host1x_client
a set of corresponding APIs to create them, request channels, etc.

Thierry


pgp8r3NhkNQ9L.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Tegra DRM device tree bindings

2012-06-26 Thread Thierry Reding
On Tue, Jun 26, 2012 at 04:02:24PM +0300, Hiroshi Doyu wrote:
 Hi Thierry,
 
 On Tue, 26 Jun 2012 12:55:13 +0200
 Thierry Reding thierry.red...@avionic-design.de wrote:
 
  * PGP Signed by an unknown key
  
  Hi,
  
  while I haven't got much time to work on the actual code right now, I
  think it might still be useful if we could get the device tree binding
  to a point where everybody is happy with it. That'll also save me some
  time once I get to writing the code because I won't have to redo it over
  again. =)
  
  So here's the current proposal:
  
  host1x {
  compatible = nvidia,tegra20-host1x, simple-bus;
  reg = 0x5000 0x00024000;
  interrupts = 0 64 0x04   /* cop syncpt */
0 65 0x04   /* mpcore syncpt */
0 66 0x04   /* cop general */
0 67 0x04; /* mpcore general */
  
  #address-cells = 1;
  #size-cells = 1;
  
  ranges = 0x5400 0x5400 0x0400;
  
  status = disabled;
  
  gart = gart;
 ...
  output and hooks up a static EDID block with the LVDS output. There is
  also a carveout property which might be a better replacement for the
  crippled GART on Tegra20. Alternatively the CMA might work just as
  well instead.
  
  The Plutux can be described like this:
  
  host1x {
  carveout = 0x0e00 0x0200;
 
 As discussed in the following ML thread previously, the necessary info
 related to the gart would be got from the standard IOMMU API(or
 something above layers, DMABUF or TTM?). So I don't think that we need
 to refer to gart and carveout here in the end.
 
 http://lists.linuxfoundation.org/pipermail/iommu/2012-June/004266.html

Yes, if IOMMU or some layer above can provide the same information, then
that is certainly better than explicitly referencing it in the DT.

I'm not sure I understand how information about the carveout would be
obtained from the IOMMU API, though.

Thierry


pgp4C7nOVrAkT.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Tegra DRM device tree bindings

2012-06-26 Thread Thierry Reding
On Tue, Jun 26, 2012 at 08:48:18PM -0600, Stephen Warren wrote:
 On 06/26/2012 08:32 PM, Mark Zhang wrote:
  On 06/26/2012 07:46 PM, Mark Zhang wrote:
  On Tue, 26 Jun 2012 12:55:13 +0200
  Thierry Reding thierry.red...@avionic-design.de wrote:
  ...
  I'm not sure I understand how information about the carveout would be
  obtained from the IOMMU API, though.
 
  I think that can be similar with current gart implementation. Define 
  carveout as:
 
  carveout {
  compatible = nvidia,tegra20-carveout;
  size = 0x1000;
  };
 
  Then create a file such like tegra-carveout.c to get these definitions 
  and
  register itself as platform device's iommu instance.
 
  The carveout isn't a HW object, so it doesn't seem appropriate to define a 
  DT
  node to represent it.
  
  Yes. But I think it's better to export the size of carveout as a 
  configurable item.
  So we need to define this somewhere. How about define carveout as a 
  property of gart?
 
 There already exists a way of preventing Linux from using certain chunks
 of memory; the /memreserve/ syntax. From a brief look at the dtc source,
 it looks like /memreserve/ entries can have labels, which implies that a
 property in the GART node could refer to the /memreserve/ entry by
 phandle in order to know what memory regions to use.

That doesn't work, unfortunately. The /memreserve/ label isn't even
stored in the DTB. Even DTC throws an error when you try to reference
the /memreserve/ by label.

Thierry


pgp37tDCQ3gok.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Tegra DRM device tree bindings

2012-06-27 Thread Thierry Reding
On Wed, Jun 27, 2012 at 03:59:07PM +0300, Hiroshi Doyu wrote:
 On Wed, 27 Jun 2012 07:14:18 +0200
 Thierry Reding thierry.red...@avionic-design.de wrote:
 
  * PGP Signed by an unknown key
  
  On Tue, Jun 26, 2012 at 08:48:18PM -0600, Stephen Warren wrote:
   On 06/26/2012 08:32 PM, Mark Zhang wrote:
On 06/26/2012 07:46 PM, Mark Zhang wrote:
On Tue, 26 Jun 2012 12:55:13 +0200
Thierry Reding thierry.red...@avionic-design.de wrote:
...
I'm not sure I understand how information about the carveout would be
obtained from the IOMMU API, though.
   
I think that can be similar with current gart implementation. Define 
carveout as:
   
carveout {
compatible = nvidia,tegra20-carveout;
size = 0x1000;
};
   
Then create a file such like tegra-carveout.c to get these 
definitions and
register itself as platform device's iommu instance.
   
The carveout isn't a HW object, so it doesn't seem appropriate to 
define a DT
node to represent it.

Yes. But I think it's better to export the size of carveout as a 
configurable item.
So we need to define this somewhere. How about define carveout as a 
property of gart?
   
   There already exists a way of preventing Linux from using certain chunks
   of memory; the /memreserve/ syntax. From a brief look at the dtc source,
   it looks like /memreserve/ entries can have labels, which implies that a
   property in the GART node could refer to the /memreserve/ entry by
   phandle in order to know what memory regions to use.
  
  Wasn't the whole point of using a carveout supposed to be a replacement
  for the GART?
 
 Mostly agree. IIUC, we use both carveout/gart allocated buffers in
 android/tegra2.
 
 As such I'd think the carveout should rather be a property
  of the host1x device.
 
 Rather than introducing a new property, how about using
 coherent_pool=??M in the kernel command line if necessary? I think
 that this carveout size depends on the system usage/load.

I was hoping that we could get away with using the CMA and perhaps
initialize it based on device tree content. I agree that the carveout
size depends on the use-case, but I still think it makes sense to
specify it on a per-board basis.

Thierry


pgpyKezJuxBdM.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Tegra DRM device tree bindings

2012-06-27 Thread Thierry Reding
On Wed, Jun 27, 2012 at 05:29:14PM +0300, Hiroshi Doyu wrote:
 On Wed, 27 Jun 2012 16:08:10 +0200
 Thierry Reding thierry.red...@avionic-design.de wrote:
 
  * PGP Signed by an unknown key
  
  On Wed, Jun 27, 2012 at 03:59:07PM +0300, Hiroshi Doyu wrote:
   On Wed, 27 Jun 2012 07:14:18 +0200
   Thierry Reding thierry.red...@avionic-design.de wrote:
   
 Old Signed by an unknown key

On Tue, Jun 26, 2012 at 08:48:18PM -0600, Stephen Warren wrote:
 On 06/26/2012 08:32 PM, Mark Zhang wrote:
  On 06/26/2012 07:46 PM, Mark Zhang wrote:
  On Tue, 26 Jun 2012 12:55:13 +0200
  Thierry Reding thierry.red...@avionic-design.de wrote:
  ...
  I'm not sure I understand how information about the carveout 
  would be
  obtained from the IOMMU API, though.
 
  I think that can be similar with current gart implementation. 
  Define carveout as:
 
  carveout {
  compatible = nvidia,tegra20-carveout;
  size = 0x1000;
  };
 
  Then create a file such like tegra-carveout.c to get these 
  definitions and
  register itself as platform device's iommu instance.
 
  The carveout isn't a HW object, so it doesn't seem appropriate to 
  define a DT
  node to represent it.
  
  Yes. But I think it's better to export the size of carveout as a 
  configurable item.
  So we need to define this somewhere. How about define carveout as a 
  property of gart?
 
 There already exists a way of preventing Linux from using certain 
 chunks
 of memory; the /memreserve/ syntax. From a brief look at the dtc 
 source,
 it looks like /memreserve/ entries can have labels, which implies 
 that a
 property in the GART node could refer to the /memreserve/ entry by
 phandle in order to know what memory regions to use.

Wasn't the whole point of using a carveout supposed to be a replacement
for the GART?
   
   Mostly agree. IIUC, we use both carveout/gart allocated buffers in
   android/tegra2.
   
   As such I'd think the carveout should rather be a property
of the host1x device.
   
   Rather than introducing a new property, how about using
   coherent_pool=??M in the kernel command line if necessary? I think
   that this carveout size depends on the system usage/load.
  
  I was hoping that we could get away with using the CMA and perhaps
  initialize it based on device tree content. I agree that the carveout
  size depends on the use-case, but I still think it makes sense to
  specify it on a per-board basis.
 
 DRM driver doesn't know if it uses CMA or not, because DRM only uses
 DMA API.

So how is the DRM supposed to allocate buffers? Does it call the
dma_alloc_from_contiguous() function to do that? I can see how it is
used by arm_dma_ops but how does it end up in the driver?

 I think that coherent_pool can be used only when the amount of
 contiguous memory is short in your system. Otherwise even unnecessary.
 
 Could you explain a bit more why you want carveout size on per-board basis?

In the ideal case I would want to not have a carveout size at all.
However there may be situations where you need to make sure some driver
can allocate a given amount of memory. Having to specify this using a
kernel command-line parameter is cumbersome because it may require
changes to the bootloader or whatever. So if you know that a particular
board always needs 128 MiB of carveout, then it makes sense to specify
it on a per-board basis.

Thierry


pgpPmtvxpBAia.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Tegra DRM device tree bindings

2012-06-28 Thread Thierry Reding
On Wed, Jun 27, 2012 at 12:02:46PM -0600, Stephen Warren wrote:
 On 06/27/2012 06:44 AM, Hiroshi Doyu wrote:
 ...
  I think that there are 2 cases:
  
(1) discontiguous memory with IOMMU
(2) contiguous memory without IOMMU(called carveout in general?)
 ...
  For (2), although memory is mostly anonymous one, we may need to know
  how much to allocate, where we only need size. This size is not from
  h/w feature, but it depends on the system load/usage. So I think that
  this size can be passed from kernel command line? For exmaple, we can
  specify how much contiguous memory is necessary with putting
  coherent_pool=??M in the kernel command line as below:
  
  coherent_pool=nn[KMG]   [ARM,KNL]
  Sets the size of memory pool for coherent, atomic dma
  allocations.
 
 I guess if that's the standard way of initializing CMA, then that's fine.
 
 It'd be nice if there was a way to specify that from the DT too; that
 way the user/distro/bootloader constructing the kernel command-line
 wouldn't have to remember to add random (potentially
 Tegra-/board-specific) extra arguments onto the command-line; the Tegra
 command-line in the upstream kernel is quite clean right now, especially
 compare to the enormous number of options we require downstream:-(

Looking at Documentation/kernel-parameters.txt it seems the canonical
way to initialize CMA is using the cma kernel command-line parameter.
For device tree we could extend the chosen node to include something
like carveout = 0x0400;. Or contiguous-memory or whatever.

Thierry


pgp4XkmOA7IC3.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Tegra DRM device tree bindings

2012-06-28 Thread Thierry Reding
On Thu, Jun 28, 2012 at 11:33:56AM -0600, Stephen Warren wrote:
 On 06/28/2012 11:19 AM, Lucas Stach wrote:
 ...
  CMA is just a way of providing large contiguous address space blocks in
  a dynamic fashion. ...
  
  TTM though solves more advanced matters, like buffer synchronisation
  between 3D and 2D block of hardware ...
  
  IMHO the best solution would be to use CMA as a flexible replacement of
  the static carveout area and put TTM on top of this ...
 
 Ah right, thanks for the explanation. That makes sense to me now.

Okay. I think that resolves the last open issue. I'll try to get some
more work done on the DT and corresponding code soonish.

For those who don't know yet I've requested the creation of a project
on freedesktop.org for Tegra graphics drivers[0]. I plan to have the DRM
code hosted there once the project has been approved. Furthermore if we
ever get to write a corresponding X driver it can be hosted there as
well. We should also use the wiki for coordination once things get
started.

Thierry

[0]: https://bugs.freedesktop.org/show_bug.cgi?id=51505


pgpeLVQc37fb4.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Tegra DRM device tree bindings

2012-06-30 Thread Thierry Reding
On Fri, Jun 29, 2012 at 04:20:31PM +0300, Terje Bergström wrote:
 On 28.06.2012 20:19, Lucas Stach wrote:
  TTM though solves more advanced matters, like buffer synchronisation
  between 3D and 2D block of hardware or syncing buffer access between GPU
  and CPU.
  One of the most interesting things of TTM is the ability to purge the
  GPU DMA buffers to scattered sysmem or even swap them out, if they are
  not currently used by the GPU. It then makes sure to move them in the
  contig space again when the GPU really needs them and fix up the GPU
  command stream with the new buffer address.
 
 We preferably should choose dma_buf as a common interface towards
 buffers. That way whatever we choose as the memory manager, all dma_buf
 aware drivers will be able to use buffers allocated by other drivers.
 
 We probably need to accommodate multiple memory managers to take care of
 legacy and new drivers. If V4L2 and DRM projects all move to dma_buf, we
 have the possibility to do zero-copy video without forcing everybody to
 use the same memory manager.

I agree. Supporting DMA BUF also doesn't seem very difficult.

 As I understand, TTM is good for platforms where we have a separate
 frame buffer memory, as is the case with most of the graphics cards. In
 Tegra, graphics and CPU occupy the same memory, so I'm not sure if we
 require the level of functionality that TTM provides. I guess the level
 of functionality and the complexity that it brings is one reason why TTM
 hasn't really caught on in the ARM world.
 
 The synchronization primitives attached to TTM are slightly confusing.
 At the bottom level, it's operations which need to be synchronized
 between each other. That's the API level that we should to export from
 kernel to user space. It's then up to libdrm level (or whatever is doing
 the rendering in user space) to decide which operations it wants to have
 completed before a buffer can be reused/read/passed on to the next stage.
 
 Anyway, if we hide the memory manager behind dma_buf, we're free to muck
 around with multiple of them and see what works best.

Exactly. Other subthreads echo this as well. Using CMA seems the easiest
and most flexible for now but still covers everything we need. If it
turns out that it isn't suited for more advanced stuff once we start
supporting 3D then we can still opt for something like TTM.

Thierry


pgpy134uZ99uM.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 1/2] iommu: Include linux/types.h

2012-07-25 Thread Thierry Reding
The linux/iommu.h header uses types defined in linux/types.h but doesn't
include it.

Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a71df92..9cbcc6a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -20,6 +20,7 @@
 #define __LINUX_IOMMU_H
 
 #include linux/errno.h
+#include linux/types.h
 
 #define IOMMU_READ (1)
 #define IOMMU_WRITE(2)
-- 
1.7.11.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu: Include linux/notifier.h

2012-07-25 Thread Thierry Reding
The linux/iommu.h header uses types defined in linux/notifier.h but
doesn't include it.

Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9cbcc6a..4f64020 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -20,6 +20,7 @@
 #define __LINUX_IOMMU_H
 
 #include linux/errno.h
+#include linux/notifier.h
 #include linux/types.h
 
 #define IOMMU_READ (1)
-- 
1.7.11.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] ARM: tegra: bus_notifier registers IOMMU devices(was: How to specify IOMMU'able devices in DT)

2012-11-29 Thread Thierry Reding
On Wed, Nov 28, 2012 at 02:48:32PM +0100, Hiroshi Doyu wrote:
[...]
 From: Hiroshi Doyu hd...@nvidia.com
 Date: Wed, 28 Nov 2012 14:47:04 +0200
 Subject: [PATCH 1/1] ARM: tegra: bus_notifier registers IOMMU devices
 
 platform_bus notifier registers IOMMU devices if dma-window is
 specified.
 
 Its format is:
   dma-window = start size;
 ex)
   dma-window = 0x12345000 0x8000;
 
 Signed-off-by: Hiroshi Doyu hd...@nvidia.com
 ---
  arch/arm/mach-tegra/board-dt-tegra30.c |   40 
 
  1 file changed, 40 insertions(+)
 
 diff --git a/arch/arm/mach-tegra/board-dt-tegra30.c 
 b/arch/arm/mach-tegra/board-dt-tegra30.c
 index a2b6cf1..570d718 100644
 --- a/arch/arm/mach-tegra/board-dt-tegra30.c
 +++ b/arch/arm/mach-tegra/board-dt-tegra30.c
 @@ -30,9 +30,11 @@
  #include linux/of_fdt.h
  #include linux/of_irq.h
  #include linux/of_platform.h
 +#include linux/of_iommu.h
  
  #include asm/mach/arch.h
  #include asm/hardware/gic.h
 +#include asm/dma-iommu.h
  
  #include board.h
  #include clock.h
 @@ -86,10 +88,48 @@ static __initdata struct tegra_clk_init_table 
 tegra_dt_clk_init_table[] = {
   { NULL, NULL,   0,  0},
  };
  
 +#ifdef CONFIG_ARM_DMA_USE_IOMMU
 +static int tegra_iommu_device_notifier(struct notifier_block *nb,
 +unsigned long event, void *_dev)
 +{
 + struct dma_iommu_mapping *map = NULL;
 + struct device *dev = _dev;
 + dma_addr_t base;
 + size_t size;
 + int err;
 +
 + switch (event) {
 + case BUS_NOTIFY_ADD_DEVICE:
 + err = of_get_dma_window(dev-of_node, NULL, 0, NULL, base,
 + size);
 + if (!err)
 + map = arm_iommu_create_mapping(platform_bus_type,
 +base, size, 0);
 + if (IS_ERR_OR_NULL(map))
 + break;
 + if (arm_iommu_attach_device(dev, map))
 + dev_err(dev, Failed to attach %s\n, dev_name(dev));
 + dev_dbg(dev, Attached %s to map %p\n, dev_name(dev), map);
 + break;
 + }
 + return NOTIFY_DONE;
 +}
 +#else
 +#define tegra_iommu_device_notifier NULL
 +#endif
 +
 +static struct notifier_block tegra_iommu_device_nb = {
 + .notifier_call = tegra_iommu_device_notifier,
 +};

You don't need this extra protection since you use IS_ENABLED below and
these are all static variables. The whole point of IS_ENABLED is to
allow full compile coverage while leaving it up to the compiler to
eliminate dead code.

Thierry


pgpo3hiq1gK94.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 13/33] iommu: Convert to devm_ioremap_resource()

2013-01-21 Thread Thierry Reding
Convert all uses of devm_request_and_ioremap() to the newly introduced
devm_ioremap_resource() which provides more consistent error handling.

devm_ioremap_resource() provides its own error messages so all explicit
error messages can be removed from the failure code paths.

Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
Cc: Joerg Roedel j...@8bytes.org
Cc: iommu@lists.linux-foundation.org
---
 drivers/iommu/tegra-smmu.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index fc17889..f08dbcd 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -19,6 +19,7 @@
 
 #define pr_fmt(fmt)%s():  fmt, __func__
 
+#include linux/err.h
 #include linux/module.h
 #include linux/platform_device.h
 #include linux/spinlock.h
@@ -1176,9 +1177,9 @@ static int tegra_smmu_probe(struct platform_device *pdev)
res = platform_get_resource(pdev, IORESOURCE_MEM, i);
if (!res)
return -ENODEV;
-   smmu-regs[i] = devm_request_and_ioremap(pdev-dev, res);
-   if (!smmu-regs[i])
-   return -EBUSY;
+   smmu-regs[i] = devm_ioremap_resource(pdev-dev, res);
+   if (IS_ERR(smmu-regs[i]))
+   return PTR_ERR(smmu-regs[i]);
}
 
err = of_get_dma_window(dev-of_node, NULL, 0, NULL, base, size);
-- 
1.8.1.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 03/23] ARM: dt: tegra30: iommu: Add nvidia,memory-clients

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:28:06PM +0300, Hiroshi Doyu wrote:
 Add nvidia,memory-clients to identify which swgroup ID a device
 belongs to.

Why not call the property nvidia,swgid instead? That seems a lot more
intuitive.

Thierry


pgpsrF4R4_pOe.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 01/23] ARM: tegra: Create a DT header defining swgroups ID

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:28:04PM +0300, Hiroshi Doyu wrote:
[...]
 +#define SWGID_AFI 0
 +#define SWGID_AVPC 1
 +#define SWGID_DC 2
 +#define SWGID_DCB 3
 +#define SWGID_EPP 4
 +#define SWGID_G2 5
 +#define SWGID_HC 6
 +#define SWGID_HDA 7
 +#define SWGID_ISP 8
 +#define SWGID_ISP2 SWGID_ISP
 +/* UNUSED: 9 */
 +/* UNUSED: 10 */
 +#define SWGID_MPE 11
 +#define SWGID_MSENC SWGID_MPE
 +#define SWGID_NV 12
 +#define SWGID_NV2 13
 +#define SWGID_PPCS 14
 +#define SWGID_SATA2 15
 +#define SWGID_SATA 16
 +#define SWGID_VDE 17
 +#define SWGID_VI 18
 +#define SWGID_VIC 19
 +#define SWGID_XUSB_HOST 20
 +#define SWGID_XUSB_DEV 21
 +#define SWGID_A9AVP 22
 +#define SWGID_TSEC 23
 +#define SWGID_PPCS1 24
 +/* UNUSED: 25 */
 +/* UNUSED: 26 */
 +/* UNUSED: 27 */
 +/* UNUSED: 28 */
 +/* UNUSED: 29 */
 +/* UNUSED: 30 */
 +/* UNUSED: 31 */
 +
 +/* Reserved: 32-63 */
 +
 +#define SWGID(x) (1ULL  SWGID_##x)

I'm not entirely sure where to find these mappings in the TRM. I see
that there's a list of the groups in 15.10.11, but where do the numbers
come from? And why are some of the names aliased? If it's for
readability only maybe we could add some more for SWGID_HC -
SWGID_HOST1X and perhaps SWGID_NV - SWGID_GR3D.

Thierry


pgpA6aXEqUo3Z.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 18/23] iommu/tegra: smmu: Workaround PCIe IOMMU'able

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 01:09:06PM +0200, Hiroshi Doyu wrote:
 Thierry Reding thierry.red...@gmail.com wrote @ Wed, 26 Jun 2013 13:06:27 
 +0200:
 
  * PGP Signed by an unknown key
  
  On Wed, Jun 26, 2013 at 12:28:21PM +0300, Hiroshi Doyu wrote:
   Make PCIe work as it is. IOMMU support can be implemented later.
   
   Signed-off-by: Hiroshi Doyu hd...@nvidia.com
   ---
drivers/iommu/tegra-smmu.c | 3 +++
1 file changed, 3 insertions(+)
  
  Can you provide more information about what the problem is here? Why is
  PCIe not working when mapped through the IOMMU?
 
 I haven't had a code to register PCI device as IOMMU'able as
 ops-add_device() does for platform_devices. I'll add this comment.

Okay, that should be solved then when we merge the PCIe driver. I hope
that can happen soon.

Thierry


pgpQK0fHHO2lL.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 19/23] iommu/tegra: smmu: Unfied driver for Tegra SoCs

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:28:22PM +0300, Hiroshi Doyu wrote:
 Support multiple generation of Tegra SoCs with this unified
 SMMU driver. Necessary info is expected to be passed from DT.
 
 Signed-off-by: Hiroshi Doyu hd...@nvidia.com
 ---
  drivers/iommu/tegra-smmu.c | 80 
 ++
  1 file changed, 3 insertions(+), 77 deletions(-)
 
 diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
 index 6e82df3..ae71721 100644
 --- a/drivers/iommu/tegra-smmu.c
 +++ b/drivers/iommu/tegra-smmu.c
 @@ -1,5 +1,5 @@
  /*
 - * IOMMU API for SMMU in Tegra30
 + * IOMMU API for SMMU in Tegra SoC

Maybe Tegra30 and later SoCs given that Tegra20 has no compatible
IOMMU?

 -MODULE_DESCRIPTION(IOMMU API for SMMU in Tegra30);
 +MODULE_DESCRIPTION(IOMMU API for SMMU in Tegra SoC);

Same here.

Thierry


pgpNuabtdnCd4.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 11/23] iommu/tegra: smmu: Add of_mach_id for T114

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:28:14PM +0300, Hiroshi Doyu wrote:
 Add of_mach_id for T114

I don't think of_mach_id is the right word. Why not use a subject such
as iommu/tegra: smmu: Add Tegra114 support. And instead of duplicating
the subject in the commit message, perhaps you can say instead that for
Tegra114 the SMMU is the same as for Tegra30 (which I assume here given
that no code changes are required).

And we really need the nvidia,tegra114-smmu.txt binding documentation.
Either that or update the nvidia,tegra30-smmu.txt with Tegra114 specific
information.

Thierry


pgpwLFUy6zrru.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 18/23] iommu/tegra: smmu: Workaround PCIe IOMMU'able

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:28:21PM +0300, Hiroshi Doyu wrote:
 Make PCIe work as it is. IOMMU support can be implemented later.
 
 Signed-off-by: Hiroshi Doyu hd...@nvidia.com
 ---
  drivers/iommu/tegra-smmu.c | 3 +++
  1 file changed, 3 insertions(+)

Can you provide more information about what the problem is here? Why is
PCIe not working when mapped through the IOMMU?

Thierry


pgpoKtEYKA_Y6.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 03/23] ARM: dt: tegra30: iommu: Add nvidia,memory-clients

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:28:06PM +0300, Hiroshi Doyu wrote:
[...]
 diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra30-smmu.txt 
 b/Documentation/devicetree/bindings/iommu/nvidia,tegra30-smmu.txt
[...]
 @@ -23,3 +24,13 @@ Example:
   nvidia,swgroups = 0x 0x000779ff;
   nvidia,ahb = ahb;
   };
 +
 + host1x {
 + compatible = nvidia,tegra30-host1x, simple-bus;
 + nvidia,memory-clients = SWGID_HC;

And this could use the SWGID(HC) to match up with how GPIOs are
referenced in the DTS files. Though I see that the clocks don't use a
parameterized version either, so things are inconsistent anyway. But if
SWGID() isn't used then maybe it shouldn't be provided by the header
file in the first place.

Oh, one other thing: both GPIO and CAR use the TEGRA_ prefix, perhaps
this should use it as well?

 diff --git a/arch/arm/boot/dts/tegra30.dtsi b/arch/arm/boot/dts/tegra30.dtsi
[...]
 index 14ec3f9..3fcee3f 100644
 --- a/arch/arm/boot/dts/tegra30.dtsi
 +++ b/arch/arm/boot/dts/tegra30.dtsi
 @@ -1,5 +1,6 @@
  #include dt-bindings/clock/tegra30-car.h
  #include dt-bindings/gpio/tegra-gpio.h
 +#include dt-bindings/iommu/tegra-swgid.h
  #include dt-bindings/interrupt-controller/arm-gic.h

Nit: these includes seem to be ordered alphabetically; if so then iommu
should go below interrupt-controller.

 @@ -286,6 +300,7 @@
   interrupts = GIC_SPI 91 IRQ_TYPE_LEVEL_HIGH;
   nvidia,dma-request-selector = apbdma 20;
   clocks = tegra_car TEGRA30_CLK_UARTE;
 + nvidia,memory-clients = 14;

SWGID_PPCS?

Thierry


pgpi4eij0DQCo.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 17/23] iommu/tegra: smmu: Use swgroups instead of pdata

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:28:20PM +0300, Hiroshi Doyu wrote:
 Instead of using platfrom data, DT passes nvidia,swgroups.
 
 Signed-off-by: Hiroshi Doyu hd...@nvidia.com
 ---
  drivers/iommu/tegra-smmu.c | 24 +++-
  1 file changed, 7 insertions(+), 17 deletions(-)

Could this patch not be squashed into the previous one? Moving from
using the swgroups from DT and replacing platform data seems to be the
same logical change.

Thierry


pgpBxuPrlqf7k.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 03/23] ARM: dt: tegra30: iommu: Add nvidia,memory-clients

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:18:17PM +0200, Thierry Reding wrote:
 On Wed, Jun 26, 2013 at 12:28:06PM +0300, Hiroshi Doyu wrote:
 [...]
  diff --git 
  a/Documentation/devicetree/bindings/iommu/nvidia,tegra30-smmu.txt 
  b/Documentation/devicetree/bindings/iommu/nvidia,tegra30-smmu.txt
 [...]
  @@ -23,3 +24,13 @@ Example:
  nvidia,swgroups = 0x 0x000779ff;
  nvidia,ahb = ahb;
  };
  +
  +   host1x {
  +   compatible = nvidia,tegra30-host1x, simple-bus;
  +   nvidia,memory-clients = SWGID_HC;
 
 And this could use the SWGID(HC) to match up with how GPIOs are
 referenced in the DTS files.

Scratch that. SWGID() yields a mask for the corresponding swgroup so
it's not the same thing.

Thierry


pgp3NHzNB8807.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 21/23] iommu/tegra: smmu: Get swgroup ID from DT

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:28:24PM +0300, Hiroshi Doyu wrote:
 Get swgroup ID from DT. nvidia,swgroups indicates which swgroup IDs
 a device belongs to.
 
 Signed-off-by: Hiroshi Doyu hd...@nvidia.com
 ---
  arch/arm/boot/dts/tegra30.dtsi |  1 -
  drivers/iommu/tegra-smmu.c | 20 +++-
  2 files changed, 19 insertions(+), 2 deletions(-)
 
 diff --git a/arch/arm/boot/dts/tegra30.dtsi b/arch/arm/boot/dts/tegra30.dtsi
 index 7c480f2..a116737 100644
 --- a/arch/arm/boot/dts/tegra30.dtsi
 +++ b/arch/arm/boot/dts/tegra30.dtsi
 @@ -1,6 +1,5 @@
  #include dt-bindings/clock/tegra30-car.h
  #include dt-bindings/gpio/tegra-gpio.h
 -#include dt-bindings/iommu/tegra-swgid.h
  #include dt-bindings/interrupt-controller/arm-gic.h
  
  #include skeleton.dtsi
 diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
 index 50eb843..96dbef3 100644
 --- a/drivers/iommu/tegra-smmu.c
 +++ b/drivers/iommu/tegra-smmu.c
 @@ -314,6 +314,24 @@ static inline void smmu_write(struct smmu_device *smmu, 
 u32 val, size_t offs)
  
  #define smmu_client_hwgrp(c) (c-as-smmu-swgroups)
  
 +static u64 tegra_smmu_of_get_swgids(struct device *dev)
 +{
 + size_t bytes;
 + const char *propname = nvidia,memory-clients;
 + const __be32 *prop;
 + int i;
 + u64 swgids = 0;
 +
 + prop = of_get_property(dev-of_node, propname, bytes);
 + if (!prop || !bytes)
 + return 0;
 +
 + for (i = 0; i  bytes / sizeof(u32); i++, prop++)
 + swgids |= 1ULL  be32_to_cpup(prop);
 +
 + return swgids;
 +}
 +
  static int __smmu_client_set_hwgrp(struct smmu_client *c,
  u64 map, int on)
  {
 @@ -725,7 +743,7 @@ static int smmu_iommu_attach_dev(struct iommu_domain 
 *domain,
   return -ENOMEM;
   client-dev = dev;
   client-as = as;
 - map = smmu-swgroups;
 + map = tegra_smmu_of_get_swgids(dev);
   if (!map)
   return -EINVAL;

Shouldn't this be part of an earlier patch ([PATCH 17/23] iommu/tegra:
smmu: Use swgroups instead of pdata)? What's the reason for the split?

Thierry


pgppPXAsbVDak.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 02/23] ARM: dt: tegra30: iommu: Add nvidia,swgroups

2013-06-27 Thread Thierry Reding
On Wed, Jun 26, 2013 at 12:28:05PM +0300, Hiroshi Doyu wrote:
 This is a bitmap that indicates which HardWare Accelerators(HWA) are
 supported on Tegra30 SoC.
 
 Signed-off-by: Hiroshi Doyu hd...@nvidia.com
 ---
  Documentation/devicetree/bindings/iommu/nvidia,tegra30-smmu.txt | 6 +-
  arch/arm/boot/dts/tegra30.dtsi  | 1 +
  2 files changed, 6 insertions(+), 1 deletion(-)
 
 diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra30-smmu.txt 
 b/Documentation/devicetree/bindings/iommu/nvidia,tegra30-smmu.txt
 index 89fb543..6be51f6 100644
 --- a/Documentation/devicetree/bindings/iommu/nvidia,tegra30-smmu.txt
 +++ b/Documentation/devicetree/bindings/iommu/nvidia,tegra30-smmu.txt
 @@ -8,14 +8,18 @@ Required properties:
  - nvidia,#asids : # of ASIDs
  - dma-window : IOVA start address and length.
  - nvidia,ahb : phandle to the ahb bus connected to SMMU.
 +- nvidia,swgroups: A bit map of supported HardWare Accelerators(HWA).

Nit: you use the spelling bitmap in the subject but bit map here.
Both are correct but perhaps we should stick to one.

 +  Each bit represents one sgroup. The assignments may be found in header

Nit: swgroup

 - smmu {
 + iommu {

This change isn't really related, but given that the tegra30.dtsi
already uses it I guess we can keep it as part of this patch.

   compatible = nvidia,tegra30-smmu;
   reg = 0x7000f010 0x02c
  0x7000f1f0 0x010
  0x7000f228 0x05c;
   nvidia,#asids = 4;/* # of ASIDs */
   dma-window = 0 0x4000;/* IOVA start  length */
 + nvidia,swgroups = 0x 0x000779ff;

Perhaps this should be a symbolic name too? Perhaps something like
TEGRA30_SWGID_ALL?

Thierry


pgpiIQXSQ7tV2.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 01/22] [HACK] of: dev_node has struct device pointer

2013-07-16 Thread Thierry Reding
On Tue, Jul 16, 2013 at 04:57:03PM -0600, Stephen Warren wrote:
 On 07/05/2013 04:44 AM, Hiroshi Doyu wrote:
  To prevent of_platform_populate() from trying to populate duplicate
  devices if a device has been already populated.
 
 You need to send drivers/of patches to the DT maintainer and
 devicetree-discuss mailing list.
 
  Signed-off-by: Hiroshi Doyu hd...@nvidia.com
  ---
  Need to take care of early_platform_devices, or alternative solution.
 
 I assume that's a TODO item... Are you planning on fleshing out this
 patch to address that issue?

There was some more discussion about it, which can be found in the
following thread:


https://lists.ozlabs.org/pipermail/devicetree-discuss/2013-June/036934.html

That's one of the last messages and it should have most of the
background. If not you may have to walk up the thread for more context.
In a nutshell I raised the question as to why we can't simply call
of_platform_populate() earlier and side-step all the workarounds that
have found their way into the kernel to side-step the issue of their not
being a struct device associated with the struct device_node.

If of_platform_populate() can be made to run early, then we can restore
a whole lot of consistency in how early drivers on an OF kernel work.

Thierry


signature.asc
Description: Digital signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 06/22] ARM: dt: tegra114: iommu: Fix IOMMU register address

2013-07-16 Thread Thierry Reding
On Tue, Jul 16, 2013 at 05:18:29PM -0600, Stephen Warren wrote:
 On 07/05/2013 04:44 AM, Hiroshi Doyu wrote:
  Fix IOMMU register address.
 
 Oh dear, how serious is this?
 
 That incorrect node got added in v3.9 - a long time ago. Does the SMMU
 driver touch registers during probe() or under any other condition right
 now? If so, can that hang the CPU or end up trashing some other HW
 module's registers?
 
 I'm trying to determine whether we need to urgently send this patch as a
 fix Cc: stable.

Judging by the TRM there's nothing in the range from 0x7000ec00 to
0x7000f7ff, so hopefully we got lucky.

Thierry


signature.asc
Description: Digital signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] iommu/tegra: Print phys_addr_t using %pa

2013-09-17 Thread Thierry Reding
When enabling LPAE on ARM, phys_addr_t becomes 64 bits wide and printing
a variable of that type using a simple %x format specifier causes the
compiler to complain. Change the format specifier to %pa, which is used
specifically for variables of type phys_addr_t.

Signed-off-by: Thierry Reding tred...@nvidia.com
---
Olof has been sending patches to fix up similar issues, but I couldn't
find one that fixes these warnings. If Olof sent a patch for this and I
missed it, please ignore.

 drivers/iommu/tegra-gart.c | 6 +++---
 drivers/iommu/tegra-smmu.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 108c0e9..8993999 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -252,7 +252,7 @@ static int gart_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
spin_lock_irqsave(gart-pte_lock, flags);
pfn = __phys_to_pfn(pa);
if (!pfn_valid(pfn)) {
-   dev_err(gart-dev, Invalid page: %08x\n, pa);
+   dev_err(gart-dev, Invalid page: %pa\n, pa);
spin_unlock_irqrestore(gart-pte_lock, flags);
return -EINVAL;
}
@@ -295,8 +295,8 @@ static phys_addr_t gart_iommu_iova_to_phys(struct 
iommu_domain *domain,
 
pa = (pte  GART_PAGE_MASK);
if (!pfn_valid(__phys_to_pfn(pa))) {
-   dev_err(gart-dev, No entry for %08llx:%08x\n,
-(unsigned long long)iova, pa);
+   dev_err(gart-dev, No entry for %08llx:%pa\n,
+(unsigned long long)iova, pa);
gart_dump_table(gart);
return -EINVAL;
}
diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index e066560..34374b3 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -731,7 +731,7 @@ static int smmu_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
unsigned long pfn = __phys_to_pfn(pa);
unsigned long flags;
 
-   dev_dbg(as-smmu-dev, [%d] %08lx:%08x\n, as-asid, iova, pa);
+   dev_dbg(as-smmu-dev, [%d] %08lx:%pa\n, as-asid, iova, pa);
 
if (!pfn_valid(pfn))
return -ENOMEM;
-- 
1.8.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv3 01/19] [HACK] of: dev_node has struct device pointer

2013-10-25 Thread Thierry Reding
On Fri, Oct 25, 2013 at 01:10:38AM +0100, Grant Likely wrote:
 On Thu, 24 Oct 2013 11:21:15 +0200, Hiroshi Doyu hd...@nvidia.com wrote:
  Hi Grant,
  
  Grant Likely grant.lik...@linaro.org wrote @ Thu, 24 Oct 2013 10:55:31 
  +0200:
  
diff --git a/include/linux/of.h b/include/linux/of.h
index f95aee3..638a88a 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -60,6 +60,7 @@ struct device_node {
struct  kref kref;
unsigned long _flags;
void*data;
+   struct device *dev; /* Set only after populated */
   
   Is this being used merely to indicate that a device has been processed
   by of_platform_device_create()? Or do you intend to dereference this
   pointer? I've avoided putting the struct device in to the device_node
   structure up to this point simply becuase there aren't any good clues
   for what /kind/ of device it actually points to. I worry that bad
   assumptions will get made when other subsystems try to use the
   same pointer. ie. if one subsystem creates its own device and sets this
   pointer, and then of_platform_device_create() comes along behind, sees
   that it is already created, and then returns a platform_device pointer
   *for something that isn't a struct platform_device*. This is very bad.
   
   Instead of using a pointer to the struct device, would a flag be
   sufficient for your purposes? Would it be fine to return NULL if the
   device has already been created?
  
  Yes, a flag would be enough for this purpose.
  
  This patch is a part of HACK to control device instanciation order. We
  have an IOMMU device(platform) which needs to be instanciated earlier
  than other (platform)devices so that IOMMU driver would configure them
  as IOMMU'able device.
 
 Ideally the drivers depending on the IOMMU would return -EPROBE_DEFER if
 the IOMMU driver isn't set up so that you don't need to play games with
 probe order. Creating certain platform devices early is a really ugly
 and fragile solution.
 
 Besides, probe order of device drivers is far more about link order of
 the kernel than it is about when of_platform_device_create() is called.
 Fiddling with the initcall level on the IOMMU driver (while not
 recommended) may very well have the effect you desire.

This is actually the other problem that I'm aware of that could benefit
from [interrupt resolution at probe time]. My idea was that once we had
a way within the driver core to resolve interrupt references at probe
time it could be used for potentially many other resources as well. Some
of the resources like GPIOs and regulators are obviously not something
that the core can or should be requesting, but mostly resources that you
don't actually need to control after probing (such as interrupts) would
be a good fit because otherwise people would write the same boilerplate
over and over again.

IOMMUs seem to me to be in that same category. As far as I can tell, an
IOMMU driver registers the IOMMU for a given bus, upon which every
device can simply be used (mostly transparently) with that IOMMU. While
I haven't figured out how exactly, I'm pretty sure we can take advantage
of the resolution of resources at probe time within the core to both
keep drivers from having to do anything in particular and at the same
time have code to determine if the IOMMU driver hasn't been probed yet
and return -EPROBE_DEFER appropriately.

I suspect that there will be enough differences between the various
IOMMU implementations that we won't be able to have a unified binding
(especially for how to associate devices with a particular virtual
address space), but perhaps that could be solved with something like an
.of_xlate() that IOMMU drivers implement, much like we've done with most
other subsystems too.

The binding for Tegra's IOMMU currently only uses the HW IDs (or a mask)
to put a device into a given address space, but I think that could be
easily extended to something like:

memory-clients = iommu MASK;

Or similar. If other information is required, we could encode that into
a multi-cell specifier. Perhaps we could even leave away the phandle
since typically there will only be a single IOMMU in the system?

Does that sound reasonable? Hiroshi is much more familiar with IOMMUs,
so I'd like to get his opinion on the above as well.

Grant, I wonder if it might be more useful if I concentrate on solving
this problem first, before tackling the interrupt reference resolution.
I expect this to be able to use the same core functionality, so it could
be used as a less invasive means of introducing some pieces we can later
reuse. Also this only adds new functionality and doesn't touch anything
that currently works, so there's less (i.e. no) risk of breaking any
existing functionality.

Thierry


pgpk4Jwi_jXnq.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org

Re: [PATCHv3 01/19] [HACK] of: dev_node has struct device pointer

2013-10-25 Thread Thierry Reding
On Fri, Oct 25, 2013 at 10:25:49AM +0200, Hiroshi Doyu wrote:
 Thierry Reding thierry.red...@gmail.com wrote @ Fri, 25 Oct 2013 09:56:55 
 +0200:
 
This patch is a part of HACK to control device instanciation order. We
have an IOMMU device(platform) which needs to be instanciated earlier
than other (platform)devices so that IOMMU driver would configure them
as IOMMU'able device.
   
   Ideally the drivers depending on the IOMMU would return -EPROBE_DEFER if
   the IOMMU driver isn't set up so that you don't need to play games with
   probe order. Creating certain platform devices early is a really ugly
   and fragile solution.
   
   Besides, probe order of device drivers is far more about link order of
   the kernel than it is about when of_platform_device_create() is called.
   Fiddling with the initcall level on the IOMMU driver (while not
   recommended) may very well have the effect you desire.
  
  This is actually the other problem that I'm aware of that could benefit
  from [interrupt resolution at probe time]. My idea was that once we had
  a way within the driver core to resolve interrupt references at probe
  time it could be used for potentially many other resources as well. Some
  of the resources like GPIOs and regulators are obviously not something
  that the core can or should be requesting, but mostly resources that you
  don't actually need to control after probing (such as interrupts) would
  be a good fit because otherwise people would write the same boilerplate
  over and over again.
  
  IOMMUs seem to me to be in that same category. As far as I can tell, an
  IOMMU driver registers the IOMMU for a given bus, upon which every
  device can simply be used (mostly transparently) with that IOMMU. While
  I haven't figured out how exactly, I'm pretty sure we can take advantage
  of the resolution of resources at probe time within the core to both
  keep drivers from having to do anything in particular and at the same
  time have code to determine if the IOMMU driver hasn't been probed yet
  and return -EPROBE_DEFER appropriately.
 
 Can you explain the above a bit more?
 
 Originally I thought that what Grant suggested would work ok with this
 patch.

I think the objection to these patches is that they special case the
instantiation of some devices. It's not a proper solution because it
implies various things. For example merely instantiating the IOMMU
device earlier than others is only going to work *if* the driver is
actually probed before the drivers of other devices. If you want to
build the driver as a module for example, probe order becomes entirely
non-deterministic.

So what Grant was saying is that you could possibly make it work by
forcing the driver to be loaded earlier using explicit initcall
ordering. But he also said that's not recommended because it's not a
proper solution and therefore not guaranteed to always work.

Explicit initcall ordering used to be heavily used in the past, but
there have been many efforts to move away from it. One of the solutions
introduced to help with that is deferred probing, which essentially adds
a new error code (EPROBE_DEFER) which a driver's .probe() can return to
cause it to be probed at a later point again, after other drivers have
been probed.

How this works is basically that a driver's .probe() requests whatever
resources it needs (GPIOs, clocks, regulators, ...). If any of those
resources isn't there yet (presumably because the driver providing it
hasn't been probed yet), it can return -EPROBE_DEFER to signal that not
all of its dependencies are available yet.

Instead of handling such dependencies implicitly by making sure all
resource providers are probed earlier than any of their consumers, the
dependencies are handled more explicitly, which turns out to simplify
things a lot. There's some additional work required in the core, but if
done consistently no driver needs to care about the dependencies and it
no longer matters where the resources come from. The problem is reduced
to essentially this:

while (!resource_available())
load_more_drivers();

So my proposed solution for the IOMMU case is to treat it the same as
any other resources. Perhaps resource isn't the right word, but at the
core the issue is the same. A device requires the services of an IOMMU
so that it can be put into the correct address space. If the IOMMU is
not available yet it cannot do that, so we simply return -EPROBE_DEFER
and cause the probe to be retried later.

Eventually the IOMMU driver will be probed and register the IOMMU. When
the earlier driver is probed again, it will be able to successfully
request to be put into the proper address space and continue with the
initialization.

Does that answer your question?

Thierry


pgpjWMaXofFQW.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo

Re: [PATCHv3 01/19] [HACK] of: dev_node has struct device pointer

2013-10-25 Thread Thierry Reding
On Fri, Oct 25, 2013 at 11:49:05AM +0200, Hiroshi Doyu wrote:
 Thierry Reding thierry.red...@gmail.com wrote @ Fri, 25 Oct 2013 11:11:05 
 +0200:
 
This is actually the other problem that I'm aware of that could benefit
from [interrupt resolution at probe time]. My idea was that once we had
a way within the driver core to resolve interrupt references at probe
time it could be used for potentially many other resources as well. Some
of the resources like GPIOs and regulators are obviously not something
that the core can or should be requesting, but mostly resources that you
don't actually need to control after probing (such as interrupts) would
be a good fit because otherwise people would write the same boilerplate
over and over again.

IOMMUs seem to me to be in that same category. As far as I can tell, an
IOMMU driver registers the IOMMU for a given bus, upon which every
device can simply be used (mostly transparently) with that IOMMU. While
I haven't figured out how exactly, I'm pretty sure we can take advantage
of the resolution of resources at probe time within the core to both
keep drivers from having to do anything in particular and at the same
time have code to determine if the IOMMU driver hasn't been probed yet
and return -EPROBE_DEFER appropriately.
   
   Can you explain the above a bit more?
   
   Originally I thought that what Grant suggested would work ok with this
   patch.
  
  I think the objection to these patches is that they special case the
  instantiation of some devices. It's not a proper solution because it
  implies various things. For example merely instantiating the IOMMU
  device earlier than others is only going to work *if* the driver is
  actually probed before the drivers of other devices. If you want to
  build the driver as a module for example, probe order becomes entirely
  non-deterministic.
 
 I understand the above limitation. What I thought for the solution is
 that I can make use of iommu_bus even before IOMMU is
 instanciated. iommu_bus has its notifier which calls
 iommu_ops()-add_device(). This could return -EPROBE_DEFER when IOMMU
 isn't ready. Only the problem is the current bus_notifier doesn't
 have the ability to return error. I'll see if it can be modified. Even
 with this, at least IOMMU *driver* needs to be init'ed enough earlier
 to prevent devices from being registered without IOMMU.

The way notifiers work is that they run completely hidden from whatever
triggers them. For instance you register the IOMMU bus notifier from the
IOMMU driver (by calling bus_set_iommu()). That registers a function to
be called when some event happens on that bus. When a device's driver is
probed successfully, the driver core will notify the bus, which causes
the IOMMU callback to be run.

Some of this code runs before the driver has successfully been probed,
so I imagine it would be possible to use it to abort probing. But that's
not possible at least with the current code.

  Instead of handling such dependencies implicitly by making sure all
  resource providers are probed earlier than any of their consumers, the
  dependencies are handled more explicitly, which turns out to simplify
  things a lot. There's some additional work required in the core, but if
  done consistently no driver needs to care about the dependencies and it
  no longer matters where the resources come from. The problem is reduced
  to essentially this:
  
  while (!resource_available())
  load_more_drivers();
  
  So my proposed solution for the IOMMU case is to treat it the same as
  any other resources. Perhaps resource isn't the right word, but at the
  core the issue is the same. A device requires the services of an IOMMU
  so that it can be put into the correct address space. If the IOMMU is
  not available yet it cannot do that, so we simply return -EPROBE_DEFER
  and cause the probe to be retried later.
 
 This looks somewhat similar to the above iommu_bus notifier.
 
 Is there any way to implement the same mechanism rather than using
 bus?

Yes, I think it should be possible to get this to work without using the
bus notifier at all. I can try to code something up but wanted to wait
for feedback from Grant first.

Thierry


pgpZj6__WN6WL.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv3 01/19] [HACK] of: dev_node has struct device pointer

2013-10-25 Thread Thierry Reding
On Fri, Oct 25, 2013 at 02:20:51PM +0100, Will Deacon wrote:
 On Fri, Oct 25, 2013 at 09:22:02AM +0100, Hiroshi Doyu wrote:
  Thierry Reding thierry.red...@gmail.com wrote @ Fri, 25 Oct 2013 09:56:55 
  +0200:
  
   I suspect that there will be enough differences between the various
   IOMMU implementations that we won't be able to have a unified binding
   (especially for how to associate devices with a particular virtual
   address space), but perhaps that could be solved with something like an
   .of_xlate() that IOMMU drivers implement, much like we've done with most
   other subsystems too.
   
   The binding for Tegra's IOMMU currently only uses the HW IDs (or a mask)
   to put a device into a given address space, but I think that could be
   easily extended to something like:
   
 memory-clients = iommu MASK;
   
   Or similar. If other information is required, we could encode that into
   a multi-cell specifier. Perhaps we could even leave away the phandle
   since typically there will only be a single IOMMU in the system?
   
   Does that sound reasonable? Hiroshi is much more familiar with IOMMUs,
   so I'd like to get his opinion on the above as well.
  
  I think that the above may be possible, but I'd like to listen from
  other IOMMU SOC maintainers.
  
  A brief explanation for memory-clients:
  
  In tegra, every H/W has its own memory-client ID, and it can be
  associated to any address spaces. The above memory-cilents is used
  to indicate which ID a device has in DT. If the other SOC IOMMUs need
  this kind of memory-clients, this would be standarized. Any comment?
  
  At least arm-smmu seems to have the following. multiple IOMMUs can be
  handled with this.
  
  
  - smmu-parent   : When multiple SMMUs are chained together, this
property can be used to provide a phandle to the
parent SMMU (that is the next SMMU on the path going
from the mmu-masters towards memory) node for this
SMMU.
 
 This property is for the case when IOMMUs are connected in series, which is
 fairly horrendous. However, it is extremely common to have multiple IOMMU
 instances within an Soc (Exynos SoCs have one IOMMU per device iirc), so
 that certainly needs to be handled.

Okay, in that case perhaps some sort of generic binding should mandate
the phandle, even for systems that only have a single one. I suspect a
generic variant of smmu-parent (iommu-parent?) would also be useful to
describe setups with cascaded IOMMUs.

Thierry


pgptdRIq__75C.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv3 01/19] [HACK] of: dev_node has struct device pointer

2013-11-01 Thread Thierry Reding
On Thu, Oct 31, 2013 at 11:53:22AM -0600, Stephen Warren wrote:
 On 10/31/2013 10:46 AM, Hiroshi Doyu wrote:
  Stephen Warren swar...@wwwdotorg.org wrote @ Thu, 31 Oct 2013 17:35:24 
  +0100:
  
  ...
  ... and for the driver to explicitly parse that property, and wait
  until the driver for iommu_phandle is ready. Exactly the same as any
  other resource/service dependency.
 
  That will solve all the problems.
 
  The only downside is that every driver needs to contain code to parse
  that property. However, I think that's just one function call; the
  actual implementation of that function can be unified somewhere inside
  core code in drivers/iommu/.
 
  Yes, but only missing part now is that, we could do this with
  bus_notifier, but the current bus_notifier doesn't have the feature
  to return error(-EPROBE_DEFER). This could be modified so that
  bus_notifier could return (-EPROBE_DEFER) to postpone
  probing. Alternatively this could be done in some core probe code as
  well as Thierry pointed out.
 
  [1] In the reply of [PATCHv3 14/19] iommu/tegra: smmu: Get 
  nvidia,memory-clients from DT
 
  I think this should be done explicitly in drivers. It's much
  simpler,
  
  Yes, we need to insert some IOMMU specific code in driver?
  
  and doesn't encode any knowledge of driver-specific bindings into some
  common bus notifier code.
  
  I think that we cannot do that in drivers. We want to use drivers as
  they are without any modifications indicating its dependency on
  IOMMU because most of drivers are existing ones which nvidia doesn't
  implement. We want to set up any IOMMU related thingy implicitly,
  hided from driver. That's why we need to do this in bus_notifier or
  driver core code.
 
 We're talking about memory-mapped on-SoC devices here, that generally
 only exist inside Tegra SoCs.
 
 Even ignoring that (i.e. expanding the argument to arbitrary modules),
 having drivers that perform bus-master transactions call a function
 of_iommu_attach() or similar, which does nothing if the device isn't
 behind an IOMMU but otherwise does whatever is required, seems like it
 isn't much of an imposition.
 
 If this turns out to be something that lots of devices benefit from, we
 could do the same thing that pinctrl does for hogs, and make the
 function call in the driver core. But note that's still part of the
 probing flow (just implemented in the driver core) rather than bus
 notifier based, hence keeps the same simplicity.

That's exactly what I was proposing in the first place. I did the same
thing in my patches for late interrupt reference resolution. The reason
why I think it makes sense to put it into the core is that it's
something that likely most devices will have to do anyway. And it also
is completely transparent to drivers, right? If they aren't attached to
an IOMMU then they just aren't but will continue to work properly. And
as you said, having it in the core makes drivers simpler, while at the
same time keeping the explicitness of having a function call (rather
than a somewhat obfuscated bus notifier) and the flexibility of deferred
probing.

Thierry


pgpcXOmbnOMS8.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv5 2/9] driver/core: populate devices in order for IOMMUs

2013-11-19 Thread Thierry Reding
On Tue, Nov 19, 2013 at 11:33:06AM +0200, Hiroshi Doyu wrote:
 IOMMU devices on the bus need to be poplulated first, then iommu
 master devices are done later.
 
 With CONFIG_OF_IOMMU, iommus= DT binding would be used to identify
 whether a device can be an iommu msater or not. If a device can, we'll
 defer to populate that device till an iommu device is populated. Once
 an iommu device is populated, dev-bus-iommu_ops is set in the
 bus. Then, those defered iommu master devices are populated and
 configured for IOMMU with help of the already populated iommu device
 via iommu_ops-add_device(). Multiple IOMMUs can be listed on this
 iommus binding so that a device can have multiple IOMMUs attached.
 
 Signed-off-by: Hiroshi Doyu hd...@nvidia.com
 ---
 v5:
 Use iommus= binding instread of arm,smmu's #stream-id-cells.
 
 v4:
 This is newly added, and the successor of the following RFC:
   [RFC][PATCHv3+ 1/2] driver/core: Add of_iommu_attach()
   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006914.html
 ---
  drivers/base/dd.c|  5 +
  drivers/iommu/of_iommu.c | 22 ++
  include/linux/of_iommu.h |  7 +++
  3 files changed, 34 insertions(+)
 
 diff --git a/drivers/base/dd.c b/drivers/base/dd.c
 index 35fa368..6e892d4 100644
 --- a/drivers/base/dd.c
 +++ b/drivers/base/dd.c
 @@ -25,6 +25,7 @@
  #include linux/async.h
  #include linux/pm_runtime.h
  #include linux/pinctrl/devinfo.h
 +#include linux/of_iommu.h
  
  #include base.h
  #include power/power.h
 @@ -273,6 +274,10 @@ static int really_probe(struct device *dev, struct 
 device_driver *drv)
  
   dev-driver = drv;
  
 + ret = of_iommu_attach(dev);
 + if (ret)
 + goto probe_failed;
 +
   /* If using pinctrl, bind pins now before probing */
   ret = pinctrl_bind_pins(dev);
   if (ret)
 diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
 index ee249bc..4aef2b2 100644
 --- a/drivers/iommu/of_iommu.c
 +++ b/drivers/iommu/of_iommu.c
 @@ -20,6 +20,8 @@
  #include linux/export.h
  #include linux/limits.h
  #include linux/of.h
 +#include linux/device.h
 +#include linux/iommu.h
  
  /**
   * of_get_dma_window - Parse *dma-window property and returns 0 if found.
 @@ -88,3 +90,23 @@ int of_get_dma_window(struct device_node *dn, const char 
 *prefix, int index,
   return 0;
  }
  EXPORT_SYMBOL_GPL(of_get_dma_window);
 +
 +int of_iommu_attach(struct device *dev)
 +{
 + int i;
 + struct of_phandle_args args;
 + struct iommu_ops *ops = dev-bus-iommu_ops;
 +
 + of_property_for_each_phandle_with_args(dev-of_node, iommus,
 +#iommu-cells, i, args) {
 + pr_debug(%s(i=%d) ops=%p %s\n,
 +  __func__, i, ops, dev_name(dev));
 +
 + if (!ops)
 + return -EPROBE_DEFER;
 + }
 +
 + if (i  ops-add_device)
 + return ops-add_device(dev);
 + return 0;
 +}

I don't think this does what it's supposed to do. As far as I can tell
there's no way the above loop won't run to parse all phandles and their
arguments unless the DT is actually wrong.

From earlier discussions I thought the goal was to actually defer this
until all nodes referred to by the iommus property were actually
registered. The above only checks that the phandles can be resolved to
valid struct device_node:s. That doesn't mean that an actual IOMMU has
been registered for it, only that the devices have been created.

I think within that loop you need to look up the IOMMU corresponding to
the struct device_node in args.np. If no match is found, then return
-EPROBE_DEFER.

If you really only rely on dev-bus-iommu_ops to be present, then there
is no need to go through the loop in the first place, since you have
access to it immediately through the struct device that's passed into
the function.

Furthermore, relying on dev-bus-iommu_ops will prevent multiple IOMMUs
from being used at all, since only one IOMMU can register iommu_ops with
the bus, right? So I think what we really need here is a way to resolve
the IOMMU using a phandle and return the associated struct iommu_ops.

I also have some trouble understanding how the current IOMMU framework
is supposed to work together with multiple IOMMUs for one device. The
.add_device() callback seems to be missing crucial information to help
decide whether the device to be added is actually one that it covers.
Also with an of_iommu_attach() function, doesn't that become more or
less redundant?

Thierry


pgpb9fbbb2XGw.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv5 2/9] driver/core: populate devices in order for IOMMUs

2013-11-20 Thread Thierry Reding
On Wed, Nov 20, 2013 at 04:17:08AM +0100, Hiroshi Doyu wrote:
 Stephen Warren swar...@wwwdotorg.org wrote @ Tue, 19 Nov 2013 22:22:47 
 +0100:
 
  On 11/19/2013 05:03 AM, Hiroshi Doyu wrote:
   Hi Thierry,
   
   Thierry Reding thierry.red...@gmail.com wrote @ Tue, 19 Nov 2013 
   11:25:07 +0100:
   
   From earlier discussions I thought the goal was to actually defer this
   until all nodes referred to by the iommus property were actually
   registered. The above only checks that the phandles can be resolved to
   valid struct device_node:s. That doesn't mean that an actual IOMMU has
   been registered for it, only that the devices have been created.
   
   Currently bus-iommu_ops is set at the end of tegra_smmu_probe(). So
   if bus-iommu_ops is set, it means that an iommu instance is
   populated at that time.
  
  Yes, but that's the register bus, upon which the device is a client, not
  the bus upon which the device is a bus master. They aren't necessarily
  the same.
  
  There's no getting around the fact that, as Thierry said, you need to
  search for a registered IOMMU device for each phandle, and defer probe
  if any aren't registered yet.
  
  If we do that, then you shouldn't need to look at the value of
  dev-bus-iommu_ops at all; if all IOMMUs in the list were registered,
  then iommu_ops must have been set when (one of them) was registered, and
  if not, then it possibly wasn't, so defer probe.
  
  That way, this code won't have to change if the core IOMMU code gets
  extended to support multiple IOMMUs, devices mastering transactions onto
  buses other than their register bus, etc.
 
 Does the above mean the following?
 
 int of_iommu_attach(struct device *dev)
 {
   int i;
   struct of_phandle_args args;
 
   of_property_for_each_phandle_with_args(dev-of_node, iommus,
  #iommu-cells, i, args)
   if (!args-np-dev-driver)
   return -EPROBE_DEFER;
   return 0;
 }

Not quite. The above would only check that a driver was bound to the
device. But if that device isn't an IOMMU then this doesn't help you.

The standard way to solve this issue is to add the IOMMU to a global
list upon registration. Typically subsystems have some way to do that
already, but it seems like IOMMU doesn't. It looks like that's one of
the side-effects of the assumption that there will always only be a
single IOMMU (per bus).

There's also no base object that IOMMU drivers implement, which is the
way it's usually done in other subsystems. The absence of that makes it
more difficult. I suspect the easiest way to do that would be to add a
new type, something like this:

struct iommu {
struct list_head list;
struct device *dev;
};

Then modify the driver to do something like this (tegra-smmu.c):

struct smmu_device {
struct iommu iommu;

...
};

...

static int tegra_smmu_probe(struct platform_device *pdev)
{
struct smmu_device *smmu;

...

smmu-iommu.dev = pdev-dev;

iommu_add(smmu-iommu);

...
}

static int tegra_smmu_remove(struct platform_device *pdev)
{
struct smmu_device *smmu;

...

iommu_del(smmu-iommu);
}

The implementation of iommu_add() and iommu_del() could be as simple as:

static DEFINE_MUTEX(iommus_lock);
static LIST_HEAD(iommus_list);

void iommu_add(struct iommu *iommu)
{
INIT_LIST_HEAD(iommu-list);
mutex_lock(iommus_lock);
list_add_tail(iommu-list, iommus_list);
mutex_unlock(iommus_lock);
}

void iommu_del(struct iommu *iommu)
{
INIT_LIST_HEAD(iommu-list);
mutex_lock(iommus_lock);
list_del(iommu-list);
mutex_unlock(iommus_lock);
}

And then you can implement a lookup function to match an IOMMU to the
phandle, like so:

struct iommu *of_find_iommu_by_node(struct device_node *np)
{
struct iommu *iommu;

mutex_lock(iommus_lock);

list_for_each_entry(iommu, iommus_list, list) {
if (iommu-dev-of_node == np) {
mutex_unlock(iommus_lock);
return iommu;
}
}

mutex_unlock(iommus_lock);
return NULL;
}

With that you can use of_find_iommu_by_node() in the loop to check
whether an IOMMU has really been registered.

Of course that all is somewhat intrusive and Joerg might have some
objections.

Thierry


pgp58AiOvsUeK.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux

Re: Report from 2013 ARM kernel summit

2013-11-20 Thread Thierry Reding
On Wed, Nov 20, 2013 at 10:31:11AM +, Will Deacon wrote:
 On Tue, Nov 19, 2013 at 08:45:02PM +, Rob Herring wrote:
  On 11/19/2013 11:35 AM, Will Deacon wrote:
   Adding Andreas and Rob for input on potential binding additions to the 
   SMMU.
  
  The above proposal would be an incompatible change. However, I think we
  could still deal with a change in this binding at this stage.
  
  One way approach to handle this without changing the binding would be to
  scan the DT for all iommu's up front and create a list of all nodes and
  their iommu parent. The fact that the hierarchy is described in a way
  that doesn't fit Linux well is really a Linux implementation detail.
  
  If changing the binding, a simple approach would be to allow
  'smmu-parent' to be a bus and/or device property and not just for
  chained iommu's. This could be a global or bus property that is
  inherited. Like interrupt-parent, you would have to deal with the parent
  being itself. Also, perhaps iommu-parent would be a better name. In any
  case, I'd like to see this all be a generic iommu binding.
 
 I like that idea. I've recently been toying with removing the chained IOMMU
 support, since I don't think anybody is using it who is interested in
 mainline. However, making it more general sounds like a better idea.
 
 One potential issue is that I think the nvidia guys want to describe masters
 that master via multiple SMMUs (which I believe was the motivation for
 moving the stream-ids out into the master nodes, rather than keeping them in
 the SMMU). Again, that's not something we can easily add to the arm-smmu,
 because the incoming stream-ids are a property of the SMMU node.

If I remember correctly, one of the reasons for the proposal was also
that the interrupt-parent property turned out to be insufficient for
some use-cases, which lead to Grant's proposal of the new interrupts-
extended property. Since that comparison has already been drawn, I think
we can agree that both are used in similar ways. Therefore we should
consider what we've learned from interrupt-parent when designing this
generic IOMMU binding to avoid having to introduce iommu-extended at
some point.

 So the question is: do we actually need to describe masters that master
 through multiple SMMUs as a single node in the devicetree?

I would think so, yes. The alternative would be to have several nodes
that describe the same device, and that conflicts on a different level.

Perhaps it could be done by having separate sub-nodes that each use a
different IOMMU, but that sounds like a much grosser solution. That
pretty much boils down to interrupt-parent/interrupt-map.

Thierry


pgp7ADJfupd2Y.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv7 04/12] driver/core: populate devices in order for IOMMUs

2013-12-14 Thread Thierry Reding
On Thu, Dec 12, 2013 at 06:14:02PM -0800, Greg KH wrote:
 On Thu, Dec 12, 2013 at 11:39:20AM +, Grant Likely wrote:
  On Thu, 12 Dec 2013 09:57:05 +0200, Hiroshi Doyu hd...@nvidia.com wrote:
   IOMMU devices on the bus need to be poplulated first, then iommu
   master devices are done later.
   
   With CONFIG_OF_IOMMU, iommus= DT binding would be used to identify
   whether a device can be an iommu msater or not. If a device can, we'll
   defer to populate that device till an iommu device is populated. Then,
   those deferred iommu master devices are populated and configured with
   help of the already populated iommu device.
   
   Signed-off-by: Hiroshi Doyu hd...@nvidia.com
   Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
   ---
   This is related to the following discussion:
 [RFC PATCH] Documentation: devicetree: add description for generic bus 
   properties
 
   http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/215042.html
   
   v6:
   Spinned off only driver core part from:
 [PATCHv5 2/9] driver/core: populate devices in order for IOMMUs
   
   v5:
   Use iommus= binding instread of arm,smmu's #stream-id-cells.
   
   v4:
   This is newly added, and the successor of the following RFC:
 [RFC][PATCHv3+ 1/2] driver/core: Add of_iommu_attach()
 
   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006914.html
   
   Signed-off-by: Hiroshi Doyu hd...@nvidia.com
   ---
drivers/base/dd.c | 5 +
1 file changed, 5 insertions(+)
   
   diff --git a/drivers/base/dd.c b/drivers/base/dd.c
   index 0605176..0605f52 100644
   --- a/drivers/base/dd.c
   +++ b/drivers/base/dd.c
   @@ -25,6 +25,7 @@
#include linux/async.h
#include linux/pm_runtime.h
#include linux/pinctrl/devinfo.h
   +#include linux/of_iommu.h

#include base.h
#include power/power.h
   @@ -273,6 +274,10 @@ static int really_probe(struct device *dev, struct 
   device_driver *drv)

 dev-driver = drv;

   + ret = of_iommu_attach(dev);
   + if (ret)
   + goto probe_failed;
   +
  
  As discussed before, I really don't think hooking in to dd.c is the
  right thing to do here, and certainly not as a device tree specific
  function. ACPI or PCI described devices may have the same constraints
  and those won't have DT descriptions.
 
 I agree, this shouldn't be in the driver core.

Okay, so what would be an alternative? Grant's objection makes sense and
we could easily just wrap the call to of_iommu_attach() within a generic
iommu_attach() that could decide at runtime which exact implementation
to call, depending on whether the device is DT, ACPI, PCI or whatnot.

If we don't want something like that in the core either, then the only
other alternative would be to call this from each driver. However given
the desire to handle IOMMUs completely transparently for device drivers
that would be missing the point.

Perhaps moving this into platform_drv_probe() would be more acceptable?
That's still somewhat core, but maybe suburban enough.

Thierry


pgp8sTFsF2kiM.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv7 06/12] ARM: tegra: create a DT header defining SWGROUP ID

2013-12-20 Thread Thierry Reding
On Wed, Dec 18, 2013 at 09:27:29AM -0700, Stephen Warren wrote:
 On 12/18/2013 01:02 AM, Mark Zhang wrote:
  On 12/12/2013 03:57 PM, Hiroshi Doyu wrote:
  Create a header file to define the swgroup IDs used by the IOMMU(SMMU)
  binding. swgroup is a group of H/W clients which a Tegra SoC
  supports. This unique ID can be used to calculate MC_SMMU_swgroup
  name_ASID_0 register offset and MC_swgroup name_HOTRESET_*_0
  register bit. This will allow the same header to be used by both
  device tree files, and drivers implementing this binding, which
  guarantees that the two stay in sync. This also makes device trees
  more readable by using names instead of magic numbers. For HOTRESET
  bit shifting we need another conversion table, which will come later.
 
  diff --git a/include/dt-bindings/memory/tegra-swgroup.h 
  b/include/dt-bindings/memory/tegra-swgroup.h
 
  +#define TEGRA_SWGROUP_MPE 11  /* 0x264 */
  +#define TEGRA_SWGROUP_MSENC   SWGROUP_MPE
  
  Need to change this to:
  
  #define TEGRA_SWGROUP_MSENC 11
  
  The reason is that, this makes TEGRA_SWGROUP_BIT doesn't work. So if I
  write TEGRA_SWGROUP_CELLS(MSENC) in dt, that causes a dt compiling error.
 
 I guess it's because TEGRA_SWGROUP_BIT needs to expand its argument
 twice, which can be done.
 
 That all said, just defining all the names directly to constants is
 probably the most direct fix.

Erm... isn't this simply a typo, where:

#define TEGRA_SWGROUP_MSENC SWGROUP_MPE

should simply be

#define TEGRA_SWGROUP_MSENC TEGRA_SWGROUP_MPE

?

That certainly works for me.

Thierry


pgpqkfeJTj7P1.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Thierry Reding
On Sun, Apr 27, 2014 at 08:23:06PM +0200, Arnd Bergmann wrote:
 On Sunday 27 April 2014 13:07:43 Shaik Ameer Basha wrote:
  +- mmu-masters: A phandle to device nodes representing the master for which
  +   the System MMU can provide a translation. Any additional 
  values
  +  after the phandle will be ignored because a System MMU never
  +  have two or more masters. #stream-id-cells specified in the
  +  master's node will be also ignored.
  +  If more than one phandle is specified, only the first phandle
  +  will be treated.
 
 This seems completely backwards: Why would you list the masters for an IOMMU
 in the IOMMU node?
 
 The master should have a standard property pointing to the IOMMU instead.
 
 We don't have a generic binding for IOMMUs yet it seems, but the time is
 overdue to make one.
 
 Consider this NAKed until there is a generic binding for IOMMUs that all
 relevant developers have agreed to.

I'd like to take this opportunity and revive one of the hibernating
patch sets that we have for Tegra. The last effort to get things merged
was back in January I think. I haven't bothered to look up the reference
since it's probably good to start from scratch anyway.

The latest version of the binding that was under discussion back then I
think looked something like this:

device@... {
iommus = iommu [spec][, other_iommu [other_spec]...];
};

And possibly with a iommu-names property to go along with that. The idea
being that a device can be a master on possibly multiple IOMMUs. Using
the above it would also be possible to have one device be multiple
masters on the same IOMMU.

On Tegra the specifier would be used to encode a memory controller's
client ID. One discussion point back at the time was to encode the ID as
a bitmask to allow more than a single master per entry. Another solution
which I think is a little cleaner and more generic, would be to use one
entry per master and use a single cell to encode the client ID. Devices
with multiple clients to the same IOMMU could then use multiple entries
referencing the same IOMMU.

I've added Hiroshi Doyu on Cc since he knows the Tegra IOMMU best.
Hiroshi, can you summarize exactly what the proposed bindings were. If
my memory serves me well they were mostly along the lines of what Arnd
proposes here, and perhaps they are something that can also be used for
Exynos.

Will Deacon (I think) had some comments on the earlier discussion as
well, so I've added him on Cc for visibility. Sorry if I'm confusing you
with someone else, Will. In that case perhaps you know who to include in
the discussion from the ARM side.

Also adding Stephen Warren for visibility.

Thierry


pgpVulkkC27jm.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Thierry Reding
On Mon, Apr 28, 2014 at 12:56:03PM +0200, Arnd Bergmann wrote:
 On Monday 28 April 2014 12:39:20 Thierry Reding wrote:
  On Sun, Apr 27, 2014 at 08:23:06PM +0200, Arnd Bergmann wrote:
   On Sunday 27 April 2014 13:07:43 Shaik Ameer Basha wrote:
+- mmu-masters: A phandle to device nodes representing the master for 
which
+   the System MMU can provide a translation. Any 
additional values
+  after the phandle will be ignored because a System MMU 
never
+  have two or more masters. #stream-id-cells specified 
in the
+  master's node will be also ignored.
+  If more than one phandle is specified, only the first 
phandle
+  will be treated.
   
   This seems completely backwards: Why would you list the masters for an 
   IOMMU
   in the IOMMU node?
   
   The master should have a standard property pointing to the IOMMU instead.
   
   We don't have a generic binding for IOMMUs yet it seems, but the time is
   overdue to make one.
   
   Consider this NAKed until there is a generic binding for IOMMUs that all
   relevant developers have agreed to.
  
  I'd like to take this opportunity and revive one of the hibernating
  patch sets that we have for Tegra. The last effort to get things merged
  was back in January I think. I haven't bothered to look up the reference
  since it's probably good to start from scratch anyway.
  
  The latest version of the binding that was under discussion back then I
  think looked something like this:
  
  device@... {
  iommus = iommu [spec][, other_iommu [other_spec]...];
  };
  
  And possibly with a iommu-names property to go along with that. The idea
  being that a device can be a master on possibly multiple IOMMUs. Using
  the above it would also be possible to have one device be multiple
  masters on the same IOMMU.
 
 Yes, that seems reasonable. Just one question: How would you represent a
 device that has multiple masters, with at least one connected to an IOMMU
 and another one connected to memory directly, without going to the IOMMU?

Heh, I don't think I've ever thought about that use-case. I guess I was
always assuming that in the absence of an IOMMU the device would simply
access memory directly. From what I can tell that's how Tegra works at
least. If the IOMMU is not enabled for a given client, that client will
access physical memory untranslated.

I suppose if that really must be represented then a global dummy IOMMU
could be introduced to help with these cases.

  On Tegra the specifier would be used to encode a memory controller's
  client ID. One discussion point back at the time was to encode the ID as
  a bitmask to allow more than a single master per entry. Another solution
  which I think is a little cleaner and more generic, would be to use one
  entry per master and use a single cell to encode the client ID. Devices
  with multiple clients to the same IOMMU could then use multiple entries
  referencing the same IOMMU.
 
 I'm not completely following here. Are you talking about the generic
 binding, or the part that is tegra specific for the specifier?
 
 My first impression is that the generic binding should just allow an
 arbitrary specifier with a variable #iommu-cells, and leave the format
 up to the IOMMU driver.

Yes, I was getting ahead of myself. The idea was to have #iommu-cells
and allow the specifier to be IOMMU-specific. On Tegra that would
translate to the memory controller client ID, on other devices I suspect
something similar might exist, but for the generic binding it should be
completely opaque and hence irrelevant.

Really just like any of the other bindings that have foos and #foo-cells
properties.

 A lot of drivers probably only support one
 master, so they can just set #iommu-cells=0, others might require
 IDs that do not fit into one cell.

You mean #iommu-cells = 1 for devices that only require one master?
There still has to be one cell to specify which master. Unless perhaps
if they can be arbitrarily assigned. I guess even if there's a fixed
mapping that applies to one SoC generation, it might be good to still
employ a specifier and have the mapping in DT for flexibility.

Thierry


pgpiS3Z2lrnfJ.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Thierry Reding
On Mon, Apr 28, 2014 at 02:05:30PM +0200, Arnd Bergmann wrote:
 On Monday 28 April 2014 13:18:03 Thierry Reding wrote:
  On Mon, Apr 28, 2014 at 12:56:03PM +0200, Arnd Bergmann wrote:
   On Monday 28 April 2014 12:39:20 Thierry Reding wrote:
And possibly with a iommu-names property to go along with that. The idea
being that a device can be a master on possibly multiple IOMMUs. Using
the above it would also be possible to have one device be multiple
masters on the same IOMMU.
   
   Yes, that seems reasonable. Just one question: How would you represent a
   device that has multiple masters, with at least one connected to an IOMMU
   and another one connected to memory directly, without going to the IOMMU?
  
  Heh, I don't think I've ever thought about that use-case. I guess I was
  always assuming that in the absence of an IOMMU the device would simply
  access memory directly. From what I can tell that's how Tegra works at
  least. If the IOMMU is not enabled for a given client, that client will
  access physical memory untranslated.
  
  I suppose if that really must be represented then a global dummy IOMMU
  could be introduced to help with these cases.
 
 It's actually not too uncommon: you can have e.g. the lower 2GB mapped
 directly from the device address space into the host memory, but have
 an iommu that translates accesses from some range in the upper 2GB of
 the 32-bit address space into full 64-bit addresses.
 
 This use case makes no sense if you use the IOMMU for isolation
 or virtualization, but it gives better performance for lowmem access
 when the only reason to have the IOMMU is to map highmem addresses.

Thinking about this some more, isn't the non-IOMMU master something we
can completely ignore in the DT? Or at least it shouldn't be handled by
the IOMMU bindings because, well, it's not an IOMMU to begin with.

Perhaps it's something that should be described using dma-ranges?

   A lot of drivers probably only support one
   master, so they can just set #iommu-cells=0, others might require
   IDs that do not fit into one cell.
  
  You mean #iommu-cells = 1 for devices that only require one master?
 
 I meant an IOMMU device that acts as the slave for exactly one device,
 even if that device has multiple master ports.

Okay, makes sense. I guess depending on the nature of the IOMMU it might
make sense not to expose it as an IOMMU at all. For example if it lives
completely within the register space of its master device. In that case
it could be directly programmed from the device's driver.

  There still has to be one cell to specify which master. Unless perhaps
  if they can be arbitrarily assigned. I guess even if there's a fixed
  mapping that applies to one SoC generation, it might be good to still
  employ a specifier and have the mapping in DT for flexibility.
 
 let me clarify by example:
 
   iommu@1 {
   compatible = some,simple-iommu;
   reg = 1;
   #iommu-cells = 0; /* supports only one master */
   };
 
   iommu@2 {
   compatible = some,other-iommu;
   reg = 3;
   #iommu-cells = 1; /* contains master ID */
   };
 
   iommu@3 {
   compatible = some,windowed-iommu;
   reg = 2;
   #iommu-cells = 2; /* contains dma-window */
   };
 
   device@4 {
   compatible = some,ethernet;
   iommus = /iommu@1;
   };
 
   device@5 {
   compatible = some,dmaengine;
   iommus = /iommu@2 0x4000 0x100,
/iommu@3 0x101;
   };
 
 The device at address 4 has a one-one relationship with iommu@1, so there
 is no need for any data. device@5 has two master ports. One is connected to
 an IOMMU that has a per-device aperture, device@5 can only issue transfers
 to the 256MB area at 0x4000, and the IOMMU will have to put entries for
 this device into that address. The second master port is connected to
 iommu@3, which uses a master ID that gets passed along with each transfer,
 so that needs to be put into the IOTLBs.

The above sounds reasonable to me with the exception of the DMA window
specifier. Isn't that precisely the information that we currently
describe using the dma-ranges property?

 A variation would be to not use #iommu-cells at all, but provide a
 #address-cells / #size-cells pair in the IOMMU, and have a translation
 as we do for dma-ranges. This is probably most flexible.

I'm not sure I follow. Wouldn't that require masters to be children of
the IOMMU DT nodes for that to work out? Also how would that work for
cases where more data than the address ranges (such as the master ID) is
needed to operate the IOMMU?

 One completely open question that I just noticed is how the kernel should
 deal with the case of multiple IOMMUs attached to one master: the
 data structures we have assume that we know exactly how to do DMA by
 setting the per-device

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-15 Thread Thierry Reding
On Mon, Apr 28, 2014 at 02:05:30PM +0200, Arnd Bergmann wrote:
[...]
 let me clarify by example:
 
   iommu@1 {
   compatible = some,simple-iommu;
   reg = 1;
   #iommu-cells = 0; /* supports only one master */
   };
 
   iommu@2 {
   compatible = some,other-iommu;
   reg = 3;
   #iommu-cells = 1; /* contains master ID */
   };
 
   iommu@3 {
   compatible = some,windowed-iommu;
   reg = 2;
   #iommu-cells = 2; /* contains dma-window */
   };
 
   device@4 {
   compatible = some,ethernet;
   iommus = /iommu@1;
   };
 
   device@5 {
   compatible = some,dmaengine;
   iommus = /iommu@2 0x4000 0x100,
/iommu@3 0x101;
   };
 
 The device at address 4 has a one-one relationship with iommu@1, so there
 is no need for any data. device@5 has two master ports. One is connected to
 an IOMMU that has a per-device aperture, device@5 can only issue transfers
 to the 256MB area at 0x4000, and the IOMMU will have to put entries for
 this device into that address. The second master port is connected to
 iommu@3, which uses a master ID that gets passed along with each transfer,
 so that needs to be put into the IOTLBs.

iommu@3 and the second port of device@5 seem to match what we need for
Tegra (and as I understand also Exynos). Can we settle on this for now
so that Hiroshi and Cho can go update their drivers for this binding?

 A variation would be to not use #iommu-cells at all, but provide a
 #address-cells / #size-cells pair in the IOMMU, and have a translation
 as we do for dma-ranges. This is probably most flexible.

The remainder of this discussion seems to indicate that #iommu-cells and
dma-ranges don't have to be mutually exclusive. For some IOMMUs it might
make sense to use both.

In fact perhaps we should require every IOMMU user to also specify a
dma-ranges property, even if for some cases the range would be simply
the complete physical address space. Perhaps in analogy to the ranges
property an empty dma-ranges property could be taken to mean all of the
physical address space.

I'm aware that this doesn't cover any of the more exotic cases out
there, but the fact is that we have real devices out there that ship
with some variations of these simple IOMMUs and I don't think we're
doing ourselves a favour by blocking support for these to be added on
the hope of merging the perfect solution that covers all use-cases.
Patches for Tegra have already been around for close to half a year.

Thierry


pgpMGzmObBmZW.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-16 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:

https://lkml.org/lkml/2014/4/27/346

More advanced functionality such as the dma-ranges property can easily
be added in a backwards-compatible way. In the absence of a dma-ranges
property it should be safe to default to the whole address space.

Signed-off-by: Thierry Reding tred...@nvidia.com
---
 Documentation/devicetree/bindings/iommu/iommu.txt | 109 ++
 1 file changed, 109 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt 
b/Documentation/devicetree/bindings/iommu/iommu.txt
new file mode 100644
index ..2d67b52b656e
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -0,0 +1,109 @@
+This document describes the generic device tree binding for IOMMUs and their
+master(s).
+
+
+IOMMU device node:
+==
+
+An IOMMU can provide the following services:
+
+* Remap address space to allow devices to access physical memory ranges that
+  they otherwise wouldn't be capable of accessing.
+
+  Example: 32-bit DMA to 64-bit physical addresses
+
+* Implement scatter-gather at page level granularity so that the device does
+  not have to.
+
+* Provide system protection against rogue DMA by forcing all accesses to go
+  through the IOMMU and faulting when encountering accesses to unmapped
+  address regions.
+
+* Provide address space isolation between multiple contexts.
+
+  Example: Virtualization
+
+Device nodes compatible with this binding represent hardware with some of the
+above capabilities.
+
+IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
+typically have a fixed association to the master device, whereas multiple-
+master IOMMU devices can translate accesses from more than one master.
+
+Required properties:
+
+- #iommu-cells: The number of cells in an IOMMU specifier. The meaning of the
+  cells is defined by the binding for the IOMMU device.
+
+  Typical values include:
+  * 0: Single-master IOMMU devices are often not configurable, therefore the
+specifying doesn't need to encode any information and can be empty.
+
+  * 1: Multiple-master IOMMU devices need to know for which master they should
+enable translation. Typically the single cell in the specifier corresponds
+to the master device's ID.
+
+
+IOMMU master node:
+==
+
+Devices that access memory through an IOMMU are called masters. A device can
+have multiple master interfaces (to one or more IOMMU devices).
+
+Required properties:
+
+- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
+  master interfaces of the device. One entry in the list describes one master
+  interface of the device.
+
+Optional properties:
+
+- iommu-names: A list of names identifying each entry in the iommus property.
+
+
+Examples:
+=
+
+Single-master IOMMU:
+
+
+   iommu {
+   #iommu-cells = 0;
+   };
+
+   master {
+   iommu = /iommu;
+   };
+
+Multi-master IOMMU:
+---
+
+   iommu {
+   /* the specifier represents the ID of the master */
+   #iommu-cells = 1;
+   };
+
+   master {
+   /* device has master ID 42 in the IOMMU */
+   iommu = /iommu 42;
+   };
+
+Multi-master device:
+
+
+   /* single-master IOMMU */
+   iommu@1 {
+   #iommu-cells = 0;
+   };
+
+   /* multi-master IOMMU */
+   iommu@2 {
+   /* the specifier represents the ID of the master */
+   #iommu-cells = 1;
+   };
+
+   /* device with two master interfaces */
+   master {
+   iommus = /iommu@1,/* master of the single-master IOMMU */
+/iommu@2 42; /* ID 42 in multi-master IOMMU */
+   };
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-17 Thread Thierry Reding
On Sat, May 17, 2014 at 05:04:55PM +0900, Cho KyongHo wrote:
 On Fri, 16 May 2014 14:23:18 +0200, Thierry Reding wrote:
[...]
  +Examples:
  +=
  +
  +Single-master IOMMU:
  +
  +
  +   iommu {
  +   #iommu-cells = 0;
  +   };
  +
  +   master {
  +   iommu = /iommu;
  +   };
  +
 
 Great work, Thierry.
 
 One simple comment.
 
 This should be also applicable to multi-master IOMMUs that the masters
 of an IOMMU is not configurable with ID or something.
 I think the title needs to be changed to cover such IOMMUs which always
 translate master's transactions and unable to change the configuration
 of the relationship between the masters and IOMMUs by S/W.

Agreed, how about we add a separate example for that case:

Multiple-master IOMMU with fixed associations:
--

/* multiple-master IOMMU */
iommu {
/*
 * Masters are statically associated with this IOMMU and
 * address translation is always enabled.
 */
#iommu-cells = 0;
};

/* static association with IOMMU */
master@1 {
reg = 1;
iommus = /iommu;
};

/* static association with IOMMU */
master@2 {
reg = 2;
iommus = /iommu;
};

?

Thierry


pgp142AT5D_Jg.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-19 Thread Thierry Reding
On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote:
 On Friday 16 May 2014 14:23:18 Thierry Reding wrote:
  From: Thierry Reding tred...@nvidia.com
  
  This commit introduces a generic device tree binding for IOMMU devices.
  Only a very minimal subset is described here, but it is enough to cover
  the requirements of both the Exynos System MMU and Tegra SMMU as
  discussed here:
  
  https://lkml.org/lkml/2014/4/27/346
  
  More advanced functionality such as the dma-ranges property can easily
  be added in a backwards-compatible way. In the absence of a dma-ranges
  property it should be safe to default to the whole address space.
  
 
 The basic binding looks fine, but I'd like it to be more explicit
 about dma-ranges. Most importantly, what does the whole address space
 mean?

The whole point was to leave out any mention of dma-ranges from the
binding until we've figured out more of the puzzle.

So what I was trying to avoid was another lengthy discussion on the
topic of dma-ranges. Oh well... =)

 A lot of IOMMUs have only 32-bit bus addresses when targetted
 by a bus master, it would also be normal for some to be smaller and
 some might even support 64-bit.
 
 For the upstream side, I'd hope we always have access to the full
 physical memory, but since this is a brand-new binding, it should
 be straightforward to just ask for upstream dma-ranges properties
 to be set all the way up to the root to confirm that.
 
 For downstream, we don't actually have a good place to put the
 dma-ranges property.

I'm not sure I understand what you mean by upstream and downstream in
this context.

 We can't put it into the iommu node, because  that would imply translating
 to the iommu's parent bus, not the iommu's own bus space.

My understanding was that the purpose of dma-ranges was to define a
mapping from one bus to another. So the general form of

child-address  parent-address  child-size

Would be used to translate a region of size child-size from the
child-address (the I/O address space created by the IOMMU) to the
parent-address (physical address space).

 We also can't put it into the master, because dma-ranges is supposed to be
 in the parent bus.

I don't understand. From the above I would think that the master's node
is precisely where it belongs.

 Finally, it makes no sense to use the dma-ranges property of the master's
 parent bus, because that bus isn't actually involved in the translation.

My understanding here is mostly based on the OpenFirmware working group
proposal for the dma-ranges property[0]. I'll give another example to
try and clarify how I had imagined this to work:

/ {
#address-cells = 2;
#size-cells = 2;

iommu {
/*
 * This is somewhat unusual (or maybe not) in that we
 * need 2 cells to represent the size of an address
 * space that is 32 bits long.
 */
#address-cells = 1;
#size-cells = 2;

#iommu-cells = 1;
};

master {
iommus = /iommu 42;
/*
 * Map I/O addresses 0 - 4 GiB to physical addresses
 * 2 GiB - 6 GiB.
 */
dma-ranges = 0x 0 0x8000 1 0;
};
};

This is somewhat incompatible with [0] in that #address-cells used to
parse the child address must be taken from the iommu node rather than
the child node. But that seems to me to be the only reasonable thing
to do, because after all the IOMMU creates a completely new address
space for the master.

[0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt

 My preferred option would be to always put the address range into
 the iommu descriptor, using the iommu's #address-cells.

That could become impossible to parse. I'm not sure if such hardware
actually exists, but if for some reason we have to split the address
range into two, then there's no longer any way to determine the size
needed for the specifier.

On the other hand what you propose makes it easy to represent multiple
master interfaces on a device. With a separate dma-ranges property how
can you define which ranges apply to which of the master interfaces?

Then again if address ranges can't be broken up in the first place, then
dma-ranges could be considered to be one entry per IOMMU in the iommus
property.

Thierry


pgpRVxs2Okzoy.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-19 Thread Thierry Reding
On Mon, May 19, 2014 at 06:22:31PM +0100, Dave Martin wrote:
 On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote:
[...]
  My understanding here is mostly based on the OpenFirmware working group
  proposal for the dma-ranges property[0]. I'll give another example to
  try and clarify how I had imagined this to work:
  
  / {
  #address-cells = 2;
  #size-cells = 2;
  
  iommu {
  /*
   * This is somewhat unusual (or maybe not) in that we
   * need 2 cells to represent the size of an address
   * space that is 32 bits long.
   */
  #address-cells = 1;
  #size-cells = 2;
  
  #iommu-cells = 1;
  };
  
  master {
  iommus = /iommu 42;
 
 Comparing this with the other discussion thread, we have a similar
 concept here, in that the iommu is made a logical parent, however...
 
 Firstly, there's an implicit assumption here that the only kind of
 thing the master could possibly be connected to is an IOMMU, with
 no non-trivial interconnect in between.  I'm not sure this is going
 to scale to complex SoCs.

Here we go again. We're now in the very same bad spot that we've been in
so many times before. There are at least two SoCs that we know do not
require anything fancy, yet we're blocking adding support for those use
cases because we think that at some point some IOMMU may require more
than that. But at the same time it seems like we don't have enough data
about what exactly that is, so we keep speculating. At this rate we're
making it impossible to get a reasonable feature set supported upstream.

That's very frustrating.

 If a range of Stream IDs may be issued (especially from something like
 a PCIe RC where the stream ID may be a many-bit value), describing
 the IDs individually may be impractical.

The IOMMU specifier is completely specific to the IOMMU. If an IOMMU has
a need to represent the stream IDs as a range there's nothing keeping it
from defining the specifier accordingly.

Thierry


pgpN1vTPeaWK5.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-19 Thread Thierry Reding
On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
 On Monday 19 May 2014 14:53:37 Thierry Reding wrote:
  On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote:
   On Friday 16 May 2014 14:23:18 Thierry Reding wrote:
From: Thierry Reding tred...@nvidia.com

This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:

https://lkml.org/lkml/2014/4/27/346

More advanced functionality such as the dma-ranges property can easily
be added in a backwards-compatible way. In the absence of a dma-ranges
property it should be safe to default to the whole address space.

   
   The basic binding looks fine, but I'd like it to be more explicit
   about dma-ranges. Most importantly, what does the whole address space
   mean?
  
  The whole point was to leave out any mention of dma-ranges from the
  binding until we've figured out more of the puzzle.
  
  So what I was trying to avoid was another lengthy discussion on the
  topic of dma-ranges. Oh well... =)
 
 I think that can't work, because we need a way to specify the
 ranges for some iommu drivers. We have to make sure we at least
 don't prevent it from working.

That was precisely why I wanted to let this out of the binding for now
so that we can actually focus on making existing hardware work rather
than speculate about what we may or may not need.

If you think the current minimal binding will cause future use-cases to
break, then we should change it to avoid that. What you're proposing is
to make the binding more complex on the assumption that we'll get it
right.

Wouldn't it be safer to keep things to the bare minimum necessary to
represent hardware that we have access to and understand now, rather
than speculating about what may be necessary at some point. I'd much
prefer to add complexity on an as-needed basis and when we have a better
understand of where we're headed.

   Finally, it makes no sense to use the dma-ranges property of the master's
   parent bus, because that bus isn't actually involved in the translation.
  
  My understanding here is mostly based on the OpenFirmware working group
  proposal for the dma-ranges property[0]. I'll give another example to
  try and clarify how I had imagined this to work:
  
  / {
  #address-cells = 2;
  #size-cells = 2;
  
  iommu {
  /*
   * This is somewhat unusual (or maybe not) in that we
   * need 2 cells to represent the size of an address
   * space that is 32 bits long.
   */
  #address-cells = 1;
  #size-cells = 2;
 
 You should never need #size-cells  #address-cells

That was always my impression as well. But how then do you represent the
full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
4 GiB - 1, which makes it 4 GiB large. That's:

0 1 0

With #address-cells = 1 and #size-cells = 1 the best you can do is:

0 0x

but that's not accurate.

  #iommu-cells = 1;
  };
  
  master {
  iommus = /iommu 42;
  /*
   * Map I/O addresses 0 - 4 GiB to physical addresses
   * 2 GiB - 6 GiB.
   */
  dma-ranges = 0x 0 0x8000 1 0;
  };
  };
  
  This is somewhat incompatible with [0] in that #address-cells used to
  parse the child address must be taken from the iommu node rather than
  the child node. But that seems to me to be the only reasonable thing
  to do, because after all the IOMMU creates a completely new address
  space for the master.
  
  [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
 
 I don't think you can have a dma-ranges without a #address-cells and
 #size-cells property in the same node. In your example, I'd also expect
 a child node below 'master' that then interprets the address space
 made up by dma-ranges.

Okay, so what Dave and you have been saying strongly indicates that
dma-ranges isn't the right thing to use here.

 As a comment on the numbers in your example, I don't expect to ever
 see a 4GB IOMMU address space that doesn't start at an offset. Instead
 I'd expect either addresses that encode a device ID, or those that
 are just a subset of the parent address space, with non-overlapping
 bus addresses for each master.

As I understand the Tegra SMMU allows each of the clients to be assigned
a separate address space (4 GiB on earlier generations and 16 GiB on new
generations) and all addresses in each address space can be mapped to
arbitrary physical addresses.

   My preferred option would be to always put

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Thierry Reding
On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
  On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
   On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
[...]
 You should never need #size-cells  #address-cells

That was always my impression as well. But how then do you represent the
full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
4 GiB - 1, which makes it 4 GiB large. That's:

0 1 0

With #address-cells = 1 and #size-cells = 1 the best you can do is:

0 0x

but that's not accurate.
   
   I think we've done both in the past, either extended #size-cells or
   taken 0x as a special token. Note that in your example,
   the iommu actually needs #address-cells = 2 anyway.
  
  But it needs #address-cells = 2 only to encode an ID in addition to
  the address. If this was a single-master IOMMU then there'd be no need
  for the ID.
 
 Right. But for a single-master IOMMU, there is no need to specify
 any additional data, it could have #address-cells=0 if we take the
 optimization you suggested.

Couldn't a single-master IOMMU be windowed?

  I'm not sure I understand the need for 0x100 (all IDs) entry above. If
  bus1's iommus property applies to all devices on the bus, why can't the
  ID 0xb be listed in the iommus property?
  
  bus1 {
  #address-cells = 1;
  #size-cells = 1;
  ranges;
  iommus = {/iommu} 0xb 0  0x1 0x0; // 4GB ID '0xb'
  dma-ranges = 0  0xb 0  0x1 0x0;
  
  anothermaster {
  ...
  };
  };
 
 It depends on how the address is interpreted, but we could make this
 a valid case too.
 
  In which case I guess dma-ranges would be redundant.
 
 No, because the iommus property doesn't translate the address range, it
 just creates a new address space. bus1 and iommu in the example have
 different #address-cells, so you definitely need a non-empty ranges
 property.

Ah, now I get it.

   The main advantage I think would be for IOMMUs that use the PCI b/d/f
   numbers as IDs. These can have #address-cells=3, #size-cells=2
   and have an empty dma-ranges property in the PCI host bridge node,
   and interpret this as using the same encoding as the PCI BARs in
   the ranges property.
  
  I'm somewhat confused here, since you said earlier:
  
   After giving the ranges stuff some more thought, I have come to the
   conclusion that using #iommu-cells should work fine for almost
   all cases, including windowed iommus, because the window is not
   actually needed in the device, but only in the iommu, wihch is of course
   free to interpret the arguments as addresses.
  
  But now you seem to be saying that we should still be using the
  #address-cells and #size-cells properties in the IOMMU node to determine
  the length of the specifier.
 
 I probably wasn't clear. I think we can make it work either way, but
 my feeling is that using #address-cells/#size-cells gives us a nicer
 syntax for the more complex cases.

Okay, so in summary we'd have something like this for simple cases:

Required properties:

- #address-cells: The number of cells in an IOMMU specifier needed to encode
  an address.
- #size-cells: The number of cells in an IOMMU specifier needed to represent
  the length of an address range.

Typical values for the above include:
- #address-cells = 0, size-cells = 0: Single master IOMMU devices are not
  configurable and therefore no additional information needs to be encoded in
  the specifier. This may also apply to multiple master IOMMU devices that do
  not allow the association of masters to be configured.
- #address-cells = 1, size-cells = 0: Multiple master IOMMU devices may
  need to be configured in order to enable translation for a given master. In
  such cases the single address cell corresponds to the master device's ID.
- #address-cells = 2, size-cells = 2: Some IOMMU devices allow the DMA
  window for masters to be configured. The first cell of the address in this
  may contain the master device's ID for example, while the second cell could
  contain the start of the DMA window for the given device. The length of the
  DMA window is specified by two additional cells.

Examples:
=

Single-master IOMMU:


iommu {
#address-cells = 0;
#size-cells = 0;
};

master {
iommus = /iommu;
};

Multiple-master IOMMU with fixed associations:
--

/* multiple-master IOMMU */
iommu {
/*
 * Masters are statically associated with this IOMMU and
 * address translation is always enabled

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Thierry Reding
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
  Couldn't a single-master IOMMU be windowed?
 
 Ah, yes. That would actually be like an IBM pSeries, which has a windowed
 IOMMU but uses one window per virtual machine. In that case, the window could
 be a property of the iommu node though, rather than part of the address
 in the link.

Does that mean that the IOMMU has one statically configured window which
is the same for each virtual machine? That would require some other
mechanism to assign separate address spaces to each virtual machine,
wouldn't it? But I suspect that if the IOMMU allows that it could be
allocated dynamically at runtime.

  Multiple-master IOMMU with fixed associations:
  --
  
  /* multiple-master IOMMU */
  iommu {
  /*
   * Masters are statically associated with this IOMMU and
   * address translation is always enabled.
   */
  #iommu-cells = 0;
  };
 
 copied wrong? I guess you mean #address-cells=0/#size-cells=0 here.

Yes, I obviously wasn't careful when I copied this over.

  Multiple-master device:
  ---
  
  /* single-master IOMMU */
  iommu@1 {
  reg = 1;
  #address-cells = 0;
  #size-cells = 0;
  };
  
  /* multiple-master IOMMU */
  iommu@2 {
  reg = 2;
  #address-cells = 1;
  #size-cells = 0;
  };
  
  /* device with two master interfaces */
  master {
  iommus = /iommu@1,/* master of the single-master IOMMU */
   /iommu@2 42; /* ID 42 in multiple-master IOMMU */
  };
[...]
  Does that sound about right?
 
 Yes, sounds great. I would probably leave out the Multiple-master device
 from the examples, since that seems to be a rather obscure case.

Agreed. We can easily add such examples if/when such device start to
appear.

 I would like to add an explanation about dma-ranges to the binding:
 
 8
 The parent bus of the iommu must have a valid dma-ranges property
 describing how the physical address space of the IOMMU maps into
 memory.

With physical address space you mean the addresses after translation,
not the I/O virtual addresses, right? But even so, how will this work
when there are multiple IOMMU devices? What determines which IOMMU is
mapped via which entry?

Perhaps having multiple IOMMUs implies that there will have to be some
partitioning of the parent address space to make sure two IOMMUs don't
translate to the same ranges?

 A device with an iommus property will ignore the dma-ranges property
 of the parent node and rely on the IOMMU for translation instead.

Do we need to consider the case where an IOMMU listed in iommus isn't
enabled (status = disabled)? In that case presumably the device would
either not function or may optionally continue to master onto the parent
untranslated.

 Using an iommus property in bus device nodes with dma-ranges
 specifying how child devices relate to the IOMMU is a possible extension
 but is not recommended until this binding gets extended.

Just for my understanding, bus device nodes with iommus and dma-ranges
properties could be equivalently written by explicitly moving the iommus
properties into the child device nodes, right? In which case they should
be the same as the other examples. So that concept is a convenience
notation to reduce duplication, but doesn't fundamentally introduce any
new concept.

Thierry


pgpb8EcG_HoBX.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Thierry Reding
On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
  On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
  [...]
Couldn't a single-master IOMMU be windowed?
   
   Ah, yes. That would actually be like an IBM pSeries, which has a windowed
   IOMMU but uses one window per virtual machine. In that case, the window 
   could
   be a property of the iommu node though, rather than part of the address
   in the link.
  
  Does that mean that the IOMMU has one statically configured window which
  is the same for each virtual machine? That would require some other
  mechanism to assign separate address spaces to each virtual machine,
  wouldn't it? But I suspect that if the IOMMU allows that it could be
  allocated dynamically at runtime.
 
 The way it works on pSeries is that upon VM creation, the guest is assigned
 one 256MB window for use by assigned DMA capable devices. When the guest
 creates a mapping, it uses a hypercall to associate a bus address in that
 range with a guest physical address. The hypervisor checks that the bus
 address is within the allowed range, and translates the guest physical
 address into a host physical address, then enters both into the I/O page
 table or I/O TLB.

So when a VM is booted it is passed a device tree with that DMA window?
Given what you describe above this seems to be more of a configuration
option to restrict the IOMMU to a subset of the physical memory for
purposes of virtualization. So I agree that this wouldn't be a good fit
for what we're trying to achieve with iommus or dma-ranges in this
binding.

   I would like to add an explanation about dma-ranges to the binding:
   
   8
   The parent bus of the iommu must have a valid dma-ranges property
   describing how the physical address space of the IOMMU maps into
   memory.
  
  With physical address space you mean the addresses after translation,
  not the I/O virtual addresses, right? But even so, how will this work
  when there are multiple IOMMU devices? What determines which IOMMU is
  mapped via which entry?
  
  Perhaps having multiple IOMMUs implies that there will have to be some
  partitioning of the parent address space to make sure two IOMMUs don't
  translate to the same ranges?
 
 These dma-ranges properties would almost always be for the entire RAM,
 and we can treat anything else as a bug.

Would it typically be a 1:1 mapping? In that case could we define an
empty dma-ranges property to mean exactly that? That would make it
consistent with the ranges property.

 The mapping between what goes into the IOMMU and what comes out of it
 is not reflected in DT at all, since it only happens at runtime.
 The dma-ranges property I mean above describes how what comes out of
 the IOMMU maps into physical memory.

Understood. That makes sense.

   A device with an iommus property will ignore the dma-ranges property
   of the parent node and rely on the IOMMU for translation instead.
  
  Do we need to consider the case where an IOMMU listed in iommus isn't
  enabled (status = disabled)? In that case presumably the device would
  either not function or may optionally continue to master onto the parent
  untranslated.
 
 My reasoning was that the DT should specify whether we use the IOMMU
 or not. Being able to just switch on or off the IOMMU sounds nice as
 well, so we could change the text above to do that.
 
 Another option would be to do this in the IOMMU code, basically
 falling back to the IOMMU parent's dma-ranges property and using
 linear dma_map_ops when that is disabled.

Yes, it should be trivial for the IOMMU core code to take care of this
special case. Still I think it's worth mentioning it in the binding so
that it's clearly specified.

   Using an iommus property in bus device nodes with dma-ranges
   specifying how child devices relate to the IOMMU is a possible extension
   but is not recommended until this binding gets extended.
  
  Just for my understanding, bus device nodes with iommus and dma-ranges
  properties could be equivalently written by explicitly moving the iommus
  properties into the child device nodes, right? In which case they should
  be the same as the other examples. So that concept is a convenience
  notation to reduce duplication, but doesn't fundamentally introduce any
  new concept.
 
 The one case  where that doesn't work is PCI, because we don't list the
 PCI devices in DT normally, and the iommus property would only exist
 at the PCI host controller node.

But it could work in classic OpenFirmware where the device tree can be
populated with the tree of PCI devices enumerated by OF. There are also
devices that have a fixed configuration and where technically the PCI
devices can be listed in the device tree. This is somewhat important if
for example one PCI device is a GPIO controller and needs to be
referenced by phandle

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Thierry Reding
On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
  On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
  Couldn't a single-master IOMMU be windowed?
 
 Ah, yes. That would actually be like an IBM pSeries, which has a 
 windowed
 IOMMU but uses one window per virtual machine. In that case, the 
 window could
 be a property of the iommu node though, rather than part of the 
 address
 in the link.

Does that mean that the IOMMU has one statically configured window which
is the same for each virtual machine? That would require some other
mechanism to assign separate address spaces to each virtual machine,
wouldn't it? But I suspect that if the IOMMU allows that it could be
allocated dynamically at runtime.
   
   The way it works on pSeries is that upon VM creation, the guest is 
   assigned
   one 256MB window for use by assigned DMA capable devices. When the guest
   creates a mapping, it uses a hypercall to associate a bus address in that
   range with a guest physical address. The hypervisor checks that the bus
   address is within the allowed range, and translates the guest physical
   address into a host physical address, then enters both into the I/O page
   table or I/O TLB.
  
  So when a VM is booted it is passed a device tree with that DMA window?
 
 Correct.
 
  Given what you describe above this seems to be more of a configuration
  option to restrict the IOMMU to a subset of the physical memory for
  purposes of virtualization. So I agree that this wouldn't be a good fit
  for what we're trying to achieve with iommus or dma-ranges in this
  binding.
 
 Thinking about it again now, I wonder if there are any other use cases
 for windowed IOMMUs. If this is the only one, there might be no use
 in the #address-cells model I suggested instead of your original
 #iommu-cells.

So in this case virtualization is the reason why we need the DMA window.
The reason for that is that the guest has no other way of knowing what
other guests might be using, so it's essentially a mechanism for the
host to manage the DMA region and allocate subregions for each guest. If
virtualization isn't an issue then it seems to me that the need for DMA
windows goes away because the operating system will track DMA regions
anyway.

The only reason I can think of why a windowed IOMMU would be useful is
to prevent two or more devices from stepping on each others' toes. But
that's a problem that the OS should already be handling during DMA
buffer allocation, isn't it?

 I would like to add an explanation about dma-ranges to the binding:
 
 8
 The parent bus of the iommu must have a valid dma-ranges property
 describing how the physical address space of the IOMMU maps into
 memory.

With physical address space you mean the addresses after translation,
not the I/O virtual addresses, right? But even so, how will this work
when there are multiple IOMMU devices? What determines which IOMMU is
mapped via which entry?

Perhaps having multiple IOMMUs implies that there will have to be some
partitioning of the parent address space to make sure two IOMMUs don't
translate to the same ranges?
   
   These dma-ranges properties would almost always be for the entire RAM,
   and we can treat anything else as a bug.
  
  Would it typically be a 1:1 mapping? In that case could we define an
  empty dma-ranges property to mean exactly that? That would make it
  consistent with the ranges property.
 
 Yes, I believe that is how it's already defined.

I've gone through the proposal at [0] again, but couldn't find a mention
of an empty dma-ranges property. But regardless I think that a 1:1
mapping is the obvious meaning of an empty dma-ranges property.

[0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt

One thing I'm not sure about is whether dma-ranges should be documented
in this binding at all. Since there's an accepted standard proposal it
would seem that it doesn't need to be specifically mentioned. One other
option would be to link to the above proposal from the binding and then
complement that with what an empty dma-ranges property means.

Or we could possible document this in a file along with other standard
properties. I don't think we currently do that for any properties, but
my concern is that there will always be a limited number of people
knowing about how such properties are supposed to work. If all of a
sudden all these people would disappear, everybody else would be left
with references to these properties but nowhere to look for their
meaning.

 A device with an iommus property will ignore the dma-ranges

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Thierry Reding
On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
 On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
  On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
  [...]
  Multiple-master IOMMU:
  --
  
  iommu {
  /* the specifier represents the ID of the master */
  #address-cells = 1;
  #size-cells = 0;

How do we know the size of the input address to the IOMMU?  Do we
get cases for example where the IOMMU only accepts a 32-bit input
address, but some 64-bit capable masters are connected through it?
   
   I was stuck on this question for a while before, but then I realized
   that it doesn't matter at all: It's the IOMMU driver itself that
   manages the address space, and it doesn't matter if a slave can
   address a larger range than the IOMMU can accept. If the IOMMU
   needs to deal with the opposite case (64-bit input addresses
   but a 32-bit master), that limitation can be put into the specifier.
  
  Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
  master device's DMA mask to do the right thing here?
 
 Ah, yes. I guess that's the right way to do it.
 
For determining dma masks, it is the output address that it
important.  Santosh's code can probably be taught to handle this,
if given an additional traversal rule for following iommus
properties.  However, deploying an IOMMU whose output address size
is smaller than the 
   
   Something seems to be missing here. I don't think we want to handle
   the case where the IOMMU output cannot the entire memory address
   space. If necessary, that would mean using both an IOMMU driver
   and swiotlb, but I think it's a reasonable assumption that hardware
   isn't /that/ crazy.
  
  Similarily, should the IOMMU not be treated like any other device here?
  Its DMA mask should determine what address range it can access.
 
 Right. But for that we need a dma-ranges property in the parent of the
 iommu, just so the mask can be set correctly and we don't have to
 rely on the 32-bit fallback case.

Shouldn't the IOMMU driver be the one to set the DMA mask for the device
in exactly the same way that other drivers override the 32-bit default?

Thierry


pgpscswbqbmqO.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Thierry Reding
On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
 On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote:
  On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
  On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
  [...]
Couldn't a single-master IOMMU be windowed?
   
   Ah, yes. That would actually be like an IBM pSeries, which has a 
   windowed
   IOMMU but uses one window per virtual machine. In that case, the 
   window could
   be a property of the iommu node though, rather than part of the 
   address
   in the link.
  
  Does that mean that the IOMMU has one statically configured window 
  which
  is the same for each virtual machine? That would require some other
  mechanism to assign separate address spaces to each virtual machine,
  wouldn't it? But I suspect that if the IOMMU allows that it could be
  allocated dynamically at runtime.
 
 The way it works on pSeries is that upon VM creation, the guest is 
 assigned
 one 256MB window for use by assigned DMA capable devices. When the 
 guest
 creates a mapping, it uses a hypercall to associate a bus address in 
 that
 range with a guest physical address. The hypervisor checks that the 
 bus
 address is within the allowed range, and translates the guest physical
 address into a host physical address, then enters both into the I/O 
 page
 table or I/O TLB.

So when a VM is booted it is passed a device tree with that DMA window?
   
   Correct.
   
Given what you describe above this seems to be more of a configuration
option to restrict the IOMMU to a subset of the physical memory for
purposes of virtualization. So I agree that this wouldn't be a good fit
for what we're trying to achieve with iommus or dma-ranges in this
binding.
   
   Thinking about it again now, I wonder if there are any other use cases
   for windowed IOMMUs. If this is the only one, there might be no use
   in the #address-cells model I suggested instead of your original
   #iommu-cells.
  
  So in this case virtualization is the reason why we need the DMA window.
  The reason for that is that the guest has no other way of knowing what
  other guests might be using, so it's essentially a mechanism for the
  host to manage the DMA region and allocate subregions for each guest. If
  virtualization isn't an issue then it seems to me that the need for DMA
  windows goes away because the operating system will track DMA regions
  anyway.
  
  The only reason I can think of why a windowed IOMMU would be useful is
  to prevent two or more devices from stepping on each others' toes. But
  that's a problem that the OS should already be handling during DMA
  buffer allocation, isn't it?
 
 Right. As long as we always unmap the buffers from the IOMMU after they
 have stopped being in use, it's very unlikely that even a broken device
 driver causes a DMA into some bus address that happens to be mapped for
 another device.

I think that if buffers remain mapped in the IOMMU when they have been
deallocated that should be considered a bug.

Thierry


pgpRqG2b99_Wd.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Thierry Reding
On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote:
 On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
  On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
   On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
 
  For determining dma masks, it is the output address that it
  important.  Santosh's code can probably be taught to handle this,
  if given an additional traversal rule for following iommus
  properties.  However, deploying an IOMMU whose output address size
  is smaller than the 
 
 Something seems to be missing here. I don't think we want to handle
 the case where the IOMMU output cannot the entire memory address
 space. If necessary, that would mean using both an IOMMU driver
 and swiotlb, but I think it's a reasonable assumption that hardware
 isn't /that/ crazy.

Similarily, should the IOMMU not be treated like any other device here?
Its DMA mask should determine what address range it can access.
   
   Right. But for that we need a dma-ranges property in the parent of the
   iommu, just so the mask can be set correctly and we don't have to
   rely on the 32-bit fallback case.
  
  Shouldn't the IOMMU driver be the one to set the DMA mask for the device
  in exactly the same way that other drivers override the 32-bit default?
 
 The IOMMU driver could /ask/ for an appropriate mask based on its internal
 design, but if you have an IOMMU with a 64-bit output address connected
 to a 32-bit bus, that should fail.

Are there real use-cases where that really happens? I guess if we need
that the correct thing would be to bitwise AND both the DMA mask of the
IOMMU device (as set by the driver) with that derived from the IOMMU's
parent bus' dma-ranges property.

 Note that it's not obvious what the IOMMU's DMA mask actually means.
 It clearly has to be the mask that is used for allocating the IO page
 tables, but it wouldn't normally be used in the path that allocates
 pages on behalf of a DMA master attached to the IOMMU, because that
 allocation is performed by the code that looks at the other device's
 dma mask.

Interesting. If a DMA buffer is allocated using the master's DMA mask
wouldn't that cause breakage if the IOMMU and master's DMA masks don't
match. It seems to me like the right thing to do for buffer allocation
is to use the IOMMU's DMA mask if a device uses the IOMMU for
translation and use the device's DMA mask when determining to what I/O
virtual address to map that buffer.

Obviously if we always assume that IOMMU hardware is sane and can always
access at least the whole memory then this isn't an issue. But what if a
device can do DMA to a 64-bit address space, but the IOMMU can only
address 32 bits. If the device's DMA mask is used for allocations, then
buffers could reside beyond the 4 GiB boundary that the IOMMU can
address, so effectively the IOMMU wouldn't be able to write to those
buffers.

Thierry


pgpeRVW1YZb4s.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2] devicetree: Add generic IOMMU device tree bindings

2014-05-23 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:

https://lkml.org/lkml/2014/4/27/346

Signed-off-by: Thierry Reding tred...@nvidia.com
---
Changes in v2:
- add notes about dma-ranges property (drop note from commit message)
- document priorities of iommus property vs. dma-ranges property
- drop #iommu-cells in favour of #address-cells and #size-cells
- remove multiple-master device example

 Documentation/devicetree/bindings/iommu/iommu.txt | 167 ++
 1 file changed, 167 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt 
b/Documentation/devicetree/bindings/iommu/iommu.txt
new file mode 100644
index ..6ce759afcc94
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -0,0 +1,167 @@
+This document describes the generic device tree binding for IOMMUs and their
+master(s).
+
+
+IOMMU device node:
+==
+
+An IOMMU can provide the following services:
+
+* Remap address space to allow devices to access physical memory ranges that
+  they otherwise wouldn't be capable of accessing.
+
+  Example: 32-bit DMA to 64-bit physical addresses
+
+* Implement scatter-gather at page level granularity so that the device does
+  not have to.
+
+* Provide system protection against rogue DMA by forcing all accesses to go
+  through the IOMMU and faulting when encountering accesses to unmapped
+  address regions.
+
+* Provide address space isolation between multiple contexts.
+
+  Example: Virtualization
+
+Device nodes compatible with this binding represent hardware with some of the
+above capabilities.
+
+IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
+typically have a fixed association to the master device, whereas multiple-
+master IOMMU devices can translate accesses from more than one master.
+
+The device tree node of the IOMMU device's parent bus must contain a valid
+dma-ranges property that describes how the physical address space of the
+IOMMU maps to memory. An empty dma-ranges property means that there is a
+1:1 mapping from IOMMU to memory.
+
+Required properties:
+
+- #address-cells: The number of cells in an IOMMU specifier needed to encode
+  an address.
+- #size-cells: The number of cells in an IOMMU specifier needed to represent
+  the length of an address range.
+
+Typical values for the above include:
+- #address-cells = 0, size-cells = 0: Single master IOMMU devices are not
+  configurable and therefore no additional information needs to be encoded in
+  the specifier. This may also apply to multiple master IOMMU devices that do
+  not allow the association of masters to be configured.
+- #address-cells = 1, size-cells = 0: Multiple master IOMMU devices may
+  need to be configured in order to enable translation for a given master. In
+  such cases the single address cell corresponds to the master device's ID.
+- #address-cells = 2, size-cells = 2: Some IOMMU devices allow the DMA
+  window for masters to be configured. The first cell of the address in this
+  may contain the master device's ID for example, while the second cell could
+  contain the start of the DMA window for the given device. The length of the
+  DMA window is specified by two additional cells.
+
+
+IOMMU master node:
+==
+
+Devices that access memory through an IOMMU are called masters. A device can
+have multiple master interfaces (to one or more IOMMU devices).
+
+Required properties:
+
+- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
+  master interfaces of the device. One entry in the list describes one master
+  interface of the device.
+
+When an iommus property is specified in a device tree node, the IOMMU will
+be used for address translation. If a dma-ranges property exists in the
+device's parent node it will be ignored. An exception to this rule is if the
+referenced IOMMU is disabled, in which case the dma-ranges property of the
+parent shall take effect.
+
+Optional properties:
+
+- iommu-names: A list of names identifying each entry in the iommus
+  property.
+
+
+Notes:
+==
+
+One possible extension to the above is to use an iommus property along with
+a dma-ranges property in a bus device node (such as PCI host bridges). This
+can be useful to describe how children on the bus relate to the IOMMU if they
+are not explicitly listed in the device tree (e.g. PCI devices). However, the
+requirements of that use-case haven't been fully determined yet. Implementing
+this is therefore not recommended without further discussion and extension of
+this binding.
+
+
+Examples:
+=
+
+Single-master IOMMU

[PATCH v2] devicetree: Add generic IOMMU device tree bindings

2014-05-23 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:

https://lkml.org/lkml/2014/4/27/346

Signed-off-by: Thierry Reding tred...@nvidia.com
---
Apologies for the noise, but apparently I mistyped one of the email
addresses, should be fixed now.

Changes in v2:
- add notes about dma-ranges property (drop note from commit message)
- document priorities of iommus property vs. dma-ranges property
- drop #iommu-cells in favour of #address-cells and #size-cells
- remove multiple-master device example

 Documentation/devicetree/bindings/iommu/iommu.txt | 167 ++
 1 file changed, 167 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt 
b/Documentation/devicetree/bindings/iommu/iommu.txt
new file mode 100644
index ..6ce759afcc94
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -0,0 +1,167 @@
+This document describes the generic device tree binding for IOMMUs and their
+master(s).
+
+
+IOMMU device node:
+==
+
+An IOMMU can provide the following services:
+
+* Remap address space to allow devices to access physical memory ranges that
+  they otherwise wouldn't be capable of accessing.
+
+  Example: 32-bit DMA to 64-bit physical addresses
+
+* Implement scatter-gather at page level granularity so that the device does
+  not have to.
+
+* Provide system protection against rogue DMA by forcing all accesses to go
+  through the IOMMU and faulting when encountering accesses to unmapped
+  address regions.
+
+* Provide address space isolation between multiple contexts.
+
+  Example: Virtualization
+
+Device nodes compatible with this binding represent hardware with some of the
+above capabilities.
+
+IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
+typically have a fixed association to the master device, whereas multiple-
+master IOMMU devices can translate accesses from more than one master.
+
+The device tree node of the IOMMU device's parent bus must contain a valid
+dma-ranges property that describes how the physical address space of the
+IOMMU maps to memory. An empty dma-ranges property means that there is a
+1:1 mapping from IOMMU to memory.
+
+Required properties:
+
+- #address-cells: The number of cells in an IOMMU specifier needed to encode
+  an address.
+- #size-cells: The number of cells in an IOMMU specifier needed to represent
+  the length of an address range.
+
+Typical values for the above include:
+- #address-cells = 0, size-cells = 0: Single master IOMMU devices are not
+  configurable and therefore no additional information needs to be encoded in
+  the specifier. This may also apply to multiple master IOMMU devices that do
+  not allow the association of masters to be configured.
+- #address-cells = 1, size-cells = 0: Multiple master IOMMU devices may
+  need to be configured in order to enable translation for a given master. In
+  such cases the single address cell corresponds to the master device's ID.
+- #address-cells = 2, size-cells = 2: Some IOMMU devices allow the DMA
+  window for masters to be configured. The first cell of the address in this
+  may contain the master device's ID for example, while the second cell could
+  contain the start of the DMA window for the given device. The length of the
+  DMA window is specified by two additional cells.
+
+
+IOMMU master node:
+==
+
+Devices that access memory through an IOMMU are called masters. A device can
+have multiple master interfaces (to one or more IOMMU devices).
+
+Required properties:
+
+- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
+  master interfaces of the device. One entry in the list describes one master
+  interface of the device.
+
+When an iommus property is specified in a device tree node, the IOMMU will
+be used for address translation. If a dma-ranges property exists in the
+device's parent node it will be ignored. An exception to this rule is if the
+referenced IOMMU is disabled, in which case the dma-ranges property of the
+parent shall take effect.
+
+Optional properties:
+
+- iommu-names: A list of names identifying each entry in the iommus
+  property.
+
+
+Notes:
+==
+
+One possible extension to the above is to use an iommus property along with
+a dma-ranges property in a bus device node (such as PCI host bridges). This
+can be useful to describe how children on the bus relate to the IOMMU if they
+are not explicitly listed in the device tree (e.g. PCI devices). However, the
+requirements of that use-case haven't been fully determined yet. Implementing
+this is therefore not recommended without further

Re: [PATCH v2] devicetree: Add generic IOMMU device tree bindings

2014-06-04 Thread Thierry Reding
On Sun, Jun 01, 2014 at 10:55:46AM +0100, Will Deacon wrote:
 On Fri, May 30, 2014 at 08:54:37PM +0100, Arnd Bergmann wrote:
  On Friday 30 May 2014 22:29:13 Hiroshi Doyu wrote:
   Tegra,SMMU has a similar problem and we have used a fixed size bitmap(64
   bit) to afford 64 stream IDs so that a single device can hold multiple
   IDs. If we apply the same bitmap to the above exmaple:
   
  iommu {
  /* the specifier represents the ID of the master */
  #address-cells = 1;
  #size-cells = 0;
  };
   
   master@a {
   ...
   iommus = smmu (BIT(1) | BIT(2) | BIT(3)); # IDs 1 2 3
   };
   
   master@b {
   ...
   iommus = smmu BIT(4); # ID 4
   };
   
   The disadvantage of this is that this limits the max number of streamIDs
   to support. If # of streamID is increased later more than 64, this
   format cannot cover any more. You have to predict the max # of streamIDs
   in advance if steamID is statically assigned.
   
  
  Well, the iommu specific binding could allow a variable #address-cells.
  That way, you just need to know the number of stream IDs for that instance
  of the iommu.
 
 In general, though, the SMMU will be able to support a large number of
 stream IDs (say a 16-bit space). The restriction we're interested in here is
 how many different stream IDs can be emitted by a single master device
 coming into the SMMU. *That* is a property of the master, not the SMMU.
 
 In the current arm,smmu binding I have a #stream-id-cells property in each
 master. I can then feed that straight into of_parse_phandle_with_args to
 enumerate the IDs for that master. The problem with that is we're
 artificially restricted by MAX_PHANDLE_ARGS.

Maybe I don't fully understand, but since we leave it up to the IOMMU
binding to define the exact meaning of #iommu-cells, can't you simply
use that to your advantage and define something like:

iommus = smmu 0 7;

to mean IDs 0 to 7 for this particular IOMMU type?

Thierry


pgptrPK3LrtPC.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] devicetree: Add generic IOMMU device tree bindings

2014-06-04 Thread Thierry Reding
On Fri, May 30, 2014 at 09:54:37PM +0200, Arnd Bergmann wrote:
 On Friday 30 May 2014 22:29:13 Hiroshi Doyu wrote:
  
  IIUC the original problem, a master with 8 streamIDs means something
  like below, where some devices have multiple IDs but some have a
  single. A sinle #address-cells cannot afford those 2 masters at once.
  
 iommu {
 /* the specifier represents the ID of the master */
 #address-cells = 1;
 #size-cells = 0;
 };
  
  master@a {
  ...
  iommus = smmu 1 2 3; # 3 IDs
  };
  
  master@b {
  ...
  iommus = smmu 4; # 1 ID
  };
 
 This would not be the usual format really. It should instead be
 
   iommus = smmu 1, smmu 2, smmu 3;
 
 which can be tedious to type.

Tedious to type doesn't sound like a good argument to me. I don't see
why the above would necessarily be a bad notation. It's very much up to
the point and very explicit. This very obviously translates to:

This device has three master interfaces, one for smmu ID 1, one
for smmu ID 2 and one for smmu ID 3.

  Tegra,SMMU has a similar problem and we have used a fixed size bitmap(64
  bit) to afford 64 stream IDs so that a single device can hold multiple
  IDs. If we apply the same bitmap to the above exmaple:
  
 iommu {
 /* the specifier represents the ID of the master */
 #address-cells = 1;
 #size-cells = 0;
 };
  
  master@a {
  ...
  iommus = smmu (BIT(1) | BIT(2) | BIT(3)); # IDs 1 2 3
  };
  
  master@b {
  ...
  iommus = smmu BIT(4); # ID 4
  };
  
  The disadvantage of this is that this limits the max number of streamIDs
  to support. If # of streamID is increased later more than 64, this
  format cannot cover any more. You have to predict the max # of streamIDs
  in advance if steamID is statically assigned.
  
 
 Well, the iommu specific binding could allow a variable #address-cells.
 That way, you just need to know the number of stream IDs for that instance
 of the iommu.

That sounds fairly complicated to me. I don't see what that buys us over
the clarity and simplicity that the above explicit notation gives us. Is
it not more common for a device to have a single master rather than a
whole bunch of them?

Thierry


pgpfX16DYEzBD.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] devicetree: Add generic IOMMU device tree bindings

2014-06-04 Thread Thierry Reding
On Fri, May 30, 2014 at 09:01:19PM +0200, Arnd Bergmann wrote:
 On Friday 30 May 2014 12:22:32 Dave Martin wrote:
   +
   +Examples:
   +=
   +
   +Single-master IOMMU:
   +
   +
   + iommu {
   + #address-cells = 0;
   + #size-cells = 0;
   + };
   +
   + master {
   + iommus = /iommu;
   + };
   +
   +Multiple-master IOMMU with fixed associations:
   +--
   +
   + /* multiple-master IOMMU */
   + iommu {
   + /*
   +  * Masters are statically associated with this IOMMU and
   +  * address translation is always enabled.
   +  */
   + #address-cells = 0;
   + #size-cells = 0;
  
  In this example, can different translations be set up for the different
  masters?
  
  With no cells available to contain any sort of ID, it looks like this
  is not possible.
 
 Correct, this example is for an IOMMU that does not use IDs but has a
 shared address space for all devices.

Couldn't these device all still have separate address spaces?

Thierry


pgpbp5MPMJr08.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] devicetree: Add generic IOMMU device tree bindings

2014-06-09 Thread Thierry Reding
On Sat, Jun 07, 2014 at 03:22:13PM +0200, Arnd Bergmann wrote:
 On Saturday 07 June 2014 00:45:45 Thierry Reding wrote:
  This is somewhat off-topic, but given the various concepts discussed in
  this thread I'm beginning to wonder how they will be implemented.
 
 I think it's good you raised the question.
 
  The
  current implementation hooks the IOMMU API into the DMA mapping API, and
  the way this is done is by setting a single IOMMU (or rather a set of
  IOMMU operations) globally per bus type.
 
 I hadn't realized that we have a per-bus iommu_ops pointer. I agree
 this will become a limitation as soon as we have a soc with two different
 IOMMUs that have platform devices attached, and it has to be moved into
 the device or a structure related to that.
 
 If that turns out controversial, we can probably have a set of pseudo
 iommu ops that just call into dev-archdata-iommu_ops-function()
 for ARM.
 
  There are two issues that I can see with that: one is that we can't
  support multiple IOMMUs in the system at all, and the other is that
  there is no context associated with the IOMMU ops, and therefore there
  is no way to differentiate between two instances of the same IOMMU. A
  few drivers use global variables to keep track of context information
  but that won't work with multiple instances, unless they keep a global
  list of all instances and then require explicit lookup in each of the
  IOMMU operations. That sounds more like a workaround rather than a
  proper solution to me.
 
 Supporting multiple iommus that share one iommu driver should work
 without such hacks, as you can put the per-device information into
 dev-device_dma_parameters (this works only for very simple IOMMUs)
 or dev-archdata-iommu

I was talking about the lack of a place to store context for the IOMMU
itself. Currently none of the functions in iommu_ops have a way to get
access to the IOMMU context itself. In fact there's not even a common
structure that could be used for this purpose. I have a couple of local
patches that try to add something like that, along with functions to
more explicitly hook up a device with it's IOMMU(s). It looks somewhat
like this:

struct iommu {
struct device *dev;

struct list_head list;

const struct iommu_ops *ops;
};

/* register and unregister IOMMU device with core */
int iommu_add(struct iommu *iommu);
void iommu_remove(struct iommu *iommu);

/*
 * Attach a device to one or more IOMMUs (according to the
 * iommus property).
 */
int iommu_attach(struct device *dev);
int iommu_detach(struct device *dev);

Does that look like a direction that we would want to pursue?

 (we may want to generalize that, I think someone just posted patches
 for it).

Perhaps you mean this:

[PATCHv3 1/3] device.h: arm dma-iommu: Move out dma_iommu_mapping struct

? From a quick glance that indeed looks like a promising step towards
unifying this across architectures.

  Discussion in this thread indicates that both of the above will be
  required at some point. Have I completely missed something or will we
  have to rework (parts of) the IOMMU API to make this work?
  
  One other thing that I have some difficulty understanding is how we can
  support things like process isolation using the current IOMMU API. Since
  a device will be statically assigned to one IOMMU domain at probe time
  there is no way we can change the address space upon a context switch.
 
 We have just introduced a way to parse dma-ranges in of_dma_configure().
 
 The only way I see this done for platform devices is to do the IOMMU
 configuration in the same place: if an iommus property is found there,
 we call out to the iommu driver that matches the respective iommu device
 and let it configure the master device.
 
 The device already has multiple properties related to iommus:
 'struct device_dma_parameters', 'archdata', 'iommu_group', and
 pdev_archdata for platform devices. This should be enough to set up
 the default iommu dma_map_ops so we can have non-isolated DMA using
 dma_map_* and dma_alloc_coherent.

Right, I think up to that point things should be fine with the existing
IOMMU API and using only DMA mapping functions.

 I haven't given much thought to devices that want to use the IOMMU
 API directly so they can have multiple domains rather than rely on
 the dma-mapping abstraction.

I'm specifically thinking about cases where we want to use the IOMMU to
isolate processes from each other. This is probably most relevant for
GPUs, since they are driven to a large degree from userspace. Other
peripherals are mostly services in kernel space exclusively, so I don't
think the issue is as relevant there.

On Tegra there's two IOMMUs, one system-wide and another one directly
used by the GPU (and programmed by the GPU driver (nouveau)). For the
latter it probably doesn't make sense to expose

Re: [PATCH v2] devicetree: Add generic IOMMU device tree bindings

2014-06-17 Thread Thierry Reding
On Mon, Jun 16, 2014 at 01:57:04PM +0100, Will Deacon wrote:
 On Wed, Jun 04, 2014 at 10:12:38PM +0100, Thierry Reding wrote:
  On Fri, May 30, 2014 at 12:27:28PM +0100, Dave Martin wrote:
   On Fri, May 30, 2014 at 08:30:08AM +0100, Thierry Reding wrote:
  [...]
Arnd, can you take another look at this binding and see if there's
anything else missing? If not I'll go through the document again and
update all #address-cells/#size-cells references with #iommu-cells as
appropriate and submit v3.
   
   How do you envisage propagation of the master ID bits downstream of the
   IOMMU would be described?
   
   We will definitely need a way to describe this for GICv3.  How those
   values are propagated is likely to vary between related SoCs and doesn't
   feel like it should be baked into a driver, especially for the ARM SMMU
   which may get reused in radically different SoC families from different
   vendors.
  
  Well, we've had cases like these in the past (power sequences come to
  mind). Some concepts are just too difficult or unwieldy to be put into
  device tree. I think that this is one of them.
  
   The most likely types of remapping are the adding of a base offset or
   some extra bits to the ID -- because not all MSIs to the GIC will
   necessarily pass through the IOMMU.  It's also possible that we might
   see ID squashing or folding in some systems.
  
  It can easily be argued that if the algorithm used to remap the ID
  varies, the compatibility of the device changes. Therefore I would
  expect any variant of the GICv3 that deviates from the standard
  mapping (if there is such a thing) to have its own compatible string.
 
 There is no standard mapping; it's a property defined at system integration
 time. I fully expect different SoCs to do different things here.

My point was that the mapping itself seems to be fundamental enough to
make devices with different mappings incompatible. Therefore I think
this could probably be handled by using different compatible values,
something along the lines of this:

compatible = vendor,soc-gicv3, arm,gicv3;

Then the mapping can be described in code, which should be a whole lot
easier and more flexible than a more or less generic notation in device
tree.

Thierry


pgp8UpCTM8rzE.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] iommu: Constify struct iommu_ops

2014-06-26 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

This structure is read-only data and should never be modified.

Signed-off-by: Thierry Reding tred...@nvidia.com
---
 drivers/iommu/amd_iommu.c   |  4 ++--
 drivers/iommu/arm-smmu.c|  2 +-
 drivers/iommu/exynos-iommu.c|  2 +-
 drivers/iommu/fsl_pamu_domain.c |  2 +-
 drivers/iommu/intel-iommu.c |  4 ++--
 drivers/iommu/iommu.c   | 19 ++-
 drivers/iommu/ipmmu-vmsa.c  |  2 +-
 drivers/iommu/msm_iommu.c   |  2 +-
 drivers/iommu/omap-iommu.c  |  2 +-
 drivers/iommu/shmobile-iommu.c  |  2 +-
 drivers/iommu/tegra-gart.c  |  2 +-
 drivers/iommu/tegra-smmu.c  |  2 +-
 include/linux/iommu.h   |  4 ++--
 13 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4aec6a29e316..7eb0e4246a79 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -81,7 +81,7 @@ LIST_HEAD(hpet_map);
  */
 static struct protection_domain *pt_domain;
 
-static struct iommu_ops amd_iommu_ops;
+static const struct iommu_ops amd_iommu_ops;
 
 static ATOMIC_NOTIFIER_HEAD(ppr_notifier);
 int amd_iommu_max_glx_val = -1;
@@ -3473,7 +3473,7 @@ static int amd_iommu_domain_has_cap(struct iommu_domain 
*domain,
return 0;
 }
 
-static struct iommu_ops amd_iommu_ops = {
+static const struct iommu_ops amd_iommu_ops = {
.domain_init = amd_iommu_domain_init,
.domain_destroy = amd_iommu_domain_destroy,
.attach_dev = amd_iommu_attach_device,
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 1599354e974d..67727294e6b5 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1609,7 +1609,7 @@ static void arm_smmu_remove_device(struct device *dev)
iommu_group_remove_device(dev);
 }
 
-static struct iommu_ops arm_smmu_ops = {
+static const struct iommu_ops arm_smmu_ops = {
.domain_init= arm_smmu_domain_init,
.domain_destroy = arm_smmu_domain_destroy,
.attach_dev = arm_smmu_attach_dev,
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 99054d2c040d..d037e87a1fe5 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1170,7 +1170,7 @@ static void exynos_iommu_remove_device(struct device *dev)
iommu_group_remove_device(dev);
 }
 
-static struct iommu_ops exynos_iommu_ops = {
+static const struct iommu_ops exynos_iommu_ops = {
.domain_init = exynos_iommu_domain_init,
.domain_destroy = exynos_iommu_domain_destroy,
.attach_dev = exynos_iommu_attach_device,
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 93072ba44b1d..5b47c5495873 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -1140,7 +1140,7 @@ static u32 fsl_pamu_get_windows(struct iommu_domain 
*domain)
return dma_domain-win_cnt;
 }
 
-static struct iommu_ops fsl_pamu_ops = {
+static const struct iommu_ops fsl_pamu_ops = {
.domain_init= fsl_pamu_domain_init,
.domain_destroy = fsl_pamu_domain_destroy,
.attach_dev = fsl_pamu_attach_device,
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 51b6b77dc3e5..9aab7963f424 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -451,7 +451,7 @@ EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
 static DEFINE_SPINLOCK(device_domain_lock);
 static LIST_HEAD(device_domain_list);
 
-static struct iommu_ops intel_iommu_ops;
+static const struct iommu_ops intel_iommu_ops;
 
 static int __init intel_iommu_setup(char *str)
 {
@@ -4465,7 +4465,7 @@ static void intel_iommu_remove_device(struct device *dev)
iommu_group_remove_device(dev);
 }
 
-static struct iommu_ops intel_iommu_ops = {
+static const struct iommu_ops intel_iommu_ops = {
.domain_init= intel_iommu_domain_init,
.domain_destroy = intel_iommu_domain_destroy,
.attach_dev = intel_iommu_attach_device,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 896ce352d88e..5e9e82c73bbf 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -39,6 +39,10 @@ static struct kset *iommu_group_kset;
 static struct ida iommu_group_ida;
 static struct mutex iommu_group_mutex;
 
+struct iommu_callback_data {
+   const struct iommu_ops *ops;
+};
+
 struct iommu_group {
struct kobject kobj;
struct kobject *devices_kobj;
@@ -520,7 +524,8 @@ EXPORT_SYMBOL_GPL(iommu_group_id);
 
 static int add_iommu_group(struct device *dev, void *data)
 {
-   struct iommu_ops *ops = data;
+   struct iommu_callback_data *cb = data;
+   const struct iommu_ops *ops = cb-ops;
 
if (!ops-add_device)
return -ENODEV;
@@ -536,7 +541,7 @@ static int iommu_bus_notifier(struct notifier_block *nb,
  unsigned long action, void *data)
 {
struct device *dev = data;
-   struct

[RFC 05/10] ARM: tegra: Add memory controller on Tegra124

2014-06-26 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

Add the memory controller and wire up the interrupt that is used to
report errors. Also add an #iommu-cells property to make the device
as an IOMMU.

Signed-off-by: Thierry Reding tred...@nvidia.com
---
 arch/arm/boot/dts/tegra124.dtsi | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 0bf050696186..efa0f0c519be 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -560,6 +560,15 @@
reset-names = fuse;
};
 
+   mc: memory-controller@0,70019000 {
+   compatible = nvidia,tegra124-mc;
+   reg = 0x0 0x70019000 0x0 0x1000;
+
+   interrupts = GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH;
+
+   #iommu-cells = 1;
+   };
+
hda@0,7003 {
compatible = nvidia,tegra124-hda, nvidia,tegra30-hda;
reg = 0x0 0x7003 0x0 0x1;
-- 
2.0.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 01/10] iommu: Add IOMMU device registry

2014-06-26 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

Add an IOMMU device registry for drivers to register with and implement
a method for users of the IOMMU API to attach to an IOMMU device. This
allows to support deferred probing and gives the IOMMU API a convenient
hook to perform early initialization of a device if necessary.

Signed-off-by: Thierry Reding tred...@nvidia.com
---
 drivers/iommu/iommu.c | 93 +++
 include/linux/iommu.h | 27 +++
 2 files changed, 120 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 806b55d056b7..5e9e82c73bbf 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -29,8 +29,12 @@
 #include linux/idr.h
 #include linux/notifier.h
 #include linux/err.h
+#include linux/of.h
 #include trace/events/iommu.h
 
+static DEFINE_MUTEX(iommus_lock);
+static LIST_HEAD(iommus);
+
 static struct kset *iommu_group_kset;
 static struct ida iommu_group_ida;
 static struct mutex iommu_group_mutex;
@@ -1004,3 +1008,92 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
+
+int iommu_add(struct iommu *iommu)
+{
+   mutex_lock(iommus_lock);
+   list_add_tail(iommu-list, iommus);
+   mutex_unlock(iommus_lock);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_add);
+
+void iommu_remove(struct iommu *iommu)
+{
+   mutex_lock(iommus_lock);
+   list_del_init(iommu-list);
+   mutex_unlock(iommus_lock);
+}
+EXPORT_SYMBOL_GPL(iommu_remove);
+
+static int of_iommu_attach(struct device *dev)
+{
+   struct of_phandle_iter iter;
+   struct iommu *iommu;
+
+   mutex_lock(iommus_lock);
+
+   of_property_for_each_phandle_with_args(iter, dev-of_node, iommus,
+  #iommu-cells, 0) {
+   bool found = false;
+   int err;
+
+   /* skip disabled IOMMUs */
+   if (!of_device_is_available(iter.out_args.np))
+   continue;
+
+   list_for_each_entry(iommu, iommus, list) {
+   if (iommu-dev-of_node == iter.out_args.np) {
+   err = iommu-ops-attach(iommu, dev);
+   if (err  0) {
+   }
+
+   found = true;
+   }
+   }
+
+   if (!found) {
+   mutex_unlock(iommus_lock);
+   return -EPROBE_DEFER;
+   }
+   }
+
+   mutex_unlock(iommus_lock);
+
+   return 0;
+}
+
+static int of_iommu_detach(struct device *dev)
+{
+   /* TODO: implement */
+   return -ENOSYS;
+}
+
+int iommu_attach(struct device *dev)
+{
+   int err = 0;
+
+   if (IS_ENABLED(CONFIG_OF)  dev-of_node) {
+   err = of_iommu_attach(dev);
+   if (!err)
+   return 0;
+   }
+
+   return err;
+}
+EXPORT_SYMBOL_GPL(iommu_attach);
+
+int iommu_detach(struct device *dev)
+{
+   int err = 0;
+
+   if (IS_ENABLED(CONFIG_OF)  dev-of_node) {
+   err = of_iommu_detach(dev);
+   if (!err)
+   return 0;
+   }
+
+   return err;
+}
+EXPORT_SYMBOL_GPL(iommu_detach);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 284a4683fdc1..ac2ceef194d4 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -43,6 +43,17 @@ struct notifier_block;
 typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
struct device *, unsigned long, int, void *);
 
+struct iommu {
+   struct device *dev;
+
+   struct list_head list;
+
+   const struct iommu_ops *ops;
+};
+
+int iommu_add(struct iommu *iommu);
+void iommu_remove(struct iommu *iommu);
+
 struct iommu_domain_geometry {
dma_addr_t aperture_start; /* First address that can be mapped*/
dma_addr_t aperture_end;   /* Last address that can be mapped */
@@ -130,6 +141,9 @@ struct iommu_ops {
/* Get the numer of window per domain */
u32 (*domain_get_windows)(struct iommu_domain *domain);
 
+   int (*attach)(struct iommu *iommu, struct device *dev);
+   int (*detach)(struct iommu *iommu, struct device *dev);
+
unsigned long pgsize_bitmap;
 };
 
@@ -192,6 +206,10 @@ extern int iommu_domain_window_enable(struct iommu_domain 
*domain, u32 wnd_nr,
  phys_addr_t offset, u64 size,
  int prot);
 extern void iommu_domain_window_disable(struct iommu_domain *domain, u32 
wnd_nr);
+
+int iommu_attach(struct device *dev);
+int iommu_detach(struct device *dev);
+
 /**
  * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
  * @domain: the iommu domain where the fault has happened
@@ -396,6 +414,15 @@ static inline int iommu_domain_set_attr(struct 
iommu_domain *domain,
return

[RFC 03/10] of: Add NVIDIA Tegra124 memory controller binding

2014-06-26 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.

In addition, the memory controller implements an SMMU (IOMMU) which can
translate I/O virtual addresses to physical addresses for clients. This
is useful for scatter-gather operation on devices that don't support it
natively and for virtualization or process separation.

Signed-off-by: Thierry Reding tred...@nvidia.com
---
 .../bindings/memory-controllers/nvidia,tegra124-mc.txt   | 12 
 1 file changed, 12 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt

diff --git 
a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt 
b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
new file mode 100644
index ..4c922e839059
--- /dev/null
+++ 
b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
@@ -0,0 +1,12 @@
+NVIDIA Tegra124 Memory Controller device tree bindings
+==
+
+Required properties:
+- compatible: Should be nvidia,tegra124-mc
+- reg: Physical base address and length of the controller's registers.
+- interrupts: The interrupt outputs from the controller.
+- #iommu-cells: Should be 1. The single cell of the IOMMU specifier defines
+  the SWGROUP of the master.
+
+This device implements an IOMMU that complies with the generic IOMMU binding.
+See ../iommu/iommu.txt for details.
-- 
2.0.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 02/10] devicetree: Add generic IOMMU device tree bindings

2014-06-26 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:

https://lkml.org/lkml/2014/4/27/346

Signed-off-by: Thierry Reding tred...@nvidia.com
---
Changes in v3:
- use #iommu-cells instead of #address-cells/#size-cells
- drop optional iommu-names property

Changes in v2:
- add notes about dma-ranges property (drop note from commit message)
- document priorities of iommus property vs. dma-ranges property
- drop #iommu-cells in favour of #address-cells and #size-cells
- remove multiple-master device example

 Documentation/devicetree/bindings/iommu/iommu.txt | 156 ++
 1 file changed, 156 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt 
b/Documentation/devicetree/bindings/iommu/iommu.txt
new file mode 100644
index ..f8f03f057156
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -0,0 +1,156 @@
+This document describes the generic device tree binding for IOMMUs and their
+master(s).
+
+
+IOMMU device node:
+==
+
+An IOMMU can provide the following services:
+
+* Remap address space to allow devices to access physical memory ranges that
+  they otherwise wouldn't be capable of accessing.
+
+  Example: 32-bit DMA to 64-bit physical addresses
+
+* Implement scatter-gather at page level granularity so that the device does
+  not have to.
+
+* Provide system protection against rogue DMA by forcing all accesses to go
+  through the IOMMU and faulting when encountering accesses to unmapped
+  address regions.
+
+* Provide address space isolation between multiple contexts.
+
+  Example: Virtualization
+
+Device nodes compatible with this binding represent hardware with some of the
+above capabilities.
+
+IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
+typically have a fixed association to the master device, whereas multiple-
+master IOMMU devices can translate accesses from more than one master.
+
+The device tree node of the IOMMU device's parent bus must contain a valid
+dma-ranges property that describes how the physical address space of the
+IOMMU maps to memory. An empty dma-ranges property means that there is a
+1:1 mapping from IOMMU to memory.
+
+Required properties:
+
+- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
+  address.
+
+Typical values for the above include:
+- #iommu-cells = 0: Single master IOMMU devices are not configurable and
+  therefore no additional information needs to be encoded in the specifier.
+  This may also apply to multiple master IOMMU devices that do not allow the
+  association of masters to be configured.
+- #iommu-cells = 1: Multiple master IOMMU devices may need to be configured
+  in order to enable translation for a given master. In such cases the single
+  address cell corresponds to the master device's ID.
+- #iommu-cells = 4: Some IOMMU devices allow the DMA window for masters to
+  be configured. The first cell of the address in this may contain the master
+  device's ID for example, while the second cell could contain the start of
+  the DMA window for the given device. The length of the DMA window is given
+  by the third and fourth cells.
+
+
+IOMMU master node:
+==
+
+Devices that access memory through an IOMMU are called masters. A device can
+have multiple master interfaces (to one or more IOMMU devices).
+
+Required properties:
+
+- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
+  master interfaces of the device. One entry in the list describes one master
+  interface of the device.
+
+When an iommus property is specified in a device tree node, the IOMMU will
+be used for address translation. If a dma-ranges property exists in the
+device's parent node it will be ignored. An exception to this rule is if the
+referenced IOMMU is disabled, in which case the dma-ranges property of the
+parent shall take effect.
+
+
+Notes:
+==
+
+One possible extension to the above is to use an iommus property along with
+a dma-ranges property in a bus device node (such as PCI host bridges). This
+can be useful to describe how children on the bus relate to the IOMMU if they
+are not explicitly listed in the device tree (e.g. PCI devices). However, the
+requirements of that use-case haven't been fully determined yet. Implementing
+this is therefore not recommended without further discussion and extension of
+this binding.
+
+
+Examples:
+=
+
+Single-master IOMMU:
+
+
+   iommu {
+   #iommu-cells = 0;
+   };
+
+   master {
+   iommus = /iommu;
+   };
+
+Multiple-master IOMMU with fixed

[RFC 04/10] memory: Add Tegra124 memory controller support

2014-06-26 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.

Currently this driver sets up the latency allowance registers to the HW
defaults. Eventually an API should be exported by this driver (via a
custom API or a generic subsystem) to allow clients to register latency
requirements.

This driver also registers an IOMMU (SMMU) that's implemented by the
memory controller.

Signed-off-by: Thierry Reding tred...@nvidia.com
---
 drivers/memory/Kconfig   |9 +
 drivers/memory/Makefile  |1 +
 drivers/memory/tegra124-mc.c | 1945 ++
 include/dt-bindings/memory/tegra124-mc.h |   30 +
 4 files changed, 1985 insertions(+)
 create mode 100644 drivers/memory/tegra124-mc.c
 create mode 100644 include/dt-bindings/memory/tegra124-mc.h

diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index c59e9c96e86d..d0f0e6781570 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -61,6 +61,15 @@ config TEGRA30_MC
  analysis, especially for IOMMU/SMMU(System Memory Management
  Unit) module.
 
+config TEGRA124_MC
+   bool Tegra124 Memory Controller driver
+   depends on ARCH_TEGRA
+   select IOMMU_API
+   help
+ This driver is for the Memory Controller module available on
+ Tegra124 SoCs. It provides an IOMMU that can be used for I/O
+ virtual address translation.
+
 config FSL_IFC
bool
depends on FSL_SOC
diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
index 71160a2b7313..03143927abab 100644
--- a/drivers/memory/Makefile
+++ b/drivers/memory/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_FSL_IFC) += fsl_ifc.o
 obj-$(CONFIG_MVEBU_DEVBUS) += mvebu-devbus.o
 obj-$(CONFIG_TEGRA20_MC)   += tegra20-mc.o
 obj-$(CONFIG_TEGRA30_MC)   += tegra30-mc.o
+obj-$(CONFIG_TEGRA124_MC)  += tegra124-mc.o
diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
new file mode 100644
index ..741755b6785d
--- /dev/null
+++ b/drivers/memory/tegra124-mc.c
@@ -0,0 +1,1945 @@
+/*
+ * Copyright (C) 2014 NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include linux/interrupt.h
+#include linux/io.h
+#include linux/iommu.h
+#include linux/kernel.h
+#include linux/module.h
+#include linux/of.h
+#include linux/platform_device.h
+#include linux/slab.h
+
+#include dt-bindings/memory/tegra124-mc.h
+
+#include asm/cacheflush.h
+#ifndef CONFIG_ARM64
+#include asm/dma-iommu.h
+#endif
+
+#define MC_INTSTATUS 0x000
+#define  MC_INT_DECERR_MTS (1  16)
+#define  MC_INT_SECERR_SEC (1  13)
+#define  MC_INT_DECERR_VPR (1  12)
+#define  MC_INT_INVALID_APB_ASID_UPDATE (1  11)
+#define  MC_INT_INVALID_SMMU_PAGE (1  10)
+#define  MC_INT_ARBITRATION_EMEM (1  9)
+#define  MC_INT_SECURITY_VIOLATION (1  8)
+#define  MC_INT_DECERR_EMEM (1  6)
+#define MC_INTMASK 0x004
+#define MC_ERR_STATUS 0x08
+#define MC_ERR_ADR 0x0c
+
+struct latency_allowance {
+   unsigned int reg;
+   unsigned int shift;
+   unsigned int mask;
+   unsigned int def;
+};
+
+struct smmu_enable {
+   unsigned int reg;
+   unsigned int bit;
+};
+
+struct tegra_mc_client {
+   unsigned int id;
+   const char *name;
+   unsigned int swgroup;
+
+   struct smmu_enable smmu;
+   struct latency_allowance latency;
+};
+
+static const struct tegra_mc_client tegra124_mc_clients[] = {
+   {
+   .id = 0x01,
+   .name = display0a,
+   .swgroup = TEGRA_SWGROUP_DC,
+   .smmu = {
+   .reg = 0x228,
+   .bit = 1,
+   },
+   .latency = {
+   .reg = 0x2e8,
+   .shift = 0,
+   .mask = 0xff,
+   .def = 0xc2,
+   },
+   }, {
+   .id = 0x02,
+   .name = display0ab,
+   .swgroup = TEGRA_SWGROUP_DCB,
+   .smmu = {
+   .reg = 0x228,
+   .bit = 2,
+   },
+   .latency = {
+   .reg = 0x2f4,
+   .shift = 0,
+   .mask = 0xff,
+   .def = 0xc6,
+   },
+   }, {
+   .id = 0x03,
+   .name = display0b,
+   .swgroup = TEGRA_SWGROUP_DC,
+   .smmu = {
+   .reg = 0x228,
+   .bit = 3,
+   },
+   .latency = {
+   .reg = 0x2e8,
+   .shift = 16,
+   .mask = 0xff,
+   .def = 0x50

[RFC 06/10] ARM: tegra: tegra124: Enable IOMMU for display controllers

2014-06-26 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

Add an iommus property to each of the display controllers and encode the
SWGROUP in the specifier.

Signed-off-by: Thierry Reding tred...@nvidia.com
---
 arch/arm/boot/dts/tegra124.dtsi | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index efa0f0c519be..82751d2878c4 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -3,6 +3,7 @@
 #include dt-bindings/pinctrl/pinctrl-tegra.h
 #include dt-bindings/pinctrl/pinctrl-tegra-xusb.h
 #include dt-bindings/interrupt-controller/arm-gic.h
+#include dt-bindings/memory/tegra124-mc.h
 
 #include skeleton.dtsi
 
@@ -104,6 +105,8 @@
reset-names = dc;
 
nvidia,head = 0;
+
+   iommus = mc TEGRA_SWGROUP_DC;
};
 
dc@0,5424 {
@@ -117,6 +120,8 @@
reset-names = dc;
 
nvidia,head = 1;
+
+   iommus = mc TEGRA_SWGROUP_DCB;
};
 
hdmi@0,5428 {
-- 
2.0.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 00/10] Add NVIDIA Tegra124 IOMMU support

2014-06-26 Thread Thierry Reding
From: Thierry Reding tred...@nvidia.com

This series adds support for the IOMMU found on Tegra124 SoCs. The SMMU
groups memory clients into SWGROUPs and each SWGROUP can be assigned to
one I/O virtual address space. Translation of virtual addresses can be
enabled per memory client.

Patch 1 adds an IOMMU device registry. The driver in patch 4 will add
the IOMMU device with this registry, which will in turn be used by the
client drivers to attach to the IOMMU device. Note that the API that is
introduced in this patch may not be sufficient in the long term (f.e.
when multiple master interfaces need to be supported).

Patch 2 is v3 of the generic IOMMU device tree binding that has been
discussed previously. Patch 3 defines the device tree binding for the
NVIDIA Tegra124 memory controller (and references the generic IOMMU
binding).

Patch 4 implements a memory controller driver for NVIDIA Tegra124. It
initializes the latency allowance programming to sensible defaults and
registers an IOMMU device. Note that this is still somewhat work in
progress. The page tables aren't properly cleaned up yet and other
features of the memory controller may be useful to implement
subsequently.

Patches 5 through 8 add the device tree node for the memory controller
and enable IOMMU support in the display and SDMMC controllers as
examples.

Patches 9 and 10 add support for IOMMU to the DRM and SDMMC drivers.
SDMMC uses the DMA mapping API, which will make use of ARM's DMA/IOMMU
integration. DRM has special needs (buffers that are mapped can be
scanned out by either display controller) and not a good fit for the
DMA mapping API, so it uses the IOMMU API directly.

This has been tested using both SDMMC and DRM drivers via the IOMMU. For
DRM when an IOMMU is detected it will use shmem as backing store, which
removes the need for CMA. Importing from gk20a via the Nouveau driver
also works, but buffers occasionally have some kind of offset that I
haven't been able to track down yet.

Thierry

Thierry Reding (10):
  iommu: Add IOMMU device registry
  devicetree: Add generic IOMMU device tree bindings
  of: Add NVIDIA Tegra124 memory controller binding
  memory: Add Tegra124 memory controller support
  ARM: tegra: Add memory controller on Tegra124
  ARM: tegra: tegra124: Enable IOMMU for display controllers
  ARM: tegra: tegra124: Enable IOMMU for SDMMC controllers
  ARM: tegra: Select ARM_DMA_USE_IOMMU
  drm/tegra: Add IOMMU support
  mmc: sdhci-tegra: Add IOMMU support

 Documentation/devicetree/bindings/iommu/iommu.txt  |  156 ++
 .../memory-controllers/nvidia,tegra124-mc.txt  |   12 +
 arch/arm/boot/dts/tegra124.dtsi|   18 +
 arch/arm/mach-tegra/Kconfig|1 +
 drivers/gpu/drm/tegra/dc.c |   21 +
 drivers/gpu/drm/tegra/drm.c|   17 +
 drivers/gpu/drm/tegra/drm.h|3 +
 drivers/gpu/drm/tegra/fb.c |   16 +-
 drivers/gpu/drm/tegra/gem.c|  236 ++-
 drivers/gpu/drm/tegra/gem.h|4 +
 drivers/iommu/iommu.c  |   93 +
 drivers/memory/Kconfig |9 +
 drivers/memory/Makefile|1 +
 drivers/memory/tegra124-mc.c   | 1945 
 drivers/mmc/host/sdhci-tegra.c |8 +
 include/dt-bindings/memory/tegra124-mc.h   |   30 +
 include/linux/iommu.h  |   27 +
 17 files changed, 2573 insertions(+), 24 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
 create mode 100644 
Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
 create mode 100644 drivers/memory/tegra124-mc.c
 create mode 100644 include/dt-bindings/memory/tegra124-mc.h

-- 
2.0.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


  1   2   3   4   5   6   7   >