Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-15 Thread Thierry Reding
On Mon, Apr 28, 2014 at 02:05:30PM +0200, Arnd Bergmann wrote:
[...]
 let me clarify by example:
 
   iommu@1 {
   compatible = some,simple-iommu;
   reg = 1;
   #iommu-cells = 0; /* supports only one master */
   };
 
   iommu@2 {
   compatible = some,other-iommu;
   reg = 3;
   #iommu-cells = 1; /* contains master ID */
   };
 
   iommu@3 {
   compatible = some,windowed-iommu;
   reg = 2;
   #iommu-cells = 2; /* contains dma-window */
   };
 
   device@4 {
   compatible = some,ethernet;
   iommus = /iommu@1;
   };
 
   device@5 {
   compatible = some,dmaengine;
   iommus = /iommu@2 0x4000 0x100,
/iommu@3 0x101;
   };
 
 The device at address 4 has a one-one relationship with iommu@1, so there
 is no need for any data. device@5 has two master ports. One is connected to
 an IOMMU that has a per-device aperture, device@5 can only issue transfers
 to the 256MB area at 0x4000, and the IOMMU will have to put entries for
 this device into that address. The second master port is connected to
 iommu@3, which uses a master ID that gets passed along with each transfer,
 so that needs to be put into the IOTLBs.

iommu@3 and the second port of device@5 seem to match what we need for
Tegra (and as I understand also Exynos). Can we settle on this for now
so that Hiroshi and Cho can go update their drivers for this binding?

 A variation would be to not use #iommu-cells at all, but provide a
 #address-cells / #size-cells pair in the IOMMU, and have a translation
 as we do for dma-ranges. This is probably most flexible.

The remainder of this discussion seems to indicate that #iommu-cells and
dma-ranges don't have to be mutually exclusive. For some IOMMUs it might
make sense to use both.

In fact perhaps we should require every IOMMU user to also specify a
dma-ranges property, even if for some cases the range would be simply
the complete physical address space. Perhaps in analogy to the ranges
property an empty dma-ranges property could be taken to mean all of the
physical address space.

I'm aware that this doesn't cover any of the more exotic cases out
there, but the fact is that we have real devices out there that ship
with some variations of these simple IOMMUs and I don't think we're
doing ourselves a favour by blocking support for these to be added on
the hope of merging the perfect solution that covers all use-cases.
Patches for Tegra have already been around for close to half a year.

Thierry


pgpMGzmObBmZW.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-15 Thread Cho KyongHo
On Thu, 15 May 2014 22:37:31 +0200, Thierry Reding wrote:
 On Mon, Apr 28, 2014 at 02:05:30PM +0200, Arnd Bergmann wrote:
 [...]
  let me clarify by example:
  
  iommu@1 {
  compatible = some,simple-iommu;
  reg = 1;
  #iommu-cells = 0; /* supports only one master */
  };
  
  iommu@2 {
  compatible = some,other-iommu;
  reg = 3;
  #iommu-cells = 1; /* contains master ID */
  };
  
  iommu@3 {
  compatible = some,windowed-iommu;
  reg = 2;
  #iommu-cells = 2; /* contains dma-window */
  };
  
  device@4 {
  compatible = some,ethernet;
  iommus = /iommu@1;
  };
  
  device@5 {
  compatible = some,dmaengine;
  iommus = /iommu@2 0x4000 0x100,
   /iommu@3 0x101;
  };
  
  The device at address 4 has a one-one relationship with iommu@1, so there
  is no need for any data. device@5 has two master ports. One is connected to
  an IOMMU that has a per-device aperture, device@5 can only issue transfers
  to the 256MB area at 0x4000, and the IOMMU will have to put entries for
  this device into that address. The second master port is connected to
  iommu@3, which uses a master ID that gets passed along with each transfer,
  so that needs to be put into the IOTLBs.
 
 iommu@3 and the second port of device@5 seem to match what we need for
 Tegra (and as I understand also Exynos). Can we settle on this for now
 so that Hiroshi and Cho can go update their drivers for this binding?
 

Currently, Exynos IOMMU is the case of iommu@1.

But in the near future, it will support multiple masters with a single context
that means all masters that shares a single System MMU also views the same
address space.

For some cases, we may need iommu@3 that supports dma-window.

So, I have no other opinion.

By the way, iommu framework should allow to process the parameters
to 'iommus' property in the master nodes by iommu driver implementations
because it is depended on implementations.

  A variation would be to not use #iommu-cells at all, but provide a
  #address-cells / #size-cells pair in the IOMMU, and have a translation
  as we do for dma-ranges. This is probably most flexible.
 
 The remainder of this discussion seems to indicate that #iommu-cells and
 dma-ranges don't have to be mutually exclusive. For some IOMMUs it might
 make sense to use both.
 
 In fact perhaps we should require every IOMMU user to also specify a
 dma-ranges property, even if for some cases the range would be simply
 the complete physical address space. Perhaps in analogy to the ranges
 property an empty dma-ranges property could be taken to mean all of the
 physical address space.
 
 I'm aware that this doesn't cover any of the more exotic cases out
 there, but the fact is that we have real devices out there that ship
 with some variations of these simple IOMMUs and I don't think we're
 doing ourselves a favour by blocking support for these to be added on
 the hope of merging the perfect solution that covers all use-cases.
 Patches for Tegra have already been around for close to half a year.
 
 Thierry
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-02 Thread Dave Martin
On Thu, May 01, 2014 at 06:41:37PM +0100, Stephen Warren wrote:
 On 04/29/2014 03:00 PM, Arnd Bergmann wrote:
 ...
  Yes. It's very complicated unfortunately, because we have to be
  able to deal with arbitrary combinations of a lot of oddball cases
  that can show up in random SoCs:
 ...
  - a device may have DMA access to a bus that is invisible to the CPU
 
 The issue is slightly more general than that. It's more that the bus
 structure seen by a device is simply /different/ than that seen by the
 CPU. I don't think it's a requirement that there be CPU-invisible buses
 for that to be true.
 
 For example, I could conceive of a HW setup like:
 
 primary CPU bus -- other devices
|\_  /
|  \|
v  v^
 device registers  some secondary bus
   |
   v
memory
 
 Here, all the buses are visible to the CPU, yet the path that
 transactions take between the buses is simply different to the CPU. More
 complex situations than the above, while still maintaining that
 description, are certainly possible.
 

I tend to think in terms of links rather than buses.  A link is
effectively a 1:1 point-to-point bus that passes all transactions with
no modification.

So, although some secondary bus is visible to the CPUs, crucially the
link some secondary bus to other devices is not visible -- in the
sense that transactions issued by the CPUs never flow down that link.
Thus, if the link actually has remappings associated with it, then
devices mastering onto some secondary bus will observe those mappings
but the CPUs won't.  That's precisely what we need to know about when
configuring DMA buffers.

invisible bus situations are therefore a subset of invisible link
situations, and it is the latter which are the source of the complexity.

However, if the extra link(s) don't have any special characteristics, it
may be software-transparent with no need for description, because we
can pretend for logical purposes that there is a single bus in that case.
Effectively that what we've relied on for simpler systems up to now.

I'm assming in your example that the direct link between primary CPU
bus and other devices is always used by preference, instead of CPUs'
transactions toward other devices being sent to some secondary bus
first.

Cheers
---Dave
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Dave Martin
On Tue, Apr 29, 2014 at 10:46:18PM +0200, Arnd Bergmann wrote:
 On Tuesday 29 April 2014 19:16:02 Dave Martin wrote:

[...]

  For example, suppose devices can post MSIs to an interrupt controller
  via a mailbox accessed through the IOMMU.  Suppose also that the IOMMU
  generates MSIs itself in order to signal management events or faults
  to a host OS.  Linux (as host) will need to configure the interrupt
  controller separately for the IOMMU and for the IOMMU clients.  This
  means that Linux needs to know which IDs may travel to the interrupt
  controller for which purpose, and they must be distinct.
 
 I don't understand. An MSI controller is just an address that acts
 as a DMA slave for a 4-byte inbound data packet. It has no way of
 knowing who is sending data, other than by the address or the data
 sent to it. Are you talking of something else?

Oops, looks like there are a few points I failed to respond to here...


I'm not an expert on PCI -- I'm prepared to believe it works that way.

GICv3 can descriminate between different MSI senders based on ID
signals on the bus.

 
  I'm not sure whether there is actually a SoC today that is MSI-capable
  and contains an IOMMU, but all the components to build one are out
  there today.  GICv3 is also explicitly designed to support such
  systems.
 
 A lot of SoCs have MSI integrated into the PCI root complex, which
 of course is pointless from MSI perspective, as well as implying that
 the MSI won't go through the IOMMU.
 
 We have briefly mentioned MSI in the review of the Samsung GH7 PCI
 support. It's possible that this one can either use the built-in
 MSI or the one in the GICv2m.

We are likely to get non-PCI MSIs in future SoC systems too, and there
are no standards governing how such systems should look.


  In the future, it is likely that HSA-style GPUs and other high-
  throughput virtualisable bus mastering devices will have capabilities
  of this sort, but I don't think there's anything concrete yet.
 
 Wouldn't they just have IOMMUs with multiple contexts?

Who knows?  A management component of the GPU that is under exclusive
control of the host or hypervisor might be wired up to bypass the IOMMU
completely.

I'm not saying this kind of thing definitely will happen, but I can't
say confidently that it won't.


   how it might be wired up in hardware, but I don't know what it's good for,
   or who would actually do it.
   
 A variation would be to not use #iommu-cells at all, but provide a
 #address-cells / #size-cells pair in the IOMMU, and have a translation
 as we do for dma-ranges. This is probably most flexible.

That would also allow us to describe ranges of master IDs, which we 
need for
things like PCI RCs on the ARM SMMU. Furthermore, basic transformations 
of
these ranges could also be described like this, although I think Dave 
(CC'd)
has some similar ideas in this area.
  
  Ideally, we would reuse the ePAPR ranges concept and describe the way
  sideband ID signals propagate down the bus hierarchy in a similar way.
 
 It would be 'dma-ranges'. Unfortunately that would imply that each DMA
 master is connected to only one IOMMU, which you say is not necessarily
 the case. The simpler case of a device is only a master on a single IOMMU
 but can use multiple contexts would however work fine with dma-ranges.

Partly, yes.  The concept embodied by dma-ranges is correct, but the
topological relationship is not: the assumption that a master device
always masters onto its parent node doesn't work for non-tree-like
topologies.

Cheers
---Dave
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Arnd Bergmann
On Thursday 01 May 2014 23:02:14 Cho KyongHo wrote:
  
  - device can only do DMA to a limited address range
  - DMA is noncoherent and needs manual cache management
  - DMA address is at an offset from physical address
  - some devices have an IOMMU
  - some IOMMUs are shared between devices
  - some devices with IOMMU can have multiple simultaneous contexts
  - a device may access some memory directly and some other memory through 
  IOMMU
 
 Do we need to consider this case?
 I don't think a device can have different contexts at the same time.
 If there such device is in a system, its driver must handle it correctly
 with different devices descriptors for the different contexts, for example.
 I mean, if a device has two DMA ports that are in different contexts,
 they can be treated as different devices which are handed by a driver.
 
 I worry that abstracting everything we can think may make the problem harder.

It's the default operation for some of the simpler IOMMUs, see
arch/x86/kernel/amd_gart_64.c for instance. It's possible that AMD
will have the same thing in their ARM64 SoCs, but I don't have
specific information about that.

It can probably be handled in the iommu_map_ops() as a generalization,
at least if we only have to worry about checking whether a memory address
is below the dma_mask in order to decide whether to use the IOMMU or not.

Or we can decide not to handle it at all, and always go through the IOMMU,
which would be slightly slower but still functional.

Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Dave Martin
On Thu, May 01, 2014 at 02:29:50PM +0100, Arnd Bergmann wrote:
 On Thursday 01 May 2014 12:15:35 Dave Martin wrote:
  On Tue, Apr 29, 2014 at 10:46:18PM +0200, Arnd Bergmann wrote:
   On Tuesday 29 April 2014 19:16:02 Dave Martin wrote:
  
  [...]
  
For example, suppose devices can post MSIs to an interrupt controller
via a mailbox accessed through the IOMMU.  Suppose also that the IOMMU
generates MSIs itself in order to signal management events or faults
to a host OS.  Linux (as host) will need to configure the interrupt
controller separately for the IOMMU and for the IOMMU clients.  This
means that Linux needs to know which IDs may travel to the interrupt
controller for which purpose, and they must be distinct.
   
   I don't understand. An MSI controller is just an address that acts
   as a DMA slave for a 4-byte inbound data packet. It has no way of
   knowing who is sending data, other than by the address or the data
   sent to it. Are you talking of something else?
  
  Oops, looks like there are a few points I failed to respond to here...
  
  
  I'm not an expert on PCI -- I'm prepared to believe it works that way.
  
  GICv3 can descriminate between different MSI senders based on ID
  signals on the bus.
 
 Any idea what this is good for? Do we have to use it? It probably doesn't
 fit very well into the way Linux handles MSIs today.

Marc may be better placed than me to comment on this in detail.

However, I believe it's correct to say that because the GIC is not part
of PCI, end-to-end MSI delivery inherently involves a non-PCI step from
the PCI RC to the GIC itself.

Thus this is likely to be a fundamental requirement for MSIs on ARM SoCs
using GIC, if we want to have a hope of mapping MSIs to VMs efficiently.

I'm not sure whether there is actually a SoC today that is MSI-capable
and contains an IOMMU, but all the components to build one are out
there today.  GICv3 is also explicitly designed to support such
systems.
   
   A lot of SoCs have MSI integrated into the PCI root complex, which
   of course is pointless from MSI perspective, as well as implying that
   the MSI won't go through the IOMMU.
   
   We have briefly mentioned MSI in the review of the Samsung GH7 PCI
   support. It's possible that this one can either use the built-in
   MSI or the one in the GICv2m.
  
  We are likely to get non-PCI MSIs in future SoC systems too, and there
  are no standards governing how such systems should look.
 
 I wouldn't call that MSI though -- using the same term in the code
 can be rather confusing. There are existing SoCs that use message
 based interrupt notification. We are probably better off modeling
 those are regular irqchips in Linux and DT, given that they may
 not be bound by the same constraints as PCI MSI.

We can call it what we like and maybe bury the distinction in irqchip
drivers for some fixed-configuration cases, but it's logically the same
concept.  Naming and subsystem factoring are implementation decisions
for Linux.

For full dynamic assignment of pluggable devices or buses to VMs, I'm
less sure that we can model that as plain irqchips.

In the future, it is likely that HSA-style GPUs and other high-
throughput virtualisable bus mastering devices will have capabilities
of this sort, but I don't think there's anything concrete yet.
   
   Wouldn't they just have IOMMUs with multiple contexts?
  
  Who knows?  A management component of the GPU that is under exclusive
  control of the host or hypervisor might be wired up to bypass the IOMMU
  completely.
  
  I'm not saying this kind of thing definitely will happen, but I can't
  say confidently that it won't.
 
 Supporting this case in DT straight away is going to add a major burden.
 If nobody can say for sure that they are actually going to do it, I'd
 lean towards assuming that we won't need it and not putting the extra
 complexity in.
 
 If someone actually needs it later, let's make it their problem for
 not participating in the design.

This is a fair point, but there is a difference between the bindings and
what kind of wacky configurations a particular version of Linux actually
supports.

DT is supposed to be a description of the hardware, not a description
of how Linux subsystems are structured, though if the two are not
reasonably well aligned that will lead to pain...

The key thing is to make sure the DT bindings are extensible to
things that we can reasonably foresee.

 
 how it might be wired up in hardware, but I don't know what it's good 
 for,
 or who would actually do it.
 
   A variation would be to not use #iommu-cells at all, but provide a
   #address-cells / #size-cells pair in the IOMMU, and have a 
   translation
   as we do for dma-ranges. This is probably most flexible.
  
  That would also allow us to describe ranges of master IDs, which we 
  need for
  things like PCI RCs on the ARM SMMU. 

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Dave Martin
On Thu, May 01, 2014 at 11:02:14PM +0900, Cho KyongHo wrote:
 On Tue, 29 Apr 2014 23:00:29 +0200, Arnd Bergmann wrote:
  On Tuesday 29 April 2014 13:07:54 Grant Grundler wrote:
   On Tue, Apr 29, 2014 at 11:16 AM, Dave Martin dave.mar...@arm.com wrote:
   ...
An IOMMU is really a specialised bridge
   
   Is a GART a bridge?
   
   IOMMUs can provide three basic functions:
   1) remap address space to reach phys mem ranges that the device is
   otherwise not capable of accessing (classic 32-bit DMA to reach 64-bit
   Phys address)
   
   2) implement scatter-gather (page level granularity) so the device
   doesn't have to
   
   3) provide some level of system protection against rogue DMA by
   forcing everything through a DMA mapping interface and faulting when
   encountering unmapped DMA transactions.
  
  [ 4) provide isolation between multiple contexts, typically for purposes
   of virtualization]
  
   I summarize IOMMUs as: participate in the routing of MMIO
   transactions in the system fabric.
   In this sense, IOMMUs are sort of a bridge. Defining what kind of
   routing they can do (coalesce transactions? remapping MMIO domains?)
   and which address ranges they route would describe most of that
   functionality.
   
   This remapping of MMIO transaction is also usually asymmetric.
   Meaning routing of downstream transactions *might* be completely
   different than the routing + remapping of transactions heading
   upstream. DMA mappings services are designed to handle only the
   transactions generated (aka mastered) by a downstream device.
  
  For the purposes of the DT binding, we have a 'ranges' property
  that defines the downstream translation (CPU-to-MMIO) and a
  'dma-ranges' property for the opposite address translation
  (device-to-memory).
  
   , so it may be cleaner to describe
an IOMMU using a real bus node in the DT, if we also define a way to 
make
master/slave linkages explicit where it matters.
   
   where it matters is a bit vague.  Is the goal to just enable DMA
   mapping services to do the right thing for a device that can
   generate DMA?
  
  Yes. It's very complicated unfortunately, because we have to be
  able to deal with arbitrary combinations of a lot of oddball cases
  that can show up in random SoCs:
  
  - device can only do DMA to a limited address range
  - DMA is noncoherent and needs manual cache management
  - DMA address is at an offset from physical address
  - some devices have an IOMMU
  - some IOMMUs are shared between devices
  - some devices with IOMMU can have multiple simultaneous contexts
  - a device may access some memory directly and some other memory through 
  IOMMU
 
 Do we need to consider this case?
 I don't think a device can have different contexts at the same time.
 If there such device is in a system, its driver must handle it correctly
 with different devices descriptors for the different contexts, for example.
 I mean, if a device has two DMA ports that are in different contexts,
 they can be treated as different devices which are handed by a driver.

GPUs will definitely be capable of acting on behalf of multiple contexts
at the same time, in a dynamic fashion.  This doesn't necessarily mean
that it masters through different ports or onto different buses though.

Sketching out how we would describe this in DT doesn't imply that we
need Linux to support it.  It's more about asking: if we have to support
this in the future, how badly will it screw up the current framework?

 I worry that abstracting everything we can think may make the problem harder.

That's always a risk, though pain today may be worth it if is saves a
larger amount of pain in the future.

As I say on the other branch of this thread, I'll follow up with
something a bit more concrete to illustrate the kind of thing I mean.

Cheers
---Dave
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Marc Zyngier
On 01/05/14 15:36, Dave Martin wrote:
 On Thu, May 01, 2014 at 02:29:50PM +0100, Arnd Bergmann wrote:
 On Thursday 01 May 2014 12:15:35 Dave Martin wrote:
 On Tue, Apr 29, 2014 at 10:46:18PM +0200, Arnd Bergmann wrote:
 On Tuesday 29 April 2014 19:16:02 Dave Martin wrote:

 [...]

 For example, suppose devices can post MSIs to an interrupt controller
 via a mailbox accessed through the IOMMU.  Suppose also that the IOMMU
 generates MSIs itself in order to signal management events or faults
 to a host OS.  Linux (as host) will need to configure the interrupt
 controller separately for the IOMMU and for the IOMMU clients.  This
 means that Linux needs to know which IDs may travel to the interrupt
 controller for which purpose, and they must be distinct.

 I don't understand. An MSI controller is just an address that acts
 as a DMA slave for a 4-byte inbound data packet. It has no way of
 knowing who is sending data, other than by the address or the data
 sent to it. Are you talking of something else?

 Oops, looks like there are a few points I failed to respond to here...


 I'm not an expert on PCI -- I'm prepared to believe it works that way.

 GICv3 can descriminate between different MSI senders based on ID
 signals on the bus.

 Any idea what this is good for? Do we have to use it? It probably doesn't
 fit very well into the way Linux handles MSIs today.
 
 Marc may be better placed than me to comment on this in detail.

As to fitting Linux, it seems to match what Linux does fairly well
(see the kvm-arm64/gicv3 branch in my tree). Not saying that it does it
in a very simple way (far from it, actually), but it works.

As to what it is good for (and before someone bursts into an Edwin
Starr interpretation), it mostly has to do with isolation, and the fact
that you may want to let the whole MSI programming to a guest (and yet
ensure that the guest cannot generate interrupts that would be assigned
to other devices). This is done by sampling the requester-id at the ITS
level, and use this information to index a per-device interrupt
translation table (I could talk for hours about the concept and its
variations, mostly using expletives and possibly a hammer, but I think
it is the time for my pink pill).

 However, I believe it's correct to say that because the GIC is not part
 of PCI, end-to-end MSI delivery inherently involves a non-PCI step from
 the PCI RC to the GIC itself.
 
 Thus this is likely to be a fundamental requirement for MSIs on ARM SoCs
 using GIC, if we want to have a hope of mapping MSIs to VMs efficiently.

Indeed. GICv[34] is the ARM way of implementing MSI on SBSA compliant
systems (from level 1 onwards, if memory serves well). People are
actively building systems with this architecture, and relying on it to
provide VM isolation.

 I'm not sure whether there is actually a SoC today that is MSI-capable
 and contains an IOMMU, but all the components to build one are out
 there today.  GICv3 is also explicitly designed to support such
 systems.

 A lot of SoCs have MSI integrated into the PCI root complex, which
 of course is pointless from MSI perspective, as well as implying that
 the MSI won't go through the IOMMU.

 We have briefly mentioned MSI in the review of the Samsung GH7 PCI
 support. It's possible that this one can either use the built-in
 MSI or the one in the GICv2m.

 We are likely to get non-PCI MSIs in future SoC systems too, and there
 are no standards governing how such systems should look.

 I wouldn't call that MSI though -- using the same term in the code
 can be rather confusing. There are existing SoCs that use message
 based interrupt notification. We are probably better off modeling
 those are regular irqchips in Linux and DT, given that they may
 not be bound by the same constraints as PCI MSI.
 
 We can call it what we like and maybe bury the distinction in irqchip
 drivers for some fixed-configuration cases, but it's logically the same
 concept.  Naming and subsystem factoring are implementation decisions
 for Linux.
 
 For full dynamic assignment of pluggable devices or buses to VMs, I'm
 less sure that we can model that as plain irqchips.

Yeah, I've been looking at that. For some restricted cases, the irqchip
model works very well (think of wire to MSI translators, which are
likely to have a fixed configuration). Anything more dynamic requires a
more evolved infrastructure, but I'd hope they would also be on a
discoverable bus, removing most of the need for description in DT.

Cheers,

M.
-- 
Jazz is not dead. It just smells funny...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Arnd Bergmann
On Thursday 01 May 2014 15:36:54 Dave Martin wrote:
 On Thu, May 01, 2014 at 02:29:50PM +0100, Arnd Bergmann wrote:
  On Thursday 01 May 2014 12:15:35 Dave Martin wrote:
 I'm not sure whether there is actually a SoC today that is MSI-capable
 and contains an IOMMU, but all the components to build one are out
 there today.  GICv3 is also explicitly designed to support such
 systems.

A lot of SoCs have MSI integrated into the PCI root complex, which
of course is pointless from MSI perspective, as well as implying that
the MSI won't go through the IOMMU.

We have briefly mentioned MSI in the review of the Samsung GH7 PCI
support. It's possible that this one can either use the built-in
MSI or the one in the GICv2m.
   
   We are likely to get non-PCI MSIs in future SoC systems too, and there
   are no standards governing how such systems should look.
  
  I wouldn't call that MSI though -- using the same term in the code
  can be rather confusing. There are existing SoCs that use message
  based interrupt notification. We are probably better off modeling
  those are regular irqchips in Linux and DT, given that they may
  not be bound by the same constraints as PCI MSI.
 
 We can call it what we like and maybe bury the distinction in irqchip
 drivers for some fixed-configuration cases, but it's logically the same
 concept.  Naming and subsystem factoring are implementation decisions
 for Linux.
 
 For full dynamic assignment of pluggable devices or buses to VMs, I'm
 less sure that we can model that as plain irqchips.

I definitely hope we won't have to deal with plugging non-PCI devices
into VMs. Nothing good can come out of that.

  Supporting this case in DT straight away is going to add a major burden.
  If nobody can say for sure that they are actually going to do it, I'd
  lean towards assuming that we won't need it and not putting the extra
  complexity in.
  
  If someone actually needs it later, let's make it their problem for
  not participating in the design.
 
 This is a fair point, but there is a difference between the bindings and
 what kind of wacky configurations a particular version of Linux actually
 supports.
 
 DT is supposed to be a description of the hardware, not a description
 of how Linux subsystems are structured, though if the two are not
 reasonably well aligned that will lead to pain...
 
 The key thing is to make sure the DT bindings are extensible to
 things that we can reasonably foresee.

Yes, defining them in an extensible way is always a good idea, but I
think it would be better not to define the fine details until we
actually need them in this case.

It would be 'dma-ranges'. Unfortunately that would imply that each DMA
master is connected to only one IOMMU, which you say is not necessarily
the case. The simpler case of a device is only a master on a single 
IOMMU
but can use multiple contexts would however work fine with dma-ranges.
   
   Partly, yes.  The concept embodied by dma-ranges is correct, but the
   topological relationship is not: the assumption that a master device
   always masters onto its parent node doesn't work for non-tree-like
   topologies.
  
  In almost all cases it will fit. When it doesn't, we can work around it by
  defining virtual address spaces the way that the PCI binding does. The only
  major exception that we know we have to handle is IOMMUs.
 
 My concern here is that as new exceptions and oddball or complex systems
 crop up, we will end up repeatedly inventing different bodges to solve
 essentially the same problem.
 
 Unlike some of the other situations we have to deal with, these are valid
 hardware configurations rather than quirks or broken systems.

Can you give an example where this would be done for a good reason?
I can't come up with an example that doesn't involve the hardware
design being seriously screwed.

Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Arnd Bergmann
On Thursday 01 May 2014 16:11:48 Marc Zyngier wrote:
 On 01/05/14 15:36, Dave Martin wrote:
  On Thu, May 01, 2014 at 02:29:50PM +0100, Arnd Bergmann wrote:
  On Thursday 01 May 2014 12:15:35 Dave Martin wrote:
  On Tue, Apr 29, 2014 at 10:46:18PM +0200, Arnd Bergmann wrote:
  I don't understand. An MSI controller is just an address that acts
  as a DMA slave for a 4-byte inbound data packet. It has no way of
  knowing who is sending data, other than by the address or the data
  sent to it. Are you talking of something else?
 
  Oops, looks like there are a few points I failed to respond to here...
 
 
  I'm not an expert on PCI -- I'm prepared to believe it works that way.
 
  GICv3 can descriminate between different MSI senders based on ID
  signals on the bus.
 
  Any idea what this is good for? Do we have to use it? It probably doesn't
  fit very well into the way Linux handles MSIs today.
  
  Marc may be better placed than me to comment on this in detail.
 
 As to fitting Linux, it seems to match what Linux does fairly well
 (see the kvm-arm64/gicv3 branch in my tree). Not saying that it does it
 in a very simple way (far from it, actually), but it works.

ok.

 As to what it is good for (and before someone bursts into an Edwin
 Starr interpretation), it mostly has to do with isolation, and the fact
 that you may want to let the whole MSI programming to a guest (and yet
 ensure that the guest cannot generate interrupts that would be assigned
 to other devices). This is done by sampling the requester-id at the ITS
 level, and use this information to index a per-device interrupt
 translation table (I could talk for hours about the concept and its
 variations, mostly using expletives and possibly a hammer, but I think
 it is the time for my pink pill).

So the idea is that you want to give the guest unfiltered access to the
PCI config space of an assigned device?

Is that safe?

If the config space access goes through the hypervisor (as I think we
always do today), there should be no isse.

  However, I believe it's correct to say that because the GIC is not part
  of PCI, end-to-end MSI delivery inherently involves a non-PCI step from
  the PCI RC to the GIC itself.
  
  Thus this is likely to be a fundamental requirement for MSIs on ARM SoCs
  using GIC, if we want to have a hope of mapping MSIs to VMs efficiently.
 
 Indeed. GICv[34] is the ARM way of implementing MSI on SBSA compliant
 systems (from level 1 onwards, if memory serves well). People are
 actively building systems with this architecture, and relying on it to
 provide VM isolation.

I don't mind it being there, but if we don't need it, we shouldn't
have to use that isolation feature and be able to just allow any
initiator to send an MSI.

Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Marc Zyngier
On 01/05/14 16:53, Arnd Bergmann wrote:
 On Thursday 01 May 2014 16:11:48 Marc Zyngier wrote:
 On 01/05/14 15:36, Dave Martin wrote:
 On Thu, May 01, 2014 at 02:29:50PM +0100, Arnd Bergmann wrote:
 On Thursday 01 May 2014 12:15:35 Dave Martin wrote:
 On Tue, Apr 29, 2014 at 10:46:18PM +0200, Arnd Bergmann wrote:
 I don't understand. An MSI controller is just an address that acts
 as a DMA slave for a 4-byte inbound data packet. It has no way of
 knowing who is sending data, other than by the address or the data
 sent to it. Are you talking of something else?

 Oops, looks like there are a few points I failed to respond to here...


 I'm not an expert on PCI -- I'm prepared to believe it works that way.

 GICv3 can descriminate between different MSI senders based on ID
 signals on the bus.

 Any idea what this is good for? Do we have to use it? It probably doesn't
 fit very well into the way Linux handles MSIs today.

 Marc may be better placed than me to comment on this in detail.

 As to fitting Linux, it seems to match what Linux does fairly well
 (see the kvm-arm64/gicv3 branch in my tree). Not saying that it does it
 in a very simple way (far from it, actually), but it works.
 
 ok.
 
 As to what it is good for (and before someone bursts into an Edwin
 Starr interpretation), it mostly has to do with isolation, and the fact
 that you may want to let the whole MSI programming to a guest (and yet
 ensure that the guest cannot generate interrupts that would be assigned
 to other devices). This is done by sampling the requester-id at the ITS
 level, and use this information to index a per-device interrupt
 translation table (I could talk for hours about the concept and its
 variations, mostly using expletives and possibly a hammer, but I think
 it is the time for my pink pill).
 
 So the idea is that you want to give the guest unfiltered access to the
 PCI config space of an assigned device?
 
 Is that safe?
 
 If the config space access goes through the hypervisor (as I think we
 always do today), there should be no isse.

My natural tendency would be to filter everything, as we certainly don't
want the device to be reconfigured in weird and fancy ways (my PCI-foo
is a bit limited, so bear with me). But as usual, GICv3 is not PCI
specific, and is designed to cater for weird and wonderful cases, PCI
being only one possible implementation...

On a non-PCI system, you definitely could assign a MSI-like capable
device, and let the guest do its thing. I'm already seeing that kind of
design.

 However, I believe it's correct to say that because the GIC is not part
 of PCI, end-to-end MSI delivery inherently involves a non-PCI step from
 the PCI RC to the GIC itself.

 Thus this is likely to be a fundamental requirement for MSIs on ARM SoCs
 using GIC, if we want to have a hope of mapping MSIs to VMs efficiently.

 Indeed. GICv[34] is the ARM way of implementing MSI on SBSA compliant
 systems (from level 1 onwards, if memory serves well). People are
 actively building systems with this architecture, and relying on it to
 provide VM isolation.
 
 I don't mind it being there, but if we don't need it, we shouldn't
 have to use that isolation feature and be able to just allow any
 initiator to send an MSI.

It is not that simple. You could design things around a global ITT,
pointed to by all the possible devices in your system. But that table
would be quite big (it has to contain the routing information for all
the possible messages in your system, and must be physically
contiguous). You still need to know the devices (requester-id,
device-id, whatever), as there is no wildcard mechanism.

Also, this approach pretty much kills hotplug (you cannot easily resize
that table to add new interrupt entries). That's why I've opted for a
more dynamic configuration, where each device gets its own ITT as it
appears on the system.

M.
-- 
Jazz is not dead. It just smells funny...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Grant Grundler
On Thu, May 1, 2014 at 6:29 AM, Arnd Bergmann a...@arndb.de wrote:
...
 GICv3 can descriminate between different MSI senders based on ID
 signals on the bus.

 Any idea what this is good for? Do we have to use it? It probably doesn't
 fit very well into the way Linux handles MSIs today.

I can see this being used for diagnosing failures - e.g. hung system
would leave tracks if an interrupt was or was not provided by the
device. I can't think of a reason why Linux MSI code would need to
support this though.

...
 We are likely to get non-PCI MSIs in future SoC systems too, and there
 are no standards governing how such systems should look.

Why look to the future when one can look in the past? :)

PA-RISC was designed in the 1980s to use MSI to generate all CPU
interrupts. This is as simple as it gets.

The concept is identical to MSI: MMIO routeable address with a payload
to indicate interrupt source (or map that source to some vector
table.) Look at arch/parisc/kernel/smp.c:ipi_send() for an example
(note p-hpa is processor-host physical address).

Current and  future products do the same things but add more
features that make this more complicated. But the basic transaction
will be identical and it needs to be routed like any other MMIO
transaction by any bridge (including IOMMUs).

 I wouldn't call that MSI though -- using the same term in the code
 can be rather confusing. There are existing SoCs that use message
 based interrupt notification. We are probably better off modeling
 those are regular irqchips in Linux and DT, given that they may
 not be bound by the same constraints as PCI MSI.

PCI device is one source for MSI and PCI defines how to initialize
an MSI source.  The target is not PCI (or not even Intel). Intel
defines how MSI works on their chipsets/CPUs. Other can still do it
differently.

I'm perfectly ok with using MSI to refer to any in band interrupt message.

 Who knows?  A management component of the GPU that is under exclusive
 control of the host or hypervisor might be wired up to bypass the IOMMU
 completely.

Does the linux kernel need to know about a device/component that it
can't control?
Either linux kernel shouldn't be told about it OR should ignore this
device/component.

 Partly, yes.  The concept embodied by dma-ranges is correct, but the
 topological relationship is not: the assumption that a master device
 always masters onto its parent node doesn't work for non-tree-like
 topologies.

 In almost all cases it will fit. When it doesn't, we can work around it by
 defining virtual address spaces the way that the PCI binding does. The only
 major exception that we know we have to handle is IOMMUs.

MMIO routing (and thus dma-ranges) is a graph (very comparable to
network routing). Some simple (e.g. PCI) implementations look like a
tree - but that's not the general case.

DMA cares about more than routing though - cache coherency and
performance (BW and latency) matter too.

cheers,
grant
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-05-01 Thread Stephen Warren
On 04/29/2014 03:00 PM, Arnd Bergmann wrote:
...
 Yes. It's very complicated unfortunately, because we have to be
 able to deal with arbitrary combinations of a lot of oddball cases
 that can show up in random SoCs:
...
 - a device may have DMA access to a bus that is invisible to the CPU

The issue is slightly more general than that. It's more that the bus
structure seen by a device is simply /different/ than that seen by the
CPU. I don't think it's a requirement that there be CPU-invisible buses
for that to be true.

For example, I could conceive of a HW setup like:

primary CPU bus -- other devices
   |\_  /
   |  \|
   v  v^
device registers  some secondary bus
  |
  v
   memory

Here, all the buses are visible to the CPU, yet the path that
transactions take between the buses is simply different to the CPU. More
complex situations than the above, while still maintaining that
description, are certainly possible.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-30 Thread Dave Martin
On Tue, Apr 29, 2014 at 11:00:29PM +0200, Arnd Bergmann wrote:  On Tuesday 29 
April 2014 13:07:54 Grant Grundler wrote:   On Tue, Apr 29, 2014 at 11:16 AM, 
Dave Martin dave.mar...@arm.com wrote:
  ...
   An IOMMU is really a specialised bridge
  
  Is a GART a bridge?

Depends what you mean by bridge.

I would say that it can be logically viewed as a bridge through which
graphics cards master onto the bus.

In this context it's integrated into the bus, but I think the function
it performs is still in series with, and logically distinct from, the
routing done by the bus proper.

  IOMMUs can provide three basic functions:
  1) remap address space to reach phys mem ranges that the device is
  otherwise not capable of accessing (classic 32-bit DMA to reach 64-bit
  Phys address)
  
  2) implement scatter-gather (page level granularity) so the device
  doesn't have to
  
  3) provide some level of system protection against rogue DMA by
  forcing everything through a DMA mapping interface and faulting when
  encountering unmapped DMA transactions.
 
 [ 4) provide isolation between multiple contexts, typically for purposes
  of virtualization]

All of which are classic functionalities of an MMU.  What is different
is that the CPU MMU is built into the CPU and standardised, and either
symmetrically shared or private to the CPU so that there is rarely if
ever any need to describe anything in DT ... whereas an IOMMU is not
tightly coupled to the CPU, and wired up to external hardware in
sometimes arbitrary ways that may require description (especially for
SoC systems).

  I summarize IOMMUs as: participate in the routing of MMIO
  transactions in the system fabric.
  In this sense, IOMMUs are sort of a bridge. Defining what kind of
  routing they can do (coalesce transactions? remapping MMIO domains?)
  and which address ranges they route would describe most of that
  functionality.

agreed

  This remapping of MMIO transaction is also usually asymmetric.
  Meaning routing of downstream transactions *might* be completely
  different than the routing + remapping of transactions heading
  upstream. DMA mappings services are designed to handle only the
  transactions generated (aka mastered) by a downstream device.

Also agreed.  In the SoC world, the upstream and downstream paths
may be completely separate in the topology, not following the same
path at all, so then it becomes more natural for them to have
independent characteristics.  dma-ranges doesn't work for these
situations.

 For the purposes of the DT binding, we have a 'ranges' property
 that defines the downstream translation (CPU-to-MMIO) and a
 'dma-ranges' property for the opposite address translation
 (device-to-memory).
 
  , so it may be cleaner to describe
   an IOMMU using a real bus node in the DT, if we also define a way to make
   master/slave linkages explicit where it matters.
  
  where it matters is a bit vague.  Is the goal to just enable DMA
  mapping services to do the right thing for a device that can
  generate DMA?
 
 Yes. It's very complicated unfortunately, because we have to be
 able to deal with arbitrary combinations of a lot of oddball cases
 that can show up in random SoCs:
 
 - device can only do DMA to a limited address range
 - DMA is noncoherent and needs manual cache management
 - DMA address is at an offset from physical address
 - some devices have an IOMMU
 - some IOMMUs are shared between devices
 - some devices with IOMMU can have multiple simultaneous contexts
 - a device may access some memory directly and some other memory through IOMMU
 - a device may have DMA access to a bus that is invisible to the CPU
 - DMA on some device is only coherent if the IOMMU is enabled
 - DMA on some device is only coherent if the IOMMU is disabled
 - the IOVA range to an IOMMU is device dependent

This sounds bad on the surface, but really these are permutations of a
few key concepts:

 1) Topologies that can't be reduced to a tree, resulting from
   unidirectional buses (common in SoC architectures).

 2) Many-to-many connectivity of devices, unlike the familiar
   many-to-one parent/child relationship from PCI and similar.

 3) Interconnects made up of independently configured and
   connected components, such that they can't be described as optional
   features of some standard bus.

 4) Bridge components and devices that process, route or transform
   transactions based on more than just the destination address.
   Instead, the handling of a transaction can involve some kind of
   device ID, memory type and cacheability attributes and coherency
   domain information in addition to the address.


To address these, we need a few things:

 a) A way to describe the cross-links where the topology is not
reducible to a tree.  This would involve a phandle to describe
a A masters on B relationship analogous to what the parent/
child relationship in DT already means.

 b) For devices with multiple distinct master roles 

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-30 Thread Dave Martin
On Mon, Apr 28, 2014 at 09:55:00PM +0200, Arnd Bergmann wrote:
 On Monday 28 April 2014 20:30:56 Will Deacon wrote:
  Hi Arnd,
  
  [and thanks Thierry for CCing me -- I have been tangled up with this before
  :)]
  
  On Mon, Apr 28, 2014 at 01:05:30PM +0100, Arnd Bergmann wrote:
   On Monday 28 April 2014 13:18:03 Thierry Reding wrote:
There still has to be one cell to specify which master. Unless perhaps
if they can be arbitrarily assigned. I guess even if there's a fixed
mapping that applies to one SoC generation, it might be good to still
employ a specifier and have the mapping in DT for flexibility.
   
   let me clarify by example:
   
 iommu@1 {
 compatible = some,simple-iommu;
 reg = 1;
 #iommu-cells = 0; /* supports only one master */
 };
   
 iommu@2 {
 compatible = some,other-iommu;
 reg = 3;
 #iommu-cells = 1; /* contains master ID */
 };
   
 iommu@3 {
 compatible = some,windowed-iommu;
 reg = 2;
 #iommu-cells = 2; /* contains dma-window */
 };

An IOMMU is really a specialised bridge, so it may be cleaner to describe
an IOMMU using a real bus node in the DT, if we also define a way to make
master/slave linkages explicit where it matters.


The problems of how to describe master/slave linkage, coherency between
masters, and how to describe sideband ID information present on the bus
are really interrelated.


If we can come up with a consistent description for these things, it
should help us to describe IOMMUs, bus mastering peripherals, MSI
controllers and complex bridges in a more uniform way, without having to
reinvent so much for each binding.  That's my hope anyway.

I've been hacking around some proposals on these areas which are a bit
different from the approach suggested here -- I'll try to summarise some
of it intelligibly and post something tomorrow so that we can discuss.


   
 device@4 {
 compatible = some,ethernet;
 iommus = /iommu@1;
 };
   
 device@5 {
 compatible = some,dmaengine;
 iommus = /iommu@2 0x4000 0x100,
  /iommu@3 0x101;
 };
   
   The device at address 4 has a one-one relationship with iommu@1, so there
   is no need for any data. device@5 has two master ports. One is connected 
   to
   an IOMMU that has a per-device aperture, device@5 can only issue transfers
   to the 256MB area at 0x4000, and the IOMMU will have to put entries 
   for
   this device into that address. The second master port is connected to
   iommu@3, which uses a master ID that gets passed along with each transfer,
   so that needs to be put into the IOTLBs.
  
  I think this is definitely going in the right direction, but it's not clear
  to me how the driver for device@5 knows how to configure the two ports.
  We're still lacking topology information (unless that's implicit in the
  ordering of the properties) to say how the mastering capabilities of the
  device are actually routed and configured.
 
 It would be helpful to have a concrete example of a device that has multiple
 masters. I have heard people mention this multiple times, and I can understand

You mean a device that contains multiple independent bus masters,
right?  In particular, a device composed of multiple bus masters that
do different things or should be handled differently by the interconnect.

There has definitely been talk on the list about real devices that
use multiple stream IDs.


I'll ask around for device-like examples, but the most obvious
example is the IOMMU itself.

Transactions generated by the IOMMU clearly need to be handled
differently by the interconnect, compared with transactions
translated and forwarded by IOMMU on behalf of its clients.

For example, suppose devices can post MSIs to an interrupt controller
via a mailbox accessed through the IOMMU.  Suppose also that the IOMMU
generates MSIs itself in order to signal management events or faults
to a host OS.  Linux (as host) will need to configure the interrupt
controller separately for the IOMMU and for the IOMMU clients.  This
means that Linux needs to know which IDs may travel to the interrupt
controller for which purpose, and they must be distinct.

I'm not sure whether there is actually a SoC today that is MSI-capable
and contains an IOMMU, but all the components to build one are out
there today.  GICv3 is also explicitly designed to support such
systems.


In the future, it is likely that HSA-style GPUs and other high-
throughput virtualisable bus mastering devices will have capabilities
of this sort, but I don't think there's anything concrete yet.


 how it might be wired up in hardware, but I don't know what it's good for,
 or who would actually do it.
 
   A variation would be to not use #iommu-cells at all, but provide a
   #address-cells / #size-cells pair in the IOMMU, and have a translation
   as we 

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-29 Thread Hiroshi Doyu

Thierry Reding thierry.red...@gmail.com writes:

 * PGP Signed by an unknown key

 On Sun, Apr 27, 2014 at 08:23:06PM +0200, Arnd Bergmann wrote:
 On Sunday 27 April 2014 13:07:43 Shaik Ameer Basha wrote:
  +- mmu-masters: A phandle to device nodes representing the master for which
  +   the System MMU can provide a translation. Any additional 
  values
  +  after the phandle will be ignored because a System MMU never
  +  have two or more masters. #stream-id-cells specified in 
  the
  +  master's node will be also ignored.
  +  If more than one phandle is specified, only the first 
  phandle
  +  will be treated.
 
 This seems completely backwards: Why would you list the masters for an IOMMU
 in the IOMMU node?
 
 The master should have a standard property pointing to the IOMMU instead.
 
 We don't have a generic binding for IOMMUs yet it seems, but the time is
 overdue to make one.
 
 Consider this NAKed until there is a generic binding for IOMMUs that all
 relevant developers have agreed to.

 I'd like to take this opportunity and revive one of the hibernating
 patch sets that we have for Tegra. The last effort to get things merged
 was back in January I think. I haven't bothered to look up the reference
 since it's probably good to start from scratch anyway.

 The latest version of the binding that was under discussion back then I
 think looked something like this:

   device@... {
   iommus = iommu [spec][, other_iommu [other_spec]...];
   };

 And possibly with a iommu-names property to go along with that. The idea
 being that a device can be a master on possibly multiple IOMMUs. Using
 the above it would also be possible to have one device be multiple
 masters on the same IOMMU.

 On Tegra the specifier would be used to encode a memory controller's
 client ID. One discussion point back at the time was to encode the ID as
 a bitmask to allow more than a single master per entry. Another solution
 which I think is a little cleaner and more generic, would be to use one
 entry per master and use a single cell to encode the client ID. Devices
 with multiple clients to the same IOMMU could then use multiple entries
 referencing the same IOMMU.

 I've added Hiroshi Doyu on Cc since he knows the Tegra IOMMU best.
 Hiroshi, can you summarize exactly what the proposed bindings were. If
 my memory serves me well they were mostly along the lines of what Arnd
 proposes here, and perhaps they are something that can also be used for
 Exynos.

You can find the detail from:

[PATCHv7 09/12] iommu/tegra: smmu: get swgroups from DT iommus=
  http://lists.linuxfoundation.org/pipermail/iommu/2013-December/007212.html

You can specify any parameters which your iommu requires from a device,
and a device(master) can have multiple IOMMUs.

device@... {
iommus = iommu [spec][, other_iommu [other_spec]...];
};
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-29 Thread Grant Grundler
On Tue, Apr 29, 2014 at 11:16 AM, Dave Martin dave.mar...@arm.com wrote:
...
 An IOMMU is really a specialised bridge

Is a GART a bridge?

IOMMUs can provide three basic functions:
1) remap address space to reach phys mem ranges that the device is
otherwise not capable of accessing (classic 32-bit DMA to reach 64-bit
Phys address)

2) implement scatter-gather (page level granularity) so the device
doesn't have to

3) provide some level of system protection against rogue DMA by
forcing everything through a DMA mapping interface and faulting when
encountering unmapped DMA transactions.


I summarize IOMMUs as: participate in the routing of MMIO
transactions in the system fabric.
In this sense, IOMMUs are sort of a bridge. Defining what kind of
routing they can do (coalesce transactions? remapping MMIO domains?)
and which address ranges they route would describe most of that
functionality.

This remapping of MMIO transaction is also usually asymmetric.
Meaning routing of downstream transactions *might* be completely
different than the routing + remapping of transactions heading
upstream. DMA mappings services are designed to handle only the
transactions generated (aka mastered) by a downstream device.


, so it may be cleaner to describe
 an IOMMU using a real bus node in the DT, if we also define a way to make
 master/slave linkages explicit where it matters.

where it matters is a bit vague.  Is the goal to just enable DMA
mapping services to do the right thing for a device that can
generate DMA?


 The problems of how to describe master/slave linkage, coherency between
 masters, and how to describe sideband ID information present on the bus
 are really interrelated.

 If we can come up with a consistent description for these things, it
 should help us to describe IOMMUs, bus mastering peripherals, MSI
 controllers and complex bridges in a more uniform way, without having to
 reinvent so much for each binding.  That's my hope anyway.

I don't know...these all deal with MMIO routing. The parts that are
dynamic should be detectable by platform specific SW (in general).
But where it's not (e.g. ISTR MSI transaction addressing is defined by
Intel Arch), it needs to be described by _something_  and device tree
seems to be a reasonable place to do that.

 I've been hacking around some proposals on these areas which are a bit
 different from the approach suggested here -- I'll try to summarise some
 of it intelligibly and post something tomorrow so that we can discuss.

Are you planning on consolidating Documentation/devicetree/bindings/iommu/ ?
Do you care about Documentation/Intel-IOMMU.txt?

Lots of stuff in Documentation/ touch different parts of the MMIO routing uses.

hth,
grant
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-29 Thread Arnd Bergmann
On Tuesday 29 April 2014 19:16:02 Dave Martin wrote:
 On Mon, Apr 28, 2014 at 09:55:00PM +0200, Arnd Bergmann wrote:
  On Monday 28 April 2014 20:30:56 Will Deacon wrote:
 
device@4 {
compatible = some,ethernet;
iommus = /iommu@1;
};

device@5 {
compatible = some,dmaengine;
iommus = /iommu@2 0x4000 0x100,
 /iommu@3 0x101;
};

The device at address 4 has a one-one relationship with iommu@1, so 
there
is no need for any data. device@5 has two master ports. One is 
connected to
an IOMMU that has a per-device aperture, device@5 can only issue 
transfers
to the 256MB area at 0x4000, and the IOMMU will have to put entries 
for
this device into that address. The second master port is connected to
iommu@3, which uses a master ID that gets passed along with each 
transfer,
so that needs to be put into the IOTLBs.
   
   I think this is definitely going in the right direction, but it's not 
   clear
   to me how the driver for device@5 knows how to configure the two ports.
   We're still lacking topology information (unless that's implicit in the
   ordering of the properties) to say how the mastering capabilities of the
   device are actually routed and configured.
  
  It would be helpful to have a concrete example of a device that has multiple
  masters. I have heard people mention this multiple times, and I can 
  understand
 
 You mean a device that contains multiple independent bus masters,
 right?  In particular, a device composed of multiple bus masters that
 do different things or should be handled differently by the interconnect.

Right.

 There has definitely been talk on the list about real devices that
 use multiple stream IDs.
 
 
 I'll ask around for device-like examples, but the most obvious
 example is the IOMMU itself.
 
 Transactions generated by the IOMMU clearly need to be handled
 differently by the interconnect, compared with transactions
 translated and forwarded by IOMMU on behalf of its clients.
 
 For example, suppose devices can post MSIs to an interrupt controller
 via a mailbox accessed through the IOMMU.  Suppose also that the IOMMU
 generates MSIs itself in order to signal management events or faults
 to a host OS.  Linux (as host) will need to configure the interrupt
 controller separately for the IOMMU and for the IOMMU clients.  This
 means that Linux needs to know which IDs may travel to the interrupt
 controller for which purpose, and they must be distinct.

I don't understand. An MSI controller is just an address that acts
as a DMA slave for a 4-byte inbound data packet. It has no way of
knowing who is sending data, other than by the address or the data
sent to it. Are you talking of something else?

 I'm not sure whether there is actually a SoC today that is MSI-capable
 and contains an IOMMU, but all the components to build one are out
 there today.  GICv3 is also explicitly designed to support such
 systems.

A lot of SoCs have MSI integrated into the PCI root complex, which
of course is pointless from MSI perspective, as well as implying that
the MSI won't go through the IOMMU.

We have briefly mentioned MSI in the review of the Samsung GH7 PCI
support. It's possible that this one can either use the built-in
MSI or the one in the GICv2m.

 In the future, it is likely that HSA-style GPUs and other high-
 throughput virtualisable bus mastering devices will have capabilities
 of this sort, but I don't think there's anything concrete yet.

Wouldn't they just have IOMMUs with multiple contexts?

  how it might be wired up in hardware, but I don't know what it's good for,
  or who would actually do it.
  
A variation would be to not use #iommu-cells at all, but provide a
#address-cells / #size-cells pair in the IOMMU, and have a translation
as we do for dma-ranges. This is probably most flexible.
   
   That would also allow us to describe ranges of master IDs, which we need 
   for
   things like PCI RCs on the ARM SMMU. Furthermore, basic transformations of
   these ranges could also be described like this, although I think Dave 
   (CC'd)
   has some similar ideas in this area.
 
 Ideally, we would reuse the ePAPR ranges concept and describe the way
 sideband ID signals propagate down the bus hierarchy in a similar way.

It would be 'dma-ranges'. Unfortunately that would imply that each DMA
master is connected to only one IOMMU, which you say is not necessarily
the case. The simpler case of a device is only a master on a single IOMMU
but can use multiple contexts would however work fine with dma-ranges.

Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-29 Thread Arnd Bergmann
On Tuesday 29 April 2014 13:07:54 Grant Grundler wrote:
 On Tue, Apr 29, 2014 at 11:16 AM, Dave Martin dave.mar...@arm.com wrote:
 ...
  An IOMMU is really a specialised bridge
 
 Is a GART a bridge?
 
 IOMMUs can provide three basic functions:
 1) remap address space to reach phys mem ranges that the device is
 otherwise not capable of accessing (classic 32-bit DMA to reach 64-bit
 Phys address)
 
 2) implement scatter-gather (page level granularity) so the device
 doesn't have to
 
 3) provide some level of system protection against rogue DMA by
 forcing everything through a DMA mapping interface and faulting when
 encountering unmapped DMA transactions.

[ 4) provide isolation between multiple contexts, typically for purposes
 of virtualization]

 I summarize IOMMUs as: participate in the routing of MMIO
 transactions in the system fabric.
 In this sense, IOMMUs are sort of a bridge. Defining what kind of
 routing they can do (coalesce transactions? remapping MMIO domains?)
 and which address ranges they route would describe most of that
 functionality.
 
 This remapping of MMIO transaction is also usually asymmetric.
 Meaning routing of downstream transactions *might* be completely
 different than the routing + remapping of transactions heading
 upstream. DMA mappings services are designed to handle only the
 transactions generated (aka mastered) by a downstream device.

For the purposes of the DT binding, we have a 'ranges' property
that defines the downstream translation (CPU-to-MMIO) and a
'dma-ranges' property for the opposite address translation
(device-to-memory).

 , so it may be cleaner to describe
  an IOMMU using a real bus node in the DT, if we also define a way to make
  master/slave linkages explicit where it matters.
 
 where it matters is a bit vague.  Is the goal to just enable DMA
 mapping services to do the right thing for a device that can
 generate DMA?

Yes. It's very complicated unfortunately, because we have to be
able to deal with arbitrary combinations of a lot of oddball cases
that can show up in random SoCs:

- device can only do DMA to a limited address range
- DMA is noncoherent and needs manual cache management
- DMA address is at an offset from physical address
- some devices have an IOMMU
- some IOMMUs are shared between devices
- some devices with IOMMU can have multiple simultaneous contexts
- a device may access some memory directly and some other memory through IOMMU
- a device may have DMA access to a bus that is invisible to the CPU
- DMA on some device is only coherent if the IOMMU is enabled
- DMA on some device is only coherent if the IOMMU is disabled
- the IOVA range to an IOMMU is device dependent

  I've been hacking around some proposals on these areas which are a bit
  different from the approach suggested here -- I'll try to summarise some
  of it intelligibly and post something tomorrow so that we can discuss.
 
 Are you planning on consolidating Documentation/devicetree/bindings/iommu/ ?
 Do you care about Documentation/Intel-IOMMU.txt?

I think we can ignore the Intel-IOMMU because that is specialized on PCI
devices, which we don't normally represent in DT. It is also a special
case because the Intel IOMMU is a single-instance device. If it's present
and enabled, it will be used by every device. The case we do need to describe
is when we don't know which IOMMU is used for which master, or how to
configure it.

Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Thierry Reding
On Sun, Apr 27, 2014 at 08:23:06PM +0200, Arnd Bergmann wrote:
 On Sunday 27 April 2014 13:07:43 Shaik Ameer Basha wrote:
  +- mmu-masters: A phandle to device nodes representing the master for which
  +   the System MMU can provide a translation. Any additional 
  values
  +  after the phandle will be ignored because a System MMU never
  +  have two or more masters. #stream-id-cells specified in the
  +  master's node will be also ignored.
  +  If more than one phandle is specified, only the first phandle
  +  will be treated.
 
 This seems completely backwards: Why would you list the masters for an IOMMU
 in the IOMMU node?
 
 The master should have a standard property pointing to the IOMMU instead.
 
 We don't have a generic binding for IOMMUs yet it seems, but the time is
 overdue to make one.
 
 Consider this NAKed until there is a generic binding for IOMMUs that all
 relevant developers have agreed to.

I'd like to take this opportunity and revive one of the hibernating
patch sets that we have for Tegra. The last effort to get things merged
was back in January I think. I haven't bothered to look up the reference
since it's probably good to start from scratch anyway.

The latest version of the binding that was under discussion back then I
think looked something like this:

device@... {
iommus = iommu [spec][, other_iommu [other_spec]...];
};

And possibly with a iommu-names property to go along with that. The idea
being that a device can be a master on possibly multiple IOMMUs. Using
the above it would also be possible to have one device be multiple
masters on the same IOMMU.

On Tegra the specifier would be used to encode a memory controller's
client ID. One discussion point back at the time was to encode the ID as
a bitmask to allow more than a single master per entry. Another solution
which I think is a little cleaner and more generic, would be to use one
entry per master and use a single cell to encode the client ID. Devices
with multiple clients to the same IOMMU could then use multiple entries
referencing the same IOMMU.

I've added Hiroshi Doyu on Cc since he knows the Tegra IOMMU best.
Hiroshi, can you summarize exactly what the proposed bindings were. If
my memory serves me well they were mostly along the lines of what Arnd
proposes here, and perhaps they are something that can also be used for
Exynos.

Will Deacon (I think) had some comments on the earlier discussion as
well, so I've added him on Cc for visibility. Sorry if I'm confusing you
with someone else, Will. In that case perhaps you know who to include in
the discussion from the ARM side.

Also adding Stephen Warren for visibility.

Thierry


pgpVulkkC27jm.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Arnd Bergmann
On Monday 28 April 2014 12:39:20 Thierry Reding wrote:
 On Sun, Apr 27, 2014 at 08:23:06PM +0200, Arnd Bergmann wrote:
  On Sunday 27 April 2014 13:07:43 Shaik Ameer Basha wrote:
   +- mmu-masters: A phandle to device nodes representing the master for 
   which
   +   the System MMU can provide a translation. Any additional 
   values
   +  after the phandle will be ignored because a System MMU 
   never
   +  have two or more masters. #stream-id-cells specified in 
   the
   +  master's node will be also ignored.
   +  If more than one phandle is specified, only the first 
   phandle
   +  will be treated.
  
  This seems completely backwards: Why would you list the masters for an IOMMU
  in the IOMMU node?
  
  The master should have a standard property pointing to the IOMMU instead.
  
  We don't have a generic binding for IOMMUs yet it seems, but the time is
  overdue to make one.
  
  Consider this NAKed until there is a generic binding for IOMMUs that all
  relevant developers have agreed to.
 
 I'd like to take this opportunity and revive one of the hibernating
 patch sets that we have for Tegra. The last effort to get things merged
 was back in January I think. I haven't bothered to look up the reference
 since it's probably good to start from scratch anyway.
 
 The latest version of the binding that was under discussion back then I
 think looked something like this:
 
   device@... {
   iommus = iommu [spec][, other_iommu [other_spec]...];
   };
 
 And possibly with a iommu-names property to go along with that. The idea
 being that a device can be a master on possibly multiple IOMMUs. Using
 the above it would also be possible to have one device be multiple
 masters on the same IOMMU.

Yes, that seems reasonable. Just one question: How would you represent a
device that has multiple masters, with at least one connected to an IOMMU
and another one connected to memory directly, without going to the IOMMU?

 On Tegra the specifier would be used to encode a memory controller's
 client ID. One discussion point back at the time was to encode the ID as
 a bitmask to allow more than a single master per entry. Another solution
 which I think is a little cleaner and more generic, would be to use one
 entry per master and use a single cell to encode the client ID. Devices
 with multiple clients to the same IOMMU could then use multiple entries
 referencing the same IOMMU.

I'm not completely following here. Are you talking about the generic
binding, or the part that is tegra specific for the specifier?

My first impression is that the generic binding should just allow an
arbitrary specifier with a variable #iommu-cells, and leave the format
up to the IOMMU driver. A lot of drivers probably only support one
master, so they can just set #iommu-cells=0, others might require
IDs that do not fit into one cell.

Arnd

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Thierry Reding
On Mon, Apr 28, 2014 at 12:56:03PM +0200, Arnd Bergmann wrote:
 On Monday 28 April 2014 12:39:20 Thierry Reding wrote:
  On Sun, Apr 27, 2014 at 08:23:06PM +0200, Arnd Bergmann wrote:
   On Sunday 27 April 2014 13:07:43 Shaik Ameer Basha wrote:
+- mmu-masters: A phandle to device nodes representing the master for 
which
+   the System MMU can provide a translation. Any 
additional values
+  after the phandle will be ignored because a System MMU 
never
+  have two or more masters. #stream-id-cells specified 
in the
+  master's node will be also ignored.
+  If more than one phandle is specified, only the first 
phandle
+  will be treated.
   
   This seems completely backwards: Why would you list the masters for an 
   IOMMU
   in the IOMMU node?
   
   The master should have a standard property pointing to the IOMMU instead.
   
   We don't have a generic binding for IOMMUs yet it seems, but the time is
   overdue to make one.
   
   Consider this NAKed until there is a generic binding for IOMMUs that all
   relevant developers have agreed to.
  
  I'd like to take this opportunity and revive one of the hibernating
  patch sets that we have for Tegra. The last effort to get things merged
  was back in January I think. I haven't bothered to look up the reference
  since it's probably good to start from scratch anyway.
  
  The latest version of the binding that was under discussion back then I
  think looked something like this:
  
  device@... {
  iommus = iommu [spec][, other_iommu [other_spec]...];
  };
  
  And possibly with a iommu-names property to go along with that. The idea
  being that a device can be a master on possibly multiple IOMMUs. Using
  the above it would also be possible to have one device be multiple
  masters on the same IOMMU.
 
 Yes, that seems reasonable. Just one question: How would you represent a
 device that has multiple masters, with at least one connected to an IOMMU
 and another one connected to memory directly, without going to the IOMMU?

Heh, I don't think I've ever thought about that use-case. I guess I was
always assuming that in the absence of an IOMMU the device would simply
access memory directly. From what I can tell that's how Tegra works at
least. If the IOMMU is not enabled for a given client, that client will
access physical memory untranslated.

I suppose if that really must be represented then a global dummy IOMMU
could be introduced to help with these cases.

  On Tegra the specifier would be used to encode a memory controller's
  client ID. One discussion point back at the time was to encode the ID as
  a bitmask to allow more than a single master per entry. Another solution
  which I think is a little cleaner and more generic, would be to use one
  entry per master and use a single cell to encode the client ID. Devices
  with multiple clients to the same IOMMU could then use multiple entries
  referencing the same IOMMU.
 
 I'm not completely following here. Are you talking about the generic
 binding, or the part that is tegra specific for the specifier?
 
 My first impression is that the generic binding should just allow an
 arbitrary specifier with a variable #iommu-cells, and leave the format
 up to the IOMMU driver.

Yes, I was getting ahead of myself. The idea was to have #iommu-cells
and allow the specifier to be IOMMU-specific. On Tegra that would
translate to the memory controller client ID, on other devices I suspect
something similar might exist, but for the generic binding it should be
completely opaque and hence irrelevant.

Really just like any of the other bindings that have foos and #foo-cells
properties.

 A lot of drivers probably only support one
 master, so they can just set #iommu-cells=0, others might require
 IDs that do not fit into one cell.

You mean #iommu-cells = 1 for devices that only require one master?
There still has to be one cell to specify which master. Unless perhaps
if they can be arbitrarily assigned. I guess even if there's a fixed
mapping that applies to one SoC generation, it might be good to still
employ a specifier and have the mapping in DT for flexibility.

Thierry


pgpiS3Z2lrnfJ.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Arnd Bergmann
On Monday 28 April 2014 13:18:03 Thierry Reding wrote:
 On Mon, Apr 28, 2014 at 12:56:03PM +0200, Arnd Bergmann wrote:
  On Monday 28 April 2014 12:39:20 Thierry Reding wrote:
   And possibly with a iommu-names property to go along with that. The idea
   being that a device can be a master on possibly multiple IOMMUs. Using
   the above it would also be possible to have one device be multiple
   masters on the same IOMMU.
  
  Yes, that seems reasonable. Just one question: How would you represent a
  device that has multiple masters, with at least one connected to an IOMMU
  and another one connected to memory directly, without going to the IOMMU?
 
 Heh, I don't think I've ever thought about that use-case. I guess I was
 always assuming that in the absence of an IOMMU the device would simply
 access memory directly. From what I can tell that's how Tegra works at
 least. If the IOMMU is not enabled for a given client, that client will
 access physical memory untranslated.
 
 I suppose if that really must be represented then a global dummy IOMMU
 could be introduced to help with these cases.

It's actually not too uncommon: you can have e.g. the lower 2GB mapped
directly from the device address space into the host memory, but have
an iommu that translates accesses from some range in the upper 2GB of
the 32-bit address space into full 64-bit addresses.

This use case makes no sense if you use the IOMMU for isolation
or virtualization, but it gives better performance for lowmem access
when the only reason to have the IOMMU is to map highmem addresses.

   On Tegra the specifier would be used to encode a memory controller's
   client ID. One discussion point back at the time was to encode the ID as
   a bitmask to allow more than a single master per entry. Another solution
   which I think is a little cleaner and more generic, would be to use one
   entry per master and use a single cell to encode the client ID. Devices
   with multiple clients to the same IOMMU could then use multiple entries
   referencing the same IOMMU.
  
  I'm not completely following here. Are you talking about the generic
  binding, or the part that is tegra specific for the specifier?
  
  My first impression is that the generic binding should just allow an
  arbitrary specifier with a variable #iommu-cells, and leave the format
  up to the IOMMU driver.
 
 Yes, I was getting ahead of myself. The idea was to have #iommu-cells
 and allow the specifier to be IOMMU-specific. On Tegra that would
 translate to the memory controller client ID, on other devices I suspect
 something similar might exist, but for the generic binding it should be
 completely opaque and hence irrelevant.
 
 Really just like any of the other bindings that have foos and #foo-cells
 properties.

Ok.

  A lot of drivers probably only support one
  master, so they can just set #iommu-cells=0, others might require
  IDs that do not fit into one cell.
 
 You mean #iommu-cells = 1 for devices that only require one master?

I meant an IOMMU device that acts as the slave for exactly one device,
even if that device has multiple master ports.

 There still has to be one cell to specify which master. Unless perhaps
 if they can be arbitrarily assigned. I guess even if there's a fixed
 mapping that applies to one SoC generation, it might be good to still
 employ a specifier and have the mapping in DT for flexibility.

let me clarify by example:

iommu@1 {
compatible = some,simple-iommu;
reg = 1;
#iommu-cells = 0; /* supports only one master */
};

iommu@2 {
compatible = some,other-iommu;
reg = 3;
#iommu-cells = 1; /* contains master ID */
};

iommu@3 {
compatible = some,windowed-iommu;
reg = 2;
#iommu-cells = 2; /* contains dma-window */
};

device@4 {
compatible = some,ethernet;
iommus = /iommu@1;
};

device@5 {
compatible = some,dmaengine;
iommus = /iommu@2 0x4000 0x100,
 /iommu@3 0x101;
};

The device at address 4 has a one-one relationship with iommu@1, so there
is no need for any data. device@5 has two master ports. One is connected to
an IOMMU that has a per-device aperture, device@5 can only issue transfers
to the 256MB area at 0x4000, and the IOMMU will have to put entries for
this device into that address. The second master port is connected to
iommu@3, which uses a master ID that gets passed along with each transfer,
so that needs to be put into the IOTLBs.

A variation would be to not use #iommu-cells at all, but provide a
#address-cells / #size-cells pair in the IOMMU, and have a translation
as we do for dma-ranges. This is probably most flexible.

One completely open question that I just noticed is how the kernel should
deal with the case of 

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Thierry Reding
On Mon, Apr 28, 2014 at 02:05:30PM +0200, Arnd Bergmann wrote:
 On Monday 28 April 2014 13:18:03 Thierry Reding wrote:
  On Mon, Apr 28, 2014 at 12:56:03PM +0200, Arnd Bergmann wrote:
   On Monday 28 April 2014 12:39:20 Thierry Reding wrote:
And possibly with a iommu-names property to go along with that. The idea
being that a device can be a master on possibly multiple IOMMUs. Using
the above it would also be possible to have one device be multiple
masters on the same IOMMU.
   
   Yes, that seems reasonable. Just one question: How would you represent a
   device that has multiple masters, with at least one connected to an IOMMU
   and another one connected to memory directly, without going to the IOMMU?
  
  Heh, I don't think I've ever thought about that use-case. I guess I was
  always assuming that in the absence of an IOMMU the device would simply
  access memory directly. From what I can tell that's how Tegra works at
  least. If the IOMMU is not enabled for a given client, that client will
  access physical memory untranslated.
  
  I suppose if that really must be represented then a global dummy IOMMU
  could be introduced to help with these cases.
 
 It's actually not too uncommon: you can have e.g. the lower 2GB mapped
 directly from the device address space into the host memory, but have
 an iommu that translates accesses from some range in the upper 2GB of
 the 32-bit address space into full 64-bit addresses.
 
 This use case makes no sense if you use the IOMMU for isolation
 or virtualization, but it gives better performance for lowmem access
 when the only reason to have the IOMMU is to map highmem addresses.

Thinking about this some more, isn't the non-IOMMU master something we
can completely ignore in the DT? Or at least it shouldn't be handled by
the IOMMU bindings because, well, it's not an IOMMU to begin with.

Perhaps it's something that should be described using dma-ranges?

   A lot of drivers probably only support one
   master, so they can just set #iommu-cells=0, others might require
   IDs that do not fit into one cell.
  
  You mean #iommu-cells = 1 for devices that only require one master?
 
 I meant an IOMMU device that acts as the slave for exactly one device,
 even if that device has multiple master ports.

Okay, makes sense. I guess depending on the nature of the IOMMU it might
make sense not to expose it as an IOMMU at all. For example if it lives
completely within the register space of its master device. In that case
it could be directly programmed from the device's driver.

  There still has to be one cell to specify which master. Unless perhaps
  if they can be arbitrarily assigned. I guess even if there's a fixed
  mapping that applies to one SoC generation, it might be good to still
  employ a specifier and have the mapping in DT for flexibility.
 
 let me clarify by example:
 
   iommu@1 {
   compatible = some,simple-iommu;
   reg = 1;
   #iommu-cells = 0; /* supports only one master */
   };
 
   iommu@2 {
   compatible = some,other-iommu;
   reg = 3;
   #iommu-cells = 1; /* contains master ID */
   };
 
   iommu@3 {
   compatible = some,windowed-iommu;
   reg = 2;
   #iommu-cells = 2; /* contains dma-window */
   };
 
   device@4 {
   compatible = some,ethernet;
   iommus = /iommu@1;
   };
 
   device@5 {
   compatible = some,dmaengine;
   iommus = /iommu@2 0x4000 0x100,
/iommu@3 0x101;
   };
 
 The device at address 4 has a one-one relationship with iommu@1, so there
 is no need for any data. device@5 has two master ports. One is connected to
 an IOMMU that has a per-device aperture, device@5 can only issue transfers
 to the 256MB area at 0x4000, and the IOMMU will have to put entries for
 this device into that address. The second master port is connected to
 iommu@3, which uses a master ID that gets passed along with each transfer,
 so that needs to be put into the IOTLBs.

The above sounds reasonable to me with the exception of the DMA window
specifier. Isn't that precisely the information that we currently
describe using the dma-ranges property?

 A variation would be to not use #iommu-cells at all, but provide a
 #address-cells / #size-cells pair in the IOMMU, and have a translation
 as we do for dma-ranges. This is probably most flexible.

I'm not sure I follow. Wouldn't that require masters to be children of
the IOMMU DT nodes for that to work out? Also how would that work for
cases where more data than the address ranges (such as the master ID) is
needed to operate the IOMMU?

 One completely open question that I just noticed is how the kernel should
 deal with the case of multiple IOMMUs attached to one master: the
 data structures we have assume that we know exactly how to do DMA by
 setting the per-device 

Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Stephen Warren
On 04/28/2014 05:18 AM, Thierry Reding wrote:
 On Mon, Apr 28, 2014 at 12:56:03PM +0200, Arnd Bergmann wrote:
...
 A lot of drivers probably only support one
 master, so they can just set #iommu-cells=0, others might require
 IDs that do not fit into one cell.
 
 You mean #iommu-cells = 1 for devices that only require one master?
 There still has to be one cell to specify which master. Unless perhaps
 if they can be arbitrarily assigned. I guess even if there's a fixed
 mapping that applies to one SoC generation, it might be good to still
 employ a specifier and have the mapping in DT for flexibility.

#iommu-cells doesn't include the phandle, so if you want the client
references to be:

property = iommu;

then that's #iommu-cells=0, whereas:

property = iommu N;

is #iommu-cells=1.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-28 Thread Will Deacon
Hi Arnd,

[and thanks Thierry for CCing me -- I have been tangled up with this before
:)]

On Mon, Apr 28, 2014 at 01:05:30PM +0100, Arnd Bergmann wrote:
 On Monday 28 April 2014 13:18:03 Thierry Reding wrote:
  There still has to be one cell to specify which master. Unless perhaps
  if they can be arbitrarily assigned. I guess even if there's a fixed
  mapping that applies to one SoC generation, it might be good to still
  employ a specifier and have the mapping in DT for flexibility.
 
 let me clarify by example:
 
   iommu@1 {
   compatible = some,simple-iommu;
   reg = 1;
   #iommu-cells = 0; /* supports only one master */
   };
 
   iommu@2 {
   compatible = some,other-iommu;
   reg = 3;
   #iommu-cells = 1; /* contains master ID */
   };
 
   iommu@3 {
   compatible = some,windowed-iommu;
   reg = 2;
   #iommu-cells = 2; /* contains dma-window */
   };
 
   device@4 {
   compatible = some,ethernet;
   iommus = /iommu@1;
   };
 
   device@5 {
   compatible = some,dmaengine;
   iommus = /iommu@2 0x4000 0x100,
/iommu@3 0x101;
   };
 
 The device at address 4 has a one-one relationship with iommu@1, so there
 is no need for any data. device@5 has two master ports. One is connected to
 an IOMMU that has a per-device aperture, device@5 can only issue transfers
 to the 256MB area at 0x4000, and the IOMMU will have to put entries for
 this device into that address. The second master port is connected to
 iommu@3, which uses a master ID that gets passed along with each transfer,
 so that needs to be put into the IOTLBs.

I think this is definitely going in the right direction, but it's not clear
to me how the driver for device@5 knows how to configure the two ports.
We're still lacking topology information (unless that's implicit in the
ordering of the properties) to say how the mastering capabilities of the
device are actually routed and configured.

 A variation would be to not use #iommu-cells at all, but provide a
 #address-cells / #size-cells pair in the IOMMU, and have a translation
 as we do for dma-ranges. This is probably most flexible.

That would also allow us to describe ranges of master IDs, which we need for
things like PCI RCs on the ARM SMMU. Furthermore, basic transformations of
these ranges could also be described like this, although I think Dave (CC'd)
has some similar ideas in this area.

 One completely open question that I just noticed is how the kernel should
 deal with the case of multiple IOMMUs attached to one master: the
 data structures we have assume that we know exactly how to do DMA by
 setting the per-device dma_map_ops (iommu or not, coherent or not),
 and by setting a pointer to at most one IOMMU.

Agreed.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 11/31] documentation: iommu: add binding document of Exynos System MMU

2014-04-27 Thread Arnd Bergmann
On Sunday 27 April 2014 13:07:43 Shaik Ameer Basha wrote:
 +- mmu-masters: A phandle to device nodes representing the master for which
 +   the System MMU can provide a translation. Any additional 
 values
 +  after the phandle will be ignored because a System MMU never
 +  have two or more masters. #stream-id-cells specified in the
 +  master's node will be also ignored.
 +  If more than one phandle is specified, only the first phandle
 +  will be treated.

This seems completely backwards: Why would you list the masters for an IOMMU
in the IOMMU node?

The master should have a standard property pointing to the IOMMU instead.

We don't have a generic binding for IOMMUs yet it seems, but the time is
overdue to make one.

Consider this NAKed until there is a generic binding for IOMMUs that all
relevant developers have agreed to.

Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu