Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-10-06 Thread Thierry Reding
On Fri, Oct 03, 2014 at 04:08:50PM +0100, Will Deacon wrote:
 Hi Thierry,
 
 On Wed, Oct 01, 2014 at 09:46:10AM +0100, Thierry Reding wrote:
  On Tue, Sep 30, 2014 at 05:00:35PM +0100, Will Deacon wrote:
   On Thu, Sep 25, 2014 at 07:40:23AM +0100, Thierry Reding wrote:
  [...]
So I think what we're going to need is a way to prevent the default
attachment to DMA/IOMMU. Or alternatively not associate devices with
IOMMU domains by default but let drivers explicitly make the decision.
   
   Which drivers and how would they know what to do? I think you might be
   jumping the gun a bit here, given where mainline is with using the IOMMU
   for anything at all.
  
  I don't think I am. I've been working on patches to enable IOMMU on
  Tegra, with the specific use-case that we want to use it to allow
  physically non-contiguous framebuffers to be used for scan out.
  
  In order to do so the DRM driver allocates an IOMMU domain and adds both
  display controllers to it. When a framebuffer is created or imported
  from DMA-BUF, it gets mapped into this domain and both display
  controllers can use the IOVA address as the framebuffer base address.
 
 Does that mean you manually swizzle the dma_map_ops for the device in the
 DRM driver?

No. It means we use the IOMMU API directly instead of the DMA mapping
API.

  Given that a device can only be attached to a single domain at a time
  this will cause breakage when the ARM glue code starts automatically
  attaching the display controllers to a default domain.
 
 Why couldn't you just re-use the domain already allocated by the DMA mapping
 API?

Because I don't see how you'd get access to it. And provided that we
could do that it would also mean that there'd be at least two domains
(one for each display controller) and we'd need to decide on using a
single one of them. Which one do we choose? And what about the unused
one? If there's no way to detach it we loose a precious resource.

  What I proposed a while back was to leave it up to the IOMMU driver 
  to
  choose an allocator for the device. Or rather, choose whether to 
  use a
  custom allocator or the DMA/IOMMU integration allocator. The way 
  this
  worked was to keep a list of devices in the IOMMU driver. Devices in
  this list would be added to domain reserved for DMA/IOMMU 
  integration.
  Those would typically be devices such as SD/MMC, audio, ... devices 
  that
  are in-kernel and need no per-process separation. By default devices
  wouldn't be added to a domain, so devices forming a composite DRM 
  device
  would be able to manage their own domain.
 
 I'd live to have as little of this as possible in the IOMMU drivers, 
 as we
 should leave those to deal with the IOMMU hardware and not domain
 management. Having subsystems manage their own dma ops is an 
 extension to
 the dma-mapping API.

It's not an extension, really. It's more that both need to be able to
coexist. For some devices you may want to create an IOMMU domain and
hook it up with the DMA mapping functions, for others you don't and
handle mapping to IOVA space explicitly.
   
   I think it's an extension in the sense that mainline doesn't currently do
   what you want, regardless of this patch series.
  
  It's interesting since you're now the second person to say this. Can you
  please elaborate why you think that's the case?
 
 Because the only way to set up DMA through an IOMMU on ARM is via the
 arm_iommu_* functions,

No, you can use the IOMMU API directly just fine.

 which are currently called from a subset of the IOMMU drivers themselves:
 
   drivers/gpu/drm/exynos/exynos_drm_iommu.c
   drivers/iommu/ipmmu-vmsa.c
   drivers/iommu/shmobile-iommu.c
   drivers/media/platform/omap3isp/isp.c
 
 Of these, ipmmu-vmsa.c and shmobile.c both allocate a domain per device.
 The omap3 code seems to do something similar. That just leaves the exynos
 driver, which Marek has been reworking anyway.

Right, and as I remember one of the things that Marek did was introduce
a flag to mark drivers as doing their own IOMMU domain management so
that they wouldn't be automatically associated with a mapping.

  I do have local patches that allow precisely this use-case to work
  without changes to the IOMMU core or requiring any extra ARM-specific
  glue.
  
  There's a fair bit of jumping through hoops, because for example you
  don't know what IOMMU instance a domain belongs to at .domain_init()
  time, so I have to defer most of the actual domain initalization until a
  device is actually attached to it, but I digress.
  
Doing so would leave a large number of address spaces available for
things like a GPU driver to keep per-process address spaces for
isolation.

I don't see how we'd be able to do that with the approach that you
propose in this series since it assumes that each device will be
associated 

Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-10-06 Thread Laurent Pinchart
Hi Thierry and Will,

On Monday 06 October 2014 11:52:50 Thierry Reding wrote:
 On Fri, Oct 03, 2014 at 04:08:50PM +0100, Will Deacon wrote:
  On Wed, Oct 01, 2014 at 09:46:10AM +0100, Thierry Reding wrote:
  On Tue, Sep 30, 2014 at 05:00:35PM +0100, Will Deacon wrote:
  On Thu, Sep 25, 2014 at 07:40:23AM +0100, Thierry Reding wrote:
  [...]
  
  So I think what we're going to need is a way to prevent the default
  attachment to DMA/IOMMU. Or alternatively not associate devices with
  IOMMU domains by default but let drivers explicitly make the
  decision.
  
  Which drivers and how would they know what to do? I think you might be
  jumping the gun a bit here, given where mainline is with using the
  IOMMU for anything at all.
  
  I don't think I am. I've been working on patches to enable IOMMU on
  Tegra, with the specific use-case that we want to use it to allow
  physically non-contiguous framebuffers to be used for scan out.
  
  In order to do so the DRM driver allocates an IOMMU domain and adds both
  display controllers to it. When a framebuffer is created or imported
  from DMA-BUF, it gets mapped into this domain and both display
  controllers can use the IOVA address as the framebuffer base address.
  
  Does that mean you manually swizzle the dma_map_ops for the device in the
  DRM driver?
 
 No. It means we use the IOMMU API directly instead of the DMA mapping
 API.

Is there a reason why you can't use the DMA mapping API for this, assuming of 
course that it would provide a way to attach both display controllers to the 
same domain ? Do you need to have explicit control over the VA at which the 
buffers are mapped ?

  Given that a device can only be attached to a single domain at a time
  this will cause breakage when the ARM glue code starts automatically
  attaching the display controllers to a default domain.
  
  Why couldn't you just re-use the domain already allocated by the DMA
  mapping API?
 
 Because I don't see how you'd get access to it. And provided that we
 could do that it would also mean that there'd be at least two domains
 (one for each display controller) and we'd need to decide on using a
 single one of them. Which one do we choose? And what about the unused
 one? If there's no way to detach it we loose a precious resource.

This would also be an issue for my Renesas IOMMU (ipmmu-vmsa) use cases. The 
IOMMU supports up to four domains (each of them having its own hardware TLB) 
and shares them between all the bus masters connected to the IOMMU. The 
connections between bus master and TLBs are configurable. I thus can't live 
with one domain being created per device.

  What I proposed a while back was to leave it up to the IOMMU
  driver to choose an allocator for the device. Or rather, choose
  whether to use a custom allocator or the DMA/IOMMU integration
  allocator. The way this worked was to keep a list of devices in
  the IOMMU driver. Devices in this list would be added to domain
  reserved for DMA/IOMMU integration. Those would typically be
  devices such as SD/MMC, audio, ... devices that are in-kernel
  and need no per-process separation. By default devices wouldn't
  be added to a domain, so devices forming a composite DRM device
  would be able to manage their own domain.

The problem with your solution is that it requires knowledge of all bus master 
devices in the IOMMU driver. That's not where that knowledge belongs, as it's 
a property of a particular SoC integration, not of the IOMMU itself.

  I'd live to have as little of this as possible in the IOMMU
  drivers, as we should leave those to deal with the IOMMU hardware
  and not domain management. Having subsystems manage their own dma
  ops is an extension to the dma-mapping API.
  
  It's not an extension, really. It's more that both need to be able
  to coexist. For some devices you may want to create an IOMMU domain
  and hook it up with the DMA mapping functions, for others you don't
  and handle mapping to IOVA space explicitly.
  
  I think it's an extension in the sense that mainline doesn't currently
  do what you want, regardless of this patch series.
  
  It's interesting since you're now the second person to say this. Can you
  please elaborate why you think that's the case?
  
  Because the only way to set up DMA through an IOMMU on ARM is via the
  arm_iommu_* functions,
 
 No, you can use the IOMMU API directly just fine.
 
  which are currently called from a subset of the IOMMU drivers themselves:
drivers/gpu/drm/exynos/exynos_drm_iommu.c
drivers/iommu/ipmmu-vmsa.c
drivers/iommu/shmobile-iommu.c
drivers/media/platform/omap3isp/isp.c
  
  Of these, ipmmu-vmsa.c and shmobile.c both allocate a domain per device.
  The omap3 code seems to do something similar. That just leaves the exynos
  driver, which Marek has been reworking anyway.
 
 Right, and as I remember one of the things that Marek did was introduce
 a flag to mark drivers as doing their own IOMMU domain management 

Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-10-06 Thread Thierry Reding
On Mon, Oct 06, 2014 at 01:50:40PM +0300, Laurent Pinchart wrote:
 Hi Thierry and Will,
 
 On Monday 06 October 2014 11:52:50 Thierry Reding wrote:
  On Fri, Oct 03, 2014 at 04:08:50PM +0100, Will Deacon wrote:
   On Wed, Oct 01, 2014 at 09:46:10AM +0100, Thierry Reding wrote:
   On Tue, Sep 30, 2014 at 05:00:35PM +0100, Will Deacon wrote:
   On Thu, Sep 25, 2014 at 07:40:23AM +0100, Thierry Reding wrote:
   [...]
   
   So I think what we're going to need is a way to prevent the default
   attachment to DMA/IOMMU. Or alternatively not associate devices with
   IOMMU domains by default but let drivers explicitly make the
   decision.
   
   Which drivers and how would they know what to do? I think you might be
   jumping the gun a bit here, given where mainline is with using the
   IOMMU for anything at all.
   
   I don't think I am. I've been working on patches to enable IOMMU on
   Tegra, with the specific use-case that we want to use it to allow
   physically non-contiguous framebuffers to be used for scan out.
   
   In order to do so the DRM driver allocates an IOMMU domain and adds both
   display controllers to it. When a framebuffer is created or imported
   from DMA-BUF, it gets mapped into this domain and both display
   controllers can use the IOVA address as the framebuffer base address.
   
   Does that mean you manually swizzle the dma_map_ops for the device in the
   DRM driver?
  
  No. It means we use the IOMMU API directly instead of the DMA mapping
  API.
 
 Is there a reason why you can't use the DMA mapping API for this, assuming of 
 course that it would provide a way to attach both display controllers to the 
 same domain ? Do you need to have explicit control over the VA at which the 
 buffers are mapped ?

I suppose I could use the DMA mapping API at least for the display parts
if both controllers could be attached to the same domain. However when
we get to the 2D and 3D parts we will probably want to switch the IOMMU
domain depending on the userspace context to prevent applications from
stepping on each other's toes.

And no, I don't need to have explicit control over which VA the buffers
get mapped to.

   Given that a device can only be attached to a single domain at a time
   this will cause breakage when the ARM glue code starts automatically
   attaching the display controllers to a default domain.
   
   Why couldn't you just re-use the domain already allocated by the DMA
   mapping API?
  
  Because I don't see how you'd get access to it. And provided that we
  could do that it would also mean that there'd be at least two domains
  (one for each display controller) and we'd need to decide on using a
  single one of them. Which one do we choose? And what about the unused
  one? If there's no way to detach it we loose a precious resource.
 
 This would also be an issue for my Renesas IOMMU (ipmmu-vmsa) use cases. The 
 IOMMU supports up to four domains (each of them having its own hardware TLB) 
 and shares them between all the bus masters connected to the IOMMU. The 
 connections between bus master and TLBs are configurable. I thus can't live 
 with one domain being created per device.

I suppose one could fake this behind the curtains by making several
domains correspond to the same TLB (it sounds like pretty much the same
concept as an address space on Tegra). But that's just really nasty in
my opinion.

   What I proposed a while back was to leave it up to the IOMMU
   driver to choose an allocator for the device. Or rather, choose
   whether to use a custom allocator or the DMA/IOMMU integration
   allocator. The way this worked was to keep a list of devices in
   the IOMMU driver. Devices in this list would be added to domain
   reserved for DMA/IOMMU integration. Those would typically be
   devices such as SD/MMC, audio, ... devices that are in-kernel
   and need no per-process separation. By default devices wouldn't
   be added to a domain, so devices forming a composite DRM device
   would be able to manage their own domain.
 
 The problem with your solution is that it requires knowledge of all bus 
 master 
 devices in the IOMMU driver. That's not where that knowledge belongs, as it's 
 a property of a particular SoC integration, not of the IOMMU itself.

Right. It will work nicely on Tegra where the IOMMU is closely tied to
the memory controller and therefore does in fact know about all of the
masters. It won't work for something more generic like the ARM SMMU
where the SoC integration really isn't a property of the IOMMU itself.

So Marek's proposal to mark drivers that don't need or want the DMA API
integration sounds like a pretty good alternative to me.

   Yes, that's the plan. Having thought about it some more (after your
   comments), subsystems can still call of_dma_deconfigure if they want to do
   their own IOMMU domain management. That may well be useful for things like
   VFIO, for example.
  
  I think it's really weird to set up some complicated 

Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-10-03 Thread Will Deacon
Hi Thierry,

On Wed, Oct 01, 2014 at 09:46:10AM +0100, Thierry Reding wrote:
 On Tue, Sep 30, 2014 at 05:00:35PM +0100, Will Deacon wrote:
  On Thu, Sep 25, 2014 at 07:40:23AM +0100, Thierry Reding wrote:
 [...]
   So I think what we're going to need is a way to prevent the default
   attachment to DMA/IOMMU. Or alternatively not associate devices with
   IOMMU domains by default but let drivers explicitly make the decision.
  
  Which drivers and how would they know what to do? I think you might be
  jumping the gun a bit here, given where mainline is with using the IOMMU
  for anything at all.
 
 I don't think I am. I've been working on patches to enable IOMMU on
 Tegra, with the specific use-case that we want to use it to allow
 physically non-contiguous framebuffers to be used for scan out.
 
 In order to do so the DRM driver allocates an IOMMU domain and adds both
 display controllers to it. When a framebuffer is created or imported
 from DMA-BUF, it gets mapped into this domain and both display
 controllers can use the IOVA address as the framebuffer base address.

Does that mean you manually swizzle the dma_map_ops for the device in the
DRM driver?

 Given that a device can only be attached to a single domain at a time
 this will cause breakage when the ARM glue code starts automatically
 attaching the display controllers to a default domain.

Why couldn't you just re-use the domain already allocated by the DMA mapping
API?

 What I proposed a while back was to leave it up to the IOMMU driver to
 choose an allocator for the device. Or rather, choose whether to use a
 custom allocator or the DMA/IOMMU integration allocator. The way this
 worked was to keep a list of devices in the IOMMU driver. Devices in
 this list would be added to domain reserved for DMA/IOMMU integration.
 Those would typically be devices such as SD/MMC, audio, ... devices 
 that
 are in-kernel and need no per-process separation. By default devices
 wouldn't be added to a domain, so devices forming a composite DRM 
 device
 would be able to manage their own domain.

I'd live to have as little of this as possible in the IOMMU drivers, as 
we
should leave those to deal with the IOMMU hardware and not domain
management. Having subsystems manage their own dma ops is an extension 
to
the dma-mapping API.
   
   It's not an extension, really. It's more that both need to be able to
   coexist. For some devices you may want to create an IOMMU domain and
   hook it up with the DMA mapping functions, for others you don't and
   handle mapping to IOVA space explicitly.
  
  I think it's an extension in the sense that mainline doesn't currently do
  what you want, regardless of this patch series.
 
 It's interesting since you're now the second person to say this. Can you
 please elaborate why you think that's the case?

Because the only way to set up DMA through an IOMMU on ARM is via the
arm_iommu_* functions, which are currently called from a subset of the
IOMMU drivers themselves:

  drivers/gpu/drm/exynos/exynos_drm_iommu.c
  drivers/iommu/ipmmu-vmsa.c
  drivers/iommu/shmobile-iommu.c
  drivers/media/platform/omap3isp/isp.c

Of these, ipmmu-vmsa.c and shmobile.c both allocate a domain per device.
The omap3 code seems to do something similar. That just leaves the exynos
driver, which Marek has been reworking anyway.

 I do have local patches that allow precisely this use-case to work
 without changes to the IOMMU core or requiring any extra ARM-specific
 glue.
 
 There's a fair bit of jumping through hoops, because for example you
 don't know what IOMMU instance a domain belongs to at .domain_init()
 time, so I have to defer most of the actual domain initalization until a
 device is actually attached to it, but I digress.
 
   Doing so would leave a large number of address spaces available for
   things like a GPU driver to keep per-process address spaces for
   isolation.
   
   I don't see how we'd be able to do that with the approach that you
   propose in this series since it assumes that each device will be
   associated with a separate domain.
  
  No, that's an artifact of the existing code on ARM. My series adds a list of
  domains to each device, but those domains are per-IOMMU instance and can
  appear in multiple lists.
 
 So you're saying the end result will be that there's a single domain per
 IOMMU device that will be associated with all devices that have a master
 interface to it?

Yes, that's the plan. Having thought about it some more (after your
comments), subsystems can still call of_dma_deconfigure if they want to do
their own IOMMU domain management. That may well be useful for things like
VFIO, for example.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-09-30 Thread Will Deacon
Hi Thierry,

On Thu, Sep 25, 2014 at 07:40:23AM +0100, Thierry Reding wrote:
 On Wed, Sep 24, 2014 at 05:33:38PM +0100, Will Deacon wrote:
  On Tue, Sep 23, 2014 at 08:14:01AM +0100, Thierry Reding wrote:
   On Mon, Sep 22, 2014 at 06:43:37PM +0100, Will Deacon wrote:
Yup. In this case, the iommu_dma_mapping passed to arch_setup_dma_ops
contains a domain and an allocator for each IOMMU instance in the 
system.
It would then be up to the architecture how it makes use of those, but
the most obvious thing to do would be to attach devices mastering 
through
an IOMMU instance to that per-instance domain.

The other use-case is isolation (one domain per device), which I guess
matches what the ARM code is doing at the moment.
   
   I think there are two cases here. You can have a composite device that
   wants to manage a single domain (using its own allocator) for a set of
   hardware devices. At the same time a set of devices (think 2D and 3D
   engines) could want to use a multiple domains for process separation.
   In that case I'd expect a logical DRM device to allocate one domain per
   process and then associate the 2D and 3D engines with that same domain
   on process switch.
  
  Sure, but that's well outside of what the dma-mapping API is going to setup
  as a default domain. These specialist setups are certainly possible, but I
  think they should be driven by, for example, the DRM code as opposed to
  being in the core dma-mapping code.
 
 I completely agree that these special cases should be driven by the
 drivers that need them. However the problem here is that the current
 patch will already attach the device to an IOMMU domain by default.

Sure, but that's not an unfixable problem if somebody cares enough to do it.
Right now, I see a small handful of callers for the IOMMU API and nearly all
of them would work perfectly well with a default domain. The big exception
to that is VFIO, but that requires the device to be unbound from the host
driver, so we could detach the mapping at that point.

 So I think what we're going to need is a way to prevent the default
 attachment to DMA/IOMMU. Or alternatively not associate devices with
 IOMMU domains by default but let drivers explicitly make the decision.

Which drivers and how would they know what to do? I think you might be
jumping the gun a bit here, given where mainline is with using the IOMMU
for anything at all.

   What I proposed a while back was to leave it up to the IOMMU driver to
   choose an allocator for the device. Or rather, choose whether to use a
   custom allocator or the DMA/IOMMU integration allocator. The way this
   worked was to keep a list of devices in the IOMMU driver. Devices in
   this list would be added to domain reserved for DMA/IOMMU integration.
   Those would typically be devices such as SD/MMC, audio, ... devices that
   are in-kernel and need no per-process separation. By default devices
   wouldn't be added to a domain, so devices forming a composite DRM device
   would be able to manage their own domain.
  
  I'd live to have as little of this as possible in the IOMMU drivers, as we
  should leave those to deal with the IOMMU hardware and not domain
  management. Having subsystems manage their own dma ops is an extension to
  the dma-mapping API.
 
 It's not an extension, really. It's more that both need to be able to
 coexist. For some devices you may want to create an IOMMU domain and
 hook it up with the DMA mapping functions, for others you don't and
 handle mapping to IOVA space explicitly.

I think it's an extension in the sense that mainline doesn't currently do
what you want, regardless of this patch series. My motivation is to enable
IOMMU-backed DMA-mapping so that I can continue implementing the virtual
SMMU work I started a while back. Patches welcome to enable any other
use-cases -- I don't think they're mutually exclusive.

 There is another issue with the approach you propose. I'm not sure if
 Tegra is special in this case (I'd expect not), but what we do is make
 an IOMMU domain correspond to an address space. Address spaces are a
 pretty limited resource (earlier generations have 4, newer have 128)
 and each address space can be up to 4 GiB. So I've always envisioned
 that we should be using a single IOMMU domain for devices that don't
 expose direct buffer access to userspace (SATA, PCIe, audio, SD/MMC,
 USB, ...). All of those would typically need only a small number of
 small buffers, so using a separate address space for each seems like a
 big waste.

I agree here, the ARM DMA-mapping code should really be doing one domain
per SMMU instance; all I've done is hook up the existing code which wasn't
previously being called.

 Doing so would leave a large number of address spaces available for
 things like a GPU driver to keep per-process address spaces for
 isolation.
 
 I don't see how we'd be able to do that with the approach that you
 propose in this 

Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-09-25 Thread Thierry Reding
On Wed, Sep 24, 2014 at 05:33:38PM +0100, Will Deacon wrote:
 On Tue, Sep 23, 2014 at 08:14:01AM +0100, Thierry Reding wrote:
  On Mon, Sep 22, 2014 at 06:43:37PM +0100, Will Deacon wrote:
   Yup. In this case, the iommu_dma_mapping passed to arch_setup_dma_ops
   contains a domain and an allocator for each IOMMU instance in the system.
   It would then be up to the architecture how it makes use of those, but
   the most obvious thing to do would be to attach devices mastering through
   an IOMMU instance to that per-instance domain.
   
   The other use-case is isolation (one domain per device), which I guess
   matches what the ARM code is doing at the moment.
  
  I think there are two cases here. You can have a composite device that
  wants to manage a single domain (using its own allocator) for a set of
  hardware devices. At the same time a set of devices (think 2D and 3D
  engines) could want to use a multiple domains for process separation.
  In that case I'd expect a logical DRM device to allocate one domain per
  process and then associate the 2D and 3D engines with that same domain
  on process switch.
 
 Sure, but that's well outside of what the dma-mapping API is going to setup
 as a default domain. These specialist setups are certainly possible, but I
 think they should be driven by, for example, the DRM code as opposed to
 being in the core dma-mapping code.

I completely agree that these special cases should be driven by the
drivers that need them. However the problem here is that the current
patch will already attach the device to an IOMMU domain by default.

So I think what we're going to need is a way to prevent the default
attachment to DMA/IOMMU. Or alternatively not associate devices with
IOMMU domains by default but let drivers explicitly make the decision.
Either of those two alternatives would require driver-specific
knowledge, which would be another strong argument against doing the
whole IOMMU initialization at device creation time.

  What I proposed a while back was to leave it up to the IOMMU driver to
  choose an allocator for the device. Or rather, choose whether to use a
  custom allocator or the DMA/IOMMU integration allocator. The way this
  worked was to keep a list of devices in the IOMMU driver. Devices in
  this list would be added to domain reserved for DMA/IOMMU integration.
  Those would typically be devices such as SD/MMC, audio, ... devices that
  are in-kernel and need no per-process separation. By default devices
  wouldn't be added to a domain, so devices forming a composite DRM device
  would be able to manage their own domain.
 
 I'd live to have as little of this as possible in the IOMMU drivers, as we
 should leave those to deal with the IOMMU hardware and not domain
 management. Having subsystems manage their own dma ops is an extension to
 the dma-mapping API.

It's not an extension, really. It's more that both need to be able to
coexist. For some devices you may want to create an IOMMU domain and
hook it up with the DMA mapping functions, for others you don't and
handle mapping to IOVA space explicitly.

There is another issue with the approach you propose. I'm not sure if
Tegra is special in this case (I'd expect not), but what we do is make
an IOMMU domain correspond to an address space. Address spaces are a
pretty limited resource (earlier generations have 4, newer have 128)
and each address space can be up to 4 GiB. So I've always envisioned
that we should be using a single IOMMU domain for devices that don't
expose direct buffer access to userspace (SATA, PCIe, audio, SD/MMC,
USB, ...). All of those would typically need only a small number of
small buffers, so using a separate address space for each seems like a
big waste.

Doing so would leave a large number of address spaces available for
things like a GPU driver to keep per-process address spaces for
isolation.

I don't see how we'd be able to do that with the approach that you
propose in this series since it assumes that each device will be
associated with a separate domain.

Thierry


pgpl3vcqlop89.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-09-24 Thread Will Deacon
On Tue, Sep 23, 2014 at 08:14:01AM +0100, Thierry Reding wrote:
 On Mon, Sep 22, 2014 at 06:43:37PM +0100, Will Deacon wrote:
  Yup. In this case, the iommu_dma_mapping passed to arch_setup_dma_ops
  contains a domain and an allocator for each IOMMU instance in the system.
  It would then be up to the architecture how it makes use of those, but
  the most obvious thing to do would be to attach devices mastering through
  an IOMMU instance to that per-instance domain.
  
  The other use-case is isolation (one domain per device), which I guess
  matches what the ARM code is doing at the moment.
 
 I think there are two cases here. You can have a composite device that
 wants to manage a single domain (using its own allocator) for a set of
 hardware devices. At the same time a set of devices (think 2D and 3D
 engines) could want to use a multiple domains for process separation.
 In that case I'd expect a logical DRM device to allocate one domain per
 process and then associate the 2D and 3D engines with that same domain
 on process switch.

Sure, but that's well outside of what the dma-mapping API is going to setup
as a default domain. These specialist setups are certainly possible, but I
think they should be driven by, for example, the DRM code as opposed to
being in the core dma-mapping code.

 What I proposed a while back was to leave it up to the IOMMU driver to
 choose an allocator for the device. Or rather, choose whether to use a
 custom allocator or the DMA/IOMMU integration allocator. The way this
 worked was to keep a list of devices in the IOMMU driver. Devices in
 this list would be added to domain reserved for DMA/IOMMU integration.
 Those would typically be devices such as SD/MMC, audio, ... devices that
 are in-kernel and need no per-process separation. By default devices
 wouldn't be added to a domain, so devices forming a composite DRM device
 would be able to manage their own domain.

I'd live to have as little of this as possible in the IOMMU drivers, as we
should leave those to deal with the IOMMU hardware and not domain
management. Having subsystems manage their own dma ops is an extension to
the dma-mapping API.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-09-22 Thread Thierry Reding
On Fri, Sep 12, 2014 at 05:34:55PM +0100, Will Deacon wrote:
[...]
 +static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 
 size)
 +{
 + struct dma_iommu_mapping *mapping;
 +
 + mapping = arm_iommu_create_mapping(dev-bus, dma_base, size);

If I understand correctly this will be called for each device that has
an IOMMU master interface and will end up creating a new mapping for
each of the devices. Each of these mappings will translate to a domain
in the IOMMU API, which in turn is a separate address space.

How do you envision to support use-cases where a set of devices need to
share a single domain? This is needed for example in DRM where SoCs
often have a set of hardware blocks (each with its own master interface)
that compose the display device. On Tegra for example there are two
display controllers that need access to the same IOVA domain so that
they can scan out framebuffers.

Thierry


pgpDHzQGHKBQj.pgp
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-09-22 Thread Laurent Pinchart
On Monday 22 September 2014 11:19:35 Thierry Reding wrote:
 On Fri, Sep 12, 2014 at 05:34:55PM +0100, Will Deacon wrote:
 [...]
 
  +static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64
  size) +{
  +   struct dma_iommu_mapping *mapping;
  +
  +   mapping = arm_iommu_create_mapping(dev-bus, dma_base, size);
 
 If I understand correctly this will be called for each device that has
 an IOMMU master interface and will end up creating a new mapping for
 each of the devices. Each of these mappings will translate to a domain
 in the IOMMU API, which in turn is a separate address space.
 
 How do you envision to support use-cases where a set of devices need to
 share a single domain? This is needed for example in DRM where SoCs
 often have a set of hardware blocks (each with its own master interface)
 that compose the display device. On Tegra for example there are two
 display controllers that need access to the same IOVA domain so that
 they can scan out framebuffers.

Or simply for IOMMUs that serve multiple masters and support a single domain 
only.

-- 
Regards,

Laurent Pinchart


signature.asc
Description: This is a digitally signed message part.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v3 7/7] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2014-09-22 Thread Will Deacon
Hi Thierry,

On Mon, Sep 22, 2014 at 10:19:35AM +0100, Thierry Reding wrote:
 On Fri, Sep 12, 2014 at 05:34:55PM +0100, Will Deacon wrote:
 [...]
  +static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 
  size)
  +{
  +   struct dma_iommu_mapping *mapping;
  +
  +   mapping = arm_iommu_create_mapping(dev-bus, dma_base, size);
 
 If I understand correctly this will be called for each device that has
 an IOMMU master interface and will end up creating a new mapping for
 each of the devices. Each of these mappings will translate to a domain
 in the IOMMU API, which in turn is a separate address space.

Correct, although that's largely because I've bolted on the existing ARM
IOMMU code.

 How do you envision to support use-cases where a set of devices need to
 share a single domain? This is needed for example in DRM where SoCs
 often have a set of hardware blocks (each with its own master interface)
 that compose the display device. On Tegra for example there are two
 display controllers that need access to the same IOVA domain so that
 they can scan out framebuffers.

Yup. In this case, the iommu_dma_mapping passed to arch_setup_dma_ops
contains a domain and an allocator for each IOMMU instance in the system.
It would then be up to the architecture how it makes use of those, but
the most obvious thing to do would be to attach devices mastering through
an IOMMU instance to that per-instance domain.

The other use-case is isolation (one domain per device), which I guess
matches what the ARM code is doing at the moment.

Will

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu