Re: [PATCH] iommu/arm-smmu: Allow disabling bypass via kernel config

2019-02-15 Thread Rob Clark
On Thu, Feb 14, 2019 at 7:40 PM Doug Anderson  wrote:
>
> Hi,
>
> On Thu, Feb 14, 2019 at 1:32 PM Robin Murphy  wrote:
> >
> > Hi Doug,
> >
> > On 2019-02-14 8:44 pm, Douglas Anderson wrote:
> > > Right now the only way to disable the iommu bypass for the ARM SMMU is
> > > with the kernel command line parameter 'arm-smmu.disable_bypass'.
> > >
> > > In general kernel command line parameters make sense for things that
> > > someone would like to tweak without rebuilding the kernel or for very
> > > basic communication between the bootloader and the kernel, but are
> > > awkward for other things.  Specifically:
> > > * Human parsing of the kernel command line can be difficult since it's
> > >just a big runon space separated line of text.
> > > * If every bit of the system was configured via the kernel command
> > >line the kernel command line would get very large and even more
> > >unwieldly.
> > > * Typically there are not easy ways in build systems to adjust the
> > >kernel command line for config-like options.
> > >
> > > Let's introduce a new config option that allows us to disable the
> > > iommu bypass without affecting the existing default nor the existing
> > > ability to adjust the configuration via kernel command line.
> >
> > I say let's just flip the default - for a while now it's been one of
> > those "oh yeah, we should probably do that" things that gets instantly
> > forgotten again, so some 3rd-party demand is plenty to convince me :)
> >
> > There are few reasons to allow unmatched stream bypass, and even fewer
> > good ones, so I'd be happy to shift the command-line burden over to the
> > esoteric cases at this point, and consider the config option in future
> > if anyone from that camp pops up and screams hard enough.
>
> Sure, I can submit that patch if we want.  I presume I'll get lots of
> screaming but I'm used to that.  ;-)
>
> ...specifically I found that when I turned on "disably bypass" on my
> board (sdm845-cheza, which is not yet upstream) that a bunch of things
> that used to work broke.  That's a good thing because all the things
> that broke need to be fixed properly (by adding the IOMMUs) but
> presumably my board is not special in relying on the old insecure
> behavior.

So one niggly bit, beyond the drivers that aren't but should be using
iommu, is the case where bootloader lights up the display.  We
actually still have a lot of work to do for this (in that clk and
genpd drivers need to be aware of handover, in addition to just
iommu)..  But I'd rather not make a hard problem harder just yet.

(it gets worse, afaict so far on the windows-arm laptops since in that
case uefi/edk is actually taking the iommu out of bypass and things go
badly when arm-smmu disables clks after probe..)

But I might recommend making the default a kconfig option for now, so
in the distro kernel case, where display driver is a kernel module,
you aren't making a hard problem harder.  For cases where bootloader
isn't enabling display, things are much easier, and I think we could
switch the default sooner.  But I fear for cases where bootloader is
enabling display it is a much harder problem :-(

BR,
-R


>
> I'm about to head on vacation for a week so I'm not sure I'll get to
> re-post before then.  If not I'll post this sometime after I get back
> unless someone beats me to it.
>
> -Doug
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 0/9] vfio/mdev: IOMMU aware mediated device

2019-02-15 Thread Jean-Philippe Brucker
On 14/02/2019 20:14, Alex Williamson wrote:
>> This patch series extends both IOMMU and vfio components to support
>> mdev device passing through when it could be isolated and protected
>> by the IOMMU units. The first part of this series (PATCH 1/09~6/09)
>> adds the interfaces and implementation of the multiple domains per
>> device. The second part (PATCH 7/09~9/09) adds the iommu device
>> attribute to each mdev, determines isolation type according to the
>> existence of an iommu device when attaching group in vfio type1 iommu
>> module, and attaches the domain to iommu aware mediated devices.
>> 
>> References:
>> [1] 
>> https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
>> [2] 
>> https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
>> [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
>> 
>> Best regards,
>> Lu Baolu
>> 
>> Change log:
>>   v5->v6:
> 
> This looks pretty reasonable with Jean-Philippe's nit fixups.  Where do
> we go from here?  I think we need an ack from Kirti since they have an
> interest here.  Presumably this looks ok to the ARM folks.

Looks great from my point of view. I focused on patch 1 since I'm
planning to reuse iommu_dev_features for SVA. I don't have time to test
auxd and mdev on SMMUv3 at the moment but I had a better look and, if it
helps, for patches 1 and 7-9:

Reviewed-by: Jean-Philippe Brucker 

That said, are you planning to add back the mdev_get_iommu_domain()
function, in a separate patch? Because I think the parent driver still
needs a way to retrieve the PASID for an mdev?

Thanks,
Jean

> Do we have
> any consumers of this code yet?  Theoretically I think a vfio-pci-like
> meta driver could be written as an mdev vendor driver with this support
> (restricted to type1 iommu use cases).  Thanks,
> 
> Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] dma-contiguous: do not allocate a single page from CMA area

2019-02-15 Thread Nicolin Chen
The addresses within a single page are always contiguous, so it's
not so necessary to always allocate one single page from CMA area.
Since the CMA area has a limited predefined size of space, it may
run out of space in heavy use cases, where there might be quite a
lot CMA pages being allocated for single pages.

However, there is also a concern that a device might care where a
page comes from -- it might expect the page from CMA area and act
differently if the page doesn't.

This patch tries to skip of one-page size allocations and returns
NULL so as to let callers allocate normal pages unless the device
has its own CMA area. This would save resources from the CMA area
for more CMA allocations. And it'd also reduce CMA fragmentations
resulted from trivial allocations.

Signed-off-by: Nicolin Chen 
---
 kernel/dma/contiguous.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index b2a87905846d..09074bd04793 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -186,16 +186,32 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, 
phys_addr_t base,
  *
  * This function allocates memory buffer for specified device. It uses
  * device specific contiguous memory area if available or the default
- * global one. Requires architecture specific dev_get_cma_area() helper
- * function.
+ * global one.
+ *
+ * However, it skips one-page size of allocations from the global area.
+ * As the addresses within one page are always contiguous, so there is
+ * no need to waste CMA pages for that kind; it also helps reduce the
+ * fragmentations in the CMA area. So a caller should be the rebounder
+ * in such case to allocate a normal page upon NULL return value.
+ *
+ * Requires architecture specific dev_get_cma_area() helper function.
  */
 struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
   unsigned int align, bool no_warn)
 {
+   struct cma *cma;
+
if (align > CONFIG_CMA_ALIGNMENT)
align = CONFIG_CMA_ALIGNMENT;
 
-   return cma_alloc(dev_get_cma_area(dev), count, align, no_warn);
+   if (dev && dev->cma_area)
+   cma = dev->cma_area;
+   else if (count > 1)
+   cma = dma_contiguous_default_area;
+   else
+   return NULL;
+
+   return cma_alloc(cma, count, align, no_warn);
 }
 
 /**
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/5] sparc64/pci_sun4v: allow large DMA masks

2019-02-15 Thread David Miller
From: Christoph Hellwig 
Date: Fri, 15 Feb 2019 15:45:58 +0100

> We've been moving to a model where the device just sets the DMA mask
> supported by it, instead of having to fallback to something it thinks
> the platform might support.  Sparc64 is the remaining holdout forcing
> drivers to supply a matching mask.  Change dma_4v_supported to just
> check if the supplied dma mask is large enough. and adjust the mapping
> code to check ATU presence in addition to the DMA mask to decice on
^^

"decide"

> the mapping method.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: David S. Miller 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/5] sparc64/iommu: allow large DMA masks

2019-02-15 Thread David Miller
From: Christoph Hellwig 
Date: Fri, 15 Feb 2019 15:45:57 +0100

> We've been moving to a model where the device just sets the DMA mask
> supported by it, instead of having to fallback to something it thinks
> the platform might support.  Sparc64 is the remaining holdout forcing
> drivers to supply a matching mask.  Change dma_4u_supported to just
> check if the supplied dma mask is large enough as nothing in the
> iommu.c code (or the core DMA code) actually looks at the DMA mask
> later on.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: David S. Miller 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 1/5] iommu: Add APIs for IOMMU PASID management

2019-02-15 Thread Jean-Philippe Brucker
Hi Jacob, Lu,

On 30/01/2019 19:05, Jacob Pan wrote:
> On Mon, 12 Nov 2018 14:44:57 +0800
> Lu Baolu  wrote:
> 
>> This adds APIs for IOMMU drivers and device drivers to manage
>> the PASIDs used for DMA transfer and translation. It bases on
>> I/O ASID allocator for PASID namespace management and relies
>> on vendor specific IOMMU drivers for paravirtual PASIDs.
>>
> Trying to integrate this with VT-d SVM code had some thoughts.
> First of all, it seems to me the problem we are trying to solve here is
> to allow custom PASID/IOASID allocator.
> IOASID, as in driver base, is a common infrastructure of its own right.
> So it is OK to let device drivers such as VT-d driver directly
> communicate with IOASID w/o going through common IOMMU layer. Therefore
> I see little value of adding this wrapper to ioasid code, instead I
> feel it might be less work to enhance ioasid with the following:
> 
> 1. allow ioasid code to register a system wide custom asid allocator,
> which takes precedence over the idr allocator
> e.g.
> typedef ioasid_t (*ioasid_alloc_fn_t)(ioasid_t min, ioasid_t max, void *data);
> void ioasid_set_allocator(ioasid_alloc_fn_t fn)
> {}
> We can still use idr for tracking ioasid and private data lookup, since
> the ioasid idr is exclusively owned by ioasid_alloc, we can rest and be
> sure there is no conflict with other callers. See code comments below.

Yes, I think we need the alloc function to be registered at boot time.
Because even when using a PV allocation instead of idr allocation, we do
need a way to quickly retrieve PASID->mm when handling an I/O page
fault. In the ioasid_alloc() function, we could do, roughly:

static ioasid_t (*ioasid_alloc_fn)(...);

ioasid_t ioasid_alloc(... min, max...)
{
if (ioasid_alloc_fn) {
   ioasid = ioasid_alloc_fn(min, max);

   /* With PV allocation, only use the IDR for storage */
   min = ioasid;
   max = ioasid + 1;
}

idr_lock
ioasid = idr_alloc(... min, max... );
idr_unlock
}

> 2. support setting asid private data _after_ asid is allocated. The use
> case is that PASID allocation and mm bind can be split into separate
> stages. Private data (mm related) are not available at the time of
> allocation, only bind time.
> Since IDR needs the data pointer at allocation time, we can still
> create an empty ioasid_data for ioasid tracking at alloc time. i.e.
> struct ioasid_data {
>   void *ptr; /* private data */
> };

I think I agree as well (though finding the right ordering and locking
in SVA allocation is giving me the biggest headache these days). If the
PASID becomes reachable early we need to make sure that concurrent
threads working on the PASID space can deal with incomplete data, but I
think that might be one of the easiest alternatives.

> 3. allow NULL ioasid_set
> I also had a hard time understanding the use of ioasid_set, since there
> is already a private data associated with each ioasid, is it not enough
> to group ioasid based on private data?

The idea behind ioasid_set is that different modules compete for the
single IOASID space, and associate different data types to their IOASIDs.

(a) iommu-sva associates an mm_struct
(b) IOMMU driver allocates a PASIDs for each auxiliary domain, and
probably associates an iommu_domain to it.
(c) device drivers (e.g. AMD KFD) want some PASIDs for their own use in
the same space, and will have their own private data in there.

For (a) we also want the io_pgfault code to find the mm associated with
a PASID:
  iopf_handle_mm_fault()
iommu_sva_find(pasid)
  ioasid_find(_pasid_set, pasid) -> mm struct

And in this case we really need the search to be only on the
shared_pasid_set, or else the IOPF code gets something that's not an mm
struct.

The easy alternative is to let the IOMMU driver handle both (a) and (b),
and store the same data type for both. I've been considering this a few
times this week, but it doesn't solve case (c).

Thanks,
Jean

> So the usage scenarios can be.
> 1. during boot, vt-d driver register a custom ioasid allocator (vcmd) if
> it detects its running as a guest.
> 
> 2. Running as a guest, all pasid allocation via ioasid_alloc() will go
> to this custom allocators and tracked
> 
> 3. for native case, there is no custom allocator, ioasid just use IDR
> alloc
> 
> 4. for native bind mm, pasid allocation already has private data filled
> in when calling ioasid_alloc
> 
> 5. for guest bind, pasid is allocated with empty private data (called
> by VFIO layer), but private data is filled in later by bind guest
> pasid inside the vt-d driver.
> 
> So my thinking is that we can avoid having another layer of APIs as
> below and keep everything within ioasid. Also allows private data to be
> managed at lower level as compared to VFIO.
> 
> 
> Thanks,
> 
> Jacob
> 
>> Below APIs are added:
>>
>> * iommu_pasid_init(pasid)
>>   - Initialize a PASID consumer. The vendor specific IOMMU
>> drivers are able to 

Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)

2019-02-15 Thread John David Anglin
On 2019-02-14 12:58 p.m., Robin Murphy wrote:
> Hmm, having felt brave enough to take a closer look, it might actually be as 
> simple as this - Dave, are you able to give the diff below a spin?
> 
> Robin.
> 
> ->8-
> diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
> index 7f595355fb79..fe4a7c71fede 100644
> --- a/drivers/dma/mv_xor.c
> +++ b/drivers/dma/mv_xor.c
> @@ -1059,6 +1059,7 @@ mv_xor_channel_add(struct mv_xor_device *xordev,
>  mv_chan->op_in_desc = XOR_MODE_IN_DESC;
> 
>  dma_dev = _chan->dmadev;
> +dma_dev->dev = >dev;
>  mv_chan->xordev = xordev;
> 
>  /*
> @@ -1091,7 +1092,6 @@ mv_xor_channel_add(struct mv_xor_device *xordev,
>  dma_dev->device_free_chan_resources = mv_xor_free_chan_resources;
>  dma_dev->device_tx_status = mv_xor_status;
>  dma_dev->device_issue_pending = mv_xor_issue_pending;
> -dma_dev->dev = >dev;
> 
>  /* set prep routines based on capability */
>  if (dma_has_cap(DMA_INTERRUPT, dma_dev->cap_mask))

The patch is fine and it fixes the boot failure.

I misapplied it in previous test.

Thanks,
Dave

-- 
John David Anglin  dave.ang...@bell.net
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)

2019-02-15 Thread John David Anglin
On 2019-02-15 10:22 a.m., John David Anglin wrote:
> On 2019-02-14 12:58 p.m., Robin Murphy wrote:
>> Hmm, having felt brave enough to take a closer look, it might actually be as 
>> simple as this - Dave, are you able to give the diff below a spin?
> Still crashes but in slightly different spot:
>
I think I see what's wrong.

Dave

-- 
John David Anglin  dave.ang...@bell.net

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)

2019-02-15 Thread John David Anglin
On 2019-02-14 12:58 p.m., Robin Murphy wrote:
> Hmm, having felt brave enough to take a closer look, it might actually be as 
> simple as this - Dave, are you able to give the diff below a spin?
Still crashes but in slightly different spot:

[    0.00] Booting Linux on physical CPU 0x00 [0x410fd034]
[    0.00] Linux version 5.0.0-rc6+ (root@espressobin) (gcc version 6.3.0 
20170516 (Debian 6.3.0-18+deb9u1)) #1 SMP PREEMPT Wed Feb 13
16:17:46 EST 2019
[    0.00] Machine model: Globalscale Marvell ESPRESSOBin Board
[    0.00] earlycon: ar3700_uart0 at MMIO 0xd0012000 (options '')
[    0.00] printk: bootconsole [ar3700_uart0] enabled
[    3.210276] Internal error: Oops: 9645 [#1] PREEMPT SMP
[    3.215932] Modules linked in:
[    3.219072] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc6+ #1
[    3.225519] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[    3.232151] pstate: a005 (NzCv daif -PAN -UAO)
[    3.237090] pc : mv_xor_channel_add+0x4c/0xb28
[    3.241650] lr : mv_xor_probe+0x20c/0x4b8
[    3.245768] sp : ff8010033ac0
[    3.249173] x29: ff8010033ac0 x28: 
[    3.254639] x27: ff8010e76068 x26: 0029
[    3.260104] x25:  x24: 
[    3.265570] x23: ffc03fb47400 x22: ff8010ead000
[    3.271035] x21: ffc03fb47410 x20: ffc03bea8d80
[    3.276501] x19: ffc03fb47400 x18: 
[    3.281966] x17: 000c x16: 000a
[    3.287432] x15: ff8010ead6c8 x14: ffc03beaa003
[    3.292898] x13: ffc03beaa002 x12: 0038
[    3.298363] x11: 1fff x10: 0001
[    3.303829] x9 : 0040 x8 : ff8010ec7928
[    3.309294] x7 : ffc03cc003b8 x6 : 
[    3.314760] x5 :  x4 : 0029
[    3.320226] x3 : 0083 x2 : 80c0
[    3.325691] x1 :  x0 : ffc03fb47410
[    3.331158] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
[    3.338056] Call trace:
[    3.340569]  mv_xor_channel_add+0x4c/0xb28
[    3.344779]  mv_xor_probe+0x20c/0x4b8
[    3.348544]  platform_drv_probe+0x50/0xb0
[    3.352663]  really_probe+0x1fc/0x2c0
[    3.356427]  driver_probe_device+0x58/0x100
[    3.360727]  __driver_attach+0xd8/0xe0
[    3.364580]  bus_for_each_dev+0x68/0xc8
[    3.368522]  driver_attach+0x20/0x28
[    3.372196]  bus_add_driver+0x108/0x228
[    3.376139]  driver_register+0x60/0x110
[    3.380081]  __platform_driver_register+0x44/0x50
[    3.384923]  mv_xor_driver_init+0x18/0x20
[    3.389043]  do_one_initcall+0x58/0x170
[    3.392985]  kernel_init_freeable+0x190/0x234
[    3.397465]  kernel_init+0x10/0x108
[    3.401047]  ret_from_fork+0x10/0x1c
[    3.404723] Code: f90067a5 d285 52901802 aa1503e0 (f9003035)
[    3.411004] ---[ end trace 65be82a62724e328 ]---
[    3.415804] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x000b
[    3.423626] SMP: stopping secondary CPUs
[    3.427661] Kernel Offset: disabled
[    3.431243] CPU features: 0x002,2000200c
[    3.435272] Memory Limit: none
[    3.438412] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x000b ]---

ff8010630440 :
ff8010630440:   a9b37bfd    stp x29, x30, [sp, #-208]!
ff8010630444:   910003fd    mov x29, sp
ff8010630448:   a9025bf5    stp x21, x22, [sp, #32]
ff801063044c:   b00043f6    adrp    x22, ff8010ead000 

ff8010630450:   a90363f7    stp x23, x24, [sp, #48]
ff8010630454:   aa0103f7    mov x23, x1
ff8010630458:   a9046bf9    stp x25, x26, [sp, #64]
ff801063045c:   d281    mov x1, #0x0    
// #0
ff8010630460:   a90153f3    stp x19, x20, [sp, #16]
ff8010630464:   910042f5    add x21, x23, #0x10
ff8010630468:   a90573fb    stp x27, x28, [sp, #80]
ff801063046c:   aa0003f4    mov x20, x0
ff8010630470:   911b22c0    add x0, x22, #0x6c8
ff8010630474:   2a0203f9    mov w25, w2
ff8010630478:   f945    ldr x5, [x0]
ff801063047c:   f90067a5    str x5, [x29, #200]
ff8010630480:   d285    mov x5, #0x0    
// #0
ff8010630484:   52901802    mov w2, #0x80c0 
// #32960
ff8010630488:   aa1503e0    mov x0, x21
ff801063048c:   f9003035    str x21, [x1, #96]
ff8010630490:   72a00c02    movk    w2, #0x60, lsl #16
ff8010630494:   d2806a01    mov x1, #0x350 
...

Dave

-- 
John David Anglin  dave.ang...@bell.net

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 5/5] Documentation/DMA-API-HOWTO: update dma_mask sections

2019-02-15 Thread Christoph Hellwig
We don't require drivers to guess a DMA mask that might actually
match the system capabilities any more, so fix up the documentation
to clear this up.

Signed-off-by: Christoph Hellwig 
---
 Documentation/DMA-API-HOWTO.txt | 121 +++-
 1 file changed, 41 insertions(+), 80 deletions(-)

diff --git a/Documentation/DMA-API-HOWTO.txt b/Documentation/DMA-API-HOWTO.txt
index f0cc3f772265..8e948fae34af 100644
--- a/Documentation/DMA-API-HOWTO.txt
+++ b/Documentation/DMA-API-HOWTO.txt
@@ -146,114 +146,75 @@ What about block I/O and networking buffers?  The block 
I/O and
 networking subsystems make sure that the buffers they use are valid
 for you to DMA from/to.
 
-DMA addressing limitations
+DMA addressing capabilities
 ==
 
-Does your device have any DMA addressing limitations?  For example, is
-your device only capable of driving the low order 24-bits of address?
-If so, you need to inform the kernel of this fact.
+By default, the kernel assumes that your device can address 32-bits of DMA
+addressing.  For a 64-bit capable device, this needs to be increased, and for
+a device with limitations, it needs to be decreased.
 
-By default, the kernel assumes that your device can address the full
-32-bits.  For a 64-bit capable device, this needs to be increased.
-And for a device with limitations, as discussed in the previous
-paragraph, it needs to be decreased.
+Special note about PCI: PCI-X specification requires PCI-X devices to support
+64-bit addressing (DAC) for all transactions.  And at least one platform (SGI
+SN2) requires 64-bit consistent allocations to operate correctly when the IO
+bus is in PCI-X mode.
 
-Special note about PCI: PCI-X specification requires PCI-X devices to
-support 64-bit addressing (DAC) for all transactions.  And at least
-one platform (SGI SN2) requires 64-bit consistent allocations to
-operate correctly when the IO bus is in PCI-X mode.
+For correct operation, you must set the DMA mask to inform the kernel about
+your devices DMA addressing capabilities.
 
-For correct operation, you must interrogate the kernel in your device
-probe routine to see if the DMA controller on the machine can properly
-support the DMA addressing limitation your device has.  It is good
-style to do this even if your device holds the default setting,
-because this shows that you did think about these issues wrt. your
-device.
-
-The query is performed via a call to dma_set_mask_and_coherent()::
+This is performed via a call to dma_set_mask_and_coherent()::
 
int dma_set_mask_and_coherent(struct device *dev, u64 mask);
 
-which will query the mask for both streaming and coherent APIs together.
-If you have some special requirements, then the following two separate
-queries can be used instead:
+which will set the mask for both streaming and coherent APIs together.  If you
+have some special requirements, then the following two separate calls can be
+used instead:
 
-   The query for streaming mappings is performed via a call to
+   The setup for streaming mappings is performed via a call to
dma_set_mask()::
 
int dma_set_mask(struct device *dev, u64 mask);
 
-   The query for consistent allocations is performed via a call
+   The setup for consistent allocations is performed via a call
to dma_set_coherent_mask()::
 
int dma_set_coherent_mask(struct device *dev, u64 mask);
 
-Here, dev is a pointer to the device struct of your device, and mask
-is a bit mask describing which bits of an address your device
-supports.  It returns zero if your card can perform DMA properly on
-the machine given the address mask you provided.  In general, the
-device struct of your device is embedded in the bus-specific device
-struct of your device.  For example, >dev is a pointer to the
-device struct of a PCI device (pdev is a pointer to the PCI device
-struct of your device).
+Here, dev is a pointer to the device struct of your device, and mask is a bit
+mask describing which bits of an address your device supports.  Often the
+device struct of your device is embedded in the bus-specific device struct of
+your device.  For example, >dev is a pointer to the device struct of a
+PCI device (pdev is a pointer to the PCI device struct of your device).
 
-If it returns non-zero, your device cannot perform DMA properly on
-this platform, and attempting to do so will result in undefined
-behavior.  You must either use a different mask, or not use DMA.
+These calls usually return zero to indicated your device can perform DMA
+properly on the machine given the address mask you provided, but they might
+return an error if the mask is too small to be supportable on the given
+system.  If it returns non-zero, your device cannot perform DMA properly on
+this platform, and attempting to do so will result in undefined behavior.
+You must not use DMA on this device unless the dma_set_mask family of
+functions has 

[PATCH 2/5] sparc64: refactor the ali DMA quirk

2019-02-15 Thread Christoph Hellwig
Do the quirk first in the dma_supported routines, as we don't need
any of the other checks for it, and remove the duplicate mask checking
that is already done by the callers.

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/kernel/iommu.c |  7 +++---
 arch/sparc/kernel/kernel.h|  6 -
 arch/sparc/kernel/pci.c   | 46 ---
 arch/sparc/kernel/pci_sun4v.c |  5 +++-
 4 files changed, 27 insertions(+), 37 deletions(-)

diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index b1a09080e8da..0c253f1c852e 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -745,14 +745,13 @@ static int dma_4u_supported(struct device *dev, u64 
device_mask)
 {
struct iommu *iommu = dev->archdata.iommu;
 
+   if (ali_sound_dma_hack(dev, device_mask))
+   return 1;
+
if (device_mask > DMA_BIT_MASK(32))
return 0;
if ((device_mask & iommu->dma_addr_mask) == iommu->dma_addr_mask)
return 1;
-#ifdef CONFIG_PCI
-   if (dev_is_pci(dev))
-   return pci64_dma_supported(to_pci_dev(dev), device_mask);
-#endif
return 0;
 }
 
diff --git a/arch/sparc/kernel/kernel.h b/arch/sparc/kernel/kernel.h
index ddffd368e057..f6f498ba3198 100644
--- a/arch/sparc/kernel/kernel.h
+++ b/arch/sparc/kernel/kernel.h
@@ -45,7 +45,11 @@ void __irq_entry smp_receive_signal_client(int irq, struct 
pt_regs *regs);
 void __irq_entry smp_kgdb_capture_client(int irq, struct pt_regs *regs);
 
 /* pci.c */
-int pci64_dma_supported(struct pci_dev *pdev, u64 device_mask);
+#ifdef CONFIG_PCI
+int ali_sound_dma_hack(struct device *dev, u64 device_mask);
+#else
+#define ali_sound_dma_hack(dev, mask)  (0)
+#endif
 
 /* signal32.c */
 void do_sigreturn32(struct pt_regs *regs);
diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
index bcfec6a85d23..5ed43828e078 100644
--- a/arch/sparc/kernel/pci.c
+++ b/arch/sparc/kernel/pci.c
@@ -956,51 +956,35 @@ void arch_teardown_msi_irq(unsigned int irq)
 }
 #endif /* !(CONFIG_PCI_MSI) */
 
-static void ali_sound_dma_hack(struct pci_dev *pdev, int set_bit)
+/* ALI sound chips generate 31-bits of DMA, a special register
+ * determines what bit 31 is emitted as.
+ */
+int ali_sound_dma_hack(struct device *dev, u64 device_mask)
 {
+   struct iommu *iommu = dev->archdata.iommu;
struct pci_dev *ali_isa_bridge;
u8 val;
 
-   /* ALI sound chips generate 31-bits of DMA, a special register
-* determines what bit 31 is emitted as.
-*/
+   if (!dev_is_pci(dev))
+   return 0;
+
+   if (to_pci_dev(dev)->vendor != PCI_VENDOR_ID_AL ||
+   to_pci_dev(dev)->device != PCI_DEVICE_ID_AL_M5451 ||
+   device_mask != 0x7fff)
+   return 0;
+
ali_isa_bridge = pci_get_device(PCI_VENDOR_ID_AL,
 PCI_DEVICE_ID_AL_M1533,
 NULL);
 
pci_read_config_byte(ali_isa_bridge, 0x7e, );
-   if (set_bit)
+   if (iommu->dma_addr_mask & 0x8000)
val |= 0x01;
else
val &= ~0x01;
pci_write_config_byte(ali_isa_bridge, 0x7e, val);
pci_dev_put(ali_isa_bridge);
-}
-
-int pci64_dma_supported(struct pci_dev *pdev, u64 device_mask)
-{
-   u64 dma_addr_mask;
-
-   if (pdev == NULL) {
-   dma_addr_mask = 0x;
-   } else {
-   struct iommu *iommu = pdev->dev.archdata.iommu;
-
-   dma_addr_mask = iommu->dma_addr_mask;
-
-   if (pdev->vendor == PCI_VENDOR_ID_AL &&
-   pdev->device == PCI_DEVICE_ID_AL_M5451 &&
-   device_mask == 0x7fff) {
-   ali_sound_dma_hack(pdev,
-  (dma_addr_mask & 0x8000) != 0);
-   return 1;
-   }
-   }
-
-   if (device_mask >= (1UL << 32UL))
-   return 0;
-
-   return (device_mask & dma_addr_mask) == dma_addr_mask;
+   return 1;
 }
 
 void pci_resource_to_user(const struct pci_dev *pdev, int bar,
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index fa0e42b4cbfb..d30eb22b6e11 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -676,6 +676,9 @@ static int dma_4v_supported(struct device *dev, u64 
device_mask)
struct iommu *iommu = dev->archdata.iommu;
u64 dma_addr_mask = iommu->dma_addr_mask;
 
+   if (ali_sound_dma_hack(dev, device_mask))
+   return 1;
+
if (device_mask > DMA_BIT_MASK(32)) {
if (iommu->atu)
dma_addr_mask = iommu->atu->dma_addr_mask;
@@ -685,7 +688,7 @@ static int dma_4v_supported(struct device *dev, u64 
device_mask)
 
if ((device_mask & dma_addr_mask) == dma_addr_mask)
return 1;
-   return pci64_dma_supported(to_pci_dev(dev), 

allow larger than require DMA masks

2019-02-15 Thread Christoph Hellwig
Hi all,

this series finishes off converting our dma mask model to split between
device capabilities (dev->dma_mask and dev->coherent_dma_mask) and system
limitations (dev->bus_dma_mask).  We already accept larger than required
masks in most dma_map_ops implementation, in case of x86 and
implementations based on it since the dawn of time.  Only one parisc
and two sparc64 instances failed larger than required DMA masks, and
this series fixes that up and updates the documentation that devices
don't need to handle DMA mask fallbacks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/5] sparc64/pci_sun4v: allow large DMA masks

2019-02-15 Thread Christoph Hellwig
We've been moving to a model where the device just sets the DMA mask
supported by it, instead of having to fallback to something it thinks
the platform might support.  Sparc64 is the remaining holdout forcing
drivers to supply a matching mask.  Change dma_4v_supported to just
check if the supplied dma mask is large enough. and adjust the mapping
code to check ATU presence in addition to the DMA mask to decice on
the mapping method.

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/kernel/pci_sun4v.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index d30eb22b6e11..a8af6023c126 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -92,7 +92,7 @@ static long iommu_batch_flush(struct iommu_batch *p, u64 mask)
prot &= (HV_PCI_MAP_ATTR_READ | HV_PCI_MAP_ATTR_WRITE);
 
while (npages != 0) {
-   if (mask <= DMA_BIT_MASK(32)) {
+   if (mask <= DMA_BIT_MASK(32) || !pbm->iommu->atu) {
num = pci_sun4v_iommu_map(devhandle,
  HV_PCI_TSBID(0, entry),
  npages,
@@ -208,7 +208,7 @@ static void *dma_4v_alloc_coherent(struct device *dev, 
size_t size,
atu = iommu->atu;
 
mask = dev->coherent_dma_mask;
-   if (mask <= DMA_BIT_MASK(32))
+   if (mask <= DMA_BIT_MASK(32) || !atu)
tbl = >tbl;
else
tbl = >tbl;
@@ -674,21 +674,12 @@ static void dma_4v_unmap_sg(struct device *dev, struct 
scatterlist *sglist,
 static int dma_4v_supported(struct device *dev, u64 device_mask)
 {
struct iommu *iommu = dev->archdata.iommu;
-   u64 dma_addr_mask = iommu->dma_addr_mask;
 
if (ali_sound_dma_hack(dev, device_mask))
return 1;
-
-   if (device_mask > DMA_BIT_MASK(32)) {
-   if (iommu->atu)
-   dma_addr_mask = iommu->atu->dma_addr_mask;
-   else
-   return 0;
-   }
-
-   if ((device_mask & dma_addr_mask) == dma_addr_mask)
-   return 1;
-   return 0;
+   if (device_mask < iommu->dma_addr_mask)
+   return 0;
+   return 1;
 }
 
 static const struct dma_map_ops sun4v_dma_ops = {
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/5] ccio: allow large DMA masks

2019-02-15 Thread Christoph Hellwig
There is no harm in setting a 64-bit mask even if we don't need it,
and the current ccio code is one of the few place in the kernel
still rejecting larger than required DMA masks.

Signed-off-by: Christoph Hellwig 
---
 drivers/parisc/ccio-dma.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c
index 8d2fc84119c6..24a68fb7b2f9 100644
--- a/drivers/parisc/ccio-dma.c
+++ b/drivers/parisc/ccio-dma.c
@@ -710,8 +710,8 @@ ccio_dma_supported(struct device *dev, u64 mask)
return 0;
}
 
-   /* only support 32-bit devices (ie PCI/GSC) */
-   return (int)(mask == 0xUL);
+   /* only support 32-bit or better devices (ie PCI/GSC) */
+   return (int)(mask >= 0xUL);
 }
 
 /**
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/5] sparc64/iommu: allow large DMA masks

2019-02-15 Thread Christoph Hellwig
We've been moving to a model where the device just sets the DMA mask
supported by it, instead of having to fallback to something it thinks
the platform might support.  Sparc64 is the remaining holdout forcing
drivers to supply a matching mask.  Change dma_4u_supported to just
check if the supplied dma mask is large enough as nothing in the
iommu.c code (or the core DMA code) actually looks at the DMA mask
later on.

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/kernel/iommu.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index 0c253f1c852e..4ae7388b1bff 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -748,11 +748,9 @@ static int dma_4u_supported(struct device *dev, u64 
device_mask)
if (ali_sound_dma_hack(dev, device_mask))
return 1;
 
-   if (device_mask > DMA_BIT_MASK(32))
+   if (device_mask < iommu->dma_addr_mask)
return 0;
-   if ((device_mask & iommu->dma_addr_mask) == iommu->dma_addr_mask)
-   return 1;
-   return 0;
+   return 1;
 }
 
 static const struct dma_map_ops sun4u_dma_ops = {
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/ipmmu-vmsa: fix device reference leaks

2019-02-15 Thread Wen Yang
Hi 
> > > diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> > > index 7a4529c..cebf56d 100644
> > > --- a/drivers/iommu/ipmmu-vmsa.c
> > > +++ b/drivers/iommu/ipmmu-vmsa.c
> > > @@ -756,6 +756,9 @@ static int ipmmu_init_platform_device(struct device 
> > > *dev,
> > >
> > > fwspec->iommu_priv = platform_get_drvdata(ipmmu_pdev);
> > >
> > > +   if (!fwspec->iommu_priv)
> > > +   put_device(_pdev->dev);
> > > +
> 
> This doesn't seem to match the patch's subject, and doesn't seem to fix
> the problem.

Thanks for your comments.
I will submit a v2 patch later.
Thank you.

Regards,

Wen

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu