Re: [PATCH] iommu/arm-smmu: Allow disabling bypass via kernel config
On Thu, Feb 14, 2019 at 7:40 PM Doug Anderson wrote: > > Hi, > > On Thu, Feb 14, 2019 at 1:32 PM Robin Murphy wrote: > > > > Hi Doug, > > > > On 2019-02-14 8:44 pm, Douglas Anderson wrote: > > > Right now the only way to disable the iommu bypass for the ARM SMMU is > > > with the kernel command line parameter 'arm-smmu.disable_bypass'. > > > > > > In general kernel command line parameters make sense for things that > > > someone would like to tweak without rebuilding the kernel or for very > > > basic communication between the bootloader and the kernel, but are > > > awkward for other things. Specifically: > > > * Human parsing of the kernel command line can be difficult since it's > > >just a big runon space separated line of text. > > > * If every bit of the system was configured via the kernel command > > >line the kernel command line would get very large and even more > > >unwieldly. > > > * Typically there are not easy ways in build systems to adjust the > > >kernel command line for config-like options. > > > > > > Let's introduce a new config option that allows us to disable the > > > iommu bypass without affecting the existing default nor the existing > > > ability to adjust the configuration via kernel command line. > > > > I say let's just flip the default - for a while now it's been one of > > those "oh yeah, we should probably do that" things that gets instantly > > forgotten again, so some 3rd-party demand is plenty to convince me :) > > > > There are few reasons to allow unmatched stream bypass, and even fewer > > good ones, so I'd be happy to shift the command-line burden over to the > > esoteric cases at this point, and consider the config option in future > > if anyone from that camp pops up and screams hard enough. > > Sure, I can submit that patch if we want. I presume I'll get lots of > screaming but I'm used to that. ;-) > > ...specifically I found that when I turned on "disably bypass" on my > board (sdm845-cheza, which is not yet upstream) that a bunch of things > that used to work broke. That's a good thing because all the things > that broke need to be fixed properly (by adding the IOMMUs) but > presumably my board is not special in relying on the old insecure > behavior. So one niggly bit, beyond the drivers that aren't but should be using iommu, is the case where bootloader lights up the display. We actually still have a lot of work to do for this (in that clk and genpd drivers need to be aware of handover, in addition to just iommu).. But I'd rather not make a hard problem harder just yet. (it gets worse, afaict so far on the windows-arm laptops since in that case uefi/edk is actually taking the iommu out of bypass and things go badly when arm-smmu disables clks after probe..) But I might recommend making the default a kconfig option for now, so in the distro kernel case, where display driver is a kernel module, you aren't making a hard problem harder. For cases where bootloader isn't enabling display, things are much easier, and I think we could switch the default sooner. But I fear for cases where bootloader is enabling display it is a much harder problem :-( BR, -R > > I'm about to head on vacation for a week so I'm not sure I'll get to > re-post before then. If not I'll post this sometime after I get back > unless someone beats me to it. > > -Doug > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 0/9] vfio/mdev: IOMMU aware mediated device
On 14/02/2019 20:14, Alex Williamson wrote: >> This patch series extends both IOMMU and vfio components to support >> mdev device passing through when it could be isolated and protected >> by the IOMMU units. The first part of this series (PATCH 1/09~6/09) >> adds the interfaces and implementation of the multiple domains per >> device. The second part (PATCH 7/09~9/09) adds the iommu device >> attribute to each mdev, determines isolation type according to the >> existence of an iommu device when attaching group in vfio type1 iommu >> module, and attaches the domain to iommu aware mediated devices. >> >> References: >> [1] >> https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification >> [2] >> https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification >> [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf >> >> Best regards, >> Lu Baolu >> >> Change log: >> v5->v6: > > This looks pretty reasonable with Jean-Philippe's nit fixups. Where do > we go from here? I think we need an ack from Kirti since they have an > interest here. Presumably this looks ok to the ARM folks. Looks great from my point of view. I focused on patch 1 since I'm planning to reuse iommu_dev_features for SVA. I don't have time to test auxd and mdev on SMMUv3 at the moment but I had a better look and, if it helps, for patches 1 and 7-9: Reviewed-by: Jean-Philippe Brucker That said, are you planning to add back the mdev_get_iommu_domain() function, in a separate patch? Because I think the parent driver still needs a way to retrieve the PASID for an mdev? Thanks, Jean > Do we have > any consumers of this code yet? Theoretically I think a vfio-pci-like > meta driver could be written as an mdev vendor driver with this support > (restricted to type1 iommu use cases). Thanks, > > Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-contiguous: do not allocate a single page from CMA area
The addresses within a single page are always contiguous, so it's not so necessary to always allocate one single page from CMA area. Since the CMA area has a limited predefined size of space, it may run out of space in heavy use cases, where there might be quite a lot CMA pages being allocated for single pages. However, there is also a concern that a device might care where a page comes from -- it might expect the page from CMA area and act differently if the page doesn't. This patch tries to skip of one-page size allocations and returns NULL so as to let callers allocate normal pages unless the device has its own CMA area. This would save resources from the CMA area for more CMA allocations. And it'd also reduce CMA fragmentations resulted from trivial allocations. Signed-off-by: Nicolin Chen --- kernel/dma/contiguous.c | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index b2a87905846d..09074bd04793 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -186,16 +186,32 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base, * * This function allocates memory buffer for specified device. It uses * device specific contiguous memory area if available or the default - * global one. Requires architecture specific dev_get_cma_area() helper - * function. + * global one. + * + * However, it skips one-page size of allocations from the global area. + * As the addresses within one page are always contiguous, so there is + * no need to waste CMA pages for that kind; it also helps reduce the + * fragmentations in the CMA area. So a caller should be the rebounder + * in such case to allocate a normal page upon NULL return value. + * + * Requires architecture specific dev_get_cma_area() helper function. */ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count, unsigned int align, bool no_warn) { + struct cma *cma; + if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; - return cma_alloc(dev_get_cma_area(dev), count, align, no_warn); + if (dev && dev->cma_area) + cma = dev->cma_area; + else if (count > 1) + cma = dma_contiguous_default_area; + else + return NULL; + + return cma_alloc(cma, count, align, no_warn); } /** -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/5] sparc64/pci_sun4v: allow large DMA masks
From: Christoph Hellwig Date: Fri, 15 Feb 2019 15:45:58 +0100 > We've been moving to a model where the device just sets the DMA mask > supported by it, instead of having to fallback to something it thinks > the platform might support. Sparc64 is the remaining holdout forcing > drivers to supply a matching mask. Change dma_4v_supported to just > check if the supplied dma mask is large enough. and adjust the mapping > code to check ATU presence in addition to the DMA mask to decice on ^^ "decide" > the mapping method. > > Signed-off-by: Christoph Hellwig Acked-by: David S. Miller ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] sparc64/iommu: allow large DMA masks
From: Christoph Hellwig Date: Fri, 15 Feb 2019 15:45:57 +0100 > We've been moving to a model where the device just sets the DMA mask > supported by it, instead of having to fallback to something it thinks > the platform might support. Sparc64 is the remaining holdout forcing > drivers to supply a matching mask. Change dma_4u_supported to just > check if the supplied dma mask is large enough as nothing in the > iommu.c code (or the core DMA code) actually looks at the DMA mask > later on. > > Signed-off-by: Christoph Hellwig Acked-by: David S. Miller ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 1/5] iommu: Add APIs for IOMMU PASID management
Hi Jacob, Lu, On 30/01/2019 19:05, Jacob Pan wrote: > On Mon, 12 Nov 2018 14:44:57 +0800 > Lu Baolu wrote: > >> This adds APIs for IOMMU drivers and device drivers to manage >> the PASIDs used for DMA transfer and translation. It bases on >> I/O ASID allocator for PASID namespace management and relies >> on vendor specific IOMMU drivers for paravirtual PASIDs. >> > Trying to integrate this with VT-d SVM code had some thoughts. > First of all, it seems to me the problem we are trying to solve here is > to allow custom PASID/IOASID allocator. > IOASID, as in driver base, is a common infrastructure of its own right. > So it is OK to let device drivers such as VT-d driver directly > communicate with IOASID w/o going through common IOMMU layer. Therefore > I see little value of adding this wrapper to ioasid code, instead I > feel it might be less work to enhance ioasid with the following: > > 1. allow ioasid code to register a system wide custom asid allocator, > which takes precedence over the idr allocator > e.g. > typedef ioasid_t (*ioasid_alloc_fn_t)(ioasid_t min, ioasid_t max, void *data); > void ioasid_set_allocator(ioasid_alloc_fn_t fn) > {} > We can still use idr for tracking ioasid and private data lookup, since > the ioasid idr is exclusively owned by ioasid_alloc, we can rest and be > sure there is no conflict with other callers. See code comments below. Yes, I think we need the alloc function to be registered at boot time. Because even when using a PV allocation instead of idr allocation, we do need a way to quickly retrieve PASID->mm when handling an I/O page fault. In the ioasid_alloc() function, we could do, roughly: static ioasid_t (*ioasid_alloc_fn)(...); ioasid_t ioasid_alloc(... min, max...) { if (ioasid_alloc_fn) { ioasid = ioasid_alloc_fn(min, max); /* With PV allocation, only use the IDR for storage */ min = ioasid; max = ioasid + 1; } idr_lock ioasid = idr_alloc(... min, max... ); idr_unlock } > 2. support setting asid private data _after_ asid is allocated. The use > case is that PASID allocation and mm bind can be split into separate > stages. Private data (mm related) are not available at the time of > allocation, only bind time. > Since IDR needs the data pointer at allocation time, we can still > create an empty ioasid_data for ioasid tracking at alloc time. i.e. > struct ioasid_data { > void *ptr; /* private data */ > }; I think I agree as well (though finding the right ordering and locking in SVA allocation is giving me the biggest headache these days). If the PASID becomes reachable early we need to make sure that concurrent threads working on the PASID space can deal with incomplete data, but I think that might be one of the easiest alternatives. > 3. allow NULL ioasid_set > I also had a hard time understanding the use of ioasid_set, since there > is already a private data associated with each ioasid, is it not enough > to group ioasid based on private data? The idea behind ioasid_set is that different modules compete for the single IOASID space, and associate different data types to their IOASIDs. (a) iommu-sva associates an mm_struct (b) IOMMU driver allocates a PASIDs for each auxiliary domain, and probably associates an iommu_domain to it. (c) device drivers (e.g. AMD KFD) want some PASIDs for their own use in the same space, and will have their own private data in there. For (a) we also want the io_pgfault code to find the mm associated with a PASID: iopf_handle_mm_fault() iommu_sva_find(pasid) ioasid_find(_pasid_set, pasid) -> mm struct And in this case we really need the search to be only on the shared_pasid_set, or else the IOPF code gets something that's not an mm struct. The easy alternative is to let the IOMMU driver handle both (a) and (b), and store the same data type for both. I've been considering this a few times this week, but it doesn't solve case (c). Thanks, Jean > So the usage scenarios can be. > 1. during boot, vt-d driver register a custom ioasid allocator (vcmd) if > it detects its running as a guest. > > 2. Running as a guest, all pasid allocation via ioasid_alloc() will go > to this custom allocators and tracked > > 3. for native case, there is no custom allocator, ioasid just use IDR > alloc > > 4. for native bind mm, pasid allocation already has private data filled > in when calling ioasid_alloc > > 5. for guest bind, pasid is allocated with empty private data (called > by VFIO layer), but private data is filled in later by bind guest > pasid inside the vt-d driver. > > So my thinking is that we can avoid having another layer of APIs as > below and keep everything within ioasid. Also allows private data to be > managed at lower level as compared to VFIO. > > > Thanks, > > Jacob > >> Below APIs are added: >> >> * iommu_pasid_init(pasid) >> - Initialize a PASID consumer. The vendor specific IOMMU >> drivers are able to
Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)
On 2019-02-14 12:58 p.m., Robin Murphy wrote: > Hmm, having felt brave enough to take a closer look, it might actually be as > simple as this - Dave, are you able to give the diff below a spin? > > Robin. > > ->8- > diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c > index 7f595355fb79..fe4a7c71fede 100644 > --- a/drivers/dma/mv_xor.c > +++ b/drivers/dma/mv_xor.c > @@ -1059,6 +1059,7 @@ mv_xor_channel_add(struct mv_xor_device *xordev, > mv_chan->op_in_desc = XOR_MODE_IN_DESC; > > dma_dev = _chan->dmadev; > +dma_dev->dev = >dev; > mv_chan->xordev = xordev; > > /* > @@ -1091,7 +1092,6 @@ mv_xor_channel_add(struct mv_xor_device *xordev, > dma_dev->device_free_chan_resources = mv_xor_free_chan_resources; > dma_dev->device_tx_status = mv_xor_status; > dma_dev->device_issue_pending = mv_xor_issue_pending; > -dma_dev->dev = >dev; > > /* set prep routines based on capability */ > if (dma_has_cap(DMA_INTERRUPT, dma_dev->cap_mask)) The patch is fine and it fixes the boot failure. I misapplied it in previous test. Thanks, Dave -- John David Anglin dave.ang...@bell.net ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)
On 2019-02-15 10:22 a.m., John David Anglin wrote: > On 2019-02-14 12:58 p.m., Robin Murphy wrote: >> Hmm, having felt brave enough to take a closer look, it might actually be as >> simple as this - Dave, are you able to give the diff below a spin? > Still crashes but in slightly different spot: > I think I see what's wrong. Dave -- John David Anglin dave.ang...@bell.net ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: ARM64 boot failure on espressobin with 5.0.0-rc6 (1f947a7a011fcceb14cb912f5481a53b18f1879a)
On 2019-02-14 12:58 p.m., Robin Murphy wrote: > Hmm, having felt brave enough to take a closer look, it might actually be as > simple as this - Dave, are you able to give the diff below a spin? Still crashes but in slightly different spot: [ 0.00] Booting Linux on physical CPU 0x00 [0x410fd034] [ 0.00] Linux version 5.0.0-rc6+ (root@espressobin) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)) #1 SMP PREEMPT Wed Feb 13 16:17:46 EST 2019 [ 0.00] Machine model: Globalscale Marvell ESPRESSOBin Board [ 0.00] earlycon: ar3700_uart0 at MMIO 0xd0012000 (options '') [ 0.00] printk: bootconsole [ar3700_uart0] enabled [ 3.210276] Internal error: Oops: 9645 [#1] PREEMPT SMP [ 3.215932] Modules linked in: [ 3.219072] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc6+ #1 [ 3.225519] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT) [ 3.232151] pstate: a005 (NzCv daif -PAN -UAO) [ 3.237090] pc : mv_xor_channel_add+0x4c/0xb28 [ 3.241650] lr : mv_xor_probe+0x20c/0x4b8 [ 3.245768] sp : ff8010033ac0 [ 3.249173] x29: ff8010033ac0 x28: [ 3.254639] x27: ff8010e76068 x26: 0029 [ 3.260104] x25: x24: [ 3.265570] x23: ffc03fb47400 x22: ff8010ead000 [ 3.271035] x21: ffc03fb47410 x20: ffc03bea8d80 [ 3.276501] x19: ffc03fb47400 x18: [ 3.281966] x17: 000c x16: 000a [ 3.287432] x15: ff8010ead6c8 x14: ffc03beaa003 [ 3.292898] x13: ffc03beaa002 x12: 0038 [ 3.298363] x11: 1fff x10: 0001 [ 3.303829] x9 : 0040 x8 : ff8010ec7928 [ 3.309294] x7 : ffc03cc003b8 x6 : [ 3.314760] x5 : x4 : 0029 [ 3.320226] x3 : 0083 x2 : 80c0 [ 3.325691] x1 : x0 : ffc03fb47410 [ 3.331158] Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) [ 3.338056] Call trace: [ 3.340569] mv_xor_channel_add+0x4c/0xb28 [ 3.344779] mv_xor_probe+0x20c/0x4b8 [ 3.348544] platform_drv_probe+0x50/0xb0 [ 3.352663] really_probe+0x1fc/0x2c0 [ 3.356427] driver_probe_device+0x58/0x100 [ 3.360727] __driver_attach+0xd8/0xe0 [ 3.364580] bus_for_each_dev+0x68/0xc8 [ 3.368522] driver_attach+0x20/0x28 [ 3.372196] bus_add_driver+0x108/0x228 [ 3.376139] driver_register+0x60/0x110 [ 3.380081] __platform_driver_register+0x44/0x50 [ 3.384923] mv_xor_driver_init+0x18/0x20 [ 3.389043] do_one_initcall+0x58/0x170 [ 3.392985] kernel_init_freeable+0x190/0x234 [ 3.397465] kernel_init+0x10/0x108 [ 3.401047] ret_from_fork+0x10/0x1c [ 3.404723] Code: f90067a5 d285 52901802 aa1503e0 (f9003035) [ 3.411004] ---[ end trace 65be82a62724e328 ]--- [ 3.415804] Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b [ 3.423626] SMP: stopping secondary CPUs [ 3.427661] Kernel Offset: disabled [ 3.431243] CPU features: 0x002,2000200c [ 3.435272] Memory Limit: none [ 3.438412] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b ]--- ff8010630440 : ff8010630440: a9b37bfd stp x29, x30, [sp, #-208]! ff8010630444: 910003fd mov x29, sp ff8010630448: a9025bf5 stp x21, x22, [sp, #32] ff801063044c: b00043f6 adrp x22, ff8010ead000 ff8010630450: a90363f7 stp x23, x24, [sp, #48] ff8010630454: aa0103f7 mov x23, x1 ff8010630458: a9046bf9 stp x25, x26, [sp, #64] ff801063045c: d281 mov x1, #0x0 // #0 ff8010630460: a90153f3 stp x19, x20, [sp, #16] ff8010630464: 910042f5 add x21, x23, #0x10 ff8010630468: a90573fb stp x27, x28, [sp, #80] ff801063046c: aa0003f4 mov x20, x0 ff8010630470: 911b22c0 add x0, x22, #0x6c8 ff8010630474: 2a0203f9 mov w25, w2 ff8010630478: f945 ldr x5, [x0] ff801063047c: f90067a5 str x5, [x29, #200] ff8010630480: d285 mov x5, #0x0 // #0 ff8010630484: 52901802 mov w2, #0x80c0 // #32960 ff8010630488: aa1503e0 mov x0, x21 ff801063048c: f9003035 str x21, [x1, #96] ff8010630490: 72a00c02 movk w2, #0x60, lsl #16 ff8010630494: d2806a01 mov x1, #0x350 ... Dave -- John David Anglin dave.ang...@bell.net ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 5/5] Documentation/DMA-API-HOWTO: update dma_mask sections
We don't require drivers to guess a DMA mask that might actually match the system capabilities any more, so fix up the documentation to clear this up. Signed-off-by: Christoph Hellwig --- Documentation/DMA-API-HOWTO.txt | 121 +++- 1 file changed, 41 insertions(+), 80 deletions(-) diff --git a/Documentation/DMA-API-HOWTO.txt b/Documentation/DMA-API-HOWTO.txt index f0cc3f772265..8e948fae34af 100644 --- a/Documentation/DMA-API-HOWTO.txt +++ b/Documentation/DMA-API-HOWTO.txt @@ -146,114 +146,75 @@ What about block I/O and networking buffers? The block I/O and networking subsystems make sure that the buffers they use are valid for you to DMA from/to. -DMA addressing limitations +DMA addressing capabilities == -Does your device have any DMA addressing limitations? For example, is -your device only capable of driving the low order 24-bits of address? -If so, you need to inform the kernel of this fact. +By default, the kernel assumes that your device can address 32-bits of DMA +addressing. For a 64-bit capable device, this needs to be increased, and for +a device with limitations, it needs to be decreased. -By default, the kernel assumes that your device can address the full -32-bits. For a 64-bit capable device, this needs to be increased. -And for a device with limitations, as discussed in the previous -paragraph, it needs to be decreased. +Special note about PCI: PCI-X specification requires PCI-X devices to support +64-bit addressing (DAC) for all transactions. And at least one platform (SGI +SN2) requires 64-bit consistent allocations to operate correctly when the IO +bus is in PCI-X mode. -Special note about PCI: PCI-X specification requires PCI-X devices to -support 64-bit addressing (DAC) for all transactions. And at least -one platform (SGI SN2) requires 64-bit consistent allocations to -operate correctly when the IO bus is in PCI-X mode. +For correct operation, you must set the DMA mask to inform the kernel about +your devices DMA addressing capabilities. -For correct operation, you must interrogate the kernel in your device -probe routine to see if the DMA controller on the machine can properly -support the DMA addressing limitation your device has. It is good -style to do this even if your device holds the default setting, -because this shows that you did think about these issues wrt. your -device. - -The query is performed via a call to dma_set_mask_and_coherent():: +This is performed via a call to dma_set_mask_and_coherent():: int dma_set_mask_and_coherent(struct device *dev, u64 mask); -which will query the mask for both streaming and coherent APIs together. -If you have some special requirements, then the following two separate -queries can be used instead: +which will set the mask for both streaming and coherent APIs together. If you +have some special requirements, then the following two separate calls can be +used instead: - The query for streaming mappings is performed via a call to + The setup for streaming mappings is performed via a call to dma_set_mask():: int dma_set_mask(struct device *dev, u64 mask); - The query for consistent allocations is performed via a call + The setup for consistent allocations is performed via a call to dma_set_coherent_mask():: int dma_set_coherent_mask(struct device *dev, u64 mask); -Here, dev is a pointer to the device struct of your device, and mask -is a bit mask describing which bits of an address your device -supports. It returns zero if your card can perform DMA properly on -the machine given the address mask you provided. In general, the -device struct of your device is embedded in the bus-specific device -struct of your device. For example, >dev is a pointer to the -device struct of a PCI device (pdev is a pointer to the PCI device -struct of your device). +Here, dev is a pointer to the device struct of your device, and mask is a bit +mask describing which bits of an address your device supports. Often the +device struct of your device is embedded in the bus-specific device struct of +your device. For example, >dev is a pointer to the device struct of a +PCI device (pdev is a pointer to the PCI device struct of your device). -If it returns non-zero, your device cannot perform DMA properly on -this platform, and attempting to do so will result in undefined -behavior. You must either use a different mask, or not use DMA. +These calls usually return zero to indicated your device can perform DMA +properly on the machine given the address mask you provided, but they might +return an error if the mask is too small to be supportable on the given +system. If it returns non-zero, your device cannot perform DMA properly on +this platform, and attempting to do so will result in undefined behavior. +You must not use DMA on this device unless the dma_set_mask family of +functions has
[PATCH 2/5] sparc64: refactor the ali DMA quirk
Do the quirk first in the dma_supported routines, as we don't need any of the other checks for it, and remove the duplicate mask checking that is already done by the callers. Signed-off-by: Christoph Hellwig --- arch/sparc/kernel/iommu.c | 7 +++--- arch/sparc/kernel/kernel.h| 6 - arch/sparc/kernel/pci.c | 46 --- arch/sparc/kernel/pci_sun4v.c | 5 +++- 4 files changed, 27 insertions(+), 37 deletions(-) diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c index b1a09080e8da..0c253f1c852e 100644 --- a/arch/sparc/kernel/iommu.c +++ b/arch/sparc/kernel/iommu.c @@ -745,14 +745,13 @@ static int dma_4u_supported(struct device *dev, u64 device_mask) { struct iommu *iommu = dev->archdata.iommu; + if (ali_sound_dma_hack(dev, device_mask)) + return 1; + if (device_mask > DMA_BIT_MASK(32)) return 0; if ((device_mask & iommu->dma_addr_mask) == iommu->dma_addr_mask) return 1; -#ifdef CONFIG_PCI - if (dev_is_pci(dev)) - return pci64_dma_supported(to_pci_dev(dev), device_mask); -#endif return 0; } diff --git a/arch/sparc/kernel/kernel.h b/arch/sparc/kernel/kernel.h index ddffd368e057..f6f498ba3198 100644 --- a/arch/sparc/kernel/kernel.h +++ b/arch/sparc/kernel/kernel.h @@ -45,7 +45,11 @@ void __irq_entry smp_receive_signal_client(int irq, struct pt_regs *regs); void __irq_entry smp_kgdb_capture_client(int irq, struct pt_regs *regs); /* pci.c */ -int pci64_dma_supported(struct pci_dev *pdev, u64 device_mask); +#ifdef CONFIG_PCI +int ali_sound_dma_hack(struct device *dev, u64 device_mask); +#else +#define ali_sound_dma_hack(dev, mask) (0) +#endif /* signal32.c */ void do_sigreturn32(struct pt_regs *regs); diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c index bcfec6a85d23..5ed43828e078 100644 --- a/arch/sparc/kernel/pci.c +++ b/arch/sparc/kernel/pci.c @@ -956,51 +956,35 @@ void arch_teardown_msi_irq(unsigned int irq) } #endif /* !(CONFIG_PCI_MSI) */ -static void ali_sound_dma_hack(struct pci_dev *pdev, int set_bit) +/* ALI sound chips generate 31-bits of DMA, a special register + * determines what bit 31 is emitted as. + */ +int ali_sound_dma_hack(struct device *dev, u64 device_mask) { + struct iommu *iommu = dev->archdata.iommu; struct pci_dev *ali_isa_bridge; u8 val; - /* ALI sound chips generate 31-bits of DMA, a special register -* determines what bit 31 is emitted as. -*/ + if (!dev_is_pci(dev)) + return 0; + + if (to_pci_dev(dev)->vendor != PCI_VENDOR_ID_AL || + to_pci_dev(dev)->device != PCI_DEVICE_ID_AL_M5451 || + device_mask != 0x7fff) + return 0; + ali_isa_bridge = pci_get_device(PCI_VENDOR_ID_AL, PCI_DEVICE_ID_AL_M1533, NULL); pci_read_config_byte(ali_isa_bridge, 0x7e, ); - if (set_bit) + if (iommu->dma_addr_mask & 0x8000) val |= 0x01; else val &= ~0x01; pci_write_config_byte(ali_isa_bridge, 0x7e, val); pci_dev_put(ali_isa_bridge); -} - -int pci64_dma_supported(struct pci_dev *pdev, u64 device_mask) -{ - u64 dma_addr_mask; - - if (pdev == NULL) { - dma_addr_mask = 0x; - } else { - struct iommu *iommu = pdev->dev.archdata.iommu; - - dma_addr_mask = iommu->dma_addr_mask; - - if (pdev->vendor == PCI_VENDOR_ID_AL && - pdev->device == PCI_DEVICE_ID_AL_M5451 && - device_mask == 0x7fff) { - ali_sound_dma_hack(pdev, - (dma_addr_mask & 0x8000) != 0); - return 1; - } - } - - if (device_mask >= (1UL << 32UL)) - return 0; - - return (device_mask & dma_addr_mask) == dma_addr_mask; + return 1; } void pci_resource_to_user(const struct pci_dev *pdev, int bar, diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c index fa0e42b4cbfb..d30eb22b6e11 100644 --- a/arch/sparc/kernel/pci_sun4v.c +++ b/arch/sparc/kernel/pci_sun4v.c @@ -676,6 +676,9 @@ static int dma_4v_supported(struct device *dev, u64 device_mask) struct iommu *iommu = dev->archdata.iommu; u64 dma_addr_mask = iommu->dma_addr_mask; + if (ali_sound_dma_hack(dev, device_mask)) + return 1; + if (device_mask > DMA_BIT_MASK(32)) { if (iommu->atu) dma_addr_mask = iommu->atu->dma_addr_mask; @@ -685,7 +688,7 @@ static int dma_4v_supported(struct device *dev, u64 device_mask) if ((device_mask & dma_addr_mask) == dma_addr_mask) return 1; - return pci64_dma_supported(to_pci_dev(dev),
allow larger than require DMA masks
Hi all, this series finishes off converting our dma mask model to split between device capabilities (dev->dma_mask and dev->coherent_dma_mask) and system limitations (dev->bus_dma_mask). We already accept larger than required masks in most dma_map_ops implementation, in case of x86 and implementations based on it since the dawn of time. Only one parisc and two sparc64 instances failed larger than required DMA masks, and this series fixes that up and updates the documentation that devices don't need to handle DMA mask fallbacks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 4/5] sparc64/pci_sun4v: allow large DMA masks
We've been moving to a model where the device just sets the DMA mask supported by it, instead of having to fallback to something it thinks the platform might support. Sparc64 is the remaining holdout forcing drivers to supply a matching mask. Change dma_4v_supported to just check if the supplied dma mask is large enough. and adjust the mapping code to check ATU presence in addition to the DMA mask to decice on the mapping method. Signed-off-by: Christoph Hellwig --- arch/sparc/kernel/pci_sun4v.c | 19 +-- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c index d30eb22b6e11..a8af6023c126 100644 --- a/arch/sparc/kernel/pci_sun4v.c +++ b/arch/sparc/kernel/pci_sun4v.c @@ -92,7 +92,7 @@ static long iommu_batch_flush(struct iommu_batch *p, u64 mask) prot &= (HV_PCI_MAP_ATTR_READ | HV_PCI_MAP_ATTR_WRITE); while (npages != 0) { - if (mask <= DMA_BIT_MASK(32)) { + if (mask <= DMA_BIT_MASK(32) || !pbm->iommu->atu) { num = pci_sun4v_iommu_map(devhandle, HV_PCI_TSBID(0, entry), npages, @@ -208,7 +208,7 @@ static void *dma_4v_alloc_coherent(struct device *dev, size_t size, atu = iommu->atu; mask = dev->coherent_dma_mask; - if (mask <= DMA_BIT_MASK(32)) + if (mask <= DMA_BIT_MASK(32) || !atu) tbl = >tbl; else tbl = >tbl; @@ -674,21 +674,12 @@ static void dma_4v_unmap_sg(struct device *dev, struct scatterlist *sglist, static int dma_4v_supported(struct device *dev, u64 device_mask) { struct iommu *iommu = dev->archdata.iommu; - u64 dma_addr_mask = iommu->dma_addr_mask; if (ali_sound_dma_hack(dev, device_mask)) return 1; - - if (device_mask > DMA_BIT_MASK(32)) { - if (iommu->atu) - dma_addr_mask = iommu->atu->dma_addr_mask; - else - return 0; - } - - if ((device_mask & dma_addr_mask) == dma_addr_mask) - return 1; - return 0; + if (device_mask < iommu->dma_addr_mask) + return 0; + return 1; } static const struct dma_map_ops sun4v_dma_ops = { -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/5] ccio: allow large DMA masks
There is no harm in setting a 64-bit mask even if we don't need it, and the current ccio code is one of the few place in the kernel still rejecting larger than required DMA masks. Signed-off-by: Christoph Hellwig --- drivers/parisc/ccio-dma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c index 8d2fc84119c6..24a68fb7b2f9 100644 --- a/drivers/parisc/ccio-dma.c +++ b/drivers/parisc/ccio-dma.c @@ -710,8 +710,8 @@ ccio_dma_supported(struct device *dev, u64 mask) return 0; } - /* only support 32-bit devices (ie PCI/GSC) */ - return (int)(mask == 0xUL); + /* only support 32-bit or better devices (ie PCI/GSC) */ + return (int)(mask >= 0xUL); } /** -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/5] sparc64/iommu: allow large DMA masks
We've been moving to a model where the device just sets the DMA mask supported by it, instead of having to fallback to something it thinks the platform might support. Sparc64 is the remaining holdout forcing drivers to supply a matching mask. Change dma_4u_supported to just check if the supplied dma mask is large enough as nothing in the iommu.c code (or the core DMA code) actually looks at the DMA mask later on. Signed-off-by: Christoph Hellwig --- arch/sparc/kernel/iommu.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c index 0c253f1c852e..4ae7388b1bff 100644 --- a/arch/sparc/kernel/iommu.c +++ b/arch/sparc/kernel/iommu.c @@ -748,11 +748,9 @@ static int dma_4u_supported(struct device *dev, u64 device_mask) if (ali_sound_dma_hack(dev, device_mask)) return 1; - if (device_mask > DMA_BIT_MASK(32)) + if (device_mask < iommu->dma_addr_mask) return 0; - if ((device_mask & iommu->dma_addr_mask) == iommu->dma_addr_mask) - return 1; - return 0; + return 1; } static const struct dma_map_ops sun4u_dma_ops = { -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/ipmmu-vmsa: fix device reference leaks
Hi > > > diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c > > > index 7a4529c..cebf56d 100644 > > > --- a/drivers/iommu/ipmmu-vmsa.c > > > +++ b/drivers/iommu/ipmmu-vmsa.c > > > @@ -756,6 +756,9 @@ static int ipmmu_init_platform_device(struct device > > > *dev, > > > > > > fwspec->iommu_priv = platform_get_drvdata(ipmmu_pdev); > > > > > > + if (!fwspec->iommu_priv) > > > + put_device(_pdev->dev); > > > + > > This doesn't seem to match the patch's subject, and doesn't seem to fix > the problem. Thanks for your comments. I will submit a v2 patch later. Thank you. Regards, Wen ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu