Re: [PATCH v3 0/6] Convert the intel iommu driver to the dma-iommu api
On 15/09/2020 02:47, Lu Baolu wrote: Hi Tvrtko, On 9/14/20 4:04 PM, Tvrtko Ursulin wrote: Hi, On 12/09/2020 04:21, Lu Baolu wrote: Tom Murphy has almost done all the work. His latest patch series was posted here. https://lore.kernel.org/linux-iommu/20200903201839.7327-1-murph...@tcd.ie/ Thanks a lot! This series is a follow-up with below changes: 1. Add a quirk for the i915 driver issue described in Tom's cover letter. Last week I have copied you on an i915 series which appears to remove the need for this quirk. so if we get those i915 patches reviewed and merged, do you still want to pursue this quirk? It's up to the graphic guys. I don't know the details in i915 driver. I don't think my tests could cover all cases. I am the graphic guy. :) I just need some reviews (internally) for my series and then we can merge it, at which point you don't need the quirk patch any more. I'll try to accelerate this. With regards to testing, you could send your series with my patches on top to our trybot mailing list (intel-gfx-try...@lists.freedesktop.org / https://patchwork.freedesktop.org/project/intel-gfx-trybot/series/?ordering=-last_updated) which would show you if it is still hitting the DMAR issues in i915. 2. Fix several bugs in patch "iommu: Allow the dma-iommu api to use bounce buffers" to make the bounce buffer work for untrusted devices. 3. Several cleanups in iommu/vt-d driver after the conversion. With the previous version of the series I hit a problem on Ivybridge where apparently the dma engine width is not respected. At least that is my layman interpretation of the errors. From the older thread: <3> [209.526605] DMAR: intel_iommu_map: iommu width (39) is not sufficient for the mapped address (008000) Relevant iommu boot related messages are: <6>[ 0.184234] DMAR: Host address width 36 <6>[ 0.184245] DMAR: DRHD base: 0x00fed9 flags: 0x0 <6>[ 0.184288] DMAR: dmar0: reg_base_addr fed9 ver 1:0 cap c020e60262 ecap f0101a <6>[ 0.184308] DMAR: DRHD base: 0x00fed91000 flags: 0x1 <6>[ 0.184337] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap c9008020660262 ecap f0105a <6>[ 0.184357] DMAR: RMRR base: 0x00d8d28000 end: 0x00d8d46fff <6>[ 0.184377] DMAR: RMRR base: 0x00db00 end: 0x00df1f <6>[ 0.184398] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1 <6>[ 0.184414] DMAR-IR: HPET id 0 under DRHD base 0xfed91000 <6>[ 0.184428] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping. <6>[ 0.185173] DMAR-IR: Enabled IRQ remapping in x2apic mode <6>[ 0.878934] DMAR: No ATSR found <6>[ 0.878966] DMAR: dmar0: Using Queued invalidation <6>[ 0.879007] DMAR: dmar1: Using Queued invalidation <6>[ 0.915032] DMAR: Intel(R) Virtualization Technology for Directed I/O <6>[ 0.915060] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) <6>[ 0.915084] software IO TLB: mapped [mem 0xc80d4000-0xcc0d4000] (64MB) (Full boot log at https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7054/fi-ivb-3770/boot0.txt, failures at https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7054/fi-ivb-3770/igt@i915_selftest@l...@blt.html.) Does this look familiar or at least plausible to you? Is this something your new series has fixed? This happens during attaching a domain to device. It has nothing to do with this patch series. I will look into this issue, but not in this email thread context. I am not sure what step is attaching domain to device, but these type messages: <3> [209.526605] DMAR: intel_iommu_map: iommu width (39) is not >> sufficient for the mapped address (008000) They definitely appear to happen at runtime, as i915 is getting exercised by userspace. Regards, Tvrtko ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [Intel-gfx] [trivial PATCH] treewide: Convert switch/case fallthrough; to break;
On Wed, 09 Sep 2020, Joe Perches wrote: > diff --git a/drivers/gpu/drm/i915/display/intel_sprite.c > b/drivers/gpu/drm/i915/display/intel_sprite.c > index 5ac0dbf0e03d..35ac539cc2b1 100644 > --- a/drivers/gpu/drm/i915/display/intel_sprite.c > +++ b/drivers/gpu/drm/i915/display/intel_sprite.c > @@ -2861,7 +2861,7 @@ static bool gen12_plane_format_mod_supported(struct > drm_plane *_plane, > case I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS: > if (!gen12_plane_supports_mc_ccs(dev_priv, plane->id)) > return false; > - fallthrough; > + break; > case DRM_FORMAT_MOD_LINEAR: > case I915_FORMAT_MOD_X_TILED: > case I915_FORMAT_MOD_Y_TILED: Acked-by: Jani Nikula for merging via whichever tree seems best. BR, Jani. -- Jani Nikula, Intel Open Source Graphics Center ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;
Hi Joe, For MTD: > drivers/mtd/nand/raw/nandsim.c| 2 +- Reviewed-by: Miquel Raynal Thanks, Miquèl ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: fix interrupt remapping for avic
On Mon, 2020-09-14 at 21:48 +0700, Suravee Suthikulpanit wrote: > Maxim, > > On 9/13/2020 7:42 PM, Maxim Levitsky wrote: > > Commit e52d58d54a32 ("iommu/amd: Use cmpxchg_double() when updating 128-bit > > IRTE") > > accidentally removed an assumption that modify_irte_ga always set the valid > > bit > > and amd_iommu_activate_guest_mode relied on that. > > > > Side effect of this is that on my machine, VFIO based VMs with AVIC enabled > > would eventually crash and show IOMMU errors like that: > > > > AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0055 > > address=0xfffdf800 flags=0x0008] > > > > Fixes: e52d58d54a321 ("iommu/amd: Use cmpxchg_double() when updating > > 128-bit IRTE") > > Signed-off-by: Maxim Levitsky > > --- > > drivers/iommu/amd/iommu.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c > > index 07ae8b93887e5..aff4cc1869356 100644 > > --- a/drivers/iommu/amd/iommu.c > > +++ b/drivers/iommu/amd/iommu.c > > @@ -3853,6 +3853,7 @@ int amd_iommu_activate_guest_mode(void *data) > > entry->hi.fields.ga_root_ptr = ir_data->ga_root_ptr; > > entry->hi.fields.vector= ir_data->ga_vector; > > entry->lo.fields_vapic.ga_tag = ir_data->ga_tag; > > + entry->lo.fields_remap.valid = 1; > > > > return modify_irte_ga(ir_data->irq_2_irte.devid, > > ir_data->irq_2_irte.index, entry, ir_data); > > > > Could you please try with the following patch instead? > > --- a/drivers/iommu/amd/iommu.c > +++ b/drivers/iommu/amd/iommu.c > @@ -3840,14 +3840,18 @@ int amd_iommu_activate_guest_mode(void *data) > { > struct amd_ir_data *ir_data = (struct amd_ir_data *)data; > struct irte_ga *entry = (struct irte_ga *) ir_data->entry; > + u64 valid; > > if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || > !entry || entry->lo.fields_vapic.guest_mode) > return 0; > > + valid = entry->lo.fields_vapic.valid; > + > entry->lo.val = 0; > entry->hi.val = 0; > > + entry->lo.fields_vapic.valid = valid; > entry->lo.fields_vapic.guest_mode = 1; > entry->lo.fields_vapic.ga_log_intr = 1; > entry->hi.fields.ga_root_ptr = ir_data->ga_root_ptr; > @@ -3864,12 +3868,14 @@ int amd_iommu_deactivate_guest_mode(void *data) > struct amd_ir_data *ir_data = (struct amd_ir_data *)data; > struct irte_ga *entry = (struct irte_ga *) ir_data->entry; > struct irq_cfg *cfg = ir_data->cfg; > - u64 valid = entry->lo.fields_remap.valid; > + u64 valid; > > if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || > !entry || !entry->lo.fields_vapic.guest_mode) > return 0; > > + valid = entry->lo.fields_remap.valid; > + > entry->lo.val = 0; > entry->hi.val = 0; I see. I based my approach on the fact that valid bit was set always to true anyway before, plus that amd_iommu_activate_guest_mode should be really only called when someone activates a valid interrupt remapping entry, but IMHO the approach of preserving the valid bit is safer anyway. It works on my system (I applied the patch manually, since either your or my email client, seems to mangle the patch) Thanks, Best regards, Maxim Levitsky > -- > > Thanks, > Suravee > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On Mon, Sep 14, 2020 at 03:44:38PM -0700, Raj, Ashok wrote: > Hi Jason, > > I thought we discussed this at LPC, but still seems to be going in > circles :-(. We discused mdev at LPC, not PASID. PASID applies widely to many device and needs to be introduced with a wide community agreement so all scenarios will be supportable. > As you had suggested earlier in the mail thread could Jason Wang maybe > build out what it takes to have a full fledged /dev/sva interface for vDPA > and figure out how the interfaces should emerge? otherwise it appears > everyone is talking very high level and with that limited understanding of > how things work at the moment. You want Jason Wang to do the work to get Intel PASID support merged? Seems a bit of strange request. > This has to move ahead of these email discussions, hoping somone with the > right ideas would help move this forward. Why not try yourself to come up with a proposal? Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: fix interrupt remapping for avic
On 9/15/20 6:25 PM, Maxim Levitsky wrote: On Mon, 2020-09-14 at 21:48 +0700, Suravee Suthikulpanit wrote: Maxim, On 9/13/2020 7:42 PM, Maxim Levitsky wrote: Commit e52d58d54a32 ("iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE") accidentally removed an assumption that modify_irte_ga always set the valid bit and amd_iommu_activate_guest_mode relied on that. Side effect of this is that on my machine, VFIO based VMs with AVIC enabled would eventually crash and show IOMMU errors like that: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0055 address=0xfffdf800 flags=0x0008] Fixes: e52d58d54a321 ("iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE") Signed-off-by: Maxim Levitsky --- drivers/iommu/amd/iommu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 07ae8b93887e5..aff4cc1869356 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3853,6 +3853,7 @@ int amd_iommu_activate_guest_mode(void *data) entry->hi.fields.ga_root_ptr = ir_data->ga_root_ptr; entry->hi.fields.vector= ir_data->ga_vector; entry->lo.fields_vapic.ga_tag = ir_data->ga_tag; + entry->lo.fields_remap.valid = 1; return modify_irte_ga(ir_data->irq_2_irte.devid, ir_data->irq_2_irte.index, entry, ir_data); Could you please try with the following patch instead? --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3840,14 +3840,18 @@ int amd_iommu_activate_guest_mode(void *data) { struct amd_ir_data *ir_data = (struct amd_ir_data *)data; struct irte_ga *entry = (struct irte_ga *) ir_data->entry; + u64 valid; if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || !entry || entry->lo.fields_vapic.guest_mode) return 0; + valid = entry->lo.fields_vapic.valid; + entry->lo.val = 0; entry->hi.val = 0; + entry->lo.fields_vapic.valid = valid; entry->lo.fields_vapic.guest_mode = 1; entry->lo.fields_vapic.ga_log_intr = 1; entry->hi.fields.ga_root_ptr = ir_data->ga_root_ptr; @@ -3864,12 +3868,14 @@ int amd_iommu_deactivate_guest_mode(void *data) struct amd_ir_data *ir_data = (struct amd_ir_data *)data; struct irte_ga *entry = (struct irte_ga *) ir_data->entry; struct irq_cfg *cfg = ir_data->cfg; - u64 valid = entry->lo.fields_remap.valid; + u64 valid; if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || !entry || !entry->lo.fields_vapic.guest_mode) return 0; + valid = entry->lo.fields_remap.valid; + entry->lo.val = 0; entry->hi.val = 0; I see. I based my approach on the fact that valid bit was set always to true anyway before, plus that amd_iommu_activate_guest_mode should be really only called when someone activates a valid interrupt remapping entry, but IMHO the approach of preserving the valid bit is safer anyway. It works on my system (I applied the patch manually, since either your or my email client, seems to mangle the patch) Thanks, Best regards, Maxim Levitsky Sorry for the mangled patch. I'll submit the patch w/ your information. Thanks for your help reporting, debugging, and testing the patch. Sincerely, Suravee -- Thanks, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property
On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote: > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote: > > From: Thierry Reding > > > > Reserved memory regions can be marked as "active" if hardware is > > expected to access the regions during boot and before the operating > > system can take control. One example where this is useful is for the > > operating system to infer whether the region needs to be identity- > > mapped through an IOMMU. > > I like simple solutions, but this hardly seems adequate to solve the > problem of passing IOMMU setup from bootloader/firmware to the OS. Like > what is the IOVA that's supposed to be used if identity mapping is not > used? The assumption here is that if the region is not active there is no need for the IOVA to be specified because the kernel will allocate memory and assign any IOVA of its choosing. Also, note that this is not meant as a way of passing IOMMU setup from the bootloader or firmware to the OS. The purpose of this is to specify that some region of memory is actively being accessed during boot. The particular case that I'm looking at is where the bootloader set up a splash screen and keeps it on during boot. The bootloader has not set up an IOMMU mapping and the identity mapping serves as a way of keeping the accesses by the display hardware working during the transitional period after the IOMMU translations have been enabled by the kernel but before the kernel display driver has had a chance to set up its own IOMMU mappings. > If you know enough about the regions to assume identity mapping, then > can't you know if active or not? We could alternatively add some property that describes the region as requiring an identity mapping. But note that we can't make any assumptions here about the usage of these regions because the IOMMU driver simply has no way of knowing what they are being used for. Some additional information is required in device tree for the IOMMU driver to be able to make that decision. Thierry > > > Signed-off-by: Thierry Reding > > --- > > .../bindings/reserved-memory/reserved-memory.txt | 7 +++ > > 1 file changed, 7 insertions(+) > > > > diff --git > > a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt > > b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt > > index 4dd20de6977f..163d2927e4fc 100644 > > --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt > > +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt > > @@ -63,6 +63,13 @@ reusable (optional) - empty property > >able to reclaim it back. Typically that means that the operating > >system can use that region to store volatile or cached data that > >can be otherwise regenerated or migrated elsewhere. > > +active (optional) - empty property > > +- If this property is set for a reserved memory region, it indicates > > + that some piece of hardware may be actively accessing this region. > > + Should the operating system want to enable IOMMU protection for a > > + device, all active memory regions must have been identity-mapped > > + in order to ensure that non-quiescent hardware during boot can > > + continue to access the memory. > > > > Linux implementation note: > > - If a "linux,cma-default" property is present, then Linux will use the > > -- > > 2.28.0 > > signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: fix interrupt remapping for avic
On 9/15/20 1:30 PM, Suravee Suthikulpanit wrote: > On 9/15/20 6:25 PM, Maxim Levitsky wrote: >> On Mon, 2020-09-14 at 21:48 +0700, Suravee Suthikulpanit wrote: >>> Could you please try with the following patch instead? >>> >>> --- a/drivers/iommu/amd/iommu.c >>> +++ b/drivers/iommu/amd/iommu.c >>> @@ -3840,14 +3840,18 @@ int amd_iommu_activate_guest_mode(void *data) >>>{ >>> struct amd_ir_data *ir_data = (struct amd_ir_data *)data; >>> struct irte_ga *entry = (struct irte_ga *) ir_data->entry; >>> + u64 valid; >>> >>> if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || >>> !entry || entry->lo.fields_vapic.guest_mode) >>> return 0; >>> >>> + valid = entry->lo.fields_vapic.valid; >>> + >>> entry->lo.val = 0; >>> entry->hi.val = 0; >>> >>> + entry->lo.fields_vapic.valid = valid; >>> entry->lo.fields_vapic.guest_mode = 1; >>> entry->lo.fields_vapic.ga_log_intr = 1; >>> entry->hi.fields.ga_root_ptr = ir_data->ga_root_ptr; >>> @@ -3864,12 +3868,14 @@ int amd_iommu_deactivate_guest_mode(void *data) >>> struct amd_ir_data *ir_data = (struct amd_ir_data *)data; >>> struct irte_ga *entry = (struct irte_ga *) ir_data->entry; >>> struct irq_cfg *cfg = ir_data->cfg; >>> - u64 valid = entry->lo.fields_remap.valid; >>> + u64 valid; >>> >>> if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || >>> !entry || !entry->lo.fields_vapic.guest_mode) >>> return 0; >>> >>> + valid = entry->lo.fields_remap.valid; >>> + >>> entry->lo.val = 0; >>> entry->hi.val = 0; >> I see. I based my approach on the fact that valid bit was >> set always to true anyway before, plus that amd_iommu_activate_guest_mode >> should be really only called when someone activates a valid interrupt >> remapping >> entry, but IMHO the approach of preserving the valid bit is safer anyway. >> >> It works on my system (I applied the patch manually, since either your or my >> email client, >> seems to mangle the patch) >> > > Sorry for the mangled patch. I'll submit the patch w/ your information. > Thanks for your help reporting, debugging, and > testing the patch. > I assume you're only doing the valid bit preservation in amd_iommu_activate_guest_mode() ? The null deref fix in amd_iommu_deactivate_guest_mode() was fixed elsewhere[0], or are you planning on merging both changes like the diff you attached? Asking also because commit 26e495f341 ("iommu/amd: Restore IRTE.RemapEn bit after programming IRTE") was added in v5.4 and v5.8 stable trees but the v5.4 backport didn't include e52d58d54a321 ("iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE"). Joao [0] https://lore.kernel.org/linux-iommu/20200910171621.12879-1-joao.m.mart...@oracle.com/ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-direct: Fix potential NULL pointer dereference
When booting the kernel v5.9-rc4 on a VM, the kernel would panic when printing a warning message in swiotlb_map(). It is because dev->dma_mask can potentially be a null pointer. Using the dma_get_mask() macro can avoid the NULL pointer dereference. Fixes: d323bb44e4d2 ("drm/virtio: Call the right shmem helpers") [drm] Initialized virtio_gpu 0.1.0 0 for virtio0 on minor 0 BUG: kernel NULL pointer dereference, address: #PF: supervisor read access in kernel mode #PF: error_code(0x) - not-present page PGD 0 P4D 0 Oops: [#1] SMP PTI CPU: 1 PID: 331 Comm: systemd-udevd Not tainted 5.9.0-rc4 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1 04/01/2014 RIP: 0010:swiotlb_map+0x1ac/0x200 Code: e8 d9 fc ff ff 80 3d 92 ee 4c 01 00 75 51 49 8b 84 24 48 02 00 00 4d 8b 6c 24 50 c6 05 7c ee 4c 01 01 4d 8b bc 24 58 02 00 00 <4c> 8b 30 4d 85 ed 75 04 4d 8b 2c 24 4c 89 e7 e8 10 6b 4f 00 4d 89 RSP: 0018:9f96801af6f8 EFLAGS: 00010246 RAX: RBX: 1000 RCX: 0080 RDX: 007f RSI: 0202 RDI: 0202 RBP: 9f96801af748 R08: R09: 0020 R10: R11: 8fabfffa3000 R12: 8faad02c7810 R13: R14: 0020 R15: FS: 7fabc63588c0() GS:8fabf7c8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: CR3: 000151496005 CR4: 00370ee0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: dma_direct_map_sg+0x124/0x210 dma_map_sg_attrs+0x32/0x50 drm_gem_shmem_get_pages_sgt+0x6a/0x90 [drm] virtio_gpu_object_create+0x140/0x2f0 [virtio_gpu] ? ww_mutex_unlock+0x26/0x30 virtio_gpu_mode_dumb_create+0xab/0x160 [virtio_gpu] drm_mode_create_dumb+0x82/0x90 [drm] drm_client_framebuffer_create+0xaa/0x200 [drm] drm_fb_helper_generic_probe+0x59/0x150 [drm_kms_helper] drm_fb_helper_single_fb_probe+0x29e/0x3e0 [drm_kms_helper] __drm_fb_helper_initial_config_and_unlock+0x41/0xd0 [drm_kms_helper] drm_fbdev_client_hotplug+0xe6/0x1a0 [drm_kms_helper] drm_fbdev_generic_setup+0xaf/0x170 [drm_kms_helper] virtio_gpu_probe+0xea/0x100 [virtio_gpu] virtio_dev_probe+0x14b/0x1e0 [virtio] really_probe+0x1db/0x440 driver_probe_device+0xe9/0x160 device_driver_attach+0x5d/0x70 __driver_attach+0x8f/0x150 ? device_driver_attach+0x70/0x70 bus_for_each_dev+0x7e/0xc0 driver_attach+0x1e/0x20 bus_add_driver+0x152/0x1f0 driver_register+0x74/0xd0 ? 0xc0529000 register_virtio_driver+0x20/0x30 [virtio] virtio_gpu_driver_init+0x15/0x1000 [virtio_gpu] do_one_initcall+0x4a/0x1fa ? _cond_resched+0x19/0x30 ? kmem_cache_alloc_trace+0x16b/0x2e0 do_init_module+0x62/0x240 load_module+0xe0e/0x1100 ? security_kernel_post_read_file+0x5c/0x70 __do_sys_finit_module+0xbe/0x120 ? __do_sys_finit_module+0xbe/0x120 __x64_sys_finit_module+0x1a/0x20 do_syscall_64+0x38/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Thomas Tai --- include/linux/dma-direct.h | 2 +- kernel/dma/swiotlb.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h index 6e87225..7556067 100644 --- a/include/linux/dma-direct.h +++ b/include/linux/dma-direct.h @@ -168,7 +168,7 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev, dev_WARN_ONCE(dev, 1, "DMA addr %pad+%zu overflow (mask %llx, bus limit %llx).\n", -&dma_addr, size, *dev->dma_mask, dev->bus_dma_limit); +&dma_addr, size, dma_get_mask(dev), dev->bus_dma_limit); return DMA_MAPPING_ERROR; } diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index c19379fa..aa7727b 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -682,7 +682,7 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size, attrs | DMA_ATTR_SKIP_CPU_SYNC); dev_WARN_ONCE(dev, 1, "swiotlb addr %pad+%zu overflow (mask %llx, bus limit %llx).\n", - &dma_addr, size, *dev->dma_mask, dev->bus_dma_limit); + &dma_addr, size, dma_get_mask(dev), dev->bus_dma_limit); return DMA_MAPPING_ERROR; } -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-direct: Fix potential NULL pointer dereference
On Tue, Sep 15, 2020 at 08:03:14AM -0600, Thomas Tai wrote: > When booting the kernel v5.9-rc4 on a VM, the kernel would panic when > printing a warning message in swiotlb_map(). It is because dev->dma_mask > can potentially be a null pointer. Using the dma_get_mask() macro can > avoid the NULL pointer dereference. dma_mask must not be zero. This means drm is calling DMA API functions on something weird. This needs to be fixed in the caller. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-direct: Fix potential NULL pointer dereference
On Tue, Sep 15, 2020 at 04:07:19PM +0200, Christoph Hellwig wrote: > On Tue, Sep 15, 2020 at 08:03:14AM -0600, Thomas Tai wrote: > > When booting the kernel v5.9-rc4 on a VM, the kernel would panic when > > printing a warning message in swiotlb_map(). It is because dev->dma_mask > > can potentially be a null pointer. Using the dma_get_mask() macro can > > avoid the NULL pointer dereference. > > dma_mask must not be zero. This means drm is calling DMA API functions > on something weird. This needs to be fixed in the caller. s/zero/NULL/, but the point stands. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 07/17] 53c700: improve non-coherent DMA handling
On Tue, 2020-09-15 at 08:27 +0200, Christoph Hellwig wrote: > On Mon, Sep 14, 2020 at 08:20:18AM -0700, James Bottomley wrote: > > If you're going to change the macros from taking a device to taking > > a hostdata structure then the descriptive argument name needs to > > change ... it can't be dev anymore. I'm happy with it simply > > becoming 'h' if hostdata is too long. > > > > I already asked for this on the first go around: > > And I did rename them, those hunks just accidentally slipped into > patch 12 instead of this one. Fixed for the next versions. Ah, yes, found it ... thanks for doing that! James ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-direct: Fix potential NULL pointer dereference
On 2020-09-15 10:07 a.m., Christoph Hellwig wrote: On Tue, Sep 15, 2020 at 08:03:14AM -0600, Thomas Tai wrote: When booting the kernel v5.9-rc4 on a VM, the kernel would panic when printing a warning message in swiotlb_map(). It is because dev->dma_mask can potentially be a null pointer. Using the dma_get_mask() macro can avoid the NULL pointer dereference. dma_mask must not be zero. This means drm is calling DMA API functions on something weird. This needs to be fixed in the caller. Thanks, Christoph for your comment. The caller already fixed the null pointer in the latest v5.9-rc5. I am thinking that if we had used the dma_get_mask(), the kernel couldn't panic and could properly print out the warning message. Thomas ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-direct: Fix potential NULL pointer dereference
On Tue, Sep 15, 2020 at 10:11:51AM -0400, Thomas Tai wrote: > > > On 2020-09-15 10:07 a.m., Christoph Hellwig wrote: >> On Tue, Sep 15, 2020 at 08:03:14AM -0600, Thomas Tai wrote: >>> When booting the kernel v5.9-rc4 on a VM, the kernel would panic when >>> printing a warning message in swiotlb_map(). It is because dev->dma_mask >>> can potentially be a null pointer. Using the dma_get_mask() macro can >>> avoid the NULL pointer dereference. >> >> dma_mask must not be zero. This means drm is calling DMA API functions >> on something weird. This needs to be fixed in the caller. >> > > Thanks, Christoph for your comment. The caller already fixed the null > pointer in the latest v5.9-rc5. I am thinking that if we had used the > dma_get_mask(), the kernel couldn't panic and could properly print out the > warning message. If we want to solve this something like this patch is probably the right way: diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h index 6e87225600ae35..064870844f06c1 100644 --- a/include/linux/dma-direct.h +++ b/include/linux/dma-direct.h @@ -62,9 +62,6 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size, { dma_addr_t end = addr + size - 1; - if (!dev->dma_mask) - return false; - if (is_ram && !IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) && min(addr, end) < phys_to_dma(dev, PFN_PHYS(min_low_pfn))) return false; diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 0d129421e75fc8..2b01d8f7baf160 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -144,6 +144,10 @@ dma_addr_t dma_map_page_attrs(struct device *dev, struct page *page, dma_addr_t addr; BUG_ON(!valid_dma_direction(dir)); + + if (WARN_ON_ONCE(!dev->dma_mask)) + return DMA_MAPPING_ERROR; + if (dma_map_direct(dev, ops)) addr = dma_direct_map_page(dev, page, offset, size, dir, attrs); else @@ -179,6 +183,10 @@ int dma_map_sg_attrs(struct device *dev, struct scatterlist *sg, int nents, int ents; BUG_ON(!valid_dma_direction(dir)); + + if (WARN_ON_ONCE(!dev->dma_mask)) + return 0; + if (dma_map_direct(dev, ops)) ents = dma_direct_map_sg(dev, sg, nents, dir, attrs); else @@ -217,6 +225,9 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr, if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr return DMA_MAPPING_ERROR; + if (WARN_ON_ONCE(!dev->dma_mask)) + return DMA_MAPPING_ERROR; + if (dma_map_direct(dev, ops)) addr = dma_direct_map_resource(dev, phys_addr, size, dir, attrs); else if (ops->map_resource) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On Mon, Sep 14, 2020 at 04:33:10PM -0600, Alex Williamson wrote: > Can you explain that further, or spit-ball what you think this /dev/sva > interface looks like and how a user might interact between vfio and > this new interface? When you open it you get some container, inside the container the user can create PASIDs. PASIDs outside that container cannot be reached. Creating a PASID, or the guest PASID range would be the entry point for doing all the operations against a PASID or range that this patch series imagines: - Map process VA mappings to the PASID's DMA virtual address space - Catch faults - Setup any special HW stuff like Intel's two level thing, ARM stuff, etc - Expose resource controls, cgroup, whatever - Migration special stuff (allocate fixed PASIDs) A PASID is a handle for an IOMMU page table, and the tools to manipulate it. Within /dev/sva the page table is just 'floating' and not linked to any PCI functions The open /dev/sva FD holding the allocated PASIDs would be passed to a kernel driver. This is a security authorization that the specified PASID can be assigned to a PCI device by the kernel. At this point the kernel driver would have the IOMMU permit its bus/device/function to use the PASID. The PASID can be passed to multiple drivers of any driver flavour so table re-use is possible. Now the IOMMU page table is linked to a device. The kernel device driver would also do the device specific programming to setup the PASID in the device, attach it to some device object and expose the device for user DMA. For instance IDXD's char dev would map the queue memory and associate the PASID with that queue and setup the HW to be ready for the new enque instruction. The IDXD mdev would link to its emulated PCI BAR and ensure the guest can only use PASID's included in the /dev/sva container. The qemu control plane for vIOMMU related to PASID would run over /dev/sva. I think the design could go further where a 'PASID' is just an abstract idea of a page table, then vfio-pci could consume it too as a IOMMU page table handle even though there is no actual PASID. So qemu could end up with one API to universally control the vIOMMU, an API that can be shared between subsystems and is not tied to VFIO. > allocating pasids and associating them with page tables for that > two-stage IOMMU setup, performing cache invalidations based on page > table updates, etc. How does it make more sense for a vIOMMU to > setup some aspects of the IOMMU through vfio and others through a > TBD interface? vfio's IOMMU interface is about RID based full device ownership, and fixed mappings. PASID is about mediation, shared ownership and page faulting. Does PASID overlap with the existing IOMMU RID interface beyond both are using the IOMMU? > The IOMMU needs to allocate PASIDs, so in that sense it enforces a > quota via the architectural limits, but is the IOMMU layer going to > distinguish in-kernel versus user limits? A cgroup limit seems like a > good idea, but that's not really at the IOMMU layer either and I don't > see that a /dev/sva and vfio interface couldn't both support a cgroup > type quota. It is all good questions. PASID is new, this stuff needs to be sketched out more. A lot of in-kernel users of IOMMU PASID are probably going to be triggered by userspace actions. I think a cgroup quota would end up near the IOMMU layer, so vfio, sva, and any other driver char devs would all be restricted by the cgroup as peers. > And it's not clear that they'll have compatible requirements. A > userspace idxd driver might have limited needs versus a vIOMMU backend. > Does a single quota model adequately support both or are we back to the > differences between access to a device and ownership of a device? At the end of the day a PASID is just a number and the drivers only use of it is to program it into HW. All these other differences deal with the IOMMU side of the PASID, how pages are mapped into it, how page fault works, etc, etc. Keeping the two concerns seperated seems very clean. A device driver shouldn't care how the PASID is setup. > > > This series is a blueprint within the context of the ownership and > > > permission model that VFIO already provides. It doesn't seem like we > > > can pluck that out on its own, nor is it necessarily the case that VFIO > > > wouldn't want to provide PASID services within its own API even if we > > > did have this undefined /dev/sva interface. > > > > I don't see what you do - VFIO does not own PASID, and in this > > vfio-mdev mode it does not own the PCI device/IOMMU either. So why > > would this need to be part of the VFIO owernship and permission model? > > Doesn't the PASID model essentially just augment the requester ID IOMMU > model so as to manage the IOVAs for a subdevice of a RID? I'd say not really.. PASID is very different from RID because PASID must always be mediated by the kernel. vfio-pci doesn't know how to use PASID because it doesn't k
Re: [PATCH] dma-direct: Fix potential NULL pointer dereference
On 2020-09-15 10:26 a.m., Christoph Hellwig wrote: On Tue, Sep 15, 2020 at 10:11:51AM -0400, Thomas Tai wrote: On 2020-09-15 10:07 a.m., Christoph Hellwig wrote: On Tue, Sep 15, 2020 at 08:03:14AM -0600, Thomas Tai wrote: When booting the kernel v5.9-rc4 on a VM, the kernel would panic when printing a warning message in swiotlb_map(). It is because dev->dma_mask can potentially be a null pointer. Using the dma_get_mask() macro can avoid the NULL pointer dereference. dma_mask must not be zero. This means drm is calling DMA API functions on something weird. This needs to be fixed in the caller. Thanks, Christoph for your comment. The caller already fixed the null pointer in the latest v5.9-rc5. I am thinking that if we had used the dma_get_mask(), the kernel couldn't panic and could properly print out the warning message. If we want to solve this something like this patch is probably the right way: diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h index 6e87225600ae35..064870844f06c1 100644 --- a/include/linux/dma-direct.h +++ b/include/linux/dma-direct.h @@ -62,9 +62,6 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size, { dma_addr_t end = addr + size - 1; - if (!dev->dma_mask) - return false; - I am concerned that some drivers may rely on this NULL checking. Would you think we can keep this checking and use the following WARN_ON_ONCE()? if (is_ram && !IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) && min(addr, end) < phys_to_dma(dev, PFN_PHYS(min_low_pfn))) return false; diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 0d129421e75fc8..2b01d8f7baf160 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -144,6 +144,10 @@ dma_addr_t dma_map_page_attrs(struct device *dev, struct page *page, dma_addr_t addr; BUG_ON(!valid_dma_direction(dir)); + + if (WARN_ON_ONCE(!dev->dma_mask)) + return DMA_MAPPING_ERROR; + if (dma_map_direct(dev, ops)) addr = dma_direct_map_page(dev, page, offset, size, dir, attrs); else @@ -179,6 +183,10 @@ int dma_map_sg_attrs(struct device *dev, struct scatterlist *sg, int nents, int ents; BUG_ON(!valid_dma_direction(dir)); + + if (WARN_ON_ONCE(!dev->dma_mask)) + return 0; + if (dma_map_direct(dev, ops)) ents = dma_direct_map_sg(dev, sg, nents, dir, attrs); else @@ -217,6 +225,9 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr, if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr return DMA_MAPPING_ERROR; + if (WARN_ON_ONCE(!dev->dma_mask)) + return DMA_MAPPING_ERROR; + if (dma_map_direct(dev, ops)) addr = dma_direct_map_resource(dev, phys_addr, size, dir, attrs); else if (ops->map_resource) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-direct: Fix potential NULL pointer dereference
On Tue, Sep 15, 2020 at 10:40:39AM -0400, Thomas Tai wrote: >> +++ b/include/linux/dma-direct.h >> @@ -62,9 +62,6 @@ static inline bool dma_capable(struct device *dev, >> dma_addr_t addr, size_t size, >> { >> dma_addr_t end = addr + size - 1; >> - if (!dev->dma_mask) >> -return false; >> - > > I am concerned that some drivers may rely on this NULL checking. Would you > think we can keep this checking and use the following WARN_ON_ONCE()? dma_capable is not a helper for drivers, but just for dma-direct and related code. And this patch adds the checks for the three places how we call into the ->map* methods. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-direct: Fix potential NULL pointer dereference
On 2020-09-15 11:09 a.m., Christoph Hellwig wrote: On Tue, Sep 15, 2020 at 10:40:39AM -0400, Thomas Tai wrote: +++ b/include/linux/dma-direct.h @@ -62,9 +62,6 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size, { dma_addr_t end = addr + size - 1; -if (!dev->dma_mask) - return false; - I am concerned that some drivers may rely on this NULL checking. Would you think we can keep this checking and use the following WARN_ON_ONCE()? dma_capable is not a helper for drivers, but just for dma-direct and related code. And this patch adds the checks for the three places how we call into the ->map* methods. Ok. That sounds good to me. I will make the suggested changes and run some tests before sending out the V2 patch. Thank you, Thomas ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] irqchip/qcom-pdc: Allow QCOM_PDC to be loadable as a permanent module
On Mon 14 Sep 21:04 UTC 2020, John Stultz wrote: > Allows qcom-pdc driver to be loaded as a permanent module. > > An earlier version of this patch was merged in a larger patchset > but was reverted entirely when issues were found with other > drivers, so now that Marc has provided a better solution in his > Hybrid probing patch set, I wanted to re-submit this change. > Reviewed-by: Bjorn Andersson > Cc: Andy Gross > Cc: Bjorn Andersson > Cc: Joerg Roedel > Cc: Thomas Gleixner > Cc: Jason Cooper > Cc: Marc Zyngier > Cc: Linus Walleij > Cc: Maulik Shah > Cc: Lina Iyer > Cc: Saravana Kannan > Cc: Todd Kjos > Cc: Greg Kroah-Hartman > Cc: linux-arm-...@vger.kernel.org > Cc: iommu@lists.linux-foundation.org > Cc: linux-g...@vger.kernel.org > Signed-off-by: John Stultz > --- > drivers/irqchip/Kconfig| 2 +- > drivers/irqchip/qcom-pdc.c | 2 ++ > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig > index bfc9719dbcdc..bb70b7177f94 100644 > --- a/drivers/irqchip/Kconfig > +++ b/drivers/irqchip/Kconfig > @@ -425,7 +425,7 @@ config GOLDFISH_PIC > for Goldfish based virtual platforms. > > config QCOM_PDC > - bool "QCOM PDC" > + tristate "QCOM PDC" > depends on ARCH_QCOM > select IRQ_DOMAIN_HIERARCHY > help > diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c > index 8543fa23da10..59eb3c8473b0 100644 > --- a/drivers/irqchip/qcom-pdc.c > +++ b/drivers/irqchip/qcom-pdc.c > @@ -433,3 +433,5 @@ static int qcom_pdc_init(struct device_node *node, struct > device_node *parent) > IRQCHIP_HYBRID_DRIVER_BEGIN(qcom_pdc) > IRQCHIP_MATCH("qcom,pdc", qcom_pdc_init) > IRQCHIP_HYBRID_DRIVER_END(qcom_pdc) > +MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Power Domain Controller"); > +MODULE_LICENSE("GPL v2"); > -- > 2.17.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
a saner API for allocating DMA addressable pages v3
Hi all, this series replaced the DMA_ATTR_NON_CONSISTENT flag to dma_alloc_attrs with a separate new dma_alloc_pages API, which is available on all platforms. In addition to cleaning up the convoluted code path, this ensures that other drivers that have asked for better support for non-coherent DMA to pages with incurring bounce buffering over can finally be properly supported. As a follow up I plan to move the implementation of the DMA_ATTR_NO_KERNEL_MAPPING flag over to this framework as well, given that is also is a fundamentally non coherent allocation. The replacement for that flag would then return a struct page, as it is allowed to actually return pages without a kernel mapping as the name suggested (although most of the time they will actually have a kernel mapping..) In addition to the conversions of the existing non-coherent DMA users, I've also added a patch to convert the firewire ohci driver to use the new dma_alloc_pages API. The first patch is queued up for 5.9 in the media tree, but included here for completeness. A git tree is available here: git://git.infradead.org/users/hch/misc.git dma_alloc_pages Gitweb: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma_alloc_pages Changes since v2: - fix up the patch reshuffle which wasn't quite correct - fix up a few commit messages Changes since v1: - rebased on the latests dma-mapping tree, which merged many of the cleanups - fix an argument passing typo in 53c700, caught by sparse - rename a few macro arguments in 53c700 - pass the right device to the DMA API in the lib82596 drivers - fix memory ownershiptransfers in sgiseeq - better document what a page in the direct kernel mapping means - split into dma_alloc_pages that returns a struct page and is in the direct mapping vs dma_alloc_noncoherent that can be vmapped - conver the firewire ohci driver to dma_alloc_pages Diffstat: ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 01/18] media/v4l2: remove V4L2-FLAG-MEMORY-NON-CONSISTENT flag
From: Sergey Senozhatsky The patch partially reverts some of the UAPI bits of the buffer cache management hints. Namely, the queue consistency (memory coherency) user-space hint because, as it turned out, the kernel implementation of this feature was misusing DMA_ATTR_NON_CONSISTENT. The patch revers both kernel and user space parts: removes the DMA consistency attr functions, rollbacks changes to v4l2_requestbuffers, v4l2_create_buffers structures and corresponding UAPI functions (plus compat32 layer) and cleanups the documentation. Signed-off-by: Christoph Hellwig Signed-off-by: Sergey Senozhatsky Signed-off-by: Christoph Hellwig --- .../userspace-api/media/v4l/buffer.rst| 17 --- .../media/v4l/vidioc-create-bufs.rst | 6 +-- .../media/v4l/vidioc-reqbufs.rst | 12 + .../media/common/videobuf2/videobuf2-core.c | 46 +++ .../common/videobuf2/videobuf2-dma-contig.c | 19 .../media/common/videobuf2/videobuf2-dma-sg.c | 3 +- .../media/common/videobuf2/videobuf2-v4l2.c | 18 +--- drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 10 +--- drivers/media/v4l2-core/v4l2-ioctl.c | 5 +- include/media/videobuf2-core.h| 7 +-- include/uapi/linux/videodev2.h| 13 +- 11 files changed, 22 insertions(+), 134 deletions(-) diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst index 57e752aaf414a7..2044ed13cd9d7d 100644 --- a/Documentation/userspace-api/media/v4l/buffer.rst +++ b/Documentation/userspace-api/media/v4l/buffer.rst @@ -701,23 +701,6 @@ Memory Consistency Flags :stub-columns: 0 :widths: 3 1 4 -* .. _`V4L2-FLAG-MEMORY-NON-CONSISTENT`: - - - ``V4L2_FLAG_MEMORY_NON_CONSISTENT`` - - 0x0001 - - A buffer is allocated either in consistent (it will be automatically - coherent between the CPU and the bus) or non-consistent memory. The - latter can provide performance gains, for instance the CPU cache - sync/flush operations can be avoided if the buffer is accessed by the - corresponding device only and the CPU does not read/write to/from that - buffer. However, this requires extra care from the driver -- it must - guarantee memory consistency by issuing a cache flush/sync when - consistency is needed. If this flag is set V4L2 will attempt to - allocate the buffer in non-consistent memory. The flag takes effect - only if the buffer is used for :ref:`memory mapping ` I/O and the - queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS - ` capability. - .. c:type:: v4l2_memory enum v4l2_memory diff --git a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst index f2a702870fadc1..12cf6b44f414f7 100644 --- a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst +++ b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst @@ -120,13 +120,9 @@ than the number requested. If you want to just query the capabilities without making any other changes, then set ``count`` to 0, ``memory`` to ``V4L2_MEMORY_MMAP`` and ``format.type`` to the buffer type. -* - __u32 - - ``flags`` - - Specifies additional buffer management attributes. - See :ref:`memory-flags`. * - __u32 - - ``reserved``\ [6] + - ``reserved``\ [7] - A place holder for future extensions. Drivers and applications must set the array to zero. diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst index 75d894d9c36c42..0e3e2fde65e850 100644 --- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst +++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst @@ -112,17 +112,10 @@ aborting or finishing any DMA in progress, an implicit ``V4L2_MEMORY_MMAP`` and ``type`` set to the buffer type. This will free any previously allocated buffers, so this is typically something that will be done at the start of the application. -* - union { - - (anonymous) -* - __u32 - - ``flags`` - - Specifies additional buffer management attributes. - See :ref:`memory-flags`. * - __u32 - ``reserved``\ [1] - - Kept for backwards compatibility. Use ``flags`` instead. -* - } - - + - A place holder for future extensions. Drivers and applications + must set the array to zero. .. tabularcolumns:: |p{6.1cm}|p{2.2cm}|p{8.7cm}| @@ -169,7 +162,6 @@ aborting or finishing any DMA in progress, an implicit - This capability is set by the driver to indicate that the queue supports cache and memory management hints. However, it's only valid when the queue is used for :ref:`memory mapping ` streaming I/O. See -:ref:`V4L2_FLAG_MEMORY_NON_CONSISTENT
[PATCH 02/18] mm: turn alloc_pages into an inline function
To prevent a compiler error when a method call alloc_pages is added (which I plan to for the dma_map_ops). Signed-off-by: Christoph Hellwig --- include/linux/gfp.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 67a0774e080b98..dd2577c5407112 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -550,8 +550,10 @@ extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order, #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \ alloc_pages_vma(gfp_mask, order, vma, addr, numa_node_id(), true) #else -#define alloc_pages(gfp_mask, order) \ - alloc_pages_node(numa_node_id(), gfp_mask, order) +static inline struct page *alloc_pages(gfp_t gfp_mask, unsigned int order) +{ + return alloc_pages_node(numa_node_id(), gfp_mask, order); +} #define alloc_pages_vma(gfp_mask, order, vma, addr, node, false)\ alloc_pages(gfp_mask, order) #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \ -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 03/18] drm/exynos: stop setting DMA_ATTR_NON_CONSISTENT
DMA_ATTR_NON_CONSISTENT is a no-op except on PA-RISC and a few MIPS configs, so don't set it in this ARM specific driver. Signed-off-by: Christoph Hellwig --- drivers/gpu/drm/exynos/exynos_drm_gem.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c index efa476858db54b..07073222b8f691 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c @@ -42,8 +42,6 @@ static int exynos_drm_alloc_buf(struct exynos_drm_gem *exynos_gem, bool kvmap) if (exynos_gem->flags & EXYNOS_BO_WC || !(exynos_gem->flags & EXYNOS_BO_CACHABLE)) attr |= DMA_ATTR_WRITE_COMBINE; - else - attr |= DMA_ATTR_NON_CONSISTENT; /* FBDev emulation requires kernel mapping */ if (!kvmap) -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 04/18] drm/nouveau/gk20a: stop setting DMA_ATTR_NON_CONSISTENT
DMA_ATTR_NON_CONSISTENT is a no-op except on PA-RISC and a few MIPS configs, so don't set it in this ARM specific driver part. Signed-off-by: Christoph Hellwig --- drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c index 985f2990ab0dda..13d4d7ac0697b4 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c @@ -594,8 +594,7 @@ gk20a_instmem_new(struct nvkm_device *device, int index, nvkm_info(&imem->base.subdev, "using IOMMU\n"); } else { - imem->attrs = DMA_ATTR_NON_CONSISTENT | - DMA_ATTR_WEAK_ORDERING | + imem->attrs = DMA_ATTR_WEAK_ORDERING | DMA_ATTR_WRITE_COMBINE; nvkm_info(&imem->base.subdev, "using DMA API\n"); -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 05/18] net/au1000-eth: stop using DMA_ATTR_NON_CONSISTENT
The au1000-eth driver contains none of the manual cache synchronization required for using DMA_ATTR_NON_CONSISTENT. From what I can tell it can be used on both dma coherent and non-coherent DMA platforms, but I suspect it has been buggy on the non-coherent platforms all along. Signed-off-by: Christoph Hellwig --- drivers/net/ethernet/amd/au1000_eth.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/amd/au1000_eth.c b/drivers/net/ethernet/amd/au1000_eth.c index 75dbd221dc594b..19e195420e2434 100644 --- a/drivers/net/ethernet/amd/au1000_eth.c +++ b/drivers/net/ethernet/amd/au1000_eth.c @@ -1131,10 +1131,9 @@ static int au1000_probe(struct platform_device *pdev) /* Allocate the data buffers * Snooping works fine with eth on all au1xxx */ - aup->vaddr = (u32)dma_alloc_attrs(&pdev->dev, MAX_BUF_SIZE * + aup->vaddr = (u32)dma_alloc_coherent(&pdev->dev, MAX_BUF_SIZE * (NUM_TX_BUFFS + NUM_RX_BUFFS), - &aup->dma_addr, 0, - DMA_ATTR_NON_CONSISTENT); + &aup->dma_addr, 0); if (!aup->vaddr) { dev_err(&pdev->dev, "failed to allocate data buffers\n"); err = -ENOMEM; @@ -1310,9 +1309,8 @@ static int au1000_probe(struct platform_device *pdev) err_remap2: iounmap(aup->mac); err_remap1: - dma_free_attrs(&pdev->dev, MAX_BUF_SIZE * (NUM_TX_BUFFS + NUM_RX_BUFFS), - (void *)aup->vaddr, aup->dma_addr, - DMA_ATTR_NON_CONSISTENT); + dma_free_coherent(&pdev->dev, MAX_BUF_SIZE * (NUM_TX_BUFFS + NUM_RX_BUFFS), + (void *)aup->vaddr, aup->dma_addr); err_vaddr: free_netdev(dev); err_alloc: @@ -1344,9 +1342,8 @@ static int au1000_remove(struct platform_device *pdev) if (aup->tx_db_inuse[i]) au1000_ReleaseDB(aup, aup->tx_db_inuse[i]); - dma_free_attrs(&pdev->dev, MAX_BUF_SIZE * (NUM_TX_BUFFS + NUM_RX_BUFFS), - (void *)aup->vaddr, aup->dma_addr, - DMA_ATTR_NON_CONSISTENT); + dma_free_coherent(&pdev->dev, MAX_BUF_SIZE * (NUM_TX_BUFFS + NUM_RX_BUFFS), + (void *)aup->vaddr, aup->dma_addr); iounmap(aup->macdma); iounmap(aup->mac); -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 06/18] lib82596: move DMA allocation into the callers of i82596_probe
This allows us to get rid of the LIB82596_DMA_ATTR defined and prepare for untangling the coherent vs non-coherent DMA allocation API. Signed-off-by: Christoph Hellwig --- drivers/net/ethernet/i825xx/lasi_82596.c | 24 ++-- drivers/net/ethernet/i825xx/lib82596.c | 36 drivers/net/ethernet/i825xx/sni_82596.c | 19 + 3 files changed, 40 insertions(+), 39 deletions(-) diff --git a/drivers/net/ethernet/i825xx/lasi_82596.c b/drivers/net/ethernet/i825xx/lasi_82596.c index aec7e98bcc853a..a12218e940a2fa 100644 --- a/drivers/net/ethernet/i825xx/lasi_82596.c +++ b/drivers/net/ethernet/i825xx/lasi_82596.c @@ -96,8 +96,6 @@ #define OPT_SWAP_PORT 0x0001 /* Need to wordswp on the MPU port */ -#define LIB82596_DMA_ATTR DMA_ATTR_NON_CONSISTENT - #define DMA_WBACK(ndev, addr, len) \ do { dma_cache_sync((ndev)->dev.parent, (void *)addr, len, DMA_TO_DEVICE); } while (0) @@ -155,7 +153,7 @@ lan_init_chip(struct parisc_device *dev) { struct net_device *netdevice; struct i596_private *lp; - int retval; + int retval = -ENOMEM; int i; if (!dev->irq) { @@ -186,12 +184,22 @@ lan_init_chip(struct parisc_device *dev) lp = netdev_priv(netdevice); lp->options = dev->id.sversion == 0x72 ? OPT_SWAP_PORT : 0; + lp->dma = dma_alloc_attrs(&dev->dev, sizeof(struct i596_dma), + &lp->dma_addr, GFP_KERNEL, + DMA_ATTR_NON_CONSISTENT); + if (!lp->dma) + goto out_free_netdev; retval = i82596_probe(netdevice); - if (retval) { - free_netdev(netdevice); - return -ENODEV; - } + if (retval) + goto out_free_dma; + return 0; + +out_free_dma: + dma_free_attrs(&dev->dev, sizeof(struct i596_dma), lp->dma, + lp->dma_addr, DMA_ATTR_NON_CONSISTENT); +out_free_netdev: + free_netdev(netdevice); return retval; } @@ -202,7 +210,7 @@ static int __exit lan_remove_chip(struct parisc_device *pdev) unregister_netdev (dev); dma_free_attrs(&pdev->dev, sizeof(struct i596_private), lp->dma, - lp->dma_addr, LIB82596_DMA_ATTR); + lp->dma_addr, DMA_ATTR_NON_CONSISTENT); free_netdev (dev); return 0; } diff --git a/drivers/net/ethernet/i825xx/lib82596.c b/drivers/net/ethernet/i825xx/lib82596.c index b03757e169e475..b4e4b3eb5758b5 100644 --- a/drivers/net/ethernet/i825xx/lib82596.c +++ b/drivers/net/ethernet/i825xx/lib82596.c @@ -1047,9 +1047,8 @@ static const struct net_device_ops i596_netdev_ops = { static int i82596_probe(struct net_device *dev) { - int i; struct i596_private *lp = netdev_priv(dev); - struct i596_dma *dma; + int ret; /* This lot is ensure things have been cache line aligned. */ BUILD_BUG_ON(sizeof(struct i596_rfd) != 32); @@ -1063,41 +1062,28 @@ static int i82596_probe(struct net_device *dev) if (!dev->base_addr || !dev->irq) return -ENODEV; - dma = dma_alloc_attrs(dev->dev.parent, sizeof(struct i596_dma), - &lp->dma_addr, GFP_KERNEL, - LIB82596_DMA_ATTR); - if (!dma) { - printk(KERN_ERR "%s: Couldn't get shared memory\n", __FILE__); - return -ENOMEM; - } - dev->netdev_ops = &i596_netdev_ops; dev->watchdog_timeo = TX_TIMEOUT; - memset(dma, 0, sizeof(struct i596_dma)); - lp->dma = dma; - - dma->scb.command = 0; - dma->scb.cmd = I596_NULL; - dma->scb.rfd = I596_NULL; + memset(lp->dma, 0, sizeof(struct i596_dma)); + lp->dma->scb.command = 0; + lp->dma->scb.cmd = I596_NULL; + lp->dma->scb.rfd = I596_NULL; spin_lock_init(&lp->lock); - DMA_WBACK_INV(dev, dma, sizeof(struct i596_dma)); + DMA_WBACK_INV(dev, lp->dma, sizeof(struct i596_dma)); - i = register_netdev(dev); - if (i) { - dma_free_attrs(dev->dev.parent, sizeof(struct i596_dma), - dma, lp->dma_addr, LIB82596_DMA_ATTR); - return i; - } + ret = register_netdev(dev); + if (ret) + return ret; DEB(DEB_PROBE, printk(KERN_INFO "%s: 82596 at %#3lx, %pM IRQ %d.\n", dev->name, dev->base_addr, dev->dev_addr, dev->irq)); DEB(DEB_INIT, printk(KERN_INFO "%s: dma at 0x%p (%d bytes), lp->scb at 0x%p\n", -dev->name, dma, (int)sizeof(struct i596_dma), -&dma->scb)); +dev->name, lp->dma, (int)sizeof(struct i596_dma), +&lp->dma->scb)); return 0; } diff --git a/drivers/net/ethernet/i825xx/sni_82596.c b/drivers/n
[PATCH 07/18] 53c700: improve non-coherent DMA handling
Switch the 53c700 driver to only use non-coherent descriptor memory if it really has to because dma_alloc_coherent fails. This doesn't matter for any of the platforms it runs on currently, but that will change soon. To help with this two new helpers to transfer ownership to and from the device are added that abstract the syncing of the non-coherent memory. The two current bidirectional cases are mapped to transfers to the device, as that appears to what they are used for. Note that for parisc, which is the only architecture this driver needs to use non-coherent memory on, the direction argument of dma_cache_sync is ignored, so this will not change behavior in any way. Signed-off-by: Christoph Hellwig --- drivers/scsi/53c700.c | 113 +++--- drivers/scsi/53c700.h | 17 --- 2 files changed, 72 insertions(+), 58 deletions(-) diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index 84b57a8f86bfa9..c59226d7e2f6b5 100644 --- a/drivers/scsi/53c700.c +++ b/drivers/scsi/53c700.c @@ -269,6 +269,20 @@ NCR_700_get_SXFER(struct scsi_device *SDp) spi_period(SDp->sdev_target)); } +static inline void dma_sync_to_dev(struct NCR_700_Host_Parameters *h, + void *addr, size_t size) +{ + if (h->noncoherent) + dma_cache_sync(h->dev, addr, size, DMA_TO_DEVICE); +} + +static inline void dma_sync_from_dev(struct NCR_700_Host_Parameters *h, + void *addr, size_t size) +{ + if (h->noncoherent) + dma_cache_sync(h->dev, addr, size, DMA_FROM_DEVICE); +} + struct Scsi_Host * NCR_700_detect(struct scsi_host_template *tpnt, struct NCR_700_Host_Parameters *hostdata, struct device *dev) @@ -283,9 +297,13 @@ NCR_700_detect(struct scsi_host_template *tpnt, if(tpnt->sdev_attrs == NULL) tpnt->sdev_attrs = NCR_700_dev_attrs; - memory = dma_alloc_attrs(dev, TOTAL_MEM_SIZE, &pScript, -GFP_KERNEL, DMA_ATTR_NON_CONSISTENT); - if(memory == NULL) { + memory = dma_alloc_coherent(dev, TOTAL_MEM_SIZE, &pScript, GFP_KERNEL); + if (!memory) { + hostdata->noncoherent = 1; + memory = dma_alloc_attrs(dev, TOTAL_MEM_SIZE, &pScript, +GFP_KERNEL, DMA_ATTR_NON_CONSISTENT); + } + if (!memory) { printk(KERN_ERR "53c700: Failed to allocate memory for driver, detaching\n"); return NULL; } @@ -339,11 +357,11 @@ NCR_700_detect(struct scsi_host_template *tpnt, for (j = 0; j < PATCHES; j++) script[LABELPATCHES[j]] = bS_to_host(pScript + SCRIPT[LABELPATCHES[j]]); /* now patch up fixed addresses. */ - script_patch_32(hostdata->dev, script, MessageLocation, + script_patch_32(hostdata, script, MessageLocation, pScript + MSGOUT_OFFSET); - script_patch_32(hostdata->dev, script, StatusAddress, + script_patch_32(hostdata, script, StatusAddress, pScript + STATUS_OFFSET); - script_patch_32(hostdata->dev, script, ReceiveMsgAddress, + script_patch_32(hostdata, script, ReceiveMsgAddress, pScript + MSGIN_OFFSET); hostdata->script = script; @@ -395,8 +413,12 @@ NCR_700_release(struct Scsi_Host *host) struct NCR_700_Host_Parameters *hostdata = (struct NCR_700_Host_Parameters *)host->hostdata[0]; - dma_free_attrs(hostdata->dev, TOTAL_MEM_SIZE, hostdata->script, - hostdata->pScript, DMA_ATTR_NON_CONSISTENT); + if (hostdata->noncoherent) + dma_free_attrs(hostdata->dev, TOTAL_MEM_SIZE, hostdata->script, + hostdata->pScript, DMA_ATTR_NON_CONSISTENT); + else + dma_free_coherent(hostdata->dev, TOTAL_MEM_SIZE, + hostdata->script, hostdata->pScript); return 1; } @@ -804,8 +826,8 @@ process_extended_message(struct Scsi_Host *host, shost_printk(KERN_WARNING, host, "Unexpected SDTR msg\n"); hostdata->msgout[0] = A_REJECT_MSG; - dma_cache_sync(hostdata->dev, hostdata->msgout, 1, DMA_TO_DEVICE); - script_patch_16(hostdata->dev, hostdata->script, + dma_sync_to_dev(hostdata, hostdata->msgout, 1); + script_patch_16(hostdata, hostdata->script, MessageCount, 1); /* SendMsgOut returns, so set up the return * address */ @@ -817,9 +839,8 @@ process_extended_message(struct Scsi_Host *host, printk(KERN_INFO "scsi%d: (%d:%d), Unsolicited WDTR after CMD, Rejecting\n", host->host_no, pun, lun); hostdata->msgout[
[PATCH 08/18] dma-mapping: add a new dma_alloc_noncoherent API
Add a new API to allocate and free memory that is guaranteed to be addressable by a device, but which potentially is not cache coherent for DMA. To transfer ownership to and from the device, the existing streaming DMA API calls dma_sync_single_for_device and dma_sync_single_for_cpu must be used. For now the new calls are implemented on top of dma_alloc_attrs just like the old-noncoherent API, but once all drivers are switched to the new API it will be replaced with a better working implementation that is available on all architectures. Signed-off-by: Christoph Hellwig --- Documentation/core-api/dma-api.rst | 75 ++ include/linux/dma-mapping.h| 12 + 2 files changed, 48 insertions(+), 39 deletions(-) diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst index 90239348b30f6f..ea0413276ddb70 100644 --- a/Documentation/core-api/dma-api.rst +++ b/Documentation/core-api/dma-api.rst @@ -516,48 +516,56 @@ routines, e.g.::: } -Part II - Advanced dma usage - +Part II - Non-coherent DMA allocations +-- -Warning: These pieces of the DMA API should not be used in the -majority of cases, since they cater for unlikely corner cases that -don't belong in usual drivers. +These APIs allow to allocate pages in the kernel direct mapping that are +guaranteed to be DMA addressable. This means that unlike dma_alloc_coherent, +virt_to_page can be called on the resulting address, and the resulting +struct page can be used for everything a struct page is suitable for. -If you don't understand how cache line coherency works between a -processor and an I/O device, you should not be using this part of the -API at all. +If you don't understand how cache line coherency works between a processor and +an I/O device, you should not be using this part of the API. :: void * - dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle, - gfp_t flag, unsigned long attrs) + dma_alloc_noncoherent(struct device *dev, size_t size, + dma_addr_t *dma_handle, enum dma_data_direction dir, + gfp_t gfp) -Identical to dma_alloc_coherent() except that when the -DMA_ATTR_NON_CONSISTENT flags is passed in the attrs argument, the -platform will choose to return either consistent or non-consistent memory -as it sees fit. By using this API, you are guaranteeing to the platform -that you have all the correct and necessary sync points for this memory -in the driver should it choose to return non-consistent memory. +This routine allocates a region of bytes of consistent memory. It +returns a pointer to the allocated region (in the processor's virtual address +space) or NULL if the allocation failed. The returned memory may or may not +be in the kernels direct mapping. Drivers must not call virt_to_page on +the returned memory region. -Note: where the platform can return consistent memory, it will -guarantee that the sync points become nops. +It also returns a which may be cast to an unsigned integer the +same width as the bus and given to the device as the DMA address base of +the region. -Warning: Handling non-consistent memory is a real pain. You should -only use this API if you positively know your driver will be -required to work on one of the rare (usually non-PCI) architectures -that simply cannot make consistent memory. +The dir parameter specified if data is read and/or written by the device, +see dma_map_single() for details. + +The gfp parameter allows the caller to specify the ``GFP_`` flags (see +kmalloc()) for the allocation, but rejects flags used to specify a memory +zone such as GFP_DMA or GFP_HIGHMEM. + +Before giving the memory to the device, dma_sync_single_for_device() needs +to be called, and before reading memory written by the device, +dma_sync_single_for_cpu(), just like for streaming DMA mappings that are +reused. :: void - dma_free_attrs(struct device *dev, size_t size, void *cpu_addr, - dma_addr_t dma_handle, unsigned long attrs) + dma_free_noncoherent(struct device *dev, size_t size, void *cpu_addr, + dma_addr_t dma_handle, enum dma_data_direction dir) -Free memory allocated by the dma_alloc_attrs(). All common -parameters must be identical to those otherwise passed to dma_free_coherent, -and the attrs argument must be identical to the attrs passed to -dma_alloc_attrs(). +Free a region of memory previously allocated using dma_alloc_noncoherent(). +dev, size and dma_handle and dir must all be the same as those passed into +dma_alloc_noncoherent(). cpu_addr must be the virtual address returned by +the dma_alloc_noncoherent(). :: @@ -575,17 +583,6 @@ memory or doing partial flushes. into the width returned by this call. It will also always be a power of two for easy alignment
[PATCH 09/18] sgiwd93: convert to dma_alloc_noncoherent
Use the new non-coherent DMA API including proper ownership transfers. This also means we can allocate the memory as DMA_TO_DEVICE instead of bidirectional. Signed-off-by: Christoph Hellwig --- drivers/scsi/sgiwd93.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/sgiwd93.c b/drivers/scsi/sgiwd93.c index 3bdf0deb8f1529..cf1030c9dda17f 100644 --- a/drivers/scsi/sgiwd93.c +++ b/drivers/scsi/sgiwd93.c @@ -95,7 +95,7 @@ void fill_hpc_entries(struct ip22_hostdata *hd, struct scsi_cmnd *cmd, int din) */ hcp->desc.pbuf = 0; hcp->desc.cntinfo = HPCDMA_EOX; - dma_cache_sync(hd->dev, hd->cpu, + dma_sync_single_for_device(hd->dev, hd->dma, (unsigned long)(hcp + 1) - (unsigned long)hd->cpu, DMA_TO_DEVICE); } @@ -234,8 +234,8 @@ static int sgiwd93_probe(struct platform_device *pdev) hdata = host_to_hostdata(host); hdata->dev = &pdev->dev; - hdata->cpu = dma_alloc_attrs(&pdev->dev, HPC_DMA_SIZE, &hdata->dma, -GFP_KERNEL, DMA_ATTR_NON_CONSISTENT); + hdata->cpu = dma_alloc_noncoherent(&pdev->dev, HPC_DMA_SIZE, + &hdata->dma, DMA_TO_DEVICE, GFP_KERNEL); if (!hdata->cpu) { printk(KERN_WARNING "sgiwd93: Could not allocate memory for " "host %d buffer.\n", unit); @@ -274,8 +274,8 @@ static int sgiwd93_probe(struct platform_device *pdev) out_irq: free_irq(irq, host); out_free: - dma_free_attrs(&pdev->dev, HPC_DMA_SIZE, hdata->cpu, hdata->dma, - DMA_ATTR_NON_CONSISTENT); + dma_free_noncoherent(&pdev->dev, HPC_DMA_SIZE, hdata->cpu, hdata->dma, + DMA_TO_DEVICE); out_put: scsi_host_put(host); out: @@ -291,8 +291,8 @@ static int sgiwd93_remove(struct platform_device *pdev) scsi_remove_host(host); free_irq(pd->irq, host); - dma_free_attrs(&pdev->dev, HPC_DMA_SIZE, hdata->cpu, hdata->dma, - DMA_ATTR_NON_CONSISTENT); + dma_free_noncoherent(&pdev->dev, HPC_DMA_SIZE, hdata->cpu, hdata->dma, + DMA_TO_DEVICE); scsi_host_put(host); return 0; } -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 10/18] hal2: convert to dma_alloc_noncoherent
Use the new non-coherent DMA API including proper ownership transfers. This also means we can allocate the buffer memory with the proper direction instead of bidirectional. Signed-off-by: Christoph Hellwig --- sound/mips/hal2.c | 58 ++- 1 file changed, 27 insertions(+), 31 deletions(-) diff --git a/sound/mips/hal2.c b/sound/mips/hal2.c index ec84bc4c3a6e77..9ac9b58d7c8cdd 100644 --- a/sound/mips/hal2.c +++ b/sound/mips/hal2.c @@ -441,7 +441,8 @@ static inline void hal2_stop_adc(struct snd_hal2 *hal2) hal2->adc.pbus.pbus->pbdma_ctrl = HPC3_PDMACTRL_LD; } -static int hal2_alloc_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec) +static int hal2_alloc_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec, + enum dma_data_direction buffer_dir) { struct device *dev = hal2->card->dev; struct hal2_desc *desc; @@ -449,15 +450,15 @@ static int hal2_alloc_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec) int count = H2_BUF_SIZE / H2_BLOCK_SIZE; int i; - codec->buffer = dma_alloc_attrs(dev, H2_BUF_SIZE, &buffer_dma, - GFP_KERNEL, DMA_ATTR_NON_CONSISTENT); + codec->buffer = dma_alloc_noncoherent(dev, H2_BUF_SIZE, &buffer_dma, + buffer_dir, GFP_KERNEL); if (!codec->buffer) return -ENOMEM; - desc = dma_alloc_attrs(dev, count * sizeof(struct hal2_desc), - &desc_dma, GFP_KERNEL, DMA_ATTR_NON_CONSISTENT); + desc = dma_alloc_noncoherent(dev, count * sizeof(struct hal2_desc), + &desc_dma, DMA_BIDIRECTIONAL, GFP_KERNEL); if (!desc) { - dma_free_attrs(dev, H2_BUF_SIZE, codec->buffer, buffer_dma, - DMA_ATTR_NON_CONSISTENT); + dma_free_noncoherent(dev, H2_BUF_SIZE, codec->buffer, buffer_dma, + buffer_dir); return -ENOMEM; } codec->buffer_dma = buffer_dma; @@ -470,20 +471,22 @@ static int hal2_alloc_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec) desc_dma : desc_dma + (i + 1) * sizeof(struct hal2_desc); desc++; } - dma_cache_sync(dev, codec->desc, count * sizeof(struct hal2_desc), - DMA_TO_DEVICE); + dma_sync_single_for_device(dev, codec->desc_dma, + count * sizeof(struct hal2_desc), + DMA_BIDIRECTIONAL); codec->desc_count = count; return 0; } -static void hal2_free_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec) +static void hal2_free_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec, + enum dma_data_direction buffer_dir) { struct device *dev = hal2->card->dev; - dma_free_attrs(dev, codec->desc_count * sizeof(struct hal2_desc), - codec->desc, codec->desc_dma, DMA_ATTR_NON_CONSISTENT); - dma_free_attrs(dev, H2_BUF_SIZE, codec->buffer, codec->buffer_dma, - DMA_ATTR_NON_CONSISTENT); + dma_free_noncoherent(dev, codec->desc_count * sizeof(struct hal2_desc), + codec->desc, codec->desc_dma, DMA_BIDIRECTIONAL); + dma_free_noncoherent(dev, H2_BUF_SIZE, codec->buffer, codec->buffer_dma, + buffer_dir); } static const struct snd_pcm_hardware hal2_pcm_hw = { @@ -509,21 +512,16 @@ static int hal2_playback_open(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct snd_hal2 *hal2 = snd_pcm_substream_chip(substream); - int err; runtime->hw = hal2_pcm_hw; - - err = hal2_alloc_dmabuf(hal2, &hal2->dac); - if (err) - return err; - return 0; + return hal2_alloc_dmabuf(hal2, &hal2->dac, DMA_TO_DEVICE); } static int hal2_playback_close(struct snd_pcm_substream *substream) { struct snd_hal2 *hal2 = snd_pcm_substream_chip(substream); - hal2_free_dmabuf(hal2, &hal2->dac); + hal2_free_dmabuf(hal2, &hal2->dac, DMA_TO_DEVICE); return 0; } @@ -579,7 +577,9 @@ static void hal2_playback_transfer(struct snd_pcm_substream *substream, unsigned char *buf = hal2->dac.buffer + rec->hw_data; memcpy(buf, substream->runtime->dma_area + rec->sw_data, bytes); - dma_cache_sync(hal2->card->dev, buf, bytes, DMA_TO_DEVICE); + dma_sync_single_for_device(hal2->card->dev, + hal2->dac.buffer_dma + rec->hw_data, bytes, + DMA_TO_DEVICE); } @@ -597,22 +597,16 @@ static int hal2_capture_open(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct snd_hal2 *hal2 = snd_pcm_substream_chip(substream); - struct hal2_codec *adc = &hal2->adc; - int err
[PATCH 11/18] lib82596: convert to dma_alloc_noncoherent
Use the new non-coherent DMA API including proper ownership transfers. This includes moving the DMA helpers to lib82596 based of an ifdef to avoid include order problems. Signed-off-by: Christoph Hellwig --- drivers/net/ethernet/i825xx/lasi_82596.c | 25 ++--- drivers/net/ethernet/i825xx/lib82596.c | 114 ++- drivers/net/ethernet/i825xx/sni_82596.c | 4 - 3 files changed, 80 insertions(+), 63 deletions(-) diff --git a/drivers/net/ethernet/i825xx/lasi_82596.c b/drivers/net/ethernet/i825xx/lasi_82596.c index a12218e940a2fa..96c6f4f36904ed 100644 --- a/drivers/net/ethernet/i825xx/lasi_82596.c +++ b/drivers/net/ethernet/i825xx/lasi_82596.c @@ -96,21 +96,14 @@ #define OPT_SWAP_PORT 0x0001 /* Need to wordswp on the MPU port */ -#define DMA_WBACK(ndev, addr, len) \ - do { dma_cache_sync((ndev)->dev.parent, (void *)addr, len, DMA_TO_DEVICE); } while (0) - -#define DMA_INV(ndev, addr, len) \ - do { dma_cache_sync((ndev)->dev.parent, (void *)addr, len, DMA_FROM_DEVICE); } while (0) - -#define DMA_WBACK_INV(ndev, addr, len) \ - do { dma_cache_sync((ndev)->dev.parent, (void *)addr, len, DMA_BIDIRECTIONAL); } while (0) - #define SYSBUS 0x006c /* big endian CPU, 82596 "big" endian mode */ #define SWAP32(x) (((u32)(x)<<16) | u32)(x)))>>16)) #define SWAP16(x) (x) +#define NONCOHERENT_DMA 1 + #include "lib82596.c" MODULE_AUTHOR("Richard Hirst"); @@ -184,9 +177,9 @@ lan_init_chip(struct parisc_device *dev) lp = netdev_priv(netdevice); lp->options = dev->id.sversion == 0x72 ? OPT_SWAP_PORT : 0; - lp->dma = dma_alloc_attrs(&dev->dev, sizeof(struct i596_dma), - &lp->dma_addr, GFP_KERNEL, - DMA_ATTR_NON_CONSISTENT); + lp->dma = dma_alloc_noncoherent(&dev->dev, + sizeof(struct i596_dma), &lp->dma_addr, + DMA_BIDIRECTIONAL, GFP_KERNEL); if (!lp->dma) goto out_free_netdev; @@ -196,8 +189,8 @@ lan_init_chip(struct parisc_device *dev) return 0; out_free_dma: - dma_free_attrs(&dev->dev, sizeof(struct i596_dma), lp->dma, - lp->dma_addr, DMA_ATTR_NON_CONSISTENT); + dma_free_noncoherent(&dev->dev, sizeof(struct i596_dma), + lp->dma, lp->dma_addr, DMA_BIDIRECTIONAL); out_free_netdev: free_netdev(netdevice); return retval; @@ -209,8 +202,8 @@ static int __exit lan_remove_chip(struct parisc_device *pdev) struct i596_private *lp = netdev_priv(dev); unregister_netdev (dev); - dma_free_attrs(&pdev->dev, sizeof(struct i596_private), lp->dma, - lp->dma_addr, DMA_ATTR_NON_CONSISTENT); + dma_free_noncoherent(&pdev->dev, sizeof(struct i596_private), lp->dma, + lp->dma_addr, DMA_BIDIRECTIONAL); free_netdev (dev); return 0; } diff --git a/drivers/net/ethernet/i825xx/lib82596.c b/drivers/net/ethernet/i825xx/lib82596.c index b4e4b3eb5758b5..ca2fb303fcc6f6 100644 --- a/drivers/net/ethernet/i825xx/lib82596.c +++ b/drivers/net/ethernet/i825xx/lib82596.c @@ -365,13 +365,44 @@ static int max_cmd_backlog = TX_RING_SIZE-1; static void i596_poll_controller(struct net_device *dev); #endif +static inline dma_addr_t virt_to_dma(struct i596_private *lp, volatile void *v) +{ + return lp->dma_addr + ((unsigned long)v - (unsigned long)lp->dma); +} + +#ifdef NONCOHERENT_DMA +static inline void dma_sync_dev(struct net_device *ndev, volatile void *addr, + size_t len) +{ + dma_sync_single_for_device(ndev->dev.parent, + virt_to_dma(netdev_priv(ndev), addr), len, + DMA_BIDIRECTIONAL); +} + +static inline void dma_sync_cpu(struct net_device *ndev, volatile void *addr, + size_t len) +{ + dma_sync_single_for_cpu(ndev->dev.parent, + virt_to_dma(netdev_priv(ndev), addr), len, + DMA_BIDIRECTIONAL); +} +#else +static inline void dma_sync_dev(struct net_device *ndev, volatile void *addr, + size_t len) +{ +} +static inline void dma_sync_cpu(struct net_device *ndev, volatile void *addr, + size_t len) +{ +} +#endif /* NONCOHERENT_DMA */ static inline int wait_istat(struct net_device *dev, struct i596_dma *dma, int delcnt, char *str) { - DMA_INV(dev, &(dma->iscp), sizeof(struct i596_iscp)); + dma_sync_cpu(dev, &(dma->iscp), sizeof(struct i596_iscp)); while (--delcnt && dma->iscp.stat) { udelay(10); - DMA_INV(dev, &(dma->iscp), sizeof(struct i596_iscp)); + dma_sync_cpu(dev, &(dma->iscp), sizeof(struct i596_iscp)); } if (!delcnt) { printk(KERN_ERR "%s: %s, iscp.stat %04x, didn't clear\n", @@ -384,10 +415,10 @@ static inline int wait_istat(struct net_device *dev, struct i596_dma *dma, int d static inline int wa
[PATCH 12/18] sgiseeq: convert to dma_alloc_noncoherent
Use the new non-coherent DMA API including proper ownership transfers. This includes adding additional calls to dma_sync_desc_dev as the old syncing was rather ad-hoc. Thanks to Thomas Bogendoerfer for debugging the ownership transfer issues. Signed-off-by: Christoph Hellwig --- drivers/net/ethernet/seeq/sgiseeq.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/seeq/sgiseeq.c b/drivers/net/ethernet/seeq/sgiseeq.c index 8507ff2420143a..37ff25a84030eb 100644 --- a/drivers/net/ethernet/seeq/sgiseeq.c +++ b/drivers/net/ethernet/seeq/sgiseeq.c @@ -112,14 +112,18 @@ struct sgiseeq_private { static inline void dma_sync_desc_cpu(struct net_device *dev, void *addr) { - dma_cache_sync(dev->dev.parent, addr, sizeof(struct sgiseeq_rx_desc), - DMA_FROM_DEVICE); + struct sgiseeq_private *sp = netdev_priv(dev); + + dma_sync_single_for_cpu(dev->dev.parent, VIRT_TO_DMA(sp, addr), + sizeof(struct sgiseeq_rx_desc), DMA_BIDIRECTIONAL); } static inline void dma_sync_desc_dev(struct net_device *dev, void *addr) { - dma_cache_sync(dev->dev.parent, addr, sizeof(struct sgiseeq_rx_desc), - DMA_TO_DEVICE); + struct sgiseeq_private *sp = netdev_priv(dev); + + dma_sync_single_for_device(dev->dev.parent, VIRT_TO_DMA(sp, addr), + sizeof(struct sgiseeq_rx_desc), DMA_BIDIRECTIONAL); } static inline void hpc3_eth_reset(struct hpc3_ethregs *hregs) @@ -403,6 +407,8 @@ static inline void sgiseeq_rx(struct net_device *dev, struct sgiseeq_private *sp rd = &sp->rx_desc[sp->rx_new]; dma_sync_desc_cpu(dev, rd); } + dma_sync_desc_dev(dev, rd); + dma_sync_desc_cpu(dev, &sp->rx_desc[orig_end]); sp->rx_desc[orig_end].rdma.cntinfo &= ~(HPCDMA_EOR); dma_sync_desc_dev(dev, &sp->rx_desc[orig_end]); @@ -443,6 +449,7 @@ static inline void kick_tx(struct net_device *dev, dma_sync_desc_cpu(dev, td); } if (td->tdma.cntinfo & HPCDMA_XIU) { + dma_sync_desc_dev(dev, td); hregs->tx_ndptr = VIRT_TO_DMA(sp, td); hregs->tx_ctrl = HPC3_ETXCTRL_ACTIVE; } @@ -476,6 +483,7 @@ static inline void sgiseeq_tx(struct net_device *dev, struct sgiseeq_private *sp if (!(td->tdma.cntinfo & (HPCDMA_XIU))) break; if (!(td->tdma.cntinfo & (HPCDMA_ETXD))) { + dma_sync_desc_dev(dev, td); if (!(status & HPC3_ETXCTRL_ACTIVE)) { hregs->tx_ndptr = VIRT_TO_DMA(sp, td); hregs->tx_ctrl = HPC3_ETXCTRL_ACTIVE; @@ -740,8 +748,8 @@ static int sgiseeq_probe(struct platform_device *pdev) sp = netdev_priv(dev); /* Make private data page aligned */ - sr = dma_alloc_attrs(&pdev->dev, sizeof(*sp->srings), &sp->srings_dma, -GFP_KERNEL, DMA_ATTR_NON_CONSISTENT); + sr = dma_alloc_noncoherent(&pdev->dev, sizeof(*sp->srings), + &sp->srings_dma, DMA_BIDIRECTIONAL, GFP_KERNEL); if (!sr) { printk(KERN_ERR "Sgiseeq: Page alloc failed, aborting.\n"); err = -ENOMEM; @@ -802,8 +810,8 @@ static int sgiseeq_probe(struct platform_device *pdev) return 0; err_out_free_attrs: - dma_free_attrs(&pdev->dev, sizeof(*sp->srings), sp->srings, - sp->srings_dma, DMA_ATTR_NON_CONSISTENT); + dma_free_noncoherent(&pdev->dev, sizeof(*sp->srings), sp->srings, + sp->srings_dma, DMA_BIDIRECTIONAL); err_out_free_dev: free_netdev(dev); @@ -817,8 +825,8 @@ static int sgiseeq_remove(struct platform_device *pdev) struct sgiseeq_private *sp = netdev_priv(dev); unregister_netdev(dev); - dma_free_attrs(&pdev->dev, sizeof(*sp->srings), sp->srings, - sp->srings_dma, DMA_ATTR_NON_CONSISTENT); + dma_free_noncoherent(&pdev->dev, sizeof(*sp->srings), sp->srings, + sp->srings_dma, DMA_BIDIRECTIONAL); free_netdev(dev); return 0; -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 13/18] 53c700: convert to dma_alloc_noncoherent
Use the new non-coherent DMA API including proper ownership transfers. Signed-off-by: Christoph Hellwig --- drivers/scsi/53c700.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index c59226d7e2f6b5..5117d90ccd9edf 100644 --- a/drivers/scsi/53c700.c +++ b/drivers/scsi/53c700.c @@ -269,18 +269,25 @@ NCR_700_get_SXFER(struct scsi_device *SDp) spi_period(SDp->sdev_target)); } +static inline dma_addr_t virt_to_dma(struct NCR_700_Host_Parameters *h, void *p) +{ + return h->pScript + ((uintptr_t)p - (uintptr_t)h->script); +} + static inline void dma_sync_to_dev(struct NCR_700_Host_Parameters *h, void *addr, size_t size) { if (h->noncoherent) - dma_cache_sync(h->dev, addr, size, DMA_TO_DEVICE); + dma_sync_single_for_device(h->dev, virt_to_dma(h, addr), + size, DMA_BIDIRECTIONAL); } static inline void dma_sync_from_dev(struct NCR_700_Host_Parameters *h, void *addr, size_t size) { if (h->noncoherent) - dma_cache_sync(h->dev, addr, size, DMA_FROM_DEVICE); + dma_sync_single_for_device(h->dev, virt_to_dma(h, addr), size, + DMA_BIDIRECTIONAL); } struct Scsi_Host * @@ -300,8 +307,8 @@ NCR_700_detect(struct scsi_host_template *tpnt, memory = dma_alloc_coherent(dev, TOTAL_MEM_SIZE, &pScript, GFP_KERNEL); if (!memory) { hostdata->noncoherent = 1; - memory = dma_alloc_attrs(dev, TOTAL_MEM_SIZE, &pScript, -GFP_KERNEL, DMA_ATTR_NON_CONSISTENT); + memory = dma_alloc_noncoherent(dev, TOTAL_MEM_SIZE, &pScript, +DMA_BIDIRECTIONAL, GFP_KERNEL); } if (!memory) { printk(KERN_ERR "53c700: Failed to allocate memory for driver, detaching\n"); @@ -414,8 +421,9 @@ NCR_700_release(struct Scsi_Host *host) (struct NCR_700_Host_Parameters *)host->hostdata[0]; if (hostdata->noncoherent) - dma_free_attrs(hostdata->dev, TOTAL_MEM_SIZE, hostdata->script, - hostdata->pScript, DMA_ATTR_NON_CONSISTENT); + dma_free_noncoherent(hostdata->dev, TOTAL_MEM_SIZE, + hostdata->script, hostdata->pScript, + DMA_BIDIRECTIONAL); else dma_free_coherent(hostdata->dev, TOTAL_MEM_SIZE, hostdata->script, hostdata->pScript); -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 14/18] dma-mapping: remove dma_cache_sync
All users are gone now, remove the API. Signed-off-by: Christoph Hellwig --- arch/mips/Kconfig | 1 - arch/mips/jazz/jazzdma.c| 1 - arch/mips/mm/dma-noncoherent.c | 6 -- arch/parisc/Kconfig | 1 - arch/parisc/kernel/pci-dma.c| 6 -- include/linux/dma-mapping.h | 8 include/linux/dma-noncoherent.h | 10 -- kernel/dma/Kconfig | 3 --- kernel/dma/mapping.c| 14 -- 9 files changed, 50 deletions(-) diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index c95fa3a2484cf0..1be91c5d666e61 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -1134,7 +1134,6 @@ config DMA_NONCOHERENT select ARCH_HAS_SYNC_DMA_FOR_DEVICE select ARCH_HAS_DMA_SET_UNCACHED select DMA_NONCOHERENT_MMAP - select DMA_NONCOHERENT_CACHE_SYNC select NEED_DMA_MAP_STATE config SYS_HAS_EARLY_PRINTK diff --git a/arch/mips/jazz/jazzdma.c b/arch/mips/jazz/jazzdma.c index dab4d058cea9b1..2bf849caf507b1 100644 --- a/arch/mips/jazz/jazzdma.c +++ b/arch/mips/jazz/jazzdma.c @@ -620,7 +620,6 @@ const struct dma_map_ops jazz_dma_ops = { .sync_single_for_device = jazz_dma_sync_single_for_device, .sync_sg_for_cpu= jazz_dma_sync_sg_for_cpu, .sync_sg_for_device = jazz_dma_sync_sg_for_device, - .cache_sync = arch_dma_cache_sync, .mmap = dma_common_mmap, .get_sgtable= dma_common_get_sgtable, }; diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c index 97a14adbafc99c..f34ad1f09799f1 100644 --- a/arch/mips/mm/dma-noncoherent.c +++ b/arch/mips/mm/dma-noncoherent.c @@ -137,12 +137,6 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, } #endif -void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size, - enum dma_data_direction direction) -{ - dma_sync_virt_for_device(vaddr, size, direction); -} - #ifdef CONFIG_DMA_PERDEV_COHERENT void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, const struct iommu_ops *iommu, bool coherent) diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig index 3b0f53dd70bc9b..ed15da1da174e0 100644 --- a/arch/parisc/Kconfig +++ b/arch/parisc/Kconfig @@ -195,7 +195,6 @@ config PA11 depends on PA7000 || PA7100LC || PA7200 || PA7300LC select ARCH_HAS_SYNC_DMA_FOR_CPU select ARCH_HAS_SYNC_DMA_FOR_DEVICE - select DMA_NONCOHERENT_CACHE_SYNC config PREFETCH def_bool y diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c index 38c68e131bbe2a..ce38c0b9158125 100644 --- a/arch/parisc/kernel/pci-dma.c +++ b/arch/parisc/kernel/pci-dma.c @@ -454,9 +454,3 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, { flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size); } - -void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size, - enum dma_data_direction direction) -{ - flush_kernel_dcache_range((unsigned long)vaddr, size); -} diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 4e1de194b45cbf..5b4e97b0846fd3 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -123,8 +123,6 @@ struct dma_map_ops { void (*sync_sg_for_device)(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir); - void (*cache_sync)(struct device *dev, void *vaddr, size_t size, - enum dma_data_direction direction); int (*dma_supported)(struct device *dev, u64 mask); u64 (*get_required_mask)(struct device *dev); size_t (*max_mapping_size)(struct device *dev); @@ -254,8 +252,6 @@ void *dmam_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs); void dmam_free_coherent(struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle); -void dma_cache_sync(struct device *dev, void *vaddr, size_t size, - enum dma_data_direction dir); int dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs); @@ -339,10 +335,6 @@ static inline void dmam_free_coherent(struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle) { } -static inline void dma_cache_sync(struct device *dev, void *vaddr, size_t size, - enum dma_data_direction dir) -{ -} static inline int dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs) diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h index b9bc6c557ea46f..0888656369a45b 100644 --- a/include/linux/dma-nonc
[PATCH 15/18] dma-mapping: add a new dma_alloc_pages API
This API is the equivalent of alloc_pages, except that the returned memory is guaranteed to be DMA addressable by the passed in device. The implementation will also be used to provide a more sensible replacement for DMA_ATTR_NON_CONSISTENT flag. Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages as its backend. Signed-off-by: Christoph Hellwig --- Documentation/core-api/dma-attributes.rst | 8 --- arch/alpha/kernel/pci_iommu.c | 2 + arch/arm/mm/dma-mapping-nommu.c | 2 + arch/arm/mm/dma-mapping.c | 4 ++ arch/ia64/hp/common/sba_iommu.c | 2 + arch/mips/jazz/jazzdma.c | 7 +-- arch/powerpc/kernel/dma-iommu.c | 2 + arch/powerpc/platforms/ps3/system-bus.c | 4 ++ arch/powerpc/platforms/pseries/vio.c | 2 + arch/s390/pci/pci_dma.c | 2 + arch/x86/kernel/amd_gart_64.c | 2 + drivers/iommu/dma-iommu.c | 2 + drivers/iommu/intel/iommu.c | 4 ++ drivers/parisc/ccio-dma.c | 2 + drivers/parisc/sba_iommu.c| 2 + drivers/xen/swiotlb-xen.c | 2 + include/linux/dma-direct.h| 5 ++ include/linux/dma-mapping.h | 34 ++-- include/linux/dma-noncoherent.h | 3 -- kernel/dma/direct.c | 52 ++- kernel/dma/mapping.c | 63 +-- kernel/dma/ops_helpers.c | 35 + kernel/dma/virt.c | 2 + 23 files changed, 206 insertions(+), 37 deletions(-) diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst index 29dcbe8826e85e..1887d92e8e9269 100644 --- a/Documentation/core-api/dma-attributes.rst +++ b/Documentation/core-api/dma-attributes.rst @@ -25,14 +25,6 @@ Since it is optional for platforms to implement DMA_ATTR_WRITE_COMBINE, those that do not will simply ignore the attribute and exhibit default behavior. -DMA_ATTR_NON_CONSISTENT - -DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either -consistent or non-consistent memory as it sees fit. By using this API, -you are guaranteeing to the platform that you have all the correct and -necessary sync points for this memory in the driver. - DMA_ATTR_NO_KERNEL_MAPPING -- diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c index 6f7de4f4e191e7..447e0fd0ed3895 100644 --- a/arch/alpha/kernel/pci_iommu.c +++ b/arch/alpha/kernel/pci_iommu.c @@ -952,5 +952,7 @@ const struct dma_map_ops alpha_pci_ops = { .dma_supported = alpha_pci_supported, .mmap = dma_common_mmap, .get_sgtable= dma_common_get_sgtable, + .alloc_pages= dma_common_alloc_pages, + .free_pages = dma_common_free_pages, }; EXPORT_SYMBOL(alpha_pci_ops); diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c index 287ef898a55e11..43c6d66b6e733a 100644 --- a/arch/arm/mm/dma-mapping-nommu.c +++ b/arch/arm/mm/dma-mapping-nommu.c @@ -176,6 +176,8 @@ static void arm_nommu_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist const struct dma_map_ops arm_nommu_dma_ops = { .alloc = arm_nommu_dma_alloc, .free = arm_nommu_dma_free, + .alloc_pages= dma_direct_alloc_pages, + .free_pages = dma_direct_free_pages, .mmap = arm_nommu_dma_mmap, .map_page = arm_nommu_dma_map_page, .unmap_page = arm_nommu_dma_unmap_page, diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 8a8949174b1c06..7738b4d23f692f 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -199,6 +199,8 @@ static int arm_dma_supported(struct device *dev, u64 mask) const struct dma_map_ops arm_dma_ops = { .alloc = arm_dma_alloc, .free = arm_dma_free, + .alloc_pages= dma_direct_alloc_pages, + .free_pages = dma_direct_free_pages, .mmap = arm_dma_mmap, .get_sgtable= arm_dma_get_sgtable, .map_page = arm_dma_map_page, @@ -226,6 +228,8 @@ static int arm_coherent_dma_mmap(struct device *dev, struct vm_area_struct *vma, const struct dma_map_ops arm_coherent_dma_ops = { .alloc = arm_coherent_dma_alloc, .free = arm_coherent_dma_free, + .alloc_pages= dma_direct_alloc_pages, + .free_pages = dma_direct_free_pages, .mmap = arm_coherent_dma_mmap, .get_sgtable= arm_dma_get_sgtable, .map_page = arm_coherent_dma_map_page, diff --git a/a
[PATCH 16/18] dma-mapping: add new {alloc, free}_noncoherent dma_map_ops methods
This will allow IOMMU drivers to allocate non-contigous memory and return a vmapped virtual address. Signed-off-by: Christoph Hellwig --- include/linux/dma-mapping.h | 5 + kernel/dma/mapping.c| 33 +++-- 2 files changed, 32 insertions(+), 6 deletions(-) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index bf592cf0db4acb..b4b5d75260d6dc 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -80,6 +80,11 @@ struct dma_map_ops { gfp_t gfp); void (*free_pages)(struct device *dev, size_t size, struct page *vaddr, dma_addr_t dma_handle, enum dma_data_direction dir); + void* (*alloc_noncoherent)(struct device *dev, size_t size, + dma_addr_t *dma_handle, enum dma_data_direction dir, + gfp_t gfp); + void (*free_noncoherent)(struct device *dev, size_t size, void *vaddr, + dma_addr_t dma_handle, enum dma_data_direction dir); int (*mmap)(struct device *, struct vm_area_struct *, void *, dma_addr_t, size_t, unsigned long attrs); diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 6f86c925b8251d..8614d7d2ee59a9 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -502,19 +502,40 @@ EXPORT_SYMBOL_GPL(dma_free_pages); void *dma_alloc_noncoherent(struct device *dev, size_t size, dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp) { - struct page *page; + const struct dma_map_ops *ops = get_dma_ops(dev); + void *vaddr; - page = dma_alloc_pages(dev, size, dma_handle, dir, gfp); - if (!page) - return NULL; - return page_address(page); + if (!ops || !ops->alloc_noncoherent) { + struct page *page; + + page = dma_alloc_pages(dev, size, dma_handle, dir, gfp); + if (!page) + return NULL; + return page_address(page); + } + + size = PAGE_ALIGN(size); + vaddr = ops->alloc_noncoherent(dev, size, dma_handle, dir, gfp); + if (vaddr) + debug_dma_map_page(dev, virt_to_page(vaddr), 0, size, dir, + *dma_handle); + return vaddr; } EXPORT_SYMBOL_GPL(dma_alloc_noncoherent); void dma_free_noncoherent(struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle, enum dma_data_direction dir) { - dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir); + const struct dma_map_ops *ops = get_dma_ops(dev); + + if (!ops || !ops->free_noncoherent) { + dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir); + return; + } + + size = PAGE_ALIGN(size); + debug_dma_unmap_page(dev, dma_handle, size, dir); + ops->free_noncoherent(dev, size, vaddr, dma_handle, dir); } EXPORT_SYMBOL_GPL(dma_free_noncoherent); -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 5/9] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature
From: Yu-cheng Yu ENQCMD instruction reads PASID from IA32_PASID MSR. The MSR is stored in the task's supervisor FPU PASID state and is context switched by XSAVES/XRSTORS. Signed-off-by: Yu-cheng Yu Co-developed-by: Fenghua Yu Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck --- v2: - Modify the commit message (Thomas) arch/x86/include/asm/fpu/types.h | 11 ++- arch/x86/include/asm/fpu/xstate.h | 2 +- arch/x86/kernel/fpu/xstate.c | 6 +- 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index c87364ea6446..f5a38a5f3ae1 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -114,7 +114,7 @@ enum xfeature { XFEATURE_Hi16_ZMM, XFEATURE_PT_UNIMPLEMENTED_SO_FAR, XFEATURE_PKRU, - XFEATURE_RSRVD_COMP_10, + XFEATURE_PASID, XFEATURE_RSRVD_COMP_11, XFEATURE_RSRVD_COMP_12, XFEATURE_RSRVD_COMP_13, @@ -134,6 +134,7 @@ enum xfeature { #define XFEATURE_MASK_Hi16_ZMM (1 << XFEATURE_Hi16_ZMM) #define XFEATURE_MASK_PT (1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR) #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) +#define XFEATURE_MASK_PASID(1 << XFEATURE_PASID) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) #define XFEATURE_MASK_FPSSE(XFEATURE_MASK_FP | XFEATURE_MASK_SSE) @@ -256,6 +257,14 @@ struct arch_lbr_state { struct lbr_entryentries[]; } __packed; +/* + * State component 10 is supervisor state used for context-switching the + * PASID state. + */ +struct ia32_pasid_state { + u64 pasid; +} __packed; + struct xstate_header { u64 xfeatures; u64 xcomp_bv; diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index 14ab815132d4..47a92232d595 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -35,7 +35,7 @@ XFEATURE_MASK_BNDCSR) /* All currently supported supervisor features */ -#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (0) +#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID) /* * A supervisor state component may not always contain valuable information, diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 038e19c0019e..67f1a03b9b23 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -37,6 +37,7 @@ static const char *xfeature_names[] = "AVX-512 ZMM_Hi256" , "Processor Trace (unused)" , "Protection Keys User registers", + "PASID state", "unknown xstate feature", }; @@ -51,6 +52,7 @@ static short xsave_cpuid_features[] __initdata = { X86_FEATURE_AVX512F, X86_FEATURE_INTEL_PT, X86_FEATURE_PKU, + X86_FEATURE_ENQCMD, }; /* @@ -318,6 +320,7 @@ static void __init print_xstate_features(void) print_xstate_feature(XFEATURE_MASK_ZMM_Hi256); print_xstate_feature(XFEATURE_MASK_Hi16_ZMM); print_xstate_feature(XFEATURE_MASK_PKRU); + print_xstate_feature(XFEATURE_MASK_PASID); } /* @@ -592,6 +595,7 @@ static void check_xstate_against_struct(int nr) XCHECK_SZ(sz, nr, XFEATURE_ZMM_Hi256, struct avx_512_zmm_uppers_state); XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM, struct avx_512_hi16_state); XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state); + XCHECK_SZ(sz, nr, XFEATURE_PASID, struct ia32_pasid_state); /* * Make *SURE* to add any feature numbers in below if @@ -601,7 +605,7 @@ static void check_xstate_against_struct(int nr) if ((nr < XFEATURE_YMM) || (nr >= XFEATURE_MAX) || (nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR) || - ((nr >= XFEATURE_RSRVD_COMP_10) && (nr <= XFEATURE_LBR))) { + ((nr >= XFEATURE_RSRVD_COMP_11) && (nr <= XFEATURE_LBR))) { WARN_ONCE(1, "no structure for xstate: %d\n", nr); XSTATE_WARN_ON(1); } -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 0/9] x86: tag application address space for devices
Typical hardware devices require a driver stack to translate application buffers to hardware addresses, and a kernel-user transition to notify the hardware of new work. What if both the translation and transition overhead could be eliminated? This is what Shared Virtual Address (SVA) and ENQCMD enabled hardware like Data Streaming Accelerator (DSA) aims to achieve. Applications map portals in their local-address-space and directly submit work to them using a new instruction. This series enables ENQCMD and associated management of the new MSR (MSR_IA32_PASID). This new MSR allows an application address space to be associated with what the PCIe spec calls a Process Address Space ID (PASID). This PASID tag is carried along with all requests between applications and devices and allows devices to interact with the process address space. SVA and ENQCMD enabled device drivers need this series. The phase 2 DSA patches with SVA and ENQCMD support was released on the top of this series: https://lore.kernel.org/patchwork/cover/1244060/ This series only provides simple and basic support for ENQCMD and the MSR: 1. Clean up type definitions (patch 1-2). These patches can be in a separate series. - Define "pasid" as "u32" consistently - Define "flags" as "unsigned int" 2. Explain different various technical terms used in the series (patch 3). 3. Enumerate support for ENQCMD in the processor (patch 4). 4. Handle FPU PASID state and the MSR during context switch (patches 5-6). 5. Define "pasid" in mm_struct (patch 7). 6. Disable ENQCMD when configured out (patch 8). 7. Allocate and free PASID for a process (patch 9). This patch series and the DSA phase 2 series are in https://github.com/intel/idxd-driver/tree/idxd-stage2 References: 1. Detailed information on the ENQCMD/ENQCMDS instructions and the IA32_PASID MSR can be found in Intel Architecture Instruction Set Extensions and Future Features Programming Reference: https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf 2. Detailed information on DSA can be found in DSA specification: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification Chang log: v8: - Change subject to "drm, iommu:" in patch 1 (Boris). - Address all comments on patch 3 from Boris and Randy. - Re-write commit message for patch 8 (Boris) - Re-write commit message, remove "#ifdef CONFIG_IOMMU_SUPPORT", remove the "if (!ppasid_state)" check in patch 9 (Boris). v7: - Don't fix up PASID in #GP. Instead, update the PASID MSR by IPI and context switch after PASID allocation and free. Inherit PASID from parent. (Andy) v6: - Change return type to u32 for kfd_pasid_alloc() in patch 1 (Felix) v5: - Mark ENQCMD disabled when configured out and use cpu_feature_enabled() to simplify the feature checking code in patch 10 and 12 (PeterZ and Dave Hansen) - Add Reviewed-by: Lu Baolu to patch 1, 2, 10, and 12. v4: - Define PASID as "u32" instead of "unsigned int" in patch 1, 7, 10, 12. (Christoph) - Drop v3 patch 2 which changes PASID type in ocxl because it's not related to x86 and was rejected by ocxl maintainer Frederic Barrat - A split patch which changes PASID type to u32 in crypto/hisilicon/qm.c was released separately to linux-crypto mailing list because it's not related to x86 and is a standalone patch: v3: - Change names of bind_mm() and unbind_mm() to match to new APIs in patch 4 (Baolu) - Change CONFIG_PCI_PASID to CONFIG_IOMMU_SUPPORT because non-PCI device can have PASID in ARM in patch 8 (Jean) - Add a few sanity checks in __free_pasid() and alloc_pasid() in patch 11 (Baolu) - Add patch 12 to define a new flag "has_valid_pasid" for a task and use the flag to identify if the task has a valid PASID MSR (PeterZ) - Add fpu__pasid_write() to update the MSR in fixup() in patch 13 - Check if mm->pasid can be found in fixup() in patch 13 v2: - Add patches 1-3 to define "pasid" and "flags" as "unsigned int" consistently (Thomas) (these 3 patches could be in a separate patch set) - Add patch 8 to move "pasid" to generic mm_struct (Christoph). Jean-Philippe Brucker released a virtually same patch. Upstream only needs one of the two. - Add patch 9 to initialize PASID in a new mm. - Plus other changes described in each patch (Thomas) Ashok Raj (1): Documentation/x86: Add documentation for SVA (Shared Virtual Addressing) Fenghua Yu (7): drm, iommu: Change type of pasid to u32 iommu/vt-d: Change flags type to unsigned int in binding mm x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions x86/msr-index: Define IA32_PASID MSR mm: Define pasid in mm x86/cpufeatures: Mark ENQCMD as disabled when configured out x86/mmu: Allocate/free PASID Yu-cheng Yu (1): x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature Documentation/x86/index.rst | 1 + Documentation/x86/sva.rst
[PATCH v8 9/9] x86/mmu: Allocate/free PASID
A PASID is allocated for an "mm" the first time any thread binds to an SVM capable device and is freed from the "mm" when the SVM is unbound by the last thread. It's possible for the "mm" to have different PASID values in different binding/unbinding SVM cycles. The mm's PASID (non-zero for valid PASID or 0 for invalid PASID) is propagated to per-thread PASID MSR for all threads within the mm through IPI, context switch, or inherit. So that a running thread has the right PASID MSR matching the mm's PASID. Suggested-by: Andy Lutomirski Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck --- v8: - Re-write commit message (Boris). - Remove ppasid_state == NULL check in update_pasid() (Boris). - Move "#ifdef CONFIG_IOMMU_SUPPORT" hunk to patch 8 (Boris). - Add comment when calling update_pasid(). v7: - Don't fix up PASID in #GP. Instead, update the PASID MSR by IPI and context switch after PASID allocation and free. Inherit PASID from parent. (Andy) Before v7: - Allocate a PASID for the mm and free it until mm exit. arch/x86/include/asm/fpu/api.h | 12 ++ arch/x86/include/asm/fpu/internal.h | 7 arch/x86/kernel/fpu/xstate.c| 57 + drivers/iommu/intel/svm.c | 28 +- 4 files changed, 103 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index b774c52e5411..dcd9503b1098 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -62,4 +62,16 @@ extern void switch_fpu_return(void); */ extern int cpu_has_xfeatures(u64 xfeatures_mask, const char **feature_name); +/* + * Tasks that are not using SVA have mm->pasid set to zero to note that they + * will not have the valid bit set in MSR_IA32_PASID while they are running. + */ +#define PASID_DISABLED 0 + +#ifdef CONFIG_IOMMU_SUPPORT +/* Update current's PASID MSR/state by mm's PASID. */ +void update_pasid(void); +#else +static inline void update_pasid(void) { } +#endif #endif /* _ASM_X86_FPU_API_H */ diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 0a460f2a3f90..341d00eba3f8 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -583,6 +583,13 @@ static inline void switch_fpu_finish(struct fpu *new_fpu) pkru_val = pk->pkru; } __write_pkru(pkru_val); + + /* +* Expensive PASID MSR write will be avoided in update_pasid() because +* TIF_NEED_FPU_LOAD was set. And the PASID state won't be updated +* unless it's different from mm->pasid to reduce overhead. +*/ + update_pasid(); } /* diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 67f1a03b9b23..5d8047441a0a 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1402,3 +1402,60 @@ int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, return 0; } #endif /* CONFIG_PROC_PID_ARCH_STATUS */ + +#ifdef CONFIG_IOMMU_SUPPORT +void update_pasid(void) +{ + u64 pasid_state; + u32 pasid; + + if (!cpu_feature_enabled(X86_FEATURE_ENQCMD)) + return; + + if (!current->mm) + return; + + pasid = READ_ONCE(current->mm->pasid); + /* Set the valid bit in the PASID MSR/state only for valid pasid. */ + pasid_state = pasid == PASID_DISABLED ? + pasid : pasid | MSR_IA32_PASID_VALID; + + /* +* No need to hold fregs_lock() since the task's fpstate won't +* be changed by others (e.g. ptrace) while the task is being +* switched to or is in IPI. +*/ + if (!test_thread_flag(TIF_NEED_FPU_LOAD)) { + /* The MSR is active and can be directly updated. */ + wrmsrl(MSR_IA32_PASID, pasid_state); + } else { + struct fpu *fpu = ¤t->thread.fpu; + struct ia32_pasid_state *ppasid_state; + struct xregs_state *xsave; + + /* +* The CPU's xstate registers are not currently active. Just +* update the PASID state in the memory buffer here. The +* PASID MSR will be loaded when returning to user mode. +*/ + xsave = &fpu->state.xsave; + xsave->header.xfeatures |= XFEATURE_MASK_PASID; + ppasid_state = get_xsave_addr(xsave, XFEATURE_PASID); + /* +* Since XFEATURE_MASK_PASID is set in xfeatures, ppasid_state +* won't be NULL and no need to check its value. +* +* Only update the task's PASID state when it's different +* from the mm's pasid. +*/ + if (ppasid_state->pasid != pasid_state) { + /* +* Invalid fpregs so that state restoring will pick up +* th
[PATCH v8 7/9] mm: Define pasid in mm
PASID is shared by all threads in a process. So the logical place to keep track of it is in the "mm". Both ARM and X86 need to use the PASID in the "mm". Suggested-by: Christoph Hellwig Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck --- v4: - Change PASID type to u32 (Christoph) v3: - Change CONFIG_PCI_PASID to CONFIG_IOMMU_SUPPORT because non-PCI device can have PASID in ARM (Jean) v2: - This new patch moves "pasid" from x86 specific mm_context_t to generic struct mm_struct per Christopher's comment: https://lore.kernel.org/linux-iommu/20200414170252.714402-1-jean-phili...@linaro.org/T/#mb57110ffe1aaa24750eeea4f93b611f0d1913911 - Jean-Philippe Brucker released a virtually same patch. I still put this patch in the series for better review. The upstream kernel only needs one of the two patches eventually. https://lore.kernel.org/linux-iommu/20200519175502.2504091-2-jean-phili...@linaro.org/ - Change CONFIG_IOASID to CONFIG_PCI_PASID (Ashok) include/linux/mm_types.h | 4 1 file changed, 4 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 496c3ff97cce..1ff0615ef19f 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -542,6 +542,10 @@ struct mm_struct { atomic_long_t hugetlb_usage; #endif struct work_struct async_put_work; + +#ifdef CONFIG_IOMMU_SUPPORT + u32 pasid; +#endif } __randomize_layout; /* -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 1/9] drm, iommu: Change type of pasid to u32
PASID is defined as a few different types in iommu including "int", "u32", and "unsigned int". To be consistent and to match with uapi definitions, define PASID and its variations (e.g. max PASID) as "u32". "u32" is also shorter and a little more explicit than "unsigned int". No PASID type change in uapi although it defines PASID as __u64 in some places. Suggested-by: Thomas Gleixner Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck Reviewed-by: Lu Baolu Acked-by: Felix Kuehling --- v8: - Change subject to "drm, iommu:" (Boris). v7: - Add "Acked-by: Felix Kuehling " v6: - Change return type to u32 for kfd_pasid_alloc() (Felix) v5: - Reviewed by Lu Baolu v4: - Change PASID type from "unsigned int" to "u32" (Christoph) v2: - Create this new patch to define PASID as "unsigned int" consistently in iommu (Thomas) drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 4 +-- .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 2 +- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 6 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h | 4 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 8 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 8 ++--- .../gpu/drm/amd/amdkfd/cik_event_interrupt.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h | 2 +- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 7 ++--- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 8 ++--- drivers/gpu/drm/amd/amdkfd/kfd_events.h | 4 +-- drivers/gpu/drm/amd/amdkfd/kfd_iommu.c| 6 ++-- drivers/gpu/drm/amd/amdkfd/kfd_pasid.c| 4 +-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 20 ++-- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 +- .../gpu/drm/amd/include/kgd_kfd_interface.h | 2 +- drivers/iommu/amd/amd_iommu.h | 10 +++--- drivers/iommu/amd/iommu.c | 31 ++- drivers/iommu/amd/iommu_v2.c | 20 ++-- drivers/iommu/intel/dmar.c| 7 +++-- drivers/iommu/intel/iommu.c | 4 +-- drivers/iommu/intel/pasid.c | 31 +-- drivers/iommu/intel/pasid.h | 24 +++--- drivers/iommu/intel/svm.c | 12 +++ drivers/iommu/iommu.c | 2 +- drivers/misc/uacce/uacce.c| 2 +- include/linux/amd-iommu.h | 8 ++--- include/linux/intel-iommu.h | 12 +++ include/linux/intel-svm.h | 2 +- include/linux/iommu.h | 10 +++--- include/linux/uacce.h | 2 +- 38 files changed, 141 insertions(+), 141 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index ffe149aafc39..dfef5a7e0f5a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -207,11 +207,11 @@ uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev *dst, struct kgd_dev *s }) /* GPUVM API */ -int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, unsigned int pasid, +int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, u32 pasid, void **vm, void **process_info, struct dma_fence **ef); int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct kgd_dev *kgd, - struct file *filp, unsigned int pasid, + struct file *filp, u32 pasid, void **vm, void **process_info, struct dma_fence **ef); void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c index bf927f432506..ee531c3988d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c @@ -105,7 +105,7 @@ static void kgd_program_sh_mem_settings(struct kgd_dev *kgd, uint32_t vmid, unlock_srbm(kgd); } -static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, unsigned int pasid, +static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, u32 pasid, unsigned int vmid) { struct amdgpu_device *adev = get_amdgpu_device(kgd); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c index 744366c7ee85..4d41317b
[PATCH v8 4/9] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
Work submission instruction comes in two flavors. ENQCMD can be called both in ring 3 and ring 0 and always uses the contents of PASID MSR when shipping the command to the device. ENQCMDS allows a kernel driver to submit commands on behalf of a user process. The driver supplies the PASID value in ENQCMDS. There isn't any usage of ENQCMD in the kernel as of now. The CPU feature flag is shown as "enqcmd" in /proc/cpuinfo. Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck --- v2: - Re-write commit message (Thomas) arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/cpuid-deps.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 2901d5df4366..fea10d04d05f 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -353,6 +353,7 @@ #define X86_FEATURE_CLDEMOTE (16*32+25) /* CLDEMOTE instruction */ #define X86_FEATURE_MOVDIRI(16*32+27) /* MOVDIRI instruction */ #define X86_FEATURE_MOVDIR64B (16*32+28) /* MOVDIR64B instruction */ +#define X86_FEATURE_ENQCMD (16*32+29) /* ENQCMD and ENQCMDS instructions */ /* AMD-defined CPU features, CPUID level 0x8007 (EBX), word 17 */ #define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* MCA overflow recovery support */ diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index 3cbe24ca80ab..3a02707c1f4d 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_CQM_MBM_TOTAL,X86_FEATURE_CQM_LLC }, { X86_FEATURE_CQM_MBM_LOCAL,X86_FEATURE_CQM_LLC }, { X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL }, + { X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES}, {} }; -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 8/9] x86/cpufeatures: Mark ENQCMD as disabled when configured out
Currently, the ENQCMD feature depends on CONFIG_IOMMU_SUPPORT. Add X86_FEATURE_ENQCMD to the disabled features mask. Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck --- v8: - Re-write commit message (Boris). - Move "#ifdef CONFIG_IOMMU_SUPPORT" hunk from patch 9 (Boris). v7: - Split this patch from a previous patch. arch/x86/include/asm/disabled-features.h | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 4ea8584682f9..5861d34f9771 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -56,6 +56,12 @@ # define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31)) #endif +#ifdef CONFIG_IOMMU_SUPPORT +# define DISABLE_ENQCMD0 +#else +# define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -75,7 +81,8 @@ #define DISABLED_MASK130 #define DISABLED_MASK140 #define DISABLED_MASK150 -#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP) +#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \ +DISABLE_ENQCMD) #define DISABLED_MASK170 #define DISABLED_MASK180 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19) -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 6/9] x86/msr-index: Define IA32_PASID MSR
The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier (PASID), a 20-bit value. Bit 31 must be set to indicate the value programmed in the MSR is valid. Hardware uses PASID to identify process address space and direct responses to the right address space. Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck --- v2: - Change "identify process" to "identify process address space" in the commit message (Thomas) arch/x86/include/asm/msr-index.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 2859ee4f39a8..aaddc6a9e237 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -257,6 +257,9 @@ #define MSR_IA32_LASTINTFROMIP 0x01dd #define MSR_IA32_LASTINTTOIP 0x01de +#define MSR_IA32_PASID 0x0d93 +#define MSR_IA32_PASID_VALID BIT_ULL(31) + /* DEBUGCTLMSR bits (others vary by model): */ #define DEBUGCTLMSR_LBR(1UL << 0) /* last branch recording */ #define DEBUGCTLMSR_BTF_SHIFT 1 -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 2/9] iommu/vt-d: Change flags type to unsigned int in binding mm
"flags" passed to intel_svm_bind_mm() is a bit mask and should be defined as "unsigned int" instead of "int". Change its type to "unsigned int". Suggested-by: Thomas Gleixner Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck Reviewed-by: Lu Baolu --- v5: - Reviewed by Lu Baolu v2: - Add this new patch per Thomas' comment. drivers/iommu/intel/svm.c | 7 --- include/linux/intel-iommu.h | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index e78a74a9c1cf..fc90a079e228 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -446,7 +446,8 @@ int intel_svm_unbind_gpasid(struct device *dev, u32 pasid) /* Caller must hold pasid_mutex, mm reference */ static int -intel_svm_bind_mm(struct device *dev, int flags, struct svm_dev_ops *ops, +intel_svm_bind_mm(struct device *dev, unsigned int flags, + struct svm_dev_ops *ops, struct mm_struct *mm, struct intel_svm_dev **sd) { struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL); @@ -1033,7 +1034,7 @@ intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata) { struct iommu_sva *sva = ERR_PTR(-EINVAL); struct intel_svm_dev *sdev = NULL; - int flags = 0; + unsigned int flags = 0; int ret; /* @@ -1042,7 +1043,7 @@ intel_svm_bind(struct device *dev, struct mm_struct *mm, void *drvdata) * and intel_svm etc. */ if (drvdata) - flags = *(int *)drvdata; + flags = *(unsigned int *)drvdata; mutex_lock(&pasid_mutex); ret = intel_svm_bind_mm(dev, flags, NULL, mm, &sdev); if (ret) diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index 7322073f62d0..9c3e8337442a 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -765,7 +765,7 @@ struct intel_svm { struct mm_struct *mm; struct intel_iommu *iommu; - int flags; + unsigned int flags; u32 pasid; int gpasid; /* In case that guest PASID is different from host PASID */ struct list_head devs; -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 3/9] Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)
From: Ashok Raj ENQCMD and Data Streaming Accelerator (DSA) and all of their associated features are a complicated stack with lots of interconnected pieces. This documentation provides a big picture overview for all of the features. Signed-off-by: Ashok Raj Co-developed-by: Fenghua Yu Signed-off-by: Fenghua Yu Reviewed-by: Tony Luck --- v8: - Address all of comments from Boris and Randy. v7: - Change the doc for updating PASID by IPI and context switch (Andy). v3: - Replace deprecated intel_svm_bind_mm() by iommu_sva_bind_mm() (Baolu) - Fix a couple of typos (Baolu) v2: - Fix the doc format and add the doc in toctree (Thomas) - Modify the doc for better description (Thomas, Tony, Dave) Documentation/x86/index.rst | 1 + Documentation/x86/sva.rst | 256 2 files changed, 257 insertions(+) create mode 100644 Documentation/x86/sva.rst diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst index 265d9e9a093b..e5d5ff096685 100644 --- a/Documentation/x86/index.rst +++ b/Documentation/x86/index.rst @@ -30,3 +30,4 @@ x86-specific Documentation usb-legacy-support i386/index x86_64/index + sva diff --git a/Documentation/x86/sva.rst b/Documentation/x86/sva.rst new file mode 100644 index ..a1f008ef7dad --- /dev/null +++ b/Documentation/x86/sva.rst @@ -0,0 +1,256 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=== +Shared Virtual Addressing (SVA) with ENQCMD +=== + +Background +== + +Shared Virtual Addressing (SVA) allows the processor and device to use the +same virtual addresses avoiding the need for software to translate virtual +addresses to physical addresses. SVA is what PCIe calls Shared Virtual +Memory (SVM). + +In addition to the convenience of using application virtual addresses +by the device, it also doesn't require pinning pages for DMA. +PCIe Address Translation Services (ATS) along with Page Request Interface +(PRI) allow devices to function much the same way as the CPU handling +application page-faults. For more information please refer to the PCIe +specification Chapter 10: ATS Specification. + +Use of SVA requires IOMMU support in the platform. IOMMU also is required +to support PCIe features ATS and PRI. ATS allows devices to cache +translations for virtual addresses. The IOMMU driver uses the mmu_notifier() +support to keep the device TLB cache and the CPU cache in sync. PRI allows +the device to request paging the virtual address by using the CPU page tables +before accessing the address. + + +Shared Hardware Workqueues +== + +Unlike Single Root I/O Virtualization (SR-IOV), Scalable IOV (SIOV) permits +the use of Shared Work Queues (SWQ) by both applications and Virtual +Machines (VM's). This allows better hardware utilization vs. hard +partitioning resources that could result in under utilization. In order to +allow the hardware to distinguish the context for which work is being +executed in the hardware by SWQ interface, SIOV uses Process Address Space +ID (PASID), which is a 20-bit number defined by the PCIe SIG. + +PASID value is encoded in all transactions from the device. This allows the +IOMMU to track I/O on a per-PASID granularity in addition to using the PCIe +Resource Identifier (RID) which is the Bus/Device/Function. + + +ENQCMD +== + +ENQCMD is a new instruction on Intel platforms that atomically submits a +work descriptor to a device. The descriptor includes the operation to be +performed, virtual addresses of all parameters, virtual address of a completion +record, and the PASID (process address space ID) of the current process. + +ENQCMD works with non-posted semantics and carries a status back if the +command was accepted by hardware. This allows the submitter to know if the +submission needs to be retried or other device specific mechanisms to +implement fairness or ensure forward progress should be provided. + +ENQCMD is the glue that ensures applications can directly submit commands +to the hardware and also permits hardware to be aware of application context +to perform I/O operations via use of PASID. + +Process Address Space Tagging += + +A new thread-scoped MSR (IA32_PASID) provides the connection between +user processes and the rest of the hardware. When an application first +accesses an SVA-capable device this MSR is initialized with a newly +allocated PASID. The driver for the device calls an IOMMU-specific API +that sets up the routing for DMA and page-requests. + +For example, the Intel Data Streaming Accelerator (DSA) uses +iommu_sva_bind_device(), which will do the following: + +- Allocate the PASID, and program the process page-table (%cr3 register) in the + PASID context entries. +- Register for mmu_notifier() to track any page-table invalidations to keep + the device TLB in sync. For example, when a page-table entry is invalidated,
[PATCH 17/18] dma-iommu: implement ->alloc_noncoherent
Implement the alloc_noncoherent method to provide memory that is neither coherent not contiguous. Signed-off-by: Christoph Hellwig --- drivers/iommu/dma-iommu.c | 41 +++ 1 file changed, 37 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 00a5b49248e334..c12c1dc43d312e 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -572,6 +572,7 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev, * @size: Size of buffer in bytes * @dma_handle: Out argument for allocated DMA handle * @gfp: Allocation flags + * @prot: pgprot_t to use for the remapped mapping * @attrs: DMA attributes for this allocation * * If @size is less than PAGE_SIZE, then a full CPU page will be allocated, @@ -580,14 +581,14 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev, * Return: Mapped virtual address, or NULL on failure. */ static void *iommu_dma_alloc_remap(struct device *dev, size_t size, - dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs) + dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot, + unsigned long attrs) { struct iommu_domain *domain = iommu_get_dma_domain(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; bool coherent = dev_is_dma_coherent(dev); int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs); - pgprot_t prot = dma_pgprot(dev, PAGE_KERNEL, attrs); unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap; struct page **pages; struct sg_table sgt; @@ -1030,8 +1031,10 @@ static void *iommu_dma_alloc(struct device *dev, size_t size, gfp |= __GFP_ZERO; if (IS_ENABLED(CONFIG_DMA_REMAP) && gfpflags_allow_blocking(gfp) && - !(attrs & DMA_ATTR_FORCE_CONTIGUOUS)) - return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs); + !(attrs & DMA_ATTR_FORCE_CONTIGUOUS)) { + return iommu_dma_alloc_remap(dev, size, handle, gfp, + dma_pgprot(dev, PAGE_KERNEL, attrs), attrs); + } if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !gfpflags_allow_blocking(gfp) && !coherent) @@ -1052,6 +1055,34 @@ static void *iommu_dma_alloc(struct device *dev, size_t size, return cpu_addr; } +#ifdef CONFIG_DMA_REMAP +static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size, + dma_addr_t *handle, enum dma_data_direction dir, gfp_t gfp) +{ + if (!gfpflags_allow_blocking(gfp)) { + struct page *page; + + page = dma_common_alloc_pages(dev, size, handle, dir, gfp); + if (!page) + return NULL; + return page_address(page); + } + + return iommu_dma_alloc_remap(dev, size, handle, gfp | __GFP_ZERO, +PAGE_KERNEL, 0); +} + +static void iommu_dma_free_noncoherent(struct device *dev, size_t size, + void *cpu_addr, dma_addr_t handle, enum dma_data_direction dir) +{ + __iommu_dma_unmap(dev, handle, size); + __iommu_dma_free(dev, size, cpu_addr); +} +#else +#define iommu_dma_alloc_noncoherentNULL +#define iommu_dma_free_noncoherent NULL +#endif /* CONFIG_DMA_REMAP */ + static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs) @@ -1122,6 +1153,8 @@ static const struct dma_map_ops iommu_dma_ops = { .free = iommu_dma_free, .alloc_pages= dma_common_alloc_pages, .free_pages = dma_common_free_pages, + .alloc_noncoherent = iommu_dma_alloc_noncoherent, + .free_noncoherent = iommu_dma_free_noncoherent, .mmap = iommu_dma_mmap, .get_sgtable= iommu_dma_get_sgtable, .map_page = iommu_dma_map_page, -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 18/18] firewire-ohci: use dma_alloc_pages
Use dma_alloc_pages to allocate DMAable pages instead of hoping that the architecture either has GFP_DMA32 or not more than 4G of memory. Signed-off-by: Christoph Hellwig --- drivers/firewire/ohci.c | 26 +++--- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c index 020cb15a4d8fcc..9811c40956e54d 100644 --- a/drivers/firewire/ohci.c +++ b/drivers/firewire/ohci.c @@ -674,17 +674,16 @@ static void ar_context_link_page(struct ar_context *ctx, unsigned int index) static void ar_context_release(struct ar_context *ctx) { + struct device *dev = ctx->ohci->card.device; unsigned int i; vunmap(ctx->buffer); - for (i = 0; i < AR_BUFFERS; i++) - if (ctx->pages[i]) { - dma_unmap_page(ctx->ohci->card.device, - ar_buffer_bus(ctx, i), - PAGE_SIZE, DMA_FROM_DEVICE); - __free_page(ctx->pages[i]); - } + for (i = 0; i < AR_BUFFERS; i++) { + if (ctx->pages[i]) + dma_free_pages(dev, PAGE_SIZE, ctx->pages[i], + ar_buffer_bus(ctx, i), DMA_FROM_DEVICE); + } } static void ar_context_abort(struct ar_context *ctx, const char *error_msg) @@ -970,6 +969,7 @@ static void ar_context_tasklet(unsigned long data) static int ar_context_init(struct ar_context *ctx, struct fw_ohci *ohci, unsigned int descriptors_offset, u32 regs) { + struct device *dev = ohci->card.device; unsigned int i; dma_addr_t dma_addr; struct page *pages[AR_BUFFERS + AR_WRAPAROUND_PAGES]; @@ -980,17 +980,13 @@ static int ar_context_init(struct ar_context *ctx, struct fw_ohci *ohci, tasklet_init(&ctx->tasklet, ar_context_tasklet, (unsigned long)ctx); for (i = 0; i < AR_BUFFERS; i++) { - ctx->pages[i] = alloc_page(GFP_KERNEL | GFP_DMA32); + ctx->pages[i] = dma_alloc_pages(dev, PAGE_SIZE, &dma_addr, + DMA_FROM_DEVICE, GFP_KERNEL); if (!ctx->pages[i]) goto out_of_memory; - dma_addr = dma_map_page(ohci->card.device, ctx->pages[i], - 0, PAGE_SIZE, DMA_FROM_DEVICE); - if (dma_mapping_error(ohci->card.device, dma_addr)) { - __free_page(ctx->pages[i]); - ctx->pages[i] = NULL; - goto out_of_memory; - } set_page_private(ctx->pages[i], dma_addr); + dma_sync_single_for_device(dev, dma_addr, PAGE_SIZE, + DMA_FROM_DEVICE); } for (i = 0; i < AR_BUFFERS; i++) -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] iommu: Kconfig: Update help description for IPMMU_VMSA config
ipmmu-vmsa driver is also used on Renesas RZ/G{1,2} Soc's, update the description for the IPMMU_VMSA config symbol to reflect this. Signed-off-by: Lad Prabhakar Reviewed-by: Chris Paterson Reviewed-by: Geert Uytterhoeven --- v1->v2 * Updated commit description as suggested by Geert * Included RB from Geert --- drivers/iommu/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index bef5d75e306b..d8f71bf31786 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -232,7 +232,7 @@ config IPMMU_VMSA select ARM_DMA_USE_IOMMU help Support for the Renesas VMSA-compatible IPMMU found in the R-Mobile - APE6, R-Car Gen2, and R-Car Gen3 SoCs. + APE6, R-Car Gen{2,3} and RZ/G{1,2} SoCs. If unsure, say N. -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On Tue, Sep 15, 2020 at 08:33:41AM -0300, Jason Gunthorpe wrote: > On Mon, Sep 14, 2020 at 03:44:38PM -0700, Raj, Ashok wrote: > > Hi Jason, > > > > I thought we discussed this at LPC, but still seems to be going in > > circles :-(. > > We discused mdev at LPC, not PASID. > > PASID applies widely to many device and needs to be introduced with a > wide community agreement so all scenarios will be supportable. True, reading some of the earlier replies I was clearly confused as I thought you were talking about mdev again. But now that you stay it, you have moved past mdev and its the PASID interfaces correct? > > > As you had suggested earlier in the mail thread could Jason Wang maybe > > build out what it takes to have a full fledged /dev/sva interface for vDPA > > and figure out how the interfaces should emerge? otherwise it appears > > everyone is talking very high level and with that limited understanding of > > how things work at the moment. > > You want Jason Wang to do the work to get Intel PASID support merged? > Seems a bit of strange request. I was reading mdev in my head. Not PASID, sorry. For the native user applications have just 1 PASID per process. There is no need for a quota management. VFIO being the one used for guest where there is more PASID's per guest is where this is enforced today. IIUC, you are asking that part of the interface to move to a API interface that potentially the new /dev/sva and VFIO could share? I think the API's for PASID management themselves are generic (Jean's patchset + Jacob's ioasid set management). Possibly what you need is already available, but not in a specific way that you expect maybe? Let me check with Jacob and let him/Jean pick that up. Cheers, Ashok ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On Tue, Sep 15, 2020 at 11:11:54AM -0700, Raj, Ashok wrote: > > PASID applies widely to many device and needs to be introduced with a > > wide community agreement so all scenarios will be supportable. > > True, reading some of the earlier replies I was clearly confused as I > thought you were talking about mdev again. But now that you stay it, you > have moved past mdev and its the PASID interfaces correct? Yes, we agreed mdev for IDXD at LPC, didn't talk about PASID. > For the native user applications have just 1 PASID per > process. There is no need for a quota management. Yes, there is. There is a limited pool of HW PASID's. If one user fork bombs it can easially claim an unreasonable number from that pool as each process will claim a PASID. That can DOS the rest of the system. If PASID DOS is a worry then it must be solved at the IOMMU level for all user applications that might trigger a PASID allocation. VFIO is not special. > IIUC, you are asking that part of the interface to move to a API interface > that potentially the new /dev/sva and VFIO could share? I think the API's > for PASID management themselves are generic (Jean's patchset + Jacob's > ioasid set management). Yes, the in kernel APIs are pretty generic now, and can be used by many types of drivers. As JasonW kicked this off, VDPA will need all this identical stuff too. We already know this, and I think Intel VDPA HW will need it, so it should concern you too :) A PASID vIOMMU solution sharable with VDPA and VFIO, based on a PASID control char dev (eg /dev/sva, or maybe /dev/iommu) seems like a reasonable starting point for discussion. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On Tue, Sep 15, 2020 at 03:45:10PM -0300, Jason Gunthorpe wrote: > On Tue, Sep 15, 2020 at 11:11:54AM -0700, Raj, Ashok wrote: > > > PASID applies widely to many device and needs to be introduced with a > > > wide community agreement so all scenarios will be supportable. > > > > True, reading some of the earlier replies I was clearly confused as I > > thought you were talking about mdev again. But now that you stay it, you > > have moved past mdev and its the PASID interfaces correct? > > Yes, we agreed mdev for IDXD at LPC, didn't talk about PASID. > > > For the native user applications have just 1 PASID per > > process. There is no need for a quota management. > > Yes, there is. There is a limited pool of HW PASID's. If one user fork > bombs it can easially claim an unreasonable number from that pool as > each process will claim a PASID. That can DOS the rest of the system. Not sure how you had this played out.. For PASID used in ENQCMD today for our SVM usages, we *DO* not automatically propagate or allocate new PASIDs. The new process needs to bind to get a PASID for its own use. For threads of same process the PASID is inherited. For forks(), we do not auto-allocate them. Since PASID isn't a sharable resource much like how you would not pass mmio mmap's to forked processes that cannot be shared correct? Such as your doorbell space for e.g. > > If PASID DOS is a worry then it must be solved at the IOMMU level for > all user applications that might trigger a PASID allocation. VFIO is > not special. Feels like you can simply avoid the PASID DOS rather than permit it to happen. > > > IIUC, you are asking that part of the interface to move to a API interface > > that potentially the new /dev/sva and VFIO could share? I think the API's > > for PASID management themselves are generic (Jean's patchset + Jacob's > > ioasid set management). > > Yes, the in kernel APIs are pretty generic now, and can be used by > many types of drivers. Good, so there is no new requirements here I suppose. > > As JasonW kicked this off, VDPA will need all this identical stuff > too. We already know this, and I think Intel VDPA HW will need it, so > it should concern you too :) This is one of those things that I would disagree and commit :-).. > > A PASID vIOMMU solution sharable with VDPA and VFIO, based on a PASID > control char dev (eg /dev/sva, or maybe /dev/iommu) seems like a > reasonable starting point for discussion. Looks like now we are getting closer to what we need. :-) Given that PASID api's are general purpose today and any driver can use it to take advantage. VFIO fortunately or unfortunately has the IOMMU things abstracted. I suppose that support is also mostly built on top of the generic iommu* api abstractions in a vendor neutral way? I'm still lost on what is missing that vDPA can't build on top of what is available? Cheers, Ashok ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 6/6] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset
On Tue, Sep 15, 2020 at 07:41:22AM +0200, Christoph Hellwig wrote: > On Mon, Sep 14, 2020 at 05:01:47PM -0600, Mathieu Poirier wrote: > > [700 lines of the fullquote deleted..] > > > > + for (r = map; r->size; r++) > > > + num_ranges++; > > > + > > > + new_map = kmemdup(map, array_size(num_ranges + 1, sizeof(*map)), > > > + GFP_KERNEL); > > > + if (!new_map) > > > + return -ENOMEM; > > > + to->dma_range_map = new_map; > > > + return 0; > > > +} > > > + > > > > This patch seemed Ok to me but it broke the stm32 remoteproc > > implementation. When > > I tested things out function dma_coerce_mask_and_cohenrent() returns -5 and > > the > > rest of the initialisation fails. I isolated things to function > > dma_to_pfn() > > [2]. In the original implementation __bus_to_pfn() returns 0xf and > > dev->dma_pfn_offset is equal to 0x38000. As such the function returns > > 0x137fff > > and dma_supported() a non-zero value[3]. > > > > With this set function dma_to_pfn() received a face lift. Function > > __bus_to_pfn() still returns 0xf but translate_dma_to_phys() returns 0, > > which forces dma_supported() to also return 0 and that is where the -5 > > (-EIO) > > comes from. > > > > Taking a futher look at translate_dma_to_phy(), @dma_addr never falls > > within the > > bus_dma_region ranges and returns 0. > > > > I'm suspecting an initialisation problem and if it occurred here, it will > > likely show up elsewhere. > > Can you try this incremental patch? > > diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h > index 088c97181ab146..c6b21acba7a459 100644 > --- a/include/linux/dma-direct.h > +++ b/include/linux/dma-direct.h > @@ -46,7 +46,7 @@ static inline phys_addr_t translate_dma_to_phys(struct > device *dev, > if (dma_addr >= m->dma_start && dma_addr - m->dma_start < > m->size) > return (phys_addr_t)dma_addr + m->offset; > > - return 0; > + return (phys_addr_t)-1; That did the trick - the stm32 platform driver's probe() function completes and the remote processor is operatinal. That being said the value returned by function dma_to_pfn() is 0x137fff in the original code and 0xf with your patches applied. Thanks, Mathieu > } > > #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-direct: Fix potential NULL pointer dereference
On 2020-09-15 11:09 a.m., Christoph Hellwig wrote: On Tue, Sep 15, 2020 at 10:40:39AM -0400, Thomas Tai wrote: +++ b/include/linux/dma-direct.h @@ -62,9 +62,6 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size, { dma_addr_t end = addr + size - 1; -if (!dev->dma_mask) - return false; - I am concerned that some drivers may rely on this NULL checking. Would you think we can keep this checking and use the following WARN_ON_ONCE()? dma_capable is not a helper for drivers, but just for dma-direct and related code. And this patch adds the checks for the three places how we call into the ->map* methods. Hi Christoph, I tried out the suggested changes, and it successfully warned the null pointer without panic. I notice that there are some places outside the dma-direct, which calls dma_capable(). https://elixir.bootlin.com/linux/v5.9-rc5/source/arch/x86/kernel/amd_gart_64.c#L187 https://elixir.bootlin.com/linux/v5.9-rc5/source/drivers/xen/swiotlb-xen.c#L387 Also, if I remove the null checking in dma_capable(), I may run into the risk of a null pointer dereference within the function. @@ -62,9 +62,6 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size, { dma_addr_t end = addr + size - 1; - if (!dev->dma_mask) - return false; - if (is_ram && !IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) && min(addr, end) < phys_to_dma(dev, PFN_PHYS(min_low_pfn))) return false; return end <= min_not_zero(*dev->dma_mask, dev->bus_dma_limit); ^ | ** risk of a null dereference ** } Given that the WARN_ON_ONCE already did the intended warning, would you be ok that I keep the null checking in dma_capable()? Thank you, Thomas ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
Hi Jason, On Tue, 15 Sep 2020 15:45:10 -0300, Jason Gunthorpe wrote: > On Tue, Sep 15, 2020 at 11:11:54AM -0700, Raj, Ashok wrote: > > > PASID applies widely to many device and needs to be introduced with a > > > wide community agreement so all scenarios will be supportable. > > > > True, reading some of the earlier replies I was clearly confused as I > > thought you were talking about mdev again. But now that you stay it, you > > have moved past mdev and its the PASID interfaces correct? > > Yes, we agreed mdev for IDXD at LPC, didn't talk about PASID. > > > For the native user applications have just 1 PASID per > > process. There is no need for a quota management. > > Yes, there is. There is a limited pool of HW PASID's. If one user fork > bombs it can easially claim an unreasonable number from that pool as > each process will claim a PASID. That can DOS the rest of the system. > > If PASID DOS is a worry then it must be solved at the IOMMU level for > all user applications that might trigger a PASID allocation. VFIO is > not special. > > > IIUC, you are asking that part of the interface to move to a API > > interface that potentially the new /dev/sva and VFIO could share? I > > think the API's for PASID management themselves are generic (Jean's > > patchset + Jacob's ioasid set management). > > Yes, the in kernel APIs are pretty generic now, and can be used by > many types of drivers. > Right, IOMMU UAPIs are not VFIO specific, we pass user pointer to the IOMMU layer to process. Similarly for PASID management, the IOASID extensions we are proposing will handle ioasid_set (groups/pools), quota, permissions, and notifications in the IOASID core. There is nothing VFIO specific. https://lkml.org/lkml/2020/8/22/12 > As JasonW kicked this off, VDPA will need all this identical stuff > too. We already know this, and I think Intel VDPA HW will need it, so > it should concern you too :) > > A PASID vIOMMU solution sharable with VDPA and VFIO, based on a PASID > control char dev (eg /dev/sva, or maybe /dev/iommu) seems like a > reasonable starting point for discussion. > I am not sure what can really be consolidated in /dev/sva. VFIO and VDPA will have their own kerne-user interfaces anyway for their usage models. They are just providing the specific transport while sharing generic IOMMU UAPIs and IOASID management. As I mentioned PASID management is already consolidated in the IOASID layer, so for VDPA or other users, it just matter of create its own ioasid_set, doing allocation. IOASID is also available to the in-kernel users which does not need /dev/sva AFAICT. For bare metal SVA, I don't see a need to create this 'floating' state of the PASID when created by /dev/sva. PASID allocation could happen behind the scene when users need to bind page tables to a device DMA stream. Security authorization of the PASID is natively enforced when user try to bind page table, there is no need to pass the FD handle of the PASID back to the kernel as you suggested earlier. Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 4/4] PCI/ACS: Enable PCI_ACS_TB for untrusted/external-facing devices
Hello Bjorn, On Tue, Jul 14, 2020 at 1:19 PM Rajat Jain wrote: > > On Sat, Jul 11, 2020 at 12:53 PM Bjorn Helgaas wrote: > > > > On Fri, Jul 10, 2020 at 03:53:59PM -0700, Rajat Jain wrote: > > > On Fri, Jul 10, 2020 at 2:29 PM Raj, Ashok wrote: > > > > On Fri, Jul 10, 2020 at 03:29:22PM -0500, Bjorn Helgaas wrote: > > > > > On Tue, Jul 07, 2020 at 03:46:04PM -0700, Rajat Jain wrote: > > > > > > When enabling ACS, enable translation blocking for external facing > > > > > > ports > > > > > > and untrusted devices. > > > > > > > > > > > > Signed-off-by: Rajat Jain > > > > > > --- > > > > > > v4: Add braces to avoid warning from kernel robot > > > > > > print warning for only external-facing devices. > > > > > > v3: print warning if ACS_TB not supported on > > > > > > external-facing/untrusted ports. > > > > > > Minor code comments fixes. > > > > > > v2: Commit log change > > > > > > > > > > > > drivers/pci/pci.c| 8 > > > > > > drivers/pci/quirks.c | 15 +++ > > > > > > 2 files changed, 23 insertions(+) > > > > > > > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > > > > index 73a8627822140..a5a6bea7af7ce 100644 > > > > > > --- a/drivers/pci/pci.c > > > > > > +++ b/drivers/pci/pci.c > > > > > > @@ -876,6 +876,14 @@ static void pci_std_enable_acs(struct pci_dev > > > > > > *dev) > > > > > > /* Upstream Forwarding */ > > > > > > ctrl |= (cap & PCI_ACS_UF); > > > > > > > > > > > > + /* Enable Translation Blocking for external devices */ > > > > > > + if (dev->external_facing || dev->untrusted) { > > > > > > + if (cap & PCI_ACS_TB) > > > > > > + ctrl |= PCI_ACS_TB; > > > > > > + else if (dev->external_facing) > > > > > > + pci_warn(dev, "ACS: No Translation Blocking on > > > > > > external-facing dev\n"); > > > > > > + } > > > > > > > > > > IIUC, this means that external devices can *never* use ATS and > > > > > can never cache translations. > > > > > > Yes, but it already exists today (and this patch doesn't change that): > > > 521376741b2c2 "PCI/ATS: Only enable ATS for trusted devices" > > > > > > IMHO any external device trying to send ATS traffic despite having ATS > > > disabled should count as a bad intent. And this patch is trying to > > > plug that loophole, by blocking the AT traffic from devices that we do > > > not expect to see AT from anyway. > > > > Thinking about this some more, I wonder if Linux should: > > > > - Explicitly disable ATS for every device at enumeration-time, e.g., > > in pci_init_capabilities(), > > > > - Enable PCI_ACS_TB for every device (not just external-facing or > > untrusted ones), > > > > - Disable PCI_ACS_TB for the relevant devices along the path only > > when enabling ATS. > > > > One nice thing about doing that is that the "untrusted" test would be > > only in pci_enable_ats(), and we wouldn't need one in > > pci_std_enable_acs(). > > Sent out v5 with this approach here: > https://patchwork.kernel.org/patch/11663515/ Any feedback on the patch above? It has been waiting for feedback Thanks & Best Regards,, Rajat > > Thanks, > > Rajat > > > > > > > It's possible BIOS gives us devices with ATS enabled, and this might > > break them, but that seems like something we'd want to find out about. > > > > Bjorn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On Tue, Sep 15, 2020 at 12:26:32PM -0700, Raj, Ashok wrote: > > Yes, there is. There is a limited pool of HW PASID's. If one user fork > > bombs it can easially claim an unreasonable number from that pool as > > each process will claim a PASID. That can DOS the rest of the system. > > Not sure how you had this played out.. For PASID used in ENQCMD today for > our SVM usages, we *DO* not automatically propagate or allocate new PASIDs. > > The new process needs to bind to get a PASID for its own use. For threads > of same process the PASID is inherited. For forks(), we do not > auto-allocate them. Auto-allocate doesn't matter, the PASID is tied to the mm_struct, after fork the program will get a new mm_struct, and it can manually re-trigger PASID allocation for that mm_struct from any SVA kernel driver. 64k processes, each with their own mm_struct, all triggering SVA, will allocate 64k PASID's and use up the whole 16 bit space. > Given that PASID api's are general purpose today and any driver can use it > to take advantage. VFIO fortunately or unfortunately has the IOMMU things > abstracted. I suppose that support is also mostly built on top of the > generic iommu* api abstractions in a vendor neutral way? > > I'm still lost on what is missing that vDPA can't build on top of what is > available? I think it is basically everything in this patch.. Why duplicate all this uAPI? Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/3] drm: panfrost: Coherency support
Hi all, I polished up my original proof-of-concept a little while back, but now that I've got my hands on my Juno again I've been able to actually test it to my satisfaction, so here are proper patches! It probably makes sense for patches #1 and #2 to stay together and both go via drm-misc, provided Will's OK with that. Robin. Robin Murphy (3): iommu/io-pgtable-arm: Support coherency for Mali LPAE drm/panfrost: Support cache-coherent integrations arm64: dts: meson: Describe G12b GPU as coherent arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 drivers/gpu/drm/panfrost/panfrost_device.h | 1 + drivers/gpu/drm/panfrost/panfrost_drv.c | 2 ++ drivers/gpu/drm/panfrost/panfrost_gem.c | 2 ++ drivers/gpu/drm/panfrost/panfrost_mmu.c | 1 + drivers/iommu/io-pgtable-arm.c | 5 - 6 files changed, 14 insertions(+), 1 deletion(-) -- 2.28.0.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/3] drm/panfrost: Support cache-coherent integrations
When the GPU's ACE-Lite interface is fully wired up and capable of snooping CPU caches, it may be described as "dma-coherent" in devicetree, which will already inform the DMA layer not to perform unnecessary cache maintenance. However, we still need to ensure that the GPU uses the appropriate cacheable outer-shareable attributes in order to generate the requisite snoop signals, and that CPU mappings don't create a mismatch by using a non-cacheable type either. Signed-off-by: Robin Murphy --- drivers/gpu/drm/panfrost/panfrost_device.h | 1 + drivers/gpu/drm/panfrost/panfrost_drv.c| 2 ++ drivers/gpu/drm/panfrost/panfrost_gem.c| 2 ++ drivers/gpu/drm/panfrost/panfrost_mmu.c| 1 + 4 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h index c30c719a8059..b31f45315e96 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.h +++ b/drivers/gpu/drm/panfrost/panfrost_device.h @@ -84,6 +84,7 @@ struct panfrost_device { /* pm_domains for devices with more than one. */ struct device *pm_domain_devs[MAX_PM_DOMAINS]; struct device_link *pm_domain_links[MAX_PM_DOMAINS]; + bool coherent; struct panfrost_features features; const struct panfrost_compatible *comp; diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index ada51df9a7a3..2a6f2f716b2f 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -588,6 +588,8 @@ static int panfrost_probe(struct platform_device *pdev) if (!pfdev->comp) return -ENODEV; + pfdev->coherent = device_get_dma_attr(&pdev->dev) == DEV_DMA_COHERENT; + /* Allocate and initialze the DRM device. */ ddev = drm_dev_alloc(&panfrost_drm_driver, &pdev->dev); if (IS_ERR(ddev)) diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c index 33355dd302f1..cdf1a8754eba 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.c +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c @@ -220,6 +220,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs = { */ struct drm_gem_object *panfrost_gem_create_object(struct drm_device *dev, size_t size) { + struct panfrost_device *pfdev = dev->dev_private; struct panfrost_gem_object *obj; obj = kzalloc(sizeof(*obj), GFP_KERNEL); @@ -229,6 +230,7 @@ struct drm_gem_object *panfrost_gem_create_object(struct drm_device *dev, size_t INIT_LIST_HEAD(&obj->mappings.list); mutex_init(&obj->mappings.lock); obj->base.base.funcs = &panfrost_gem_funcs; + obj->base.map_cached = pfdev->coherent; return &obj->base.base; } diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index e8f7b11352d2..8852fd378f7a 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -371,6 +371,7 @@ int panfrost_mmu_pgtable_alloc(struct panfrost_file_priv *priv) .pgsize_bitmap = SZ_4K | SZ_2M, .ias= FIELD_GET(0xff, pfdev->features.mmu_features), .oas= FIELD_GET(0xff00, pfdev->features.mmu_features), + .coherent_walk = pfdev->coherent, .tlb= &mmu_tlb_ops, .iommu_dev = pfdev->dev, }; -- 2.28.0.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent
According to a downstream commit I found in the Khadas vendor kernel, the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows how to handle this properly) we should describe it as such. Otherwise the mismatch leads to all manner of fun with mismatched attributes and inadvertently snooping stale data from caches, which would account for at least some of the brokenness observed on this platform. Signed-off-by: Robin Murphy --- arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi index 9b8548e5f6e5..ee8fcae9f9f0 100644 --- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi +++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi @@ -135,3 +135,7 @@ map1 { }; }; }; + +&mali { + dma-coherent; +}; -- 2.28.0.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On Tue, Sep 15, 2020 at 03:08:51PM -0700, Jacob Pan wrote: > > A PASID vIOMMU solution sharable with VDPA and VFIO, based on a PASID > > control char dev (eg /dev/sva, or maybe /dev/iommu) seems like a > > reasonable starting point for discussion. > > I am not sure what can really be consolidated in /dev/sva. More or less, everything in this patch. All the manipulations of PASID that are required for vIOMMU use case/etc. Basically all PASID control that is not just a 1:1 mapping of the mm_struct. > will have their own kerne-user interfaces anyway for their usage models. > They are just providing the specific transport while sharing generic IOMMU > UAPIs and IOASID management. > As I mentioned PASID management is already consolidated in the IOASID layer, > so for VDPA or other users, it just matter of create its own ioasid_set, > doing allocation. Creating the PASID is not the problem, managing what the PASID maps to is the issue. That is all uAPI that we don't really have today. > IOASID is also available to the in-kernel users which does not > need /dev/sva AFAICT. For bare metal SVA, I don't see a need to create this > 'floating' state of the PASID when created by /dev/sva. PASID allocation > could happen behind the scene when users need to bind page tables to a > device DMA stream. My point is I would like to see one set of uAPI ioctls to bind page tables. I don't want to have VFIO, VDPA, etc, etc uAPIs to do the exact same things only slightly differently. If user space wants to bind page tables, create the PASID with /dev/sva, use ioctls there to setup the page table the way it wants, then pass the now configured PASID to a driver that can use it. Driver does not do page table binding. Do not duplicate all the control plane uAPI in every driver. PASID managment and binding is seperated from the driver(s) that are using the PASID. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/3] iommu/io-pgtable-arm: Support coherency for Mali LPAE
Midgard GPUs have ACE-Lite master interfaces which allows systems to integrate them in an I/O-coherent manner. It seems that from the GPU's viewpoint, the rest of the system is its outer shareable domain, and so even when snoop signals are wired up, they are only emitted for outer shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does indeed get coherent pagetable walks working nicely for the coherent T620 in the Arm Juno SoC. Reviewed-by: Steven Price Signed-off-by: Robin Murphy --- drivers/iommu/io-pgtable-arm.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index dc7bcf858b6d..e47012006dcc 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -440,7 +440,7 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, << ARM_LPAE_PTE_ATTRINDX_SHIFT); } - if (prot & IOMMU_CACHE) + if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE) pte |= ARM_LPAE_PTE_SH_IS; else pte |= ARM_LPAE_PTE_SH_OS; @@ -1049,6 +1049,9 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie) cfg->arm_mali_lpae_cfg.transtab = virt_to_phys(data->pgd) | ARM_MALI_LPAE_TTBR_READ_INNER | ARM_MALI_LPAE_TTBR_ADRMODE_TABLE; + if (cfg->coherent_walk) + cfg->arm_mali_lpae_cfg.transtab |= ARM_MALI_LPAE_TTBR_SHARE_OUTER; + return &data->iop; out_free_data: -- 2.28.0.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
Hi Jason, On Tue, 15 Sep 2020 20:51:26 -0300, Jason Gunthorpe wrote: > On Tue, Sep 15, 2020 at 03:08:51PM -0700, Jacob Pan wrote: > > > A PASID vIOMMU solution sharable with VDPA and VFIO, based on a > > > PASID control char dev (eg /dev/sva, or maybe /dev/iommu) seems > > > like a reasonable starting point for discussion. > > > > I am not sure what can really be consolidated in /dev/sva. > > More or less, everything in this patch. All the manipulations of PASID > that are required for vIOMMU use case/etc. Basically all PASID control > that is not just a 1:1 mapping of the mm_struct. > > > will have their own kerne-user interfaces anyway for their usage > > models. They are just providing the specific transport while > > sharing generic IOMMU UAPIs and IOASID management. > > > As I mentioned PASID management is already consolidated in the > > IOASID layer, so for VDPA or other users, it just matter of create > > its own ioasid_set, doing allocation. > > Creating the PASID is not the problem, managing what the PASID maps to > is the issue. That is all uAPI that we don't really have today. > > > IOASID is also available to the in-kernel users which does not > > need /dev/sva AFAICT. For bare metal SVA, I don't see a need to > > create this 'floating' state of the PASID when created by /dev/sva. > > PASID allocation could happen behind the scene when users need to > > bind page tables to a device DMA stream. > > My point is I would like to see one set of uAPI ioctls to bind page > tables. I don't want to have VFIO, VDPA, etc, etc uAPIs to do the > exact same things only slightly differently. > Got your point. I am not familiar with VDPA but for VFIO UAPI, it is very thin, mostly passthrough IOMMU UAPI struct as opaque data. > If user space wants to bind page tables, create the PASID with > /dev/sva, use ioctls there to setup the page table the way it wants, > then pass the now configured PASID to a driver that can use it. > Are we talking about bare metal SVA? If so, I don't see the need for userspace to know there is a PASID. All user space need is that my current mm is bound to a device by the driver. So it can be a one-step process for user instead of two. > Driver does not do page table binding. Do not duplicate all the > control plane uAPI in every driver. > > PASID managment and binding is seperated from the driver(s) that are > using the PASID. > Why separate? Drivers need to be involved in PASID life cycle management. For example, when tearing down a PASID, the driver needs to stop DMA, IOMMU driver needs to unbind, etc. If driver is the control point, then things are just in order. I am referring to bare metal SVA. For guest SVA, I agree that binding is separate from PASID allocation. Could you review this doc. in terms of life cycle? https://lkml.org/lkml/2020/8/22/13 My point is that /dev/sda has no value for bare metal SVA, we are just talking about if guest SVA UAPIs can be consolidated. Or am I missing something? > Jason Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/tegra-smmu: Fix tlb_mask
The "num_tlb_lines" might not be a power-of-2 value, being 48 on Tegra210 for example. So the current way of calculating tlb_mask using the num_tlb_lines is not correct: tlb_mask=0x5f in case of num_tlb_lines=48, which will trim a setting of 0x30 (48) to 0x10. Signed-off-by: Nicolin Chen --- drivers/iommu/tegra-smmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index 84fdee473873..0becdbfea306 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -1120,7 +1120,7 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev, BIT_MASK(mc->soc->num_address_bits - SMMU_PTE_SHIFT) - 1; dev_dbg(dev, "address bits: %u, PFN mask: %#lx\n", mc->soc->num_address_bits, smmu->pfn_mask); - smmu->tlb_mask = (smmu->soc->num_tlb_lines << 1) - 1; + smmu->tlb_mask = (1 << fls(smmu->soc->num_tlb_lines)) - 1; dev_dbg(dev, "TLB lines: %u, mask: %#lx\n", smmu->soc->num_tlb_lines, smmu->tlb_mask); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
> From: Jason Gunthorpe > Sent: Tuesday, September 15, 2020 10:29 PM > > > Do they need a device at all? It's not clear to me why RID based > > IOMMU management fits within vfio's scope, but PASID based does not. > > In RID mode vfio-pci completely owns the PCI function, so it is more > natural that VFIO, as the sole device owner, would own the DMA mapping > machinery. Further, the RID IOMMU mode is rarely used outside of VFIO > so there is not much reason to try and disaggregate the API. It is also used by vDPA. > > PASID on the other hand, is shared. vfio-mdev drivers will share the > device with other kernel drivers. PASID and DMA will be concurrent > with VFIO and other kernel drivers/etc. > Looks you are equating PASID to host-side sharing, while ignoring another valid usage that a PASID-capable device is passed through to the guest through vfio-pci and then PASID is used by the guest for guest-side sharing. In such case, it is an exclusive usage in host side and then what is the problem for VFIO to manage PASID given that vfio-pci completely owns the function? Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On 9/16/20 8:22 AM, Jacob Pan (Jun) wrote: If user space wants to bind page tables, create the PASID with /dev/sva, use ioctls there to setup the page table the way it wants, then pass the now configured PASID to a driver that can use it. Are we talking about bare metal SVA? If so, I don't see the need for userspace to know there is a PASID. All user space need is that my current mm is bound to a device by the driver. So it can be a one-step process for user instead of two. Driver does not do page table binding. Do not duplicate all the control plane uAPI in every driver. PASID managment and binding is seperated from the driver(s) that are using the PASID. Why separate? Drivers need to be involved in PASID life cycle management. For example, when tearing down a PASID, the driver needs to stop DMA, IOMMU driver needs to unbind, etc. If driver is the control point, then things are just in order. I am referring to bare metal SVA. For guest SVA, I agree that binding is separate from PASID allocation. Could you review this doc. in terms of life cycle? https://lkml.org/lkml/2020/8/22/13 My point is that /dev/sda has no value for bare metal SVA, we are just talking about if guest SVA UAPIs can be consolidated. Or am I missing something? Not only bare metal SVA, but also subdevice passthrough (Intel Scalable IOV and ARM SubStream ID) also consumes PASID which has nothing to do with user space, hence the /dev/sva is unsuited. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On 2020/9/14 下午9:31, Jean-Philippe Brucker wrote: If it's possible, I would suggest a generic uAPI instead of a VFIO specific one. A large part of this work is already generic uAPI, in include/uapi/linux/iommu.h. This is not what I read from this series, all the following uAPI is VFIO specific: struct vfio_iommu_type1_nesting_op; struct vfio_iommu_type1_pasid_request; And include/uapi/linux/iommu.h is not included in include/uapi/linux/vfio.h at all. This patchset connects that generic interface to the pre-existing VFIO uAPI that deals with IOMMU mappings of an assigned device. But the bulk of the work is done by the IOMMU subsystem, and is available to all device drivers. So any reason not introducing the uAPI to IOMMU drivers directly? Jason suggest something like /dev/sva. There will be a lot of other subsystems that could benefit from this (e.g vDPA). Do you have a more precise idea of the interface /dev/sva would provide, how it would interact with VFIO and others? Can we replace the container fd with sva fd like: sva = open("/dev/sva", O_RDWR); group = open("/dev/vfio/26", O_RDWR); ioctl(group, VFIO_GROUP_SET_SVA, &sva); Then we can do all SVA stuffs through sva fd, and for other subsystems (like vDPA) it only need to implement the function that is equivalent to VFIO_GROUP_SET_SVA. vDPA could transport the generic iommu.h structures via its own uAPI, and call the IOMMU API directly without going through an intermediate /dev/sva handle. Any value for those transporting? I think we have agreed that VFIO is not the only user for vSVA ... It's not hard to forecast that there would be more subsystems that want to benefit from vSVA, we don't want to duplicate the similar uAPIs in all of those subsystems. Thanks Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
On 2020/9/16 上午3:26, Raj, Ashok wrote: IIUC, you are asking that part of the interface to move to a API interface that potentially the new /dev/sva and VFIO could share? I think the API's for PASID management themselves are generic (Jean's patchset + Jacob's ioasid set management). Yes, the in kernel APIs are pretty generic now, and can be used by many types of drivers. Good, so there is no new requirements here I suppose. The requirement is not for in-kernel APIs but a generic uAPIs. As JasonW kicked this off, VDPA will need all this identical stuff too. We already know this, and I think Intel VDPA HW will need it, so it should concern you too:) This is one of those things that I would disagree and commit :-).. A PASID vIOMMU solution sharable with VDPA and VFIO, based on a PASID control char dev (eg /dev/sva, or maybe /dev/iommu) seems like a reasonable starting point for discussion. Looks like now we are getting closer to what we need.:-) Given that PASID api's are general purpose today and any driver can use it to take advantage. VFIO fortunately or unfortunately has the IOMMU things abstracted. I suppose that support is also mostly built on top of the generic iommu* api abstractions in a vendor neutral way? I'm still lost on what is missing that vDPA can't build on top of what is available? For sure it can, but we may end up duplicated (or similar) uAPIs which is bad. Thanks Cheers, Ashok ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 6/6] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset
On Tue, Sep 15, 2020 at 01:55:01PM -0600, Mathieu Poirier wrote: > That did the trick - the stm32 platform driver's probe() function completes > and > the remote processor is operatinal. > > That being said the value returned by function dma_to_pfn() > is 0x137fff in the original code and 0xf with your patches applied. Yes, that is intentional. The old code just applied the range and got an out of range offset, the new one reports the max offset. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
support range based offsets in dma-direct v3
Hi all, this series adds range-based offsets to the dma-direct implementation. The guts of the change are a patch from Jim with some modifications from me, but to do it nicely we need to ARM patches to prepare for it as well. Changes since v2: - fix a mismerge - return (phys_addr_t)-1 from translate_dma_to_phys when there is no matching range to fix dma_capable checks Changes since v1: - rebased on top of the latests dma-mapping for-next tree - add two more trivial ARM cleanups - remove the DMA property inheritance hack in usb - move the remaining copy of the ranges into the remoteproc driver as it should not be seen as a general API, but as a quirk for remoteproc that we need to fix ASAP Diffstat: ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/6] ARM/dma-mapping: remove a __arch_page_to_dma #error
The __arch_page_to_dma hook is long gone. Signed-off-by: Christoph Hellwig --- arch/arm/include/asm/dma-mapping.h | 4 1 file changed, 4 deletions(-) diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index bdd80ddbca3451..70d95677656044 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -23,10 +23,6 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus) return NULL; } -#ifdef __arch_page_to_dma -#error Please update to __arch_pfn_to_dma -#endif - /* * dma_to_pfn/pfn_to_dma/dma_to_virt/virt_to_dma are architecture private * functions used internally by the DMA-mapping API to provide DMA -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-direct: Fix potential NULL pointer dereference
On Tue, Sep 15, 2020 at 04:46:17PM -0400, Thomas Tai wrote: > I tried out the suggested changes, and it successfully warned the null > pointer without panic. I notice that there are some places outside the > dma-direct, which calls dma_capable(). > > https://elixir.bootlin.com/linux/v5.9-rc5/source/arch/x86/kernel/amd_gart_64.c#L187 > > https://elixir.bootlin.com/linux/v5.9-rc5/source/drivers/xen/swiotlb-xen.c#L387 All of these still come in throught the wrappers in kernel/dma/mapping.c. > Given that the WARN_ON_ONCE already did the intended warning, would you be > ok that I keep the null checking in dma_capable()? No, the generic dma mapping layer is the right place. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/6] ARM/dma-mapping: remove dma_to_virt
dma_to_virt is entirely unused, remove it. Signed-off-by: Christoph Hellwig --- arch/arm/include/asm/dma-mapping.h| 18 +- arch/arm/mach-omap1/include/mach/memory.h | 4 2 files changed, 1 insertion(+), 21 deletions(-) diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 70d95677656044..cf2535fb8891f5 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -24,7 +24,7 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus) } /* - * dma_to_pfn/pfn_to_dma/dma_to_virt/virt_to_dma are architecture private + * dma_to_pfn/pfn_to_dma/virt_to_dma are architecture private * functions used internally by the DMA-mapping API to provide DMA * addresses. They must not be used by drivers. */ @@ -46,17 +46,6 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) return pfn; } -static inline void *dma_to_virt(struct device *dev, dma_addr_t addr) -{ - if (dev) { - unsigned long pfn = dma_to_pfn(dev, addr); - - return phys_to_virt(__pfn_to_phys(pfn)); - } - - return (void *)__bus_to_virt((unsigned long)addr); -} - static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) { if (dev) @@ -76,11 +65,6 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) return __arch_dma_to_pfn(dev, addr); } -static inline void *dma_to_virt(struct device *dev, dma_addr_t addr) -{ - return __arch_dma_to_virt(dev, addr); -} - static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) { return __arch_virt_to_dma(dev, addr); diff --git a/arch/arm/mach-omap1/include/mach/memory.h b/arch/arm/mach-omap1/include/mach/memory.h index 1142560e0078f5..e43697c3297bf2 100644 --- a/arch/arm/mach-omap1/include/mach/memory.h +++ b/arch/arm/mach-omap1/include/mach/memory.h @@ -41,10 +41,6 @@ __phys_to_pfn(__dma);\ }) -#define __arch_dma_to_virt(dev, addr) ({ (void *) (is_lbus_device(dev) ? \ - lbus_to_virt(addr) : \ - __phys_to_virt(addr)); }) - #define __arch_virt_to_dma(dev, addr) ({ unsigned long __addr = (unsigned long)(addr); \ (dma_addr_t) (is_lbus_device(dev) ? \ virt_to_lbus(__addr) : \ -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/6] ARM/dma-mapping: move various helpers from dma-mapping.h to dma-direct.h
Move the helpers to translate to and from direct mapping DMA addresses to dma-direct.h. This not only is the most logical place, but the new placement also avoids dependency loops with pending commits. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- arch/arm/common/dmabounce.c| 2 +- arch/arm/include/asm/dma-direct.h | 50 ++ arch/arm/include/asm/dma-mapping.h | 50 -- 3 files changed, 51 insertions(+), 51 deletions(-) diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c index f4b719bde76367..d3e00ea9208834 100644 --- a/arch/arm/common/dmabounce.c +++ b/arch/arm/common/dmabounce.c @@ -24,7 +24,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/arch/arm/include/asm/dma-direct.h b/arch/arm/include/asm/dma-direct.h index bca0de56753439..fbcf4367b5cb1a 100644 --- a/arch/arm/include/asm/dma-direct.h +++ b/arch/arm/include/asm/dma-direct.h @@ -2,6 +2,56 @@ #ifndef ASM_ARM_DMA_DIRECT_H #define ASM_ARM_DMA_DIRECT_H 1 +#include + +/* + * dma_to_pfn/pfn_to_dma/virt_to_dma are architecture private + * functions used internally by the DMA-mapping API to provide DMA + * addresses. They must not be used by drivers. + */ +#ifndef __arch_pfn_to_dma +static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) +{ + if (dev) + pfn -= dev->dma_pfn_offset; + return (dma_addr_t)__pfn_to_bus(pfn); +} + +static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) +{ + unsigned long pfn = __bus_to_pfn(addr); + + if (dev) + pfn += dev->dma_pfn_offset; + + return pfn; +} + +static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) +{ + if (dev) + return pfn_to_dma(dev, virt_to_pfn(addr)); + + return (dma_addr_t)__virt_to_bus((unsigned long)(addr)); +} + +#else +static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) +{ + return __arch_pfn_to_dma(dev, pfn); +} + +static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) +{ + return __arch_dma_to_pfn(dev, addr); +} + +static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) +{ + return __arch_virt_to_dma(dev, addr); +} +#endif + static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr) { unsigned int offset = paddr & ~PAGE_MASK; diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index cf2535fb8891f5..0a1a536368c3a4 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -8,8 +8,6 @@ #include #include -#include - #include #include @@ -23,54 +21,6 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus) return NULL; } -/* - * dma_to_pfn/pfn_to_dma/virt_to_dma are architecture private - * functions used internally by the DMA-mapping API to provide DMA - * addresses. They must not be used by drivers. - */ -#ifndef __arch_pfn_to_dma -static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) -{ - if (dev) - pfn -= dev->dma_pfn_offset; - return (dma_addr_t)__pfn_to_bus(pfn); -} - -static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) -{ - unsigned long pfn = __bus_to_pfn(addr); - - if (dev) - pfn += dev->dma_pfn_offset; - - return pfn; -} - -static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) -{ - if (dev) - return pfn_to_dma(dev, virt_to_pfn(addr)); - - return (dma_addr_t)__virt_to_bus((unsigned long)(addr)); -} - -#else -static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) -{ - return __arch_pfn_to_dma(dev, pfn); -} - -static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) -{ - return __arch_dma_to_pfn(dev, addr); -} - -static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) -{ - return __arch_virt_to_dma(dev, addr); -} -#endif - /** * arm_dma_alloc - allocate consistent memory for DMA * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 4/6] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE
The DMA offset notifier can only be used if PHYS_OFFSET is at least KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit phys_addr_t. Currently the code compiles fine despite that, a pending change to the DMA offset handling would create a compiler warning for this case. Add an ifdef to not compile the code except for LPAE configs. Signed-off-by: Christoph Hellwig Reviewed-by: Robin Murphy --- arch/arm/mach-keystone/keystone.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c index 638808c4e12247..dcd031ba84c2e0 100644 --- a/arch/arm/mach-keystone/keystone.c +++ b/arch/arm/mach-keystone/keystone.c @@ -24,6 +24,7 @@ #include "keystone.h" +#ifdef CONFIG_ARM_LPAE static unsigned long keystone_dma_pfn_offset __read_mostly; static int keystone_platform_notifier(struct notifier_block *nb, @@ -48,14 +49,17 @@ static int keystone_platform_notifier(struct notifier_block *nb, static struct notifier_block platform_nb = { .notifier_call = keystone_platform_notifier, }; +#endif /* CONFIG_ARM_LPAE */ static void __init keystone_init(void) { +#ifdef CONFIG_ARM_LPAE if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) { keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START); bus_register_notifier(&platform_bus_type, &platform_nb); } +#endif keystone_pm_runtime_init(); } -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 5/6] usb: don't inherity DMA properties for USB devices
As the comment in usb_alloc_dev correctly states, drivers can't use the DMA API on usb device, and at least calling dma_set_mask on them is highly dangerous. Unlike what the comment states upper level drivers also can't really use the presence of a dma mask to check for DMA support, as the dma_mask is set by default for most busses. Setting the dma_mask comes from "[PATCH] usbcore dma updates (and doc)" in BitKeeper times, as it seems like it was primarily for setting the NETIF_F_HIGHDMA flag in USB drivers, something that has long been fixed up since. Setting the dma_pfn_offset comes from commit b44bbc46a8bb ("usb: core: setup dma_pfn_offset for USB devices and, interfaces"), which worked around the fact that the scsi_calculate_bounce_limits functions wasn't going through the proper driver interface to query DMA information, but that function was removed in commit 21e07dba9fb1 ("scsi: reduce use of block bounce buffers") years ago. Signed-off-by: Christoph Hellwig Reviewed-by: Greg Kroah-Hartman --- drivers/usb/core/message.c | 6 -- drivers/usb/core/usb.c | 12 2 files changed, 18 deletions(-) diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c index 6197938dcc2d8f..9e45732dc1d1d1 100644 --- a/drivers/usb/core/message.c +++ b/drivers/usb/core/message.c @@ -1954,12 +1954,6 @@ int usb_set_configuration(struct usb_device *dev, int configuration) intf->dev.bus = &usb_bus_type; intf->dev.type = &usb_if_device_type; intf->dev.groups = usb_interface_groups; - /* -* Please refer to usb_alloc_dev() to see why we set -* dma_mask and dma_pfn_offset. -*/ - intf->dev.dma_mask = dev->dev.dma_mask; - intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset; INIT_WORK(&intf->reset_ws, __usb_queue_reset_device); intf->minor = -1; device_initialize(&intf->dev); diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c index bafc113f2b3ef3..9b4ac4415f1a47 100644 --- a/drivers/usb/core/usb.c +++ b/drivers/usb/core/usb.c @@ -599,18 +599,6 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent, dev->dev.bus = &usb_bus_type; dev->dev.type = &usb_device_type; dev->dev.groups = usb_device_groups; - /* -* Fake a dma_mask/offset for the USB device: -* We cannot really use the dma-mapping API (dma_alloc_* and -* dma_map_*) for USB devices but instead need to use -* usb_alloc_coherent and pass data in 'urb's, but some subsystems -* manually look into the mask/offset pair to determine whether -* they need bounce buffers. -* Note: calling dma_set_mask() on a USB device would set the -* mask for the entire HCD, so don't do that. -*/ - dev->dev.dma_mask = bus->sysdev->dma_mask; - dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset; set_dev_node(&dev->dev, dev_to_node(bus->sysdev)); dev->state = USB_STATE_ATTACHED; dev->lpm_disable_count = 1; -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 6/6] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset
From: Jim Quinlan The new field 'dma_range_map' in struct device is used to facilitate the use of single or multiple offsets between mapping regions of cpu addrs and dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only capable of holding a single uniform offset and had no region bounds checking. The function of_dma_get_range() has been modified so that it takes a single argument -- the device node -- and returns a map, NULL, or an error code. The map is an array that holds the information regarding the DMA regions. Each range entry contains the address offset, the cpu_start address, the dma_start address, and the size of the region. of_dma_configure() is the typical manner to set range offsets but there are a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel driver code. These cases now invoke the function dma_direct_set_offset(dev, cpu_addr, dma_addr, size). Signed-off-by: Jim Quinlan [hch: various interface cleanups] Signed-off-by: Christoph Hellwig Tested-by: Nathan Chancellor --- arch/arm/include/asm/dma-direct.h | 9 +-- arch/arm/mach-keystone/keystone.c | 17 ++--- arch/arm/mach-omap1/include/mach/memory.h | 4 + arch/sh/drivers/pci/pcie-sh7786.c | 9 ++- arch/x86/pci/sta2x11-fixup.c | 6 +- drivers/acpi/arm64/iort.c | 6 +- drivers/base/core.c | 2 + drivers/gpu/drm/sun4i/sun4i_backend.c | 8 +- drivers/iommu/io-pgtable-arm.c| 2 +- .../platform/sunxi/sun4i-csi/sun4i_csi.c | 9 ++- .../platform/sunxi/sun6i-csi/sun6i_csi.c | 11 ++- drivers/of/address.c | 73 --- drivers/of/device.c | 44 ++- drivers/of/of_private.h | 11 +-- drivers/of/unittest.c | 34 ++--- drivers/remoteproc/remoteproc_core.c | 24 +- .../staging/media/sunxi/cedrus/cedrus_hw.c| 10 ++- include/linux/device.h| 4 +- include/linux/dma-direct.h| 54 -- include/linux/dma-mapping.h | 9 ++- kernel/dma/coherent.c | 7 +- kernel/dma/direct.c | 51 - 22 files changed, 285 insertions(+), 119 deletions(-) diff --git a/arch/arm/include/asm/dma-direct.h b/arch/arm/include/asm/dma-direct.h index fbcf4367b5cb1a..436544aeb83405 100644 --- a/arch/arm/include/asm/dma-direct.h +++ b/arch/arm/include/asm/dma-direct.h @@ -12,8 +12,8 @@ #ifndef __arch_pfn_to_dma static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) { - if (dev) - pfn -= dev->dma_pfn_offset; + if (dev && dev->dma_range_map) + pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn))); return (dma_addr_t)__pfn_to_bus(pfn); } @@ -21,9 +21,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) { unsigned long pfn = __bus_to_pfn(addr); - if (dev) - pfn += dev->dma_pfn_offset; - + if (dev && dev->dma_range_map) + pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn))); return pfn; } diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c index dcd031ba84c2e0..09a65c2dfd7327 100644 --- a/arch/arm/mach-keystone/keystone.c +++ b/arch/arm/mach-keystone/keystone.c @@ -8,6 +8,7 @@ */ #include #include +#include #include #include #include @@ -25,8 +26,6 @@ #include "keystone.h" #ifdef CONFIG_ARM_LPAE -static unsigned long keystone_dma_pfn_offset __read_mostly; - static int keystone_platform_notifier(struct notifier_block *nb, unsigned long event, void *data) { @@ -39,9 +38,12 @@ static int keystone_platform_notifier(struct notifier_block *nb, return NOTIFY_BAD; if (!dev->of_node) { - dev->dma_pfn_offset = keystone_dma_pfn_offset; - dev_err(dev, "set dma_pfn_offset%08lx\n", - dev->dma_pfn_offset); + int ret = dma_direct_set_offset(dev, KEYSTONE_HIGH_PHYS_START, + KEYSTONE_LOW_PHYS_START, + KEYSTONE_HIGH_PHYS_SIZE); + dev_err(dev, "set dma_offset%08llx%s\n", + KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START, + ret ? " failed" : ""); } return NOTIFY_OK; } @@ -54,11 +56,8 @@ static struct notifier_block platform_nb = { static void __init keystone_init(void) { #ifdef CONFIG_ARM_LPAE - if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) { - keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START - - KEYSTONE_LOW_PHYS_START); + if (PHYS_OFFSET >= KEYSTONE_HIGH_P