Re: [PATCH v4 0/6] normalize IOMMU dma mode boot options
On 2019/4/8 9:14, Hanjun Guo wrote: > Hi Zhen, > > On 2019/4/7 20:41, Zhen Lei wrote: >> As Robin Murphy's suggestion: >> "It's also not necessarily obvious to the user how this interacts with >> IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it >> would be better to refactor the whole lot into a single selection of >> something >> like IOMMU_DEFAULT_MODE anyway." >> >> In this version, I tried to normalize the IOMMU dma mode boot options for all >> ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass), >> lazy(mapping but defer the IOTLB invalidation), strict. But currently each >> ARCHs defined their private boot options, different with each other. For >> example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0, >> X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass. >> >> >> Zhen Lei (6): >> iommu: use iommu.dma_mode to replace iommu.passthrough and >> iommu.strict >> iommu: keep dma mode build options consistent with cmdline options >> iommu: add iommu_default_dma_mode_get() helper >> s390/pci: use common boot option iommu.dma_mode >> powernv/iommu: use common boot option iommu.dma_mode >> x86/iommu: use common boot option iommu.dma_mode > > This will break systems using boot options as now, and I think > this is unacceptable. If you want to do so, just introduce iommu.dma_mode > on top of those iommu boot options with dma mode boot options unchanged, > and iommu.dma_mode is for all archs but compatible with them. I just changed the boot options name, but keep the function no change. I added all related maintainers/supporters in the "to=" list, maybe we can disuss this. Should I add some "obsoleted" warnings for old options and keep them for a while? But I think this kind of thing is best done in one go. > > Thanks > Hanjun > > > . > -- Thanks! BestRegards ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions
Hi, On 4/6/19 2:02 AM, James Sewart wrote: Hey Lu, My bad, did some debugging on my end. The issue was swapping out find_domain for iommu_get_domain_for_dev. It seems in some situations the domain is not attached to the group but the device is expected to have the domain still stored in its archdata. I’ve attached the final patch with find_domain unremoved which seems to work in my testing. This version works for me now. Cheers, James. Best regards, Lu Baolu On 4 Apr 2019, at 07:49, Lu Baolu wrote: Hi James, I did a sanity test from my end. The test machine fails to boot. I haven't seen any valuable kernel log. It results in a purple screen. Best regards, Lu Baolu On 3/29/19 11:26 PM, James Sewart wrote: Hey Lu, I’ve attached a preliminary v3, if you could take a look and run some tests that would be great. Since v2 i’ve added your default domain type patches, the new device_group handler, and incorporated Jacob’s feedback. On 28 Mar 2019, at 18:37, James Sewart wrote: Hey Lu, On 26 Mar 2019, at 01:24, Lu Baolu wrote: Hi James, On 3/25/19 8:57 PM, James Sewart wrote: Theres an issue that if we choose to alloc a new resv_region with type IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions to free this entry type which means refactoring the rmrr regions in get_resv_regions. Should this work be in this patchset? Do you mean the rmrr regions are not allocated in get_resv_regions, but are freed in put_resv_regions? I think we should fix this in this patch set since this might impact the device passthrough if we don't do it. They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is freed in put_resv_regions. If we allocate a new resv_region with type IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free the static RMRR regions. Either the ISA region is static and not freed as with my implementation, or the RMRR regions are converted to be allocated on each call to get_resv_regions and freed in put_resv_regions. By the way, there's another way in my mind. Let's add a new region type for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way as those MSI regions. Just FYI. This solution would require adding some extra code to iommu_group_create_direct_mappings as currently only type IOMMU_RESV_DIRECT is identity mapped, other types are only reserved. Best regards, Lu Baolu Cheers, James. Cheers, James. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/2] iommu/arm-smmu-v3: make sure the kdump kernel can work well when smmu is enabled
Hi Will, On 2019/4/4 23:30, Will Deacon wrote: > Hi Zhen Lei, > > On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote: >> v1 --> v2: >> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000 >> (Report abort to device, no event recorded) to suppress the event messages >> caused by the unexpected devices. >> 2. rewrite the patch description. > > This issue came up a while back: > > https://lore.kernel.org/linux-pci/20180302103032.gb19...@arm.com/ > > and I'd still prefer to solve it using the disable_bypass logic which we > already have. Something along the lines of the diff below? Yes, my patches also use disable_bypass=1(set ste.config=0b000). If SMMU_IDR0.ST_LEVEL=0(Linear Stream table supported), then all STE entries are allocated and initialized(set ste.config=0b000). But if SMMU_IDR0.ST_LEVEL=1 (2-level Stream Table), we only allocated and initialized the first level tables, but leave level 2 tables dynamic allocated. That means, C_BAD_STREAMID(eventid=0x2) will be reported, if an unexpeted device access memory without reinitialized in kdump kernel. So my patches allocated a dummy level 2 table(STE table), and make all level 1 table entries pointer to it in advance. That means abort all unexpected devices memory access base this dummy STE table. When an expected device(need to be used in kdump kernel) attached, we will allocate a new level 2 table(STE table) accordingly, but keep others still pointer to the dummy STE table. > > We're relying on the DMA API not subsequently requesting a passthrough > domain, but it should only do that if you've configured your crashkernel > to do so. > > Will > > --->8 > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index d3880010c6cf..91b8f3b2ee25 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct > arm_smmu_device *smmu, bool bypass) > /* Clear CR0 and sync (disables SMMU and queue processing) */ > reg = readl_relaxed(smmu->base + ARM_SMMU_CR0); > if (reg & CR0_SMMUEN) { > - if (is_kdump_kernel()) { > - arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0); > - arm_smmu_device_disable(smmu); > - return -EBUSY; > - } > - > dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n"); > + WARN_ON(is_kdump_kernel() && !disable_bypass); > + arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0); > } > > ret = arm_smmu_device_disable(smmu); > > . > -- Thanks! BestRegards ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/6] normalize IOMMU dma mode boot options
Hi Zhen, On 2019/4/7 20:41, Zhen Lei wrote: > As Robin Murphy's suggestion: > "It's also not necessarily obvious to the user how this interacts with > IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it > would be better to refactor the whole lot into a single selection of something > like IOMMU_DEFAULT_MODE anyway." > > In this version, I tried to normalize the IOMMU dma mode boot options for all > ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass), > lazy(mapping but defer the IOTLB invalidation), strict. But currently each > ARCHs defined their private boot options, different with each other. For > example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0, > X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass. > > > Zhen Lei (6): > iommu: use iommu.dma_mode to replace iommu.passthrough and > iommu.strict > iommu: keep dma mode build options consistent with cmdline options > iommu: add iommu_default_dma_mode_get() helper > s390/pci: use common boot option iommu.dma_mode > powernv/iommu: use common boot option iommu.dma_mode > x86/iommu: use common boot option iommu.dma_mode This will break systems using boot options as now, and I think this is unacceptable. If you want to do so, just introduce iommu.dma_mode on top of those iommu boot options with dma mode boot options unchanged, and iommu.dma_mode is for all archs but compatible with them. Thanks Hanjun ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: kernel BUG at drivers/iommu/intel-iommu.c:608
On Sun, 7 Apr 2019 16:02:31 -0700 Bart Van Assche wrote: > On 4/7/19 2:06 PM, Alex Williamson wrote: > > On Sun, 7 Apr 2019 12:10:38 -0700 > > Bart Van Assche wrote: > >> If I tell qemu to use PCI pass-through for a PCI adapter and next load the > >> lpfc driver for an lpfc adapter that has not been passed through to any VM > >> a kernel bug is hit. Do you perhaps know whether it should be possible to > >> a load kernel driver in this scenario? If so, do you know what should > >> change > >> to avoid that this kernel bug is hit? Should the iommu code be modified or > >> should the lpfc code be modified? I'm asking you because I think that you > >> introduced the BUG() statement that was hit. See also commit ab8dfe251571 > >> ("iommu/vt-d: Introduce helper functions to improve code readability"; > >> v3.17). > > > > Do both of these lpfc devices belong to the same IOMMU group? > > (/sys/kernel/iommu_groups/) Thanks, > > Hi Alex, > > Apparently the two Emulex (lpfc) and the two QLogic ports are in the same > IOMMU group: > > # lspci | grep -E 'QLogic|Emulex' > 01:00.0 Fibre Channel: Emulex Corporation Lancer Gen6: LPe32000 Fibre Channel > Host Adapter (rev 01) > 01:00.1 Fibre Channel: Emulex Corporation Lancer Gen6: LPe32000 Fibre Channel > Host Adapter (rev 01) > 02:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI > Express HBA (rev 02) > 02:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI > Express HBA (rev 02) > > # ls -d /sys/kernel/iommu_groups/*/devices/:0[12]:00.* > /sys/kernel/iommu_groups/1/devices/:01:00.0 > /sys/kernel/iommu_groups/1/devices/:01:00.1 > /sys/kernel/iommu_groups/1/devices/:02:00.0 > /sys/kernel/iommu_groups/1/devices/:02:00.1 It's not possible to do what you want with this configuration. An IOMMU group represents the smallest set of devices that are isolated from other sets of devices and is also therefore the minimum granularity we can assign devices to userspace (ex. QEMU). The kernel reacts to breaking the isolation of the group with a BUG_ON. If you managed not to hit the BUG_ON here, you'd hit the BUG_ON in vfio code when the loss of isolation is detected there. IOMMU groups are formed at the highest point in the topology which guarantees isolation. This can be indicated either via native PCIe ACS support or ACS-equivalent quirks in the code. If the root port provides neither of these, then all devices downstream are grouped together as well as all peer root ports in the same PCI slot and all devices downstream of those. If a multifunction endpoint does not provide ACS or equivalent quirks, the functions will be grouped together. Not all endpoint devices or systems are designed for minimum possible granularity. You can learn more here[1]. Thanks, Alex [1] http://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: kernel BUG at drivers/iommu/intel-iommu.c:608
On 4/7/19 2:06 PM, Alex Williamson wrote: On Sun, 7 Apr 2019 12:10:38 -0700 Bart Van Assche wrote: If I tell qemu to use PCI pass-through for a PCI adapter and next load the lpfc driver for an lpfc adapter that has not been passed through to any VM a kernel bug is hit. Do you perhaps know whether it should be possible to a load kernel driver in this scenario? If so, do you know what should change to avoid that this kernel bug is hit? Should the iommu code be modified or should the lpfc code be modified? I'm asking you because I think that you introduced the BUG() statement that was hit. See also commit ab8dfe251571 ("iommu/vt-d: Introduce helper functions to improve code readability"; v3.17). Do both of these lpfc devices belong to the same IOMMU group? (/sys/kernel/iommu_groups/) Thanks, Hi Alex, Apparently the two Emulex (lpfc) and the two QLogic ports are in the same IOMMU group: # lspci | grep -E 'QLogic|Emulex' 01:00.0 Fibre Channel: Emulex Corporation Lancer Gen6: LPe32000 Fibre Channel Host Adapter (rev 01) 01:00.1 Fibre Channel: Emulex Corporation Lancer Gen6: LPe32000 Fibre Channel Host Adapter (rev 01) 02:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02) 02:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02) # ls -d /sys/kernel/iommu_groups/*/devices/:0[12]:00.* /sys/kernel/iommu_groups/1/devices/:01:00.0 /sys/kernel/iommu_groups/1/devices/:01:00.1 /sys/kernel/iommu_groups/1/devices/:02:00.0 /sys/kernel/iommu_groups/1/devices/:02:00.1 Thanks, Bart. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: kernel BUG at drivers/iommu/intel-iommu.c:608
On Sun, 7 Apr 2019 12:10:38 -0700 Bart Van Assche wrote: > Hi Jiang, > > If I tell qemu to use PCI pass-through for a PCI adapter and next load the > lpfc driver for an lpfc adapter that has not been passed through to any VM > a kernel bug is hit. Do you perhaps know whether it should be possible to > a load kernel driver in this scenario? If so, do you know what should change > to avoid that this kernel bug is hit? Should the iommu code be modified or > should the lpfc code be modified? I'm asking you because I think that you > introduced the BUG() statement that was hit. See also commit ab8dfe251571 > ("iommu/vt-d: Introduce helper functions to improve code readability"; v3.17). Do both of these lpfc devices belong to the same IOMMU group? (/sys/kernel/iommu_groups/) Thanks, Alex > [ cut here ] > kernel BUG at drivers/iommu/intel-iommu.c:608! > invalid opcode: [#1] SMP > CPU: 7 PID: 7842 Comm: modprobe Not tainted 5.0.7+ #2 > Hardware name: Gigabyte Technology Co., Ltd. Z97X-UD5H/Z97X-UD5H, BIOS F10 > 08/03/2015 > RIP: 0010:domain_get_iommu+0x50/0x60 > Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 39 c2 48 63 c8 75 ed 31 > c0 c3 48 c1 e1 03 48 8b 05 15 9b cd 00 48 8b 04 08 c3 <0f> 0b 31 c9 eb ee 66 > 2e 0f 1f 84 00 00 00 00 00 41 55 8b 05 d0 9a > RSP: 0018:a7884024ba60 EFLAGS: 00010202 > RAX: 96c8897c60c0 RBX: 0004046c2000 RCX: 96c88edab000 > RDX: fff0 RSI: 96c88e08de80 RDI: 96c8897c60c0 > RBP: R08: 96c88b806b40 R09: 96c88f802f50 > R10: R11: 0001 R12: 96c88edab0b0 > R13: R14: 1000 R15: 96c8897c60c0 > FS: 7fec74c05b80() GS:96c89fbc() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7fec74c5dc80 CR3: 000404686004 CR4: 001626e0 > Call Trace: > __intel_map_page+0x7e/0x150 > intel_alloc_coherent+0xa7/0x130 > dma_alloc_attrs+0x6b/0xc0 > dma_pool_alloc+0xb8/0x1a0 > lpfc_mem_alloc+0x109/0x3e0 [lpfc] > lpfc_pci_probe_one+0xdac/0x2060 [lpfc] > pci_device_probe+0xc3/0x140 > really_probe+0xd2/0x380 > driver_probe_device+0xae/0xf0 > __driver_attach+0xd5/0x100 > ? driver_probe_device+0xf0/0xf0 > bus_for_each_dev+0x5b/0x90 > bus_add_driver+0x208/0x220 > ? 0xc0ad9000 > driver_register+0x66/0xb0 > ? 0xc0ad9000 > lpfc_init+0xd5/0x1000 [lpfc] > do_one_initcall+0x2e/0x181 > ? __vunmap+0x75/0xb0 > do_init_module+0x55/0x1e0 > load_module+0x2438/0x2560 > ? __do_sys_finit_module+0x8f/0xd0 > __do_sys_finit_module+0x8f/0xd0 > do_syscall_64+0x44/0xf0 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > RIP: 0033:0x7fec74d212f9 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 > 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 > 01 c3 48 8b 0d 6f 4b 0c 00 f7 d8 64 89 01 48 > RSP: 002b:7ffea26c1458 EFLAGS: 0246 ORIG_RAX: 0139 > RAX: ffda RBX: 55eeaf1669c0 RCX: 7fec74d212f9 > RDX: RSI: 55eeaf0ee3c8 RDI: 0004 > RBP: R08: R09: 55eeaf166400 > R10: 0004 R11: 0246 R12: 55eeaf0ee3c8 > R13: 0004 R14: 55eeaf16eb90 R15: 55eeaf1669c0 > Modules linked in: lpfc(+) scsi_transport_fc mlx4_ib ib_uverbs ib_core > mlx4_en mlx4_core pci_stub af_packet vhost_net vhost tun vfio_pci vfio_virqfd > vfio_iommu_type1 vfio fuse dm_crypt algif_skcipher af_alg loop devlink bridge > stp llc xt_tcpudp ip6t_rpfilter ip6t_REJECT > nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat > ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security > iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 > scsi_transport_iscsi ip_set nfnetlink ebtable_filter ebtables ip6table_filter > ip6_tables iptable_filter ip_tables x_tables bpfilter coretemp hwmon > intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass > crct10dif_pclmul snd_hda_codec_hdmi crc32_pclmul > ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic aesni_intel > aes_x86_64 crypto_simd cryptd snd_hda_intel iTCO_wdt mei_me > iTCO_vendor_support joydev mxm_wmi glue_helper e1000e alx > mdio intel_rapl_perf snd_hda_codec mei ptp lpc_ich pcspkr i2c_i801 > snd_hda_core pps_core mfd_core fan thermal wmi pcc_cpufreq acpi_pad button > snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_seq_device snd_pcm > snd_timer snd soundcore ext4 crc16 mbcache jbd2 > hid_generic usbhid sd_mod i915 intel_gtt i2c_algo_bit iosf_mbi drm_kms_helper > syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm > drm_panel_orientation_quirks xhci_pci libata ehci_pci agpgart ehci_hcd > xhci_hcd i2c_core video usbcore usb_common sg dm_multipath > dm_mod scsi_dh_rdac scsi_dh_emc
kernel BUG at drivers/iommu/intel-iommu.c:608
Hi Jiang, If I tell qemu to use PCI pass-through for a PCI adapter and next load the lpfc driver for an lpfc adapter that has not been passed through to any VM a kernel bug is hit. Do you perhaps know whether it should be possible to a load kernel driver in this scenario? If so, do you know what should change to avoid that this kernel bug is hit? Should the iommu code be modified or should the lpfc code be modified? I'm asking you because I think that you introduced the BUG() statement that was hit. See also commit ab8dfe251571 ("iommu/vt-d: Introduce helper functions to improve code readability"; v3.17). Thank you, Bart. [ cut here ] kernel BUG at drivers/iommu/intel-iommu.c:608! invalid opcode: [#1] SMP CPU: 7 PID: 7842 Comm: modprobe Not tainted 5.0.7+ #2 Hardware name: Gigabyte Technology Co., Ltd. Z97X-UD5H/Z97X-UD5H, BIOS F10 08/03/2015 RIP: 0010:domain_get_iommu+0x50/0x60 Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 39 c2 48 63 c8 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 15 9b cd 00 48 8b 04 08 c3 <0f> 0b 31 c9 eb ee 66 2e 0f 1f 84 00 00 00 00 00 41 55 8b 05 d0 9a RSP: 0018:a7884024ba60 EFLAGS: 00010202 RAX: 96c8897c60c0 RBX: 0004046c2000 RCX: 96c88edab000 RDX: fff0 RSI: 96c88e08de80 RDI: 96c8897c60c0 RBP: R08: 96c88b806b40 R09: 96c88f802f50 R10: R11: 0001 R12: 96c88edab0b0 R13: R14: 1000 R15: 96c8897c60c0 FS: 7fec74c05b80() GS:96c89fbc() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fec74c5dc80 CR3: 000404686004 CR4: 001626e0 Call Trace: __intel_map_page+0x7e/0x150 intel_alloc_coherent+0xa7/0x130 dma_alloc_attrs+0x6b/0xc0 dma_pool_alloc+0xb8/0x1a0 lpfc_mem_alloc+0x109/0x3e0 [lpfc] lpfc_pci_probe_one+0xdac/0x2060 [lpfc] pci_device_probe+0xc3/0x140 really_probe+0xd2/0x380 driver_probe_device+0xae/0xf0 __driver_attach+0xd5/0x100 ? driver_probe_device+0xf0/0xf0 bus_for_each_dev+0x5b/0x90 bus_add_driver+0x208/0x220 ? 0xc0ad9000 driver_register+0x66/0xb0 ? 0xc0ad9000 lpfc_init+0xd5/0x1000 [lpfc] do_one_initcall+0x2e/0x181 ? __vunmap+0x75/0xb0 do_init_module+0x55/0x1e0 load_module+0x2438/0x2560 ? __do_sys_finit_module+0x8f/0xd0 __do_sys_finit_module+0x8f/0xd0 do_syscall_64+0x44/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fec74d212f9 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6f 4b 0c 00 f7 d8 64 89 01 48 RSP: 002b:7ffea26c1458 EFLAGS: 0246 ORIG_RAX: 0139 RAX: ffda RBX: 55eeaf1669c0 RCX: 7fec74d212f9 RDX: RSI: 55eeaf0ee3c8 RDI: 0004 RBP: R08: R09: 55eeaf166400 R10: 0004 R11: 0246 R12: 55eeaf0ee3c8 R13: 0004 R14: 55eeaf16eb90 R15: 55eeaf1669c0 Modules linked in: lpfc(+) scsi_transport_fc mlx4_ib ib_uverbs ib_core mlx4_en mlx4_core pci_stub af_packet vhost_net vhost tun vfio_pci vfio_virqfd vfio_iommu_type1 vfio fuse dm_crypt algif_skcipher af_alg loop devlink bridge stp llc xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 scsi_transport_iscsi ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables bpfilter coretemp hwmon intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass crct10dif_pclmul snd_hda_codec_hdmi crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic aesni_intel aes_x86_64 crypto_simd cryptd snd_hda_intel iTCO_wdt mei_me iTCO_vendor_support joydev mxm_wmi glue_helper e1000e alx mdio intel_rapl_perf snd_hda_codec mei ptp lpc_ich pcspkr i2c_i801 snd_hda_core pps_core mfd_core fan thermal wmi pcc_cpufreq acpi_pad button snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_seq_device snd_pcm snd_timer snd soundcore ext4 crc16 mbcache jbd2 hid_generic usbhid sd_mod i915 intel_gtt i2c_algo_bit iosf_mbi drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm drm_panel_orientation_quirks xhci_pci libata ehci_pci agpgart ehci_hcd xhci_hcd i2c_core video usbcore usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod unix ipv6 autofs4 [last unloaded: scsi_transport_fc] ---[ end trace cc531c0d02c790cc ]--- ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 4/6] s390/pci: use common boot option iommu.dma_mode
s390_iommu=strict can be replaced with iommu.dma_mode=strict. Signed-off-by: Zhen Lei --- Documentation/admin-guide/kernel-parameters.txt | 7 --- arch/s390/pci/pci_dma.c | 20 +--- drivers/iommu/Kconfig | 1 + 3 files changed, 6 insertions(+), 22 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 60409ad23b2ac8b..a2df11945b33fc9 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4095,13 +4095,6 @@ S [KNL] Run init in single mode - s390_iommu= [HW,S390] - Set s390 IOTLB flushing mode - strict - With strict flushing every unmap operation will result in - an IOTLB flush. Default is lazy flushing before reuse, - which is faster. - sa1100ir[NET] See drivers/net/irda/sa1100_ir.c. diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c index 9e52d1527f71495..6f1615c16f33995 100644 --- a/arch/s390/pci/pci_dma.c +++ b/arch/s390/pci/pci_dma.c @@ -17,7 +17,6 @@ static struct kmem_cache *dma_region_table_cache; static struct kmem_cache *dma_page_table_cache; -static int s390_iommu_strict; static int zpci_refresh_global(struct zpci_dev *zdev) { @@ -193,13 +192,13 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr, if (!zdev->tlb_refresh) return 0; } else { - if (!s390_iommu_strict) + if (IOMMU_DMA_MODE_IS_LAZY()) return 0; } ret = zpci_refresh_trans((u64) zdev->fh << 32, dma_addr, PAGE_ALIGN(size)); - if (ret == -ENOMEM && !s390_iommu_strict) { + if (ret == -ENOMEM && IOMMU_DMA_MODE_IS_LAZY()) { /* enable the hypervisor to free some resources */ if (zpci_refresh_global(zdev)) goto out; @@ -278,7 +277,7 @@ static dma_addr_t dma_alloc_address(struct device *dev, int size) spin_lock_irqsave(>iommu_bitmap_lock, flags); offset = __dma_alloc_iommu(dev, zdev->next_bit, size); if (offset == -1) { - if (!s390_iommu_strict) { + if (IOMMU_DMA_MODE_IS_LAZY()) { /* global flush before DMA addresses are reused */ if (zpci_refresh_global(zdev)) goto out_error; @@ -313,7 +312,7 @@ static void dma_free_address(struct device *dev, dma_addr_t dma_addr, int size) if (!zdev->iommu_bitmap) goto out; - if (s390_iommu_strict) + if (!IOMMU_DMA_MODE_IS_LAZY()) bitmap_clear(zdev->iommu_bitmap, offset, size); else bitmap_set(zdev->lazy_bitmap, offset, size); @@ -584,7 +583,7 @@ int zpci_dma_init_device(struct zpci_dev *zdev) rc = -ENOMEM; goto free_dma_table; } - if (!s390_iommu_strict) { + if (IOMMU_DMA_MODE_IS_LAZY()) { zdev->lazy_bitmap = vzalloc(zdev->iommu_pages / 8); if (!zdev->lazy_bitmap) { rc = -ENOMEM; @@ -671,12 +670,3 @@ void zpci_dma_exit(void) /* dma_supported is unconditionally true without a callback */ }; EXPORT_SYMBOL_GPL(s390_pci_dma_ops); - -static int __init s390_iommu_setup(char *str) -{ - if (!strncmp(str, "strict", 6)) - s390_iommu_strict = 1; - return 0; -} - -__setup("s390_iommu=", s390_iommu_setup); diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 19118cfdea1c335..d88dc44d60d88ea 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -77,6 +77,7 @@ config IOMMU_DEBUGFS choice prompt "IOMMU dma mode" depends on IOMMU_API + default IOMMU_DMA_MODE_LAZY if S390_IOMMU default IOMMU_DMA_MODE_STRICT help IOMMU dma mode, such as: passthrough, lazy, strict. -- 1.8.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 1/6] iommu: use iommu.dma_mode to replace iommu.passthrough and iommu.strict
Currently the IOMMU dma contains 3 modes: passthrough, lazy, strict. The passthrough mode bypass the IOMMU, the lazy mode defer the invalidation of hardware TLBs, and the strict mode invalidate IOMMU hardware TLBs synchronously. The three modes are mutually exclusive. So people maybe confused about iommu.passthrough and iommu.strict, because thay can not be coexist. Use iommu.dma_mode to replace them will be better. Signed-off-by: Zhen Lei --- Documentation/admin-guide/kernel-parameters.txt | 33 - drivers/iommu/Kconfig | 4 +-- drivers/iommu/iommu.c | 48 ++--- include/linux/iommu.h | 5 +++ 4 files changed, 50 insertions(+), 40 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 2b8ee90bb64470d..60409ad23b2ac8b 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1792,24 +1792,21 @@ nobypass[PPC/POWERNV] Disable IOMMU bypass, using IOMMU for PCI devices. - iommu.strict= [ARM64] Configure TLB invalidation behaviour - Format: { "0" | "1" } - 0 - Lazy mode. - Request that DMA unmap operations use deferred - invalidation of hardware TLBs, for increased - throughput at the cost of reduced device isolation. - Will fall back to strict mode if not supported by - the relevant IOMMU driver. - 1 - Strict mode (default). - DMA unmap operations invalidate IOMMU hardware TLBs - synchronously. - - iommu.passthrough= - [ARM64] Configure DMA to bypass the IOMMU by default. - Format: { "0" | "1" } - 0 - Use IOMMU translation for DMA. - 1 - Bypass the IOMMU for DMA. - unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH. + + iommu.dma_mode= [ARM64] Configure default dma mode. if unset, use the + value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH. + passthrough + Configure DMA to bypass the IOMMU by default. + lazy + Request that DMA unmap operations use deferred + invalidation of hardware TLBs, for increased + throughput at the cost of reduced device isolation. + Will fall back to strict mode if not supported by + the relevant IOMMU driver. + strict + Default. DMA unmap operations invalidate IOMMU hardware + TLBs synchronously. + io7=[HW] IO7 for Marvel based alpha systems See comment before marvel_specify_io7 in diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 6f07f3b21816c64..b67fcabd668f7b6 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -79,8 +79,8 @@ config IOMMU_DEFAULT_PASSTHROUGH depends on IOMMU_API help Enable passthrough by default, removing the need to pass in - iommu.passthrough=on or iommu=pt through command line. If this - is enabled, you can still disable with iommu.passthrough=off + iommu.dma_mode=passthrough or iommu=pt through command line. If this + is enabled, you can still disable with iommu.dma_mode={lazy|strict} or iommu=nopt depending on the architecture. If unsure, say N here. diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 109de67d5d727c2..e4d581e6cb8d210 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -38,12 +38,13 @@ static struct kset *iommu_group_kset; static DEFINE_IDA(iommu_group_ida); + #ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH -static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY; +#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_PASSTHROUGH #else -static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_DMA; +#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_STRICT #endif -static bool iommu_dma_strict __read_mostly = true; +static int iommu_default_dma_mode __read_mostly = IOMMU_DEFAULT_DMA_MODE; struct iommu_callback_data { const struct iommu_ops *ops; @@ -141,25 +142,29 @@ static int __iommu_attach_group(struct iommu_domain *domain, static void __iommu_detach_group(struct iommu_domain *domain, struct iommu_group *group); -static int __init iommu_set_def_domain_type(char *str) +static int __init iommu_dma_mode_setup(char *str) { - bool pt; - int ret; + if (!str) + goto fail; - ret =
[PATCH v4 3/6] iommu: add iommu_default_dma_mode_get() helper
Add IOMMU_DMA_MODE_IS_LAZY() and IOMMU_DMA_MODE_IS_PASSTHROUGH() to make the code looks cleaner. There is no functional change, just prepare for the following patches. Signed-off-by: Zhen Lei --- drivers/iommu/iommu.c | 12 include/linux/iommu.h | 11 +++ 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b75e23a2ea08da8..876c0966db2481a 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -168,6 +168,11 @@ static int __init iommu_dma_mode_setup(char *str) } early_param("iommu.dma_mode", iommu_dma_mode_setup); +int iommu_default_dma_mode_get(void) +{ + return iommu_default_dma_mode; +} + static ssize_t iommu_group_attr_show(struct kobject *kobj, struct attribute *__attr, char *buf) { @@ -1109,9 +1114,8 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev) */ if (!group->default_domain) { struct iommu_domain *dom; - int def_domain_type = - (iommu_default_dma_mode == IOMMU_DMA_MODE_PASSTHROUGH) - ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA; + int def_domain_type = IOMMU_DMA_MODE_IS_PASSTHROUGH() \ + ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA; dom = __iommu_domain_alloc(dev->bus, def_domain_type); if (!dom && def_domain_type != IOMMU_DOMAIN_DMA) { @@ -1127,7 +1131,7 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev) if (!group->domain) group->domain = dom; - if (dom && (iommu_default_dma_mode == IOMMU_DMA_MODE_LAZY)) { + if (dom && IOMMU_DMA_MODE_IS_LAZY()) { int attr = 1; iommu_domain_set_attr(dom, DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE, diff --git a/include/linux/iommu.h b/include/linux/iommu.h index c3f4e3416176496..1b35ae3f6382e4a 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -46,6 +46,10 @@ #define IOMMU_DMA_MODE_STRICT 0x0 #define IOMMU_DMA_MODE_LAZY0x1 #define IOMMU_DMA_MODE_PASSTHROUGH 0x2 +#define IOMMU_DMA_MODE_IS_LAZY() \ + (iommu_default_dma_mode_get() == IOMMU_DMA_MODE_LAZY) +#define IOMMU_DMA_MODE_IS_PASSTHROUGH() \ + (iommu_default_dma_mode_get() == IOMMU_DMA_MODE_PASSTHROUGH) struct iommu_ops; struct iommu_group; @@ -421,6 +425,8 @@ static inline void dev_iommu_fwspec_set(struct device *dev, int iommu_probe_device(struct device *dev); void iommu_release_device(struct device *dev); +extern int iommu_default_dma_mode_get(void); + #else /* CONFIG_IOMMU_API */ struct iommu_ops {}; @@ -705,6 +711,11 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode) return NULL; } +static inline int iommu_default_dma_mode_get(void) +{ + return IOMMU_DMA_MODE_PASSTHROUGH; +} + #endif /* CONFIG_IOMMU_API */ #ifdef CONFIG_IOMMU_DEBUGFS -- 1.8.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 0/6] normalize IOMMU dma mode boot options
As Robin Murphy's suggestion: "It's also not necessarily obvious to the user how this interacts with IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it would be better to refactor the whole lot into a single selection of something like IOMMU_DEFAULT_MODE anyway." In this version, I tried to normalize the IOMMU dma mode boot options for all ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass), lazy(mapping but defer the IOTLB invalidation), strict. But currently each ARCHs defined their private boot options, different with each other. For example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0, X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass. Zhen Lei (6): iommu: use iommu.dma_mode to replace iommu.passthrough and iommu.strict iommu: keep dma mode build options consistent with cmdline options iommu: add iommu_default_dma_mode_get() helper s390/pci: use common boot option iommu.dma_mode powernv/iommu: use common boot option iommu.dma_mode x86/iommu: use common boot option iommu.dma_mode Documentation/admin-guide/kernel-parameters.txt | 42 +++- arch/ia64/include/asm/iommu.h | 2 - arch/ia64/kernel/pci-dma.c | 2 - arch/powerpc/platforms/powernv/pci-ioda.c | 23 +-- arch/s390/pci/pci_dma.c | 20 +++--- arch/x86/include/asm/iommu.h| 1 - arch/x86/kernel/pci-dma.c | 20 -- drivers/iommu/Kconfig | 36 ++--- drivers/iommu/amd_iommu.c | 12 +++--- drivers/iommu/amd_iommu_init.c | 4 -- drivers/iommu/amd_iommu_types.h | 6 --- drivers/iommu/intel-iommu.c | 7 +--- drivers/iommu/iommu.c | 52 - include/linux/iommu.h | 16 14 files changed, 108 insertions(+), 135 deletions(-) -- 1.8.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 2/6] iommu: keep dma mode build options consistent with cmdline options
First, add build option IOMMU_DMA_MODE_LAZY, so that we have the opportunity to set lazy mode as default at build time. Then put the three config options in an choice, make people can only choose one of the three at a time, the same to the boot options iommu.dma_mode. Signed-off-by: Zhen Lei --- drivers/iommu/Kconfig | 30 +++--- drivers/iommu/iommu.c | 4 +++- 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index b67fcabd668f7b6..19118cfdea1c335 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -74,16 +74,40 @@ config IOMMU_DEBUGFS debug/iommu directory, and then populate a subdirectory with entries as required. -config IOMMU_DEFAULT_PASSTHROUGH - bool "IOMMU passthrough by default" +choice + prompt "IOMMU dma mode" depends on IOMMU_API + default IOMMU_DMA_MODE_STRICT + help + IOMMU dma mode, such as: passthrough, lazy, strict. + +config IOMMU_DMA_MODE_PASSTHROUGH + bool "Configure DMA to bypass the IOMMU" help Enable passthrough by default, removing the need to pass in iommu.dma_mode=passthrough or iommu=pt through command line. If this is enabled, you can still disable with iommu.dma_mode={lazy|strict} or iommu=nopt depending on the architecture. - If unsure, say N here. +config IOMMU_DMA_MODE_LAZY + bool "IOMMU DMA use lazy mode to flush IOTLB and free IOVA" + help + Support lazy mode, where for every IOMMU DMA unmap operation, the + flush operation of IOTLB and the free operation of IOVA are deferred. + They are only guaranteed to be done before the related IOVA will be + reused. Removing the need to pass in kernel parameters through + command line. For example, iommu.dma_mode=lazy on ARM64. If this is + enabled, you can still disable with kernel parameters, such as + iommu.dma_mode=strict depending on the architecture. + +config IOMMU_DMA_MODE_STRICT + bool "IOMMU DMA use strict mode to flush IOTLB and free IOVA" + help + For every IOMMU DMA unmap operation, the flush operation of IOTLB and + the free operation of IOVA are guaranteed to be done in the unmap + function. + +endchoice config OF_IOMMU def_bool y diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index e4d581e6cb8d210..b75e23a2ea08da8 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -39,8 +39,10 @@ static struct kset *iommu_group_kset; static DEFINE_IDA(iommu_group_ida); -#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH +#if defined(CONFIG_IOMMU_DMA_MODE_PASSTHROUGH) #define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_PASSTHROUGH +#elif defined(CONFIG_IOMMU_DMA_MODE_LAZY) +#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_LAZY #else #define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_STRICT #endif -- 1.8.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 5/6] powernv/iommu: use common boot option iommu.dma_mode
iommu=nobypass can be replaced with iommu.dma_mode=strict. Signed-off-by: Zhen Lei --- Documentation/admin-guide/kernel-parameters.txt | 2 -- arch/powerpc/platforms/powernv/pci-ioda.c | 23 +-- drivers/iommu/Kconfig | 1 + 3 files changed, 2 insertions(+), 24 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a2df11945b33fc9..f88a8bff3c0caa0 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1789,8 +1789,6 @@ soft pt [x86] nopt[x86] - nobypass[PPC/POWERNV] - Disable IOMMU bypass, using IOMMU for PCI devices. iommu.dma_mode= [ARM64] Configure default dma mode. if unset, use the diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 3ead4c237ed0ec9..be0234c170316bc 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -85,29 +85,8 @@ void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level, va_end(args); } -static bool pnv_iommu_bypass_disabled __read_mostly; static bool pci_reset_phbs __read_mostly; -static int __init iommu_setup(char *str) -{ - if (!str) - return -EINVAL; - - while (*str) { - if (!strncmp(str, "nobypass", 8)) { - pnv_iommu_bypass_disabled = true; - pr_info("PowerNV: IOMMU bypass window disabled.\n"); - break; - } - str += strcspn(str, ","); - if (*str == ',') - str++; - } - - return 0; -} -early_param("iommu", iommu_setup); - static int __init pci_reset_phbs_setup(char *str) { pci_reset_phbs = true; @@ -2456,7 +2435,7 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe) return rc; } - if (!pnv_iommu_bypass_disabled) + if (IOMMU_DMA_MODE_IS_PASSTHROUGH()) pnv_pci_ioda2_set_bypass(pe, true); return 0; diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index d88dc44d60d88ea..b053eeaa82ebca8 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -77,6 +77,7 @@ config IOMMU_DEBUGFS choice prompt "IOMMU dma mode" depends on IOMMU_API + default IOMMU_DMA_MODE_PASSTHROUGH if (PPC_POWERNV && PCI) default IOMMU_DMA_MODE_LAZY if S390_IOMMU default IOMMU_DMA_MODE_STRICT help -- 1.8.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 6/6] x86/iommu: use common boot option iommu.dma_mode
iommu=pt can be replaced with iommu.dma_mode=passthrough. iommu=nopt can be replaced with iommu.dma_mode=lazy. intel_iommu=strict can be replaced with iommu.dma_mode=strict. amd_iommu=fullflush can be replaced with iommu.dma_mode=strict. Note: intel_iommu_strict is not deleted because it can also be assigned in quirk_calpella_no_shadow_gtt(). Signed-off-by: Zhen Lei --- arch/ia64/include/asm/iommu.h | 2 -- arch/ia64/kernel/pci-dma.c | 2 -- arch/x86/include/asm/iommu.h| 1 - arch/x86/kernel/pci-dma.c | 20 drivers/iommu/Kconfig | 14 ++ drivers/iommu/amd_iommu.c | 12 ++-- drivers/iommu/amd_iommu_init.c | 4 drivers/iommu/amd_iommu_types.h | 6 -- drivers/iommu/intel-iommu.c | 7 ++- 9 files changed, 14 insertions(+), 54 deletions(-) diff --git a/arch/ia64/include/asm/iommu.h b/arch/ia64/include/asm/iommu.h index 7429a72f3f92199..92aceef63710861 100644 --- a/arch/ia64/include/asm/iommu.h +++ b/arch/ia64/include/asm/iommu.h @@ -8,10 +8,8 @@ extern void no_iommu_init(void); #ifdef CONFIG_INTEL_IOMMU extern int force_iommu, no_iommu; -extern int iommu_pass_through; extern int iommu_detected; #else -#define iommu_pass_through (0) #define no_iommu (1) #define iommu_detected (0) #endif diff --git a/arch/ia64/kernel/pci-dma.c b/arch/ia64/kernel/pci-dma.c index fe988c49f01ce6a..f5d49cd3fbb01a9 100644 --- a/arch/ia64/kernel/pci-dma.c +++ b/arch/ia64/kernel/pci-dma.c @@ -22,8 +22,6 @@ int force_iommu __read_mostly; #endif -int iommu_pass_through; - static int __init pci_iommu_init(void) { if (iommu_detected) diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h index baedab8ac5385f7..b91623d521d9f0f 100644 --- a/arch/x86/include/asm/iommu.h +++ b/arch/x86/include/asm/iommu.h @@ -4,7 +4,6 @@ extern int force_iommu, no_iommu; extern int iommu_detected; -extern int iommu_pass_through; /* 10 seconds */ #define DMAR_OPERATION_TIMEOUT ((cycles_t) tsc_khz*10*1000) diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index d460998ae828514..bd63d80597ae6d0 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -34,21 +34,6 @@ /* Set this to 1 if there is a HW IOMMU in the system */ int iommu_detected __read_mostly = 0; -/* - * This variable becomes 1 if iommu=pt is passed on the kernel command line. - * If this variable is 1, IOMMU implementations do no DMA translation for - * devices and allow every device to access to whole physical memory. This is - * useful if a user wants to use an IOMMU only for KVM device assignment to - * guests and not for driver dma translation. - * It is also possible to disable by default in kernel config, and enable with - * iommu=nopt at boot time. - */ -#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH -int iommu_pass_through __read_mostly = 1; -#else -int iommu_pass_through __read_mostly; -#endif - extern struct iommu_table_entry __iommu_table[], __iommu_table_end[]; /* Dummy device used for NULL arguments (normally ISA). */ @@ -139,11 +124,6 @@ static __init int iommu_setup(char *p) if (!strncmp(p, "soft", 4)) swiotlb = 1; #endif - if (!strncmp(p, "pt", 2)) - iommu_pass_through = 1; - if (!strncmp(p, "nopt", 4)) - iommu_pass_through = 0; - gart_parse_options(p); #ifdef CONFIG_CALGARY_IOMMU diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index b053eeaa82ebca8..ff8c5d0d435cf58 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -78,7 +78,7 @@ choice prompt "IOMMU dma mode" depends on IOMMU_API default IOMMU_DMA_MODE_PASSTHROUGH if (PPC_POWERNV && PCI) - default IOMMU_DMA_MODE_LAZY if S390_IOMMU + default IOMMU_DMA_MODE_LAZY if (AMD_IOMMU || INTEL_IOMMU || S390_IOMMU) default IOMMU_DMA_MODE_STRICT help IOMMU dma mode, such as: passthrough, lazy, strict. @@ -87,9 +87,8 @@ config IOMMU_DMA_MODE_PASSTHROUGH bool "Configure DMA to bypass the IOMMU" help Enable passthrough by default, removing the need to pass in - iommu.dma_mode=passthrough or iommu=pt through command line. If this - is enabled, you can still disable with iommu.dma_mode={lazy|strict} - or iommu=nopt depending on the architecture. + iommu.dma_mode=passthrough through command line. If this is enabled, + you can still disable with iommu.dma_mode={lazy|strict}. config IOMMU_DMA_MODE_LAZY bool "IOMMU DMA use lazy mode to flush IOTLB and free IOVA" @@ -97,10 +96,9 @@ config IOMMU_DMA_MODE_LAZY Support lazy mode, where for every IOMMU DMA unmap operation, the flush operation of IOTLB and the free operation of IOVA are deferred. They are only guaranteed to be done before the related IOVA will be -
Re: [PATCH 02/21] arm64/iommu: improve mmap bounds checking
On Fri, Apr 05, 2019 at 06:30:52PM +0100, Robin Murphy wrote: > On 27/03/2019 08:04, Christoph Hellwig wrote: >> The nr_pages checks should be done for all mmap requests, not just those >> using remap_pfn_range. > > Hmm, the logic in iommu_dma_mmap() inherently returns an error for the "off > >= nr_pages" case already. It's also supposed to be robust against the > "vma_pages(vma) > nr_pages - off" condition, although by making the partial > mapping and treating it as a success, rather than doing nothing and > returning an error. What's the exact motivation here? Have one error check at the front of the function that is identical to the mmap checks in the other dma_map_ops instances so that: a) we get the same error behavior for partial requests everywhere b) we can lift these checks into common code in the next round. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu