Re: [PATCH v4 0/6] normalize IOMMU dma mode boot options

2019-04-07 Thread Leizhen (ThunderTown)



On 2019/4/8 9:14, Hanjun Guo wrote:
> Hi Zhen,
> 
> On 2019/4/7 20:41, Zhen Lei wrote:
>> As Robin Murphy's suggestion:
>> "It's also not necessarily obvious to the user how this interacts with
>> IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it
>> would be better to refactor the whole lot into a single selection of 
>> something
>> like IOMMU_DEFAULT_MODE anyway."
>>
>> In this version, I tried to normalize the IOMMU dma mode boot options for all
>> ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass),
>> lazy(mapping but defer the IOTLB invalidation), strict. But currently each
>> ARCHs defined their private boot options, different with each other. For
>> example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0,
>> X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass.
>>
>>
>> Zhen Lei (6):
>>   iommu: use iommu.dma_mode to replace iommu.passthrough and
>> iommu.strict
>>   iommu: keep dma mode build options consistent with cmdline options
>>   iommu: add iommu_default_dma_mode_get() helper
>>   s390/pci: use common boot option iommu.dma_mode
>>   powernv/iommu: use common boot option iommu.dma_mode
>>   x86/iommu: use common boot option iommu.dma_mode
> 
> This will break systems using boot options as now, and I think
> this is unacceptable. If you want to do so, just introduce iommu.dma_mode
> on top of those iommu boot options with dma mode boot options unchanged,
> and iommu.dma_mode is for all archs but compatible with them.

I just changed the boot options name, but keep the function no change. I added
all related maintainers/supporters in the "to=" list, maybe we can disuss this.
Should I add some "obsoleted" warnings for old options and keep them for a 
while?
But I think this kind of thing is best done in one go.

> 
> Thanks
> Hanjun
> 
> 
> .
> 

-- 
Thanks!
BestRegards

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-04-07 Thread Lu Baolu

Hi,

On 4/6/19 2:02 AM, James Sewart wrote:

Hey Lu,

My bad, did some debugging on my end. The issue was swapping out
find_domain for iommu_get_domain_for_dev. It seems in some situations the
domain is not attached to the group but the device is expected to have the
domain still stored in its archdata.

I’ve attached the final patch with find_domain unremoved which seems to
work in my testing.


This version works for me now.



Cheers,
James.


Best regards,
Lu Baolu








On 4 Apr 2019, at 07:49, Lu Baolu  wrote:

Hi James,

I did a sanity test from my end. The test machine fails to boot. I
haven't seen any valuable kernel log. It results in a purple screen.

Best regards,
Lu Baolu

On 3/29/19 11:26 PM, James Sewart wrote:

Hey Lu,
I’ve attached a preliminary v3, if you could take a look and run some tests
that would be great.
Since v2 i’ve added your default domain type patches, the new device_group
handler, and incorporated Jacob’s feedback.

On 28 Mar 2019, at 18:37, James Sewart  wrote:

Hey Lu,


On 26 Mar 2019, at 01:24, Lu Baolu  wrote:

Hi James,

On 3/25/19 8:57 PM, James Sewart wrote:

Theres an issue that if we choose to alloc a new resv_region with type
IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
to free this entry type which means refactoring the rmrr regions in
get_resv_regions. Should this work be in this patchset?

Do you mean the rmrr regions are not allocated in get_resv_regions, but
are freed in put_resv_regions? I think we should fix this in this patch
set since this might impact the device passthrough if we don't do it.

They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
freed in put_resv_regions. If we allocate a new resv_region with type
IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
the static RMRR regions.
Either the ISA region is static and not freed as with my implementation,
or the RMRR regions are converted to be allocated on each call to
get_resv_regions and freed in put_resv_regions.


By the way, there's another way in my mind. Let's add a new region type
for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
as those MSI regions. Just FYI.


This solution would require adding some extra code to
iommu_group_create_direct_mappings as currently only type
IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.




Best regards,
Lu Baolu


Cheers,
James.

Cheers,
James.



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 0/2] iommu/arm-smmu-v3: make sure the kdump kernel can work well when smmu is enabled

2019-04-07 Thread Leizhen (ThunderTown)
Hi Will,

On 2019/4/4 23:30, Will Deacon wrote:
> Hi Zhen Lei,
> 
> On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote:
>> v1 --> v2:
>> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
>> (Report abort to device, no event recorded) to suppress the event messages
>> caused by the unexpected devices.
>> 2. rewrite the patch description.
> 
> This issue came up a while back:
> 
> https://lore.kernel.org/linux-pci/20180302103032.gb19...@arm.com/
> 
> and I'd still prefer to solve it using the disable_bypass logic which we
> already have. Something along the lines of the diff below?

Yes, my patches also use disable_bypass=1(set ste.config=0b000). If
SMMU_IDR0.ST_LEVEL=0(Linear Stream table supported), then all STE entries
are allocated and initialized(set ste.config=0b000). But if SMMU_IDR0.ST_LEVEL=1
(2-level Stream Table), we only allocated and initialized the first level 
tables,
but leave level 2 tables dynamic allocated. That means, 
C_BAD_STREAMID(eventid=0x2)
will be reported, if an unexpeted device access memory without reinitialized in
kdump kernel. So my patches allocated a dummy level 2 table(STE table), and make
all level 1 table entries pointer to it in advance. That means abort all 
unexpected
devices memory access base this dummy STE table. When an expected device(need 
to be
used in kdump kernel) attached, we will allocate a new level 2 table(STE table)
accordingly, but keep others still pointer to the dummy STE table.


> 
> We're relying on the DMA API not subsequently requesting a passthrough
> domain, but it should only do that if you've configured your crashkernel
> to do so.
> 
> Will
> 
> --->8
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index d3880010c6cf..91b8f3b2ee25 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct 
> arm_smmu_device *smmu, bool bypass)
>   /* Clear CR0 and sync (disables SMMU and queue processing) */
>   reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
>   if (reg & CR0_SMMUEN) {
> - if (is_kdump_kernel()) {
> - arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> - arm_smmu_device_disable(smmu);
> - return -EBUSY;
> - }
> -
>   dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> + WARN_ON(is_kdump_kernel() && !disable_bypass);
> + arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>   }
>  
>   ret = arm_smmu_device_disable(smmu);
> 
> .
> 

-- 
Thanks!
BestRegards

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/6] normalize IOMMU dma mode boot options

2019-04-07 Thread Hanjun Guo
Hi Zhen,

On 2019/4/7 20:41, Zhen Lei wrote:
> As Robin Murphy's suggestion:
> "It's also not necessarily obvious to the user how this interacts with
> IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it
> would be better to refactor the whole lot into a single selection of something
> like IOMMU_DEFAULT_MODE anyway."
> 
> In this version, I tried to normalize the IOMMU dma mode boot options for all
> ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass),
> lazy(mapping but defer the IOTLB invalidation), strict. But currently each
> ARCHs defined their private boot options, different with each other. For
> example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0,
> X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass.
> 
> 
> Zhen Lei (6):
>   iommu: use iommu.dma_mode to replace iommu.passthrough and
> iommu.strict
>   iommu: keep dma mode build options consistent with cmdline options
>   iommu: add iommu_default_dma_mode_get() helper
>   s390/pci: use common boot option iommu.dma_mode
>   powernv/iommu: use common boot option iommu.dma_mode
>   x86/iommu: use common boot option iommu.dma_mode

This will break systems using boot options as now, and I think
this is unacceptable. If you want to do so, just introduce iommu.dma_mode
on top of those iommu boot options with dma mode boot options unchanged,
and iommu.dma_mode is for all archs but compatible with them.

Thanks
Hanjun

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: kernel BUG at drivers/iommu/intel-iommu.c:608

2019-04-07 Thread Alex Williamson
On Sun, 7 Apr 2019 16:02:31 -0700
Bart Van Assche  wrote:

> On 4/7/19 2:06 PM, Alex Williamson wrote:
> > On Sun, 7 Apr 2019 12:10:38 -0700
> > Bart Van Assche  wrote:  
> >> If I tell qemu to use PCI pass-through for a PCI adapter and next load the
> >> lpfc driver for an lpfc adapter that has not been passed through to any VM
> >> a kernel bug is hit. Do you perhaps know whether it should be possible to
> >> a load kernel driver in this scenario? If so, do you know what should 
> >> change
> >> to avoid that this kernel bug is hit? Should the iommu code be modified or
> >> should the lpfc code be modified? I'm asking you because I think that you
> >> introduced the BUG() statement that was hit. See also commit ab8dfe251571
> >> ("iommu/vt-d: Introduce helper functions to improve code readability"; 
> >> v3.17).  
> > 
> > Do both of these lpfc devices belong to the same IOMMU group?
> > (/sys/kernel/iommu_groups/)  Thanks,  
> 
> Hi Alex,
> 
> Apparently the two Emulex (lpfc) and the two QLogic ports are in the same 
> IOMMU group:
> 
> # lspci | grep -E 'QLogic|Emulex'
> 01:00.0 Fibre Channel: Emulex Corporation Lancer Gen6: LPe32000 Fibre Channel 
> Host Adapter (rev 01)
> 01:00.1 Fibre Channel: Emulex Corporation Lancer Gen6: LPe32000 Fibre Channel 
> Host Adapter (rev 01)
> 02:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI 
> Express HBA (rev 02)
> 02:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI 
> Express HBA (rev 02)
> 
> # ls -d /sys/kernel/iommu_groups/*/devices/:0[12]:00.*
> /sys/kernel/iommu_groups/1/devices/:01:00.0
> /sys/kernel/iommu_groups/1/devices/:01:00.1
> /sys/kernel/iommu_groups/1/devices/:02:00.0
> /sys/kernel/iommu_groups/1/devices/:02:00.1

It's not possible to do what you want with this configuration.  An IOMMU
group represents the smallest set of devices that are isolated from
other sets of devices and is also therefore the minimum granularity we
can assign devices to userspace (ex. QEMU).  The kernel reacts to
breaking the isolation of the group with a BUG_ON.  If you managed not
to hit the BUG_ON here, you'd hit the BUG_ON in vfio code when the loss
of isolation is detected there. IOMMU groups are formed at the highest
point in the topology which guarantees isolation.  This can be
indicated either via native PCIe ACS support or ACS-equivalent quirks
in the code.  If the root port provides neither of these, then all
devices downstream are grouped together as well as all peer root ports
in the same PCI slot and all devices downstream of those.  If a
multifunction endpoint does not provide ACS or equivalent quirks, the
functions will be grouped together. Not all endpoint devices or systems
are designed for minimum possible granularity.  You can learn more
here[1].  Thanks,

Alex

[1] http://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: kernel BUG at drivers/iommu/intel-iommu.c:608

2019-04-07 Thread Bart Van Assche

On 4/7/19 2:06 PM, Alex Williamson wrote:

On Sun, 7 Apr 2019 12:10:38 -0700
Bart Van Assche  wrote:

If I tell qemu to use PCI pass-through for a PCI adapter and next load the
lpfc driver for an lpfc adapter that has not been passed through to any VM
a kernel bug is hit. Do you perhaps know whether it should be possible to
a load kernel driver in this scenario? If so, do you know what should change
to avoid that this kernel bug is hit? Should the iommu code be modified or
should the lpfc code be modified? I'm asking you because I think that you
introduced the BUG() statement that was hit. See also commit ab8dfe251571
("iommu/vt-d: Introduce helper functions to improve code readability"; v3.17).


Do both of these lpfc devices belong to the same IOMMU group?
(/sys/kernel/iommu_groups/)  Thanks,


Hi Alex,

Apparently the two Emulex (lpfc) and the two QLogic ports are in the same IOMMU 
group:

# lspci | grep -E 'QLogic|Emulex'
01:00.0 Fibre Channel: Emulex Corporation Lancer Gen6: LPe32000 Fibre Channel 
Host Adapter (rev 01)
01:00.1 Fibre Channel: Emulex Corporation Lancer Gen6: LPe32000 Fibre Channel 
Host Adapter (rev 01)
02:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI 
Express HBA (rev 02)
02:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI 
Express HBA (rev 02)

# ls -d /sys/kernel/iommu_groups/*/devices/:0[12]:00.*
/sys/kernel/iommu_groups/1/devices/:01:00.0
/sys/kernel/iommu_groups/1/devices/:01:00.1
/sys/kernel/iommu_groups/1/devices/:02:00.0
/sys/kernel/iommu_groups/1/devices/:02:00.1

Thanks,

Bart.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: kernel BUG at drivers/iommu/intel-iommu.c:608

2019-04-07 Thread Alex Williamson
On Sun, 7 Apr 2019 12:10:38 -0700
Bart Van Assche  wrote:

> Hi Jiang,
> 
> If I tell qemu to use PCI pass-through for a PCI adapter and next load the
> lpfc driver for an lpfc adapter that has not been passed through to any VM
> a kernel bug is hit. Do you perhaps know whether it should be possible to
> a load kernel driver in this scenario? If so, do you know what should change
> to avoid that this kernel bug is hit? Should the iommu code be modified or
> should the lpfc code be modified? I'm asking you because I think that you
> introduced the BUG() statement that was hit. See also commit ab8dfe251571
> ("iommu/vt-d: Introduce helper functions to improve code readability"; v3.17).

Do both of these lpfc devices belong to the same IOMMU group?
(/sys/kernel/iommu_groups/)  Thanks,

Alex

> [ cut here ]
> kernel BUG at drivers/iommu/intel-iommu.c:608!
> invalid opcode:  [#1] SMP
> CPU: 7 PID: 7842 Comm: modprobe Not tainted 5.0.7+ #2
> Hardware name: Gigabyte Technology Co., Ltd. Z97X-UD5H/Z97X-UD5H, BIOS F10 
> 08/03/2015
> RIP: 0010:domain_get_iommu+0x50/0x60
> Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 39 c2 48 63 c8 75 ed 31 
> c0 c3 48 c1 e1 03 48 8b 05 15 9b cd 00 48 8b 04 08 c3 <0f> 0b 31 c9 eb ee 66 
> 2e 0f 1f 84 00 00 00 00 00 41 55 8b 05 d0 9a
> RSP: 0018:a7884024ba60 EFLAGS: 00010202
> RAX: 96c8897c60c0 RBX: 0004046c2000 RCX: 96c88edab000
> RDX: fff0 RSI: 96c88e08de80 RDI: 96c8897c60c0
> RBP:  R08: 96c88b806b40 R09: 96c88f802f50
> R10:  R11: 0001 R12: 96c88edab0b0
> R13:  R14: 1000 R15: 96c8897c60c0
> FS:  7fec74c05b80() GS:96c89fbc() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 7fec74c5dc80 CR3: 000404686004 CR4: 001626e0
> Call Trace:
>   __intel_map_page+0x7e/0x150
>   intel_alloc_coherent+0xa7/0x130
>   dma_alloc_attrs+0x6b/0xc0
>   dma_pool_alloc+0xb8/0x1a0
>   lpfc_mem_alloc+0x109/0x3e0 [lpfc]
>   lpfc_pci_probe_one+0xdac/0x2060 [lpfc]
>   pci_device_probe+0xc3/0x140
>   really_probe+0xd2/0x380
>   driver_probe_device+0xae/0xf0
>   __driver_attach+0xd5/0x100
>   ? driver_probe_device+0xf0/0xf0
>   bus_for_each_dev+0x5b/0x90
>   bus_add_driver+0x208/0x220
>   ? 0xc0ad9000
>   driver_register+0x66/0xb0
>   ? 0xc0ad9000
>   lpfc_init+0xd5/0x1000 [lpfc]
>   do_one_initcall+0x2e/0x181
>   ? __vunmap+0x75/0xb0
>   do_init_module+0x55/0x1e0
>   load_module+0x2438/0x2560
>   ? __do_sys_finit_module+0x8f/0xd0
>   __do_sys_finit_module+0x8f/0xd0
>   do_syscall_64+0x44/0xf0
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7fec74d212f9
> Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 
> 01 c3 48 8b 0d 6f 4b 0c 00 f7 d8 64 89 01 48
> RSP: 002b:7ffea26c1458 EFLAGS: 0246 ORIG_RAX: 0139
> RAX: ffda RBX: 55eeaf1669c0 RCX: 7fec74d212f9
> RDX:  RSI: 55eeaf0ee3c8 RDI: 0004
> RBP:  R08:  R09: 55eeaf166400
> R10: 0004 R11: 0246 R12: 55eeaf0ee3c8
> R13: 0004 R14: 55eeaf16eb90 R15: 55eeaf1669c0
> Modules linked in: lpfc(+) scsi_transport_fc mlx4_ib ib_uverbs ib_core 
> mlx4_en mlx4_core pci_stub af_packet vhost_net vhost tun vfio_pci vfio_virqfd 
> vfio_iommu_type1 vfio fuse dm_crypt algif_skcipher af_alg loop devlink bridge 
> stp llc xt_tcpudp ip6t_rpfilter ip6t_REJECT 
> nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat 
> ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security 
> iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security 
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
> scsi_transport_iscsi ip_set nfnetlink ebtable_filter ebtables ip6table_filter 
> ip6_tables iptable_filter ip_tables x_tables bpfilter coretemp hwmon 
> intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass 
> crct10dif_pclmul snd_hda_codec_hdmi crc32_pclmul 
> ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic aesni_intel 
> aes_x86_64 crypto_simd cryptd snd_hda_intel iTCO_wdt mei_me 
> iTCO_vendor_support joydev mxm_wmi glue_helper e1000e alx
>   mdio intel_rapl_perf snd_hda_codec mei ptp lpc_ich pcspkr i2c_i801 
> snd_hda_core pps_core mfd_core fan thermal wmi pcc_cpufreq acpi_pad button 
> snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_seq_device snd_pcm 
> snd_timer snd soundcore ext4 crc16 mbcache jbd2 
> hid_generic usbhid sd_mod i915 intel_gtt i2c_algo_bit iosf_mbi drm_kms_helper 
> syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm 
> drm_panel_orientation_quirks xhci_pci libata ehci_pci agpgart ehci_hcd 
> xhci_hcd i2c_core video usbcore usb_common sg dm_multipath 
> dm_mod scsi_dh_rdac scsi_dh_emc 

kernel BUG at drivers/iommu/intel-iommu.c:608

2019-04-07 Thread Bart Van Assche

Hi Jiang,

If I tell qemu to use PCI pass-through for a PCI adapter and next load the
lpfc driver for an lpfc adapter that has not been passed through to any VM
a kernel bug is hit. Do you perhaps know whether it should be possible to
a load kernel driver in this scenario? If so, do you know what should change
to avoid that this kernel bug is hit? Should the iommu code be modified or
should the lpfc code be modified? I'm asking you because I think that you
introduced the BUG() statement that was hit. See also commit ab8dfe251571
("iommu/vt-d: Introduce helper functions to improve code readability"; v3.17).

Thank you,

Bart.

[ cut here ]
kernel BUG at drivers/iommu/intel-iommu.c:608!
invalid opcode:  [#1] SMP
CPU: 7 PID: 7842 Comm: modprobe Not tainted 5.0.7+ #2
Hardware name: Gigabyte Technology Co., Ltd. Z97X-UD5H/Z97X-UD5H, BIOS F10 
08/03/2015
RIP: 0010:domain_get_iommu+0x50/0x60
Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 39 c2 48 63 c8 75 ed 31 c0 c3 
48 c1 e1 03 48 8b 05 15 9b cd 00 48 8b 04 08 c3 <0f> 0b 31 c9 eb ee 66 2e 0f 1f 
84 00 00 00 00 00 41 55 8b 05 d0 9a
RSP: 0018:a7884024ba60 EFLAGS: 00010202
RAX: 96c8897c60c0 RBX: 0004046c2000 RCX: 96c88edab000
RDX: fff0 RSI: 96c88e08de80 RDI: 96c8897c60c0
RBP:  R08: 96c88b806b40 R09: 96c88f802f50
R10:  R11: 0001 R12: 96c88edab0b0
R13:  R14: 1000 R15: 96c8897c60c0
FS:  7fec74c05b80() GS:96c89fbc() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fec74c5dc80 CR3: 000404686004 CR4: 001626e0
Call Trace:
 __intel_map_page+0x7e/0x150
 intel_alloc_coherent+0xa7/0x130
 dma_alloc_attrs+0x6b/0xc0
 dma_pool_alloc+0xb8/0x1a0
 lpfc_mem_alloc+0x109/0x3e0 [lpfc]
 lpfc_pci_probe_one+0xdac/0x2060 [lpfc]
 pci_device_probe+0xc3/0x140
 really_probe+0xd2/0x380
 driver_probe_device+0xae/0xf0
 __driver_attach+0xd5/0x100
 ? driver_probe_device+0xf0/0xf0
 bus_for_each_dev+0x5b/0x90
 bus_add_driver+0x208/0x220
 ? 0xc0ad9000
 driver_register+0x66/0xb0
 ? 0xc0ad9000
 lpfc_init+0xd5/0x1000 [lpfc]
 do_one_initcall+0x2e/0x181
 ? __vunmap+0x75/0xb0
 do_init_module+0x55/0x1e0
 load_module+0x2438/0x2560
 ? __do_sys_finit_module+0x8f/0xd0
 __do_sys_finit_module+0x8f/0xd0
 do_syscall_64+0x44/0xf0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fec74d212f9
Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 
48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 
8b 0d 6f 4b 0c 00 f7 d8 64 89 01 48
RSP: 002b:7ffea26c1458 EFLAGS: 0246 ORIG_RAX: 0139
RAX: ffda RBX: 55eeaf1669c0 RCX: 7fec74d212f9
RDX:  RSI: 55eeaf0ee3c8 RDI: 0004
RBP:  R08:  R09: 55eeaf166400
R10: 0004 R11: 0246 R12: 55eeaf0ee3c8
R13: 0004 R14: 55eeaf16eb90 R15: 55eeaf1669c0
Modules linked in: lpfc(+) scsi_transport_fc mlx4_ib ib_uverbs ib_core mlx4_en mlx4_core pci_stub af_packet vhost_net vhost tun vfio_pci vfio_virqfd vfio_iommu_type1 vfio fuse dm_crypt algif_skcipher af_alg loop devlink bridge stp llc xt_tcpudp ip6t_rpfilter ip6t_REJECT 
nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
scsi_transport_iscsi ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables bpfilter coretemp hwmon intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass crct10dif_pclmul snd_hda_codec_hdmi crc32_pclmul 
ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic aesni_intel aes_x86_64 crypto_simd cryptd snd_hda_intel iTCO_wdt mei_me iTCO_vendor_support joydev mxm_wmi glue_helper e1000e alx
 mdio intel_rapl_perf snd_hda_codec mei ptp lpc_ich pcspkr i2c_i801 snd_hda_core pps_core mfd_core fan thermal wmi pcc_cpufreq acpi_pad button snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_seq_device snd_pcm snd_timer snd soundcore ext4 crc16 mbcache jbd2 
hid_generic usbhid sd_mod i915 intel_gtt i2c_algo_bit iosf_mbi drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm drm_panel_orientation_quirks xhci_pci libata ehci_pci agpgart ehci_hcd xhci_hcd i2c_core video usbcore usb_common sg dm_multipath 
dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod unix ipv6 autofs4 [last unloaded: scsi_transport_fc]

---[ end trace cc531c0d02c790cc ]---
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 4/6] s390/pci: use common boot option iommu.dma_mode

2019-04-07 Thread Zhen Lei
s390_iommu=strict can be replaced with iommu.dma_mode=strict.

Signed-off-by: Zhen Lei 
---
 Documentation/admin-guide/kernel-parameters.txt |  7 ---
 arch/s390/pci/pci_dma.c | 20 +---
 drivers/iommu/Kconfig   |  1 +
 3 files changed, 6 insertions(+), 22 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 60409ad23b2ac8b..a2df11945b33fc9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4095,13 +4095,6 @@
 
S   [KNL] Run init in single mode
 
-   s390_iommu= [HW,S390]
-   Set s390 IOTLB flushing mode
-   strict
-   With strict flushing every unmap operation will result 
in
-   an IOTLB flush. Default is lazy flushing before reuse,
-   which is faster.
-
sa1100ir[NET]
See drivers/net/irda/sa1100_ir.c.
 
diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 9e52d1527f71495..6f1615c16f33995 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -17,7 +17,6 @@
 
 static struct kmem_cache *dma_region_table_cache;
 static struct kmem_cache *dma_page_table_cache;
-static int s390_iommu_strict;
 
 static int zpci_refresh_global(struct zpci_dev *zdev)
 {
@@ -193,13 +192,13 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, 
dma_addr_t dma_addr,
if (!zdev->tlb_refresh)
return 0;
} else {
-   if (!s390_iommu_strict)
+   if (IOMMU_DMA_MODE_IS_LAZY())
return 0;
}
 
ret = zpci_refresh_trans((u64) zdev->fh << 32, dma_addr,
 PAGE_ALIGN(size));
-   if (ret == -ENOMEM && !s390_iommu_strict) {
+   if (ret == -ENOMEM && IOMMU_DMA_MODE_IS_LAZY()) {
/* enable the hypervisor to free some resources */
if (zpci_refresh_global(zdev))
goto out;
@@ -278,7 +277,7 @@ static dma_addr_t dma_alloc_address(struct device *dev, int 
size)
spin_lock_irqsave(>iommu_bitmap_lock, flags);
offset = __dma_alloc_iommu(dev, zdev->next_bit, size);
if (offset == -1) {
-   if (!s390_iommu_strict) {
+   if (IOMMU_DMA_MODE_IS_LAZY()) {
/* global flush before DMA addresses are reused */
if (zpci_refresh_global(zdev))
goto out_error;
@@ -313,7 +312,7 @@ static void dma_free_address(struct device *dev, dma_addr_t 
dma_addr, int size)
if (!zdev->iommu_bitmap)
goto out;
 
-   if (s390_iommu_strict)
+   if (!IOMMU_DMA_MODE_IS_LAZY())
bitmap_clear(zdev->iommu_bitmap, offset, size);
else
bitmap_set(zdev->lazy_bitmap, offset, size);
@@ -584,7 +583,7 @@ int zpci_dma_init_device(struct zpci_dev *zdev)
rc = -ENOMEM;
goto free_dma_table;
}
-   if (!s390_iommu_strict) {
+   if (IOMMU_DMA_MODE_IS_LAZY()) {
zdev->lazy_bitmap = vzalloc(zdev->iommu_pages / 8);
if (!zdev->lazy_bitmap) {
rc = -ENOMEM;
@@ -671,12 +670,3 @@ void zpci_dma_exit(void)
/* dma_supported is unconditionally true without a callback */
 };
 EXPORT_SYMBOL_GPL(s390_pci_dma_ops);
-
-static int __init s390_iommu_setup(char *str)
-{
-   if (!strncmp(str, "strict", 6))
-   s390_iommu_strict = 1;
-   return 0;
-}
-
-__setup("s390_iommu=", s390_iommu_setup);
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 19118cfdea1c335..d88dc44d60d88ea 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -77,6 +77,7 @@ config IOMMU_DEBUGFS
 choice
prompt "IOMMU dma mode"
depends on IOMMU_API
+   default IOMMU_DMA_MODE_LAZY if S390_IOMMU
default IOMMU_DMA_MODE_STRICT
help
  IOMMU dma mode, such as: passthrough, lazy, strict.
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 1/6] iommu: use iommu.dma_mode to replace iommu.passthrough and iommu.strict

2019-04-07 Thread Zhen Lei
Currently the IOMMU dma contains 3 modes: passthrough, lazy, strict. The
passthrough mode bypass the IOMMU, the lazy mode defer the invalidation
of hardware TLBs, and the strict mode invalidate IOMMU hardware TLBs
synchronously. The three modes are mutually exclusive. So people maybe
confused about iommu.passthrough and iommu.strict, because thay can not
be coexist. Use iommu.dma_mode to replace them will be better.

Signed-off-by: Zhen Lei 
---
 Documentation/admin-guide/kernel-parameters.txt | 33 -
 drivers/iommu/Kconfig   |  4 +--
 drivers/iommu/iommu.c   | 48 ++---
 include/linux/iommu.h   |  5 +++
 4 files changed, 50 insertions(+), 40 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 2b8ee90bb64470d..60409ad23b2ac8b 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1792,24 +1792,21 @@
nobypass[PPC/POWERNV]
Disable IOMMU bypass, using IOMMU for PCI devices.
 
-   iommu.strict=   [ARM64] Configure TLB invalidation behaviour
-   Format: { "0" | "1" }
-   0 - Lazy mode.
- Request that DMA unmap operations use deferred
- invalidation of hardware TLBs, for increased
- throughput at the cost of reduced device isolation.
- Will fall back to strict mode if not supported by
- the relevant IOMMU driver.
-   1 - Strict mode (default).
- DMA unmap operations invalidate IOMMU hardware TLBs
- synchronously.
-
-   iommu.passthrough=
-   [ARM64] Configure DMA to bypass the IOMMU by default.
-   Format: { "0" | "1" }
-   0 - Use IOMMU translation for DMA.
-   1 - Bypass the IOMMU for DMA.
-   unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
+
+   iommu.dma_mode= [ARM64] Configure default dma mode. if unset, use the
+   value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
+   passthrough
+   Configure DMA to bypass the IOMMU by default.
+   lazy
+   Request that DMA unmap operations use deferred
+   invalidation of hardware TLBs, for increased
+   throughput at the cost of reduced device isolation.
+   Will fall back to strict mode if not supported by
+   the relevant IOMMU driver.
+   strict
+   Default. DMA unmap operations invalidate IOMMU hardware
+   TLBs synchronously.
+
 
io7=[HW] IO7 for Marvel based alpha systems
See comment before marvel_specify_io7 in
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816c64..b67fcabd668f7b6 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -79,8 +79,8 @@ config IOMMU_DEFAULT_PASSTHROUGH
depends on IOMMU_API
 help
  Enable passthrough by default, removing the need to pass in
- iommu.passthrough=on or iommu=pt through command line. If this
- is enabled, you can still disable with iommu.passthrough=off
+ iommu.dma_mode=passthrough or iommu=pt through command line. If this
+ is enabled, you can still disable with iommu.dma_mode={lazy|strict}
  or iommu=nopt depending on the architecture.
 
  If unsure, say N here.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 109de67d5d727c2..e4d581e6cb8d210 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -38,12 +38,13 @@
 
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
+
 #ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
-static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY;
+#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_PASSTHROUGH
 #else
-static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_DMA;
+#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_STRICT
 #endif
-static bool iommu_dma_strict __read_mostly = true;
+static int iommu_default_dma_mode __read_mostly = IOMMU_DEFAULT_DMA_MODE;
 
 struct iommu_callback_data {
const struct iommu_ops *ops;
@@ -141,25 +142,29 @@ static int __iommu_attach_group(struct iommu_domain 
*domain,
 static void __iommu_detach_group(struct iommu_domain *domain,
 struct iommu_group *group);
 
-static int __init iommu_set_def_domain_type(char *str)
+static int __init iommu_dma_mode_setup(char *str)
 {
-   bool pt;
-   int ret;
+   if (!str)
+   goto fail;
 
-   ret = 

[PATCH v4 3/6] iommu: add iommu_default_dma_mode_get() helper

2019-04-07 Thread Zhen Lei
Add IOMMU_DMA_MODE_IS_LAZY() and IOMMU_DMA_MODE_IS_PASSTHROUGH() to make
the code looks cleaner.

There is no functional change, just prepare for the following patches.

Signed-off-by: Zhen Lei 
---
 drivers/iommu/iommu.c | 12 
 include/linux/iommu.h | 11 +++
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b75e23a2ea08da8..876c0966db2481a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -168,6 +168,11 @@ static int __init iommu_dma_mode_setup(char *str)
 }
 early_param("iommu.dma_mode", iommu_dma_mode_setup);
 
+int iommu_default_dma_mode_get(void)
+{
+   return iommu_default_dma_mode;
+}
+
 static ssize_t iommu_group_attr_show(struct kobject *kobj,
 struct attribute *__attr, char *buf)
 {
@@ -1109,9 +1114,8 @@ struct iommu_group *iommu_group_get_for_dev(struct device 
*dev)
 */
if (!group->default_domain) {
struct iommu_domain *dom;
-   int def_domain_type =
-   (iommu_default_dma_mode == IOMMU_DMA_MODE_PASSTHROUGH)
-   ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA;
+   int def_domain_type = IOMMU_DMA_MODE_IS_PASSTHROUGH() \
+   ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA;
 
dom = __iommu_domain_alloc(dev->bus, def_domain_type);
if (!dom && def_domain_type != IOMMU_DOMAIN_DMA) {
@@ -1127,7 +1131,7 @@ struct iommu_group *iommu_group_get_for_dev(struct device 
*dev)
if (!group->domain)
group->domain = dom;
 
-   if (dom && (iommu_default_dma_mode == IOMMU_DMA_MODE_LAZY)) {
+   if (dom && IOMMU_DMA_MODE_IS_LAZY()) {
int attr = 1;
iommu_domain_set_attr(dom,
  DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c3f4e3416176496..1b35ae3f6382e4a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -46,6 +46,10 @@
 #define IOMMU_DMA_MODE_STRICT  0x0
 #define IOMMU_DMA_MODE_LAZY0x1
 #define IOMMU_DMA_MODE_PASSTHROUGH 0x2
+#define IOMMU_DMA_MODE_IS_LAZY() \
+   (iommu_default_dma_mode_get() == IOMMU_DMA_MODE_LAZY)
+#define IOMMU_DMA_MODE_IS_PASSTHROUGH() \
+   (iommu_default_dma_mode_get() == IOMMU_DMA_MODE_PASSTHROUGH)
 
 struct iommu_ops;
 struct iommu_group;
@@ -421,6 +425,8 @@ static inline void dev_iommu_fwspec_set(struct device *dev,
 int iommu_probe_device(struct device *dev);
 void iommu_release_device(struct device *dev);
 
+extern int iommu_default_dma_mode_get(void);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -705,6 +711,11 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct 
fwnode_handle *fwnode)
return NULL;
 }
 
+static inline int iommu_default_dma_mode_get(void)
+{
+   return IOMMU_DMA_MODE_PASSTHROUGH;
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #ifdef CONFIG_IOMMU_DEBUGFS
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 0/6] normalize IOMMU dma mode boot options

2019-04-07 Thread Zhen Lei
As Robin Murphy's suggestion:
"It's also not necessarily obvious to the user how this interacts with
IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it
would be better to refactor the whole lot into a single selection of something
like IOMMU_DEFAULT_MODE anyway."

In this version, I tried to normalize the IOMMU dma mode boot options for all
ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass),
lazy(mapping but defer the IOTLB invalidation), strict. But currently each
ARCHs defined their private boot options, different with each other. For
example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0,
X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass.


Zhen Lei (6):
  iommu: use iommu.dma_mode to replace iommu.passthrough and
iommu.strict
  iommu: keep dma mode build options consistent with cmdline options
  iommu: add iommu_default_dma_mode_get() helper
  s390/pci: use common boot option iommu.dma_mode
  powernv/iommu: use common boot option iommu.dma_mode
  x86/iommu: use common boot option iommu.dma_mode

 Documentation/admin-guide/kernel-parameters.txt | 42 +++-
 arch/ia64/include/asm/iommu.h   |  2 -
 arch/ia64/kernel/pci-dma.c  |  2 -
 arch/powerpc/platforms/powernv/pci-ioda.c   | 23 +--
 arch/s390/pci/pci_dma.c | 20 +++---
 arch/x86/include/asm/iommu.h|  1 -
 arch/x86/kernel/pci-dma.c   | 20 --
 drivers/iommu/Kconfig   | 36 ++---
 drivers/iommu/amd_iommu.c   | 12 +++---
 drivers/iommu/amd_iommu_init.c  |  4 --
 drivers/iommu/amd_iommu_types.h |  6 ---
 drivers/iommu/intel-iommu.c |  7 +---
 drivers/iommu/iommu.c   | 52 -
 include/linux/iommu.h   | 16 
 14 files changed, 108 insertions(+), 135 deletions(-)

-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 2/6] iommu: keep dma mode build options consistent with cmdline options

2019-04-07 Thread Zhen Lei
First, add build option IOMMU_DMA_MODE_LAZY, so that we have the
opportunity to set lazy mode as default at build time. Then put the
three config options in an choice, make people can only choose one of the
three at a time, the same to the boot options iommu.dma_mode.

Signed-off-by: Zhen Lei 
---
 drivers/iommu/Kconfig | 30 +++---
 drivers/iommu/iommu.c |  4 +++-
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index b67fcabd668f7b6..19118cfdea1c335 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -74,16 +74,40 @@ config IOMMU_DEBUGFS
  debug/iommu directory, and then populate a subdirectory with
  entries as required.
 
-config IOMMU_DEFAULT_PASSTHROUGH
-   bool "IOMMU passthrough by default"
+choice
+   prompt "IOMMU dma mode"
depends on IOMMU_API
+   default IOMMU_DMA_MODE_STRICT
+   help
+ IOMMU dma mode, such as: passthrough, lazy, strict.
+
+config IOMMU_DMA_MODE_PASSTHROUGH
+   bool "Configure DMA to bypass the IOMMU"
 help
  Enable passthrough by default, removing the need to pass in
  iommu.dma_mode=passthrough or iommu=pt through command line. If this
  is enabled, you can still disable with iommu.dma_mode={lazy|strict}
  or iommu=nopt depending on the architecture.
 
- If unsure, say N here.
+config IOMMU_DMA_MODE_LAZY
+   bool "IOMMU DMA use lazy mode to flush IOTLB and free IOVA"
+   help
+ Support lazy mode, where for every IOMMU DMA unmap operation, the
+ flush operation of IOTLB and the free operation of IOVA are deferred.
+ They are only guaranteed to be done before the related IOVA will be
+ reused. Removing the need to pass in kernel parameters through
+ command line. For example, iommu.dma_mode=lazy on ARM64. If this is
+ enabled, you can still disable with kernel parameters, such as
+ iommu.dma_mode=strict depending on the architecture.
+
+config IOMMU_DMA_MODE_STRICT
+   bool "IOMMU DMA use strict mode to flush IOTLB and free IOVA"
+   help
+ For every IOMMU DMA unmap operation, the flush operation of IOTLB and
+ the free operation of IOVA are guaranteed to be done in the unmap
+ function.
+
+endchoice
 
 config OF_IOMMU
def_bool y
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index e4d581e6cb8d210..b75e23a2ea08da8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -39,8 +39,10 @@
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
 
-#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
+#if defined(CONFIG_IOMMU_DMA_MODE_PASSTHROUGH)
 #define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_PASSTHROUGH
+#elif defined(CONFIG_IOMMU_DMA_MODE_LAZY)
+#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_LAZY
 #else
 #define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_STRICT
 #endif
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 5/6] powernv/iommu: use common boot option iommu.dma_mode

2019-04-07 Thread Zhen Lei
iommu=nobypass can be replaced with iommu.dma_mode=strict.

Signed-off-by: Zhen Lei 
---
 Documentation/admin-guide/kernel-parameters.txt |  2 --
 arch/powerpc/platforms/powernv/pci-ioda.c   | 23 +--
 drivers/iommu/Kconfig   |  1 +
 3 files changed, 2 insertions(+), 24 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index a2df11945b33fc9..f88a8bff3c0caa0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1789,8 +1789,6 @@
soft
pt  [x86]
nopt[x86]
-   nobypass[PPC/POWERNV]
-   Disable IOMMU bypass, using IOMMU for PCI devices.
 
 
iommu.dma_mode= [ARM64] Configure default dma mode. if unset, use the
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3ead4c237ed0ec9..be0234c170316bc 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -85,29 +85,8 @@ void pe_level_printk(const struct pnv_ioda_pe *pe, const 
char *level,
va_end(args);
 }
 
-static bool pnv_iommu_bypass_disabled __read_mostly;
 static bool pci_reset_phbs __read_mostly;
 
-static int __init iommu_setup(char *str)
-{
-   if (!str)
-   return -EINVAL;
-
-   while (*str) {
-   if (!strncmp(str, "nobypass", 8)) {
-   pnv_iommu_bypass_disabled = true;
-   pr_info("PowerNV: IOMMU bypass window disabled.\n");
-   break;
-   }
-   str += strcspn(str, ",");
-   if (*str == ',')
-   str++;
-   }
-
-   return 0;
-}
-early_param("iommu", iommu_setup);
-
 static int __init pci_reset_phbs_setup(char *str)
 {
pci_reset_phbs = true;
@@ -2456,7 +2435,7 @@ static long pnv_pci_ioda2_setup_default_config(struct 
pnv_ioda_pe *pe)
return rc;
}
 
-   if (!pnv_iommu_bypass_disabled)
+   if (IOMMU_DMA_MODE_IS_PASSTHROUGH())
pnv_pci_ioda2_set_bypass(pe, true);
 
return 0;
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d88dc44d60d88ea..b053eeaa82ebca8 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -77,6 +77,7 @@ config IOMMU_DEBUGFS
 choice
prompt "IOMMU dma mode"
depends on IOMMU_API
+   default IOMMU_DMA_MODE_PASSTHROUGH if (PPC_POWERNV && PCI)
default IOMMU_DMA_MODE_LAZY if S390_IOMMU
default IOMMU_DMA_MODE_STRICT
help
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 6/6] x86/iommu: use common boot option iommu.dma_mode

2019-04-07 Thread Zhen Lei
iommu=pt can be replaced with iommu.dma_mode=passthrough.
iommu=nopt can be replaced with iommu.dma_mode=lazy.
intel_iommu=strict can be replaced with iommu.dma_mode=strict.
amd_iommu=fullflush can be replaced with iommu.dma_mode=strict.

Note: intel_iommu_strict is not deleted because it can also be assigned
in quirk_calpella_no_shadow_gtt().

Signed-off-by: Zhen Lei 
---
 arch/ia64/include/asm/iommu.h   |  2 --
 arch/ia64/kernel/pci-dma.c  |  2 --
 arch/x86/include/asm/iommu.h|  1 -
 arch/x86/kernel/pci-dma.c   | 20 
 drivers/iommu/Kconfig   | 14 ++
 drivers/iommu/amd_iommu.c   | 12 ++--
 drivers/iommu/amd_iommu_init.c  |  4 
 drivers/iommu/amd_iommu_types.h |  6 --
 drivers/iommu/intel-iommu.c |  7 ++-
 9 files changed, 14 insertions(+), 54 deletions(-)

diff --git a/arch/ia64/include/asm/iommu.h b/arch/ia64/include/asm/iommu.h
index 7429a72f3f92199..92aceef63710861 100644
--- a/arch/ia64/include/asm/iommu.h
+++ b/arch/ia64/include/asm/iommu.h
@@ -8,10 +8,8 @@
 extern void no_iommu_init(void);
 #ifdef CONFIG_INTEL_IOMMU
 extern int force_iommu, no_iommu;
-extern int iommu_pass_through;
 extern int iommu_detected;
 #else
-#define iommu_pass_through (0)
 #define no_iommu   (1)
 #define iommu_detected (0)
 #endif
diff --git a/arch/ia64/kernel/pci-dma.c b/arch/ia64/kernel/pci-dma.c
index fe988c49f01ce6a..f5d49cd3fbb01a9 100644
--- a/arch/ia64/kernel/pci-dma.c
+++ b/arch/ia64/kernel/pci-dma.c
@@ -22,8 +22,6 @@
 int force_iommu __read_mostly;
 #endif
 
-int iommu_pass_through;
-
 static int __init pci_iommu_init(void)
 {
if (iommu_detected)
diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index baedab8ac5385f7..b91623d521d9f0f 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -4,7 +4,6 @@
 
 extern int force_iommu, no_iommu;
 extern int iommu_detected;
-extern int iommu_pass_through;
 
 /* 10 seconds */
 #define DMAR_OPERATION_TIMEOUT ((cycles_t) tsc_khz*10*1000)
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index d460998ae828514..bd63d80597ae6d0 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -34,21 +34,6 @@
 /* Set this to 1 if there is a HW IOMMU in the system */
 int iommu_detected __read_mostly = 0;
 
-/*
- * This variable becomes 1 if iommu=pt is passed on the kernel command line.
- * If this variable is 1, IOMMU implementations do no DMA translation for
- * devices and allow every device to access to whole physical memory. This is
- * useful if a user wants to use an IOMMU only for KVM device assignment to
- * guests and not for driver dma translation.
- * It is also possible to disable by default in kernel config, and enable with
- * iommu=nopt at boot time.
- */
-#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
-int iommu_pass_through __read_mostly = 1;
-#else
-int iommu_pass_through __read_mostly;
-#endif
-
 extern struct iommu_table_entry __iommu_table[], __iommu_table_end[];
 
 /* Dummy device used for NULL arguments (normally ISA). */
@@ -139,11 +124,6 @@ static __init int iommu_setup(char *p)
if (!strncmp(p, "soft", 4))
swiotlb = 1;
 #endif
-   if (!strncmp(p, "pt", 2))
-   iommu_pass_through = 1;
-   if (!strncmp(p, "nopt", 4))
-   iommu_pass_through = 0;
-
gart_parse_options(p);
 
 #ifdef CONFIG_CALGARY_IOMMU
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index b053eeaa82ebca8..ff8c5d0d435cf58 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -78,7 +78,7 @@ choice
prompt "IOMMU dma mode"
depends on IOMMU_API
default IOMMU_DMA_MODE_PASSTHROUGH if (PPC_POWERNV && PCI)
-   default IOMMU_DMA_MODE_LAZY if S390_IOMMU
+   default IOMMU_DMA_MODE_LAZY if (AMD_IOMMU || INTEL_IOMMU || S390_IOMMU)
default IOMMU_DMA_MODE_STRICT
help
  IOMMU dma mode, such as: passthrough, lazy, strict.
@@ -87,9 +87,8 @@ config IOMMU_DMA_MODE_PASSTHROUGH
bool "Configure DMA to bypass the IOMMU"
 help
  Enable passthrough by default, removing the need to pass in
- iommu.dma_mode=passthrough or iommu=pt through command line. If this
- is enabled, you can still disable with iommu.dma_mode={lazy|strict}
- or iommu=nopt depending on the architecture.
+ iommu.dma_mode=passthrough through command line. If this is enabled,
+ you can still disable with iommu.dma_mode={lazy|strict}.
 
 config IOMMU_DMA_MODE_LAZY
bool "IOMMU DMA use lazy mode to flush IOTLB and free IOVA"
@@ -97,10 +96,9 @@ config IOMMU_DMA_MODE_LAZY
  Support lazy mode, where for every IOMMU DMA unmap operation, the
  flush operation of IOTLB and the free operation of IOVA are deferred.
  They are only guaranteed to be done before the related IOVA will be
- 

Re: [PATCH 02/21] arm64/iommu: improve mmap bounds checking

2019-04-07 Thread Christoph Hellwig
On Fri, Apr 05, 2019 at 06:30:52PM +0100, Robin Murphy wrote:
> On 27/03/2019 08:04, Christoph Hellwig wrote:
>> The nr_pages checks should be done for all mmap requests, not just those
>> using remap_pfn_range.
>
> Hmm, the logic in iommu_dma_mmap() inherently returns an error for the "off 
> >= nr_pages" case already. It's also supposed to be robust against the 
> "vma_pages(vma) > nr_pages - off" condition, although by making the partial 
> mapping and treating it as a success, rather than doing nothing and 
> returning an error. What's the exact motivation here?

Have one error check at the front of the function that is identical
to the mmap checks in the other dma_map_ops instances so that:

 a) we get the same error behavior for partial requests everywhere
 b) we can lift these checks into common code in the next round.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu