Re: [PATCH 1/1] iommu/vt-d: Fix double list_add when enabling VMD in scalable mode

2022-02-28 Thread Joerg Roedel
On Mon, Feb 21, 2022 at 01:33:48PM +0800, Lu Baolu wrote:
> Fixes: 474dd1c65064 ("iommu/vt-d: Fix clearing real DMA device's 
> scalable-mode context entries")
> Cc: sta...@vger.kernel.org # v5.14+
> Signed-off-by: Adrian Huang 
> Link: https://lore.kernel.org/r/20220216091307.703-1-adrianhuang0...@gmail.com
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied for v5.17, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/1] iommu/vt-d: Fix double list_add when enabling VMD in scalable mode

2022-02-20 Thread Lu Baolu
From: Adrian Huang 

When enabling VMD and IOMMU scalable mode, the following kernel panic
call trace/kernel log is shown in Eagle Stream platform (Sapphire Rapids
CPU) during booting:

pci :59:00.5: Adding to iommu group 42
...
vmd :59:00.5: PCI host bridge to bus 1:80
pci 1:80:01.0: [8086:352a] type 01 class 0x060400
pci 1:80:01.0: reg 0x10: [mem 0x-0x0001 64bit]
pci 1:80:01.0: enabling Extended Tags
pci 1:80:01.0: PME# supported from D0 D3hot D3cold
pci 1:80:01.0: DMAR: Setup RID2PASID failed
pci 1:80:01.0: Failed to add to iommu group 42: -16
pci 1:80:03.0: [8086:352b] type 01 class 0x060400
pci 1:80:03.0: reg 0x10: [mem 0x-0x0001 64bit]
pci 1:80:03.0: enabling Extended Tags
pci 1:80:03.0: PME# supported from D0 D3hot D3cold
[ cut here ]
kernel BUG at lib/list_debug.c:29!
invalid opcode:  [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.17.0-rc3+ #7
Hardware name: Lenovo ThinkSystem SR650V3/SB27A86647, BIOS ESE101Y-1.00 
01/13/2022
Workqueue: events work_for_cpu_fn
RIP: 0010:__list_add_valid.cold+0x26/0x3f
Code: 9a 4a ab ff 4c 89 c1 48 c7 c7 40 0c d9 9e e8 b9 b1 fe ff 0f
  0b 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 f0 0c d9 9e e8 a2 b1
  fe ff <0f> 0b 48 89 d1 4c 89 c6 4c 89 ca 48 c7 c7 98 0c d9
  9e e8 8b b1 fe
RSP: :ff5ad434865b3a40 EFLAGS: 00010246
RAX: 0058 RBX: ff4d61160b74b880 RCX: ff4d61255e1fffa8
RDX:  RSI: fffe RDI: 9fd34f20
RBP: ff4d611d8e245c00 R08:  R09: ff5ad434865b3888
R10: ff5ad434865b3880 R11: ff4d61257fdc6fe8 R12: ff4d61160b74b8a0
R13: ff4d61160b74b8a0 R14: ff4d611d8e245c10 R15: ff4d611d8001ba70
FS:  () GS:ff4d611d5ea0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: ff4d611fa1401000 CR3: 000aa0210001 CR4: 00771ef0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe07f0 DR7: 0400
PKRU: 5554
Call Trace:
 
 intel_pasid_alloc_table+0x9c/0x1d0
 dmar_insert_one_dev_info+0x423/0x540
 ? device_to_iommu+0x12d/0x2f0
 intel_iommu_attach_device+0x116/0x290
 __iommu_attach_device+0x1a/0x90
 iommu_group_add_device+0x190/0x2c0
 __iommu_probe_device+0x13e/0x250
 iommu_probe_device+0x24/0x150
 iommu_bus_notifier+0x69/0x90
 blocking_notifier_call_chain+0x5a/0x80
 device_add+0x3db/0x7b0
 ? arch_memremap_can_ram_remap+0x19/0x50
 ? memremap+0x75/0x140
 pci_device_add+0x193/0x1d0
 pci_scan_single_device+0xb9/0xf0
 pci_scan_slot+0x4c/0x110
 pci_scan_child_bus_extend+0x3a/0x290
 vmd_enable_domain.constprop.0+0x63e/0x820
 vmd_probe+0x163/0x190
 local_pci_probe+0x42/0x80
 work_for_cpu_fn+0x13/0x20
 process_one_work+0x1e2/0x3b0
 worker_thread+0x1c4/0x3a0
 ? rescuer_thread+0x370/0x370
 kthread+0xc7/0xf0
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30
 
Modules linked in:
---[ end trace  ]---
...
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x1ca0 from 0x8100 (relocation range: 
0x8000-0xbfff)
---[ end Kernel panic - not syncing: Fatal exception ]---

The following 'lspci' output shows devices '1:80:*' are subdevices of
the VMD device :59:00.5:

  $ lspci
  ...
  :59:00.5 RAID bus controller: Intel Corporation Volume Management Device 
NVMe RAID Controller (rev 20)
  ...
  1:80:01.0 PCI bridge: Intel Corporation Device 352a (rev 03)
  1:80:03.0 PCI bridge: Intel Corporation Device 352b (rev 03)
  1:80:05.0 PCI bridge: Intel Corporation Device 352c (rev 03)
  1:80:07.0 PCI bridge: Intel Corporation Device 352d (rev 03)
  1:81:00.0 Non-Volatile memory controller: Intel Corporation NVMe 
Datacenter SSD [3DNAND, Beta Rock Controller]
  1:82:00.0 Non-Volatile memory controller: Intel Corporation NVMe 
Datacenter SSD [3DNAND, Beta Rock Controller]

The symptom 'list_add double add' is caused by the following failure
message:

  pci 1:80:01.0: DMAR: Setup RID2PASID failed
  pci 1:80:01.0: Failed to add to iommu group 42: -16
  pci 1:80:03.0: [8086:352b] type 01 class 0x060400

Device 1:80:01.0 is the subdevice of the VMD device :59:00.5,
so invoking intel_pasid_alloc_table() gets the pasid_table of the VMD
device :59:00.5. Here is call path:

  intel_pasid_alloc_table
pci_for_each_dma_alias
 get_alias_pasid_table
   search_pasid_table

pci_real_dma_dev() in pci_for_each_dma_alias() gets the real dma device
which is the VMD device :59:00.5. However, pte of the VMD device
:59:00.5 has been configured during this message "pci :59:00.5:
Adding to iommu group 42". So, the status -EBUSY is returned when
configuring pasid entry for device 1:80:01.0.

It then invokes dmar_remove_one_dev_info() to release
'struct device_domain_info *' from iommu_devinfo_cache. But, the pasid
table is not released because of the following