Re: warning from domain_get_iommu

2020-02-08 Thread Lu Baolu

Hi,

On 2020/2/8 18:19, Jerry Snitselaar wrote:

On Sat Feb 08 20, Lu Baolu wrote:

Hi Jerry,

On 2020/2/7 17:34, Jerry Snitselaar wrote:

On Thu Feb 06 20, Jerry Snitselaar wrote:

On Tue Feb 04 20, Jerry Snitselaar wrote:
I'm working on getting a system to reproduce this, and verify it 
also occurs

with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[    2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) 
Driver

[    2.832615] ehci-pci: EHCI PCI platform driver
[    2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[    2.835974] ehci-pci :00:1a.0: new USB bus registered, 
assigned bus number 1

[    2.838276] ehci-pci :00:1a.0: debug port 2
[    2.839700] WARNING: CPU: 0 PID: 1 at 
drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60

[    2.840671] Modules linked in:
[    2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[    2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 
Gen9, BIOS P89 07/21/2019

[    2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[    2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 
48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 
48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 
55 40 0f b6 f6

[    2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[    2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 

[    2.840671] RDX: fff0 RSI:  RDI: 
88ec7f1c8000
[    2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 
88ec7cbfcd00
[    2.840671] R10: 0095 R11: c90df928 R12: 

[    2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 

[    2.840671] FS:  () 
GS:88ec7f60() knlGS:

[    2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[    2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 
001606b0

[    2.840671] Call Trace:
[    2.840671]  __intel_map_single+0x62/0x140
[    2.840671]  intel_alloc_coherent+0xa6/0x130
[    2.840671]  dma_pool_alloc+0xd8/0x1e0
[    2.840671]  e_qh_alloc+0x55/0x130
[    2.840671]  ehci_setup+0x284/0x7b0
[    2.840671]  ehci_pci_setup+0xa3/0x530
[    2.840671]  usb_add_hcd+0x2b6/0x800
[    2.840671]  usb_hcd_pci_probe+0x375/0x460
[    2.840671]  local_pci_probe+0x41/0x90
[    2.840671]  pci_device_probe+0x105/0x1b0
[    2.840671]  driver_probe_device+0x12d/0x460
[    2.840671]  device_driver_attach+0x50/0x60
[    2.840671]  __driver_attach+0x61/0x130
[    2.840671]  ? device_driver_attach+0x60/0x60
[    2.840671]  bus_for_each_dev+0x77/0xc0
[    2.840671]  ? klist_add_tail+0x3b/0x70
[    2.840671]  bus_add_driver+0x14d/0x1e0
[    2.840671]  ? ehci_hcd_init+0xaa/0xaa
[    2.840671]  ? do_early_param+0x91/0x91
[    2.840671]  driver_register+0x6b/0xb0
[    2.840671]  ? ehci_hcd_init+0xaa/0xaa
[    2.840671]  do_one_initcall+0x46/0x1c3
[    2.840671]  ? do_early_param+0x91/0x91
[    2.840671]  kernel_init_freeable+0x1af/0x258
[    2.840671]  ? rest_init+0xaa/0xaa
[    2.840671]  kernel_init+0xa/0xf9
[    2.840671]  ret_from_fork+0x35/0x40
[    2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[    3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[    3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity 
mapping
[    3.018537] ehci-pci :00:1a.0: cache line size of 64 is not 
supported

[    3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[    3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[    3.030918] usb usb1: New USB device found, idVendor=1d6b, 
idProduct=0002, bcdDevice= 4.18
[    3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1

[    3.035900] usb usb1: Product: EHCI Host Controller
[    3.037423] usb usb1: Manufacturer: Linux 
4.18.0-170.el8.kdump2.x86_64 ehci_hcd

[    3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
 goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry


I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE.


Hi Baolu,

I think I understand what is happening here. With the kdump boot
translation is pre-enabled, so in intel_iommu_add_device things are
getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent
calls iommu_need_mapping it returns true, but doesn't do the dma
domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then
__intel_map_single g

Re: warning from domain_get_iommu

2020-02-08 Thread Jerry Snitselaar

On Sat Feb 08 20, Lu Baolu wrote:

Hi Jerry,

On 2020/2/7 17:34, Jerry Snitselaar wrote:

On Thu Feb 06 20, Jerry Snitselaar wrote:

On Tue Feb 04 20, Jerry Snitselaar wrote:
I'm working on getting a system to reproduce this, and verify it 
also occurs

with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[    2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller 
(EHCI) Driver

[    2.832615] ehci-pci: EHCI PCI platform driver
[    2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[    2.835974] ehci-pci :00:1a.0: new USB bus registered, 
assigned bus number 1

[    2.838276] ehci-pci :00:1a.0: debug port 2
[    2.839700] WARNING: CPU: 0 PID: 1 at 
drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60

[    2.840671] Modules linked in:
[    2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[    2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant 
DL360 Gen9, BIOS P89 07/21/2019

[    2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[    2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 
0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 
91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 
00 00 41 55 40 0f b6 f6

[    2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[    2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 

[    2.840671] RDX: fff0 RSI:  RDI: 
88ec7f1c8000
[    2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 
88ec7cbfcd00
[    2.840671] R10: 0095 R11: c90df928 R12: 

[    2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 

[    2.840671] FS:  () 
GS:88ec7f60() knlGS:

[    2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[    2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 
001606b0

[    2.840671] Call Trace:
[    2.840671]  __intel_map_single+0x62/0x140
[    2.840671]  intel_alloc_coherent+0xa6/0x130
[    2.840671]  dma_pool_alloc+0xd8/0x1e0
[    2.840671]  e_qh_alloc+0x55/0x130
[    2.840671]  ehci_setup+0x284/0x7b0
[    2.840671]  ehci_pci_setup+0xa3/0x530
[    2.840671]  usb_add_hcd+0x2b6/0x800
[    2.840671]  usb_hcd_pci_probe+0x375/0x460
[    2.840671]  local_pci_probe+0x41/0x90
[    2.840671]  pci_device_probe+0x105/0x1b0
[    2.840671]  driver_probe_device+0x12d/0x460
[    2.840671]  device_driver_attach+0x50/0x60
[    2.840671]  __driver_attach+0x61/0x130
[    2.840671]  ? device_driver_attach+0x60/0x60
[    2.840671]  bus_for_each_dev+0x77/0xc0
[    2.840671]  ? klist_add_tail+0x3b/0x70
[    2.840671]  bus_add_driver+0x14d/0x1e0
[    2.840671]  ? ehci_hcd_init+0xaa/0xaa
[    2.840671]  ? do_early_param+0x91/0x91
[    2.840671]  driver_register+0x6b/0xb0
[    2.840671]  ? ehci_hcd_init+0xaa/0xaa
[    2.840671]  do_one_initcall+0x46/0x1c3
[    2.840671]  ? do_early_param+0x91/0x91
[    2.840671]  kernel_init_freeable+0x1af/0x258
[    2.840671]  ? rest_init+0xaa/0xaa
[    2.840671]  kernel_init+0xa/0xf9
[    2.840671]  ret_from_fork+0x35/0x40
[    2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[    3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[    3.012551] ehci-pci :00:1a.0: 32bit DMA uses 
non-identity mapping
[    3.018537] ehci-pci :00:1a.0: cache line size of 64 is 
not supported

[    3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[    3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[    3.030918] usb usb1: New USB device found, idVendor=1d6b, 
idProduct=0002, bcdDevice= 4.18
[    3.033491] usb usb1: New USB device strings: Mfr=3, 
Product=2, SerialNumber=1

[    3.035900] usb usb1: Product: EHCI Host Controller
[    3.037423] usb usb1: Manufacturer: Linux 
4.18.0-170.el8.kdump2.x86_64 ehci_hcd

[    3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
 goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry


I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE.


Hi Baolu,

I think I understand what is happening here. With the kdump boot
translation is pre-enabled, so in intel_iommu_add_device things are
getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent
calls iommu_need_mapping it returns true, but doesn't do the dma
domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then
__intel_map_single gets called and it calls deferred_attach_domain,
w

Re: warning from domain_get_iommu

2020-02-07 Thread Lu Baolu

Hi Jerry,

On 2020/2/7 17:34, Jerry Snitselaar wrote:

On Thu Feb 06 20, Jerry Snitselaar wrote:

On Tue Feb 04 20, Jerry Snitselaar wrote:
I'm working on getting a system to reproduce this, and verify it also 
occurs

with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[    2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) 
Driver

[    2.832615] ehci-pci: EHCI PCI platform driver
[    2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[    2.835974] ehci-pci :00:1a.0: new USB bus registered, 
assigned bus number 1

[    2.838276] ehci-pci :00:1a.0: debug port 2
[    2.839700] WARNING: CPU: 0 PID: 1 at 
drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60

[    2.840671] Modules linked in:
[    2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[    2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 
Gen9, BIOS P89 07/21/2019

[    2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[    2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 
63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 
04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 
0f b6 f6

[    2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[    2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 

[    2.840671] RDX: fff0 RSI:  RDI: 
88ec7f1c8000
[    2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 
88ec7cbfcd00
[    2.840671] R10: 0095 R11: c90df928 R12: 

[    2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 

[    2.840671] FS:  () GS:88ec7f60() 
knlGS:

[    2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[    2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 
001606b0

[    2.840671] Call Trace:
[    2.840671]  __intel_map_single+0x62/0x140
[    2.840671]  intel_alloc_coherent+0xa6/0x130
[    2.840671]  dma_pool_alloc+0xd8/0x1e0
[    2.840671]  e_qh_alloc+0x55/0x130
[    2.840671]  ehci_setup+0x284/0x7b0
[    2.840671]  ehci_pci_setup+0xa3/0x530
[    2.840671]  usb_add_hcd+0x2b6/0x800
[    2.840671]  usb_hcd_pci_probe+0x375/0x460
[    2.840671]  local_pci_probe+0x41/0x90
[    2.840671]  pci_device_probe+0x105/0x1b0
[    2.840671]  driver_probe_device+0x12d/0x460
[    2.840671]  device_driver_attach+0x50/0x60
[    2.840671]  __driver_attach+0x61/0x130
[    2.840671]  ? device_driver_attach+0x60/0x60
[    2.840671]  bus_for_each_dev+0x77/0xc0
[    2.840671]  ? klist_add_tail+0x3b/0x70
[    2.840671]  bus_add_driver+0x14d/0x1e0
[    2.840671]  ? ehci_hcd_init+0xaa/0xaa
[    2.840671]  ? do_early_param+0x91/0x91
[    2.840671]  driver_register+0x6b/0xb0
[    2.840671]  ? ehci_hcd_init+0xaa/0xaa
[    2.840671]  do_one_initcall+0x46/0x1c3
[    2.840671]  ? do_early_param+0x91/0x91
[    2.840671]  kernel_init_freeable+0x1af/0x258
[    2.840671]  ? rest_init+0xaa/0xaa
[    2.840671]  kernel_init+0xa/0xf9
[    2.840671]  ret_from_fork+0x35/0x40
[    2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[    3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[    3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity 
mapping
[    3.018537] ehci-pci :00:1a.0: cache line size of 64 is not 
supported

[    3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[    3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[    3.030918] usb usb1: New USB device found, idVendor=1d6b, 
idProduct=0002, bcdDevice= 4.18
[    3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1

[    3.035900] usb usb1: Product: EHCI Host Controller
[    3.037423] usb usb1: Manufacturer: Linux 
4.18.0-170.el8.kdump2.x86_64 ehci_hcd

[    3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
 goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry


I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE.


Hi Baolu,

I think I understand what is happening here. With the kdump boot
translation is pre-enabled, so in intel_iommu_add_device things are
getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent
calls iommu_need_mapping it returns true, but doesn't do the dma
domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then
__intel_map_single gets called and it calls deferred_attach_domain,
which sets the domain to the group d

Re: warning from domain_get_iommu

2020-02-07 Thread Jerry Snitselaar

On Thu Feb 06 20, Jerry Snitselaar wrote:

On Tue Feb 04 20, Jerry Snitselaar wrote:

I'm working on getting a system to reproduce this, and verify it also occurs
with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[2.832615] ehci-pci: EHCI PCI platform driver
[2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus 
number 1
[2.838276] ehci-pci :00:1a.0: debug port 2
[2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 
domain_get_iommu+0x55/0x60
[2.840671] Modules linked in:
[2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS 
P89 07/21/2019
[2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 
75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 
31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6
[2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 
[2.840671] RDX: fff0 RSI:  RDI: 88ec7f1c8000
[2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00
[2.840671] R10: 0095 R11: c90df928 R12: 
[2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 
[2.840671] FS:  () GS:88ec7f60() 
knlGS:
[2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0
[2.840671] Call Trace:
[2.840671]  __intel_map_single+0x62/0x140
[2.840671]  intel_alloc_coherent+0xa6/0x130
[2.840671]  dma_pool_alloc+0xd8/0x1e0
[2.840671]  e_qh_alloc+0x55/0x130
[2.840671]  ehci_setup+0x284/0x7b0
[2.840671]  ehci_pci_setup+0xa3/0x530
[2.840671]  usb_add_hcd+0x2b6/0x800
[2.840671]  usb_hcd_pci_probe+0x375/0x460
[2.840671]  local_pci_probe+0x41/0x90
[2.840671]  pci_device_probe+0x105/0x1b0
[2.840671]  driver_probe_device+0x12d/0x460
[2.840671]  device_driver_attach+0x50/0x60
[2.840671]  __driver_attach+0x61/0x130
[2.840671]  ? device_driver_attach+0x60/0x60
[2.840671]  bus_for_each_dev+0x77/0xc0
[2.840671]  ? klist_add_tail+0x3b/0x70
[2.840671]  bus_add_driver+0x14d/0x1e0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  driver_register+0x6b/0xb0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  do_one_initcall+0x46/0x1c3
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  kernel_init_freeable+0x1af/0x258
[2.840671]  ? rest_init+0xaa/0xaa
[2.840671]  kernel_init+0xa/0xf9
[2.840671]  ret_from_fork+0x35/0x40
[2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping
[3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported
[3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, 
bcdDevice= 4.18
[3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[3.035900] usb usb1: Product: EHCI Host Controller
[3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 
ehci_hcd
[3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
 goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry


I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE.


Hi Baolu,

I think I understand what is happening here. With the kdump boot
translation is pre-enabled, so in intel_iommu_add_device things are
getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent
calls iommu_need_mapping it returns true, but doesn't do the dma
domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then
__intel_map_single gets called and it calls deferred_attach_domain,
which sets the domain to the group domain, which in this case is the
identity domain. Then it calls domain_get_i

Re: warning from domain_get_iommu

2020-02-06 Thread Jerry Snitselaar

On Tue Feb 04 20, Jerry Snitselaar wrote:

I'm working on getting a system to reproduce this, and verify it also occurs
with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[2.832615] ehci-pci: EHCI PCI platform driver
[2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus 
number 1
[2.838276] ehci-pci :00:1a.0: debug port 2
[2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 
domain_get_iommu+0x55/0x60
[2.840671] Modules linked in:
[2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS 
P89 07/21/2019
[2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 
75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 
31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6
[2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 
[2.840671] RDX: fff0 RSI:  RDI: 88ec7f1c8000
[2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00
[2.840671] R10: 0095 R11: c90df928 R12: 
[2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 
[2.840671] FS:  () GS:88ec7f60() 
knlGS:
[2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0
[2.840671] Call Trace:
[2.840671]  __intel_map_single+0x62/0x140
[2.840671]  intel_alloc_coherent+0xa6/0x130
[2.840671]  dma_pool_alloc+0xd8/0x1e0
[2.840671]  e_qh_alloc+0x55/0x130
[2.840671]  ehci_setup+0x284/0x7b0
[2.840671]  ehci_pci_setup+0xa3/0x530
[2.840671]  usb_add_hcd+0x2b6/0x800
[2.840671]  usb_hcd_pci_probe+0x375/0x460
[2.840671]  local_pci_probe+0x41/0x90
[2.840671]  pci_device_probe+0x105/0x1b0
[2.840671]  driver_probe_device+0x12d/0x460
[2.840671]  device_driver_attach+0x50/0x60
[2.840671]  __driver_attach+0x61/0x130
[2.840671]  ? device_driver_attach+0x60/0x60
[2.840671]  bus_for_each_dev+0x77/0xc0
[2.840671]  ? klist_add_tail+0x3b/0x70
[2.840671]  bus_add_driver+0x14d/0x1e0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  driver_register+0x6b/0xb0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  do_one_initcall+0x46/0x1c3
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  kernel_init_freeable+0x1af/0x258
[2.840671]  ? rest_init+0xaa/0xaa
[2.840671]  kernel_init+0xa/0xf9
[2.840671]  ret_from_fork+0x35/0x40
[2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping
[3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported
[3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, 
bcdDevice= 4.18
[3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[3.035900] usb usb1: Product: EHCI Host Controller
[3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 
ehci_hcd
[3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
  goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry


I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


warning from domain_get_iommu

2020-02-04 Thread Jerry Snitselaar

I'm working on getting a system to reproduce this, and verify it also occurs
with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[2.832615] ehci-pci: EHCI PCI platform driver
[2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus 
number 1
[2.838276] ehci-pci :00:1a.0: debug port 2
[2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 
domain_get_iommu+0x55/0x60
[2.840671] Modules linked in:
[2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS 
P89 07/21/2019
[2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 
75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 
31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6
[2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 
[2.840671] RDX: fff0 RSI:  RDI: 88ec7f1c8000
[2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00
[2.840671] R10: 0095 R11: c90df928 R12: 
[2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 
[2.840671] FS:  () GS:88ec7f60() 
knlGS:
[2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0
[2.840671] Call Trace:
[2.840671]  __intel_map_single+0x62/0x140
[2.840671]  intel_alloc_coherent+0xa6/0x130
[2.840671]  dma_pool_alloc+0xd8/0x1e0
[2.840671]  e_qh_alloc+0x55/0x130
[2.840671]  ehci_setup+0x284/0x7b0
[2.840671]  ehci_pci_setup+0xa3/0x530
[2.840671]  usb_add_hcd+0x2b6/0x800
[2.840671]  usb_hcd_pci_probe+0x375/0x460
[2.840671]  local_pci_probe+0x41/0x90
[2.840671]  pci_device_probe+0x105/0x1b0
[2.840671]  driver_probe_device+0x12d/0x460
[2.840671]  device_driver_attach+0x50/0x60
[2.840671]  __driver_attach+0x61/0x130
[2.840671]  ? device_driver_attach+0x60/0x60
[2.840671]  bus_for_each_dev+0x77/0xc0
[2.840671]  ? klist_add_tail+0x3b/0x70
[2.840671]  bus_add_driver+0x14d/0x1e0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  driver_register+0x6b/0xb0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  do_one_initcall+0x46/0x1c3
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  kernel_init_freeable+0x1af/0x258
[2.840671]  ? rest_init+0xaa/0xaa
[2.840671]  kernel_init+0xa/0xf9
[2.840671]  ret_from_fork+0x35/0x40
[2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping
[3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported
[3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, 
bcdDevice= 4.18
[3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[3.035900] usb usb1: Product: EHCI Host Controller
[3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 
ehci_hcd
[3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
   goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu