Re: warning from domain_get_iommu
Hi, On 2020/2/8 18:19, Jerry Snitselaar wrote: On Sat Feb 08 20, Lu Baolu wrote: Hi Jerry, On 2020/2/7 17:34, Jerry Snitselaar wrote: On Thu Feb 06 20, Jerry Snitselaar wrote: On Tue Feb 04 20, Jerry Snitselaar wrote: I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 2.832615] ehci-pci: EHCI PCI platform driver [ 2.834190] ehci-pci :00:1a.0: EHCI Host Controller [ 2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [ 2.838276] ehci-pci :00:1a.0: debug port 2 [ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [ 2.840671] Modules linked in: [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [ 2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [ 2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [ 2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [ 2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [ 2.840671] R10: 0095 R11: c90df928 R12: [ 2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [ 2.840671] FS: () GS:88ec7f60() knlGS: [ 2.840671] CS: 0010 DS: ES: CR0: 80050033 [ 2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [ 2.840671] Call Trace: [ 2.840671] __intel_map_single+0x62/0x140 [ 2.840671] intel_alloc_coherent+0xa6/0x130 [ 2.840671] dma_pool_alloc+0xd8/0x1e0 [ 2.840671] e_qh_alloc+0x55/0x130 [ 2.840671] ehci_setup+0x284/0x7b0 [ 2.840671] ehci_pci_setup+0xa3/0x530 [ 2.840671] usb_add_hcd+0x2b6/0x800 [ 2.840671] usb_hcd_pci_probe+0x375/0x460 [ 2.840671] local_pci_probe+0x41/0x90 [ 2.840671] pci_device_probe+0x105/0x1b0 [ 2.840671] driver_probe_device+0x12d/0x460 [ 2.840671] device_driver_attach+0x50/0x60 [ 2.840671] __driver_attach+0x61/0x130 [ 2.840671] ? device_driver_attach+0x60/0x60 [ 2.840671] bus_for_each_dev+0x77/0xc0 [ 2.840671] ? klist_add_tail+0x3b/0x70 [ 2.840671] bus_add_driver+0x14d/0x1e0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] driver_register+0x6b/0xb0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] do_one_initcall+0x46/0x1c3 [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] kernel_init_freeable+0x1af/0x258 [ 2.840671] ? rest_init+0xaa/0xaa [ 2.840671] kernel_init+0xa/0xf9 [ 2.840671] ret_from_fork+0x35/0x40 [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [ 3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [ 3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [ 3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [ 3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [ 3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 3.035900] usb usb1: Product: EHCI Host Controller [ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [ 3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single g
Re: warning from domain_get_iommu
On Sat Feb 08 20, Lu Baolu wrote: Hi Jerry, On 2020/2/7 17:34, Jerry Snitselaar wrote: On Thu Feb 06 20, Jerry Snitselaar wrote: On Tue Feb 04 20, Jerry Snitselaar wrote: I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 2.832615] ehci-pci: EHCI PCI platform driver [ 2.834190] ehci-pci :00:1a.0: EHCI Host Controller [ 2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [ 2.838276] ehci-pci :00:1a.0: debug port 2 [ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [ 2.840671] Modules linked in: [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [ 2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [ 2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [ 2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [ 2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [ 2.840671] R10: 0095 R11: c90df928 R12: [ 2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [ 2.840671] FS: () GS:88ec7f60() knlGS: [ 2.840671] CS: 0010 DS: ES: CR0: 80050033 [ 2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [ 2.840671] Call Trace: [ 2.840671] __intel_map_single+0x62/0x140 [ 2.840671] intel_alloc_coherent+0xa6/0x130 [ 2.840671] dma_pool_alloc+0xd8/0x1e0 [ 2.840671] e_qh_alloc+0x55/0x130 [ 2.840671] ehci_setup+0x284/0x7b0 [ 2.840671] ehci_pci_setup+0xa3/0x530 [ 2.840671] usb_add_hcd+0x2b6/0x800 [ 2.840671] usb_hcd_pci_probe+0x375/0x460 [ 2.840671] local_pci_probe+0x41/0x90 [ 2.840671] pci_device_probe+0x105/0x1b0 [ 2.840671] driver_probe_device+0x12d/0x460 [ 2.840671] device_driver_attach+0x50/0x60 [ 2.840671] __driver_attach+0x61/0x130 [ 2.840671] ? device_driver_attach+0x60/0x60 [ 2.840671] bus_for_each_dev+0x77/0xc0 [ 2.840671] ? klist_add_tail+0x3b/0x70 [ 2.840671] bus_add_driver+0x14d/0x1e0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] driver_register+0x6b/0xb0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] do_one_initcall+0x46/0x1c3 [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] kernel_init_freeable+0x1af/0x258 [ 2.840671] ? rest_init+0xaa/0xaa [ 2.840671] kernel_init+0xa/0xf9 [ 2.840671] ret_from_fork+0x35/0x40 [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [ 3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [ 3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [ 3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [ 3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [ 3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 3.035900] usb usb1: Product: EHCI Host Controller [ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [ 3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single gets called and it calls deferred_attach_domain, w
Re: warning from domain_get_iommu
Hi Jerry, On 2020/2/7 17:34, Jerry Snitselaar wrote: On Thu Feb 06 20, Jerry Snitselaar wrote: On Tue Feb 04 20, Jerry Snitselaar wrote: I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 2.832615] ehci-pci: EHCI PCI platform driver [ 2.834190] ehci-pci :00:1a.0: EHCI Host Controller [ 2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [ 2.838276] ehci-pci :00:1a.0: debug port 2 [ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [ 2.840671] Modules linked in: [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [ 2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [ 2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [ 2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [ 2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [ 2.840671] R10: 0095 R11: c90df928 R12: [ 2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [ 2.840671] FS: () GS:88ec7f60() knlGS: [ 2.840671] CS: 0010 DS: ES: CR0: 80050033 [ 2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [ 2.840671] Call Trace: [ 2.840671] __intel_map_single+0x62/0x140 [ 2.840671] intel_alloc_coherent+0xa6/0x130 [ 2.840671] dma_pool_alloc+0xd8/0x1e0 [ 2.840671] e_qh_alloc+0x55/0x130 [ 2.840671] ehci_setup+0x284/0x7b0 [ 2.840671] ehci_pci_setup+0xa3/0x530 [ 2.840671] usb_add_hcd+0x2b6/0x800 [ 2.840671] usb_hcd_pci_probe+0x375/0x460 [ 2.840671] local_pci_probe+0x41/0x90 [ 2.840671] pci_device_probe+0x105/0x1b0 [ 2.840671] driver_probe_device+0x12d/0x460 [ 2.840671] device_driver_attach+0x50/0x60 [ 2.840671] __driver_attach+0x61/0x130 [ 2.840671] ? device_driver_attach+0x60/0x60 [ 2.840671] bus_for_each_dev+0x77/0xc0 [ 2.840671] ? klist_add_tail+0x3b/0x70 [ 2.840671] bus_add_driver+0x14d/0x1e0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] driver_register+0x6b/0xb0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] do_one_initcall+0x46/0x1c3 [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] kernel_init_freeable+0x1af/0x258 [ 2.840671] ? rest_init+0xaa/0xaa [ 2.840671] kernel_init+0xa/0xf9 [ 2.840671] ret_from_fork+0x35/0x40 [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [ 3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [ 3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [ 3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [ 3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [ 3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 3.035900] usb usb1: Product: EHCI Host Controller [ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [ 3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single gets called and it calls deferred_attach_domain, which sets the domain to the group d
Re: warning from domain_get_iommu
On Thu Feb 06 20, Jerry Snitselaar wrote: On Tue Feb 04 20, Jerry Snitselaar wrote: I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [2.832615] ehci-pci: EHCI PCI platform driver [2.834190] ehci-pci :00:1a.0: EHCI Host Controller [2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [2.838276] ehci-pci :00:1a.0: debug port 2 [2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [2.840671] Modules linked in: [2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [2.840671] R10: 0095 R11: c90df928 R12: [2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [2.840671] FS: () GS:88ec7f60() knlGS: [2.840671] CS: 0010 DS: ES: CR0: 80050033 [2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [2.840671] Call Trace: [2.840671] __intel_map_single+0x62/0x140 [2.840671] intel_alloc_coherent+0xa6/0x130 [2.840671] dma_pool_alloc+0xd8/0x1e0 [2.840671] e_qh_alloc+0x55/0x130 [2.840671] ehci_setup+0x284/0x7b0 [2.840671] ehci_pci_setup+0xa3/0x530 [2.840671] usb_add_hcd+0x2b6/0x800 [2.840671] usb_hcd_pci_probe+0x375/0x460 [2.840671] local_pci_probe+0x41/0x90 [2.840671] pci_device_probe+0x105/0x1b0 [2.840671] driver_probe_device+0x12d/0x460 [2.840671] device_driver_attach+0x50/0x60 [2.840671] __driver_attach+0x61/0x130 [2.840671] ? device_driver_attach+0x60/0x60 [2.840671] bus_for_each_dev+0x77/0xc0 [2.840671] ? klist_add_tail+0x3b/0x70 [2.840671] bus_add_driver+0x14d/0x1e0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] ? do_early_param+0x91/0x91 [2.840671] driver_register+0x6b/0xb0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] do_one_initcall+0x46/0x1c3 [2.840671] ? do_early_param+0x91/0x91 [2.840671] kernel_init_freeable+0x1af/0x258 [2.840671] ? rest_init+0xaa/0xaa [2.840671] kernel_init+0xa/0xf9 [2.840671] ret_from_fork+0x35/0x40 [2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [3.035900] usb usb1: Product: EHCI Host Controller [3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single gets called and it calls deferred_attach_domain, which sets the domain to the group domain, which in this case is the identity domain. Then it calls domain_get_i
Re: warning from domain_get_iommu
On Tue Feb 04 20, Jerry Snitselaar wrote: I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [2.832615] ehci-pci: EHCI PCI platform driver [2.834190] ehci-pci :00:1a.0: EHCI Host Controller [2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [2.838276] ehci-pci :00:1a.0: debug port 2 [2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [2.840671] Modules linked in: [2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [2.840671] R10: 0095 R11: c90df928 R12: [2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [2.840671] FS: () GS:88ec7f60() knlGS: [2.840671] CS: 0010 DS: ES: CR0: 80050033 [2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [2.840671] Call Trace: [2.840671] __intel_map_single+0x62/0x140 [2.840671] intel_alloc_coherent+0xa6/0x130 [2.840671] dma_pool_alloc+0xd8/0x1e0 [2.840671] e_qh_alloc+0x55/0x130 [2.840671] ehci_setup+0x284/0x7b0 [2.840671] ehci_pci_setup+0xa3/0x530 [2.840671] usb_add_hcd+0x2b6/0x800 [2.840671] usb_hcd_pci_probe+0x375/0x460 [2.840671] local_pci_probe+0x41/0x90 [2.840671] pci_device_probe+0x105/0x1b0 [2.840671] driver_probe_device+0x12d/0x460 [2.840671] device_driver_attach+0x50/0x60 [2.840671] __driver_attach+0x61/0x130 [2.840671] ? device_driver_attach+0x60/0x60 [2.840671] bus_for_each_dev+0x77/0xc0 [2.840671] ? klist_add_tail+0x3b/0x70 [2.840671] bus_add_driver+0x14d/0x1e0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] ? do_early_param+0x91/0x91 [2.840671] driver_register+0x6b/0xb0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] do_one_initcall+0x46/0x1c3 [2.840671] ? do_early_param+0x91/0x91 [2.840671] kernel_init_freeable+0x1af/0x258 [2.840671] ? rest_init+0xaa/0xaa [2.840671] kernel_init+0xa/0xf9 [2.840671] ret_from_fork+0x35/0x40 [2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [3.035900] usb usb1: Product: EHCI Host Controller [3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
warning from domain_get_iommu
I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [2.832615] ehci-pci: EHCI PCI platform driver [2.834190] ehci-pci :00:1a.0: EHCI Host Controller [2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [2.838276] ehci-pci :00:1a.0: debug port 2 [2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [2.840671] Modules linked in: [2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [2.840671] R10: 0095 R11: c90df928 R12: [2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [2.840671] FS: () GS:88ec7f60() knlGS: [2.840671] CS: 0010 DS: ES: CR0: 80050033 [2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [2.840671] Call Trace: [2.840671] __intel_map_single+0x62/0x140 [2.840671] intel_alloc_coherent+0xa6/0x130 [2.840671] dma_pool_alloc+0xd8/0x1e0 [2.840671] e_qh_alloc+0x55/0x130 [2.840671] ehci_setup+0x284/0x7b0 [2.840671] ehci_pci_setup+0xa3/0x530 [2.840671] usb_add_hcd+0x2b6/0x800 [2.840671] usb_hcd_pci_probe+0x375/0x460 [2.840671] local_pci_probe+0x41/0x90 [2.840671] pci_device_probe+0x105/0x1b0 [2.840671] driver_probe_device+0x12d/0x460 [2.840671] device_driver_attach+0x50/0x60 [2.840671] __driver_attach+0x61/0x130 [2.840671] ? device_driver_attach+0x60/0x60 [2.840671] bus_for_each_dev+0x77/0xc0 [2.840671] ? klist_add_tail+0x3b/0x70 [2.840671] bus_add_driver+0x14d/0x1e0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] ? do_early_param+0x91/0x91 [2.840671] driver_register+0x6b/0xb0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] do_one_initcall+0x46/0x1c3 [2.840671] ? do_early_param+0x91/0x91 [2.840671] kernel_init_freeable+0x1af/0x258 [2.840671] ? rest_init+0xaa/0xaa [2.840671] kernel_init+0xa/0xf9 [2.840671] ret_from_fork+0x35/0x40 [2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [3.035900] usb usb1: Product: EHCI Host Controller [3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu