Re: [RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info
On Tue Dec 17 19, Jerry Snitselaar wrote: On Tue Dec 17 19, Jerry Snitselaar wrote: In addition to checking for a null pointer, verify that info does not have the value DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values __dmar_remove_one_dev_info will panic when trying to access a member of the device_domain_info struct. [1.464241] BUG: unable to handle kernel NULL pointer dereference at 004e [1.464241] PGD 0 P4D 0 [1.464241] Oops: [#1] SMP PTI [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW - - - 4.18.0-160.el8.x86_64 #1 [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250 [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 55 41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb <4c> 8b 67 50 48 8b 6f 58 $ [1.464241] RSP: :c90dfd10 EFLAGS: 00010082 [1.464241] RAX: 0001 RBX: fffe RCX: [1.464241] RDX: 0001 RSI: 0004 RDI: fffe [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 0039 [1.464241] R10: R11: c90dfa58 R12: 88ec7a0eec20 [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: [1.464241] FS: () GS:88ec7a60() knlGS: [1.464241] CS: 0010 DS: ES: CR0: 80050033 [1.464241] CR2: 004e CR3: 006c7900a001 C 001606b0 [1.464241] Call Trace: [1.464241] dmar_remove_one_dev_info.isra.68+0x27/0x40 [1.464241] intel_iommu_add_device+0x124/0x180 [1.464241] ? iommu_probe_device+0x40/0x40 [1.464241] add_iommu_group+0xa/0x20 [1.464241] bus_for_each_dev+0x77/0xc0 [1.464241] ? down_write+0xe/0x40 [1.464241] bus_set_iommu+0x85/0xc0 [1.464241] intel_iommu_init+0x4b4/0x777 [1.464241] ? e820__memblock_setup+0x63/0x63 [1.464241] ? do_early_param+0x91/0x91 [1.464241] pci_iommu_init+0x19/0x45 [1.464241] do_one_initcall+0x46/0x1c3 [1.464241] ? do_early_param+0x91/0x91 [1.464241] kernel_init_freeable+0x1af/0x258 [1.464241] ? rest_init+0xaa/0xaa [1.464241] kernel_init+0xa/0x107 [1.464241] ret_from_fork+0x35/0x40 [1.464241] Modules linked in: [1.464241] CR2: 004e [1.464241] ---[ end trace 0927d2ba8b8032b5 ]--- Cc: Joerg Roedel Cc: Lu Baolu Cc: David Woodhouse Cc: sta...@vger.kernel.org # v5.3+ Cc: iommu@lists.linux-foundation.org Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one") Signed-off-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..e42a09794fa2 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) spin_lock_irqsave(&device_domain_lock, flags); info = dev->archdata.iommu; - if (info) + if (info && info != DEFER_DEVICE_DOMAIN_INFO + && info != DUMMY_DEVICE_DOMAIN_INFO) __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(&device_domain_lock, flags); } -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Nack this. Apparently the issue is just being seen with the kdump kernel. I'm wondering if it is already solved by 6c3a44ed3c55 ("iommu/vt-d: Turn off translations at shutdown"). Testing a 5.5 build now. And a minute later I got a response. The 5.5 kernel hits the original panic when booting into the kdump kernel. I need to test with this patch on 5.5, but with a test build of our kernel with this patch the problem just moves to: [3.742317] pci :01:00.0: Using iommu dma mapping [3.744020] pci :01:00.1: Adding to iommu group 86 [3.746697] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0Modules linked in: [3.746697] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-167.el8.iommu6.x86_64 #1 [3.746697] Hardware name: HP ProLiant DL560 Gen9/ProLiant DL560 Gen9, BIOS P85 07/21/2019 [3.746697] RIP: 0010:native_queued_spin_lock_slowpath+0x5d/0x1d0 [3.746697] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 47 85 c$ [3.746697] RSP: :c90f3bd8 EFLAGS: 0002 [3.746697] RAX: 0101 RBX: 0046 RCX: 7f17 [3.746697] RDX: RSI: RDI: 82e8a600 [3.746697] RBP: 6fd0ec00 R08: 000
Re: [RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info
On Tue Dec 17 19, Jerry Snitselaar wrote: In addition to checking for a null pointer, verify that info does not have the value DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values __dmar_remove_one_dev_info will panic when trying to access a member of the device_domain_info struct. [1.464241] BUG: unable to handle kernel NULL pointer dereference at 004e [1.464241] PGD 0 P4D 0 [1.464241] Oops: [#1] SMP PTI [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW - - - 4.18.0-160.el8.x86_64 #1 [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250 [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 55 41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb <4c> 8b 67 50 48 8b 6f 58 $ [1.464241] RSP: :c90dfd10 EFLAGS: 00010082 [1.464241] RAX: 0001 RBX: fffe RCX: [1.464241] RDX: 0001 RSI: 0004 RDI: fffe [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 0039 [1.464241] R10: R11: c90dfa58 R12: 88ec7a0eec20 [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: [1.464241] FS: () GS:88ec7a60() knlGS: [1.464241] CS: 0010 DS: ES: CR0: 80050033 [1.464241] CR2: 004e CR3: 006c7900a001 C 001606b0 [1.464241] Call Trace: [1.464241] dmar_remove_one_dev_info.isra.68+0x27/0x40 [1.464241] intel_iommu_add_device+0x124/0x180 [1.464241] ? iommu_probe_device+0x40/0x40 [1.464241] add_iommu_group+0xa/0x20 [1.464241] bus_for_each_dev+0x77/0xc0 [1.464241] ? down_write+0xe/0x40 [1.464241] bus_set_iommu+0x85/0xc0 [1.464241] intel_iommu_init+0x4b4/0x777 [1.464241] ? e820__memblock_setup+0x63/0x63 [1.464241] ? do_early_param+0x91/0x91 [1.464241] pci_iommu_init+0x19/0x45 [1.464241] do_one_initcall+0x46/0x1c3 [1.464241] ? do_early_param+0x91/0x91 [1.464241] kernel_init_freeable+0x1af/0x258 [1.464241] ? rest_init+0xaa/0xaa [1.464241] kernel_init+0xa/0x107 [1.464241] ret_from_fork+0x35/0x40 [1.464241] Modules linked in: [1.464241] CR2: 004e [1.464241] ---[ end trace 0927d2ba8b8032b5 ]--- Cc: Joerg Roedel Cc: Lu Baolu Cc: David Woodhouse Cc: sta...@vger.kernel.org # v5.3+ Cc: iommu@lists.linux-foundation.org Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one") Signed-off-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..e42a09794fa2 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) spin_lock_irqsave(&device_domain_lock, flags); info = dev->archdata.iommu; - if (info) + if (info && info != DEFER_DEVICE_DOMAIN_INFO + && info != DUMMY_DEVICE_DOMAIN_INFO) __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(&device_domain_lock, flags); } -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Nack this. Apparently the issue is just being seen with the kdump kernel. I'm wondering if it is already solved by 6c3a44ed3c55 ("iommu/vt-d: Turn off translations at shutdown"). Testing a 5.5 build now. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info
On Tue, Dec 17, 2019 at 10:56 AM Jerry Snitselaar wrote: > > In addition to checking for a null pointer, verify that > info does not have the value DEFER_DEVICE_DOMAIN_INFO or > DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values > __dmar_remove_one_dev_info will panic when trying to access > a member of the device_domain_info struct. > > [1.464241] BUG: unable to handle kernel NULL pointer dereference at > 004e > [1.464241] PGD 0 P4D 0 > [1.464241] Oops: [#1] SMP PTI > [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW > - - - 4.18.0-160.el8.x86_64 #1 > [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, > BIOS P89 07/21/2019 > [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250 > [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 > 55 41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb > <4c> 8b 67 50 48 8b 6f 58 $ > [1.464241] RSP: :c90dfd10 EFLAGS: 00010082 > [1.464241] RAX: 0001 RBX: fffe RCX: > > [1.464241] RDX: 0001 RSI: 0004 RDI: > fffe > [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: > 0039 > [1.464241] R10: R11: c90dfa58 R12: > 88ec7a0eec20 > [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: > > [1.464241] FS: () GS:88ec7a60() > knlGS: > [1.464241] CS: 0010 DS: ES: CR0: 80050033 > [1.464241] CR2: 004e CR3: 006c7900a001 C > 001606b0 > [1.464241] Call Trace: > [1.464241] dmar_remove_one_dev_info.isra.68+0x27/0x40 > [1.464241] intel_iommu_add_device+0x124/0x180 > [1.464241] ? iommu_probe_device+0x40/0x40 > [1.464241] add_iommu_group+0xa/0x20 > [1.464241] bus_for_each_dev+0x77/0xc0 > [1.464241] ? down_write+0xe/0x40 > [1.464241] bus_set_iommu+0x85/0xc0 > [1.464241] intel_iommu_init+0x4b4/0x777 > [1.464241] ? e820__memblock_setup+0x63/0x63 > [1.464241] ? do_early_param+0x91/0x91 > [1.464241] pci_iommu_init+0x19/0x45 > [1.464241] do_one_initcall+0x46/0x1c3 > [1.464241] ? do_early_param+0x91/0x91 > [1.464241] kernel_init_freeable+0x1af/0x258 > [1.464241] ? rest_init+0xaa/0xaa > [1.464241] kernel_init+0xa/0x107 > [1.464241] ret_from_fork+0x35/0x40 > [1.464241] Modules linked in: > [1.464241] CR2: 004e > [1.464241] ---[ end trace 0927d2ba8b8032b5 ]--- > > Cc: Joerg Roedel > Cc: Lu Baolu > Cc: David Woodhouse > Cc: sta...@vger.kernel.org # v5.3+ > Cc: iommu@lists.linux-foundation.org > Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one") > Signed-off-by: Jerry Snitselaar > --- > drivers/iommu/intel-iommu.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 0c8d81f56a30..e42a09794fa2 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) > > spin_lock_irqsave(&device_domain_lock, flags); > info = dev->archdata.iommu; > - if (info) > + if (info && info != DEFER_DEVICE_DOMAIN_INFO > + && info != DUMMY_DEVICE_DOMAIN_INFO) > __dmar_remove_one_dev_info(info); > spin_unlock_irqrestore(&device_domain_lock, flags); > } > -- > 2.24.0 > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > I'm not positive that the DUMMY_DEVICE_DOMAIN_INFO check is needed. It seemed like there were checks for that most places before dmar_remove_one_dev_info would be called, but I wasn't certain. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info
In addition to checking for a null pointer, verify that info does not have the value DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values __dmar_remove_one_dev_info will panic when trying to access a member of the device_domain_info struct. [1.464241] BUG: unable to handle kernel NULL pointer dereference at 004e [1.464241] PGD 0 P4D 0 [1.464241] Oops: [#1] SMP PTI [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW - - - 4.18.0-160.el8.x86_64 #1 [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250 [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 55 41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb <4c> 8b 67 50 48 8b 6f 58 $ [1.464241] RSP: :c90dfd10 EFLAGS: 00010082 [1.464241] RAX: 0001 RBX: fffe RCX: [1.464241] RDX: 0001 RSI: 0004 RDI: fffe [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 0039 [1.464241] R10: R11: c90dfa58 R12: 88ec7a0eec20 [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: [1.464241] FS: () GS:88ec7a60() knlGS: [1.464241] CS: 0010 DS: ES: CR0: 80050033 [1.464241] CR2: 004e CR3: 006c7900a001 C 001606b0 [1.464241] Call Trace: [1.464241] dmar_remove_one_dev_info.isra.68+0x27/0x40 [1.464241] intel_iommu_add_device+0x124/0x180 [1.464241] ? iommu_probe_device+0x40/0x40 [1.464241] add_iommu_group+0xa/0x20 [1.464241] bus_for_each_dev+0x77/0xc0 [1.464241] ? down_write+0xe/0x40 [1.464241] bus_set_iommu+0x85/0xc0 [1.464241] intel_iommu_init+0x4b4/0x777 [1.464241] ? e820__memblock_setup+0x63/0x63 [1.464241] ? do_early_param+0x91/0x91 [1.464241] pci_iommu_init+0x19/0x45 [1.464241] do_one_initcall+0x46/0x1c3 [1.464241] ? do_early_param+0x91/0x91 [1.464241] kernel_init_freeable+0x1af/0x258 [1.464241] ? rest_init+0xaa/0xaa [1.464241] kernel_init+0xa/0x107 [1.464241] ret_from_fork+0x35/0x40 [1.464241] Modules linked in: [1.464241] CR2: 004e [1.464241] ---[ end trace 0927d2ba8b8032b5 ]--- Cc: Joerg Roedel Cc: Lu Baolu Cc: David Woodhouse Cc: sta...@vger.kernel.org # v5.3+ Cc: iommu@lists.linux-foundation.org Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one") Signed-off-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..e42a09794fa2 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) spin_lock_irqsave(&device_domain_lock, flags); info = dev->archdata.iommu; - if (info) + if (info && info != DEFER_DEVICE_DOMAIN_INFO + && info != DUMMY_DEVICE_DOMAIN_INFO) __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(&device_domain_lock, flags); } -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu