Re: iommu_sva_bind_device question
On Sat, Jun 25, 2022 at 12:52:27PM -0700, Fenghua Yu wrote: > Hi, Jerry and Baolu, > > On Fri, Jun 24, 2022 at 07:47:30AM -0700, Jerry Snitselaar wrote: > > > > > > > Hi Baolu & Dave, > > > > > fails. > > > > > > > > > > You also will get the following warning if you don't have scalable > > > > > mode enabled (either not enabled by default, or if enabled by default > > > > > and passed intel_iommu=on,sm_off): > > > > > > > > If scalable mode is disabled, iommu_dev_enable_feature(IOMMU_SVA) will > > > > return failure, hence driver should not call iommu_sva_bind_device(). > > > > I guess below will disappear if above is fixed in the idxd driver. > > Yes, Jerry's patch fixes the WARNING as well. > > > > > > > > > Best regards, > > > > baolu > > > > > > > > > > It looks like there was a recent maintainer change, and Fenghua is now > > > the maintainer. Fenghua thoughts on this? With 42a1b73852c4 > > > ("dmaengine: idxd: Separate user and kernel pasid enabling") the code > > > no longer depends on iommu_dev_feature_enable succeeding. Testing with > > > something like this works (ran dmatest without sm_on, and > > > dsa_user_test_runner.sh with sm_on, plus booting with various > > > intel_iommu= combinations): > > > > > > diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c > > > index 355fb3ef4cbf..5b49fd5c1e25 100644 > > > --- a/drivers/dma/idxd/init.c > > > +++ b/drivers/dma/idxd/init.c > > > @@ -514,13 +514,14 @@ static int idxd_probe(struct idxd_device *idxd) > > > if (IS_ENABLED(CONFIG_INTEL_IDXD_SVM) && sva) { > > > if (iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA)) > > > dev_warn(dev, "Unable to turn on user SVA > > > feature.\n"); > > > - else > > > + else { > > > set_bit(IDXD_FLAG_USER_PASID_ENABLED, > > > &idxd->flags); > > > > > > - if (idxd_enable_system_pasid(idxd)) > > Please add "{" after this if. > > > > - dev_warn(dev, "No in-kernel DMA with PASID.\n"); > > > - else > then "}" before this else. > > > > - set_bit(IDXD_FLAG_PASID_ENABLED, &idxd->flags); > > > + if (idxd_enable_system_pasid(idxd)) > > > + dev_warn(dev, "No in-kernel DMA with > > > PASID.\n"); > > > + else > > > + set_bit(IDXD_FLAG_PASID_ENABLED, > > > &idxd->flags); > > > + } > > > } else if (!sva) { > > > dev_warn(dev, "User forced SVA off via module param.\n"); > > > } > > The patch was copied/pasted here. So the tabs are lost at beginning of each > line. So it cannot be applied. Please change the tabs back. > > Could you please send this patch in a separate email so that it has a > right patch format and description and ready to be picked up? > Sure, if you feel this is the correct solution. Just to be clear you would like the end result to be: if (IS_ENABLED(CONFIG_INTEL_IDXD_SVM) && sva) { if (iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA)) dev_warn(dev, "Unable to turn on user SVA feature.\n"); else { set_bit(IDXD_FLAG_USER_PASID_ENABLED, &idxd->flags); if (idxd_enable_system_pasid(idxd)) { dev_warn(dev, "No in-kernel DMA with PASID.\n"); } else set_bit(IDXD_FLAG_PASID_ENABLED, &idxd->flags); } } else if (!sva) { dev_warn(dev, "User forced SVA off via module param.\n"); } > > > > > > The commit description is a bit confusing, because it talks about there > > > being no dependency, but ties user pasid to enabling/disabling the SVA > > > feature, which system pasid would depend on as well. > > > > > > Regards, > > > Jerry > > > > Things like that warning message "Unable to turn on user SVA feature" when > > iommu_dev_enable_feature fails though seems to be misleading with user > > inserted in
Re: iommu_sva_bind_device question
On Fri, Jun 24, 2022 at 06:41:02AM -0700, Jerry Snitselaar wrote: > On Fri, Jun 24, 2022 at 09:43:30AM +0800, Baolu Lu wrote: > > On 2022/6/24 09:14, Jerry Snitselaar wrote: > > > On Fri, Jun 24, 2022 at 08:55:08AM +0800, Baolu Lu wrote: > > > > On 2022/6/24 01:02, Jerry Snitselaar wrote: > > > > > Hi Baolu & Dave, > > > > > > > > > > I noticed last night that on a Sapphire Rapids system if you boot > > > > > without > > > > > intel_iommu=on, the idxd driver will crash during probe in > > > > > iommu_sva_bind_device(). > > > > > Should there be a sanity check before calling dev_iommu_ops(), or is > > > > > the expectation > > > > > that the caller would verify it is safe to call? This seemed to be > > > > > uncovered by > > > > > the combination of 3f6634d997db ("iommu: Use right way to retrieve > > > > > iommu_ops"), and > > > > > 42a1b73852c4 ("dmaengine: idxd: Separate user and kernel pasid > > > > > enabling"). > > > > > > > > > > [ 21.423729] BUG: kernel NULL pointer dereference, address: > > > > > 0038 > > > > > [ 21.445108] #PF: supervisor read access in kernel mode > > > > > [ 21.450912] #PF: error_code(0x) - not-present page > > > > > [ 21.456706] PGD 0 > > > > > [ 21.459047] Oops: [#1] PREEMPT SMP NOPTI > > > > > [ 21.464004] CPU: 0 PID: 1420 Comm: kworker/0:3 Not tainted > > > > > 5.19.0-0.rc3.27.eln120.x86_64 #1 > > > > > [ 21.464011] Hardware name: Intel Corporation > > > > > EAGLESTREAM/EAGLESTREAM, BIOS EGSDCRB1.SYS.0067.D12.2110190954 > > > > > 10/19/2021 > > > > > [ 21.464015] Workqueue: events work_for_cpu_fn > > > > > [ 21.464030] RIP: 0010:iommu_sva_bind_device+0x1d/0xe0 > > > > > [ 21.464046] Code: c3 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 > > > > > 00 00 41 57 41 56 49 89 d6 41 55 41 54 55 53 48 83 ec 08 48 8b 87 d8 > > > > > 02 00 00 <48> 8b 40 38 48 8b 50 10 48 83 7a 70 00 48 89 14 24 0f 84 > > > > > 91 00 00 > > > > > [ 21.464050] RSP: 0018:ff7245d9096b7db8 EFLAGS: 00010296 > > > > > [ 21.464054] RAX: RBX: ff1eadeec8a51000 RCX: > > > > > > > > > > [ 21.464058] RDX: ff7245d9096b7e24 RSI: RDI: > > > > > ff1eadeec8a510d0 > > > > > [ 21.464060] RBP: ff1eadeec8a51000 R08: b1a12300 R09: > > > > > ff1eadffbfce25b4 > > > > > [ 21.464062] R10: R11: 0038 R12: > > > > > c09f8000 > > > > > [ 21.464065] R13: ff1eadeec8a510d0 R14: ff7245d9096b7e24 R15: > > > > > ff1eaddf54429000 > > > > > [ 21.464067] FS: () GS:ff1eadee7f60() > > > > > knlGS: > > > > > [ 21.464070] CS: 0010 DS: ES: CR0: 80050033 > > > > > [ 21.464072] CR2: 0038 CR3: 0008c0e10006 CR4: > > > > > 00771ef0 > > > > > [ 21.464074] DR0: DR1: DR2: > > > > > > > > > > [ 21.464076] DR3: DR6: fffe07f0 DR7: > > > > > 0400 > > > > > [ 21.464078] PKRU: 5554 > > > > > [ 21.464079] Call Trace: > > > > > [ 21.464083] > > > > > [ 21.464092] idxd_pci_probe+0x259/0x1070 [idxd] > > > > > [ 21.464121] local_pci_probe+0x3e/0x80 > > > > > [ 21.464132] work_for_cpu_fn+0x13/0x20 > > > > > [ 21.464136] process_one_work+0x1c4/0x380 > > > > > [ 21.464143] worker_thread+0x1ab/0x380 > > > > > [ 21.464147] ? _raw_spin_lock_irqsave+0x23/0x50 > > > > > [ 21.464158] ? process_one_work+0x380/0x380 > > > > > [ 21.464161] kthread+0xe6/0x110 > > > > > [ 21.464168] ? kthread_complete_and_exit+0x20/0x20 > > > > > [ 21.464172] ret_from_fork+0x1f/0x30 > > > > > > > > > > I figure either there needs to be a check in iommu_sva_bind_device, or > > > > > idxd needs to check in idxd_enable_system_pasid that that > > > > > idxd->pdev->dev.iommu is not null before it tries calling > >
Re: iommu_sva_bind_device question
On Fri, Jun 24, 2022 at 09:43:30AM +0800, Baolu Lu wrote: > On 2022/6/24 09:14, Jerry Snitselaar wrote: > > On Fri, Jun 24, 2022 at 08:55:08AM +0800, Baolu Lu wrote: > > > On 2022/6/24 01:02, Jerry Snitselaar wrote: > > > > Hi Baolu & Dave, > > > > > > > > I noticed last night that on a Sapphire Rapids system if you boot > > > > without > > > > intel_iommu=on, the idxd driver will crash during probe in > > > > iommu_sva_bind_device(). > > > > Should there be a sanity check before calling dev_iommu_ops(), or is > > > > the expectation > > > > that the caller would verify it is safe to call? This seemed to be > > > > uncovered by > > > > the combination of 3f6634d997db ("iommu: Use right way to retrieve > > > > iommu_ops"), and > > > > 42a1b73852c4 ("dmaengine: idxd: Separate user and kernel pasid > > > > enabling"). > > > > > > > > [ 21.423729] BUG: kernel NULL pointer dereference, address: > > > > 0038 > > > > [ 21.445108] #PF: supervisor read access in kernel mode > > > > [ 21.450912] #PF: error_code(0x) - not-present page > > > > [ 21.456706] PGD 0 > > > > [ 21.459047] Oops: [#1] PREEMPT SMP NOPTI > > > > [ 21.464004] CPU: 0 PID: 1420 Comm: kworker/0:3 Not tainted > > > > 5.19.0-0.rc3.27.eln120.x86_64 #1 > > > > [ 21.464011] Hardware name: Intel Corporation > > > > EAGLESTREAM/EAGLESTREAM, BIOS EGSDCRB1.SYS.0067.D12.2110190954 > > > > 10/19/2021 > > > > [ 21.464015] Workqueue: events work_for_cpu_fn > > > > [ 21.464030] RIP: 0010:iommu_sva_bind_device+0x1d/0xe0 > > > > [ 21.464046] Code: c3 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 > > > > 00 41 57 41 56 49 89 d6 41 55 41 54 55 53 48 83 ec 08 48 8b 87 d8 02 00 > > > > 00 <48> 8b 40 38 48 8b 50 10 48 83 7a 70 00 48 89 14 24 0f 84 91 00 00 > > > > [ 21.464050] RSP: 0018:ff7245d9096b7db8 EFLAGS: 00010296 > > > > [ 21.464054] RAX: RBX: ff1eadeec8a51000 RCX: > > > > > > > > [ 21.464058] RDX: ff7245d9096b7e24 RSI: RDI: > > > > ff1eadeec8a510d0 > > > > [ 21.464060] RBP: ff1eadeec8a51000 R08: b1a12300 R09: > > > > ff1eadffbfce25b4 > > > > [ 21.464062] R10: R11: 0038 R12: > > > > c09f8000 > > > > [ 21.464065] R13: ff1eadeec8a510d0 R14: ff7245d9096b7e24 R15: > > > > ff1eaddf54429000 > > > > [ 21.464067] FS: () GS:ff1eadee7f60() > > > > knlGS: > > > > [ 21.464070] CS: 0010 DS: ES: CR0: 80050033 > > > > [ 21.464072] CR2: 0038 CR3: 0008c0e10006 CR4: > > > > 00771ef0 > > > > [ 21.464074] DR0: DR1: DR2: > > > > > > > > [ 21.464076] DR3: DR6: fffe07f0 DR7: > > > > 0400 > > > > [ 21.464078] PKRU: 5554 > > > > [ 21.464079] Call Trace: > > > > [ 21.464083] > > > > [ 21.464092] idxd_pci_probe+0x259/0x1070 [idxd] > > > > [ 21.464121] local_pci_probe+0x3e/0x80 > > > > [ 21.464132] work_for_cpu_fn+0x13/0x20 > > > > [ 21.464136] process_one_work+0x1c4/0x380 > > > > [ 21.464143] worker_thread+0x1ab/0x380 > > > > [ 21.464147] ? _raw_spin_lock_irqsave+0x23/0x50 > > > > [ 21.464158] ? process_one_work+0x380/0x380 > > > > [ 21.464161] kthread+0xe6/0x110 > > > > [ 21.464168] ? kthread_complete_and_exit+0x20/0x20 > > > > [ 21.464172] ret_from_fork+0x1f/0x30 > > > > > > > > I figure either there needs to be a check in iommu_sva_bind_device, or > > > > idxd needs to check in idxd_enable_system_pasid that that > > > > idxd->pdev->dev.iommu is not null before it tries calling > > > > iommu_sva_bind_device. > > > > > > As documented around the iommu_sva_bind_device() interface: > > > > > > * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called > > > first, > > > to > > > * initialize the required SVA features. > > > > > > idxd->pdev->dev.iommu should be checked in there. > > > > > > Dave, any though
Re: iommu_sva_bind_device question
On Fri, Jun 24, 2022 at 08:55:08AM +0800, Baolu Lu wrote: > On 2022/6/24 01:02, Jerry Snitselaar wrote: > > Hi Baolu & Dave, > > > > I noticed last night that on a Sapphire Rapids system if you boot without > > intel_iommu=on, the idxd driver will crash during probe in > > iommu_sva_bind_device(). > > Should there be a sanity check before calling dev_iommu_ops(), or is the > > expectation > > that the caller would verify it is safe to call? This seemed to be > > uncovered by > > the combination of 3f6634d997db ("iommu: Use right way to retrieve > > iommu_ops"), and > > 42a1b73852c4 ("dmaengine: idxd: Separate user and kernel pasid enabling"). > > > > [ 21.423729] BUG: kernel NULL pointer dereference, address: > > 0038 > > [ 21.445108] #PF: supervisor read access in kernel mode > > [ 21.450912] #PF: error_code(0x) - not-present page > > [ 21.456706] PGD 0 > > [ 21.459047] Oops: [#1] PREEMPT SMP NOPTI > > [ 21.464004] CPU: 0 PID: 1420 Comm: kworker/0:3 Not tainted > > 5.19.0-0.rc3.27.eln120.x86_64 #1 > > [ 21.464011] Hardware name: Intel Corporation EAGLESTREAM/EAGLESTREAM, > > BIOS EGSDCRB1.SYS.0067.D12.2110190954 10/19/2021 > > [ 21.464015] Workqueue: events work_for_cpu_fn > > [ 21.464030] RIP: 0010:iommu_sva_bind_device+0x1d/0xe0 > > [ 21.464046] Code: c3 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 > > 41 57 41 56 49 89 d6 41 55 41 54 55 53 48 83 ec 08 48 8b 87 d8 02 00 00 > > <48> 8b 40 38 48 8b 50 10 48 83 7a 70 00 48 89 14 24 0f 84 91 00 00 > > [ 21.464050] RSP: 0018:ff7245d9096b7db8 EFLAGS: 00010296 > > [ 21.464054] RAX: RBX: ff1eadeec8a51000 RCX: > > > > [ 21.464058] RDX: ff7245d9096b7e24 RSI: RDI: > > ff1eadeec8a510d0 > > [ 21.464060] RBP: ff1eadeec8a51000 R08: b1a12300 R09: > > ff1eadffbfce25b4 > > [ 21.464062] R10: R11: 0038 R12: > > c09f8000 > > [ 21.464065] R13: ff1eadeec8a510d0 R14: ff7245d9096b7e24 R15: > > ff1eaddf54429000 > > [ 21.464067] FS: () GS:ff1eadee7f60() > > knlGS: > > [ 21.464070] CS: 0010 DS: ES: CR0: 80050033 > > [ 21.464072] CR2: 0038 CR3: 0008c0e10006 CR4: > > 00771ef0 > > [ 21.464074] DR0: DR1: DR2: > > > > [ 21.464076] DR3: DR6: fffe07f0 DR7: > > 0400 > > [ 21.464078] PKRU: 5554 > > [ 21.464079] Call Trace: > > [ 21.464083] > > [ 21.464092] idxd_pci_probe+0x259/0x1070 [idxd] > > [ 21.464121] local_pci_probe+0x3e/0x80 > > [ 21.464132] work_for_cpu_fn+0x13/0x20 > > [ 21.464136] process_one_work+0x1c4/0x380 > > [ 21.464143] worker_thread+0x1ab/0x380 > > [ 21.464147] ? _raw_spin_lock_irqsave+0x23/0x50 > > [ 21.464158] ? process_one_work+0x380/0x380 > > [ 21.464161] kthread+0xe6/0x110 > > [ 21.464168] ? kthread_complete_and_exit+0x20/0x20 > > [ 21.464172] ret_from_fork+0x1f/0x30 > > > > I figure either there needs to be a check in iommu_sva_bind_device, or > > idxd needs to check in idxd_enable_system_pasid that that > > idxd->pdev->dev.iommu is not null before it tries calling > > iommu_sva_bind_device. > > As documented around the iommu_sva_bind_device() interface: > > * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, > to > * initialize the required SVA features. > > idxd->pdev->dev.iommu should be checked in there. > > Dave, any thoughts? > > Best regards, > baolu Duh, sorry I missed that in the comments. It calls iommu_dev_enable_feature(), but then goes into code that calls iommu_sva_bind_device whether or not iommu_dev_enable_feature() fails. You also will get the following warning if you don't have scalable mode enabled (either not enabled by default, or if enabled by default and passed intel_iommu=on,sm_off): [ 24.645784] idxd :6a:01.0: enabling device (0144 -> 0146) [ 24.645871] idxd :6a:01.0: Unable to turn on user SVA feature. [ 24.645932] [ cut here ] [ 24.645935] WARNING: CPU: 0 PID: 422 at drivers/iommu/intel/pasid.c:253 intel_pasid_get_entry.isra.0+0xcd/0xe0 [ 24.675872] Modules linked in: intel_uncore(+) drm_ttm_helper isst_if_mbox_pci(+) idxd(+) snd i2c_i801(+) isst_if_mmio ttm isst_if_common mei fjes(+) soundcore intel_vsec i2c_ismt i2c_smbus idxd_bus ipmi_ssif acpi_ipmi ipmi_si acpi_pad acpi_power_me
iommu_sva_bind_device question
Hi Baolu & Dave, I noticed last night that on a Sapphire Rapids system if you boot without intel_iommu=on, the idxd driver will crash during probe in iommu_sva_bind_device(). Should there be a sanity check before calling dev_iommu_ops(), or is the expectation that the caller would verify it is safe to call? This seemed to be uncovered by the combination of 3f6634d997db ("iommu: Use right way to retrieve iommu_ops"), and 42a1b73852c4 ("dmaengine: idxd: Separate user and kernel pasid enabling"). [ 21.423729] BUG: kernel NULL pointer dereference, address: 0038 [ 21.445108] #PF: supervisor read access in kernel mode [ 21.450912] #PF: error_code(0x) - not-present page [ 21.456706] PGD 0 [ 21.459047] Oops: [#1] PREEMPT SMP NOPTI [ 21.464004] CPU: 0 PID: 1420 Comm: kworker/0:3 Not tainted 5.19.0-0.rc3.27.eln120.x86_64 #1 [ 21.464011] Hardware name: Intel Corporation EAGLESTREAM/EAGLESTREAM, BIOS EGSDCRB1.SYS.0067.D12.2110190954 10/19/2021 [ 21.464015] Workqueue: events work_for_cpu_fn [ 21.464030] RIP: 0010:iommu_sva_bind_device+0x1d/0xe0 [ 21.464046] Code: c3 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 49 89 d6 41 55 41 54 55 53 48 83 ec 08 48 8b 87 d8 02 00 00 <48> 8b 40 38 48 8b 50 10 48 83 7a 70 00 48 89 14 24 0f 84 91 00 00 [ 21.464050] RSP: 0018:ff7245d9096b7db8 EFLAGS: 00010296 [ 21.464054] RAX: RBX: ff1eadeec8a51000 RCX: [ 21.464058] RDX: ff7245d9096b7e24 RSI: RDI: ff1eadeec8a510d0 [ 21.464060] RBP: ff1eadeec8a51000 R08: b1a12300 R09: ff1eadffbfce25b4 [ 21.464062] R10: R11: 0038 R12: c09f8000 [ 21.464065] R13: ff1eadeec8a510d0 R14: ff7245d9096b7e24 R15: ff1eaddf54429000 [ 21.464067] FS: () GS:ff1eadee7f60() knlGS: [ 21.464070] CS: 0010 DS: ES: CR0: 80050033 [ 21.464072] CR2: 0038 CR3: 0008c0e10006 CR4: 00771ef0 [ 21.464074] DR0: DR1: DR2: [ 21.464076] DR3: DR6: fffe07f0 DR7: 0400 [ 21.464078] PKRU: 5554 [ 21.464079] Call Trace: [ 21.464083] [ 21.464092] idxd_pci_probe+0x259/0x1070 [idxd] [ 21.464121] local_pci_probe+0x3e/0x80 [ 21.464132] work_for_cpu_fn+0x13/0x20 [ 21.464136] process_one_work+0x1c4/0x380 [ 21.464143] worker_thread+0x1ab/0x380 [ 21.464147] ? _raw_spin_lock_irqsave+0x23/0x50 [ 21.464158] ? process_one_work+0x380/0x380 [ 21.464161] kthread+0xe6/0x110 [ 21.464168] ? kthread_complete_and_exit+0x20/0x20 [ 21.464172] ret_from_fork+0x1f/0x30 I figure either there needs to be a check in iommu_sva_bind_device, or idxd needs to check in idxd_enable_system_pasid that that idxd->pdev->dev.iommu is not null before it tries calling iommu_sva_bind_device. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Thu, Jun 23, 2022 at 10:29:35AM +0800, Baolu Lu wrote: > On 2022/6/22 23:05, Jerry Snitselaar wrote: > > On Wed, Jun 22, 2022 at 7:52 AM Baolu Lu wrote: > > > On 2022/6/16 02:36, Steve Wahl wrote: > > > > To support up to 64 sockets with 10 DMAR units each (640), make the > > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > > > > set. > > > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > > > > remapping doesn't support X2APIC mode x2apic disabled"; and the system > > > > fails to boot properly. > > > > > > > > Signed-off-by: Steve Wahl > > > > Reviewed-by: Kevin Tian > > > > --- > > > > > > > > Note that we could not find a reason for connecting > > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > > > > it seemed like the two would continue to match on earlier processors. > > > > There doesn't appear to be kernel code that assumes that the value of > > > > one is related to the other. > > > > > > > > v2: Make this value a config option, rather than a fixed constant. The > > > > default > > > > values should match previous configuration except in the MAXSMP case. > > > > Keeping the > > > > value at a power of two was requested by Kevin Tian. > > > > > > > > v3: Make the config option dependent upon DMAR_TABLE, as it is not used > > > > without this. > > > > > > > >drivers/iommu/intel/Kconfig | 7 +++ > > > >include/linux/dmar.h| 6 +- > > > >2 files changed, 8 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig > > > > index 39a06d245f12..07aaebcb581d 100644 > > > > --- a/drivers/iommu/intel/Kconfig > > > > +++ b/drivers/iommu/intel/Kconfig > > > > @@ -9,6 +9,13 @@ config DMAR_PERF > > > >config DMAR_DEBUG > > > >bool > > > > > > > > +config DMAR_UNITS_SUPPORTED > > > > + int "Number of DMA Remapping Units supported" > > > > + depends on DMAR_TABLE > > > > + default 1024 if MAXSMP > > > > + default 128 if X86_64 > > > > + default 64 > > > With this patch applied, the IOMMU configuration looks like: > > > > > > [*] AMD IOMMU support > > > AMD IOMMU Version 2 driver > > > [*] Enable AMD IOMMU internals in DebugFS > > > (1024) Number of DMA Remapping Units supported <<<< NEW > > > [*] Support for Intel IOMMU using DMA Remapping Devices > > > [*] Export Intel IOMMU internals in Debugfs > > > [*] Support for Shared Virtual Memory with Intel IOMMU > > > [*] Enable Intel DMA Remapping Devices by default > > > [*] Enable Intel IOMMU scalable mode by default > > > [*] Support for Interrupt Remapping > > > [*] OMAP IOMMU Support > > > [*] Export OMAP IOMMU internals in DebugFS > > > [*] Rockchip IOMMU Support > > > > > > The NEW item looks confusing. It looks to be a generic configurable > > > value though it's actually Intel DMAR specific. Any thoughts? > > > > > > Best regards, > > > baolu > > > > > Would moving it under INTEL_IOMMU at least have it show up below > > "Support for Intel IOMMU using DMA Remapping Devices"? I'm not sure it > > can be better than that, because IRQ_REMAP selects DMAR_TABLE, so we > > can't stick it in the if INTEL_IOMMU section. > > It's more reasonable to move it under INTEL_IOMMU, but the trouble is > that this also stands even if INTEL_IOMMU is not configured. My thought only was with it after the 'config INTEL_IOMMU' block and before 'if INTEL_IOMMU' it would show up like: [*] Support for Intel IOMMU using DMA Remapping Devices (1024) Number of DMA Remapping Units supported <<<< NEW > > The real problem here is that the iommu sequence ID overflows if > DMAR_UNITS_SUPPORTED is not big enough. This is purely a software > implementation issue, I am not sure whether user opt-in when building a > kernel package could help a lot here. > Is this something that could be figured out when parsing the dmar table? It looks like currently iommu_refcnt[], iommu_did[], and dmar_seq_ids[] depend on it. Regards, Jerry > If we can't find a better way, can we just step back? > > Best regards, > baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Wed, Jun 22, 2022 at 7:52 AM Baolu Lu wrote: > > On 2022/6/16 02:36, Steve Wahl wrote: > > To support up to 64 sockets with 10 DMAR units each (640), make the > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > > set. > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > > remapping doesn't support X2APIC mode x2apic disabled"; and the system > > fails to boot properly. > > > > Signed-off-by: Steve Wahl > > Reviewed-by: Kevin Tian > > --- > > > > Note that we could not find a reason for connecting > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > > it seemed like the two would continue to match on earlier processors. > > There doesn't appear to be kernel code that assumes that the value of > > one is related to the other. > > > > v2: Make this value a config option, rather than a fixed constant. The > > default > > values should match previous configuration except in the MAXSMP case. > > Keeping the > > value at a power of two was requested by Kevin Tian. > > > > v3: Make the config option dependent upon DMAR_TABLE, as it is not used > > without this. > > > > drivers/iommu/intel/Kconfig | 7 +++ > > include/linux/dmar.h| 6 +- > > 2 files changed, 8 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig > > index 39a06d245f12..07aaebcb581d 100644 > > --- a/drivers/iommu/intel/Kconfig > > +++ b/drivers/iommu/intel/Kconfig > > @@ -9,6 +9,13 @@ config DMAR_PERF > > config DMAR_DEBUG > > bool > > > > +config DMAR_UNITS_SUPPORTED > > + int "Number of DMA Remapping Units supported" > > + depends on DMAR_TABLE > > + default 1024 if MAXSMP > > + default 128 if X86_64 > > + default 64 > > With this patch applied, the IOMMU configuration looks like: > > [*] AMD IOMMU support > AMD IOMMU Version 2 driver > [*] Enable AMD IOMMU internals in DebugFS > (1024) Number of DMA Remapping Units supported NEW > [*] Support for Intel IOMMU using DMA Remapping Devices > [*] Export Intel IOMMU internals in Debugfs > [*] Support for Shared Virtual Memory with Intel IOMMU > [*] Enable Intel DMA Remapping Devices by default > [*] Enable Intel IOMMU scalable mode by default > [*] Support for Interrupt Remapping > [*] OMAP IOMMU Support > [*] Export OMAP IOMMU internals in DebugFS > [*] Rockchip IOMMU Support > > The NEW item looks confusing. It looks to be a generic configurable > value though it's actually Intel DMAR specific. Any thoughts? > > Best regards, > baolu > Would moving it under INTEL_IOMMU at least have it show up below "Support for Intel IOMMU using DMA Remapping Devices"? I'm not sure it can be better than that, because IRQ_REMAP selects DMAR_TABLE, so we can't stick it in the if INTEL_IOMMU section. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Wed, Jun 15, 2022 at 01:36:50PM -0500, Steve Wahl wrote: > To support up to 64 sockets with 10 DMAR units each (640), make the > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > set. > > If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > remapping doesn't support X2APIC mode x2apic disabled"; and the system > fails to boot properly. > > Signed-off-by: Steve Wahl > Reviewed-by: Kevin Tian Reviewed-by: Jerry Snitselaar > --- > > Note that we could not find a reason for connecting > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > it seemed like the two would continue to match on earlier processors. > There doesn't appear to be kernel code that assumes that the value of > one is related to the other. > > v2: Make this value a config option, rather than a fixed constant. The > default > values should match previous configuration except in the MAXSMP case. > Keeping the > value at a power of two was requested by Kevin Tian. > > v3: Make the config option dependent upon DMAR_TABLE, as it is not used > without this. > > drivers/iommu/intel/Kconfig | 7 +++ > include/linux/dmar.h| 6 +- > 2 files changed, 8 insertions(+), 5 deletions(-) > > diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig > index 39a06d245f12..07aaebcb581d 100644 > --- a/drivers/iommu/intel/Kconfig > +++ b/drivers/iommu/intel/Kconfig > @@ -9,6 +9,13 @@ config DMAR_PERF > config DMAR_DEBUG > bool > > +config DMAR_UNITS_SUPPORTED > + int "Number of DMA Remapping Units supported" > + depends on DMAR_TABLE > + default 1024 if MAXSMP > + default 128 if X86_64 > + default 64 > + > config INTEL_IOMMU > bool "Support for Intel IOMMU using DMA Remapping Devices" > depends on PCI_MSI && ACPI && (X86 || IA64) > diff --git a/include/linux/dmar.h b/include/linux/dmar.h > index 45e903d84733..0c03c1845c23 100644 > --- a/include/linux/dmar.h > +++ b/include/linux/dmar.h > @@ -18,11 +18,7 @@ > > struct acpi_dmar_header; > > -#ifdef CONFIG_X86 > -# define DMAR_UNITS_SUPPORTEDMAX_IO_APICS > -#else > -# define DMAR_UNITS_SUPPORTED64 > -#endif > +#define DMAR_UNITS_SUPPORTEDCONFIG_DMAR_UNITS_SUPPORTED > > /* DMAR Flags */ > #define DMAR_INTR_REMAP 0x1 > -- > 2.26.2 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Tue, Jun 14, 2022 at 11:45:35AM -0500, Steve Wahl wrote: > On Tue, Jun 14, 2022 at 10:21:29AM +0800, Baolu Lu wrote: > > On 2022/6/14 09:54, Jerry Snitselaar wrote: > > > On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu wrote: > > > > > > > > On 2022/6/14 09:44, Jerry Snitselaar wrote: > > > > > On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu > > > > > wrote: > > > > > > On 2022/6/14 04:57, Jerry Snitselaar wrote: > > > > > > > On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote: > > > > > > > > To support up to 64 sockets with 10 DMAR units each (640), make > > > > > > > > the > > > > > > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > > > > > > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when > > > > > > > > MAXSMP is > > > > > > > > set. > > > > > > > > > > > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED > > > > > > > > (previously set > > > > > > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: > > > > > > > > Failed to > > > > > > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and > > > > > > > > "x2apic: IRQ > > > > > > > > remapping doesn't support X2APIC mode x2apic disabled"; and the > > > > > > > > system > > > > > > > > fails to boot properly. > > > > > > > > > > > > > > > > Signed-off-by: Steve Wahl > > > > > > > > --- > > > > > > > > > > > > > > > > Note that we could not find a reason for connecting > > > > > > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. > > > > > > > > Perhaps > > > > > > > > it seemed like the two would continue to match on earlier > > > > > > > > processors. > > > > > > > > There doesn't appear to be kernel code that assumes that the > > > > > > > > value of > > > > > > > > one is related to the other. > > > > > > > > > > > > > > > > v2: Make this value a config option, rather than a fixed > > > > > > > > constant. The default > > > > > > > > values should match previous configuration except in the MAXSMP > > > > > > > > case. Keeping the > > > > > > > > value at a power of two was requested by Kevin Tian. > > > > > > > > > > > > > > > > drivers/iommu/intel/Kconfig | 6 ++ > > > > > > > > include/linux/dmar.h| 6 +- > > > > > > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > > > > > > > > > > > diff --git a/drivers/iommu/intel/Kconfig > > > > > > > > b/drivers/iommu/intel/Kconfig > > > > > > > > index 247d0f2d5fdf..fdbda77ac21e 100644 > > > > > > > > --- a/drivers/iommu/intel/Kconfig > > > > > > > > +++ b/drivers/iommu/intel/Kconfig > > > > > > > > @@ -9,6 +9,12 @@ config DMAR_PERF > > > > > > > > config DMAR_DEBUG > > > > > > > >bool > > > > > > > > > > > > > > > > +config DMAR_UNITS_SUPPORTED > > > > > > > > +int "Number of DMA Remapping Units supported" > > > > > > > Also, should there be a "depends on (X86 || IA64)" here? > > > > > > Do you have any compilation errors or warnings? > > > > > > > > > > > > Best regards, > > > > > > baolu > > > > > > > > > > > I think it is probably harmless since it doesn't get used elsewhere, > > > > > but our tooling was complaining to me because DMAR_UNITS_SUPPORTED was > > > > > being autogenerated into the configs for the non-x86 architectures we > > > > > build (aarch64, s390x, ppcle64). > > > > > We have files corresponding to the config options that it looks at, > > > > > and I had one for x86 and not the others so it noticed the > > > > > discrepancy. > > > > > > > > So with "depends on (X86 || IA64)", that tool doesn't complain anymore, > > > > right? > > > > > > > > Best regards, > > > > baolu > > > > > > > > > > Yes, with the depends it no longer happens. > > > > The dmar code only exists on X86 and IA64 arch's. Adding this depending > > makes sense to me. I will add it if no objections. > > I think that works after Baolu's patchset that makes intel-iommu.h > private. I'm pretty sure it wouldn't have worked before that. > > No objections. > Yes, I think applying it with the depends prior to Baolu's change would still run into the issue from the KTR report if someone compiled without INTEL_IOMMU enabled. This was dealing with being able to do something like: make allmodconfig ARCH=arm64 ; grep DMAR_UNITS .config and finding CONFIG_DMAR_UNITS_SUPPORTED=64. Thinking some more though, instead of the depends being on the arch would depending on DMAR_TABLE or INTEL_IOMMU be more appropriate? Regards, Jerry > --> Steve > > -- > Steve Wahl, Hewlett Packard Enterprise ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu wrote: > > On 2022/6/14 09:44, Jerry Snitselaar wrote: > > On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu wrote: > >> On 2022/6/14 04:57, Jerry Snitselaar wrote: > >>> On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote: > >>>> To support up to 64 sockets with 10 DMAR units each (640), make the > >>>> value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > >>>> CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > >>>> set. > >>>> > >>>> If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > >>>> to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > >>>> allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > >>>> remapping doesn't support X2APIC mode x2apic disabled"; and the system > >>>> fails to boot properly. > >>>> > >>>> Signed-off-by: Steve Wahl > >>>> --- > >>>> > >>>> Note that we could not find a reason for connecting > >>>> DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > >>>> it seemed like the two would continue to match on earlier processors. > >>>> There doesn't appear to be kernel code that assumes that the value of > >>>> one is related to the other. > >>>> > >>>> v2: Make this value a config option, rather than a fixed constant. The > >>>> default > >>>> values should match previous configuration except in the MAXSMP case. > >>>> Keeping the > >>>> value at a power of two was requested by Kevin Tian. > >>>> > >>>>drivers/iommu/intel/Kconfig | 6 ++ > >>>>include/linux/dmar.h| 6 +- > >>>>2 files changed, 7 insertions(+), 5 deletions(-) > >>>> > >>>> diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig > >>>> index 247d0f2d5fdf..fdbda77ac21e 100644 > >>>> --- a/drivers/iommu/intel/Kconfig > >>>> +++ b/drivers/iommu/intel/Kconfig > >>>> @@ -9,6 +9,12 @@ config DMAR_PERF > >>>>config DMAR_DEBUG > >>>> bool > >>>> > >>>> +config DMAR_UNITS_SUPPORTED > >>>> +int "Number of DMA Remapping Units supported" > >>> Also, should there be a "depends on (X86 || IA64)" here? > >> Do you have any compilation errors or warnings? > >> > >> Best regards, > >> baolu > >> > > I think it is probably harmless since it doesn't get used elsewhere, > > but our tooling was complaining to me because DMAR_UNITS_SUPPORTED was > > being autogenerated into the configs for the non-x86 architectures we > > build (aarch64, s390x, ppcle64). > > We have files corresponding to the config options that it looks at, > > and I had one for x86 and not the others so it noticed the > > discrepancy. > > So with "depends on (X86 || IA64)", that tool doesn't complain anymore, > right? > > Best regards, > baolu > Yes, with the depends it no longer happens. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu wrote: > > On 2022/6/14 04:57, Jerry Snitselaar wrote: > > On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote: > >> To support up to 64 sockets with 10 DMAR units each (640), make the > >> value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > >> CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > >> set. > >> > >> If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > >> to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > >> allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > >> remapping doesn't support X2APIC mode x2apic disabled"; and the system > >> fails to boot properly. > >> > >> Signed-off-by: Steve Wahl > >> --- > >> > >> Note that we could not find a reason for connecting > >> DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > >> it seemed like the two would continue to match on earlier processors. > >> There doesn't appear to be kernel code that assumes that the value of > >> one is related to the other. > >> > >> v2: Make this value a config option, rather than a fixed constant. The > >> default > >> values should match previous configuration except in the MAXSMP case. > >> Keeping the > >> value at a power of two was requested by Kevin Tian. > >> > >> drivers/iommu/intel/Kconfig | 6 ++ > >> include/linux/dmar.h| 6 +- > >> 2 files changed, 7 insertions(+), 5 deletions(-) > >> > >> diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig > >> index 247d0f2d5fdf..fdbda77ac21e 100644 > >> --- a/drivers/iommu/intel/Kconfig > >> +++ b/drivers/iommu/intel/Kconfig > >> @@ -9,6 +9,12 @@ config DMAR_PERF > >> config DMAR_DEBUG > >> bool > >> > >> +config DMAR_UNITS_SUPPORTED > >> +int "Number of DMA Remapping Units supported" > > > > Also, should there be a "depends on (X86 || IA64)" here? > > Do you have any compilation errors or warnings? > > Best regards, > baolu > I think it is probably harmless since it doesn't get used elsewhere, but our tooling was complaining to me because DMAR_UNITS_SUPPORTED was being autogenerated into the configs for the non-x86 architectures we build (aarch64, s390x, ppcle64). We have files corresponding to the config options that it looks at, and I had one for x86 and not the others so it noticed the discrepancy. > > > >> +default 1024 if MAXSMP > >> +default 128 if X86_64 > >> +default 64 > >> + > >> config INTEL_IOMMU > >> bool "Support for Intel IOMMU using DMA Remapping Devices" > >> depends on PCI_MSI && ACPI && (X86 || IA64) > >> diff --git a/include/linux/dmar.h b/include/linux/dmar.h > >> index 45e903d84733..0c03c1845c23 100644 > >> --- a/include/linux/dmar.h > >> +++ b/include/linux/dmar.h > >> @@ -18,11 +18,7 @@ > >> > >> struct acpi_dmar_header; > >> > >> -#ifdef CONFIG_X86 > >> -# defineDMAR_UNITS_SUPPORTEDMAX_IO_APICS > >> -#else > >> -# defineDMAR_UNITS_SUPPORTED64 > >> -#endif > >> +#define DMAR_UNITS_SUPPORTEDCONFIG_DMAR_UNITS_SUPPORTED > >> > >> /* DMAR Flags */ > >> #define DMAR_INTR_REMAP0x1 > >> -- > >> 2.26.2 > >> > > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote: > To support up to 64 sockets with 10 DMAR units each (640), make the > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > set. > > If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > remapping doesn't support X2APIC mode x2apic disabled"; and the system > fails to boot properly. > > Signed-off-by: Steve Wahl > --- > > Note that we could not find a reason for connecting > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > it seemed like the two would continue to match on earlier processors. > There doesn't appear to be kernel code that assumes that the value of > one is related to the other. > > v2: Make this value a config option, rather than a fixed constant. The > default > values should match previous configuration except in the MAXSMP case. > Keeping the > value at a power of two was requested by Kevin Tian. > > drivers/iommu/intel/Kconfig | 6 ++ > include/linux/dmar.h| 6 +- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig > index 247d0f2d5fdf..fdbda77ac21e 100644 > --- a/drivers/iommu/intel/Kconfig > +++ b/drivers/iommu/intel/Kconfig > @@ -9,6 +9,12 @@ config DMAR_PERF > config DMAR_DEBUG > bool > > +config DMAR_UNITS_SUPPORTED > + int "Number of DMA Remapping Units supported" Also, should there be a "depends on (X86 || IA64)" here? > + default 1024 if MAXSMP > + default 128 if X86_64 > + default 64 > + > config INTEL_IOMMU > bool "Support for Intel IOMMU using DMA Remapping Devices" > depends on PCI_MSI && ACPI && (X86 || IA64) > diff --git a/include/linux/dmar.h b/include/linux/dmar.h > index 45e903d84733..0c03c1845c23 100644 > --- a/include/linux/dmar.h > +++ b/include/linux/dmar.h > @@ -18,11 +18,7 @@ > > struct acpi_dmar_header; > > -#ifdef CONFIG_X86 > -# define DMAR_UNITS_SUPPORTEDMAX_IO_APICS > -#else > -# define DMAR_UNITS_SUPPORTED64 > -#endif > +#define DMAR_UNITS_SUPPORTEDCONFIG_DMAR_UNITS_SUPPORTED > > /* DMAR Flags */ > #define DMAR_INTR_REMAP 0x1 > -- > 2.26.2 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote: > To support up to 64 sockets with 10 DMAR units each (640), make the > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > set. > > If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > remapping doesn't support X2APIC mode x2apic disabled"; and the system > fails to boot properly. > > Signed-off-by: Steve Wahl > --- > > Note that we could not find a reason for connecting > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > it seemed like the two would continue to match on earlier processors. > There doesn't appear to be kernel code that assumes that the value of > one is related to the other. > > v2: Make this value a config option, rather than a fixed constant. The > default > values should match previous configuration except in the MAXSMP case. > Keeping the > value at a power of two was requested by Kevin Tian. > > drivers/iommu/intel/Kconfig | 6 ++ > include/linux/dmar.h| 6 +- > 2 files changed, 7 insertions(+), 5 deletions(-) > Baolu do you have this queued up for v5.20? Also do you have a public repo where you keep the vt-d changes before sending Joerg the patches for a release? Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: Fix unmap_pages support
On Fri, 2021-11-12 at 10:59 +0800, Lu Baolu wrote: > Hi Alex, > > On 11/11/21 8:32 AM, Alex Williamson wrote: > > When supporting only the .map and .unmap callbacks of iommu_ops, > > the IOMMU driver can make assumptions about the size and alignment > > used for mappings based on the driver provided pgsize_bitmap. VT-d > > previously used essentially PAGE_MASK for this bitmap as any power > > of two mapping was acceptably filled by native page sizes. > > > > However, with the .map_pages and .unmap_pages interface we're now > > getting page-size and count arguments. If we simply combine these > > as (page-size * count) and make use of the previous map/unmap > > functions internally, any size and alignment assumptions are very > > different. > > > > As an example, a given vfio device assignment VM will often create > > a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff]. On a system that > > does not support IOMMU super pages, the unmap_pages interface will > > ask to unmap 1024 4KB pages at the base IOVA. > > dma_pte_clear_level() > > will recurse down to level 2 of the page table where the first half > > of the pfn range exactly matches the entire pte level. We clear > > the > > pte, increment the pfn by the level size, but (oops) the next pte > > is > > on a new page, so we exit the loop an pop back up a level. When we > > then update the pfn based on that higher level, we seem to assume > > that the previous pfn value was at the start of the level. In this > > case the level size is 256K pfns, which we add to the base pfn and > > get a results of 0x7fe00, which is clearly greater than 0x401ff, > > so we're done. Meanwhile we never cleared the ptes for the > > remainder > > of the range. When the VM remaps this range, we're overwriting > > valid > > ptes and the VT-d driver complains loudly, as reported by the user > > report linked below. > > > > The fix for this seems relatively simple, if each iteration of the > > loop in dma_pte_clear_level() is assumed to clear to the end of the > > level pte page, then our next pfn should be calculated from > > level_pfn > > rather than our working pfn. > > > > Fixes: 3f34f1259776 ("iommu/vt-d: Implement map/unmap_pages() > > iommu_ops callback") > > Reported-by: Ajay Garg > > Link: > > https://lore.kernel.org/all/20211002124012.18186-1-ajaygargn...@gmail.com/ > > Signed-off-by: Alex Williamson > > Thank you for fixing this! I will queue it for v5.16. > > Best regards, > baolu > Hi Baolu, Do you have an estimate of when this will be submitted? Regards, Jerry > > --- > > drivers/iommu/intel/iommu.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/intel/iommu.c > > b/drivers/iommu/intel/iommu.c > > index d75f59ae28e6..f6395f5425f0 100644 > > --- a/drivers/iommu/intel/iommu.c > > +++ b/drivers/iommu/intel/iommu.c > > @@ -1249,7 +1249,7 @@ static struct page > > *dma_pte_clear_level(struct dmar_domain *domain, int level, > > freelist); > > } > > next: > > - pfn += level_size(level); > > + pfn = level_pfn + level_size(level); > > } while (!first_pte_in_page(++pte) && pfn <= last_pfn); > > > > if (first_pte) > > > > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Add sanity check for interrupt remapping table length macros
Suravee Suthikulpanit @ 2020-12-10 09:24 MST: > Currently, macros related to the interrupt remapping table length are > defined separately. This has resulted in an oversight in which one of > the macros were missed when changing the length. To prevent this, > redefine the macros to add built-in sanity check. > > Also, rename macros to use the name of the DTE[IntTabLen] field as > specified in the AMD IOMMU specification. There is no functional change. > > Suggested-by: Linus Torvalds > Reviewed-by: Tom Lendacky > Signed-off-by: Suravee Suthikulpanit > Cc: Will Deacon > Cc: Jerry Snitselaar > Cc: Joerg Roedel > --- > drivers/iommu/amd/amd_iommu_types.h | 19 ++- > drivers/iommu/amd/init.c| 6 +++--- > drivers/iommu/amd/iommu.c | 2 +- > 3 files changed, 14 insertions(+), 13 deletions(-) > > diff --git a/drivers/iommu/amd/amd_iommu_types.h > b/drivers/iommu/amd/amd_iommu_types.h > index 494b42a31b7a..899ce62df3f0 100644 > --- a/drivers/iommu/amd/amd_iommu_types.h > +++ b/drivers/iommu/amd/amd_iommu_types.h > @@ -255,11 +255,19 @@ > /* Bit value definition for dte irq remapping fields*/ > #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6) > #define DTE_IRQ_REMAP_INTCTL_MASK(0x3ULL << 60) > -#define DTE_IRQ_TABLE_LEN_MASK (0xfULL << 1) > #define DTE_IRQ_REMAP_INTCTL(2ULL << 60) > -#define DTE_IRQ_TABLE_LEN (9ULL << 1) > #define DTE_IRQ_REMAP_ENABLE1ULL > > +/* > + * AMD IOMMU hardware only support 512 IRTEs despite > + * the architectural limitation of 2048 entries. > + */ > +#define DTE_INTTAB_ALIGNMENT128 > +#define DTE_INTTABLEN_VALUE 9ULL > +#define DTE_INTTABLEN (DTE_INTTABLEN_VALUE << 1) > +#define DTE_INTTABLEN_MASK (0xfULL << 1) > +#define MAX_IRQS_PER_TABLE (1 << DTE_INTTABLEN_VALUE) > + > #define PAGE_MODE_NONE0x00 > #define PAGE_MODE_1_LEVEL 0x01 > #define PAGE_MODE_2_LEVEL 0x02 > @@ -409,13 +417,6 @@ extern bool amd_iommu_np_cache; > /* Only true if all IOMMUs support device IOTLBs */ > extern bool amd_iommu_iotlb_sup; > > -/* > - * AMD IOMMU hardware only support 512 IRTEs despite > - * the architectural limitation of 2048 entries. > - */ > -#define MAX_IRQS_PER_TABLE 512 > -#define IRQ_TABLE_ALIGNMENT 128 > - > struct irq_remap_table { > raw_spinlock_t lock; > unsigned min_index; > diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c > index 23a790f8f550..6bec8913d064 100644 > --- a/drivers/iommu/amd/init.c > +++ b/drivers/iommu/amd/init.c > @@ -989,10 +989,10 @@ static bool copy_device_table(void) > > irq_v = old_devtb[devid].data[2] & DTE_IRQ_REMAP_ENABLE; > int_ctl = old_devtb[devid].data[2] & DTE_IRQ_REMAP_INTCTL_MASK; > - int_tab_len = old_devtb[devid].data[2] & DTE_IRQ_TABLE_LEN_MASK; > + int_tab_len = old_devtb[devid].data[2] & DTE_INTTABLEN_MASK; > if (irq_v && (int_ctl || int_tab_len)) { > if ((int_ctl != DTE_IRQ_REMAP_INTCTL) || > - (int_tab_len != DTE_IRQ_TABLE_LEN)) { > + (int_tab_len != DTE_INTTABLEN)) { > pr_err("Wrong old irq remapping flag: %#x\n", > devid); > return false; > } > @@ -2674,7 +2674,7 @@ static int __init early_amd_iommu_init(void) > remap_cache_sz = MAX_IRQS_PER_TABLE * (sizeof(u64) * 2); > amd_iommu_irq_cache = kmem_cache_create("irq_remap_cache", > remap_cache_sz, > - IRQ_TABLE_ALIGNMENT, > + DTE_INTTAB_ALIGNMENT, > 0, NULL); > if (!amd_iommu_irq_cache) > goto out; > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c > index b9cf59443843..f7abf16d1e3a 100644 > --- a/drivers/iommu/amd/iommu.c > +++ b/drivers/iommu/amd/iommu.c > @@ -3191,7 +3191,7 @@ static void set_dte_irq_entry(u16 devid, struct > irq_remap_table *table) > dte &= ~DTE_IRQ_PHYS_ADDR_MASK; > dte |= iommu_virt_to_phys(table->table); > dte |= DTE_IRQ_REMAP_INTCTL; > - dte |= DTE_IRQ_TABLE_LEN; > + dte |= DTE_INTTABLEN; > dte |= DTE_IRQ_REMAP_ENABLE; > > amd_iommu_dev_table[devid].data[2] = dte; Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [GIT PULL] IOMMU fix for 5.10 (-final)
On Wed, Dec 9, 2020 at 12:18 PM Linus Torvalds wrote: > > On Wed, Dec 9, 2020 at 11:12 AM Jerry Snitselaar wrote: > > > > Since the field in the device table entry format expects it to be n > > where there are 2^n entries in the table I guess it should be: > > > > #define DTE_IRQ_TABLE_LEN 9 > > #define MAX_IRQS_PER_TABLE (1 << DTE_IRQ_TABLE_LEN) > > No, that "DTE_IRQ_TABLE_LEN" is not the size shift - it's the size > shift value in that DTE field, which is shifted up by 1. > > That's why the current code does that > >#define DTE_IRQ_TABLE_LEN (9ULL << 1) > > there.. > > Which was why I suggested that new #define that is the *actual* shift > value, and then the DTE thing and the MAX_IRQS_PER_TABLE values would > depend on that. > >Linus > Yes, when I read it my head was translating it as setting them both to 512 and then I forgot that it gets shifted over 1. Which considering I was the once who noticed the original problem of it still being 8 was a nice brain fart. This should be fixed like you suggest. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [GIT PULL] IOMMU fix for 5.10 (-final)
On Wed, Dec 9, 2020 at 12:12 PM Jerry Snitselaar wrote: > > > Will Deacon @ 2020-12-09 11:50 MST: > > > On Wed, Dec 09, 2020 at 10:07:46AM -0800, Linus Torvalds wrote: > >> On Wed, Dec 9, 2020 at 6:12 AM Will Deacon wrote: > >> > > >> > Please pull this one-liner AMD IOMMU fix for 5.10. It's actually a fix > >> > for a fix, where the size of the interrupt remapping table was increased > >> > but a related constant for the size of the interrupt table was forgotten. > >> > >> Pulled. > > > > Thanks. > > > >> However, why didn't this then add some sanity checking for the two > >> different #defines to be in sync? > >> > >> IOW, something like > >> > >>#define AMD_IOMMU_IRQ_TABLE_SHIFT 9 > >> > >>#define MAX_IRQS_PER_TABLE (1 << AMD_IOMMU_IRQ_TABLE_SHIFT) > >>#define DTE_IRQ_TABLE_LEN ((u64)AMD_IOMMU_IRQ_TABLE_SHIFT << 1) > > Since the field in the device table entry format expects it to be n > where there are 2^n entries in the table I guess it should be: > > #define DTE_IRQ_TABLE_LEN 9 > #define MAX_IRQS_PER_TABLE (1 << DTE_IRQ_TABLE_LEN) > No, ignore that. I'm being stupid. > >> > >> or whatever. Hmm? > > > > This looks like a worthwhile change to me, but I don't have any hardware > > so I've been very reluctant to make even "obvious" driver changes here. > > > > Suravee -- please can you post a patch implementing the above? > > > >> That way this won't happen again, but perhaps equally importantly the > >> linkage will be more clear, and there won't be those random constants. > >> > >> Naming above is probably garbage - I assume there's some actual > >> architectural name for that irq table length field in the DTE? > > > > The one in the spec is even better: "IntTabLen". > > > > Will > > ___ > > iommu mailing list > > iommu@lists.linux-foundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/iommu > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [GIT PULL] IOMMU fix for 5.10 (-final)
Will Deacon @ 2020-12-09 11:50 MST: > On Wed, Dec 09, 2020 at 10:07:46AM -0800, Linus Torvalds wrote: >> On Wed, Dec 9, 2020 at 6:12 AM Will Deacon wrote: >> > >> > Please pull this one-liner AMD IOMMU fix for 5.10. It's actually a fix >> > for a fix, where the size of the interrupt remapping table was increased >> > but a related constant for the size of the interrupt table was forgotten. >> >> Pulled. > > Thanks. > >> However, why didn't this then add some sanity checking for the two >> different #defines to be in sync? >> >> IOW, something like >> >>#define AMD_IOMMU_IRQ_TABLE_SHIFT 9 >> >>#define MAX_IRQS_PER_TABLE (1 << AMD_IOMMU_IRQ_TABLE_SHIFT) >>#define DTE_IRQ_TABLE_LEN ((u64)AMD_IOMMU_IRQ_TABLE_SHIFT << 1) Since the field in the device table entry format expects it to be n where there are 2^n entries in the table I guess it should be: #define DTE_IRQ_TABLE_LEN 9 #define MAX_IRQS_PER_TABLE (1 << DTE_IRQ_TABLE_LEN) >> >> or whatever. Hmm? > > This looks like a worthwhile change to me, but I don't have any hardware > so I've been very reluctant to make even "obvious" driver changes here. > > Suravee -- please can you post a patch implementing the above? > >> That way this won't happen again, but perhaps equally importantly the >> linkage will be more clear, and there won't be those random constants. >> >> Naming above is probably garbage - I assume there's some actual >> architectural name for that irq table length field in the DTE? > > The one in the spec is even better: "IntTabLen". > > Will > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs
Suravee Suthikulpanit @ 2020-12-07 02:19 MST: > According to the AMD IOMMU spec, the commit 73db2fc595f3 > ("iommu/amd: Increase interrupt remapping table limit to 512 entries") > also requires the interrupt table length (IntTabLen) to be set to 9 > (power of 2) in the device table mapping entry (DTE). > > Fixes: 73db2fc595f3 ("iommu/amd: Increase interrupt remapping table limit to > 512 entries") > Reported-by: Jerry Snitselaar > Signed-off-by: Suravee Suthikulpanit > --- > drivers/iommu/amd/amd_iommu_types.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/amd/amd_iommu_types.h > b/drivers/iommu/amd/amd_iommu_types.h > index 89647700bab2..494b42a31b7a 100644 > --- a/drivers/iommu/amd/amd_iommu_types.h > +++ b/drivers/iommu/amd/amd_iommu_types.h > @@ -257,7 +257,7 @@ > #define DTE_IRQ_REMAP_INTCTL_MASK(0x3ULL << 60) > #define DTE_IRQ_TABLE_LEN_MASK (0xfULL << 1) > #define DTE_IRQ_REMAP_INTCTL(2ULL << 60) > -#define DTE_IRQ_TABLE_LEN (8ULL << 1) > +#define DTE_IRQ_TABLE_LEN (9ULL << 1) > #define DTE_IRQ_REMAP_ENABLE1ULL > > #define PAGE_MODE_NONE0x00 Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Increase interrupt remapping table limit to 512 entries
Suravee Suthikulpanit @ 2020-10-14 19:50 MST: > Certain device drivers allocate IO queues on a per-cpu basis. > On AMD EPYC platform, which can support up-to 256 cpu threads, > this can exceed the current MAX_IRQ_PER_TABLE limit of 256, > and result in the error message: > > AMD-Vi: Failed to allocate IRTE > > This has been observed with certain NVME devices. > > AMD IOMMU hardware can actually support upto 512 interrupt > remapping table entries. Therefore, update the driver to > match the hardware limit. > > Please note that this also increases the size of interrupt remapping > table to 8KB per device when using the 128-bit IRTE format. > > Signed-off-by: Suravee Suthikulpanit > --- > drivers/iommu/amd/amd_iommu_types.h | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/amd/amd_iommu_types.h > b/drivers/iommu/amd/amd_iommu_types.h > index 30a5d412255a..427484c45589 100644 > --- a/drivers/iommu/amd/amd_iommu_types.h > +++ b/drivers/iommu/amd/amd_iommu_types.h > @@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache; > /* Only true if all IOMMUs support device IOTLBs */ > extern bool amd_iommu_iotlb_sup; > > -#define MAX_IRQS_PER_TABLE 256 > +/* > + * AMD IOMMU hardware only support 512 IRTEs despite > + * the architectural limitation of 2048 entries. > + */ > +#define MAX_IRQS_PER_TABLE 512 > #define IRQ_TABLE_ALIGNMENT 128 > > struct irq_remap_table { With this change should DTE_IRQ_TABLE_LEN be changed to 9? IIUC the spec correctly leaving it at 8 is saying the table is 256 entries long. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Question about domain_init (v5.3-v5.7)
Jerry Snitselaar @ 2020-11-30 10:50 MST: > Lu Baolu @ 2020-11-26 19:12 MST: > >> Hi Jerry, >> >> On 11/27/20 5:35 AM, Jerry Snitselaar wrote: >>> Lu Baolu @ 2020-11-26 04:01 MST: >>> >>>> Hi Jerry, >>>> >>>> On 2020/11/26 4:27, Jerry Snitselaar wrote: >>>>> Is there a reason we check the requested guest address width against >>>>> the >>>>> iommu's mgaw, instead of the agaw that we already know for the iommu? >>>>> I've run into a case with a new system where the mgaw reported is 57, >>>>> but if they set PAE to 46 instead of 52 in the bios, then sagaw reports >>>>> the highest supported agaw is 48 and the domain_init code fails here. In >>>> >>>> Isn't this a platform bug? If it's too late to fix it in the BIOS, you >>>> maybe have to add a platform specific quirk to set mgaw to the highest >>>> supported agaw? >>>> >>>> Best regards, >>>> baolu >>> Is there somewhere you can point me to that discusses how they >>> should be >>> setting the mgaw? I misunderstood when I previously asked you about >>> whether the mgaw could be a value that was greater than any of sagaw. >>> If it is a bios issue, then they should fix it there. >> >> MGAW indicates the max gpa width supported by 2nd translation. The VT-d >> spec requires that this value must be at least equal to the host >> physical addressibility. According to this, BIOS is good, right? >> >> For this failure case, domain_init() just wants to find a suitable agaw >> for the private domain. I think it makes sense to check against >> iommu->agaw instead of cap_mgaw. >> >> Best regards, >> baolu >> > > From this bit in the spec about MGAW: > > Guest addressability for a given DMA request is limited to the > minimum of the value reported through this field and the adjusted > guest address width of the corresponding page-table structure. > (Adjusted guest address widths supported by hardware are reported > through the SAGAW field). > > That does suggest it should be adjusted down to the sagaw value in this case, > yes? > Just want to make sure I'm understanding it correctly. Or I guess that is really talking about if you had an mgaw lower than the the sagaw, the dma request would be limited to that lower mgaw value. > >>> >>>> >>>>> other places like prepare_domain_attach_device, the dmar domain agaw >>>>> gets adjusted down to the iommu agaw. The agaw of the iommu gets >>>>> determined based off what is reported for sagaw. I'm wondering if it >>>>> can't instead do: >>>>> --- >>>>>drivers/iommu/intel-iommu.c | 4 ++-- >>>>>1 file changed, 2 insertions(+), 2 deletions(-) >>>>> diff --git a/drivers/iommu/intel-iommu.c >>>>> b/drivers/iommu/intel-iommu.c >>>>> index 6ca5c92ef2e5..a8e41ec36d9e 100644 >>>>> --- a/drivers/iommu/intel-iommu.c >>>>> +++ b/drivers/iommu/intel-iommu.c >>>>> @@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, >>>>> struct intel_iommu *iommu, >>>>> domain_reserve_special_ranges(domain); >>>>> /* calculate AGAW */ >>>>> - if (guest_width > cap_mgaw(iommu->cap)) >>>>> - guest_width = cap_mgaw(iommu->cap); >>>>> + if (guest_width > agaw_to_width(iommu->agaw)) >>>>> + guest_width = agaw_to_width(iommu->agaw); >>>>> domain->gaw = guest_width; >>>>> adjust_width = guestwidth_to_adjustwidth(guest_width); >>>>> agaw = width_to_agaw(adjust_width); >>>>> -- >>>>> 2.27.0 >>>>> >>>>> Thoughts? With the former code the ehci device for the ilo fails when >>>>> trying to get a private domain. >>>>> Thanks, >>>>> Jerry >>>>> >>> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Question about domain_init (v5.3-v5.7)
Lu Baolu @ 2020-11-26 19:12 MST: > Hi Jerry, > > On 11/27/20 5:35 AM, Jerry Snitselaar wrote: >> Lu Baolu @ 2020-11-26 04:01 MST: >> >>> Hi Jerry, >>> >>> On 2020/11/26 4:27, Jerry Snitselaar wrote: >>>> Is there a reason we check the requested guest address width against >>>> the >>>> iommu's mgaw, instead of the agaw that we already know for the iommu? >>>> I've run into a case with a new system where the mgaw reported is 57, >>>> but if they set PAE to 46 instead of 52 in the bios, then sagaw reports >>>> the highest supported agaw is 48 and the domain_init code fails here. In >>> >>> Isn't this a platform bug? If it's too late to fix it in the BIOS, you >>> maybe have to add a platform specific quirk to set mgaw to the highest >>> supported agaw? >>> >>> Best regards, >>> baolu >> Is there somewhere you can point me to that discusses how they >> should be >> setting the mgaw? I misunderstood when I previously asked you about >> whether the mgaw could be a value that was greater than any of sagaw. >> If it is a bios issue, then they should fix it there. > > MGAW indicates the max gpa width supported by 2nd translation. The VT-d > spec requires that this value must be at least equal to the host > physical addressibility. According to this, BIOS is good, right? > > For this failure case, domain_init() just wants to find a suitable agaw > for the private domain. I think it makes sense to check against > iommu->agaw instead of cap_mgaw. > > Best regards, > baolu > >From this bit in the spec about MGAW: Guest addressability for a given DMA request is limited to the minimum of the value reported through this field and the adjusted guest address width of the corresponding page-table structure. (Adjusted guest address widths supported by hardware are reported through the SAGAW field). That does suggest it should be adjusted down to the sagaw value in this case, yes? Just want to make sure I'm understanding it correctly. >> >>> >>>> other places like prepare_domain_attach_device, the dmar domain agaw >>>> gets adjusted down to the iommu agaw. The agaw of the iommu gets >>>> determined based off what is reported for sagaw. I'm wondering if it >>>> can't instead do: >>>> --- >>>>drivers/iommu/intel-iommu.c | 4 ++-- >>>>1 file changed, 2 insertions(+), 2 deletions(-) >>>> diff --git a/drivers/iommu/intel-iommu.c >>>> b/drivers/iommu/intel-iommu.c >>>> index 6ca5c92ef2e5..a8e41ec36d9e 100644 >>>> --- a/drivers/iommu/intel-iommu.c >>>> +++ b/drivers/iommu/intel-iommu.c >>>> @@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, >>>> struct intel_iommu *iommu, >>>>domain_reserve_special_ranges(domain); >>>>/* calculate AGAW */ >>>> - if (guest_width > cap_mgaw(iommu->cap)) >>>> - guest_width = cap_mgaw(iommu->cap); >>>> + if (guest_width > agaw_to_width(iommu->agaw)) >>>> + guest_width = agaw_to_width(iommu->agaw); >>>>domain->gaw = guest_width; >>>>adjust_width = guestwidth_to_adjustwidth(guest_width); >>>>agaw = width_to_agaw(adjust_width); >>>> -- >>>> 2.27.0 >>>> >>>> Thoughts? With the former code the ehci device for the ilo fails when >>>> trying to get a private domain. >>>> Thanks, >>>> Jerry >>>> >> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Question about domain_init (v5.3-v5.7)
Lu Baolu @ 2020-11-26 19:12 MST: > Hi Jerry, > > On 11/27/20 5:35 AM, Jerry Snitselaar wrote: >> Lu Baolu @ 2020-11-26 04:01 MST: >> >>> Hi Jerry, >>> >>> On 2020/11/26 4:27, Jerry Snitselaar wrote: >>>> Is there a reason we check the requested guest address width against >>>> the >>>> iommu's mgaw, instead of the agaw that we already know for the iommu? >>>> I've run into a case with a new system where the mgaw reported is 57, >>>> but if they set PAE to 46 instead of 52 in the bios, then sagaw reports >>>> the highest supported agaw is 48 and the domain_init code fails here. In >>> >>> Isn't this a platform bug? If it's too late to fix it in the BIOS, you >>> maybe have to add a platform specific quirk to set mgaw to the highest >>> supported agaw? >>> >>> Best regards, >>> baolu >> Is there somewhere you can point me to that discusses how they >> should be >> setting the mgaw? I misunderstood when I previously asked you about >> whether the mgaw could be a value that was greater than any of sagaw. >> If it is a bios issue, then they should fix it there. > > MGAW indicates the max gpa width supported by 2nd translation. The VT-d > spec requires that this value must be at least equal to the host > physical addressibility. According to this, BIOS is good, right? > Yes, the host address width is 46. MGAW reports 57 (56+1), and highest sagaw bit is for 48. > For this failure case, domain_init() just wants to find a suitable agaw > for the private domain. I think it makes sense to check against > iommu->agaw instead of cap_mgaw. > > Best regards, > baolu > >> >>> >>>> other places like prepare_domain_attach_device, the dmar domain agaw >>>> gets adjusted down to the iommu agaw. The agaw of the iommu gets >>>> determined based off what is reported for sagaw. I'm wondering if it >>>> can't instead do: >>>> --- >>>>drivers/iommu/intel-iommu.c | 4 ++-- >>>>1 file changed, 2 insertions(+), 2 deletions(-) >>>> diff --git a/drivers/iommu/intel-iommu.c >>>> b/drivers/iommu/intel-iommu.c >>>> index 6ca5c92ef2e5..a8e41ec36d9e 100644 >>>> --- a/drivers/iommu/intel-iommu.c >>>> +++ b/drivers/iommu/intel-iommu.c >>>> @@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, >>>> struct intel_iommu *iommu, >>>>domain_reserve_special_ranges(domain); >>>>/* calculate AGAW */ >>>> - if (guest_width > cap_mgaw(iommu->cap)) >>>> - guest_width = cap_mgaw(iommu->cap); >>>> + if (guest_width > agaw_to_width(iommu->agaw)) >>>> + guest_width = agaw_to_width(iommu->agaw); >>>>domain->gaw = guest_width; >>>>adjust_width = guestwidth_to_adjustwidth(guest_width); >>>>agaw = width_to_agaw(adjust_width); >>>> -- >>>> 2.27.0 >>>> >>>> Thoughts? With the former code the ehci device for the ilo fails when >>>> trying to get a private domain. >>>> Thanks, >>>> Jerry >>>> >> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Question about domain_init (v5.3-v5.7)
Lu Baolu @ 2020-11-26 04:01 MST: > Hi Jerry, > > On 2020/11/26 4:27, Jerry Snitselaar wrote: >> Is there a reason we check the requested guest address width against >> the >> iommu's mgaw, instead of the agaw that we already know for the iommu? >> I've run into a case with a new system where the mgaw reported is 57, >> but if they set PAE to 46 instead of 52 in the bios, then sagaw reports >> the highest supported agaw is 48 and the domain_init code fails here. In > > Isn't this a platform bug? If it's too late to fix it in the BIOS, you > maybe have to add a platform specific quirk to set mgaw to the highest > supported agaw? > > Best regards, > baolu Is there somewhere you can point me to that discusses how they should be setting the mgaw? I misunderstood when I previously asked you about whether the mgaw could be a value that was greater than any of sagaw. If it is a bios issue, then they should fix it there. > >> other places like prepare_domain_attach_device, the dmar domain agaw >> gets adjusted down to the iommu agaw. The agaw of the iommu gets >> determined based off what is reported for sagaw. I'm wondering if it >> can't instead do: >> --- >> drivers/iommu/intel-iommu.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> diff --git a/drivers/iommu/intel-iommu.c >> b/drivers/iommu/intel-iommu.c >> index 6ca5c92ef2e5..a8e41ec36d9e 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, >> struct intel_iommu *iommu, >> domain_reserve_special_ranges(domain); >> /* calculate AGAW */ >> -if (guest_width > cap_mgaw(iommu->cap)) >> -guest_width = cap_mgaw(iommu->cap); >> +if (guest_width > agaw_to_width(iommu->agaw)) >> +guest_width = agaw_to_width(iommu->agaw); >> domain->gaw = guest_width; >> adjust_width = guestwidth_to_adjustwidth(guest_width); >> agaw = width_to_agaw(adjust_width); >> -- >> 2.27.0 >> >> Thoughts? With the former code the ehci device for the ilo fails when >> trying to get a private domain. >> Thanks, >> Jerry >> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Question about domain_init (v5.3-v5.7)
Is there a reason we check the requested guest address width against the iommu's mgaw, instead of the agaw that we already know for the iommu? I've run into a case with a new system where the mgaw reported is 57, but if they set PAE to 46 instead of 52 in the bios, then sagaw reports the highest supported agaw is 48 and the domain_init code fails here. In other places like prepare_domain_attach_device, the dmar domain agaw gets adjusted down to the iommu agaw. The agaw of the iommu gets determined based off what is reported for sagaw. I'm wondering if it can't instead do: --- drivers/iommu/intel-iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 6ca5c92ef2e5..a8e41ec36d9e 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu, domain_reserve_special_ranges(domain); /* calculate AGAW */ - if (guest_width > cap_mgaw(iommu->cap)) - guest_width = cap_mgaw(iommu->cap); + if (guest_width > agaw_to_width(iommu->agaw)) + guest_width = agaw_to_width(iommu->agaw); domain->gaw = guest_width; adjust_width = guestwidth_to_adjustwidth(guest_width); agaw = width_to_agaw(adjust_width); -- 2.27.0 Thoughts? With the former code the ehci device for the ilo fails when trying to get a private domain. Thanks, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
kdump boot failing with IVRS checksum failure
Hello Joerg, We are seeing a kdump kernel boot failure in test on an HP DL325 Gen10 and it was tracked down to 387caf0b759a ("iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions"). Reproduced on 5.9-rc5 and goes away with revert of the commit. There is a follow on commit that depends on this that was reverted as well 2ca6b6dc8512 ("iommu/amd: Remove unused variable"). I'm working on getting system access and want to see what the IVRS table looks like, but thought I'd give you heads up. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/2] iommu: Move AMD and Intel Kconfig + Makefile bits into their directories
Jerry Snitselaar @ 2020-06-30 13:06 MST: > This patchset imeplements the suggestion from Linus to move the > Kconfig and Makefile bits for AMD and Intel into their respective > directories. > > v2: Rebase against v5.8-rc3. Dropped ---help--- changes from Kconfig as that > was > dealt with in systemwide cleanup. > > Jerry Snitselaar (2): > iommu/vt-d: Move Kconfig and Makefile bits down into intel directory > iommu/amd: Move Kconfig and Makefile bits down into amd directory > > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu Hi Joerg, Looks like I forgot to cc you on this cover letter for v2. Does this work for you now? Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 2/2] iommu/amd: Move Kconfig and Makefile bits down into amd directory
Move AMD Kconfig and Makefile bits down into the amd directory with the rest of the AMD specific files. Cc: Joerg Roedel Cc: Suravee Suthikulpanit Signed-off-by: Jerry Snitselaar --- drivers/iommu/Kconfig | 45 +- drivers/iommu/Makefile | 5 + drivers/iommu/amd/Kconfig | 44 + drivers/iommu/amd/Makefile | 4 4 files changed, 50 insertions(+), 48 deletions(-) create mode 100644 drivers/iommu/amd/Kconfig create mode 100644 drivers/iommu/amd/Makefile diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 281cd6bd0fe0..24000e7ed0fa 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -132,50 +132,7 @@ config IOMMU_PGTABLES_L2 def_bool y depends on MSM_IOMMU && MMU && SMP && CPU_DCACHE_DISABLE=n -# AMD IOMMU support -config AMD_IOMMU - bool "AMD IOMMU support" - select SWIOTLB - select PCI_MSI - select PCI_ATS - select PCI_PRI - select PCI_PASID - select IOMMU_API - select IOMMU_IOVA - select IOMMU_DMA - depends on X86_64 && PCI && ACPI - help - With this option you can enable support for AMD IOMMU hardware in - your system. An IOMMU is a hardware component which provides - remapping of DMA memory accesses from devices. With an AMD IOMMU you - can isolate the DMA memory of different devices and protect the - system from misbehaving device drivers or hardware. - - You can find out if your system has an AMD IOMMU if you look into - your BIOS for an option to enable it or if you have an IVRS ACPI - table. - -config AMD_IOMMU_V2 - tristate "AMD IOMMU Version 2 driver" - depends on AMD_IOMMU - select MMU_NOTIFIER - help - This option enables support for the AMD IOMMUv2 features of the IOMMU - hardware. Select this option if you want to use devices that support - the PCI PRI and PASID interface. - -config AMD_IOMMU_DEBUGFS - bool "Enable AMD IOMMU internals in DebugFS" - depends on AMD_IOMMU && IOMMU_DEBUGFS - help - !!!WARNING!!! !!!WARNING!!! !!!WARNING!!! !!!WARNING!!! - - DO NOT ENABLE THIS OPTION UNLESS YOU REALLY, -REALLY- KNOW WHAT YOU ARE DOING!!! - Exposes AMD IOMMU device internals in DebugFS. - - This option is -NOT- intended for production environments, and should - not generally be enabled. - +source "drivers/iommu/amd/Kconfig" source "drivers/iommu/intel/Kconfig" config IRQ_REMAP diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 71dd2f382e78..f356bc12b1c7 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -1,5 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y += intel/ +obj-y += amd/ intel/ obj-$(CONFIG_IOMMU_API) += iommu.o obj-$(CONFIG_IOMMU_API) += iommu-traces.o obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o @@ -12,9 +12,6 @@ obj-$(CONFIG_IOASID) += ioasid.o obj-$(CONFIG_IOMMU_IOVA) += iova.o obj-$(CONFIG_OF_IOMMU) += of_iommu.o obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o -obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o amd/quirks.o -obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o -obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o obj-$(CONFIG_ARM_SMMU) += arm_smmu.o arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig new file mode 100644 index ..1f061d91e0b8 --- /dev/null +++ b/drivers/iommu/amd/Kconfig @@ -0,0 +1,44 @@ +# SPDX-License-Identifier: GPL-2.0-only +# AMD IOMMU support +config AMD_IOMMU + bool "AMD IOMMU support" + select SWIOTLB + select PCI_MSI + select PCI_ATS + select PCI_PRI + select PCI_PASID + select IOMMU_API + select IOMMU_IOVA + select IOMMU_DMA + depends on X86_64 && PCI && ACPI + help + With this option you can enable support for AMD IOMMU hardware in + your system. An IOMMU is a hardware component which provides + remapping of DMA memory accesses from devices. With an AMD IOMMU you + can isolate the DMA memory of different devices and protect the + system from misbehaving device drivers or hardware. + + You can find out if your system has an AMD IOMMU if you look into + your BIOS for an option to enable it or if you have an IVRS ACPI + table. + +config AMD_IOMMU_V2 + tristate "AMD IOMMU Version 2 driver" + depends on AMD_IOMMU + select MMU_NOTIFIER + help + This option enables support for the AMD IOMMUv2 features of the IOMMU + hardware. Select this option if you want to use devices that support + the PCI PRI and PASID int
[PATCH v2 0/2] iommu: Move AMD and Intel Kconfig + Makefile bits into their directories
This patchset imeplements the suggestion from Linus to move the Kconfig and Makefile bits for AMD and Intel into their respective directories. v2: Rebase against v5.8-rc3. Dropped ---help--- changes from Kconfig as that was dealt with in systemwide cleanup. Jerry Snitselaar (2): iommu/vt-d: Move Kconfig and Makefile bits down into intel directory iommu/amd: Move Kconfig and Makefile bits down into amd directory ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/2] iommu/vt-d: Move Kconfig and Makefile bits down into intel directory
Move Intel Kconfig and Makefile bits down into intel directory with the rest of the Intel specific files. Cc: Joerg Roedel Cc: Lu Baolu Signed-off-by: Jerry Snitselaar --- drivers/iommu/Kconfig| 86 +--- drivers/iommu/Makefile | 8 +--- drivers/iommu/intel/Kconfig | 86 drivers/iommu/intel/Makefile | 7 +++ 4 files changed, 96 insertions(+), 91 deletions(-) create mode 100644 drivers/iommu/intel/Kconfig create mode 100644 drivers/iommu/intel/Makefile diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 6dc49ed8377a..281cd6bd0fe0 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -176,91 +176,7 @@ config AMD_IOMMU_DEBUGFS This option is -NOT- intended for production environments, and should not generally be enabled. -# Intel IOMMU support -config DMAR_TABLE - bool - -config INTEL_IOMMU - bool "Support for Intel IOMMU using DMA Remapping Devices" - depends on PCI_MSI && ACPI && (X86 || IA64) - select IOMMU_API - select IOMMU_IOVA - select NEED_DMA_MAP_STATE - select DMAR_TABLE - select SWIOTLB - select IOASID - help - DMA remapping (DMAR) devices support enables independent address - translations for Direct Memory Access (DMA) from devices. - These DMA remapping devices are reported via ACPI tables - and include PCI device scope covered by these DMA - remapping devices. - -config INTEL_IOMMU_DEBUGFS - bool "Export Intel IOMMU internals in Debugfs" - depends on INTEL_IOMMU && IOMMU_DEBUGFS - help - !!!WARNING!!! - - DO NOT ENABLE THIS OPTION UNLESS YOU REALLY KNOW WHAT YOU ARE DOING!!! - - Expose Intel IOMMU internals in Debugfs. - - This option is -NOT- intended for production environments, and should - only be enabled for debugging Intel IOMMU. - -config INTEL_IOMMU_SVM - bool "Support for Shared Virtual Memory with Intel IOMMU" - depends on INTEL_IOMMU && X86_64 - select PCI_PASID - select PCI_PRI - select MMU_NOTIFIER - select IOASID - help - Shared Virtual Memory (SVM) provides a facility for devices - to access DMA resources through process address space by - means of a Process Address Space ID (PASID). - -config INTEL_IOMMU_DEFAULT_ON - def_bool y - prompt "Enable Intel DMA Remapping Devices by default" - depends on INTEL_IOMMU - help - Selecting this option will enable a DMAR device at boot time if - one is found. If this option is not selected, DMAR support can - be enabled by passing intel_iommu=on to the kernel. - -config INTEL_IOMMU_BROKEN_GFX_WA - bool "Workaround broken graphics drivers (going away soon)" - depends on INTEL_IOMMU && BROKEN && X86 - help - Current Graphics drivers tend to use physical address - for DMA and avoid using DMA APIs. Setting this config - option permits the IOMMU driver to set a unity map for - all the OS-visible memory. Hence the driver can continue - to use physical addresses for DMA, at least until this - option is removed in the 2.6.32 kernel. - -config INTEL_IOMMU_FLOPPY_WA - def_bool y - depends on INTEL_IOMMU && X86 - help - Floppy disk drivers are known to bypass DMA API calls - thereby failing to work when IOMMU is enabled. This - workaround will setup a 1:1 mapping for the first - 16MiB to make floppy (an ISA device) work. - -config INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON - bool "Enable Intel IOMMU scalable mode by default" - depends on INTEL_IOMMU - help - Selecting this option will enable by default the scalable mode if - hardware presents the capability. The scalable mode is defined in - VT-d 3.0. The scalable mode capability could be checked by reading - /sys/devices/virtual/iommu/dmar*/intel-iommu/ecap. If this option - is not selected, scalable mode support could also be enabled by - passing intel_iommu=sm_on to the kernel. If not sure, please use - the default value. +source "drivers/iommu/intel/Kconfig" config IRQ_REMAP bool "Support for Interrupt Remapping" diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 342190196dfb..71dd2f382e78 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -1,4 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 +obj-y += intel/ obj-$(CONFIG_IOMMU_API) += iommu.o obj-$(CONFIG_IOMMU_API) += iommu-traces.o obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o @@ -17,13 +18,8 @@ obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o obj-$(CONFIG_ARM_SMMU) += arm_smmu.o arm_smmu-objs
Re: [PATCH 00/13] iommu: Remove usage of dev->archdata.iommu
On Thu Jun 25 20, Joerg Roedel wrote: From: Joerg Roedel Hi, here is a patch-set to remove the usage of dev->archdata.iommu from the IOMMU code in the kernel and replace its uses by the iommu per-device private data field. The changes also remove the field entirely from the architectures which no longer need it. On PowerPC the field is called dev->archdata.iommu_domain and was only used by the PAMU IOMMU driver. It gets removed as well. The patches have been runtime tested on Intel VT-d and compile tested with allyesconfig for: * x86 (32 and 64 bit) * arm and arm64 * ia64 (only drivers/ because build failed for me in arch/ia64) * PPC64 Besides that the changes also survived my IOMMU tree compile tests. Please review. Regards, Joerg Joerg Roedel (13): iommu/exynos: Use dev_iommu_priv_get/set() iommu/vt-d: Use dev_iommu_priv_get/set() iommu/msm: Use dev_iommu_priv_get/set() iommu/omap: Use dev_iommu_priv_get/set() iommu/rockchip: Use dev_iommu_priv_get/set() iommu/tegra: Use dev_iommu_priv_get/set() iommu/pamu: Use dev_iommu_priv_get/set() iommu/mediatek: Do no use dev->archdata.iommu x86: Remove dev->archdata.iommu pointer ia64: Remove dev->archdata.iommu pointer arm: Remove dev->archdata.iommu pointer arm64: Remove dev->archdata.iommu pointer powerpc/dma: Remove dev->archdata.iommu_domain arch/arm/include/asm/device.h | 3 --- arch/arm64/include/asm/device.h | 3 --- arch/ia64/include/asm/device.h| 3 --- arch/powerpc/include/asm/device.h | 3 --- arch/x86/include/asm/device.h | 3 --- .../gpu/drm/i915/selftests/mock_gem_device.c | 10 -- drivers/iommu/exynos-iommu.c | 20 +-- drivers/iommu/fsl_pamu_domain.c | 8 drivers/iommu/intel/iommu.c | 18 - drivers/iommu/msm_iommu.c | 4 ++-- drivers/iommu/mtk_iommu.h | 2 ++ drivers/iommu/mtk_iommu_v1.c | 10 -- drivers/iommu/omap-iommu.c| 20 +-- drivers/iommu/rockchip-iommu.c| 8 drivers/iommu/tegra-gart.c| 8 drivers/iommu/tegra-smmu.c| 8 .../media/platform/s5p-mfc/s5p_mfc_iommu.h| 4 +++- 17 files changed, 64 insertions(+), 71 deletions(-) -- 2.27.0 Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Fix misuse of iommu_domain_identity_map()
On Fri Jun 19 20, Lu Baolu wrote: The iommu_domain_identity_map() helper takes start/end PFN as arguments. Fix a misuse case where the start and end addresses are passed. Fixes: e70b081c6f376 ("iommu/vt-d: Remove IOVA handling code from the non-dma_ops path") Cc: Tom Murphy Reported-by: Alex Williamson Signed-off-by: Lu Baolu Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/2] iommu/amd: Move Kconfig and Makefile bits down into amd directory
Move AMD Kconfig and Makefile bits down into the amd directory with the rest of the AMD specific files. Cc: Joerg Roedel Cc: Suravee Suthikulpanit Signed-off-by: Jerry Snitselaar --- drivers/iommu/Kconfig | 45 +- drivers/iommu/Makefile | 5 + drivers/iommu/amd/Kconfig | 44 + drivers/iommu/amd/Makefile | 4 4 files changed, 50 insertions(+), 48 deletions(-) create mode 100644 drivers/iommu/amd/Kconfig create mode 100644 drivers/iommu/amd/Makefile diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index b12d4ec124f6..78a8be0053b3 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -132,50 +132,7 @@ config IOMMU_PGTABLES_L2 def_bool y depends on MSM_IOMMU && MMU && SMP && CPU_DCACHE_DISABLE=n -# AMD IOMMU support -config AMD_IOMMU - bool "AMD IOMMU support" - select SWIOTLB - select PCI_MSI - select PCI_ATS - select PCI_PRI - select PCI_PASID - select IOMMU_API - select IOMMU_IOVA - select IOMMU_DMA - depends on X86_64 && PCI && ACPI - ---help--- - With this option you can enable support for AMD IOMMU hardware in - your system. An IOMMU is a hardware component which provides - remapping of DMA memory accesses from devices. With an AMD IOMMU you - can isolate the DMA memory of different devices and protect the - system from misbehaving device drivers or hardware. - - You can find out if your system has an AMD IOMMU if you look into - your BIOS for an option to enable it or if you have an IVRS ACPI - table. - -config AMD_IOMMU_V2 - tristate "AMD IOMMU Version 2 driver" - depends on AMD_IOMMU - select MMU_NOTIFIER - ---help--- - This option enables support for the AMD IOMMUv2 features of the IOMMU - hardware. Select this option if you want to use devices that support - the PCI PRI and PASID interface. - -config AMD_IOMMU_DEBUGFS - bool "Enable AMD IOMMU internals in DebugFS" - depends on AMD_IOMMU && IOMMU_DEBUGFS - ---help--- - !!!WARNING!!! !!!WARNING!!! !!!WARNING!!! !!!WARNING!!! - - DO NOT ENABLE THIS OPTION UNLESS YOU REALLY, -REALLY- KNOW WHAT YOU ARE DOING!!! - Exposes AMD IOMMU device internals in DebugFS. - - This option is -NOT- intended for production environments, and should - not generally be enabled. - +source "drivers/iommu/amd/Kconfig" source "drivers/iommu/intel/Kconfig" config IRQ_REMAP diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 71dd2f382e78..f356bc12b1c7 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -1,5 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y += intel/ +obj-y += amd/ intel/ obj-$(CONFIG_IOMMU_API) += iommu.o obj-$(CONFIG_IOMMU_API) += iommu-traces.o obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o @@ -12,9 +12,6 @@ obj-$(CONFIG_IOASID) += ioasid.o obj-$(CONFIG_IOMMU_IOVA) += iova.o obj-$(CONFIG_OF_IOMMU) += of_iommu.o obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o -obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o amd/quirks.o -obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o -obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o obj-$(CONFIG_ARM_SMMU) += arm_smmu.o arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig new file mode 100644 index ..1f061d91e0b8 --- /dev/null +++ b/drivers/iommu/amd/Kconfig @@ -0,0 +1,44 @@ +# SPDX-License-Identifier: GPL-2.0-only +# AMD IOMMU support +config AMD_IOMMU + bool "AMD IOMMU support" + select SWIOTLB + select PCI_MSI + select PCI_ATS + select PCI_PRI + select PCI_PASID + select IOMMU_API + select IOMMU_IOVA + select IOMMU_DMA + depends on X86_64 && PCI && ACPI + help + With this option you can enable support for AMD IOMMU hardware in + your system. An IOMMU is a hardware component which provides + remapping of DMA memory accesses from devices. With an AMD IOMMU you + can isolate the DMA memory of different devices and protect the + system from misbehaving device drivers or hardware. + + You can find out if your system has an AMD IOMMU if you look into + your BIOS for an option to enable it or if you have an IVRS ACPI + table. + +config AMD_IOMMU_V2 + tristate "AMD IOMMU Version 2 driver" + depends on AMD_IOMMU + select MMU_NOTIFIER + help + This option enables support for the AMD IOMMUv2 features of the IOMMU + hardware. Select this option if you want to use devices that support + the PCI
[PATCH 0/2] iommu: Move AMD and Intel Kconfig + Makefile bits into their directories
This patchset imeplements the suggestion from Linus to move the Kconfig and Makefile bits for AMD and Intel into their respective directories. It also cleans up a couple Kconfig entries to use the newer help attribute instead of ---help--- (complaint from checkpatch). Jerry Snitselaar (2): iommu/vt-d: Move Kconfig and Makefile bits down into intel directory iommu/amd: Move Kconfig and Makefile bits down into amd directory ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/2] iommu/vt-d: Move Kconfig and Makefile bits down into intel directory
Move Intel Kconfig and Makefile bits down into intel directory with the rest of the Intel specific files. Cc: Joerg Roedel Cc: Lu Baolu Signed-off-by: Jerry Snitselaar --- drivers/iommu/Kconfig| 86 +--- drivers/iommu/Makefile | 8 +--- drivers/iommu/intel/Kconfig | 86 drivers/iommu/intel/Makefile | 7 +++ 4 files changed, 96 insertions(+), 91 deletions(-) create mode 100644 drivers/iommu/intel/Kconfig create mode 100644 drivers/iommu/intel/Makefile diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index aca76383f201..b12d4ec124f6 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -176,91 +176,7 @@ config AMD_IOMMU_DEBUGFS This option is -NOT- intended for production environments, and should not generally be enabled. -# Intel IOMMU support -config DMAR_TABLE - bool - -config INTEL_IOMMU - bool "Support for Intel IOMMU using DMA Remapping Devices" - depends on PCI_MSI && ACPI && (X86 || IA64) - select IOMMU_API - select IOMMU_IOVA - select NEED_DMA_MAP_STATE - select DMAR_TABLE - select SWIOTLB - select IOASID - help - DMA remapping (DMAR) devices support enables independent address - translations for Direct Memory Access (DMA) from devices. - These DMA remapping devices are reported via ACPI tables - and include PCI device scope covered by these DMA - remapping devices. - -config INTEL_IOMMU_DEBUGFS - bool "Export Intel IOMMU internals in Debugfs" - depends on INTEL_IOMMU && IOMMU_DEBUGFS - help - !!!WARNING!!! - - DO NOT ENABLE THIS OPTION UNLESS YOU REALLY KNOW WHAT YOU ARE DOING!!! - - Expose Intel IOMMU internals in Debugfs. - - This option is -NOT- intended for production environments, and should - only be enabled for debugging Intel IOMMU. - -config INTEL_IOMMU_SVM - bool "Support for Shared Virtual Memory with Intel IOMMU" - depends on INTEL_IOMMU && X86 - select PCI_PASID - select PCI_PRI - select MMU_NOTIFIER - select IOASID - help - Shared Virtual Memory (SVM) provides a facility for devices - to access DMA resources through process address space by - means of a Process Address Space ID (PASID). - -config INTEL_IOMMU_DEFAULT_ON - def_bool y - prompt "Enable Intel DMA Remapping Devices by default" - depends on INTEL_IOMMU - help - Selecting this option will enable a DMAR device at boot time if - one is found. If this option is not selected, DMAR support can - be enabled by passing intel_iommu=on to the kernel. - -config INTEL_IOMMU_BROKEN_GFX_WA - bool "Workaround broken graphics drivers (going away soon)" - depends on INTEL_IOMMU && BROKEN && X86 - ---help--- - Current Graphics drivers tend to use physical address - for DMA and avoid using DMA APIs. Setting this config - option permits the IOMMU driver to set a unity map for - all the OS-visible memory. Hence the driver can continue - to use physical addresses for DMA, at least until this - option is removed in the 2.6.32 kernel. - -config INTEL_IOMMU_FLOPPY_WA - def_bool y - depends on INTEL_IOMMU && X86 - ---help--- - Floppy disk drivers are known to bypass DMA API calls - thereby failing to work when IOMMU is enabled. This - workaround will setup a 1:1 mapping for the first - 16MiB to make floppy (an ISA device) work. - -config INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON - bool "Enable Intel IOMMU scalable mode by default" - depends on INTEL_IOMMU - help - Selecting this option will enable by default the scalable mode if - hardware presents the capability. The scalable mode is defined in - VT-d 3.0. The scalable mode capability could be checked by reading - /sys/devices/virtual/iommu/dmar*/intel-iommu/ecap. If this option - is not selected, scalable mode support could also be enabled by - passing intel_iommu=sm_on to the kernel. If not sure, please use - the default value. +source "drivers/iommu/intel/Kconfig" config IRQ_REMAP bool "Support for Interrupt Remapping" diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 342190196dfb..71dd2f382e78 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -1,4 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 +obj-y += intel/ obj-$(CONFIG_IOMMU_API) += iommu.o obj-$(CONFIG_IOMMU_API) += iommu-traces.o obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o @@ -17,13 +18,8 @@ obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o obj-$(CONFIG_ARM_SMMU) += arm_smmu.o ar
[PATCH] iommu: add include/uapi/linux/iommu.h to MAINTAINERS file
When include/uapi/linux/iommu.h was created it was never added to the file list in MAINTAINERS. Cc: Joerg Roedel Signed-off-by: Jerry Snitselaar --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index e1897ed32930..061648b6e393 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8954,6 +8954,7 @@ F:drivers/iommu/ F: include/linux/iommu.h F: include/linux/iova.h F: include/linux/of_iommu.h +F: include/uapi/linux/iommu.h IO_URING M: Jens Axboe -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu: Don't attach deferred device in iommu_group_do_dma_attach
Attaching a deferred device should be delayed until dma api is called. Cc: iommu@lists.linux-foundation.org Suggested-by: Joerg Roedel Signed-off-by: Jerry Snitselaar --- If you already have thrown a patch together, then ignore this. Also feel free to swap out the signed-off-by with your's since this is more your patch than mine. You can put a reviewed-by and tested-by instead for me. drivers/iommu/iommu.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b5ea203f6c68..d43120eb1dc5 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1680,8 +1680,12 @@ static void probe_alloc_default_domain(struct bus_type *bus, static int iommu_group_do_dma_attach(struct device *dev, void *data) { struct iommu_domain *domain = data; + int ret = 0; - return __iommu_attach_device(domain, dev); + if (!iommu_is_attach_deferred(domain, dev)) + ret = __iommu_attach_device(domain, dev); + + return ret; } static int __iommu_group_dma_attach(struct iommu_group *group) -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] iommu: Move Intel and AMD drivers into their own subdirectory
On Thu Jun 04 20, Lu Baolu wrote: Hi Joerg, On 6/2/20 5:26 PM, Joerg Roedel wrote: Hi, two small patches to move the Intel and AMD IOMMU drivers into their own subdirectory under drivers/iommu/ to make the file structure a bit less cluttered. Does the MAINTAINERS file need to update? Best regards, baolu Yes, that should be updated to point at the new directories. Good catch. Regards, Joerg Joerg Roedel (2): iommu/amd: Move AMD IOMMU driver into subdirectory iommu/vt-d: Move Intel IOMMU driver into subdirectory drivers/iommu/Makefile | 18 +- drivers/iommu/{ => amd}/amd_iommu.h| 0 drivers/iommu/{ => amd}/amd_iommu_types.h | 0 .../{amd_iommu_debugfs.c => amd/debugfs.c} | 0 drivers/iommu/{amd_iommu_init.c => amd/init.c} | 2 +- drivers/iommu/{amd_iommu.c => amd/iommu.c} | 2 +- .../iommu/{amd_iommu_v2.c => amd/iommu_v2.c} | 0 .../iommu/{amd_iommu_quirks.c => amd/quirks.c} | 0 .../{intel-iommu-debugfs.c => intel/debugfs.c} | 0 drivers/iommu/{ => intel}/dmar.c | 2 +- drivers/iommu/{ => intel}/intel-pasid.h| 0 drivers/iommu/{intel-iommu.c => intel/iommu.c} | 2 +- .../irq_remapping.c} | 2 +- drivers/iommu/{intel-pasid.c => intel/pasid.c} | 0 drivers/iommu/{intel-svm.c => intel/svm.c} | 0 drivers/iommu/{intel-trace.c => intel/trace.c} | 0 16 files changed, 14 insertions(+), 14 deletions(-) rename drivers/iommu/{ => amd}/amd_iommu.h (100%) rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%) rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (100%) rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%) rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (99%) rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (100%) rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%) rename drivers/iommu/{intel-iommu-debugfs.c => intel/debugfs.c} (100%) rename drivers/iommu/{ => intel}/dmar.c (99%) rename drivers/iommu/{ => intel}/intel-pasid.h (100%) rename drivers/iommu/{intel-iommu.c => intel/iommu.c} (99%) rename drivers/iommu/{intel_irq_remapping.c => intel/irq_remapping.c} (99%) rename drivers/iommu/{intel-pasid.c => intel/pasid.c} (100%) rename drivers/iommu/{intel-svm.c => intel/svm.c} (100%) rename drivers/iommu/{intel-trace.c => intel/trace.c} (100%) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] iommu: Move Intel and AMD drivers into their own subdirectory
On Tue Jun 02 20, Joerg Roedel wrote: Hi, two small patches to move the Intel and AMD IOMMU drivers into their own subdirectory under drivers/iommu/ to make the file structure a bit less cluttered. Regards, Joerg Joerg Roedel (2): iommu/amd: Move AMD IOMMU driver into subdirectory iommu/vt-d: Move Intel IOMMU driver into subdirectory drivers/iommu/Makefile | 18 +- drivers/iommu/{ => amd}/amd_iommu.h| 0 drivers/iommu/{ => amd}/amd_iommu_types.h | 0 .../{amd_iommu_debugfs.c => amd/debugfs.c} | 0 drivers/iommu/{amd_iommu_init.c => amd/init.c} | 2 +- drivers/iommu/{amd_iommu.c => amd/iommu.c} | 2 +- .../iommu/{amd_iommu_v2.c => amd/iommu_v2.c} | 0 .../iommu/{amd_iommu_quirks.c => amd/quirks.c} | 0 .../{intel-iommu-debugfs.c => intel/debugfs.c} | 0 drivers/iommu/{ => intel}/dmar.c | 2 +- drivers/iommu/{ => intel}/intel-pasid.h| 0 drivers/iommu/{intel-iommu.c => intel/iommu.c} | 2 +- .../irq_remapping.c} | 2 +- drivers/iommu/{intel-pasid.c => intel/pasid.c} | 0 drivers/iommu/{intel-svm.c => intel/svm.c} | 0 drivers/iommu/{intel-trace.c => intel/trace.c} | 0 16 files changed, 14 insertions(+), 14 deletions(-) rename drivers/iommu/{ => amd}/amd_iommu.h (100%) rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%) rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (100%) rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%) rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (99%) rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (100%) rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%) rename drivers/iommu/{intel-iommu-debugfs.c => intel/debugfs.c} (100%) rename drivers/iommu/{ => intel}/dmar.c (99%) rename drivers/iommu/{ => intel}/intel-pasid.h (100%) rename drivers/iommu/{intel-iommu.c => intel/iommu.c} (99%) rename drivers/iommu/{intel_irq_remapping.c => intel/irq_remapping.c} (99%) rename drivers/iommu/{intel-pasid.c => intel/pasid.c} (100%) rename drivers/iommu/{intel-svm.c => intel/svm.c} (100%) rename drivers/iommu/{intel-trace.c => intel/trace.c} (100%) -- 2.17.1 Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
On Tue Jun 02 20, Jerry Snitselaar wrote: On Tue Jun 02 20, Joerg Roedel wrote: Hi Jerry, On Mon, Jun 01, 2020 at 05:02:36PM -0700, Jerry Snitselaar wrote: Yeah, that will solve the panic. If you still see the kdump faults, can you please try with the attached diff? I was not able to reproduce them in my setup. Regards, Joerg I have another hp proliant server now, and reproduced. I will have the patch below tested shortly. Minor change, I switched group->domain to domain since group isn't an argument, and *data being passed in comes from group->domain anyways. Looks like it solves problem for both the epyc system, and the hp proliant server, diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b5ea203f6c68..5a6d509f72b6 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1680,8 +1680,12 @@ static void probe_alloc_default_domain(struct bus_type *bus, static int iommu_group_do_dma_attach(struct device *dev, void *data) { struct iommu_domain *domain = data; + int ret = 0; - return __iommu_attach_device(domain, dev); + if (!iommu_is_attach_deferred(group->domain, dev)) + ret = __iommu_attach_device(group->domain, dev); + + return ret; } static int __iommu_group_dma_attach(struct iommu_group *group) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
On Tue Jun 02 20, Joerg Roedel wrote: Hi Jerry, On Mon, Jun 01, 2020 at 05:02:36PM -0700, Jerry Snitselaar wrote: Yeah, that will solve the panic. If you still see the kdump faults, can you please try with the attached diff? I was not able to reproduce them in my setup. Regards, Joerg I have another hp proliant server now, and reproduced. I will have the patch below tested shortly. Minor change, I switched group->domain to domain since group isn't an argument, and *data being passed in comes from group->domain anyways. diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b5ea203f6c68..5a6d509f72b6 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1680,8 +1680,12 @@ static void probe_alloc_default_domain(struct bus_type *bus, static int iommu_group_do_dma_attach(struct device *dev, void *data) { struct iommu_domain *domain = data; + int ret = 0; - return __iommu_attach_device(domain, dev); + if (!iommu_is_attach_deferred(group->domain, dev)) + ret = __iommu_attach_device(group->domain, dev); + + return ret; } static int __iommu_group_dma_attach(struct iommu_group *group) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
On Tue Jun 02 20, Lu Baolu wrote: Hi Jerry, On 6/1/20 6:42 PM, Jerry Snitselaar wrote: Hi Joerg, With this patchset, I have an epyc system where if I boot with iommu=nopt and force a dump I will see some io page faults for a nic on the system. The vmcore is harvested and the system reboots. I haven't reproduced it on other systems yet, but without the patchset I don't see the io page faults during the kdump. Regards, Jerry I just hit an issue on a separate intel based system (kdump iommu=nopt), where it panics in during intel_iommu_attach_device, in is_aux_domain, due to device_domain_info being DEFER_DEVICE_DOMAIN_INFO. That doesn't get set to a valid address until the domain_add_dev_info call. Is it as simple as the following? I guess you won't hit this issue if you use iommu/next branch of Joerg's tree. We've changed to use a generic helper to retrieve the valid per device iommu data or NULL (if there's no). Best regards, baolu Yeah, that will solve the panic. diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 29d3940847d3..f1bbeed46a4c 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5053,8 +5053,8 @@ is_aux_domain(struct device *dev, struct iommu_domain *domain) { struct device_domain_info *info = dev->archdata.iommu; - return info && info->auxd_enabled && - domain->type == IOMMU_DOMAIN_UNMANAGED; + return info && info != DEFER_DEVICE_DOMAIN_INFO && + info->auxd_enabled && domain->type == IOMMU_DOMAIN_UNMANAGED; } static void auxiliary_link_device(struct dmar_domain *domain, Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
On Mon Jun 01 20, Jerry Snitselaar wrote: On Fri May 29 20, Jerry Snitselaar wrote: On Tue Apr 14 20, Joerg Roedel wrote: Hi, here is the second version of this patch-set. The first version with some more introductory text can be found here: https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ Changes v1->v2: * Rebased to v5.7-rc1 * Re-wrote the arm-smmu changes as suggested by Robin Murphy * Re-worked the Exynos patches to hopefully not break the driver anymore * Fixed a missing mutex_unlock() reported by Marek Szyprowski, thanks for that. There is also a git-branch available with these patches applied: https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2 Please review. Thanks, Joerg Joerg Roedel (32): iommu: Move default domain allocation to separate function iommu/amd: Implement iommu_ops->def_domain_type call-back iommu/vt-d: Wire up iommu_ops->def_domain_type iommu/amd: Remove dma_mask check from check_device() iommu/amd: Return -ENODEV in add_device when device is not handled by IOMMU iommu: Add probe_device() and remove_device() call-backs iommu: Move default domain allocation to iommu_probe_device() iommu: Keep a list of allocated groups in __iommu_probe_device() iommu: Move new probe_device path to separate function iommu: Split off default domain allocation from group assignment iommu: Move iommu_group_create_direct_mappings() out of iommu_group_add_device() iommu: Export bus_iommu_probe() and make is safe for re-probing iommu/amd: Remove dev_data->passthrough iommu/amd: Convert to probe/release_device() call-backs iommu/vt-d: Convert to probe/release_device() call-backs iommu/arm-smmu: Convert to probe/release_device() call-backs iommu/pamu: Convert to probe/release_device() call-backs iommu/s390: Convert to probe/release_device() call-backs iommu/virtio: Convert to probe/release_device() call-backs iommu/msm: Convert to probe/release_device() call-backs iommu/mediatek: Convert to probe/release_device() call-backs iommu/mediatek-v1 Convert to probe/release_device() call-backs iommu/qcom: Convert to probe/release_device() call-backs iommu/rockchip: Convert to probe/release_device() call-backs iommu/tegra: Convert to probe/release_device() call-backs iommu/renesas: Convert to probe/release_device() call-backs iommu/omap: Remove orphan_dev tracking iommu/omap: Convert to probe/release_device() call-backs iommu/exynos: Use first SYSMMU in controllers list for IOMMU core iommu/exynos: Convert to probe/release_device() call-backs iommu: Remove add_device()/remove_device() code-paths iommu: Unexport iommu_group_get_for_dev() Sai Praneeth Prakhya (1): iommu: Add def_domain_type() callback in iommu_ops drivers/iommu/amd_iommu.c | 97 drivers/iommu/amd_iommu_types.h | 1 - drivers/iommu/arm-smmu-v3.c | 38 +-- drivers/iommu/arm-smmu.c| 39 ++-- drivers/iommu/exynos-iommu.c| 24 +- drivers/iommu/fsl_pamu_domain.c | 22 +- drivers/iommu/intel-iommu.c | 68 +- drivers/iommu/iommu.c | 393 +--- drivers/iommu/ipmmu-vmsa.c | 60 ++--- drivers/iommu/msm_iommu.c | 34 +-- drivers/iommu/mtk_iommu.c | 24 +- drivers/iommu/mtk_iommu_v1.c| 50 ++-- drivers/iommu/omap-iommu.c | 99 ++-- drivers/iommu/qcom_iommu.c | 24 +- drivers/iommu/rockchip-iommu.c | 26 +-- drivers/iommu/s390-iommu.c | 22 +- drivers/iommu/tegra-gart.c | 24 +- drivers/iommu/tegra-smmu.c | 31 +-- drivers/iommu/virtio-iommu.c| 41 +--- include/linux/iommu.h | 21 +- 20 files changed, 533 insertions(+), 605 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Hi Joerg, With this patchset, I have an epyc system where if I boot with iommu=nopt and force a dump I will see some io page faults for a nic on the system. The vmcore is harvested and the system reboots. I haven't reproduced it on other systems yet, but without the patchset I don't see the io page faults during the kdump. Regards, Jerry I just hit an issue on a separate intel based system (kdump iommu=nopt), where it panics in during intel_iommu_attach_device, in is_aux_domain, due to device_domain_info being DEFER_DEVICE_DOMAIN_INFO. That doesn't get set to a valid address until the domain_add_dev_info call. Is it as simple as the following? diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 29d3940847d3..f1bbeed46a4c 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5053,8 +5053,8 @@ is_aux_domain(struct device *dev, struct iommu_domain *domain) { struct device_domain_info *info = dev->archdata.iommu; - return info && info->auxd_enable
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
On Fri May 29 20, Jerry Snitselaar wrote: On Tue Apr 14 20, Joerg Roedel wrote: Hi, here is the second version of this patch-set. The first version with some more introductory text can be found here: https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ Changes v1->v2: * Rebased to v5.7-rc1 * Re-wrote the arm-smmu changes as suggested by Robin Murphy * Re-worked the Exynos patches to hopefully not break the driver anymore * Fixed a missing mutex_unlock() reported by Marek Szyprowski, thanks for that. There is also a git-branch available with these patches applied: https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2 Please review. Thanks, Joerg Joerg Roedel (32): iommu: Move default domain allocation to separate function iommu/amd: Implement iommu_ops->def_domain_type call-back iommu/vt-d: Wire up iommu_ops->def_domain_type iommu/amd: Remove dma_mask check from check_device() iommu/amd: Return -ENODEV in add_device when device is not handled by IOMMU iommu: Add probe_device() and remove_device() call-backs iommu: Move default domain allocation to iommu_probe_device() iommu: Keep a list of allocated groups in __iommu_probe_device() iommu: Move new probe_device path to separate function iommu: Split off default domain allocation from group assignment iommu: Move iommu_group_create_direct_mappings() out of iommu_group_add_device() iommu: Export bus_iommu_probe() and make is safe for re-probing iommu/amd: Remove dev_data->passthrough iommu/amd: Convert to probe/release_device() call-backs iommu/vt-d: Convert to probe/release_device() call-backs iommu/arm-smmu: Convert to probe/release_device() call-backs iommu/pamu: Convert to probe/release_device() call-backs iommu/s390: Convert to probe/release_device() call-backs iommu/virtio: Convert to probe/release_device() call-backs iommu/msm: Convert to probe/release_device() call-backs iommu/mediatek: Convert to probe/release_device() call-backs iommu/mediatek-v1 Convert to probe/release_device() call-backs iommu/qcom: Convert to probe/release_device() call-backs iommu/rockchip: Convert to probe/release_device() call-backs iommu/tegra: Convert to probe/release_device() call-backs iommu/renesas: Convert to probe/release_device() call-backs iommu/omap: Remove orphan_dev tracking iommu/omap: Convert to probe/release_device() call-backs iommu/exynos: Use first SYSMMU in controllers list for IOMMU core iommu/exynos: Convert to probe/release_device() call-backs iommu: Remove add_device()/remove_device() code-paths iommu: Unexport iommu_group_get_for_dev() Sai Praneeth Prakhya (1): iommu: Add def_domain_type() callback in iommu_ops drivers/iommu/amd_iommu.c | 97 drivers/iommu/amd_iommu_types.h | 1 - drivers/iommu/arm-smmu-v3.c | 38 +-- drivers/iommu/arm-smmu.c| 39 ++-- drivers/iommu/exynos-iommu.c| 24 +- drivers/iommu/fsl_pamu_domain.c | 22 +- drivers/iommu/intel-iommu.c | 68 +- drivers/iommu/iommu.c | 393 +--- drivers/iommu/ipmmu-vmsa.c | 60 ++--- drivers/iommu/msm_iommu.c | 34 +-- drivers/iommu/mtk_iommu.c | 24 +- drivers/iommu/mtk_iommu_v1.c| 50 ++-- drivers/iommu/omap-iommu.c | 99 ++-- drivers/iommu/qcom_iommu.c | 24 +- drivers/iommu/rockchip-iommu.c | 26 +-- drivers/iommu/s390-iommu.c | 22 +- drivers/iommu/tegra-gart.c | 24 +- drivers/iommu/tegra-smmu.c | 31 +-- drivers/iommu/virtio-iommu.c| 41 +--- include/linux/iommu.h | 21 +- 20 files changed, 533 insertions(+), 605 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Hi Joerg, With this patchset, I have an epyc system where if I boot with iommu=nopt and force a dump I will see some io page faults for a nic on the system. The vmcore is harvested and the system reboots. I haven't reproduced it on other systems yet, but without the patchset I don't see the io page faults during the kdump. Regards, Jerry I just hit an issue on a separate intel based system (kdump iommu=nopt), where it panics in during intel_iommu_attach_device, in is_aux_domain, due to device_domain_info being DEFER_DEVICE_DOMAIN_INFO. That doesn't get set to a valid address until the domain_add_dev_info call. Is it as simple as the following? diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 29d3940847d3..f1bbeed46a4c 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5053,8 +5053,8 @@ is_aux_domain(struct device *dev, struct iommu_domain *domain) { struct device_domain_info *info = dev->archdata.iommu; - return info && info->auxd_enabled && - domain->type ==
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
On Tue Apr 14 20, Joerg Roedel wrote: Hi, here is the second version of this patch-set. The first version with some more introductory text can be found here: https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ Changes v1->v2: * Rebased to v5.7-rc1 * Re-wrote the arm-smmu changes as suggested by Robin Murphy * Re-worked the Exynos patches to hopefully not break the driver anymore * Fixed a missing mutex_unlock() reported by Marek Szyprowski, thanks for that. There is also a git-branch available with these patches applied: https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2 Please review. Thanks, Joerg Joerg Roedel (32): iommu: Move default domain allocation to separate function iommu/amd: Implement iommu_ops->def_domain_type call-back iommu/vt-d: Wire up iommu_ops->def_domain_type iommu/amd: Remove dma_mask check from check_device() iommu/amd: Return -ENODEV in add_device when device is not handled by IOMMU iommu: Add probe_device() and remove_device() call-backs iommu: Move default domain allocation to iommu_probe_device() iommu: Keep a list of allocated groups in __iommu_probe_device() iommu: Move new probe_device path to separate function iommu: Split off default domain allocation from group assignment iommu: Move iommu_group_create_direct_mappings() out of iommu_group_add_device() iommu: Export bus_iommu_probe() and make is safe for re-probing iommu/amd: Remove dev_data->passthrough iommu/amd: Convert to probe/release_device() call-backs iommu/vt-d: Convert to probe/release_device() call-backs iommu/arm-smmu: Convert to probe/release_device() call-backs iommu/pamu: Convert to probe/release_device() call-backs iommu/s390: Convert to probe/release_device() call-backs iommu/virtio: Convert to probe/release_device() call-backs iommu/msm: Convert to probe/release_device() call-backs iommu/mediatek: Convert to probe/release_device() call-backs iommu/mediatek-v1 Convert to probe/release_device() call-backs iommu/qcom: Convert to probe/release_device() call-backs iommu/rockchip: Convert to probe/release_device() call-backs iommu/tegra: Convert to probe/release_device() call-backs iommu/renesas: Convert to probe/release_device() call-backs iommu/omap: Remove orphan_dev tracking iommu/omap: Convert to probe/release_device() call-backs iommu/exynos: Use first SYSMMU in controllers list for IOMMU core iommu/exynos: Convert to probe/release_device() call-backs iommu: Remove add_device()/remove_device() code-paths iommu: Unexport iommu_group_get_for_dev() Sai Praneeth Prakhya (1): iommu: Add def_domain_type() callback in iommu_ops drivers/iommu/amd_iommu.c | 97 drivers/iommu/amd_iommu_types.h | 1 - drivers/iommu/arm-smmu-v3.c | 38 +-- drivers/iommu/arm-smmu.c| 39 ++-- drivers/iommu/exynos-iommu.c| 24 +- drivers/iommu/fsl_pamu_domain.c | 22 +- drivers/iommu/intel-iommu.c | 68 +- drivers/iommu/iommu.c | 393 +--- drivers/iommu/ipmmu-vmsa.c | 60 ++--- drivers/iommu/msm_iommu.c | 34 +-- drivers/iommu/mtk_iommu.c | 24 +- drivers/iommu/mtk_iommu_v1.c| 50 ++-- drivers/iommu/omap-iommu.c | 99 ++-- drivers/iommu/qcom_iommu.c | 24 +- drivers/iommu/rockchip-iommu.c | 26 +-- drivers/iommu/s390-iommu.c | 22 +- drivers/iommu/tegra-gart.c | 24 +- drivers/iommu/tegra-smmu.c | 31 +-- drivers/iommu/virtio-iommu.c| 41 +--- include/linux/iommu.h | 21 +- 20 files changed, 533 insertions(+), 605 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Hi Joerg, With this patchset, I have an epyc system where if I boot with iommu=nopt and force a dump I will see some io page faults for a nic on the system. The vmcore is harvested and the system reboots. I haven't reproduced it on other systems yet, but without the patchset I don't see the io page faults during the kdump. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Implement deferred domain attachment
On Mon May 18 20, Joerg Roedel wrote: On Fri, May 15, 2020 at 08:23:13PM +0100, Robin Murphy wrote: But that's not what this is; this is (supposed to be) the exact same "don't actually perform the attach yet" logic as before, just restricting it to default domains in the one place that it actually needs to be, so as not to fundamentally bugger up iommu_attach_device() in a way that prevents it from working as expected at the correct point later. You are right, that is better. I tested it and it seems to work. Updated diff attached, with a minor cleanup included. Mind sending it as a proper patch I can send upstream? Thanks, Joerg diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 7b375421afba..a9d02bc3ab5b 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -693,6 +693,15 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group, return ret; } +static bool iommu_is_attach_deferred(struct iommu_domain *domain, +struct device *dev) +{ + if (domain->ops->is_attach_deferred) + return domain->ops->is_attach_deferred(domain, dev); + + return false; +} + /** * iommu_group_add_device - add a device to an iommu group * @group: the group into which to add the device (reference should be held) @@ -705,6 +714,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) { int ret, i = 0; struct group_device *device; + struct iommu_domain *domain; device = kzalloc(sizeof(*device), GFP_KERNEL); if (!device) @@ -747,7 +757,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices); - if (group->domain) + domain = group->domain; + if (domain && !iommu_is_attach_deferred(domain, dev)) ret = __iommu_attach_device(group->domain, dev); mutex_unlock(&group->mutex); if (ret) @@ -1653,9 +1664,6 @@ static int __iommu_attach_device(struct iommu_domain *domain, struct device *dev) { int ret; - if ((domain->ops->is_attach_deferred != NULL) && - domain->ops->is_attach_deferred(domain, dev)) - return 0; if (unlikely(domain->ops->attach_dev == NULL)) return -ENODEV; @@ -1727,8 +1735,7 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid); static void __iommu_detach_device(struct iommu_domain *domain, struct device *dev) { - if ((domain->ops->is_attach_deferred != NULL) && - domain->ops->is_attach_deferred(domain, dev)) + if (iommu_is_attach_deferred(domain, dev)) return; if (unlikely(domain->ops->detach_dev == NULL)) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu This worked for me as well. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Implement deferred domain attachment
On Mon May 18 20, Joerg Roedel wrote: On Fri, May 15, 2020 at 08:23:13PM +0100, Robin Murphy wrote: But that's not what this is; this is (supposed to be) the exact same "don't actually perform the attach yet" logic as before, just restricting it to default domains in the one place that it actually needs to be, so as not to fundamentally bugger up iommu_attach_device() in a way that prevents it from working as expected at the correct point later. You are right, that is better. I tested it and it seems to work. Updated diff attached, with a minor cleanup included. Mind sending it as a proper patch I can send upstream? Thanks, Joerg I should have this tested this afternoon. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: amd kdump failure with iommu=nopt
On Thu May 14 20, Joerg Roedel wrote: On Thu, May 14, 2020 at 05:36:23PM +0200, Joerg Roedel wrote: This commit also removes the deferred attach of the device to its new domain. Does the attached diff fix the problem for you? +static int __iommu_attach_device_no_defer(struct iommu_domain *domain, + struct device *dev) +{ if (unlikely(domain->ops->attach_dev == NULL)) return -ENODEV; ret = domain->ops->attach_dev(domain, dev); if (!ret) trace_attach_device_to_domain(dev); + return ret; } Sorry, this didn't compile, here is an updated version that actually compiles: diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 4050569188be..f54ebb964271 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1889,13 +1889,19 @@ void iommu_domain_free(struct iommu_domain *domain) } EXPORT_SYMBOL_GPL(iommu_domain_free); -static int __iommu_attach_device(struct iommu_domain *domain, -struct device *dev) +static bool __iommu_is_attach_deferred(struct iommu_domain *domain, + struct device *dev) +{ + if (!domain->ops->is_attach_deferred) + return false; + + return domain->ops->is_attach_deferred(domain, dev); +} + +static int __iommu_attach_device_no_defer(struct iommu_domain *domain, + struct device *dev) { int ret; - if ((domain->ops->is_attach_deferred != NULL) && - domain->ops->is_attach_deferred(domain, dev)) - return 0; if (unlikely(domain->ops->attach_dev == NULL)) return -ENODEV; @@ -1903,9 +1909,19 @@ static int __iommu_attach_device(struct iommu_domain *domain, ret = domain->ops->attach_dev(domain, dev); if (!ret) trace_attach_device_to_domain(dev); + return ret; } +static int __iommu_attach_device(struct iommu_domain *domain, +struct device *dev) +{ + if (__iommu_is_attach_deferred(domain, dev)) + return 0; + + return __iommu_attach_device_no_defer(domain, dev); +} + int iommu_attach_device(struct iommu_domain *domain, struct device *dev) { struct iommu_group *group; @@ -2023,7 +2039,12 @@ EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev); */ struct iommu_domain *iommu_get_dma_domain(struct device *dev) { - return dev->iommu_group->default_domain; + struct iommu_domain *domain = dev->iommu_group->default_domain; + + if (__iommu_is_attach_deferred(domain, dev)) + __iommu_attach_device_no_defer(domain, dev); + + return domain; } /* ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Yes, that works. Tested-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
amd kdump failure with iommu=nopt
We've seen kdump failures with recent kernels (5.5, 5.6, 5.7-rc1) on amd systems when iommu is enabled in translation mode. In the cases so far there has been mpt3sas involved, but I'm also seeing io page faults for ahci right before mpt3sas has an io page fault: [ 15.156620] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xfff9b300 flags=0x0020] [ 15.166889] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xfff9b320 flags=0x0020] [ 15.177169] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 15.186100] ata4.00: failed to IDENTIFY (device reports invalid type, err_mask=0x0) [ 15.193786] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f730c0 flags=0x0020] [ 15.204059] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f732c0 flags=0x0020] [ 15.214327] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f734c0 flags=0x0020] [ 15.224597] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f736c0 flags=0x0020] [ 15.234867] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f738c0 flags=0x0020] [ 15.245138] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f73ac0 flags=0x0020] [ 15.255407] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f73cc0 flags=0x0020] [ 15.265677] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f73ec0 flags=0x0020] [ 20.599101] ata2.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) [ 20.916172] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 20.922429] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xfff9b300 flags=0x0020] [ 20.932703] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xfff9b320 flags=0x0020] [ 20.943234] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 20.949430] ata4.00: failed to IDENTIFY (device reports invalid type, err_mask=0x0) [ 20.957115] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f730c0 flags=0x0020] [ 20.967384] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f732c0 flags=0x0020] [ 20.977654] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f734c0 flags=0x0020] [ 20.987923] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f736c0 flags=0x0020] [ 20.998193] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f738c0 flags=0x0020] [ 21.008464] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f73ac0 flags=0x0020] [ 21.018733] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f73cc0 flags=0x0020] [ 21.029005] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f73ec0 flags=0x0020] [ 26.231097] ata2.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) [ 26.238415] ata2: limiting SATA link speed to 3.0 Gbps [ 26.548169] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 26.564483] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 320) [ 26.571026] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f730c0 flags=0x0020] [ 26.581301] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f732c0 flags=0x0020] [ 26.591568] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f734c0 flags=0x0020] [ 26.601839] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f736c0 flags=0x0020] [ 26.612109] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f738c0 flags=0x0020] [ 26.622377] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f73ac0 flags=0x0020] [ 26.632647] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f73cc0 flags=0x0020] [ 26.642917] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0042 address=0xf1f73ec0 flags=0x0020] [ 26.654047] ata2.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) [ 26.743097] xhci_hcd :05:00.3: Error while assigning device slot ID [ 26.749718] xhci_hcd :05:00.3: Max number of devices this xHCI host supports is 64. [ 26.757730] usb usb1-port2: couldn't allocate usb_device [ 26.987555] mpt3sas version 33.100.00.00 loaded [ 26.994668] mpt3sas_cm0: 63 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (226256 kB) [ 27.060443] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 27.068469] mpt3sas_cm0: MSI-X vectors supported: 96 [ 27.073444] no of cores: 1, max_msix_vectors: -1 [ 27.078244] mpt3sa
Re: [PATCH v4 0/3] Replace private domain with per-group default domain
On Wed May 06 20, Lu Baolu wrote: Some devices are required to use a specific type (identity or dma) of default domain when they are used with a vendor iommu. When the system level default domain type is different from it, the vendor iommu driver has to request a new default domain with either iommu_request_dma_domain_for_dev() or iommu_request_dm_for_dev() in the add_dev() callback. Unfortunately, these two helpers only work when the group hasn't been assigned to any other devices, hence, some vendor iommu driver has to use a private domain if it fails to request a new default one. Joerg proposed an on-going proposal which makes the default domain framework to support configuring per-group default domain during boot process. https://lkml.org/lkml/2020/4/14/616 [This has been applied in iommu/next.] Hence, there is no need to keep the private domain implementation in the Intel IOMMU driver. This patch series aims to remove it. Best regards, baolu Change log: v3->v4: - Make the commit message of the first patch more comprehensive. v2->v3: - Port necessary patches on the top of Joerg's new proposal. https://lkml.org/lkml/2020/4/14/616 The per-group default domain proposed previously in this series will be deprecated due to a race concern between domain switching and device driver probing. v1->v2: - Rename the iommu ops callback to def_domain_type Lu Baolu (3): iommu/vt-d: Allow 32bit devices to uses DMA domain iommu/vt-d: Allow PCI sub-hierarchy to use DMA domain iommu/vt-d: Apply per-device dma_ops drivers/iommu/intel-iommu.c | 396 +++- 1 file changed, 26 insertions(+), 370 deletions(-) -- 2.17.1 Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: question about iommu_need_mapping
On Thu Feb 20 20, Jerry Snitselaar wrote: On Thu Feb 20 20, Lu Baolu wrote: Hi Jerry, On 2020/2/20 7:55, Jerry Snitselaar wrote: Is it possible for a device to end up with dev->archdata.iommu == NULL on iommu_need_mapping in the following instance: 1. iommu_group has dma domain for default 2. device gets private identity domain in intel_iommu_add_device 3. iommu_need_mapping gets called with that device. 4. dmar_remove_one_dev_info sets dev->archdata.iommu = NULL via unlink_domain_info. 5. request_default_domain_for_dev exits after checking that group->default_domain exists, and group->default_domain->type is dma. 6. iommu_request_dma_domain_for_dev returns 0 from request_default_domain_for_dev and a private dma domain isn't created for the device. Yes. It's possible. The case I was seeing went away with commit 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity domain"), because it changed which domain the group and devices were using, but it seems like it is still a possibility with the code. Baolu, you mentioned possibly removing the domain switch. Commit 98b2fffb5e27 ("iommu/vt-d: Handle 32bit device with identity default domain") makes it sound like the domain switch is required. It's more "nice to have" than "required" if the iommu driver doesn't disable swiotlb explicitly. The device access of system memory higher than the device's addressing capability could go through the bounced buffer implemented in swiotlb. Best regards, baolu Hi Baolu, Would this mean switching to bounce_dma_ops instead? Never mind. I see that it would go into the dma_direct code. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: question about iommu_need_mapping
On Thu Feb 20 20, Lu Baolu wrote: Hi Jerry, On 2020/2/20 7:55, Jerry Snitselaar wrote: Is it possible for a device to end up with dev->archdata.iommu == NULL on iommu_need_mapping in the following instance: 1. iommu_group has dma domain for default 2. device gets private identity domain in intel_iommu_add_device 3. iommu_need_mapping gets called with that device. 4. dmar_remove_one_dev_info sets dev->archdata.iommu = NULL via unlink_domain_info. 5. request_default_domain_for_dev exits after checking that group->default_domain exists, and group->default_domain->type is dma. 6. iommu_request_dma_domain_for_dev returns 0 from request_default_domain_for_dev and a private dma domain isn't created for the device. Yes. It's possible. The case I was seeing went away with commit 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity domain"), because it changed which domain the group and devices were using, but it seems like it is still a possibility with the code. Baolu, you mentioned possibly removing the domain switch. Commit 98b2fffb5e27 ("iommu/vt-d: Handle 32bit device with identity default domain") makes it sound like the domain switch is required. It's more "nice to have" than "required" if the iommu driver doesn't disable swiotlb explicitly. The device access of system memory higher than the device's addressing capability could go through the bounced buffer implemented in swiotlb. Best regards, baolu Hi Baolu, Would this mean switching to bounce_dma_ops instead? Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
question about iommu_need_mapping
Is it possible for a device to end up with dev->archdata.iommu == NULL on iommu_need_mapping in the following instance: 1. iommu_group has dma domain for default 2. device gets private identity domain in intel_iommu_add_device 3. iommu_need_mapping gets called with that device. 4. dmar_remove_one_dev_info sets dev->archdata.iommu = NULL via unlink_domain_info. 5. request_default_domain_for_dev exits after checking that group->default_domain exists, and group->default_domain->type is dma. 6. iommu_request_dma_domain_for_dev returns 0 from request_default_domain_for_dev and a private dma domain isn't created for the device. The case I was seeing went away with commit 9235cb13d7d1 ("iommu/vt-d: Allow devices with RMRRs to use identity domain"), because it changed which domain the group and devices were using, but it seems like it is still a possibility with the code. Baolu, you mentioned possibly removing the domain switch. Commit 98b2fffb5e27 ("iommu/vt-d: Handle 32bit device with identity default domain") makes it sound like the domain switch is required. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dmar fault right around domain switch in iommu_need_mapping
On Wed Feb 19 20, Lu Baolu wrote: Hi Jerry, On 2020/2/18 23:45, Jerry Snitselaar wrote: Hi Joerg and Baolu, I'm chasing down one last issue. I'm waiting to hear back from them testing with Joerg's patchset, but I'm guessing this will still pop up. It looks like right around when the domain switch occurs in iommu_need_mapping there are some dmar faults (below is from 5.6-rc1 plus earlier fix attempt that moved deferred attach to beginning of iommu_need_mapping): [ 12.546920] DMAR: DRHD: handling fault status reg 2 [ 12.546923] DMAR: [DMA Read] Request device [02:00.0] PASID fault addr 791dd000 [fault reason 02] Present bit in context entry is clear [ 12.635193] hpsa :02:00.0: Using iommu dma mapping [ 12.776712] hpsa :02:00.0: DMAR: 32bit DMA uses non-identity mapping [ 14.091219] DMAR: [DMA Read] Request device [07:00.0] PASID fault addr 791dd000 [fault reason 02] Present bit in context entry is clear [ 14.180842] DMAR: DRHD: handling fault status reg 202 [ 14.180845] DMAR: [DMA Read] Request device [07:00.0] PASID fault addr 791dd000 [fault reason 02] Present bit in context entry is clear [ 14.268756] DMAR: DRHD: handling fault status reg 302 [ 15.542551] hpsa :07:00.0: Using iommu dma mapping [ 15.567256] hpsa :07:00.0: DMAR: 32bit DMA uses non-identity mapping It seems to only happen right then, and then things are fine. Happens during both regular and kdump boot. With the kdump boot the faults are from the hpilo in the logs I'm looking at, so it doesn't seem to be tied to a device, or certain rmrr. The faulting address always seems to be the base address of the rmrr. The dmar tables look sane. Perhaps like this? The device was boot with an identity domain (iommu=pt). When loading the driver for this device, iommu driver finds that it's a 32-bit device and tries to convert it to DMA domain. The rmrr is still active during the switch, hence you see dma faults during that time window. Best regards, baolu It looks like it doesn't occur with Joerg's patchset. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5 v2] iommu/vt-d: Do deferred attachment in iommu_need_mapping()
On Tue Feb 18 20, Joerg Roedel wrote: Hi Baolu, On Tue, Feb 18, 2020 at 10:38:14AM +0800, Lu Baolu wrote: > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 42cdcce1602e..32f43695a22b 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -2541,9 +2541,6 @@ static void do_deferred_attach(struct device *dev) > static struct dmar_domain *deferred_attach_domain(struct device *dev) > { > - if (unlikely(attach_deferred(dev))) > - do_deferred_attach(dev); > - This should also be moved to the call place of deferred_attach_domain() in bounce_map_single(). bounce_map_single() assumes that devices always use DMA domain, so it doesn't call iommu_need_mapping(). We could do_deferred_attach() there manually. Good point, thanks for your review. Updated patch below. From 3a5b8a66d288d86ac1fd45092e7d96f842d0cccf Mon Sep 17 00:00:00 2001 From: Joerg Roedel Date: Mon, 17 Feb 2020 17:20:59 +0100 Subject: [PATCH 3/5] iommu/vt-d: Do deferred attachment in iommu_need_mapping() The attachment of deferred devices needs to happen before the check whether the device is identity mapped or not. Otherwise the check will return wrong results, cause warnings boot failures in kdump kernels, like WARNING: CPU: 0 PID: 318 at ../drivers/iommu/intel-iommu.c:592 domain_get_iommu+0x61/0x70 [...] Call Trace: __intel_map_single+0x55/0x190 intel_alloc_coherent+0xac/0x110 dmam_alloc_attrs+0x50/0xa0 ahci_port_start+0xfb/0x1f0 [libahci] ata_host_start.part.39+0x104/0x1e0 [libata] With the earlier check the kdump boot succeeds and a crashdump is written. Signed-off-by: Joerg Roedel Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
dmar fault right around domain switch in iommu_need_mapping
Hi Joerg and Baolu, I'm chasing down one last issue. I'm waiting to hear back from them testing with Joerg's patchset, but I'm guessing this will still pop up. It looks like right around when the domain switch occurs in iommu_need_mapping there are some dmar faults (below is from 5.6-rc1 plus earlier fix attempt that moved deferred attach to beginning of iommu_need_mapping): [ 12.546920] DMAR: DRHD: handling fault status reg 2 [ 12.546923] DMAR: [DMA Read] Request device [02:00.0] PASID fault addr 791dd000 [fault reason 02] Present bit in context entry is clear [ 12.635193] hpsa :02:00.0: Using iommu dma mapping [ 12.776712] hpsa :02:00.0: DMAR: 32bit DMA uses non-identity mapping [ 14.091219] DMAR: [DMA Read] Request device [07:00.0] PASID fault addr 791dd000 [fault reason 02] Present bit in context entry is clear [ 14.180842] DMAR: DRHD: handling fault status reg 202 [ 14.180845] DMAR: [DMA Read] Request device [07:00.0] PASID fault addr 791dd000 [fault reason 02] Present bit in context entry is clear [ 14.268756] DMAR: DRHD: handling fault status reg 302 [ 15.542551] hpsa :07:00.0: Using iommu dma mapping [ 15.567256] hpsa :07:00.0: DMAR: 32bit DMA uses non-identity mapping It seems to only happen right then, and then things are fine. Happens during both regular and kdump boot. With the kdump boot the faults are from the hpilo in the logs I'm looking at, so it doesn't seem to be tied to a device, or certain rmrr. The faulting address always seems to be the base address of the rmrr. The dmar tables look sane. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/5] iommu/vt-d: Simplify check in identity_mapping()
On Mon Feb 17 20, Joerg Roedel wrote: From: Joerg Roedel The function only has one call-site and there it is never called with dummy or deferred devices. Simplify the check in the function to account for that. Signed-off-by: Joerg Roedel Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] iommu/vt-d: Do deferred attachment in iommu_need_mapping()
On Mon Feb 17 20, Joerg Roedel wrote: From: Joerg Roedel The attachment of deferred devices needs to happen before the check whether the device is identity mapped or not. Otherwise the check will return wrong results, cause warnings boot failures in kdump kernels, like WARNING: CPU: 0 PID: 318 at ../drivers/iommu/intel-iommu.c:592 domain_get_iommu+0x61/0x70 [...] Call Trace: __intel_map_single+0x55/0x190 intel_alloc_coherent+0xac/0x110 dmam_alloc_attrs+0x50/0xa0 ahci_port_start+0xfb/0x1f0 [libahci] ata_host_start.part.39+0x104/0x1e0 [libata] With the earlier check the kdump boot succeeds and a crashdump is written. Signed-off-by: Joerg Roedel Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/5] iommu/vt-d: Remove deferred_attach_domain()
On Mon Feb 17 20, Joerg Roedel wrote: From: Joerg Roedel The function is now only a wrapper around find_domain(). Remove the function and call find_domain() directly at the call-sites. Signed-off-by: Joerg Roedel Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/5] iommu/vt-d: Move deferred device attachment into helper function
On Mon Feb 17 20, Joerg Roedel wrote: From: Joerg Roedel Move the code that does the deferred device attachment into a separate helper function. Signed-off-by: Joerg Roedel Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/5] iommu/vt-d: Add attach_deferred() helper
On Mon Feb 17 20, Joerg Roedel wrote: From: Joerg Roedel Implement a helper function to check whether a device's attach process is deferred. Signed-off-by: Joerg Roedel Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: arm-smmu.1.auto: Unhandled context fault starting with 5.4-rc1
On Mon Feb 17 20, Robin Murphy wrote: On 16/02/2020 10:11 pm, Jerry Snitselaar wrote: On Fri Feb 14 20, Robin Murphy wrote: Hi Jerry, On 2020-02-14 8:13 pm, Jerry Snitselaar wrote: Hi Will, On a gigabyte system with Cavium CN8xx, when doing a fio test against an nvme drive we are seeing the following: [ 637.161194] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8010003f6000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.174329] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x80136000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.186887] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8010002ee000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.199275] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8010003c7000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.211885] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x801000392000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.224580] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x80118000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.237241] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x80100036, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.249657] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x801ba000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.262120] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8013e000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.274468] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x801000304000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 Those "IOVAs" don't look much like IOVAs from the DMA allocator - if they were physical addresses, would they correspond to an expected region of the physical memory map? I would suspect that this is most likely misbehaviour in the NVMe driver (issuing a write to a non-DMA-mapped address), and the SMMU is just doing its job in blocking and reporting it. I also reproduced with 5.5-rc7, and will check 5.6-rc1 later today. I couldn't narrow it down further into 5.4-rc1. I don't know smmu or the code well, any thoughts on where to start digging into this? fio test that is being run is: #fio -filename=/dev/nvme0n1 -iodepth=64 -thread -rw=randwrite -ioengine=libaio -bs=4k -runtime=43200 -size=-group_reporting -name=mytest -numjobs=32 Just to clarify, do other tests work OK on the same device? Thanks, Robin. I was able to get back on the system today. I think I know what the problem is: [ 0.036189] iommu: Gigabyte R120-T34-00 detected, force iommu passthrough mode [ 6.324282] iommu: Default domain type: Translated So the new default domain code in 5.4 overrides the iommu quirk code setting default passthrough. Testing a quick patch that tracks whether the default domain was set in the quirk code, and leaves it alone if it was. So far it seems to be working. Ah, OK. Could you point me at that quirk code? I can't seem to track it down in mainline, and seeing this much leaves me dubious that it's even correct - matching a particular board implies that it's a firmware issue (as far as I'm aware the SMMUs in CN88xx SoCs are usable in general), but if the firmware description is wrong to the point that DMA ops translation doesn't work, then no other translation (e.g. VFIO) is likely to work either. In that case it's simply not safe to enable the SMMU at all, and fudging the default domain type merely hides one symptom of the problem. Robin. Ugh. It is a RHEL only patch, but for some reason it is applied to the ark kernel builds as well. Sorry for the noise. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: arm-smmu.1.auto: Unhandled context fault starting with 5.4-rc1
On Fri Feb 14 20, Robin Murphy wrote: Hi Jerry, On 2020-02-14 8:13 pm, Jerry Snitselaar wrote: Hi Will, On a gigabyte system with Cavium CN8xx, when doing a fio test against an nvme drive we are seeing the following: [ 637.161194] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8010003f6000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.174329] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x80136000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.186887] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8010002ee000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.199275] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8010003c7000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.211885] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x801000392000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.224580] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x80118000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.237241] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x80100036, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.249657] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x801ba000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.262120] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8013e000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.274468] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x801000304000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 Those "IOVAs" don't look much like IOVAs from the DMA allocator - if they were physical addresses, would they correspond to an expected region of the physical memory map? I would suspect that this is most likely misbehaviour in the NVMe driver (issuing a write to a non-DMA-mapped address), and the SMMU is just doing its job in blocking and reporting it. I also reproduced with 5.5-rc7, and will check 5.6-rc1 later today. I couldn't narrow it down further into 5.4-rc1. I don't know smmu or the code well, any thoughts on where to start digging into this? fio test that is being run is: #fio -filename=/dev/nvme0n1 -iodepth=64 -thread -rw=randwrite -ioengine=libaio -bs=4k -runtime=43200 -size=-group_reporting -name=mytest -numjobs=32 Just to clarify, do other tests work OK on the same device? Thanks, Robin. I was able to get back on the system today. I think I know what the problem is: [0.036189] iommu: Gigabyte R120-T34-00 detected, force iommu passthrough mode [6.324282] iommu: Default domain type: Translated So the new default domain code in 5.4 overrides the iommu quirk code setting default passthrough. Testing a quick patch that tracks whether the default domain was set in the quirk code, and leaves it alone if it was. So far it seems to be working. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
arm-smmu.1.auto: Unhandled context fault starting with 5.4-rc1
Hi Will, On a gigabyte system with Cavium CN8xx, when doing a fio test against an nvme drive we are seeing the following: [ 637.161194] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8010003f6000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.174329] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x80136000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.186887] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8010002ee000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.199275] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8010003c7000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.211885] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x801000392000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.224580] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x80118000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.237241] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x80100036, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.249657] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x801ba000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.262120] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x8013e000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 [ 637.274468] arm-smmu arm-smmu.1.auto: Unhandled context fault: fsr=0x8402, iova=0x801000304000, fsynr=0x70091, cbfrsynra=0x9000, cb=7 I also reproduced with 5.5-rc7, and will check 5.6-rc1 later today. I couldn't narrow it down further into 5.4-rc1. I don't know smmu or the code well, any thoughts on where to start digging into this? fio test that is being run is: #fio -filename=/dev/nvme0n1 -iodepth=64 -thread -rw=randwrite -ioengine=libaio -bs=4k -runtime=43200 -size=-group_reporting -name=mytest -numjobs=32 Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: warning from domain_get_iommu
On Sat Feb 08 20, Lu Baolu wrote: Hi Jerry, On 2020/2/7 17:34, Jerry Snitselaar wrote: On Thu Feb 06 20, Jerry Snitselaar wrote: On Tue Feb 04 20, Jerry Snitselaar wrote: I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [ 2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 2.832615] ehci-pci: EHCI PCI platform driver [ 2.834190] ehci-pci :00:1a.0: EHCI Host Controller [ 2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [ 2.838276] ehci-pci :00:1a.0: debug port 2 [ 2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [ 2.840671] Modules linked in: [ 2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [ 2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [ 2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [ 2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [ 2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [ 2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [ 2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [ 2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [ 2.840671] R10: 0095 R11: c90df928 R12: [ 2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [ 2.840671] FS: () GS:88ec7f60() knlGS: [ 2.840671] CS: 0010 DS: ES: CR0: 80050033 [ 2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [ 2.840671] Call Trace: [ 2.840671] __intel_map_single+0x62/0x140 [ 2.840671] intel_alloc_coherent+0xa6/0x130 [ 2.840671] dma_pool_alloc+0xd8/0x1e0 [ 2.840671] e_qh_alloc+0x55/0x130 [ 2.840671] ehci_setup+0x284/0x7b0 [ 2.840671] ehci_pci_setup+0xa3/0x530 [ 2.840671] usb_add_hcd+0x2b6/0x800 [ 2.840671] usb_hcd_pci_probe+0x375/0x460 [ 2.840671] local_pci_probe+0x41/0x90 [ 2.840671] pci_device_probe+0x105/0x1b0 [ 2.840671] driver_probe_device+0x12d/0x460 [ 2.840671] device_driver_attach+0x50/0x60 [ 2.840671] __driver_attach+0x61/0x130 [ 2.840671] ? device_driver_attach+0x60/0x60 [ 2.840671] bus_for_each_dev+0x77/0xc0 [ 2.840671] ? klist_add_tail+0x3b/0x70 [ 2.840671] bus_add_driver+0x14d/0x1e0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] driver_register+0x6b/0xb0 [ 2.840671] ? ehci_hcd_init+0xaa/0xaa [ 2.840671] do_one_initcall+0x46/0x1c3 [ 2.840671] ? do_early_param+0x91/0x91 [ 2.840671] kernel_init_freeable+0x1af/0x258 [ 2.840671] ? rest_init+0xaa/0xaa [ 2.840671] kernel_init+0xa/0xf9 [ 2.840671] ret_from_fork+0x35/0x40 [ 2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [ 3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [ 3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [ 3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [ 3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [ 3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [ 3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [ 3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 3.035900] usb usb1: Product: EHCI Host Controller [ 3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [ 3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single gets called and
Re: warning from domain_get_iommu
On Thu Feb 06 20, Jerry Snitselaar wrote: On Tue Feb 04 20, Jerry Snitselaar wrote: I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [2.832615] ehci-pci: EHCI PCI platform driver [2.834190] ehci-pci :00:1a.0: EHCI Host Controller [2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [2.838276] ehci-pci :00:1a.0: debug port 2 [2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [2.840671] Modules linked in: [2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [2.840671] R10: 0095 R11: c90df928 R12: [2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [2.840671] FS: () GS:88ec7f60() knlGS: [2.840671] CS: 0010 DS: ES: CR0: 80050033 [2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [2.840671] Call Trace: [2.840671] __intel_map_single+0x62/0x140 [2.840671] intel_alloc_coherent+0xa6/0x130 [2.840671] dma_pool_alloc+0xd8/0x1e0 [2.840671] e_qh_alloc+0x55/0x130 [2.840671] ehci_setup+0x284/0x7b0 [2.840671] ehci_pci_setup+0xa3/0x530 [2.840671] usb_add_hcd+0x2b6/0x800 [2.840671] usb_hcd_pci_probe+0x375/0x460 [2.840671] local_pci_probe+0x41/0x90 [2.840671] pci_device_probe+0x105/0x1b0 [2.840671] driver_probe_device+0x12d/0x460 [2.840671] device_driver_attach+0x50/0x60 [2.840671] __driver_attach+0x61/0x130 [2.840671] ? device_driver_attach+0x60/0x60 [2.840671] bus_for_each_dev+0x77/0xc0 [2.840671] ? klist_add_tail+0x3b/0x70 [2.840671] bus_add_driver+0x14d/0x1e0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] ? do_early_param+0x91/0x91 [2.840671] driver_register+0x6b/0xb0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] do_one_initcall+0x46/0x1c3 [2.840671] ? do_early_param+0x91/0x91 [2.840671] kernel_init_freeable+0x1af/0x258 [2.840671] ? rest_init+0xaa/0xaa [2.840671] kernel_init+0xa/0xf9 [2.840671] ret_from_fork+0x35/0x40 [2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [3.035900] usb usb1: Product: EHCI Host Controller [3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. Hi Baolu, I think I understand what is happening here. With the kdump boot translation is pre-enabled, so in intel_iommu_add_device things are getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent calls iommu_need_mapping it returns true, but doesn't do the dma domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then __intel_map_single gets called and it calls deferred_attach_domain, which sets the domain to the group domain, which in this case is the
Re: Seeing some another issue with mixed domains in the same iommu_group
On Thu Feb 06 20, Jerry Snitselaar wrote: ... The above cases seem to be avoided by: 9235cb13d7d1 | 2020-01-24 | iommu/vt-d: Allow devices with RMRRs to use identity domain (Lu Baolu) which results in the watchdog device no longer taking a dma domain and switching the group default. Without that patch though when it gets into the iommu_need_mapping code for :01:00.4 after the following: dmar_remove_one_dev_info(dev); ret = iommu_request_dma_domain_for_dev(dev); ret is 0 and dev->archdata.iommu is NULL. Even with 9235cb13d7d1 device_def_domain_type can return return dma, but I'm not sure how likely it is for there to be an iommu group like that again where the group default ends up dma, a device gets removed and added to the identity domain, and then ends up in that code in iommu_need_mapping. Hi Baolu, Would something along these lines makes sense? diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9dc37672bf89..40cc8f5a3ebb 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3614,6 +3614,20 @@ static bool iommu_need_mapping(struct device *dev) } dmar_remove_one_dev_info(dev); get_private_domain_for_dev(dev); + } else { + if (dev->archdata.iommu == NULL) { + struct iommu_domain *domain; + struct iommu_group *group; + struct dmar_domain *dmar_domain, *tmp; + + group = iommu_group_get_for_dev(dev); + domain = iommu_group_default_domain(group); + dmar_domain = to_dmar_domain(domain); + tmp = set_domain_for_dev(dev, dmar_domain); + } } dev_info(dev, "32bit DMA uses non-identity mapping\n"); -- Obviously needs some checks added, but this was just an initial test I was trying. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Seeing some another issue with mixed domains in the same iommu_group
On Thu Feb 06 20, Jerry Snitselaar wrote: On Thu Feb 06 20, Jerry Snitselaar wrote: Hi Baolu, I'm seeing another issue with the devices in the HP ilo when the system is booted with intel_iommu=on and iommu=pt (iommu=nopt does not run into problems). first system: 01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard Slave Instrumentation & System Support (rev 05) 01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EH 01:00.2 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard Management Processor Support and Messaging (rev 05) 01:00.4 USB controller: Hewlett-Packard Company Integrated Lights-Out Standard Virtual USB Controller (rev 02) [ 21.208103] pci :01:00.0: Adding to iommu group 24 [ 21.210911] pci :01:00.0: Using iommu dma mapping [ 21.212635] pci :01:00.1: Adding to iommu group 24 [ 21.214326] pci :01:00.1: Device uses a private identity domain. [ 21.216507] pci :01:00.2: Adding to iommu group 24 [ 21.618173] pci :01:00.4: Adding to iommu group 24 [ 21.619839] pci :01:00.4: Device uses a private identity domain. [ 26.206832] uhci_hcd: USB Universal Host Controller Interface driver [ 26.209044] uhci_hcd :01:00.4: UHCI Host Controller [ 26.210897] uhci_hcd :01:00.4: new USB bus registered, assigned bus number 3 [ 26.213247] uhci_hcd :01:00.4: detected 8 ports [ 26.214810] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports [ 26.217153] uhci_hcd :01:00.4: irq 16, io base 0x3c00 [ 26.219171] uhci_hcd :01:00.4: 32bit DMA uses non-identity mapping [ 26.221261] uhci_hcd :01:00.4: unable to allocate consistent memory for frame list [ 26.223787] uhci_hcd :01:00.4: startup error -16 [ 26.225381] uhci_hcd :01:00.4: USB bus 3 deregistered [ 26.227378] uhci_hcd :01:00.4: init :01:00.4 fail, -16 [ 26.229296] uhci_hcd: probe of :01:00.4 failed with error -16 different system with similar issue: 01:00.0 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out Standard Slave Instrumentation & System Support [103c:3306] (rev 07) 01:00.1 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA G200eH3 [102b:0538] (rev 02) (prog-if 00 [VGA controller]) 01:00.2 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out Standard Management Processor Support and Messaging [103c:3307] (rev 07) 01:00.4 USB controller [0c03]: Hewlett-Packard Company iLO5 Virtual USB Controller [103c:22f6] (prog-if 20 [EHCI]) [ 13.695663] pci :01:00.0: Adding to iommu group 10 [ 13.703667] pci :01:00.0: Using iommu dma mapping [ 13.708871] pci :01:00.1: Adding to iommu group 10 [ 13.714033] pci :01:00.1: DMAR: Device uses a private identity domain. [ 13.721033] pci :01:00.2: Adding to iommu group 10 [ 13.726290] pci :01:00.4: Adding to iommu group 10 [ 13.731453] pci :01:00.4: DMAR: Device uses a private identity domain. [ 17.157796] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 17.164348] ehci-pci: EHCI PCI platform driver [ 17.170061] ehci-pci :01:00.4: EHCI Host Controller [ 17.175457] ehci-pci :01:00.4: new USB bus registered, assigned bus number 1 [ 17.182912] ehci-pci :01:00.4: DMAR: 32bit DMA uses non-identity mapping [ 17.189988] ehci-pci :01:00.4: can't setup: -12 [ 17.194884] ehci-pci :01:00.4: USB bus 1 deregistered [ 17.200567] ehci-pci :01:00.4: init :01:00.4 fail, -12 [ 17.206508] ehci-pci: probe of :01:00.4 failed with error -12 I'm looking through the code and trying to debug it, but any thoughts on this? Regards, Jerry In iommu_need_mapping, in a case like the above does something like dmar_insert_one_dev_info need to happen to associate the device back with the group default domain? In intel_iommu_add_device it is going to get removed and added to the identity domain, and then in iommu_need_mapping it gets removed from the identity domain, and iommu_request_dma_domain_for_dev should return 0 because the group default domain at this point is the correct type. The above cases seem to be avoided by: 9235cb13d7d1 | 2020-01-24 | iommu/vt-d: Allow devices with RMRRs to use identity domain (Lu Baolu) which results in the watchdog device no longer taking a dma domain and switching the group default. Without that patch though when it gets into the iommu_need_mapping code for :01:00.4 after the following: dmar_remove_one_dev_info(dev); ret = iommu_request_dma_domain_for_dev(dev); ret is 0 and dev->archdata.iommu is NULL. Even with 9235cb13d7d1 device_def_domain_type can return return dma, but I'm not sure how likely it is for there to be an iommu group like that again where the group default ends up dma, a device gets removed and added to the identity domain, and then e
Re: Seeing some another issue with mixed domains in the same iommu_group
On Thu Feb 06 20, Jerry Snitselaar wrote: Hi Baolu, I'm seeing another issue with the devices in the HP ilo when the system is booted with intel_iommu=on and iommu=pt (iommu=nopt does not run into problems). first system: 01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard Slave Instrumentation & System Support (rev 05) 01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EH 01:00.2 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard Management Processor Support and Messaging (rev 05) 01:00.4 USB controller: Hewlett-Packard Company Integrated Lights-Out Standard Virtual USB Controller (rev 02) [ 21.208103] pci :01:00.0: Adding to iommu group 24 [ 21.210911] pci :01:00.0: Using iommu dma mapping [ 21.212635] pci :01:00.1: Adding to iommu group 24 [ 21.214326] pci :01:00.1: Device uses a private identity domain. [ 21.216507] pci :01:00.2: Adding to iommu group 24 [ 21.618173] pci :01:00.4: Adding to iommu group 24 [ 21.619839] pci :01:00.4: Device uses a private identity domain. [ 26.206832] uhci_hcd: USB Universal Host Controller Interface driver [ 26.209044] uhci_hcd :01:00.4: UHCI Host Controller [ 26.210897] uhci_hcd :01:00.4: new USB bus registered, assigned bus number 3 [ 26.213247] uhci_hcd :01:00.4: detected 8 ports [ 26.214810] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports [ 26.217153] uhci_hcd :01:00.4: irq 16, io base 0x3c00 [ 26.219171] uhci_hcd :01:00.4: 32bit DMA uses non-identity mapping [ 26.221261] uhci_hcd :01:00.4: unable to allocate consistent memory for frame list [ 26.223787] uhci_hcd :01:00.4: startup error -16 [ 26.225381] uhci_hcd :01:00.4: USB bus 3 deregistered [ 26.227378] uhci_hcd :01:00.4: init :01:00.4 fail, -16 [ 26.229296] uhci_hcd: probe of :01:00.4 failed with error -16 different system with similar issue: 01:00.0 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out Standard Slave Instrumentation & System Support [103c:3306] (rev 07) 01:00.1 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA G200eH3 [102b:0538] (rev 02) (prog-if 00 [VGA controller]) 01:00.2 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out Standard Management Processor Support and Messaging [103c:3307] (rev 07) 01:00.4 USB controller [0c03]: Hewlett-Packard Company iLO5 Virtual USB Controller [103c:22f6] (prog-if 20 [EHCI]) [ 13.695663] pci :01:00.0: Adding to iommu group 10 [ 13.703667] pci :01:00.0: Using iommu dma mapping [ 13.708871] pci :01:00.1: Adding to iommu group 10 [ 13.714033] pci :01:00.1: DMAR: Device uses a private identity domain. [ 13.721033] pci :01:00.2: Adding to iommu group 10 [ 13.726290] pci :01:00.4: Adding to iommu group 10 [ 13.731453] pci :01:00.4: DMAR: Device uses a private identity domain. [ 17.157796] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 17.164348] ehci-pci: EHCI PCI platform driver [ 17.170061] ehci-pci :01:00.4: EHCI Host Controller [ 17.175457] ehci-pci :01:00.4: new USB bus registered, assigned bus number 1 [ 17.182912] ehci-pci :01:00.4: DMAR: 32bit DMA uses non-identity mapping [ 17.189988] ehci-pci :01:00.4: can't setup: -12 [ 17.194884] ehci-pci :01:00.4: USB bus 1 deregistered [ 17.200567] ehci-pci :01:00.4: init :01:00.4 fail, -12 [ 17.206508] ehci-pci: probe of :01:00.4 failed with error -12 I'm looking through the code and trying to debug it, but any thoughts on this? Regards, Jerry In iommu_need_mapping, in a case like the above does something like dmar_insert_one_dev_info need to happen to associate the device back with the group default domain? In intel_iommu_add_device it is going to get removed and added to the identity domain, and then in iommu_need_mapping it gets removed from the identity domain, and iommu_request_dma_domain_for_dev should return 0 because the group default domain at this point is the correct type. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Seeing some another issue with mixed domains in the same iommu_group
Hi Baolu, I'm seeing another issue with the devices in the HP ilo when the system is booted with intel_iommu=on and iommu=pt (iommu=nopt does not run into problems). first system: 01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard Slave Instrumentation & System Support (rev 05) 01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EH 01:00.2 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard Management Processor Support and Messaging (rev 05) 01:00.4 USB controller: Hewlett-Packard Company Integrated Lights-Out Standard Virtual USB Controller (rev 02) [ 21.208103] pci :01:00.0: Adding to iommu group 24 [ 21.210911] pci :01:00.0: Using iommu dma mapping [ 21.212635] pci :01:00.1: Adding to iommu group 24 [ 21.214326] pci :01:00.1: Device uses a private identity domain. [ 21.216507] pci :01:00.2: Adding to iommu group 24 [ 21.618173] pci :01:00.4: Adding to iommu group 24 [ 21.619839] pci :01:00.4: Device uses a private identity domain. [ 26.206832] uhci_hcd: USB Universal Host Controller Interface driver [ 26.209044] uhci_hcd :01:00.4: UHCI Host Controller [ 26.210897] uhci_hcd :01:00.4: new USB bus registered, assigned bus number 3 [ 26.213247] uhci_hcd :01:00.4: detected 8 ports [ 26.214810] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports [ 26.217153] uhci_hcd :01:00.4: irq 16, io base 0x3c00 [ 26.219171] uhci_hcd :01:00.4: 32bit DMA uses non-identity mapping [ 26.221261] uhci_hcd :01:00.4: unable to allocate consistent memory for frame list [ 26.223787] uhci_hcd :01:00.4: startup error -16 [ 26.225381] uhci_hcd :01:00.4: USB bus 3 deregistered [ 26.227378] uhci_hcd :01:00.4: init :01:00.4 fail, -16 [ 26.229296] uhci_hcd: probe of :01:00.4 failed with error -16 different system with similar issue: 01:00.0 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out Standard Slave Instrumentation & System Support [103c:3306] (rev 07) 01:00.1 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA G200eH3 [102b:0538] (rev 02) (prog-if 00 [VGA controller]) 01:00.2 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out Standard Management Processor Support and Messaging [103c:3307] (rev 07) 01:00.4 USB controller [0c03]: Hewlett-Packard Company iLO5 Virtual USB Controller [103c:22f6] (prog-if 20 [EHCI]) [ 13.695663] pci :01:00.0: Adding to iommu group 10 [ 13.703667] pci :01:00.0: Using iommu dma mapping [ 13.708871] pci :01:00.1: Adding to iommu group 10 [ 13.714033] pci :01:00.1: DMAR: Device uses a private identity domain. [ 13.721033] pci :01:00.2: Adding to iommu group 10 [ 13.726290] pci :01:00.4: Adding to iommu group 10 [ 13.731453] pci :01:00.4: DMAR: Device uses a private identity domain. [ 17.157796] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 17.164348] ehci-pci: EHCI PCI platform driver [ 17.170061] ehci-pci :01:00.4: EHCI Host Controller [ 17.175457] ehci-pci :01:00.4: new USB bus registered, assigned bus number 1 [ 17.182912] ehci-pci :01:00.4: DMAR: 32bit DMA uses non-identity mapping [ 17.189988] ehci-pci :01:00.4: can't setup: -12 [ 17.194884] ehci-pci :01:00.4: USB bus 1 deregistered [ 17.200567] ehci-pci :01:00.4: init :01:00.4 fail, -12 [ 17.206508] ehci-pci: probe of :01:00.4 failed with error -12 I'm looking through the code and trying to debug it, but any thoughts on this? Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: warning from domain_get_iommu
On Tue Feb 04 20, Jerry Snitselaar wrote: I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [2.832615] ehci-pci: EHCI PCI platform driver [2.834190] ehci-pci :00:1a.0: EHCI Host Controller [2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [2.838276] ehci-pci :00:1a.0: debug port 2 [2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [2.840671] Modules linked in: [2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [2.840671] R10: 0095 R11: c90df928 R12: [2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [2.840671] FS: () GS:88ec7f60() knlGS: [2.840671] CS: 0010 DS: ES: CR0: 80050033 [2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [2.840671] Call Trace: [2.840671] __intel_map_single+0x62/0x140 [2.840671] intel_alloc_coherent+0xa6/0x130 [2.840671] dma_pool_alloc+0xd8/0x1e0 [2.840671] e_qh_alloc+0x55/0x130 [2.840671] ehci_setup+0x284/0x7b0 [2.840671] ehci_pci_setup+0xa3/0x530 [2.840671] usb_add_hcd+0x2b6/0x800 [2.840671] usb_hcd_pci_probe+0x375/0x460 [2.840671] local_pci_probe+0x41/0x90 [2.840671] pci_device_probe+0x105/0x1b0 [2.840671] driver_probe_device+0x12d/0x460 [2.840671] device_driver_attach+0x50/0x60 [2.840671] __driver_attach+0x61/0x130 [2.840671] ? device_driver_attach+0x60/0x60 [2.840671] bus_for_each_dev+0x77/0xc0 [2.840671] ? klist_add_tail+0x3b/0x70 [2.840671] bus_add_driver+0x14d/0x1e0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] ? do_early_param+0x91/0x91 [2.840671] driver_register+0x6b/0xb0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] do_one_initcall+0x46/0x1c3 [2.840671] ? do_early_param+0x91/0x91 [2.840671] kernel_init_freeable+0x1af/0x258 [2.840671] ? rest_init+0xaa/0xaa [2.840671] kernel_init+0xa/0xf9 [2.840671] ret_from_fork+0x35/0x40 [2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [3.035900] usb usb1: Product: EHCI Host Controller [3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
warning from domain_get_iommu
I'm working on getting a system to reproduce this, and verify it also occurs with 5.5, but I have a report of a case where the kdump kernel gives warnings like the following on a hp dl360 gen9: [2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [2.832615] ehci-pci: EHCI PCI platform driver [2.834190] ehci-pci :00:1a.0: EHCI Host Controller [2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [2.838276] ehci-pci :00:1a.0: debug port 2 [2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60 [2.840671] Modules linked in: [2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-170.el8.kdump2.x86_64 #1 [2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [2.840671] RIP: 0010:domain_get_iommu+0x55/0x60 [2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202 [2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: [2.840671] RDX: fff0 RSI: RDI: 88ec7f1c8000 [2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00 [2.840671] R10: 0095 R11: c90df928 R12: [2.840671] R13: 88ec7f1c8000 R14: 1000 R15: [2.840671] FS: () GS:88ec7f60() knlGS: [2.840671] CS: 0010 DS: ES: CR0: 80050033 [2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0 [2.840671] Call Trace: [2.840671] __intel_map_single+0x62/0x140 [2.840671] intel_alloc_coherent+0xa6/0x130 [2.840671] dma_pool_alloc+0xd8/0x1e0 [2.840671] e_qh_alloc+0x55/0x130 [2.840671] ehci_setup+0x284/0x7b0 [2.840671] ehci_pci_setup+0xa3/0x530 [2.840671] usb_add_hcd+0x2b6/0x800 [2.840671] usb_hcd_pci_probe+0x375/0x460 [2.840671] local_pci_probe+0x41/0x90 [2.840671] pci_device_probe+0x105/0x1b0 [2.840671] driver_probe_device+0x12d/0x460 [2.840671] device_driver_attach+0x50/0x60 [2.840671] __driver_attach+0x61/0x130 [2.840671] ? device_driver_attach+0x60/0x60 [2.840671] bus_for_each_dev+0x77/0xc0 [2.840671] ? klist_add_tail+0x3b/0x70 [2.840671] bus_add_driver+0x14d/0x1e0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] ? do_early_param+0x91/0x91 [2.840671] driver_register+0x6b/0xb0 [2.840671] ? ehci_hcd_init+0xaa/0xaa [2.840671] do_one_initcall+0x46/0x1c3 [2.840671] ? do_early_param+0x91/0x91 [2.840671] kernel_init_freeable+0x1af/0x258 [2.840671] ? rest_init+0xaa/0xaa [2.840671] kernel_init+0xa/0xf9 [2.840671] ret_from_fork+0x35/0x40 [2.840671] ---[ end trace e87b0d9a1c8135c4 ]--- [3.010848] ehci-pci :00:1a.0: Using iommu dma mapping [3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping [3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported [3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000 [3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.18 [3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [3.035900] usb usb1: Product: EHCI Host Controller [3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 ehci_hcd [3.039691] usb usb1: SerialNumber: :00:1a.0 It looks like the device finishes initializing once it figures out it needs dma mapping instead of the default passthrough. intel_alloc_coherent calls iommu_need_mapping, before it calls __intel_map_single, so I'm not sure why it is tripping over the WARN_ON in domain_get_iommu. one thing I noticed while looking at this is that domain_get_iommu can return NULL. So should there be something like the following in __intel_map_single after the domain_get_iommu call? if (!iommu) goto error; It is possible to deref the null pointer later otherwise. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/vt-d: call __dmar_remove_one_dev_info with valid pointer
It is possible for archdata.iommu to be set to DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO so check for those values before calling __dmar_remove_one_dev_info. Without a check it can result in a null pointer dereference. This has been seen while booting a kdump kernel on an HP dl380 gen9. Cc: Joerg Roedel Cc: Lu Baolu Cc: David Woodhouse Cc: sta...@vger.kernel.org # 5.3+ Cc: linux-ker...@vger.kernel.org Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one") Signed-off-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 1801f0aaf013..932267f49f9a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) spin_lock_irqsave(&device_domain_lock, flags); info = dev->archdata.iommu; - if (info) + if (info && info != DEFER_DEVICE_DOMAIN_INFO + && info != DUMMY_DEVICE_DOMAIN_INFO) __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(&device_domain_lock, flags); } -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Don't reject nvme host due to scope mismatch
On Sun Jan 05 20, jimyan wrote: On a system with an Intel PCIe port configured as a nvme host device, iommu initialization fails with DMAR: Device scope type does not match for :80:00.0 This is because the DMAR table reports this device as having scope 2 (ACPI_DMAR_SCOPE_TYPE_BRIDGE): but the device has a type 0 PCI header: 80:00.0 Class 0600: Device 8086:2020 (rev 06) 00: 86 80 20 20 47 05 10 00 06 00 00 06 10 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 00 00 30: 00 00 00 00 90 00 00 00 00 00 00 00 00 01 00 00 VT-d works perfectly on this system, so there's no reason to bail out on initialization due to this apparent scope mismatch. Add the class 0x06 ("PCI_BASE_CLASS_BRIDGE") as a heuristic for allowing DMAR initialization for non-bridge PCI devices listed with scope bridge. Signed-off-by: jimyan Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
On Tue Jan 07 20, Lu Baolu wrote: Hi Jerry, On 1/7/20 1:05 AM, Jerry Snitselaar wrote: On Wed Jan 01 20, Roland Dreier via iommu wrote: We saw more devices with the same mismatch quirk. So maintaining them in a quirk table will make it more readable and maintainable. I guess I disagree about the maintainable part, given that this patch already regresses Broadwell NTB. I'm not even sure what the DMAR table says about NTB on my Skylake systems, exactly because the existing code means I did not have any problems. But we might need to add device 201Ch too. Maybe we don't need the mismatch check at all? Your patch sets the quirk if any possibly mismatching device is present in the system, so we'll ignore any scope mismatch on a system with, say, the 8086:2020 NVMe host in it. So could we just drop the check completely and not have a quirk to disable the check? - R. If the check is removed what happens for cases where there is an actual problem in the dmar table? I just worked an issue with some Intel people where a purley system had an rmrr entry pointing to a bridge as the endpoint device instead of the raid module sitting behind it. The latest solution was here. https://lkml.org/lkml/2020/1/5/103, does this work for you? Best regards, baolu Hi Baolu, They resolved it by updating the rmrr entry in the dmar table to add the extra path needed for it to point at the raid module. Looking at the code though I imagine without the firmware update they would still have the problem because IIRC it was a combo of an endpoint scope type, and a pci bridge header so that first check would fail as it did before. My worry was if the suggestion is to remove the check completely, a case like that wouldn't report anything wrong. Jim's latest patch I think solves the issue for what he was seeing and the NTB case. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
On Wed Jan 01 20, Roland Dreier via iommu wrote: We saw more devices with the same mismatch quirk. So maintaining them in a quirk table will make it more readable and maintainable. I guess I disagree about the maintainable part, given that this patch already regresses Broadwell NTB. I'm not even sure what the DMAR table says about NTB on my Skylake systems, exactly because the existing code means I did not have any problems. But we might need to add device 201Ch too. Maybe we don't need the mismatch check at all? Your patch sets the quirk if any possibly mismatching device is present in the system, so we'll ignore any scope mismatch on a system with, say, the 8086:2020 NVMe host in it. So could we just drop the check completely and not have a quirk to disable the check? - R. If the check is removed what happens for cases where there is an actual problem in the dmar table? I just worked an issue with some Intel people where a purley system had an rmrr entry pointing to a bridge as the endpoint device instead of the raid module sitting behind it. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Add a quirk flag for scope mismatched devices
On Tue Dec 24 19, Lu Baolu wrote: We expect devices with endpoint scope to have normal PCI headers, and devices with bridge scope to have bridge PCI headers. However Some PCI devices may be listed in the DMAR table with bridge scope, even though they have a normal PCI header. Add a quirk flag for those special devices. Cc: Roland Dreier Cc: Jim Yan Signed-off-by: Lu Baolu --- Reviewed-by: Jerry Snitselaar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: Don't reject nvme host due to scope mismatch
On Fri Dec 20 19, jimyan wrote: On a system with an Intel PCIe port configured as a nvme host device, iommu initialization fails with DMAR: Device scope type does not match for :80:00.0 This is because the DMAR table reports this device as having scope 2 (ACPI_DMAR_SCOPE_TYPE_BRIDGE): Isn't that a problem to be fixed in the DMAR table then? but the device has a type 0 PCI header: 80:00.0 Class 0600: Device 8086:2020 (rev 06) 00: 86 80 20 20 47 05 10 00 06 00 00 06 10 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 00 00 30: 00 00 00 00 90 00 00 00 00 00 00 00 00 01 00 00 VT-d works perfectly on this system, so there's no reason to bail out on initialization due to this apparent scope mismatch. Add the class 0x600 ("PCI_CLASS_BRIDGE_HOST") as a heuristic for allowing DMAR initialization for non-bridge PCI devices listed with scope bridge. Signed-off-by: jimyan --- drivers/iommu/dmar.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index eecd6a421667..9faf2f0e0237 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -244,6 +244,7 @@ int dmar_insert_dev_scope(struct dmar_pci_notify_info *info, info->dev->hdr_type != PCI_HEADER_TYPE_NORMAL) || (scope->entry_type == ACPI_DMAR_SCOPE_TYPE_BRIDGE && (info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL && + info->dev->class >> 8 != PCI_CLASS_BRIDGE_HOST && info->dev->class >> 8 != PCI_CLASS_BRIDGE_OTHER))) { pr_warn("Device scope type does not match for %s\n", pci_name(info->dev)); -- 2.11.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info
On Tue Dec 17 19, Jerry Snitselaar wrote: On Tue Dec 17 19, Jerry Snitselaar wrote: In addition to checking for a null pointer, verify that info does not have the value DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values __dmar_remove_one_dev_info will panic when trying to access a member of the device_domain_info struct. [1.464241] BUG: unable to handle kernel NULL pointer dereference at 004e [1.464241] PGD 0 P4D 0 [1.464241] Oops: [#1] SMP PTI [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW - - - 4.18.0-160.el8.x86_64 #1 [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250 [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 55 41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb <4c> 8b 67 50 48 8b 6f 58 $ [1.464241] RSP: :c90dfd10 EFLAGS: 00010082 [1.464241] RAX: 0001 RBX: fffe RCX: [1.464241] RDX: 0001 RSI: 0004 RDI: fffe [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 0039 [1.464241] R10: R11: c90dfa58 R12: 88ec7a0eec20 [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: [1.464241] FS: () GS:88ec7a60() knlGS: [1.464241] CS: 0010 DS: ES: CR0: 80050033 [1.464241] CR2: 004e CR3: 006c7900a001 C 001606b0 [1.464241] Call Trace: [1.464241] dmar_remove_one_dev_info.isra.68+0x27/0x40 [1.464241] intel_iommu_add_device+0x124/0x180 [1.464241] ? iommu_probe_device+0x40/0x40 [1.464241] add_iommu_group+0xa/0x20 [1.464241] bus_for_each_dev+0x77/0xc0 [1.464241] ? down_write+0xe/0x40 [1.464241] bus_set_iommu+0x85/0xc0 [1.464241] intel_iommu_init+0x4b4/0x777 [1.464241] ? e820__memblock_setup+0x63/0x63 [1.464241] ? do_early_param+0x91/0x91 [1.464241] pci_iommu_init+0x19/0x45 [1.464241] do_one_initcall+0x46/0x1c3 [1.464241] ? do_early_param+0x91/0x91 [1.464241] kernel_init_freeable+0x1af/0x258 [1.464241] ? rest_init+0xaa/0xaa [1.464241] kernel_init+0xa/0x107 [1.464241] ret_from_fork+0x35/0x40 [1.464241] Modules linked in: [1.464241] CR2: 004e [1.464241] ---[ end trace 0927d2ba8b8032b5 ]--- Cc: Joerg Roedel Cc: Lu Baolu Cc: David Woodhouse Cc: sta...@vger.kernel.org # v5.3+ Cc: iommu@lists.linux-foundation.org Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one") Signed-off-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..e42a09794fa2 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) spin_lock_irqsave(&device_domain_lock, flags); info = dev->archdata.iommu; - if (info) + if (info && info != DEFER_DEVICE_DOMAIN_INFO + && info != DUMMY_DEVICE_DOMAIN_INFO) __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(&device_domain_lock, flags); } -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Nack this. Apparently the issue is just being seen with the kdump kernel. I'm wondering if it is already solved by 6c3a44ed3c55 ("iommu/vt-d: Turn off translations at shutdown"). Testing a 5.5 build now. And a minute later I got a response. The 5.5 kernel hits the original panic when booting into the kdump kernel. I need to test with this patch on 5.5, but with a test build of our kernel with this patch the problem just moves to: [3.742317] pci :01:00.0: Using iommu dma mapping [3.744020] pci :01:00.1: Adding to iommu group 86 [3.746697] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0Modules linked in: [3.746697] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-167.el8.iommu6.x86_64 #1 [3.746697] Hardware name: HP ProLiant DL560 Gen9/ProLiant DL560 Gen9, BIOS P85 07/21/2019 [3.746697] RIP: 0010:native_queued_spin_lock_slowpath+0x5d/0x1d0 [3.746697] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 47 85 c$ [3.746697] RSP: :c90f3bd8 EFLAGS: 0002 [3.746697] RAX: 0101 RBX: 0046 RCX: 7f17 [3.746697] RDX: RSI: RDI:
Re: [RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info
On Tue Dec 17 19, Jerry Snitselaar wrote: In addition to checking for a null pointer, verify that info does not have the value DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values __dmar_remove_one_dev_info will panic when trying to access a member of the device_domain_info struct. [1.464241] BUG: unable to handle kernel NULL pointer dereference at 004e [1.464241] PGD 0 P4D 0 [1.464241] Oops: [#1] SMP PTI [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW - - - 4.18.0-160.el8.x86_64 #1 [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250 [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 55 41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb <4c> 8b 67 50 48 8b 6f 58 $ [1.464241] RSP: :c90dfd10 EFLAGS: 00010082 [1.464241] RAX: 0001 RBX: fffe RCX: [1.464241] RDX: 0001 RSI: 0004 RDI: fffe [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 0039 [1.464241] R10: R11: c90dfa58 R12: 88ec7a0eec20 [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: [1.464241] FS: () GS:88ec7a60() knlGS: [1.464241] CS: 0010 DS: ES: CR0: 80050033 [1.464241] CR2: 004e CR3: 006c7900a001 C 001606b0 [1.464241] Call Trace: [1.464241] dmar_remove_one_dev_info.isra.68+0x27/0x40 [1.464241] intel_iommu_add_device+0x124/0x180 [1.464241] ? iommu_probe_device+0x40/0x40 [1.464241] add_iommu_group+0xa/0x20 [1.464241] bus_for_each_dev+0x77/0xc0 [1.464241] ? down_write+0xe/0x40 [1.464241] bus_set_iommu+0x85/0xc0 [1.464241] intel_iommu_init+0x4b4/0x777 [1.464241] ? e820__memblock_setup+0x63/0x63 [1.464241] ? do_early_param+0x91/0x91 [1.464241] pci_iommu_init+0x19/0x45 [1.464241] do_one_initcall+0x46/0x1c3 [1.464241] ? do_early_param+0x91/0x91 [1.464241] kernel_init_freeable+0x1af/0x258 [1.464241] ? rest_init+0xaa/0xaa [1.464241] kernel_init+0xa/0x107 [1.464241] ret_from_fork+0x35/0x40 [1.464241] Modules linked in: [1.464241] CR2: 004e [1.464241] ---[ end trace 0927d2ba8b8032b5 ]--- Cc: Joerg Roedel Cc: Lu Baolu Cc: David Woodhouse Cc: sta...@vger.kernel.org # v5.3+ Cc: iommu@lists.linux-foundation.org Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one") Signed-off-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..e42a09794fa2 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) spin_lock_irqsave(&device_domain_lock, flags); info = dev->archdata.iommu; - if (info) + if (info && info != DEFER_DEVICE_DOMAIN_INFO + && info != DUMMY_DEVICE_DOMAIN_INFO) __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(&device_domain_lock, flags); } -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Nack this. Apparently the issue is just being seen with the kdump kernel. I'm wondering if it is already solved by 6c3a44ed3c55 ("iommu/vt-d: Turn off translations at shutdown"). Testing a 5.5 build now. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info
On Tue, Dec 17, 2019 at 10:56 AM Jerry Snitselaar wrote: > > In addition to checking for a null pointer, verify that > info does not have the value DEFER_DEVICE_DOMAIN_INFO or > DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values > __dmar_remove_one_dev_info will panic when trying to access > a member of the device_domain_info struct. > > [1.464241] BUG: unable to handle kernel NULL pointer dereference at > 004e > [1.464241] PGD 0 P4D 0 > [1.464241] Oops: [#1] SMP PTI > [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW > - - - 4.18.0-160.el8.x86_64 #1 > [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, > BIOS P89 07/21/2019 > [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250 > [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 > 55 41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb > <4c> 8b 67 50 48 8b 6f 58 $ > [1.464241] RSP: :c90dfd10 EFLAGS: 00010082 > [1.464241] RAX: 0001 RBX: fffe RCX: > > [1.464241] RDX: 0001 RSI: 0004 RDI: > fffe > [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: > 0039 > [1.464241] R10: R11: c90dfa58 R12: > 88ec7a0eec20 > [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: > > [1.464241] FS: () GS:88ec7a60() > knlGS: > [1.464241] CS: 0010 DS: ES: CR0: 80050033 > [1.464241] CR2: 004e CR3: 006c7900a001 C > 001606b0 > [1.464241] Call Trace: > [1.464241] dmar_remove_one_dev_info.isra.68+0x27/0x40 > [1.464241] intel_iommu_add_device+0x124/0x180 > [1.464241] ? iommu_probe_device+0x40/0x40 > [1.464241] add_iommu_group+0xa/0x20 > [1.464241] bus_for_each_dev+0x77/0xc0 > [1.464241] ? down_write+0xe/0x40 > [1.464241] bus_set_iommu+0x85/0xc0 > [1.464241] intel_iommu_init+0x4b4/0x777 > [1.464241] ? e820__memblock_setup+0x63/0x63 > [1.464241] ? do_early_param+0x91/0x91 > [1.464241] pci_iommu_init+0x19/0x45 > [1.464241] do_one_initcall+0x46/0x1c3 > [1.464241] ? do_early_param+0x91/0x91 > [1.464241] kernel_init_freeable+0x1af/0x258 > [1.464241] ? rest_init+0xaa/0xaa > [1.464241] kernel_init+0xa/0x107 > [1.464241] ret_from_fork+0x35/0x40 > [1.464241] Modules linked in: > [1.464241] CR2: 004e > [1.464241] ---[ end trace 0927d2ba8b8032b5 ]--- > > Cc: Joerg Roedel > Cc: Lu Baolu > Cc: David Woodhouse > Cc: sta...@vger.kernel.org # v5.3+ > Cc: iommu@lists.linux-foundation.org > Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one") > Signed-off-by: Jerry Snitselaar > --- > drivers/iommu/intel-iommu.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 0c8d81f56a30..e42a09794fa2 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) > > spin_lock_irqsave(&device_domain_lock, flags); > info = dev->archdata.iommu; > - if (info) > + if (info && info != DEFER_DEVICE_DOMAIN_INFO > + && info != DUMMY_DEVICE_DOMAIN_INFO) > __dmar_remove_one_dev_info(info); > spin_unlock_irqrestore(&device_domain_lock, flags); > } > -- > 2.24.0 > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > I'm not positive that the DUMMY_DEVICE_DOMAIN_INFO check is needed. It seemed like there were checks for that most places before dmar_remove_one_dev_info would be called, but I wasn't certain. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info
In addition to checking for a null pointer, verify that info does not have the value DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values __dmar_remove_one_dev_info will panic when trying to access a member of the device_domain_info struct. [1.464241] BUG: unable to handle kernel NULL pointer dereference at 004e [1.464241] PGD 0 P4D 0 [1.464241] Oops: [#1] SMP PTI [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW - - - 4.18.0-160.el8.x86_64 #1 [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 07/21/2019 [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250 [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 55 41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb <4c> 8b 67 50 48 8b 6f 58 $ [1.464241] RSP: :c90dfd10 EFLAGS: 00010082 [1.464241] RAX: 0001 RBX: fffe RCX: [1.464241] RDX: 0001 RSI: 0004 RDI: fffe [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 0039 [1.464241] R10: R11: c90dfa58 R12: 88ec7a0eec20 [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: [1.464241] FS: () GS:88ec7a60() knlGS: [1.464241] CS: 0010 DS: ES: CR0: 80050033 [1.464241] CR2: 004e CR3: 006c7900a001 C 001606b0 [1.464241] Call Trace: [1.464241] dmar_remove_one_dev_info.isra.68+0x27/0x40 [1.464241] intel_iommu_add_device+0x124/0x180 [1.464241] ? iommu_probe_device+0x40/0x40 [1.464241] add_iommu_group+0xa/0x20 [1.464241] bus_for_each_dev+0x77/0xc0 [1.464241] ? down_write+0xe/0x40 [1.464241] bus_set_iommu+0x85/0xc0 [1.464241] intel_iommu_init+0x4b4/0x777 [1.464241] ? e820__memblock_setup+0x63/0x63 [1.464241] ? do_early_param+0x91/0x91 [1.464241] pci_iommu_init+0x19/0x45 [1.464241] do_one_initcall+0x46/0x1c3 [1.464241] ? do_early_param+0x91/0x91 [1.464241] kernel_init_freeable+0x1af/0x258 [1.464241] ? rest_init+0xaa/0xaa [1.464241] kernel_init+0xa/0x107 [1.464241] ret_from_fork+0x35/0x40 [1.464241] Modules linked in: [1.464241] CR2: 004e [1.464241] ---[ end trace 0927d2ba8b8032b5 ]--- Cc: Joerg Roedel Cc: Lu Baolu Cc: David Woodhouse Cc: sta...@vger.kernel.org # v5.3+ Cc: iommu@lists.linux-foundation.org Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one") Signed-off-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..e42a09794fa2 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) spin_lock_irqsave(&device_domain_lock, flags); info = dev->archdata.iommu; - if (info) + if (info && info != DEFER_DEVICE_DOMAIN_INFO + && info != DUMMY_DEVICE_DOMAIN_INFO) __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(&device_domain_lock, flags); } -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: panic in dmar_remove_one_dev_info
On Mon Dec 16 19, Jerry Snitselaar wrote: HP is seeing a panic on gen9 dl360 and dl560 while testing these other changes we've been eorking on. I just took an initial look, but have to run to a dentist appointment so couldn't dig too deep. It looks like the device sets dev->archdata.iommu to DEFER_DEVICE_DOMAIN_INFO in intel_iommu_add_device, and then it needs a private domain so dmar_remove_one_dev_info gets called. That code path ends up trying to use DEFER_DEVICE_DOMAIN_INFO as a pointer. I don't need if there just needs to be a check in there to bail out if it sees DEFER_DEVICE_DOMAIN_INFO, or if something more is needed. I'll look at it some more when I get back home. Regards, Jerry Hi Baolu, Does this look sane? --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) spin_lock_irqsave(&device_domain_lock, flags); info = dev->archdata.iommu; - if (info) + if (info && info != DEFER_DEVICE_DOMAIN_INFO + && info != DUMMY_DEVICE_DOMAIN_INFO) __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(&device_domain_lock, flags); } Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
panic in dmar_remove_one_dev_info
HP is seeing a panic on gen9 dl360 and dl560 while testing these other changes we've been eorking on. I just took an initial look, but have to run to a dentist appointment so couldn't dig too deep. It looks like the device sets dev->archdata.iommu to DEFER_DEVICE_DOMAIN_INFO in intel_iommu_add_device, and then it needs a private domain so dmar_remove_one_dev_info gets called. That code path ends up trying to use DEFER_DEVICE_DOMAIN_INFO as a pointer. I don't need if there just needs to be a check in there to bail out if it sees DEFER_DEVICE_DOMAIN_INFO, or if something more is needed. I'll look at it some more when I get back home. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error
On Thu Dec 12 19, Lu Baolu wrote: Hi, On 12/12/19 9:49 AM, Jerry Snitselaar wrote: On Wed Dec 11 19, Lu Baolu wrote: If the default DMA domain of a group doesn't fit a device, it will still sit in the group but use a private identity domain. When map/unmap/iova_to_phys come through iommu API, the driver should still serve them, otherwise, other devices in the same group will be impacted. Since identity domain has been mapped with the whole available memory space and RMRRs, we don't need to worry about the impact on it. Link: https://www.spinics.net/lists/iommu/msg40416.html Cc: Jerry Snitselaar Reported-by: Jerry Snitselaar Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Cc: sta...@vger.kernel.org # v5.3+ Signed-off-by: Lu Baolu Reviewed-by: Jerry Snitselaar Can you please try this fix and check whether it can fix your problem? If it helps, do you mind adding a Tested-by? Best regards, baolu Tested-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 8 1 file changed, 8 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..b73bebea9148 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5478,9 +5478,6 @@ static int intel_iommu_map(struct iommu_domain *domain, int prot = 0; int ret; - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return -EINVAL; - if (iommu_prot & IOMMU_READ) prot |= DMA_PTE_READ; if (iommu_prot & IOMMU_WRITE) @@ -5523,8 +5520,6 @@ static size_t intel_iommu_unmap(struct iommu_domain *domain, /* Cope with horrid API which requires us to unmap more than the size argument if it happens to be a large-page mapping. */ BUG_ON(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level)); - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return 0; if (size < VTD_PAGE_SIZE << level_to_offset_bits(level)) size = VTD_PAGE_SIZE << level_to_offset_bits(level); @@ -5556,9 +5551,6 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain, int level = 0; u64 phys = 0; - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return 0; - pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level); if (pte) phys = dma_pte_addr(pte); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: Set ISA bridge reserved region as relaxable
On Wed Dec 11 19, Alex Williamson wrote: Commit d850c2ee5fe2 ("iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions") created a direct-mapped reserved memory region in order to replace the static identity mapping of the ISA address space, where the latter was then removed in commit df4f3c603aeb ("iommu/vt-d: Remove static identity map code"). According to the history of this code and the Kconfig option surrounding it, this direct mapping exists for the benefit of legacy ISA drivers that are not compatible with the DMA API. In conjuntion with commit 9b77e5c79840 ("vfio/type1: check dma map request is within a valid iova range") this change introduced a regression where the vfio IOMMU backend enforces reserved memory regions per IOMMU group, preventing userspace from creating IOMMU mappings conflicting with prescribed reserved regions. A necessary prerequisite for the vfio change was the introduction of "relaxable" direct mappings introduced by commit adfd37382090 ("iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions"). These relaxable direct mappings provide the same identity mapping support in the default domain, but also indicate that the reservation is software imposed and may be relaxed under some conditions, such as device assignment. Convert the ISA bridge direct-mapped reserved region to relaxable to reflect that the restriction is self imposed and need not be enforced by drivers such as vfio. Fixes: d850c2ee5fe2 ("iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions") Cc: sta...@vger.kernel.org # v5.3+ Link: https://lore.kernel.org/linux-iommu/20191211082304.2d4fa...@x1.home Reported-by: cprt Tested-by: cprt Signed-off-by: Alex Williamson Tested-by: Jerry Snitselaar Reviewed-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..6eb0dd7489a1 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5737,7 +5737,7 @@ static void intel_iommu_get_resv_regions(struct device *device, if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) { reg = iommu_alloc_resv_region(0, 1UL << 24, 0, - IOMMU_RESV_DIRECT); + IOMMU_RESV_DIRECT_RELAXABLE); if (reg) list_add_tail(®->list, head); } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/vt-d: Allocate reserved region for ISA with correct permission
Currently the reserved region for ISA is allocated with no permissions. If a dma domain is being used, mapping this region will fail. Set the permissions to DMA_PTE_READ|DMA_PTE_WRITE. Cc: Joerg Roedel Cc: Lu Baolu Cc: iommu@lists.linux-foundation.org Cc: sta...@vger.kernel.org # v5.3+ Fixes: d850c2ee5fe2 ("iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions") Signed-off-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..998529cebcf2 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5736,7 +5736,7 @@ static void intel_iommu_get_resv_regions(struct device *device, struct pci_dev *pdev = to_pci_dev(device); if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) { - reg = iommu_alloc_resv_region(0, 1UL << 24, 0, + reg = iommu_alloc_resv_region(0, 1UL << 24, prot, IOMMU_RESV_DIRECT); if (reg) list_add_tail(®->list, head); -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error
On Thu Dec 12 19, Jerry Snitselaar wrote: On Fri Dec 13 19, Lu Baolu wrote: Hi, On 12/13/19 8:30 AM, Jerry Snitselaar wrote: On Thu Dec 12 19, Lu Baolu wrote: Hi, On 12/12/19 9:49 AM, Jerry Snitselaar wrote: On Wed Dec 11 19, Lu Baolu wrote: If the default DMA domain of a group doesn't fit a device, it will still sit in the group but use a private identity domain. When map/unmap/iova_to_phys come through iommu API, the driver should still serve them, otherwise, other devices in the same group will be impacted. Since identity domain has been mapped with the whole available memory space and RMRRs, we don't need to worry about the impact on it. Link: https://www.spinics.net/lists/iommu/msg40416.html Cc: Jerry Snitselaar Reported-by: Jerry Snitselaar Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Cc: sta...@vger.kernel.org # v5.3+ Signed-off-by: Lu Baolu Reviewed-by: Jerry Snitselaar Can you please try this fix and check whether it can fix your problem? If it helps, do you mind adding a Tested-by? Best regards, baolu I'm testing with this patch, my patch that moves the direct mapping call, and Alex's patch for the ISA bridge. It solved the 2 iommu mapping errors I was seeing with default passthrough, I no longer see all the dmar pte read access errors, and the system boots allowing me to login. I'm tracking down 2 issues at the moment. With passthrough I see a problem with 01:00.4 that I mentioned in the earlier email: [ 78.978573] uhci_hcd: USB Universal Host Controller Interface driver [ 78.980842] uhci_hcd :01:00.4: UHCI Host Controller [ 78.982738] uhci_hcd :01:00.4: new USB bus registered, assigned bus number 3 [ 78.985222] uhci_hcd :01:00.4: detected 8 ports [ 78.986907] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports [ 78.989316] uhci_hcd :01:00.4: irq 16, io base 0x3c00 [ 78.994634] uhci_hcd :01:00.4: DMAR: 32bit DMA uses non-identity mapping [ 7 :01:00.4: unable to allocate consistent memory for frame list [ 79.499891] uhci_hcd :01:00.4: startup error -16 [ 79.501588] uhci_hcd :01:00.4: USB bus 3 deregistered [ 79.503494] uhci_hcd :01:00.4: init :01:00.4 fail, -16 [ 79.505497] uhci_hcd: probe of :01:00.4 failed with error -16 If I boot the system with iommu=nopt I see an iommu map failure due to the prot check in __domain_mapping: [ 40.940589] pci :00:1f.0: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 40.943558] pci :00:1f.0: iommu_group_create_direct_mappings: iterating through mappings [ 40.946402] pci :00:1f.0: iommu_group_create_direct_mappings: calling apply_resv_region [ 40.949184] pci :00:1f.0: iommu_group_create_direct_mappings: entry type is direct [ 40.951819] DMAR: intel_iommu_map: enter [ 40.953128] DMAR: __domain_mapping: prot & (DMA_PTE_READ|DMA_PTE_WRITE) == 0 [ 40.955486] DMAR: domain_mapping: __domain_mapping failed [ 40.957348] DMAR: intel_iommu_map: domain_pfn_mapping returned -22 [ 40.959466] DMAR: intel_iommu_map: leave [ 40.959468] iommu: iommu_map: ops->map failed iova 0x0 pa 0x pgsize 0x1000 [ 40.963511] pci :00:1f.0: iommu_group_create_direct_mappings: iommu_map failed [ 40.966026] pci :00:1f.0: iommu_group_create_direct_mappings: leaving func [ 40.968487] pci :00:1f.0: iommu_group_add_device: calling __iommu_attach_device [ 40.971016] pci :00:1f.0: Adding to iommu group 19 [ 40.972731] pci :00:1f.0: DMAR: domain->type is dma /sys/kernel/iommu_groups/19 [root@hp-dl388g8-07 19]# cat reserved_regions 0x 0x00ff direct 0xbdf6e000 0xbdf84fff direct 0xfee0 0xfeef msi 00:1f.0 ISA bridge: Intel Corporation C600/X79 series chipset LPC Controller This seems to be another issue? Best regards, baolu In intel_iommu_get_resv_regions this iommu_alloc_resv_region is called with prot set to 0: if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) { reg = iommu_alloc_resv_region(0, 1UL << 24, 0, IOMMU_RESV_DIRECT_RELAXABLE); if (reg) Looking at the older code for the ISA bridge it looks like it called iommu_prepare_identity_map -> domain_prepare_identity_map -> iommu_domain_identity_map -> and finally __domain_mapping with DMA_PTE_READ|DMA_PTE_WRITE? I wonder if this is an issue with the region starting at 0x0 and this bit in iommu_group_create_mappings: phys_addr = iommu_iova_to_phys(domain, addr); if (phys_addr) continue; Disregard this Off to stick in some more debugging statements. Regards, Jerry ___ iommu mailing list iommu
Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error
On Fri Dec 13 19, Lu Baolu wrote: Hi, On 12/13/19 8:30 AM, Jerry Snitselaar wrote: On Thu Dec 12 19, Lu Baolu wrote: Hi, On 12/12/19 9:49 AM, Jerry Snitselaar wrote: On Wed Dec 11 19, Lu Baolu wrote: If the default DMA domain of a group doesn't fit a device, it will still sit in the group but use a private identity domain. When map/unmap/iova_to_phys come through iommu API, the driver should still serve them, otherwise, other devices in the same group will be impacted. Since identity domain has been mapped with the whole available memory space and RMRRs, we don't need to worry about the impact on it. Link: https://www.spinics.net/lists/iommu/msg40416.html Cc: Jerry Snitselaar Reported-by: Jerry Snitselaar Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Cc: sta...@vger.kernel.org # v5.3+ Signed-off-by: Lu Baolu Reviewed-by: Jerry Snitselaar Can you please try this fix and check whether it can fix your problem? If it helps, do you mind adding a Tested-by? Best regards, baolu I'm testing with this patch, my patch that moves the direct mapping call, and Alex's patch for the ISA bridge. It solved the 2 iommu mapping errors I was seeing with default passthrough, I no longer see all the dmar pte read access errors, and the system boots allowing me to login. I'm tracking down 2 issues at the moment. With passthrough I see a problem with 01:00.4 that I mentioned in the earlier email: [ 78.978573] uhci_hcd: USB Universal Host Controller Interface driver [ 78.980842] uhci_hcd :01:00.4: UHCI Host Controller [ 78.982738] uhci_hcd :01:00.4: new USB bus registered, assigned bus number 3 [ 78.985222] uhci_hcd :01:00.4: detected 8 ports [ 78.986907] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports [ 78.989316] uhci_hcd :01:00.4: irq 16, io base 0x3c00 [ 78.994634] uhci_hcd :01:00.4: DMAR: 32bit DMA uses non-identity mapping [ 7 :01:00.4: unable to allocate consistent memory for frame list [ 79.499891] uhci_hcd :01:00.4: startup error -16 [ 79.501588] uhci_hcd :01:00.4: USB bus 3 deregistered [ 79.503494] uhci_hcd :01:00.4: init :01:00.4 fail, -16 [ 79.505497] uhci_hcd: probe of :01:00.4 failed with error -16 If I boot the system with iommu=nopt I see an iommu map failure due to the prot check in __domain_mapping: [ 40.940589] pci :00:1f.0: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 40.943558] pci :00:1f.0: iommu_group_create_direct_mappings: iterating through mappings [ 40.946402] pci :00:1f.0: iommu_group_create_direct_mappings: calling apply_resv_region [ 40.949184] pci :00:1f.0: iommu_group_create_direct_mappings: entry type is direct [ 40.951819] DMAR: intel_iommu_map: enter [ 40.953128] DMAR: __domain_mapping: prot & (DMA_PTE_READ|DMA_PTE_WRITE) == 0 [ 40.955486] DMAR: domain_mapping: __domain_mapping failed [ 40.957348] DMAR: intel_iommu_map: domain_pfn_mapping returned -22 [ 40.959466] DMAR: intel_iommu_map: leave [ 40.959468] iommu: iommu_map: ops->map failed iova 0x0 pa 0x pgsize 0x1000 [ 40.963511] pci :00:1f.0: iommu_group_create_direct_mappings: iommu_map failed [ 40.966026] pci :00:1f.0: iommu_group_create_direct_mappings: leaving func [ 40.968487] pci :00:1f.0: iommu_group_add_device: calling __iommu_attach_device [ 40.971016] pci :00:1f.0: Adding to iommu group 19 [ 40.972731] pci :00:1f.0: DMAR: domain->type is dma /sys/kernel/iommu_groups/19 [root@hp-dl388g8-07 19]# cat reserved_regions 0x 0x00ff direct 0xbdf6e000 0xbdf84fff direct 0xfee0 0xfeef msi 00:1f.0 ISA bridge: Intel Corporation C600/X79 series chipset LPC Controller This seems to be another issue? Best regards, baolu In intel_iommu_get_resv_regions this iommu_alloc_resv_region is called with prot set to 0: if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) { reg = iommu_alloc_resv_region(0, 1UL << 24, 0, IOMMU_RESV_DIRECT_RELAXABLE); if (reg) I wonder if this is an issue with the region starting at 0x0 and this bit in iommu_group_create_mappings: phys_addr = iommu_iova_to_phys(domain, addr); if (phys_addr) continue; Off to stick in some more debugging statements. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error
On Thu Dec 12 19, Lu Baolu wrote: Hi, On 12/12/19 9:49 AM, Jerry Snitselaar wrote: On Wed Dec 11 19, Lu Baolu wrote: If the default DMA domain of a group doesn't fit a device, it will still sit in the group but use a private identity domain. When map/unmap/iova_to_phys come through iommu API, the driver should still serve them, otherwise, other devices in the same group will be impacted. Since identity domain has been mapped with the whole available memory space and RMRRs, we don't need to worry about the impact on it. Link: https://www.spinics.net/lists/iommu/msg40416.html Cc: Jerry Snitselaar Reported-by: Jerry Snitselaar Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Cc: sta...@vger.kernel.org # v5.3+ Signed-off-by: Lu Baolu Reviewed-by: Jerry Snitselaar Can you please try this fix and check whether it can fix your problem? If it helps, do you mind adding a Tested-by? Best regards, baolu I'm testing with this patch, my patch that moves the direct mapping call, and Alex's patch for the ISA bridge. It solved the 2 iommu mapping errors I was seeing with default passthrough, I no longer see all the dmar pte read access errors, and the system boots allowing me to login. I'm tracking down 2 issues at the moment. With passthrough I see a problem with 01:00.4 that I mentioned in the earlier email: [ 78.978573] uhci_hcd: USB Universal Host Controller Interface driver [ 78.980842] uhci_hcd :01:00.4: UHCI Host Controller [ 78.982738] uhci_hcd :01:00.4: new USB bus registered, assigned bus number 3 [ 78.985222] uhci_hcd :01:00.4: detected 8 ports [ 78.986907] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports [ 78.989316] uhci_hcd :01:00.4: irq 16, io base 0x3c00 [ 78.994634] uhci_hcd :01:00.4: DMAR: 32bit DMA uses non-identity mapping [ 7 :01:00.4: unable to allocate consistent memory for frame list [ 79.499891] uhci_hcd :01:00.4: startup error -16 [ 79.501588] uhci_hcd :01:00.4: USB bus 3 deregistered [ 79.503494] uhci_hcd :01:00.4: init :01:00.4 fail, -16 [ 79.505497] uhci_hcd: probe of :01:00.4 failed with error -16 If I boot the system with iommu=nopt I see an iommu map failure due to the prot check in __domain_mapping: [ 40.940589] pci :00:1f.0: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 40.943558] pci :00:1f.0: iommu_group_create_direct_mappings: iterating through mappings [ 40.946402] pci :00:1f.0: iommu_group_create_direct_mappings: calling apply_resv_region [ 40.949184] pci :00:1f.0: iommu_group_create_direct_mappings: entry type is direct [ 40.951819] DMAR: intel_iommu_map: enter [ 40.953128] DMAR: __domain_mapping: prot & (DMA_PTE_READ|DMA_PTE_WRITE) == 0 [ 40.955486] DMAR: domain_mapping: __domain_mapping failed [ 40.957348] DMAR: intel_iommu_map: domain_pfn_mapping returned -22 [ 40.959466] DMAR: intel_iommu_map: leave [ 40.959468] iommu: iommu_map: ops->map failed iova 0x0 pa 0x pgsize 0x1000 [ 40.963511] pci :00:1f.0: iommu_group_create_direct_mappings: iommu_map failed [ 40.966026] pci :00:1f.0: iommu_group_create_direct_mappings: leaving func [ 40.968487] pci :00:1f.0: iommu_group_add_device: calling __iommu_attach_device [ 40.971016] pci :00:1f.0: Adding to iommu group 19 [ 40.972731] pci :00:1f.0: DMAR: domain->type is dma /sys/kernel/iommu_groups/19 [root@hp-dl388g8-07 19]# cat reserved_regions 0x 0x00ff direct 0xbdf6e000 0xbdf84fff direct 0xfee0 0xfeef msi 00:1f.0 ISA bridge: Intel Corporation C600/X79 series chipset LPC Controller --- drivers/iommu/intel-iommu.c | 8 1 file changed, 8 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..b73bebea9148 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5478,9 +5478,6 @@ static int intel_iommu_map(struct iommu_domain *domain, int prot = 0; int ret; - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return -EINVAL; - if (iommu_prot & IOMMU_READ) prot |= DMA_PTE_READ; if (iommu_prot & IOMMU_WRITE) @@ -5523,8 +5520,6 @@ static size_t intel_iommu_unmap(struct iommu_domain *domain, /* Cope with horrid API which requires us to unmap more than the size argument if it happens to be a large-page mapping. */ BUG_ON(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level)); - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return 0; if (size < VTD_PAGE_SIZE << level_to_offset_bits(level)) size = VTD_PAGE_SIZE << level_to_offset_bits(level); @@ -5556,9 +5551,6 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain, int level = 0;
Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error
On Wed Dec 11 19, Lu Baolu wrote: If the default DMA domain of a group doesn't fit a device, it will still sit in the group but use a private identity domain. When map/unmap/iova_to_phys come through iommu API, the driver should still serve them, otherwise, other devices in the same group will be impacted. Since identity domain has been mapped with the whole available memory space and RMRRs, we don't need to worry about the impact on it. Link: https://www.spinics.net/lists/iommu/msg40416.html Cc: Jerry Snitselaar Reported-by: Jerry Snitselaar Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Cc: sta...@vger.kernel.org # v5.3+ Signed-off-by: Lu Baolu Reviewed-by: Jerry Snitselaar --- drivers/iommu/intel-iommu.c | 8 1 file changed, 8 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..b73bebea9148 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5478,9 +5478,6 @@ static int intel_iommu_map(struct iommu_domain *domain, int prot = 0; int ret; - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return -EINVAL; - if (iommu_prot & IOMMU_READ) prot |= DMA_PTE_READ; if (iommu_prot & IOMMU_WRITE) @@ -5523,8 +5520,6 @@ static size_t intel_iommu_unmap(struct iommu_domain *domain, /* Cope with horrid API which requires us to unmap more than the size argument if it happens to be a large-page mapping. */ BUG_ON(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level)); - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return 0; if (size < VTD_PAGE_SIZE << level_to_offset_bits(level)) size = VTD_PAGE_SIZE << level_to_offset_bits(level); @@ -5556,9 +5551,6 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain, int level = 0; u64 phys = 0; - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return 0; - pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level); if (pte) phys = dma_pte_addr(pte); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error
On Wed Dec 11 19, Lu Baolu wrote: If the default DMA domain of a group doesn't fit a device, it will still sit in the group but use a private identity domain. When map/unmap/iova_to_phys come through iommu API, the driver should still serve them, otherwise, other devices in the same group will be impacted. Since identity domain has been mapped with the whole available memory space and RMRRs, we don't need to worry about the impact on it. Does this pose any potential issues with the reverse case where the group has a default identity domain, and the first device fits that, but a later device in the group needs dma and gets a private dma domain? Link: https://www.spinics.net/lists/iommu/msg40416.html Cc: Jerry Snitselaar Reported-by: Jerry Snitselaar Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Cc: sta...@vger.kernel.org # v5.3+ Signed-off-by: Lu Baolu --- drivers/iommu/intel-iommu.c | 8 1 file changed, 8 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..b73bebea9148 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5478,9 +5478,6 @@ static int intel_iommu_map(struct iommu_domain *domain, int prot = 0; int ret; - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return -EINVAL; - if (iommu_prot & IOMMU_READ) prot |= DMA_PTE_READ; if (iommu_prot & IOMMU_WRITE) @@ -5523,8 +5520,6 @@ static size_t intel_iommu_unmap(struct iommu_domain *domain, /* Cope with horrid API which requires us to unmap more than the size argument if it happens to be a large-page mapping. */ BUG_ON(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level)); - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return 0; if (size < VTD_PAGE_SIZE << level_to_offset_bits(level)) size = VTD_PAGE_SIZE << level_to_offset_bits(level); @@ -5556,9 +5551,6 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain, int level = 0; u64 phys = 0; - if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN) - return 0; - pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level); if (pte) phys = dma_pte_addr(pte); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dmar pte read access not set error messages on hp dl388 gen8 systems
On Tue Dec 10 19, Lu Baolu wrote: Hi, On 12/10/19 1:18 PM, Jerry Snitselaar wrote: On Mon Dec 09 19, Jerry Snitselaar wrote: [snip] A call to iommu_map is failing. [ 36.686881] pci :01:00.2: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 36.689843] pci :01:00.2: iommu_group_create_direct_mappings: iterating through mappings [ 36.692757] pci :01:00.2: iommu_group_create_direct_mappings: calling apply_resv_region [ 36.695526] pci :01:00.2: e_direct_mappings: entry type is direct [ 37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.201357] pci :01:00.2: iommu_group_create_direct_mappings: iommu_map failed [ 37.203973] pci :01:00.2: iommu_group_create_direct_mappings: leaving func [ 37.206385] pci :01:00.2: iommu_group_add_device: calling __iommu_attach_device [ 37.208950] pci :01:00.2: Adding to iommu group 25 [ 37.210660] pci :01:00.2: DMAR: domain->type is dma It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check at the beginning of intel_iommu_map. I will verify, but it looks like that is getting set when intel_iommu_add_device is called for 01:00.1. request_default_domain_for_dev for 01:00.1 will return -EBUSY because iommu_group_device_count(group) != 1. Okay, I will send you a fix patch later. Thanks! Best regards, baolu One issue I see is: [ 38.869182] uhci_hcd :01:00.4: UHCI Host Controller [ 39.371173] uhci_hcd :01:00.4: new USB bus registered, assigned bus number 3 [ 39.373708] uhci_hcd :01:00.4: detected 8 ports [ 39.375333] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports [ 39.377820] uhci_hcd :01:00.4: irq 16, io base 0x3c00 [ 39.379921] uhci_hcd :01:00.4: DMAR: 32bit DMA uses non-identity mapping [ 39.382269] uhci_hcd :01:00.4: unable to allocate consistent memory for frame list [ 39.384920] uhci_hcd :01:00.4: startup error -16 [ 39.386619] uhci_hcd :01:00.4: USB bus 3 deregistered [ 39.388640] uhci_hcd :01:00.4: init :01:00.4 fail, -16 [ 39.390616] uhci_hcd: probe of :01:00.4 failed with error -16 I'm not sure if this is related to the flag and what is allowed now by the api. I need to go look at the code to see what it is doing. I'll try debugging it tonight. Regards, Jerry Also fails for 01:00.4: [ 37.212448] pci :01:00.4: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 37.215382] pci :01:00.4: iommu_group_create_direct_mappings: iterating through mappings [ 37.218170] pci :01:00.4: iommu_group_create_direct_mappings: calling apply_resv_region [ 37.220933] pci :01:00.4: iommu_group_create_direct_mappings: entry type is direct-relaxable [ 37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.226857] pci :01:00.4: iommu_group_create_direct_mappings: iommu_map failed [ 37.229300] pci :01:00.4: iommu_group_create_direct_mappings: leaving func [ 37.231648] pci :01:00.4: iommu_group_add_device: calling __iommu_attach_device [ 37.234194] pci :01:00.4: Adding to iommu group 25 [ 37.236192] pci :01:00.4: DMAR: domain->type is dma [ 37.237958] pci :01:00.4: DMAR: device default domain type is identity. requesting identity domain [ 37.241061] pci :01:00.4: don't change mappings of existing d37.489870] pci :01:00.4: DMAR: Device uses a private identity domain. There is an RMRR for 0xbddde000-0xefff: [63Ah 1594 2] Subtable Type : 0001 [Reserved Memory Region] [63Ch 1596 2] Length : 0036 [63Eh 1598 2] Reserved : [640h 1600 2] PCI Segment Number : [642h 1602 8] Base Address : BDDDE000 [64Ah 1610 8] End Address (limit) : BDDDEFFF [652h 1618 1] Device Scope Type : 01 [PCI Endpoint Device] [653h 1619 1] Entry Length : 0A [654h 1620 2] Reserved : [656h 1622 1] Enumeration ID : 00 [657h 1623 1] PCI Bus Number : 00 [658h 1624 2] PCI Path : 1C,07 [65Ah 1626 2] PCI Path : 00,00 [65Ch 1628 1] Device Scope Type : 01 [PCI Endpoint Device] [65Dh 1629 1] Entry Length : 0A [65Eh 1630 2] Reserved : [660h 1632 1] Enumeration ID : 00 [661h 1633 1] PCI Bus Number : 00 [662h 1634 2] PCI Path : 1C,07 [664h 1636 2] PCI Path : 00,02 [666h 1638 1] Device Scope Type : 01 [PCI Endpoint Device] [667h 1639 1] Entry Length : 0A [668h 1640 2] Reserved : [66Ah 1642 1] Enumeration ID : 00 [66Bh 1643 1]
[PATCH] iommu: set group default domain before creating direct mappings
iommu_group_create_direct_mappings uses group->default_domain, but right after it is called, request_default_domain_for_dev calls iommu_domain_free for the default domain, and sets the group default domain to a different domain. Move the iommu_group_create_direct_mappings call to after the group default domain is set, so the direct mappings get associated with that domain. Cc: Joerg Roedel Cc: Lu Baolu Cc: iommu@lists.linux-foundation.org Cc: sta...@vger.kernel.org Fixes: 7423e01741dd ("iommu: Add API to request DMA domain for device") Signed-off-by: Jerry Snitselaar --- drivers/iommu/iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index db7bfd4f2d20..fa908179b80b 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2282,13 +2282,13 @@ request_default_domain_for_dev(struct device *dev, unsigned long type) goto out; } - iommu_group_create_direct_mappings(group, dev); - /* Make the domain the default for this group */ if (group->default_domain) iommu_domain_free(group->default_domain); group->default_domain = domain; + iommu_group_create_direct_mappings(group, dev); + dev_info(dev, "Using iommu %s mapping\n", type == IOMMU_DOMAIN_DMA ? "dma" : "direct"); -- 2.24.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dmar pte read access not set error messages on hp dl388 gen8 systems
On Tue Dec 10 19, Lu Baolu wrote: Hi, On 12/10/19 2:16 PM, Jerry Snitselaar wrote: On Mon Dec 09 19, Jerry Snitselaar wrote: On Mon Dec 09 19, Jerry Snitselaar wrote: On Mon Dec 09 19, Jerry Snitselaar wrote: [snip] A call to iommu_map is failing. [ 36.686881] pci :01:00.2: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 36.689843] pci :01:00.2: iommu_group_create_direct_mappings: iterating through mappings [ 36.692757] pci :01:00.2: iommu_group_create_direct_mappings: calling apply_resv_region [ 36.695526] pci :01:00.2: e_direct_mappings: entry type is direct [ 37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.201357] pci :01:00.2: iommu_group_create_direct_mappings: iommu_map failed [ 37.203973] pci :01:00.2: iommu_group_create_direct_mappings: leaving func [ 37.206385] pci :01:00.2: iommu_group_add_device: calling __iommu_attach_device [ 37.208950] pci :01:00.2: Adding to iommu group 25 [ 37.210660] pci :01:00.2: DMAR: domain->type is dma It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check at the beginning of intel_iommu_map. I will verify, but it looks like that is getting set when intel_iommu_add_device is called for 01:00.1. request_default_domain_for_dev for 01:00.1 will return -EBUSY because iommu_group_device_count(group) != 1. Also I see 01:00.0 and others that are the first in a group exiting iommu_group_create_direct_mappings at the (!domain || domain->type != IOMMU_DOMAIN_DMA) check. In request_default_domain_for_dev default_domain doesn't getting set until after that call. Should the iommu_group_create_direct_mappings call be moved below where group->default_domain gets set? Doing this the system boots, and I don't get any dmar pte read errors. I still see the map failing because of the DOMAIN_FLAG_LOSE_CHILDREN in those cases mentioned above, but it no longer is spitting out tons of dmar pte read errors. You can post a patch if you think this is worth of. Best regards, baolu I will send a patch tomorrow. In the case where you have default passthrough enabled, if the default domain type for the first device in a group is dma the call will fail, because iommu_group_create_direct_mappings uses group->default_domain and that will have an identity type until group->default_domain gets set right after the iommu_group_create_direct_mappings call. Regards, Jerry Also fails for 01:00.4: [ 37.212448] pci :01:00.4: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 37.215382] pci :01:00.4: iommu_group_create_direct_mappings: iterating through mappings [ 37.218170] pci :01:00.4: iommu_group_create_direct_mappings: calling apply_resv_region [ 37.220933] pci :01:00.4: iommu_group_create_direct_mappings: entry type is direct-relaxable [ 37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.226857] pci :01:00.4: iommu_group_create_direct_mappings: iommu_map failed [ 37.229300] pci :01:00.4: iommu_group_create_direct_mappings: leaving func [ 37.231648] pci :01:00.4: iommu_group_add_device: calling __iommu_attach_device [ 37.234194] pci :01:00.4: Adding to iommu group 25 [ 37.236192] pci :01:00.4: DMAR: domain->type is dma [ 37.237958] pci :01:00.4: DMAR: device default domain type is identity. requesting identity domain [ 37.241061] pci :01:00.4: don't change mappings of existing d37.489870] pci :01:00.4: DMAR: Device uses a private identity domain. There is an RMRR for 0xbddde000-0xefff: [63Ah 1594 2] Subtable Type : 0001 [Reserved Memory Region] [63Ch 1596 2] Length : 0036 [63Eh 1598 2] Reserved : [640h 1600 2] PCI Segment Number : [642h 1602 8] Base Address : BDDDE000 [64Ah 1610 8] End Address (limit) : BDDDEFFF [652h 1618 1] Device Scope Type : 01 [PCI Endpoint Device] [653h 1619 1] Entry Length : 0A [654h 1620 2] Reserved : [656h 1622 1] Enumeration ID : 00 [657h 1623 1] PCI Bus Number : 00 [658h 1624 2] PCI Path : 1C,07 [65Ah 1626 2] PCI Path : 00,00 [65Ch 1628 1] Device Scope Type : 01 [PCI Endpoint Device] [65Dh 1629 1] Entry Length : 0A [65Eh 1630 2] Reserved : [660h 1632 1] Enumeration ID : 00 [661h 1633 1] PCI Bus Number : 00 [662h 1634 2] PCI Path : 1C,07 [664h 1636 2] PCI Path : 00,02 [666h 1638 1] Device Scope Type : 01 [PCI Endpoint Device] [667h 1639 1] Entry Length : 0
Re: dmar pte read access not set error messages on hp dl388 gen8 systems
On Mon Dec 09 19, Jerry Snitselaar wrote: On Mon Dec 09 19, Jerry Snitselaar wrote: On Mon Dec 09 19, Jerry Snitselaar wrote: [snip] A call to iommu_map is failing. [ 36.686881] pci :01:00.2: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 36.689843] pci :01:00.2: iommu_group_create_direct_mappings: iterating through mappings [ 36.692757] pci :01:00.2: iommu_group_create_direct_mappings: calling apply_resv_region [ 36.695526] pci :01:00.2: e_direct_mappings: entry type is direct [ 37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.201357] pci :01:00.2: iommu_group_create_direct_mappings: iommu_map failed [ 37.203973] pci :01:00.2: iommu_group_create_direct_mappings: leaving func [ 37.206385] pci :01:00.2: iommu_group_add_device: calling __iommu_attach_device [ 37.208950] pci :01:00.2: Adding to iommu group 25 [ 37.210660] pci :01:00.2: DMAR: domain->type is dma It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check at the beginning of intel_iommu_map. I will verify, but it looks like that is getting set when intel_iommu_add_device is called for 01:00.1. request_default_domain_for_dev for 01:00.1 will return -EBUSY because iommu_group_device_count(group) != 1. Also I see 01:00.0 and others that are the first in a group exiting iommu_group_create_direct_mappings at the (!domain || domain->type != IOMMU_DOMAIN_DMA) check. In request_default_domain_for_dev default_domain doesn't getting set until after that call. Should the iommu_group_create_direct_mappings call be moved below where group->default_domain gets set? Doing this the system boots, and I don't get any dmar pte read errors. I still see the map failing because of the DOMAIN_FLAG_LOSE_CHILDREN in those cases mentioned above, but it no longer is spitting out tons of dmar pte read errors. Also fails for 01:00.4: [ 37.212448] pci :01:00.4: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 37.215382] pci :01:00.4: iommu_group_create_direct_mappings: iterating through mappings [ 37.218170] pci :01:00.4: iommu_group_create_direct_mappings: calling apply_resv_region [ 37.220933] pci :01:00.4: iommu_group_create_direct_mappings: entry type is direct-relaxable [ 37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.226857] pci :01:00.4: iommu_group_create_direct_mappings: iommu_map failed [ 37.229300] pci :01:00.4: iommu_group_create_direct_mappings: leaving func [ 37.231648] pci :01:00.4: iommu_group_add_device: calling __iommu_attach_device [ 37.234194] pci :01:00.4: Adding to iommu group 25 [ 37.236192] pci :01:00.4: DMAR: domain->type is dma [ 37.237958] pci :01:00.4: DMAR: device default domain type is identity. requesting identity domain [ 37.241061] pci :01:00.4: don't change mappings of existing d37.489870] pci :01:00.4: DMAR: Device uses a private identity domain. There is an RMRR for 0xbddde000-0xefff: [63Ah 1594 2]Subtable Type : 0001 [Reserved Memory Region] [63Ch 1596 2] Length : 0036 [63Eh 1598 2] Reserved : [640h 1600 2] PCI Segment Number : [642h 1602 8] Base Address : BDDDE000 [64Ah 1610 8] End Address (limit) : BDDDEFFF [652h 1618 1]Device Scope Type : 01 [PCI Endpoint Device] [653h 1619 1] Entry Length : 0A [654h 1620 2] Reserved : [656h 1622 1] Enumeration ID : 00 [657h 1623 1] PCI Bus Number : 00 [658h 1624 2] PCI Path : 1C,07 [65Ah 1626 2] PCI Path : 00,00 [65Ch 1628 1]Device Scope Type : 01 [PCI Endpoint Device] [65Dh 1629 1] Entry Length : 0A [65Eh 1630 2] Reserved : [660h 1632 1] Enumeration ID : 00 [661h 1633 1] PCI Bus Number : 00 [662h 1634 2] PCI Path : 1C,07 [664h 1636 2] PCI Path : 00,02 [666h 1638 1]Device Scope Type : 01 [PCI Endpoint Device] [667h 1639 1] Entry Length : 0A [668h 1640 2] Reserved : [66Ah 1642 1] Enumeration ID : 00 [66Bh 1643 1] PCI Bus Number : 00 [66Ch 1644 2] PCI Path : 1C,07 [66Eh 1646 2] PCI Path : 00,04 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dmar pte read access not set error messages on hp dl388 gen8 systems
On Mon Dec 09 19, Jerry Snitselaar wrote: On Mon Dec 09 19, Jerry Snitselaar wrote: [snip] A call to iommu_map is failing. [ 36.686881] pci :01:00.2: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 36.689843] pci :01:00.2: iommu_group_create_direct_mappings: iterating through mappings [ 36.692757] pci :01:00.2: iommu_group_create_direct_mappings: calling apply_resv_region [ 36.695526] pci :01:00.2: e_direct_mappings: entry type is direct [ 37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.201357] pci :01:00.2: iommu_group_create_direct_mappings: iommu_map failed [ 37.203973] pci :01:00.2: iommu_group_create_direct_mappings: leaving func [ 37.206385] pci :01:00.2: iommu_group_add_device: calling __iommu_attach_device [ 37.208950] pci :01:00.2: Adding to iommu group 25 [ 37.210660] pci :01:00.2: DMAR: domain->type is dma It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check at the beginning of intel_iommu_map. I will verify, but it looks like that is getting set when intel_iommu_add_device is called for 01:00.1. request_default_domain_for_dev for 01:00.1 will return -EBUSY because iommu_group_device_count(group) != 1. Also I see 01:00.0 and others that are the first in a group exiting iommu_group_create_direct_mappings at the (!domain || domain->type != IOMMU_DOMAIN_DMA) check. In request_default_domain_for_dev default_domain doesn't getting set until after that call. Should the iommu_group_create_direct_mappings call be moved below where group->default_domain gets set? Also fails for 01:00.4: [ 37.212448] pci :01:00.4: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 37.215382] pci :01:00.4: iommu_group_create_direct_mappings: iterating through mappings [ 37.218170] pci :01:00.4: iommu_group_create_direct_mappings: calling apply_resv_region [ 37.220933] pci :01:00.4: iommu_group_create_direct_mappings: entry type is direct-relaxable [ 37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.226857] pci :01:00.4: iommu_group_create_direct_mappings: iommu_map failed [ 37.229300] pci :01:00.4: iommu_group_create_direct_mappings: leaving func [ 37.231648] pci :01:00.4: iommu_group_add_device: calling __iommu_attach_device [ 37.234194] pci :01:00.4: Adding to iommu group 25 [ 37.236192] pci :01:00.4: DMAR: domain->type is dma [ 37.237958] pci :01:00.4: DMAR: device default domain type is identity. requesting identity domain [ 37.241061] pci :01:00.4: don't change mappings of existing d37.489870] pci :01:00.4: DMAR: Device uses a private identity domain. There is an RMRR for 0xbddde000-0xefff: [63Ah 1594 2]Subtable Type : 0001 [Reserved Memory Region] [63Ch 1596 2] Length : 0036 [63Eh 1598 2] Reserved : [640h 1600 2] PCI Segment Number : [642h 1602 8] Base Address : BDDDE000 [64Ah 1610 8] End Address (limit) : BDDDEFFF [652h 1618 1]Device Scope Type : 01 [PCI Endpoint Device] [653h 1619 1] Entry Length : 0A [654h 1620 2] Reserved : [656h 1622 1] Enumeration ID : 00 [657h 1623 1] PCI Bus Number : 00 [658h 1624 2] PCI Path : 1C,07 [65Ah 1626 2] PCI Path : 00,00 [65Ch 1628 1]Device Scope Type : 01 [PCI Endpoint Device] [65Dh 1629 1] Entry Length : 0A [65Eh 1630 2] Reserved : [660h 1632 1] Enumeration ID : 00 [661h 1633 1] PCI Bus Number : 00 [662h 1634 2] PCI Path : 1C,07 [664h 1636 2] PCI Path : 00,02 [666h 1638 1]Device Scope Type : 01 [PCI Endpoint Device] [667h 1639 1] Entry Length : 0A [668h 1640 2] Reserved : [66Ah 1642 1] Enumeration ID : 00 [66Bh 1643 1] PCI Bus Number : 00 [66Ch 1644 2] PCI Path : 1C,07 [66Eh 1646 2] PCI Path : 00,04 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dmar pte read access not set error messages on hp dl388 gen8 systems
On Mon Dec 09 19, Jerry Snitselaar wrote: [snip] A call to iommu_map is failing. [ 36.686881] pci :01:00.2: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 36.689843] pci :01:00.2: iommu_group_create_direct_mappings: iterating through mappings [ 36.692757] pci :01:00.2: iommu_group_create_direct_mappings: calling apply_resv_region [ 36.695526] pci :01:00.2: e_direct_mappings: entry type is direct [ 37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.201357] pci :01:00.2: iommu_group_create_direct_mappings: iommu_map failed [ 37.203973] pci :01:00.2: iommu_group_create_direct_mappings: leaving func [ 37.206385] pci :01:00.2: iommu_group_add_device: calling __iommu_attach_device [ 37.208950] pci :01:00.2: Adding to iommu group 25 [ 37.210660] pci :01:00.2: DMAR: domain->type is dma It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check at the beginning of intel_iommu_map. I will verify, but it looks like that is getting set when intel_iommu_add_device is called for 01:00.1. request_default_domain_for_dev for 01:00.1 will return -EBUSY because iommu_group_device_count(group) != 1. Also fails for 01:00.4: [ 37.212448] pci :01:00.4: iommu_group_add_device: calling iommu_group_create_direct_mappings [ 37.215382] pci :01:00.4: iommu_group_create_direct_mappings: iterating through mappings [ 37.218170] pci :01:00.4: iommu_group_create_direct_mappings: calling apply_resv_region [ 37.220933] pci :01:00.4: iommu_group_create_direct_mappings: entry type is direct-relaxable [ 37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 0xbddde000 pgsize 0x1000 [ 37.226857] pci :01:00.4: iommu_group_create_direct_mappings: iommu_map failed [ 37.229300] pci :01:00.4: iommu_group_create_direct_mappings: leaving func [ 37.231648] pci :01:00.4: iommu_group_add_device: calling __iommu_attach_device [ 37.234194] pci :01:00.4: Adding to iommu group 25 [ 37.236192] pci :01:00.4: DMAR: domain->type is dma [ 37.237958] pci :01:00.4: DMAR: device default domain type is identity. requesting identity domain [ 37.241061] pci :01:00.4: don't change mappings of existing d37.489870] pci :01:00.4: DMAR: Device uses a private identity domain. There is an RMRR for 0xbddde000-0xefff: [63Ah 1594 2]Subtable Type : 0001 [Reserved Memory Region] [63Ch 1596 2] Length : 0036 [63Eh 1598 2] Reserved : [640h 1600 2] PCI Segment Number : [642h 1602 8] Base Address : BDDDE000 [64Ah 1610 8] End Address (limit) : BDDDEFFF [652h 1618 1]Device Scope Type : 01 [PCI Endpoint Device] [653h 1619 1] Entry Length : 0A [654h 1620 2] Reserved : [656h 1622 1] Enumeration ID : 00 [657h 1623 1] PCI Bus Number : 00 [658h 1624 2] PCI Path : 1C,07 [65Ah 1626 2] PCI Path : 00,00 [65Ch 1628 1]Device Scope Type : 01 [PCI Endpoint Device] [65Dh 1629 1] Entry Length : 0A [65Eh 1630 2] Reserved : [660h 1632 1] Enumeration ID : 00 [661h 1633 1] PCI Bus Number : 00 [662h 1634 2] PCI Path : 1C,07 [664h 1636 2] PCI Path : 00,02 [666h 1638 1]Device Scope Type : 01 [PCI Endpoint Device] [667h 1639 1] Entry Length : 0A [668h 1640 2] Reserved : [66Ah 1642 1] Enumeration ID : 00 [66Bh 1643 1] PCI Bus Number : 00 [66Ch 1644 2] PCI Path : 1C,07 [66Eh 1646 2] PCI Path : 00,04 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: dmar pte read access not set error messages on hp dl388 gen8 systems
On Tue Dec 10 19, Lu Baolu wrote: Hi, On 12/10/19 8:52 AM, Jerry Snitselaar wrote: On Sun Dec 08 19, Lu Baolu wrote: Hi, On 12/7/19 10:41 AM, Jerry Snitselaar wrote: On Fri Dec 06 19, Jerry Snitselaar wrote: On Sat Dec 07 19, Lu Baolu wrote: Hi Jerry, On 12/6/19 3:24 PM, Jerry Snitselaar wrote: On Fri Dec 06 19, Lu Baolu wrote: [snip] Can you please try below change? Let's check whether the afending address has been mapped for device 01.00.2. $ git diff diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index db7bfd4f2d20..d9daf66be849 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -663,6 +663,8 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group, ret = iommu_map(domain, addr, addr, pg_size, entry->prot); if (ret) goto out; + + dev_info(dev, "Setting identity map [0x%Lx - 0x%Lx] for group %d\n", addr, addr + pg_size, group->id); } } I am doubting that device 01.00.2 is not in the device scope of [ 4.485108] DMAR: RMRR base: 0x00bdf6f000 end: 0x00bdf7efff By the way, does device 01.00.2 works well after binding the driver? When I boot it with passthrough it doesn't get to a point where I can login. I think the serial console on these systems is tied to the ilo, so the conserver connection could be making things worse. Unfortunately the system is remote. I should have more time now to focus on debugging this. Attaching console output for the above patch. It seems that device 01.00.2 isn't in the scope of RMRR [base: 0x00bdf6f000 end: 0x00bdf7efff]. But it still tries to access the address within it, hence faults generated. You can check it with ACPI/DMAR table. Best regards, baolu I believe it is the 3rd endpoint device entry in dmar data below. So question about request_default_domain_for_dev. Since a dma mapping is already done for 1.00.0, and that sets the default_domain for the group (I think), won't it bail out for 1.00.2 at this check? if (group->default_domain && group->default_domain->type == type) goto out; Or I guess request_default_domain_for_dev wouldn't even be called for 1.00.2. intel_iommu_add_device it wouldn't even call one of the request functions with 1.00.2 since domain->type would be dma from 1.00.0, and device_def_domain_type should return dma. Can you please add some debug messages and check what really happens here? Best regards, baolu [ 25.000544] pci :01:00.0: Adding to iommu group 25 [ 25.502243] pci :01:00.0: DMAR: domain->type is identity << intel_iommu_add_device (alloced in iommu_group_get_for_dev) [ 25.504239] pci :01:00.0: DMAR: device default domain type is dma. requesting dma domain << intel_iommu_add_device [ 25.507954] pci :01:00.0: Using iommu dma mapping << request_default_domain_for_dev (now default domain for group is dma) [ 25.509765] pci :01:00.1: Adding to iommu group 25 [ 25.511514] pci :01:00.1: DMAR: domain->type is dma << intel_iommu_add_device [ 25.513263] pci :01:00.1: DMAR: device default domain type is identity. requesting identity domain << intel_iommu_add_device [ 25.516435] pci :01:00.1: don't change mappings of existing devices. << request_default_domain_for_dev [ 25.518669] pci :01:00.1: DMAR: Device uses a private identity domain. << intel_iommu_add_device [ 25.521061] pci :01:00.2: Adding to iommu group 25 [ 25.522791] pci :01:00.2: DMAR: domain->type is dma << intel_iommu_add_device [ 25.524706] pci :01:00.4: Adding to iommu group 25 [ 25.526458] pci :01:00.4: DMAR: domain->type is dma << intel_iommu_add_device [ 25.528213] pci :01:00.4: DMAR: device default domain type is identity. requesting identity domain << intel_iommu_add_device [ 25.531284] pci :01:00.4: don't change mappings of existing devices. << request_default_domain_for_dev [ 25.533500] pci :01:00.4: DMAR: Device uses a private identity domain. << intel_iommu_add_device So the domain type is dma after 01:00.0 gets added, and when intel_iommu_add_device is called for 01:00.2 it will go into the if section. Since the device default domain type for 01:00.2 is dma nothing happens in there, and it goes on to 01:00.4. Is the "private identity domain" message really accurate since everyone will use si_domain? Adding some more debugging. The facts that we have seen: 1) 01.00.2 uses the default domain in group 25. The domain type of this default domain is DMA. 2) iommu_group_create_direct_mappings() *should* be called when adding 01.00.2 into group 25. As the result, RMRR for this device *should* be identity mapped. 3) By checkin
Re: dmar pte read access not set error messages on hp dl388 gen8 systems
On Sun Dec 08 19, Lu Baolu wrote: Hi, On 12/7/19 10:41 AM, Jerry Snitselaar wrote: On Fri Dec 06 19, Jerry Snitselaar wrote: On Sat Dec 07 19, Lu Baolu wrote: Hi Jerry, On 12/6/19 3:24 PM, Jerry Snitselaar wrote: On Fri Dec 06 19, Lu Baolu wrote: [snip] Can you please try below change? Let's check whether the afending address has been mapped for device 01.00.2. $ git diff diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index db7bfd4f2d20..d9daf66be849 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -663,6 +663,8 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group, ret = iommu_map(domain, addr, addr, pg_size, entry->prot); if (ret) goto out; + + dev_info(dev, "Setting identity map [0x%Lx - 0x%Lx] for group %d\n", addr, addr + pg_size, group->id); } } I am doubting that device 01.00.2 is not in the device scope of [ 4.485108] DMAR: RMRR base: 0x00bdf6f000 end: 0x00bdf7efff By the way, does device 01.00.2 works well after binding the driver? When I boot it with passthrough it doesn't get to a point where I can login. I think the serial console on these systems is tied to the ilo, so the conserver connection could be making things worse. Unfortunately the system is remote. I should have more time now to focus on debugging this. Attaching console output for the above patch. It seems that device 01.00.2 isn't in the scope of RMRR [base: 0x00bdf6f000 end: 0x00bdf7efff]. But it still tries to access the address within it, hence faults generated. You can check it with ACPI/DMAR table. Best regards, baolu I believe it is the 3rd endpoint device entry in dmar data below. So question about request_default_domain_for_dev. Since a dma mapping is already done for 1.00.0, and that sets the default_domain for the group (I think), won't it bail out for 1.00.2 at this check? if (group->default_domain && group->default_domain->type == type) goto out; Or I guess request_default_domain_for_dev wouldn't even be called for 1.00.2. intel_iommu_add_device it wouldn't even call one of the request functions with 1.00.2 since domain->type would be dma from 1.00.0, and device_def_domain_type should return dma. Can you please add some debug messages and check what really happens here? Best regards, baolu [ 25.000544] pci :01:00.0: Adding to iommu group 25 [ 25.502243] pci :01:00.0: DMAR: domain->type is identity << intel_iommu_add_device (alloced in iommu_group_get_for_dev) [ 25.504239] pci :01:00.0: DMAR: device default domain type is dma. requesting dma domain << intel_iommu_add_device [ 25.507954] pci :01:00.0: Using iommu dma mapping<< request_default_domain_for_dev (now default domain for group is dma) [ 25.509765] pci :01:00.1: Adding to iommu group 25 [ 25.511514] pci :01:00.1: DMAR: domain->type is dma << intel_iommu_add_device [ 25.513263] pci :01:00.1: DMAR: device default domain type is identity. requesting identity domain << intel_iommu_add_device [ 25.516435] pci :01:00.1: don't change mappings of existing devices. << request_default_domain_for_dev [ 25.518669] pci :01:00.1: DMAR: Device uses a private identity domain. << intel_iommu_add_device [ 25.521061] pci :01:00.2: Adding to iommu group 25 [ 25.522791] pci :01:00.2: DMAR: domain->type is dma << intel_iommu_add_device [ 25.524706] pci :01:00.4: Adding to iommu group 25 [ 25.526458] pci :01:00.4: DMAR: domain->type is dma << intel_iommu_add_device [ 25.528213] pci :01:00.4: DMAR: device default domain type is identity. requesting identity domain << intel_iommu_add_device [ 25.531284] pci :01:00.4: don't change mappings of existing devices. << request_default_domain_for_dev [ 25.533500] pci :01:00.4: DMAR: Device uses a private identity domain. << intel_iommu_add_device So the domain type is dma after 01:00.0 gets added, and when intel_iommu_add_device is called for 01:00.2 it will go into the if section. Since the device default domain type for 01:00.2 is dma nothing happens in there, and it goes on to 01:00.4. Is the "private identity domain" message really accurate since everyone will use si_domain? Adding some more debugging. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu