Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Mon, Aug 19, 2019 at 1:16 AM Roger Pau Monné wrote: > > On Sun, Aug 18, 2019 at 10:00:17PM -0700, Roman Shaposhnik wrote: > > Hi Roger! > > > > Some good news, some bad news ;-) > > > > Good news is that on the newer BIOS, your original patch seems to work fine. > > > > IOW, with newer BIOS: > > 1. without your original patch I see garbled screen > > 2. with your original patch everything boots normally. > > That would be my expectation. > > > Bad news is that with older BIOS, your original patch seems to either > > work or not depending on the BIOS some of the BIOS settings. > > > > IOW, with older BIOS: > > 1. without your original patch I see garbled screen (regardless of > > BIOS settings) > > 2. with your original patch AND resetting to a factory defaults of > > BIOS settings -- everything boots normally > > 3. when I load up our custom settings -- the only thing that can > > boot normally is the latest patch > > > > So, would it make sense and commit your original patch for now? This > > will unlock me with newer BIOSes on these boxes. > > I think so, can you please provide a tested-by to the patch: > > https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html Done. > Feel free to also note the weird behaviour you are seeing with this > box and firmware version. > > > As for the older BIOSes, I still feel that it would be great for Xen > > to boot more reliably -- especially since Xen 4.10 seems to be doing > > fine regardless of BIOS version and settings. > > > > What do you think? > > We could add a quirk for this fimrware version and hardware if we know > exactly what the issue is. Have you figured out if it's related to the > flush? ie: iommu_iotlb_flush vs iommu_iotlb_flush_all vs iommu_flush_all? I am actually at Open Source Summit in San Diego right now -- away from my lab. I'll try to capture as much data on Fri as possible when I come back. Please stay tuned. Thanks, Roman. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Sun, Aug 18, 2019 at 10:00:17PM -0700, Roman Shaposhnik wrote: > Hi Roger! > > Some good news, some bad news ;-) > > Good news is that on the newer BIOS, your original patch seems to work fine. > > IOW, with newer BIOS: > 1. without your original patch I see garbled screen > 2. with your original patch everything boots normally. That would be my expectation. > Bad news is that with older BIOS, your original patch seems to either > work or not depending on the BIOS some of the BIOS settings. > > IOW, with older BIOS: > 1. without your original patch I see garbled screen (regardless of > BIOS settings) > 2. with your original patch AND resetting to a factory defaults of > BIOS settings -- everything boots normally > 3. when I load up our custom settings -- the only thing that can > boot normally is the latest patch > > So, would it make sense and commit your original patch for now? This > will unlock me with newer BIOSes on these boxes. I think so, can you please provide a tested-by to the patch: https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html Feel free to also note the weird behaviour you are seeing with this box and firmware version. > As for the older BIOSes, I still feel that it would be great for Xen > to boot more reliably -- especially since Xen 4.10 seems to be doing > fine regardless of BIOS version and settings. > > What do you think? We could add a quirk for this fimrware version and hardware if we know exactly what the issue is. Have you figured out if it's related to the flush? ie: iommu_iotlb_flush vs iommu_iotlb_flush_all vs iommu_flush_all? Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
Hi Roger! Some good news, some bad news ;-) Good news is that on the newer BIOS, your original patch seems to work fine. IOW, with newer BIOS: 1. without your original patch I see garbled screen 2. with your original patch everything boots normally. Bad news is that with older BIOS, your original patch seems to either work or not depending on the BIOS some of the BIOS settings. IOW, with older BIOS: 1. without your original patch I see garbled screen (regardless of BIOS settings) 2. with your original patch AND resetting to a factory defaults of BIOS settings -- everything boots normally 3. when I load up our custom settings -- the only thing that can boot normally is the latest patch So, would it make sense and commit your original patch for now? This will unlock me with newer BIOSes on these boxes. As for the older BIOSes, I still feel that it would be great for Xen to boot more reliably -- especially since Xen 4.10 seems to be doing fine regardless of BIOS version and settings. What do you think? Btw, just to be clear -- when I say your original patch here's what I mean: diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index fef97c82f6..3605614aaf 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l, if ( !paging_mode_translate(p2m->domain) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, IOMMUF_readable | IOMMUF_writable); @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l) if ( !paging_mode_translate(d) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); } Thanks, Roman. On Wed, Aug 14, 2019 at 1:06 AM Roger Pau Monné wrote: > > On Tue, Aug 13, 2019 at 12:24:32PM -0700, Roman Shaposhnik wrote: > > Hi Roger, > > > > sorry for the delay -- I hope you will understand that I actually had > > a good reason. See below: > > No problem, just wanted to make sure this doesn't fell through the > cracks. > > > On Mon, Aug 12, 2019 at 1:57 AM Roger Pau Monné > > wrote: > > > > > > Ping? > > > > > > I know I've posted this quite recently, but have you been able to test > > > the two proposed patches? > > > > > > ie: the one here and: > > > > > > https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00643.html > > > > > > I would like to figure out exactly what's going on and fix this > > > properly. > > > > Turns out this may actually be related to the BIOS version on these > > boxes. I have > > one with the BIOS build from 06/07/2018 and the other one is from 2017. So > > with > > 2 of your proposed patches -- we now have a 2x2 test matrix. The awful > > part seems > > to be that the behavior looks to be slightly different. > > > > I need an extra day to summarize it all and I'll follow up quickly. It > > just so far I lost > > time trying to figure out while the same build would behave > > differently on different > > boxes only to find out that the BIOS is different (and the really > > awful part is that > > they both report as version 5.0.1.1 -- it is only when you look at the > > timestamp > > of the build -- that's where you see them being different :-( ). > > Oh, that's awful. I hope you can get something clear out of this. > > Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Tue, Aug 13, 2019 at 12:24:32PM -0700, Roman Shaposhnik wrote: > Hi Roger, > > sorry for the delay -- I hope you will understand that I actually had > a good reason. See below: No problem, just wanted to make sure this doesn't fell through the cracks. > On Mon, Aug 12, 2019 at 1:57 AM Roger Pau Monné wrote: > > > > Ping? > > > > I know I've posted this quite recently, but have you been able to test > > the two proposed patches? > > > > ie: the one here and: > > > > https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00643.html > > > > I would like to figure out exactly what's going on and fix this > > properly. > > Turns out this may actually be related to the BIOS version on these > boxes. I have > one with the BIOS build from 06/07/2018 and the other one is from 2017. So > with > 2 of your proposed patches -- we now have a 2x2 test matrix. The awful > part seems > to be that the behavior looks to be slightly different. > > I need an extra day to summarize it all and I'll follow up quickly. It > just so far I lost > time trying to figure out while the same build would behave > differently on different > boxes only to find out that the BIOS is different (and the really > awful part is that > they both report as version 5.0.1.1 -- it is only when you look at the > timestamp > of the build -- that's where you see them being different :-( ). Oh, that's awful. I hope you can get something clear out of this. Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
Hi Roger, sorry for the delay -- I hope you will understand that I actually had a good reason. See below: On Mon, Aug 12, 2019 at 1:57 AM Roger Pau Monné wrote: > > Ping? > > I know I've posted this quite recently, but have you been able to test > the two proposed patches? > > ie: the one here and: > > https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00643.html > > I would like to figure out exactly what's going on and fix this > properly. Turns out this may actually be related to the BIOS version on these boxes. I have one with the BIOS build from 06/07/2018 and the other one is from 2017. So with 2 of your proposed patches -- we now have a 2x2 test matrix. The awful part seems to be that the behavior looks to be slightly different. I need an extra day to summarize it all and I'll follow up quickly. It just so far I lost time trying to figure out while the same build would behave differently on different boxes only to find out that the BIOS is different (and the really awful part is that they both report as version 5.0.1.1 -- it is only when you look at the timestamp of the build -- that's where you see them being different :-( ). Thanks, Roman. > On Wed, Aug 07, 2019 at 09:35:34AM +0200, Roger Pau Monné wrote: > > On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote: > > > On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné > > > wrote: > > > > > > > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: > > > > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: > > > > > > This patch completely fixes the problem for me! > > > > > > > > > > > > Thanks Roger! I'd love to see this in Xen 4.13 > > > > > > > > > > Thanks for testing! > > > > > > > > > > It's still not clear to me why the previous approach didn't work, but > > > > > I think this patch is better because it removes the usage of > > > > > {set/clear}_identity_p2m_entry from PV domains. I will submit this > > > > > formally now. > > > > > > > > Sorry to bother again, but since we still don't understand why the > > > > previous fix didn't work for you, and I can't reproduce this with my > > > > hardware, could you give the attached patch a try? > > > > > > No worries -- and thanks for helping to get it over the finish line -- > > > this is much appreciated! > > > > > > I'm happy to say that this latest patch is also working just fine. So > > > I guess this is the one that's going to land in Xen 4.13? > > > > No, not really, sorry this was still a debug patch. > > > > So I think the behaviour you are seeing can only be explained if the > > IOMMU is already enabled by the firmware when booting into Xen, can > > this be the case? > > > > I have a patch I would like you to try to confirm this, can you please > > give it a spin and report back (ideally with the Xen boot log). > > > > Thanks, Roger. > > ---8<--- > > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > > index fef97c82f6..3605614aaf 100644 > > --- a/xen/arch/x86/mm/p2m.c > > +++ b/xen/arch/x86/mm/p2m.c > > @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned > > long gfn_l, > > > > if ( !paging_mode_translate(p2m->domain) ) > > { > > -if ( !need_iommu_pt_sync(d) ) > > +if ( !has_iommu_pt(d) ) > > return 0; > > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > > IOMMUF_readable | IOMMUF_writable); > > @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, > > unsigned long gfn_l) > > > > if ( !paging_mode_translate(d) ) > > { > > -if ( !need_iommu_pt_sync(d) ) > > +if ( !has_iommu_pt(d) ) > > return 0; > > return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); > > } > > diff --git a/xen/drivers/passthrough/vtd/iommu.c > > b/xen/drivers/passthrough/vtd/iommu.c > > index 5d72270c5b..9dd0ed7f63 100644 > > --- a/xen/drivers/passthrough/vtd/iommu.c > > +++ b/xen/drivers/passthrough/vtd/iommu.c > > @@ -2316,6 +2316,9 @@ static int __init vtd_setup(void) > > */ > > for_each_drhd_unit ( drhd ) > > { > > +unsigned long flags; > > +uint32_t val; > > + > > iommu = drhd->iommu; > > > > printk("Intel VT-d iommu %"PRIu32" supported page sizes: 4kB", > > @@ -2351,6 +2354,22 @@ static int __init vtd_setup(void) > > if ( !vtd_ept_page_compatible(iommu) ) > > iommu_hap_pt_share = 0; > > > > +spin_lock_irqsave(&iommu->register_lock, flags); > > +val = dmar_readl(iommu->reg, DMAR_GSTS_REG); > > +/* > > + * TODO: needs to be revisited once Xen supports booting with an > > + * already enabled IOMMU. > > + */ > > +if ( val & DMA_GSTS_TES ) > > +{ > > +printk(XENLOG_WARNING VTDPREFIX > > + "IOMMU: DMA remapping already enabled, disabling it\n"); > > +dmar_writel(iommu->reg, DMA
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
Ping? I know I've posted this quite recently, but have you been able to test the two proposed patches? ie: the one here and: https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00643.html I would like to figure out exactly what's going on and fix this properly. Thanks, Roger. On Wed, Aug 07, 2019 at 09:35:34AM +0200, Roger Pau Monné wrote: > On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote: > > On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné wrote: > > > > > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: > > > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: > > > > > This patch completely fixes the problem for me! > > > > > > > > > > Thanks Roger! I'd love to see this in Xen 4.13 > > > > > > > > Thanks for testing! > > > > > > > > It's still not clear to me why the previous approach didn't work, but > > > > I think this patch is better because it removes the usage of > > > > {set/clear}_identity_p2m_entry from PV domains. I will submit this > > > > formally now. > > > > > > Sorry to bother again, but since we still don't understand why the > > > previous fix didn't work for you, and I can't reproduce this with my > > > hardware, could you give the attached patch a try? > > > > No worries -- and thanks for helping to get it over the finish line -- > > this is much appreciated! > > > > I'm happy to say that this latest patch is also working just fine. So > > I guess this is the one that's going to land in Xen 4.13? > > No, not really, sorry this was still a debug patch. > > So I think the behaviour you are seeing can only be explained if the > IOMMU is already enabled by the firmware when booting into Xen, can > this be the case? > > I have a patch I would like you to try to confirm this, can you please > give it a spin and report back (ideally with the Xen boot log). > > Thanks, Roger. > ---8<--- > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > index fef97c82f6..3605614aaf 100644 > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned > long gfn_l, > > if ( !paging_mode_translate(p2m->domain) ) > { > -if ( !need_iommu_pt_sync(d) ) > +if ( !has_iommu_pt(d) ) > return 0; > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > IOMMUF_readable | IOMMUF_writable); > @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned > long gfn_l) > > if ( !paging_mode_translate(d) ) > { > -if ( !need_iommu_pt_sync(d) ) > +if ( !has_iommu_pt(d) ) > return 0; > return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); > } > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index 5d72270c5b..9dd0ed7f63 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -2316,6 +2316,9 @@ static int __init vtd_setup(void) > */ > for_each_drhd_unit ( drhd ) > { > +unsigned long flags; > +uint32_t val; > + > iommu = drhd->iommu; > > printk("Intel VT-d iommu %"PRIu32" supported page sizes: 4kB", > @@ -2351,6 +2354,22 @@ static int __init vtd_setup(void) > if ( !vtd_ept_page_compatible(iommu) ) > iommu_hap_pt_share = 0; > > +spin_lock_irqsave(&iommu->register_lock, flags); > +val = dmar_readl(iommu->reg, DMAR_GSTS_REG); > +/* > + * TODO: needs to be revisited once Xen supports booting with an > + * already enabled IOMMU. > + */ > +if ( val & DMA_GSTS_TES ) > +{ > +printk(XENLOG_WARNING VTDPREFIX > + "IOMMU: DMA remapping already enabled, disabling it\n"); > +dmar_writel(iommu->reg, DMAR_GCMD_REG, val & ~DMA_GCMD_TE); > +IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, dmar_readl, > + !(val & DMA_GSTS_TES), val); > +} > +spin_unlock_irqrestore(&iommu->register_lock, flags); > + > ret = iommu_set_interrupt(drhd); > if ( ret ) > { > > > ___ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Aug 07, 2019 at 10:31:40AM +0200, Jan Beulich wrote: > On 07.08.2019 09:35, Roger Pau Monné wrote: > > On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote: > > > On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné > > > wrote: > > > > > > > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: > > > > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: > > > > > > This patch completely fixes the problem for me! > > > > > > > > > > > > Thanks Roger! I'd love to see this in Xen 4.13 > > > > > > > > > > Thanks for testing! > > > > > > > > > > It's still not clear to me why the previous approach didn't work, but > > > > > I think this patch is better because it removes the usage of > > > > > {set/clear}_identity_p2m_entry from PV domains. I will submit this > > > > > formally now. > > > > > > > > Sorry to bother again, but since we still don't understand why the > > > > previous fix didn't work for you, and I can't reproduce this with my > > > > hardware, could you give the attached patch a try? > > > > > > No worries -- and thanks for helping to get it over the finish line -- > > > this is much appreciated! > > > > > > I'm happy to say that this latest patch is also working just fine. So > > > I guess this is the one that's going to land in Xen 4.13? > > > > No, not really, sorry this was still a debug patch. > > > > So I think the behaviour you are seeing can only be explained if the > > IOMMU is already enabled by the firmware when booting into Xen, can > > this be the case? > > Even then - why would the existing flush not suffice in this case? It should indeed, I'm just trying to figure out what's different in his setup. Also I'm not sure the iommu init code was designed taking into account that the iommu DMA translation might already be enabled by the firmware. Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 07.08.2019 11:57, Roger Pau Monné wrote: On Wed, Aug 07, 2019 at 09:08:58AM +0200, Jan Beulich wrote: On 06.08.2019 23:48, Roman Shaposhnik wrote: On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné wrote: On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: This patch completely fixes the problem for me! Thanks Roger! I'd love to see this in Xen 4.13 Thanks for testing! It's still not clear to me why the previous approach didn't work, but I think this patch is better because it removes the usage of {set/clear}_identity_p2m_entry from PV domains. I will submit this formally now. Sorry to bother again, but since we still don't understand why the previous fix didn't work for you, and I can't reproduce this with my hardware, could you give the attached patch a try? No worries -- and thanks for helping to get it over the finish line -- this is much appreciated! I'm happy to say that this latest patch is also working just fine. So I guess this is the one that's going to land in Xen 4.13? Not necessarily - the other patch is also a candidate, but its description would need to explain what was actually wrong. AFAICT the only difference between the non-working version and the working version is the flush, so I've added it here. Now I'm afraid I still can't draw a helpful conclusion from Roman's successful test: intel_iommu_hwdom_init(), after having called setup_hwdom_rmrr(), calls iommu_flush_all() (with one other, seemingly innocent call in between). The only conclusion I _could_ draw is that iommu_flush_all() doesn't do what its name says. Which would be quite bad. But [orig] iommu_flush_all() -> iommu_flush_iotlb_global(flush_non_present_entry=0) -> flush->iotlb(DMA_TLB_GLOBAL_FLUSH, flush_non_present_entry=0) [patch] iommu_flush_iotlb_all() -> iommu_flush_iotlb(dma_old_pte_present=0, page_count=0) -> iommu_flush_iotlb_dsi(flush_non_present_entry=0) -> flush->iotlb(DMA_TLB_DSI_FLUSH, flush_non_present_entry=0) suggests to me that (as one would infer from the names) is the more through flush. I must be overlooking something ... I went over the iotlb queued invalidation code and it seems fine to me, I haven't been able to spot any issues with it, and as you say above the only difference is the flush type (DSI vs GLOBAL). Again and just to make sure it's actually the flush type (and nothing in between) the factor that makes this work or break, can you try the above patch? And I take it the "you" here was meant for Roman (only on Cc), not me. Jan Thanks, Roger. ---8<--- diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index fef97c82f6..3605614aaf 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l, if ( !paging_mode_translate(p2m->domain) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, IOMMUF_readable | IOMMUF_writable); @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l) if ( !paging_mode_translate(d) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); } diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 5d72270c5b..885185ad09 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -2026,7 +2026,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map, mrmrr->count = 1; list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs); -return 0; +return iommu_flush_all(); } static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev) ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Aug 07, 2019 at 09:08:58AM +0200, Jan Beulich wrote: > On 06.08.2019 23:48, Roman Shaposhnik wrote: > > On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné wrote: > > > > > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: > > > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: > > > > > This patch completely fixes the problem for me! > > > > > > > > > > Thanks Roger! I'd love to see this in Xen 4.13 > > > > > > > > Thanks for testing! > > > > > > > > It's still not clear to me why the previous approach didn't work, but > > > > I think this patch is better because it removes the usage of > > > > {set/clear}_identity_p2m_entry from PV domains. I will submit this > > > > formally now. > > > > > > Sorry to bother again, but since we still don't understand why the > > > previous fix didn't work for you, and I can't reproduce this with my > > > hardware, could you give the attached patch a try? > > > > No worries -- and thanks for helping to get it over the finish line -- > > this is much appreciated! > > > > I'm happy to say that this latest patch is also working just fine. So > > I guess this is the one that's going to land in Xen 4.13? > > Not necessarily - the other patch is also a candidate, but its > description would need to explain what was actually wrong. > > > > AFAICT the only difference between the non-working version and the > > > working version is the flush, so I've added it here. > > Now I'm afraid I still can't draw a helpful conclusion from Roman's > successful test: intel_iommu_hwdom_init(), after having called > setup_hwdom_rmrr(), calls iommu_flush_all() (with one other, > seemingly innocent call in between). The only conclusion I _could_ > draw is that iommu_flush_all() doesn't do what its name says. Which > would be quite bad. But > > [orig] > iommu_flush_all() > -> iommu_flush_iotlb_global(flush_non_present_entry=0) > -> flush->iotlb(DMA_TLB_GLOBAL_FLUSH, flush_non_present_entry=0) > > [patch] > iommu_flush_iotlb_all() > -> iommu_flush_iotlb(dma_old_pte_present=0, page_count=0) > -> iommu_flush_iotlb_dsi(flush_non_present_entry=0) > -> flush->iotlb(DMA_TLB_DSI_FLUSH, flush_non_present_entry=0) > > suggests to me that (as one would infer from the names) is the > more through flush. I must be overlooking something ... I went over the iotlb queued invalidation code and it seems fine to me, I haven't been able to spot any issues with it, and as you say above the only difference is the flush type (DSI vs GLOBAL). Again and just to make sure it's actually the flush type (and nothing in between) the factor that makes this work or break, can you try the above patch? Thanks, Roger. ---8<--- diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index fef97c82f6..3605614aaf 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l, if ( !paging_mode_translate(p2m->domain) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, IOMMUF_readable | IOMMUF_writable); @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l) if ( !paging_mode_translate(d) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); } diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 5d72270c5b..885185ad09 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -2026,7 +2026,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map, mrmrr->count = 1; list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs); -return 0; +return iommu_flush_all(); } static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev) ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 07.08.2019 09:35, Roger Pau Monné wrote: On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote: On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné wrote: On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: This patch completely fixes the problem for me! Thanks Roger! I'd love to see this in Xen 4.13 Thanks for testing! It's still not clear to me why the previous approach didn't work, but I think this patch is better because it removes the usage of {set/clear}_identity_p2m_entry from PV domains. I will submit this formally now. Sorry to bother again, but since we still don't understand why the previous fix didn't work for you, and I can't reproduce this with my hardware, could you give the attached patch a try? No worries -- and thanks for helping to get it over the finish line -- this is much appreciated! I'm happy to say that this latest patch is also working just fine. So I guess this is the one that's going to land in Xen 4.13? No, not really, sorry this was still a debug patch. So I think the behaviour you are seeing can only be explained if the IOMMU is already enabled by the firmware when booting into Xen, can this be the case? Even then - why would the existing flush not suffice in this case? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote: > On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné wrote: > > > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: > > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: > > > > This patch completely fixes the problem for me! > > > > > > > > Thanks Roger! I'd love to see this in Xen 4.13 > > > > > > Thanks for testing! > > > > > > It's still not clear to me why the previous approach didn't work, but > > > I think this patch is better because it removes the usage of > > > {set/clear}_identity_p2m_entry from PV domains. I will submit this > > > formally now. > > > > Sorry to bother again, but since we still don't understand why the > > previous fix didn't work for you, and I can't reproduce this with my > > hardware, could you give the attached patch a try? > > No worries -- and thanks for helping to get it over the finish line -- > this is much appreciated! > > I'm happy to say that this latest patch is also working just fine. So > I guess this is the one that's going to land in Xen 4.13? No, not really, sorry this was still a debug patch. So I think the behaviour you are seeing can only be explained if the IOMMU is already enabled by the firmware when booting into Xen, can this be the case? I have a patch I would like you to try to confirm this, can you please give it a spin and report back (ideally with the Xen boot log). Thanks, Roger. ---8<--- diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index fef97c82f6..3605614aaf 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l, if ( !paging_mode_translate(p2m->domain) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, IOMMUF_readable | IOMMUF_writable); @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l) if ( !paging_mode_translate(d) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); } diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 5d72270c5b..9dd0ed7f63 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -2316,6 +2316,9 @@ static int __init vtd_setup(void) */ for_each_drhd_unit ( drhd ) { +unsigned long flags; +uint32_t val; + iommu = drhd->iommu; printk("Intel VT-d iommu %"PRIu32" supported page sizes: 4kB", @@ -2351,6 +2354,22 @@ static int __init vtd_setup(void) if ( !vtd_ept_page_compatible(iommu) ) iommu_hap_pt_share = 0; +spin_lock_irqsave(&iommu->register_lock, flags); +val = dmar_readl(iommu->reg, DMAR_GSTS_REG); +/* + * TODO: needs to be revisited once Xen supports booting with an + * already enabled IOMMU. + */ +if ( val & DMA_GSTS_TES ) +{ +printk(XENLOG_WARNING VTDPREFIX + "IOMMU: DMA remapping already enabled, disabling it\n"); +dmar_writel(iommu->reg, DMAR_GCMD_REG, val & ~DMA_GCMD_TE); +IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, dmar_readl, + !(val & DMA_GSTS_TES), val); +} +spin_unlock_irqrestore(&iommu->register_lock, flags); + ret = iommu_set_interrupt(drhd); if ( ret ) { ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 06.08.2019 23:48, Roman Shaposhnik wrote: On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné wrote: On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: This patch completely fixes the problem for me! Thanks Roger! I'd love to see this in Xen 4.13 Thanks for testing! It's still not clear to me why the previous approach didn't work, but I think this patch is better because it removes the usage of {set/clear}_identity_p2m_entry from PV domains. I will submit this formally now. Sorry to bother again, but since we still don't understand why the previous fix didn't work for you, and I can't reproduce this with my hardware, could you give the attached patch a try? No worries -- and thanks for helping to get it over the finish line -- this is much appreciated! I'm happy to say that this latest patch is also working just fine. So I guess this is the one that's going to land in Xen 4.13? Not necessarily - the other patch is also a candidate, but its description would need to explain what was actually wrong. AFAICT the only difference between the non-working version and the working version is the flush, so I've added it here. Now I'm afraid I still can't draw a helpful conclusion from Roman's successful test: intel_iommu_hwdom_init(), after having called setup_hwdom_rmrr(), calls iommu_flush_all() (with one other, seemingly innocent call in between). The only conclusion I _could_ draw is that iommu_flush_all() doesn't do what its name says. Which would be quite bad. But [orig] iommu_flush_all() -> iommu_flush_iotlb_global(flush_non_present_entry=0) -> flush->iotlb(DMA_TLB_GLOBAL_FLUSH, flush_non_present_entry=0) [patch] iommu_flush_iotlb_all() -> iommu_flush_iotlb(dma_old_pte_present=0, page_count=0) -> iommu_flush_iotlb_dsi(flush_non_present_entry=0) -> flush->iotlb(DMA_TLB_DSI_FLUSH, flush_non_present_entry=0) suggests to me that (as one would infer from the names) is the more through flush. I must be overlooking something ... Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné wrote: > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: > > > This patch completely fixes the problem for me! > > > > > > Thanks Roger! I'd love to see this in Xen 4.13 > > > > Thanks for testing! > > > > It's still not clear to me why the previous approach didn't work, but > > I think this patch is better because it removes the usage of > > {set/clear}_identity_p2m_entry from PV domains. I will submit this > > formally now. > > Sorry to bother again, but since we still don't understand why the > previous fix didn't work for you, and I can't reproduce this with my > hardware, could you give the attached patch a try? No worries -- and thanks for helping to get it over the finish line -- this is much appreciated! I'm happy to say that this latest patch is also working just fine. So I guess this is the one that's going to land in Xen 4.13? Thanks, Roman. > AFAICT the only difference between the non-working version and the > working version is the flush, so I've added it here. > > Thanks, Roger. > ---8<--- > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > index fef97c82f6..3605614aaf 100644 > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned > long gfn_l, > > if ( !paging_mode_translate(p2m->domain) ) > { > -if ( !need_iommu_pt_sync(d) ) > +if ( !has_iommu_pt(d) ) > return 0; > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > IOMMUF_readable | IOMMUF_writable); > @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned > long gfn_l) > > if ( !paging_mode_translate(d) ) > { > -if ( !need_iommu_pt_sync(d) ) > +if ( !has_iommu_pt(d) ) > return 0; > return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); > } > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index 5d72270c5b..9fd5c97be2 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -2026,7 +2026,7 @@ static int rmrr_identity_mapping(struct domain *d, > bool_t map, > mrmrr->count = 1; > list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs); > > -return 0; > +return iommu_iotlb_flush_all(d, IOMMU_FLUSHF_added | > IOMMU_FLUSHF_modified); > } > > static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev) > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote: > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: > > This patch completely fixes the problem for me! > > > > Thanks Roger! I'd love to see this in Xen 4.13 > > Thanks for testing! > > It's still not clear to me why the previous approach didn't work, but > I think this patch is better because it removes the usage of > {set/clear}_identity_p2m_entry from PV domains. I will submit this > formally now. Sorry to bother again, but since we still don't understand why the previous fix didn't work for you, and I can't reproduce this with my hardware, could you give the attached patch a try? AFAICT the only difference between the non-working version and the working version is the flush, so I've added it here. Thanks, Roger. ---8<--- diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index fef97c82f6..3605614aaf 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l, if ( !paging_mode_translate(p2m->domain) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, IOMMUF_readable | IOMMUF_writable); @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l) if ( !paging_mode_translate(d) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); } diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 5d72270c5b..9fd5c97be2 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -2026,7 +2026,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map, mrmrr->count = 1; list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs); -return 0; +return iommu_iotlb_flush_all(d, IOMMU_FLUSHF_added | IOMMU_FLUSHF_modified); } static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev) ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote: > On Thu, Aug 1, 2019 at 1:16 AM Roger Pau Monné wrote: > > > > On Wed, Jul 31, 2019 at 02:03:24PM -0700, Roman Shaposhnik wrote: > > > On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper > > > wrote: > > > > > > > > On 31/07/2019 20:35, Roman Shaposhnik wrote: > > > > > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné > > > > > wrote: > > > > >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote: > > > > >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > > > > Sorry -- got a bit distracted yesterday. Attached is the log with > > > > only > > > > your latest patch attached. Interestingly enough the box booted > > > > fine > > > > without screen artifacts. So I guess we're getting closer... > > > > > > > > Thanks for all the help! > > > > >>> That's quite weird, there's no functional changes between the > > > > >>> previous patches and this one, the only difference is that this > > > > >>> patch > > > > >>> has more verbose output. > > > > >>> > > > > >>> Are you sure you didn't have any local patches on top of Xen that > > > > >>> could explain this difference in behaviour? > > > > >> FWIW, can you please try the plain patch again: > > > > >> > > > > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html > > > > >> > > > > >> And report back? > > > > >> > > > > >> I would like to get this committed ASAP if it does fix your issue. > > > > > I'd like to say that it did -- but I tried it again just now and it > > > > > still garbled screen and tons of: > > > > > > > > > > (XEN) printk: 26665 messages suppressed. > > > > > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr > > > > > 8e14c000, iommu reg = 82c0008de000 > > > > > > > > > > I'm very much confused by what's going on, but it seems that's the > > > > > case -- adding those debug print statements make the issue go away > > > > > > > > > > Here are the patches that are being applied: > > > > >NOT WORKING: > > > > > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch > > > > > > > > > >WORKING: > > > > > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch > > > > > > > > > > At this point I'm really not sure what's going on. > > > > > > > > Ok. seeing as you've double checked this, the mystery deepens. > > > > > > > > My bet is on the intel_iommu_lookup_page() call having side effects[1]. > > > > If you take out the debugging in the middle of the loop in > > > > rmrr_identity_mapping(), does the problem reproduce again? > > > > > > > > ~Andrew > > > > > > > > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100% > > > > more memory allocation and higher-level PTE construction than looks wise > > > > for what is supposed to be a getter. > > > > > > Yup. That's what it is -- intel_iommu_lookup_page() seems to be the > > > culprit. > > > > > > I've did the experiment in the other direction -- adding a dummy call: > > > > > > https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23 > > > on top of original Roger's patch makes system boot NORMALLY. > > > > I'm again quite lost, and I don't really understand why mappings added > > by arch_iommu_hwdom_init seems to work fine while mappings added by > > rmrr_identity_mapping don't. > > > > I have yet another patch for you to try, which attempts to mimic > > exactly what arch_iommu_hwdom_init does into rmrr_identity_mapping, > > can you please give it a try? > > > > This has the added bonus of limiting the use of > > {set/clear}_identity_p2m_entry to translated domains only, since > > rmrr_identity_mapping was the only caller against PV domains. > > > > Thanks, Roger. > > ---8<--- > > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > > index fef97c82f6..d36a58b1a6 100644 > > --- a/xen/arch/x86/mm/p2m.c > > +++ b/xen/arch/x86/mm/p2m.c > > @@ -1341,10 +1341,8 @@ int set_identity_p2m_entry(struct domain *d, > > unsigned long gfn_l, > > > > if ( !paging_mode_translate(p2m->domain) ) > > { > > -if ( !need_iommu_pt_sync(d) ) > > -return 0; > > -return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > > -IOMMUF_readable | IOMMUF_writable); > > +ASSERT_UNREACHABLE(); > > +return -ENXIO; > > } > > > > gfn_lock(p2m, gfn, 0); > > @@ -1432,9 +1430,8 @@ int clear_identity_p2m_entry(struct domain *d, > > unsigned long gfn_l) > > > > if ( !paging_mode_translate(d) ) > > { > > -if ( !need_iommu_pt_sync(d) ) > > -return 0; > > -return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); > > +ASSERT_UNREACHABLE(); > > +return -ENXIO; > > } > > > > gfn_lock(p2m, gfn, 0); > > diff --git a/xen/drivers/passthrough/vtd/i
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Thu, Aug 1, 2019 at 1:16 AM Roger Pau Monné wrote: > > On Wed, Jul 31, 2019 at 02:03:24PM -0700, Roman Shaposhnik wrote: > > On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper > > wrote: > > > > > > On 31/07/2019 20:35, Roman Shaposhnik wrote: > > > > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné > > > > wrote: > > > >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote: > > > >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > > > Sorry -- got a bit distracted yesterday. Attached is the log with > > > only > > > your latest patch attached. Interestingly enough the box booted fine > > > without screen artifacts. So I guess we're getting closer... > > > > > > Thanks for all the help! > > > >>> That's quite weird, there's no functional changes between the > > > >>> previous patches and this one, the only difference is that this patch > > > >>> has more verbose output. > > > >>> > > > >>> Are you sure you didn't have any local patches on top of Xen that > > > >>> could explain this difference in behaviour? > > > >> FWIW, can you please try the plain patch again: > > > >> > > > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html > > > >> > > > >> And report back? > > > >> > > > >> I would like to get this committed ASAP if it does fix your issue. > > > > I'd like to say that it did -- but I tried it again just now and it > > > > still garbled screen and tons of: > > > > > > > > (XEN) printk: 26665 messages suppressed. > > > > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr > > > > 8e14c000, iommu reg = 82c0008de000 > > > > > > > > I'm very much confused by what's going on, but it seems that's the > > > > case -- adding those debug print statements make the issue go away > > > > > > > > Here are the patches that are being applied: > > > >NOT WORKING: > > > > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch > > > > > > > >WORKING: > > > > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch > > > > > > > > At this point I'm really not sure what's going on. > > > > > > Ok. seeing as you've double checked this, the mystery deepens. > > > > > > My bet is on the intel_iommu_lookup_page() call having side effects[1]. > > > If you take out the debugging in the middle of the loop in > > > rmrr_identity_mapping(), does the problem reproduce again? > > > > > > ~Andrew > > > > > > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100% > > > more memory allocation and higher-level PTE construction than looks wise > > > for what is supposed to be a getter. > > > > Yup. That's what it is -- intel_iommu_lookup_page() seems to be the culprit. > > > > I've did the experiment in the other direction -- adding a dummy call: > > > > https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23 > > on top of original Roger's patch makes system boot NORMALLY. > > I'm again quite lost, and I don't really understand why mappings added > by arch_iommu_hwdom_init seems to work fine while mappings added by > rmrr_identity_mapping don't. > > I have yet another patch for you to try, which attempts to mimic > exactly what arch_iommu_hwdom_init does into rmrr_identity_mapping, > can you please give it a try? > > This has the added bonus of limiting the use of > {set/clear}_identity_p2m_entry to translated domains only, since > rmrr_identity_mapping was the only caller against PV domains. > > Thanks, Roger. > ---8<--- > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > index fef97c82f6..d36a58b1a6 100644 > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -1341,10 +1341,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned > long gfn_l, > > if ( !paging_mode_translate(p2m->domain) ) > { > -if ( !need_iommu_pt_sync(d) ) > -return 0; > -return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > -IOMMUF_readable | IOMMUF_writable); > +ASSERT_UNREACHABLE(); > +return -ENXIO; > } > > gfn_lock(p2m, gfn, 0); > @@ -1432,9 +1430,8 @@ int clear_identity_p2m_entry(struct domain *d, unsigned > long gfn_l) > > if ( !paging_mode_translate(d) ) > { > -if ( !need_iommu_pt_sync(d) ) > -return 0; > -return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); > +ASSERT_UNREACHABLE(); > +return -ENXIO; > } > > gfn_lock(p2m, gfn, 0); > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index 5d72270c5b..62df5ca5aa 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -1969,6 +1969,7 @@ static int rmrr_identity_mapping(struct domain *d, > bool_t map, > unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address)
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Thu, Aug 1, 2019 at 1:45 AM Roger Pau Monné wrote: > > On Wed, Jul 31, 2019 at 12:30:04PM -0700, Roman Shaposhnik wrote: > > On Wed, Jul 31, 2019 at 1:36 AM Roger Pau Monné > > wrote: > > > > > > On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > > > > Sorry -- got a bit distracted yesterday. Attached is the log with only > > > > your latest patch attached. Interestingly enough the box booted fine > > > > without screen artifacts. So I guess we're getting closer... > > > > > > > > Thanks for all the help! > > > > > > That's quite weird, there's no functional changes between the > > > previous patches and this one, the only difference is that this patch > > > has more verbose output. > > > > That's super weird indeed. I'm re-trying your very first patch right > > now to see if it may be a pilot error on my part (hope not). > > > > > Are you sure you didn't have any local patches on top of Xen that > > > could explain this difference in behaviour? > > > > Pretty sure -- but let me push my branch in EVE and show to you all > > (you will need Docker installed to build Xen the way EVE does). > > I'm afraid it's quite hard for me to figure out whether you have any > patches on top of Xen or not. It is actually super simple. We use Docker containers as method for guaranteeing that our build environment is always pristine and controlled. Precisely to avoid this type of issues like random patches getting in, etc. So... let me explain (not so much to convince you to use it, but rather to have an extra pair of eye on an off chance that I actually missed something): All of our builds steps are captured in the following Dockerfile: https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile it is based on Alpine Linux 3.7 and we only install the following packages into a printing Alpine: https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile#L5 then we proceed with downloading Xen from a printing upsteram tarball: https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile#L36 and only apply patches that sit next to the Dockerfile and have patch file extension: https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile#L50 At this point this is a single patch: https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch there's literally nothing else in that folder: https://github.com/rvs/eve/blob/xen-bug/pkg/xen Then we proceed with building Xen in a very vanilla fashion: https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile#L64 That's it! That's literally all the build logic. > For further testing can you make sure you don't have any custom > patches on top of Xen (if you indeed had any in the previous > attempts)? I don't think there's any way for extra patches to sneak in, but please see the above explanation. Thanks, Roman. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Jul 31, 2019 at 12:30:04PM -0700, Roman Shaposhnik wrote: > On Wed, Jul 31, 2019 at 1:36 AM Roger Pau Monné wrote: > > > > On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > > > Sorry -- got a bit distracted yesterday. Attached is the log with only > > > your latest patch attached. Interestingly enough the box booted fine > > > without screen artifacts. So I guess we're getting closer... > > > > > > Thanks for all the help! > > > > That's quite weird, there's no functional changes between the > > previous patches and this one, the only difference is that this patch > > has more verbose output. > > That's super weird indeed. I'm re-trying your very first patch right > now to see if it may be a pilot error on my part (hope not). > > > Are you sure you didn't have any local patches on top of Xen that > > could explain this difference in behaviour? > > Pretty sure -- but let me push my branch in EVE and show to you all > (you will need Docker installed to build Xen the way EVE does). I'm afraid it's quite hard for me to figure out whether you have any patches on top of Xen or not. For further testing can you make sure you don't have any custom patches on top of Xen (if you indeed had any in the previous attempts)? Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Jul 31, 2019 at 02:03:24PM -0700, Roman Shaposhnik wrote: > On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper > wrote: > > > > On 31/07/2019 20:35, Roman Shaposhnik wrote: > > > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné > > > wrote: > > >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote: > > >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > > Sorry -- got a bit distracted yesterday. Attached is the log with only > > your latest patch attached. Interestingly enough the box booted fine > > without screen artifacts. So I guess we're getting closer... > > > > Thanks for all the help! > > >>> That's quite weird, there's no functional changes between the > > >>> previous patches and this one, the only difference is that this patch > > >>> has more verbose output. > > >>> > > >>> Are you sure you didn't have any local patches on top of Xen that > > >>> could explain this difference in behaviour? > > >> FWIW, can you please try the plain patch again: > > >> > > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html > > >> > > >> And report back? > > >> > > >> I would like to get this committed ASAP if it does fix your issue. > > > I'd like to say that it did -- but I tried it again just now and it > > > still garbled screen and tons of: > > > > > > (XEN) printk: 26665 messages suppressed. > > > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr > > > 8e14c000, iommu reg = 82c0008de000 > > > > > > I'm very much confused by what's going on, but it seems that's the > > > case -- adding those debug print statements make the issue go away > > > > > > Here are the patches that are being applied: > > >NOT WORKING: > > > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch > > > > > >WORKING: > > > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch > > > > > > At this point I'm really not sure what's going on. > > > > Ok. seeing as you've double checked this, the mystery deepens. > > > > My bet is on the intel_iommu_lookup_page() call having side effects[1]. > > If you take out the debugging in the middle of the loop in > > rmrr_identity_mapping(), does the problem reproduce again? > > > > ~Andrew > > > > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100% > > more memory allocation and higher-level PTE construction than looks wise > > for what is supposed to be a getter. > > Yup. That's what it is -- intel_iommu_lookup_page() seems to be the culprit. > > I've did the experiment in the other direction -- adding a dummy call: > > https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23 > on top of original Roger's patch makes system boot NORMALLY. I'm again quite lost, and I don't really understand why mappings added by arch_iommu_hwdom_init seems to work fine while mappings added by rmrr_identity_mapping don't. I have yet another patch for you to try, which attempts to mimic exactly what arch_iommu_hwdom_init does into rmrr_identity_mapping, can you please give it a try? This has the added bonus of limiting the use of {set/clear}_identity_p2m_entry to translated domains only, since rmrr_identity_mapping was the only caller against PV domains. Thanks, Roger. ---8<--- diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index fef97c82f6..d36a58b1a6 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1341,10 +1341,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l, if ( !paging_mode_translate(p2m->domain) ) { -if ( !need_iommu_pt_sync(d) ) -return 0; -return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, -IOMMUF_readable | IOMMUF_writable); +ASSERT_UNREACHABLE(); +return -ENXIO; } gfn_lock(p2m, gfn, 0); @@ -1432,9 +1430,8 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l) if ( !paging_mode_translate(d) ) { -if ( !need_iommu_pt_sync(d) ) -return 0; -return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); +ASSERT_UNREACHABLE(); +return -ENXIO; } gfn_lock(p2m, gfn, 0); diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 5d72270c5b..62df5ca5aa 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -1969,6 +1969,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map, unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >> PAGE_SHIFT_4K; struct mapped_rmrr *mrmrr; struct domain_iommu *hd = dom_iommu(d); +unsigned int flush_flags = 0; ASSERT(pcidevs_locked()); ASSERT(rmrr->base_address < rmrr->end_address); @@ -1982,7 +1983,7 @@ static int rmrr_identity_mapping(struct domain *d, bo
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 31.07.2019 21:46, Andrew Cooper wrote: > My bet is on the intel_iommu_lookup_page() call having side effects[1]. > If you take out the debugging in the middle of the loop in > rmrr_identity_mapping(), does the problem reproduce again? While your guess was right according to Roman's latest results, I'd still like to note that ... > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100% > more memory allocation and higher-level PTE construction than looks wise > for what is supposed to be a getter. ... there being 0 passed as last argument is supposed to suppress any allocations (and from looking at the function I can't really see any side effects yet in this case; all I do see is that the first check of "alloc" could/should happen before looking up PCI device and DRHD). I've also double checked - there clearly is a flush after the call to setup_hwdom_rmrr(). I'm not even convinced this flush has much value, seeing that translation gets enabled only afterwards. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper wrote: > > On 31/07/2019 20:35, Roman Shaposhnik wrote: > > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné > > wrote: > >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote: > >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > Sorry -- got a bit distracted yesterday. Attached is the log with only > your latest patch attached. Interestingly enough the box booted fine > without screen artifacts. So I guess we're getting closer... > > Thanks for all the help! > >>> That's quite weird, there's no functional changes between the > >>> previous patches and this one, the only difference is that this patch > >>> has more verbose output. > >>> > >>> Are you sure you didn't have any local patches on top of Xen that > >>> could explain this difference in behaviour? > >> FWIW, can you please try the plain patch again: > >> > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html > >> > >> And report back? > >> > >> I would like to get this committed ASAP if it does fix your issue. > > I'd like to say that it did -- but I tried it again just now and it > > still garbled screen and tons of: > > > > (XEN) printk: 26665 messages suppressed. > > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr > > 8e14c000, iommu reg = 82c0008de000 > > > > I'm very much confused by what's going on, but it seems that's the > > case -- adding those debug print statements make the issue go away > > > > Here are the patches that are being applied: > >NOT WORKING: > > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch > > > >WORKING: > > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch > > > > At this point I'm really not sure what's going on. > > Ok. seeing as you've double checked this, the mystery deepens. > > My bet is on the intel_iommu_lookup_page() call having side effects[1]. > If you take out the debugging in the middle of the loop in > rmrr_identity_mapping(), does the problem reproduce again? > > ~Andrew > > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100% > more memory allocation and higher-level PTE construction than looks wise > for what is supposed to be a getter. Yup. That's what it is -- intel_iommu_lookup_page() seems to be the culprit. I've did the experiment in the other direction -- adding a dummy call: https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23 on top of original Roger's patch makes system boot NORMALLY. Thanks, Roman. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 31/07/2019 20:35, Roman Shaposhnik wrote: > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné wrote: >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote: >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: Sorry -- got a bit distracted yesterday. Attached is the log with only your latest patch attached. Interestingly enough the box booted fine without screen artifacts. So I guess we're getting closer... Thanks for all the help! >>> That's quite weird, there's no functional changes between the >>> previous patches and this one, the only difference is that this patch >>> has more verbose output. >>> >>> Are you sure you didn't have any local patches on top of Xen that >>> could explain this difference in behaviour? >> FWIW, can you please try the plain patch again: >> >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html >> >> And report back? >> >> I would like to get this committed ASAP if it does fix your issue. > I'd like to say that it did -- but I tried it again just now and it > still garbled screen and tons of: > > (XEN) printk: 26665 messages suppressed. > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr > 8e14c000, iommu reg = 82c0008de000 > > I'm very much confused by what's going on, but it seems that's the > case -- adding those debug print statements make the issue go away > > Here are the patches that are being applied: >NOT WORKING: > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch > >WORKING: > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch > > At this point I'm really not sure what's going on. Ok. seeing as you've double checked this, the mystery deepens. My bet is on the intel_iommu_lookup_page() call having side effects[1]. If you take out the debugging in the middle of the loop in rmrr_identity_mapping(), does the problem reproduce again? ~Andrew [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100% more memory allocation and higher-level PTE construction than looks wise for what is supposed to be a getter. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Jul 31, 2019 at 2:46 AM Jan Beulich wrote: > > On 31.07.2019 10:58, Andrew Cooper wrote: > > On 31/07/2019 09:34, Jan Beulich wrote: > >> On 30.07.2019 19:56, Roman Shaposhnik wrote: > >>> On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich wrote: > On 23.07.2019 20:25, Roman Shaposhnik wrote: > > Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug > > booted the system just fine. > Btw (I've noticed this only now) - are you saying without "iommu=debug" > the box does _not_ boot fine, despite the other option? > >>> Yes. But it made sense to me since iommu=debug (as per your > >>> explanation) overwhelms the CPU and I guess adding > >>> iommu_inclusive_mapping=1 avoids the code path that does it? > >> I'm afraid I don't follow: My question was whether > >> "iommu_inclusive_mapping=1" alone would not allow the box to boot. > >> Without "iommu=debug" there's no excessive logging afaict, no > >> matter what other IOMMU options you use. > > > > Without inclusive mappings, the system boots but the screen is junk and > > there are DMA faults all over the place. With inclusive mappings, it > > all "works" fine. > > > > With debug enabled, the DMA faults are spat out to the console for a > > short while, until an APIC error occurs and the system wedges completely. > > Right - the verbosity _with_ "iommu=debug" may be a problem. Hence me > wondering whether the system indeed wouldn't boot _without_ that option. Without iommu=debug (and any other option) the screen is garbled. Thanks, Roman. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné wrote: > > On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote: > > On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > > > Sorry -- got a bit distracted yesterday. Attached is the log with only > > > your latest patch attached. Interestingly enough the box booted fine > > > without screen artifacts. So I guess we're getting closer... > > > > > > Thanks for all the help! > > > > That's quite weird, there's no functional changes between the > > previous patches and this one, the only difference is that this patch > > has more verbose output. > > > > Are you sure you didn't have any local patches on top of Xen that > > could explain this difference in behaviour? > > FWIW, can you please try the plain patch again: > > https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html > > And report back? > > I would like to get this committed ASAP if it does fix your issue. I'd like to say that it did -- but I tried it again just now and it still garbled screen and tons of: (XEN) printk: 26665 messages suppressed. (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr 8e14c000, iommu reg = 82c0008de000 I'm very much confused by what's going on, but it seems that's the case -- adding those debug print statements make the issue go away Here are the patches that are being applied: NOT WORKING: https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch WORKING: https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch At this point I'm really not sure what's going on. Thanks, Roman. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Jul 31, 2019 at 1:36 AM Roger Pau Monné wrote: > > On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > > Sorry -- got a bit distracted yesterday. Attached is the log with only > > your latest patch attached. Interestingly enough the box booted fine > > without screen artifacts. So I guess we're getting closer... > > > > Thanks for all the help! > > That's quite weird, there's no functional changes between the > previous patches and this one, the only difference is that this patch > has more verbose output. That's super weird indeed. I'm re-trying your very first patch right now to see if it may be a pilot error on my part (hope not). > Are you sure you didn't have any local patches on top of Xen that > could explain this difference in behaviour? Pretty sure -- but let me push my branch in EVE and show to you all (you will need Docker installed to build Xen the way EVE does). But if you want to check for yourself: $ git clone https://github.com/rvs/eve.git -b xen-bug $ cd eve/pkg/xen $ docker build --no-cache . $ docker export $(docker create XXX a) | tar xvf - boot $ ls boot/xen.gz Thanks, Roman. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 31.07.2019 10:58, Andrew Cooper wrote: > On 31/07/2019 09:34, Jan Beulich wrote: >> On 30.07.2019 19:56, Roman Shaposhnik wrote: >>> On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich wrote: On 23.07.2019 20:25, Roman Shaposhnik wrote: > Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug > booted the system just fine. Btw (I've noticed this only now) - are you saying without "iommu=debug" the box does _not_ boot fine, despite the other option? >>> Yes. But it made sense to me since iommu=debug (as per your >>> explanation) overwhelms the CPU and I guess adding >>> iommu_inclusive_mapping=1 avoids the code path that does it? >> I'm afraid I don't follow: My question was whether >> "iommu_inclusive_mapping=1" alone would not allow the box to boot. >> Without "iommu=debug" there's no excessive logging afaict, no >> matter what other IOMMU options you use. > > Without inclusive mappings, the system boots but the screen is junk and > there are DMA faults all over the place. With inclusive mappings, it > all "works" fine. > > With debug enabled, the DMA faults are spat out to the console for a > short while, until an APIC error occurs and the system wedges completely. Right - the verbosity _with_ "iommu=debug" may be a problem. Hence me wondering whether the system indeed wouldn't boot _without_ that option. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 31/07/2019 09:34, Jan Beulich wrote: > On 30.07.2019 19:56, Roman Shaposhnik wrote: >> On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich wrote: >>> On 23.07.2019 20:25, Roman Shaposhnik wrote: Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug booted the system just fine. >>> Btw (I've noticed this only now) - are you saying without "iommu=debug" >>> the box does _not_ boot fine, despite the other option? >> Yes. But it made sense to me since iommu=debug (as per your >> explanation) overwhelms the CPU and I guess adding >> iommu_inclusive_mapping=1 avoids the code path that does it? > I'm afraid I don't follow: My question was whether > "iommu_inclusive_mapping=1" alone would not allow the box to boot. > Without "iommu=debug" there's no excessive logging afaict, no > matter what other IOMMU options you use. Without inclusive mappings, the system boots but the screen is junk and there are DMA faults all over the place. With inclusive mappings, it all "works" fine. With debug enabled, the DMA faults are spat out to the console for a short while, until an APIC error occurs and the system wedges completely. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 30.07.2019 19:56, Roman Shaposhnik wrote: > On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich wrote: >> >> On 23.07.2019 20:25, Roman Shaposhnik wrote: >>> Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug >>> booted the system just fine. >> >> Btw (I've noticed this only now) - are you saying without "iommu=debug" >> the box does _not_ boot fine, despite the other option? > > Yes. But it made sense to me since iommu=debug (as per your > explanation) overwhelms the CPU and I guess adding > iommu_inclusive_mapping=1 avoids the code path that does it? I'm afraid I don't follow: My question was whether "iommu_inclusive_mapping=1" alone would not allow the box to boot. Without "iommu=debug" there's no excessive logging afaict, no matter what other IOMMU options you use. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 30.07.2019 19:55, Roman Shaposhnik wrote: > Sorry -- got a bit distracted yesterday. Attached is the log with only > your latest patch attached. Interestingly enough the box booted fine > without screen artifacts. So I guess we're getting closer... I'm afraid I can't make sense of this and the prior results: Afaics it was only debugging code which got changed, the actual code changes to address the issue remained the same. Hence the earlier patch should have worked too, if this latest one does. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote: > On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > > Sorry -- got a bit distracted yesterday. Attached is the log with only > > your latest patch attached. Interestingly enough the box booted fine > > without screen artifacts. So I guess we're getting closer... > > > > Thanks for all the help! > > That's quite weird, there's no functional changes between the > previous patches and this one, the only difference is that this patch > has more verbose output. > > Are you sure you didn't have any local patches on top of Xen that > could explain this difference in behaviour? FWIW, can you please try the plain patch again: https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html And report back? I would like to get this committed ASAP if it does fix your issue. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote: > Sorry -- got a bit distracted yesterday. Attached is the log with only > your latest patch attached. Interestingly enough the box booted fine > without screen artifacts. So I guess we're getting closer... > > Thanks for all the help! That's quite weird, there's no functional changes between the previous patches and this one, the only difference is that this patch has more verbose output. Are you sure you didn't have any local patches on top of Xen that could explain this difference in behaviour? Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich wrote: > > On 23.07.2019 20:25, Roman Shaposhnik wrote: > > Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug > > booted the system just fine. > > Btw (I've noticed this only now) - are you saying without "iommu=debug" > the box does _not_ boot fine, despite the other option? Yes. But it made sense to me since iommu=debug (as per your explanation) overwhelms the CPU and I guess adding iommu_inclusive_mapping=1 avoids the code path that does it? Thanks, Roman. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
Ping? It would be very helpful for us in order to get this sorted out if you could answer the questions below and try the new debug patch :). On Fri, Jul 26, 2019 at 11:35:45AM +0200, Roger Pau Monné wrote: > On Thu, Jul 25, 2019 at 05:47:19PM -0700, Roman Shaposhnik wrote: > > Hi Roger! > > > > With your patch (and build as a debug build) Xen crashes on boot > > (which I guess was the point of your BUG_ON statement). > > Yes, that's very weird, seems like entries are not added to the iommu > page tables but I have no idea why, AFAICT this works fine on my > system. > > Do you have any patches on top of RELEASE-4.12.0? > > I have another patch with more verbose output, could you give it a > try? It's maybe going to be more chatty than the previous one. > > I'm sorry to keep you testing stuff, but since I cannot reproduce this > locally I have to rely on you to provide the debug output. > > Thanks, Roger. > ---8<--- > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > index b9bbb8f485..75f8359a99 100644 > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -1331,7 +1331,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned > long gfn_l, > > if ( !paging_mode_translate(p2m->domain) ) > { > -if ( !need_iommu_pt_sync(d) ) > +if ( !has_iommu_pt(d) ) > return 0; > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > IOMMUF_readable | IOMMUF_writable); > @@ -1422,7 +1422,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned > long gfn_l) > > if ( !paging_mode_translate(d) ) > { > -if ( !need_iommu_pt_sync(d) ) > +if ( !has_iommu_pt(d) ) > return 0; > return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); > } > diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c > index 117b869b0c..214c5d515f 100644 > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -291,8 +291,18 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn, > unsigned long i; > int rc = 0; > > +if (dfn_x(dfn) >= 0x8d800 && dfn_x(dfn) < 0x9 ) > +{ > +printk(" iommu_map %#lx\n", dfn_x(dfn)); > +process_pending_softirqs(); > +} > + > if ( !iommu_enabled || !hd->platform_ops ) > +{ > +printk("iommu_enabled: %d platform_ops %p\n", > + iommu_enabled, hd->platform_ops); > return 0; > +} > > ASSERT(IS_ALIGNED(dfn_x(dfn), (1ul << page_order))); > ASSERT(IS_ALIGNED(mfn_x(mfn), (1ul << page_order))); > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index 50a0e25224..8c3fcb50ae 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -2009,12 +2009,33 @@ static int rmrr_identity_mapping(struct domain *d, > bool_t map, > if ( !map ) > return -ENOENT; > > +printk(" mapping %#lx - %#lx\n", base_pfn, end_pfn); > while ( base_pfn < end_pfn ) > { > int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag); > +mfn_t mfn; > +unsigned int f; > > if ( err ) > return err; > + > +err = intel_iommu_lookup_page(d, _dfn(base_pfn), &mfn, &f); > +if ( err ) > +{ > +printk("intel_iommu_lookup_page err: %d\n", err); > +BUG(); > +} > +if ( base_pfn != mfn_x(mfn) ) > +{ > +printk("base_pfn: %#lx mfn: %#lx\n", base_pfn, mfn_x(mfn)); > +BUG(); > +} > +if ( f != (IOMMUF_readable | IOMMUF_writable) ) > +{ > +printk("flags: %#x\n", f); > +BUG(); > +} > + > base_pfn++; > } > > @@ -2263,6 +2284,7 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain > *d) > u16 bdf; > int ret, i; > > +printk(" setting up regions\n"); > pcidevs_lock(); > for_each_rmrr_device ( rmrr, bdf, i ) > { > > ___ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Thu, Jul 25, 2019 at 05:47:19PM -0700, Roman Shaposhnik wrote: > Hi Roger! > > With your patch (and build as a debug build) Xen crashes on boot > (which I guess was the point of your BUG_ON statement). Yes, that's very weird, seems like entries are not added to the iommu page tables but I have no idea why, AFAICT this works fine on my system. Do you have any patches on top of RELEASE-4.12.0? I have another patch with more verbose output, could you give it a try? It's maybe going to be more chatty than the previous one. I'm sorry to keep you testing stuff, but since I cannot reproduce this locally I have to rely on you to provide the debug output. Thanks, Roger. ---8<--- diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index b9bbb8f485..75f8359a99 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1331,7 +1331,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l, if ( !paging_mode_translate(p2m->domain) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, IOMMUF_readable | IOMMUF_writable); @@ -1422,7 +1422,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l) if ( !paging_mode_translate(d) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); } diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index 117b869b0c..214c5d515f 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -291,8 +291,18 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn, unsigned long i; int rc = 0; +if (dfn_x(dfn) >= 0x8d800 && dfn_x(dfn) < 0x9 ) +{ +printk(" iommu_map %#lx\n", dfn_x(dfn)); +process_pending_softirqs(); +} + if ( !iommu_enabled || !hd->platform_ops ) +{ +printk("iommu_enabled: %d platform_ops %p\n", + iommu_enabled, hd->platform_ops); return 0; +} ASSERT(IS_ALIGNED(dfn_x(dfn), (1ul << page_order))); ASSERT(IS_ALIGNED(mfn_x(mfn), (1ul << page_order))); diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 50a0e25224..8c3fcb50ae 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -2009,12 +2009,33 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map, if ( !map ) return -ENOENT; +printk(" mapping %#lx - %#lx\n", base_pfn, end_pfn); while ( base_pfn < end_pfn ) { int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag); +mfn_t mfn; +unsigned int f; if ( err ) return err; + +err = intel_iommu_lookup_page(d, _dfn(base_pfn), &mfn, &f); +if ( err ) +{ +printk("intel_iommu_lookup_page err: %d\n", err); +BUG(); +} +if ( base_pfn != mfn_x(mfn) ) +{ +printk("base_pfn: %#lx mfn: %#lx\n", base_pfn, mfn_x(mfn)); +BUG(); +} +if ( f != (IOMMUF_readable | IOMMUF_writable) ) +{ +printk("flags: %#x\n", f); +BUG(); +} + base_pfn++; } @@ -2263,6 +2284,7 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain *d) u16 bdf; int ret, i; +printk(" setting up regions\n"); pcidevs_lock(); for_each_rmrr_device ( rmrr, bdf, i ) { ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 23.07.2019 20:25, Roman Shaposhnik wrote: > Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug > booted the system just fine. Btw (I've noticed this only now) - are you saying without "iommu=debug" the box does _not_ boot fine, despite the other option? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Wed, Jul 24, 2019 at 10:42 AM Rich Persaud wrote: > > On Jul 19, 2019, at 15:31, Roman Shaposhnik wrote: > > Hi! > > we're using Xen on Advantech ARK-2250 Embedded Box PC: > > https://www.elmark.com.pl/web/uploaded/karty_produktow/advantech/ark-2250l/ark-2250l_instrukcja-uzytkownika.pdf > > > Roman, > > Good to see Xen being used on fanless devices. Oh, there's WAY more of those in EVE under Xen management ;-) https://wiki.lfedge.org/display/EVE/Hardware+Platforms+Supporting+EVE > Does the AMI BIOS for the i7 6600U Skylake CPU [1] variant of ARK-2250 [2] > support Intel TXT DRTM and discrete TPM, which would enable boot integrity > [3] protection for Xen, read-only dom0 and stateless VMs? > Boot integrity is valuable on edge devices. Funny you should mention this -- that's exactly what we're playing with right now in LF Edge Project EVE. Do you want to pop up on the mailing list or slack channel there? (not sure this is the right topic for Xen-devel). And just so that we're on the same page, here's what we are after when it comes to root of trust in EVE (I really need to do a write up on this soon): * measured boot (we're really not interested secure boot that much) * measured boot of the DomUs * proxy TPM to the DomUs > [1] CPU spec: > https://ark.intel.com/content/www/us/en/ark/products/88192/intel-core-i7-6600u-processor-4m-cache-up-to-3-40-ghz.html > > [2] PC spec: > https://www.advantech.com/products/ark-2000_series_embedded_box_pcs/ark-2250l/mod_66ebc4e0-9a0c-489c-96a5-70a8054e9037 > > [3] TrenchBoot, Xen Summit 2019, https://youtube.com/watch?v=f0LZFSq4Ack Thanks for the notes! Much appreciated! Thanks, Roman. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
Hi Roger! With your patch (and build as a debug build) Xen crashes on boot (which I guess was the point of your BUG_ON statement). The log is attached Thanks, Roman. On Wed, Jul 24, 2019 at 7:11 AM Roger Pau Monné wrote: > > On Tue, Jul 23, 2019 at 10:32:26AM -0700, Roman Shaposhnik wrote: > > Hi Roger! > > > > I applied your patch, removed no-igfx and I still see the original > > problem. Please let me know what other logs/debugs would you need at > > this point. > > I'm not sure why you don't get the rmrrs added to the iommu page > tables, AFAICT it works on my test box. > > I have a patch with extra debug messages and checks, could you give it > a test, I'm attaching it below. Note that you don't need the previous > patch, since it's already contained in the debug patch below. > > Please paste the Xen bootlog with the patch applied when your reply. > > Thank, Roger. > ---8<--- > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > index fef97c82f6..3605614aaf 100644 > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned > long gfn_l, > > if ( !paging_mode_translate(p2m->domain) ) > { > -if ( !need_iommu_pt_sync(d) ) > +if ( !has_iommu_pt(d) ) > return 0; > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > IOMMUF_readable | IOMMUF_writable); > @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned > long gfn_l) > > if ( !paging_mode_translate(d) ) > { > -if ( !need_iommu_pt_sync(d) ) > +if ( !has_iommu_pt(d) ) > return 0; > return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); > } > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index 8b27d7e775..ea303b5d45 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -2009,12 +2009,19 @@ static int rmrr_identity_mapping(struct domain *d, > bool_t map, > if ( !map ) > return -ENOENT; > > +printk(" mapping %#lx - %#lx\n", base_pfn, end_pfn); > while ( base_pfn < end_pfn ) > { > int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag); > +mfn_t mfn; > +unsigned int f; > > if ( err ) > return err; > +BUG_ON(intel_iommu_lookup_page(d, _dfn(base_pfn), &mfn, &f)); > +BUG_ON(base_pfn != mfn_x(mfn)); > +BUG_ON(f != (IOMMUF_readable | IOMMUF_writable)); > + > base_pfn++; > } > > @@ -2263,6 +2270,7 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain > *d) > u16 bdf; > int ret, i; > > +printk(" setting up regions\n"); > pcidevs_lock(); > for_each_rmrr_device ( rmrr, bdf, i ) > { > 0x:0x00:0x02.0x0: ROM: 0x1 bytes at 0x8968e018 0x:0x02:0x00.0x0: ROM: 0x1 bytes at 0x89641018 0x:0x00:0x1f.0x6: ROM: 0x10c00 bytes at 0x89630018 Xen 4.12.0 (XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=y Fri Jul 26 00:32:25 UTC 2019 (XEN) Latest ChangeSet: (XEN) Bootloader: GRUB 2.03 (XEN) Command line: loglvl=all com1=115200,8n1 console=com1,vga dom0_mem=1024M,max:1024M dom0_max_vcpus=1 dom0_vcpus_pin smt=false (XEN) Xen image load base address: 0x8800 (XEN) Video information: (XEN) VGA is graphics mode 1680x1050, 32 bpp (XEN) Disc information: (XEN) Found 0 MBR signatures (XEN) Found 1 EDD information structures (XEN) EFI RAM map: (XEN) - 00058000 (usable) (XEN) 00058000 - 00059000 (reserved) (XEN) 00059000 - 0009f000 (usable) (XEN) 0009f000 - 000a (reserved) (XEN) 0010 - 8648a000 (usable) (XEN) 8648a000 - 8648b000 (ACPI NVS) (XEN) 8648b000 - 864b5000 (reserved) (XEN) 864b5000 - 8c224000 (usable) (XEN) 8c224000 - 8c528000 (reserved) (XEN) 8c528000 - 8c736000 (usable) (XEN) 8c736000 - 8cea7000 (ACPI NVS) (XEN) 8cea7000 - 8d2ff000 (reserved) (XEN) 8d2ff000 - 8d30 (usable) (XEN) 8d30 - 8d40 (reserved) (XEN) e000 - f000 (reserved) (XEN) fe00 - fe011000 (reserved) (XEN) fec0 - fec01000 (reserved) (XEN) fee0 - fee01000 (reserved) (XEN) ff00 - 0001 (reserved) (XEN) 0001 - 00016e00 (usable) (XEN) ACPI: RSDP 8CE49000, 0024 (r2 ALASKA) (XEN) ACPI: XSDT 8CE490A8, 00CC (r1 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: FACP 8CE6C370, 010C (r5 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: DSDT 8CE49208, 23167 (r2 ALASKA A M I 1072009 INTL 20120913) (XEN) ACPI: FACS 8CE8EF80, 0040 (XEN) ACPI: APIC 8CE6C480, 0084 (r3 ALASKA A M I 1072009 AMI 10013) (XEN) ACP
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> On Jul 19, 2019, at 15:31, Roman Shaposhnik wrote: > > Hi! > > we're using Xen on Advantech ARK-2250 Embedded Box PC: > > https://www.elmark.com.pl/web/uploaded/karty_produktow/advantech/ark-2250l/ark-2250l_instrukcja-uzytkownika.pdf Roman, Good to see Xen being used on fanless devices. Does the AMI BIOS for the i7 6600U Skylake CPU [1] variant of ARK-2250 [2] support Intel TXT DRTM and discrete TPM, which would enable boot integrity [3] protection for Xen, read-only dom0 and stateless VMs? Boot integrity is valuable on edge devices. Rich [1] CPU spec: https://ark.intel.com/content/www/us/en/ark/products/88192/intel-core-i7-6600u-processor-4m-cache-up-to-3-40-ghz.html [2] PC spec: https://www.advantech.com/products/ark-2000_series_embedded_box_pcs/ark-2250l/mod_66ebc4e0-9a0c-489c-96a5-70a8054e9037 [3] TrenchBoot, Xen Summit 2019, https://youtube.com/watch?v=f0LZFSq4Ack ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Tue, Jul 23, 2019 at 10:32:26AM -0700, Roman Shaposhnik wrote: > Hi Roger! > > I applied your patch, removed no-igfx and I still see the original > problem. Please let me know what other logs/debugs would you need at > this point. I'm not sure why you don't get the rmrrs added to the iommu page tables, AFAICT it works on my test box. I have a patch with extra debug messages and checks, could you give it a test, I'm attaching it below. Note that you don't need the previous patch, since it's already contained in the debug patch below. Please paste the Xen bootlog with the patch applied when your reply. Thank, Roger. ---8<--- diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index fef97c82f6..3605614aaf 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l, if ( !paging_mode_translate(p2m->domain) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, IOMMUF_readable | IOMMUF_writable); @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l) if ( !paging_mode_translate(d) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); } diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 8b27d7e775..ea303b5d45 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -2009,12 +2009,19 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map, if ( !map ) return -ENOENT; +printk(" mapping %#lx - %#lx\n", base_pfn, end_pfn); while ( base_pfn < end_pfn ) { int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag); +mfn_t mfn; +unsigned int f; if ( err ) return err; +BUG_ON(intel_iommu_lookup_page(d, _dfn(base_pfn), &mfn, &f)); +BUG_ON(base_pfn != mfn_x(mfn)); +BUG_ON(f != (IOMMUF_readable | IOMMUF_writable)); + base_pfn++; } @@ -2263,6 +2270,7 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain *d) u16 bdf; int ret, i; +printk(" setting up regions\n"); pcidevs_lock(); for_each_rmrr_device ( rmrr, bdf, i ) { ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 24.07.2019 14:00, Jan Beulich wrote: > On 23.07.2019 19:58, Roman Shaposhnik wrote: >> No worries. Take a look at the head of the log attached. > > One more thing for you to tweak to make the log even more useful: > As per > > (XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n Tue Jul 23 > 17:15:48 UTC 2019 > > this is a non-debug build. If you used a debug one, there ought to > be another confirming message next to > > (XEN) [VT-D]found ACPI_DMAR_RMRR: > (XEN) [VT-D] RMRR address range 8d80..8fff not in reserved memory; > need "iommu_inclusive_mapping=1"? > (XEN) [VT-D] endpoint: :00:02.0 > > plus others in case of errors beyond the memory map related one > (which, as was pointed out already, is a warning only). In case > you're interested, see register_one_rmrr() in particular. Oh, additionally you want to add "loglvl=all" to your hypervisor command line. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 23.07.2019 19:58, Roman Shaposhnik wrote: > No worries. Take a look at the head of the log attached. One more thing for you to tweak to make the log even more useful: As per (XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n Tue Jul 23 17:15:48 UTC 2019 this is a non-debug build. If you used a debug one, there ought to be another confirming message next to (XEN) [VT-D]found ACPI_DMAR_RMRR: (XEN) [VT-D] RMRR address range 8d80..8fff not in reserved memory; need "iommu_inclusive_mapping=1"? (XEN) [VT-D] endpoint: :00:02.0 plus others in case of errors beyond the memory map related one (which, as was pointed out already, is a warning only). In case you're interested, see register_one_rmrr() in particular. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 24/07/2019 12:23, Andrew Cooper wrote: > On 23/07/2019 18:32, Roman Shaposhnik wrote: >> Oh, and it seems that that https://downloads.xenproject.org/ SSL >> certificate expired yesterday -- perhaps somebody can take a look at >> that. > Thanks for reporting. It should be being taken care of. And it should be fixed now. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 23/07/2019 18:32, Roman Shaposhnik wrote: > Oh, and it seems that that https://downloads.xenproject.org/ SSL > certificate expired yesterday -- perhaps somebody can take a look at > that. Thanks for reporting. It should be being taken care of. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Tue, Jul 23, 2019 at 11:12 AM Andrew Cooper wrote: > > On 23/07/2019 18:58, Roman Shaposhnik wrote: > > On Tue, Jul 23, 2019 at 10:50 AM Andrew Cooper > wrote: > > On 23/07/2019 18:48, Roman Shaposhnik wrote: > > On Tue, Jul 23, 2019 at 10:35 AM Andrew Cooper > wrote: > > On 23/07/2019 18:32, Roman Shaposhnik wrote: > > Hi Roger! > > I applied your patch, removed no-igfx and I still see the original > problem. Please let me know what other logs/debugs would you need at > this point. > > Please can you collect a full boot log with iommu=debug > > How long of an output should I expect when iommu=debug is enabled? > I've just enabled it and I'm looking at what appears to be an endless > scroll of debug info. > > This is all I see for the good 5 minutes at this point. Culminating with: > (XEN) (XEN) APIC error on CPU0: 40(00) > > and a failure to boot. > > Note that this is still without no-igfx > > I'm attaching the tail end of this log. > > Sadly, what is useful is the head of the log, before it starts > complaining loudly about every DMA fault. > > No worries. Take a look at the head of the log attached. > > Btw, I'm kind of curious why iommu=debug would actually make it crash > > > The system is rather sickly, and is debugging at a rate slower than incoming > faults, which is going to starve whichever CPU is taking the IOMMU interrupt. > > I wouldn't worry about the APIC error now. Got it. Makes sense. > Curiously, there is one single intremap error on boot, which is likely > unrelated. > > (XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0, iommu > reg = 82c0008e > > (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear > > > This will be irq0 from the IO-APIC. > > Can you try booting following the guidance from > > (XEN) [VT-D]found ACPI_DMAR_RMRR: > > (XEN) [VT-D] RMRR address range 8d80..8fff not in reserved memory; > need "iommu_inclusive_mapping=1"? > > (XEN) [VT-D] endpoint: :00:02.0 > > > which I noted on my first reply? Given that Rogers patch didn't help, > something else is going on. Sorry -- missed that option the first time around. Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug booted the system just fine. There are no issues with screen. I am attaching a full log. Btw, my understanding is that this may point to a BIOS issue. Which would be fair conclusion, but I've got to wonder why Xen 4.11 didn't seem to be susceptible to this BIOS issue. Thanks, Roman. 0x:0x00:0x02.0x0: ROM: 0x1 bytes at 0x8968d018 0x:0x02:0x00.0x0: ROM: 0x1 bytes at 0x89640018 0x:0x00:0x1f.0x6: ROM: 0x10c00 bytes at 0x8962f018 Xen 4.12.0 (XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n Tue Jul 23 17:15:48 UTC 2019 (XEN) Latest ChangeSet: (XEN) Bootloader: GRUB 2.03 (XEN) Command line: com1=115200,8n1 console=com1,vga iommu_inclusive_mapping=1 iommu=debug dom0_mem=1024M,max:1024M dom0_max_vcpus=1 dom0_vcpus_pin smt=false (XEN) Xen image load base address: 0x8800 (XEN) Video information: (XEN) VGA is graphics mode 1680x1050, 32 bpp (XEN) Disc information: (XEN) Found 0 MBR signatures (XEN) Found 1 EDD information structures (XEN) EFI RAM map: (XEN) - 00058000 (usable) (XEN) 00058000 - 00059000 (reserved) (XEN) 00059000 - 0009f000 (usable) (XEN) 0009f000 - 000a (reserved) (XEN) 0010 - 8648a000 (usable) (XEN) 8648a000 - 8648b000 (ACPI NVS) (XEN) 8648b000 - 864b5000 (reserved) (XEN) 864b5000 - 8c224000 (usable) (XEN) 8c224000 - 8c528000 (reserved) (XEN) 8c528000 - 8c736000 (usable) (XEN) 8c736000 - 8cea7000 (ACPI NVS) (XEN) 8cea7000 - 8d2ff000 (reserved) (XEN) 8d2ff000 - 8d30 (usable) (XEN) 8d30 - 8d40 (reserved) (XEN) e000 - f000 (reserved) (XEN) fe00 - fe011000 (reserved) (XEN) fec0 - fec01000 (reserved) (XEN) fee0 - fee01000 (reserved) (XEN) ff00 - 0001 (reserved) (XEN) 0001 - 00016e00 (usable) (XEN) ACPI: RSDP 8CE49000, 0024 (r2 ALASKA) (XEN) ACPI: XSDT 8CE490A8, 00CC (r1 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: FACP 8CE6C370, 010C (r5 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: DSDT 8CE49208, 23167 (r2 ALASKA A M I 1072009 INTL 20120913) (XEN) ACPI: FACS 8CE8EF80, 0040 (XEN) ACPI: APIC 8CE6C480, 0084 (r3 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: FPDT 8CE6C508, 0044 (r1 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: FIDT 8CE6C550, 009C (r1 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: MCFG 8CE6C5F0, 003C (r1 ALASKA A M I 1072009 MSFT 97) (XEN) ACPI: HPET 8CE6C630, 0038 (r1 ALASKA A M I 1072009 AMI.5000B) (XEN) ACPI: LP
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 23/07/2019 18:58, Roman Shaposhnik wrote: > On Tue, Jul 23, 2019 at 10:50 AM Andrew Cooper > wrote: >> On 23/07/2019 18:48, Roman Shaposhnik wrote: >>> On Tue, Jul 23, 2019 at 10:35 AM Andrew Cooper >>> wrote: On 23/07/2019 18:32, Roman Shaposhnik wrote: > Hi Roger! > > I applied your patch, removed no-igfx and I still see the original > problem. Please let me know what other logs/debugs would you need at > this point. Please can you collect a full boot log with iommu=debug >>> How long of an output should I expect when iommu=debug is enabled? >>> I've just enabled it and I'm looking at what appears to be an endless >>> scroll of debug info. >>> >>> This is all I see for the good 5 minutes at this point. Culminating with: >>> (XEN) (XEN) APIC error on CPU0: 40(00) >>> >>> and a failure to boot. >>> >>> Note that this is still without no-igfx >>> >>> I'm attaching the tail end of this log. >> Sadly, what is useful is the head of the log, before it starts >> complaining loudly about every DMA fault. > No worries. Take a look at the head of the log attached. > > Btw, I'm kind of curious why iommu=debug would actually make it crash The system is rather sickly, and is debugging at a rate slower than incoming faults, which is going to starve whichever CPU is taking the IOMMU interrupt. I wouldn't worry about the APIC error now. Curiously, there is one single intremap error on boot, which is likely unrelated. (XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0, iommu reg = 82c0008e (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear This will be irq0 from the IO-APIC. Can you try booting following the guidance from (XEN) [VT-D]found ACPI_DMAR_RMRR: (XEN) [VT-D] RMRR address range 8d80..8fff not in reserved memory; need "iommu_inclusive_mapping=1"? (XEN) [VT-D] endpoint: :00:02.0 which I noted on my first reply? Given that Rogers patch didn't help, something else is going on. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Tue, Jul 23, 2019 at 10:50 AM Andrew Cooper wrote: > > On 23/07/2019 18:48, Roman Shaposhnik wrote: > > On Tue, Jul 23, 2019 at 10:35 AM Andrew Cooper > > wrote: > >> On 23/07/2019 18:32, Roman Shaposhnik wrote: > >>> Hi Roger! > >>> > >>> I applied your patch, removed no-igfx and I still see the original > >>> problem. Please let me know what other logs/debugs would you need at > >>> this point. > >> Please can you collect a full boot log with iommu=debug > > How long of an output should I expect when iommu=debug is enabled? > > I've just enabled it and I'm looking at what appears to be an endless > > scroll of debug info. > > > > This is all I see for the good 5 minutes at this point. Culminating with: > > (XEN) (XEN) APIC error on CPU0: 40(00) > > > > and a failure to boot. > > > > Note that this is still without no-igfx > > > > I'm attaching the tail end of this log. > > Sadly, what is useful is the head of the log, before it starts > complaining loudly about every DMA fault. No worries. Take a look at the head of the log attached. Btw, I'm kind of curious why iommu=debug would actually make it crash Thanks, Roman. 0x:0x00:0x02.0x0: ROM: 0x1 bytes at 0x8968d018 0x:0x02:0x00.0x0: ROM: 0x1 bytes at 0x89640018 0x:0x00:0x1f.0x6: ROM: 0x10c00 bytes at 0x8962f018 Xen 4.12.0 (XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n Tue Jul 23 17:15:48 UTC 2019 (XEN) Latest ChangeSet: (XEN) Bootloader: GRUB 2.03 (XEN) Command line: com1=115200,8n1 console=com1,vga iommu=debug dom0_mem=1024M,max:1024M dom0_max_vcpus=1 dom0_vcpus_pin smt=false (XEN) Xen image load base address: 0x8800 (XEN) Video information: (XEN) VGA is graphics mode 1680x1050, 32 bpp (XEN) Disc information: (XEN) Found 0 MBR signatures (XEN) Found 1 EDD information structures (XEN) EFI RAM map: (XEN) - 00058000 (usable) (XEN) 00058000 - 00059000 (reserved) (XEN) 00059000 - 0009f000 (usable) (XEN) 0009f000 - 000a (reserved) (XEN) 0010 - 8648a000 (usable) (XEN) 8648a000 - 8648b000 (ACPI NVS) (XEN) 8648b000 - 864b5000 (reserved) (XEN) 864b5000 - 8c224000 (usable) (XEN) 8c224000 - 8c528000 (reserved) (XEN) 8c528000 - 8c736000 (usable) (XEN) 8c736000 - 8cea7000 (ACPI NVS) (XEN) 8cea7000 - 8d2ff000 (reserved) (XEN) 8d2ff000 - 8d30 (usable) (XEN) 8d30 - 8d40 (reserved) (XEN) e000 - f000 (reserved) (XEN) fe00 - fe011000 (reserved) (XEN) fec0 - fec01000 (reserved) (XEN) fee0 - fee01000 (reserved) (XEN) ff00 - 0001 (reserved) (XEN) 0001 - 00016e00 (usable) (XEN) ACPI: RSDP 8CE49000, 0024 (r2 ALASKA) (XEN) ACPI: XSDT 8CE490A8, 00CC (r1 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: FACP 8CE6C370, 010C (r5 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: DSDT 8CE49208, 23167 (r2 ALASKA A M I 1072009 INTL 20120913) (XEN) ACPI: FACS 8CE8EF80, 0040 (XEN) ACPI: APIC 8CE6C480, 0084 (r3 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: FPDT 8CE6C508, 0044 (r1 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: FIDT 8CE6C550, 009C (r1 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: MCFG 8CE6C5F0, 003C (r1 ALASKA A M I 1072009 MSFT 97) (XEN) ACPI: HPET 8CE6C630, 0038 (r1 ALASKA A M I 1072009 AMI.5000B) (XEN) ACPI: LPIT 8CE6C668, 0094 (r1 INTEL SKL-ULT0 MSFT 5F) (XEN) ACPI: SSDT 8CE6C700, 0248 (r2 INTEL sensrhub0 INTL 20120913) (XEN) ACPI: SSDT 8CE6C948, 2BAE (r2 INTEL PtidDevc 1000 INTL 20120913) (XEN) ACPI: SSDT 8CE6F4F8, 0BE3 (r2 INTEL Ther_Rvp 1000 INTL 20120913) (XEN) ACPI: SSDT 8CE700E0, 04A3 (r2 INTEL zpodd 1000 INTL 20120913) (XEN) ACPI: DBGP 8CE70588, 0034 (r1 INTEL 0 MSFT 5F) (XEN) ACPI: DBG2 8CE705C0, 0054 (r0 INTEL 0 MSFT 5F) (XEN) ACPI: SSDT 8CE70618, 06E9 (r2 INTEL xh_rvp070 INTL 20120913) (XEN) ACPI: SSDT 8CE70D08, 547E (r2 SaSsdt SaSsdt 3000 INTL 20120913) (XEN) ACPI: UEFI 8CE76188, 0042 (r10 0) (XEN) ACPI: SSDT 8CE761D0, 0E73 (r2 CpuRef CpuSsdt 3000 INTL 20120913) (XEN) ACPI: BGRT 8CE77048, 0038 (r1 ALASKA A M I 1072009 AMI 10013) (XEN) ACPI: DMAR 8CE77080, 00A8 (r1 INTEL SKL 1 INTL1) (XEN) ACPI: TPM2 8CE77128, 0034 (r3Tpm2Tabl1 AMI 0) (XEN) ACPI: ASF! 8CE77160, 00A5 (r32 INTEL HCG1 TFSMF4240) (XEN) System RAM: 4003MB (4099736kB) (XEN) Domain heap initialised (XEN) ACPI: 32/64X FACS address mismatch in FADT - 8ce8ef80/, using 32 (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119 (XEN) Enabling APIC mode: Flat. Using 1 I/O
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 23/07/2019 18:48, Roman Shaposhnik wrote: > On Tue, Jul 23, 2019 at 10:35 AM Andrew Cooper > wrote: >> On 23/07/2019 18:32, Roman Shaposhnik wrote: >>> Hi Roger! >>> >>> I applied your patch, removed no-igfx and I still see the original >>> problem. Please let me know what other logs/debugs would you need at >>> this point. >> Please can you collect a full boot log with iommu=debug > How long of an output should I expect when iommu=debug is enabled? > I've just enabled it and I'm looking at what appears to be an endless > scroll of debug info. > > This is all I see for the good 5 minutes at this point. Culminating with: > (XEN) (XEN) APIC error on CPU0: 40(00) > > and a failure to boot. > > Note that this is still without no-igfx > > I'm attaching the tail end of this log. Sadly, what is useful is the head of the log, before it starts complaining loudly about every DMA fault. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 23/07/2019 18:32, Roman Shaposhnik wrote: > Hi Roger! > > I applied your patch, removed no-igfx and I still see the original > problem. Please let me know what other logs/debugs would you need at > this point. Please can you collect a full boot log with iommu=debug ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
Hi Roger! I applied your patch, removed no-igfx and I still see the original problem. Please let me know what other logs/debugs would you need at this point. Btw, just to make it clear what patch got applied I'm attaching it to this email. Oh, and it seems that that https://downloads.xenproject.org/ SSL certificate expired yesterday -- perhaps somebody can take a look at that. Thanks, Roman. On Mon, Jul 22, 2019 at 4:47 PM Andrew Cooper wrote: > > On 23/07/2019 00:36, Roman Shaposhnik wrote: > > Hi Everyone! > > > > Thanks a million for an extremely quick turnaround. I am in my lab > > again next to the box in question. > > > > Should I go ahead and test the latest patch or wait for the official > > one to be submitted? > > > > Thanks, > > Roman. > > Use this patch to test with. Roger forgot to CC you on the official > one, but the code changes are identical. > > ~Andrew 01-iommu-mappings.patch Description: Binary data ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 23/07/2019 00:36, Roman Shaposhnik wrote: > Hi Everyone! > > Thanks a million for an extremely quick turnaround. I am in my lab > again next to the box in question. > > Should I go ahead and test the latest patch or wait for the official > one to be submitted? > > Thanks, > Roman. Use this patch to test with. Roger forgot to CC you on the official one, but the code changes are identical. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
Hi Everyone! Thanks a million for an extremely quick turnaround. I am in my lab again next to the box in question. Should I go ahead and test the latest patch or wait for the official one to be submitted? Thanks, Roman. On Mon, Jul 22, 2019 at 8:22 AM Roger Pau Monné wrote: > > On Mon, Jul 22, 2019 at 05:02:35PM +0200, Paul Durrant wrote: > > > -Original Message- > > > From: Roger Pau Monne > > > Sent: 22 July 2019 15:40 > > > To: Paul Durrant > > > Cc: 'Roman Shaposhnik' ; > > > xen-devel@lists.xenproject.org; jgr...@suse.com; Andrew > > > Cooper ; boris.ostrov...@oracle.com; > > > jbeul...@suse.com > > > Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx > > > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > > > index fef97c82f6..88a2430c8c 100644 > > > --- a/xen/arch/x86/mm/p2m.c > > > +++ b/xen/arch/x86/mm/p2m.c > > > @@ -836,7 +836,7 @@ guest_physmap_add_page(struct domain *d, gfn_t gfn, > > > mfn_t mfn, > > > */ > > > for ( i = 0; i < (1UL << page_order); ++i, ++page ) > > > { > > > -if ( !need_iommu_pt_sync(d) ) > > > +if ( !has_iommu_pt(d) ) > > > /* nothing */; > > > else if ( get_page_and_type(page, d, PGT_writable_page) ) > > > put_page_and_type(page); > > > @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, > > > unsigned long gfn_l, > > > > > > if ( !paging_mode_translate(p2m->domain) ) > > > { > > > -if ( !need_iommu_pt_sync(d) ) > > > +if ( !has_iommu_pt(d) ) > > > return 0; > > > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), > > > PAGE_ORDER_4K, > > > IOMMUF_readable | IOMMUF_writable); > > > @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, > > > unsigned long gfn_l) > > > > > > if ( !paging_mode_translate(d) ) > > > { > > > -if ( !need_iommu_pt_sync(d) ) > > > +if ( !has_iommu_pt(d) ) > > > return 0; > > > return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); > > > } > > > > Yes, this all looks ok to me... although I still find it counterintuitive > > that we make p2m calls for PV domains. > > I agree, albeit I'm not sure of how to get rid of those, will need to > look at the callers. For iommu callers we could likely just call > iommu_legacy_{map/unmap} for PV. > > I'm going to formally submit this patch then. > > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Mon, Jul 22, 2019 at 05:02:35PM +0200, Paul Durrant wrote: > > -Original Message- > > From: Roger Pau Monne > > Sent: 22 July 2019 15:40 > > To: Paul Durrant > > Cc: 'Roman Shaposhnik' ; xen-devel@lists.xenproject.org; > > jgr...@suse.com; Andrew > > Cooper ; boris.ostrov...@oracle.com; > > jbeul...@suse.com > > Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx > > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > > index fef97c82f6..88a2430c8c 100644 > > --- a/xen/arch/x86/mm/p2m.c > > +++ b/xen/arch/x86/mm/p2m.c > > @@ -836,7 +836,7 @@ guest_physmap_add_page(struct domain *d, gfn_t gfn, > > mfn_t mfn, > > */ > > for ( i = 0; i < (1UL << page_order); ++i, ++page ) > > { > > -if ( !need_iommu_pt_sync(d) ) > > +if ( !has_iommu_pt(d) ) > > /* nothing */; > > else if ( get_page_and_type(page, d, PGT_writable_page) ) > > put_page_and_type(page); > > @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned > > long gfn_l, > > > > if ( !paging_mode_translate(p2m->domain) ) > > { > > -if ( !need_iommu_pt_sync(d) ) > > +if ( !has_iommu_pt(d) ) > > return 0; > > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > > IOMMUF_readable | IOMMUF_writable); > > @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, > > unsigned long gfn_l) > > > > if ( !paging_mode_translate(d) ) > > { > > -if ( !need_iommu_pt_sync(d) ) > > +if ( !has_iommu_pt(d) ) > > return 0; > > return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); > > } > > Yes, this all looks ok to me... although I still find it counterintuitive > that we make p2m calls for PV domains. I agree, albeit I'm not sure of how to get rid of those, will need to look at the callers. For iommu callers we could likely just call iommu_legacy_{map/unmap} for PV. I'm going to formally submit this patch then. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> -Original Message- > From: Roger Pau Monne > Sent: 22 July 2019 15:40 > To: Paul Durrant > Cc: 'Roman Shaposhnik' ; xen-devel@lists.xenproject.org; > jgr...@suse.com; Andrew > Cooper ; boris.ostrov...@oracle.com; > jbeul...@suse.com > Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx > > On Mon, Jul 22, 2019 at 04:03:44PM +0200, Paul Durrant wrote: > > > -Original Message- > > [snip] > > > > > diff --git a/xen/drivers/passthrough/iommu.c > > > > > b/xen/drivers/passthrough/iommu.c > > > > > index 79ec6719f5..9d91f0d633 100644 > > > > > --- a/xen/drivers/passthrough/iommu.c > > > > > +++ b/xen/drivers/passthrough/iommu.c > > > > > @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain > > > > > *d) > > > > > register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m > > > > > table", 0); > > > > > > > > > > hd->status = IOMMU_STATUS_initializing; > > > > > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d); > > > > > +hd->need_sync = !iommu_use_hap_pt(d); > > > > > > > > But this is going to mean the if() below is true for non-strict dom0, > > > > which means it pointlessly > > > maps the dom0 page list when hwdom_iommu_map() should have already mapped > > > all conventional RAM. > > > > > > Right, this all seems quite broken. Non-translated guests (ie: PV) > > > would always need iommu page-table sync, but set_identity_p2m_entry > > > contains the following: > > > > > > if ( !paging_mode_translate(p2m->domain) ) > > > { > > > if ( !need_iommu_pt_sync(d) ) > > > return 0; > > > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > > > IOMMUF_readable | IOMMUF_writable); > > > } > > > > > > I wonder whether the usage of need_iommu_pt_sync is wrong there, and > > > should instead be !has_iommu_pt(d), since non-translated domains would > > > never share the iommu page-tables anyway. > > > > You want syncing if the domain has IOMMU page tables and shared EPT is not > > in use, so this logic > just seems wrong. > > Right, so for a PV domain this would mean syncing if the iommu is in > used, because there's no sharing at all. > > > > > > > In any case, I think need_sync must be set when the iommu page tables > > > are not shared, and gating it on iommu_hwdom_strict seems wrong to me, > > > the strictness of the iommu doesn't affect whether a sync is need or > > > not. > > > > I think need_sync is gated on strict mode because, in 'relaxed' mode, the > > mappings that are set up > when dom0 is started are supposed to be static (modulo hotplug RAM)... so > modifications to the > domain's page list are not supposed to have any effect, and so no > synchronization need be done. > > Hm, right, in relaxed mode dom0 is supposed to have all RAM regions > mapped in the iommu page-tables, so there's no need to modify the > iommu page tables at run time. > > This approach seems slightly broken if dma operations agains mmio > regions are attempted by dom0, but anyway, this seems to have worked > fine so far. > > > > > > > I've updated the patch to avoid the pointless mapping of dom0 page > > > list, but I haven't included the change to set_identity_p2m_entry. > > > > > > > So, I think the albeit odd looking logic in iommu_hwdom_init() was actually > > correct, but the code in > set_identity_p2m_entry() is wrong. > > Ack, see the patch below. I think it might also be helpful to add a > comment to the setting of need_sync in iommu_hwdom_init in order to > mention that relaxed domains don't need sync because all ram is > already mapped in the iommu page tables. Agreed. > > Thanks, Roger. > ---8<--- > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c > index fef97c82f6..88a2430c8c 100644 > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -836,7 +836,7 @@ guest_physmap_add_page(struct domain *d, gfn_t gfn, mfn_t > mfn, > */ > for ( i = 0; i < (1UL << page_order); ++i, ++page ) > { > -if ( !need_iommu_pt_sync(d) ) > +if ( !has_iommu_pt(d) ) > /* nothing */; >
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Mon, Jul 22, 2019 at 04:03:44PM +0200, Paul Durrant wrote: > > -Original Message- > [snip] > > > > diff --git a/xen/drivers/passthrough/iommu.c > > > > b/xen/drivers/passthrough/iommu.c > > > > index 79ec6719f5..9d91f0d633 100644 > > > > --- a/xen/drivers/passthrough/iommu.c > > > > +++ b/xen/drivers/passthrough/iommu.c > > > > @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > > > > register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m > > > > table", 0); > > > > > > > > hd->status = IOMMU_STATUS_initializing; > > > > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d); > > > > +hd->need_sync = !iommu_use_hap_pt(d); > > > > > > But this is going to mean the if() below is true for non-strict dom0, > > > which means it pointlessly > > maps the dom0 page list when hwdom_iommu_map() should have already mapped > > all conventional RAM. > > > > Right, this all seems quite broken. Non-translated guests (ie: PV) > > would always need iommu page-table sync, but set_identity_p2m_entry > > contains the following: > > > > if ( !paging_mode_translate(p2m->domain) ) > > { > > if ( !need_iommu_pt_sync(d) ) > > return 0; > > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > > IOMMUF_readable | IOMMUF_writable); > > } > > > > I wonder whether the usage of need_iommu_pt_sync is wrong there, and > > should instead be !has_iommu_pt(d), since non-translated domains would > > never share the iommu page-tables anyway. > > You want syncing if the domain has IOMMU page tables and shared EPT is not in > use, so this logic just seems wrong. Right, so for a PV domain this would mean syncing if the iommu is in used, because there's no sharing at all. > > > > In any case, I think need_sync must be set when the iommu page tables > > are not shared, and gating it on iommu_hwdom_strict seems wrong to me, > > the strictness of the iommu doesn't affect whether a sync is need or > > not. > > I think need_sync is gated on strict mode because, in 'relaxed' mode, the > mappings that are set up when dom0 is started are supposed to be static > (modulo hotplug RAM)... so modifications to the domain's page list are not > supposed to have any effect, and so no synchronization need be done. Hm, right, in relaxed mode dom0 is supposed to have all RAM regions mapped in the iommu page-tables, so there's no need to modify the iommu page tables at run time. This approach seems slightly broken if dma operations agains mmio regions are attempted by dom0, but anyway, this seems to have worked fine so far. > > > > I've updated the patch to avoid the pointless mapping of dom0 page > > list, but I haven't included the change to set_identity_p2m_entry. > > > > So, I think the albeit odd looking logic in iommu_hwdom_init() was actually > correct, but the code in set_identity_p2m_entry() is wrong. Ack, see the patch below. I think it might also be helpful to add a comment to the setting of need_sync in iommu_hwdom_init in order to mention that relaxed domains don't need sync because all ram is already mapped in the iommu page tables. Thanks, Roger. ---8<--- diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index fef97c82f6..88a2430c8c 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -836,7 +836,7 @@ guest_physmap_add_page(struct domain *d, gfn_t gfn, mfn_t mfn, */ for ( i = 0; i < (1UL << page_order); ++i, ++page ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) /* nothing */; else if ( get_page_and_type(page, d, PGT_writable_page) ) put_page_and_type(page); @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l, if ( !paging_mode_translate(p2m->domain) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, IOMMUF_readable | IOMMUF_writable); @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l) if ( !paging_mode_translate(d) ) { -if ( !need_iommu_pt_sync(d) ) +if ( !has_iommu_pt(d) ) return 0; return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K); } ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> -Original Message- [snip] > > > diff --git a/xen/drivers/passthrough/iommu.c > > > b/xen/drivers/passthrough/iommu.c > > > index 79ec6719f5..9d91f0d633 100644 > > > --- a/xen/drivers/passthrough/iommu.c > > > +++ b/xen/drivers/passthrough/iommu.c > > > @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > > > register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m > > > table", 0); > > > > > > hd->status = IOMMU_STATUS_initializing; > > > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d); > > > +hd->need_sync = !iommu_use_hap_pt(d); > > > > But this is going to mean the if() below is true for non-strict dom0, which > > means it pointlessly > maps the dom0 page list when hwdom_iommu_map() should have already mapped all > conventional RAM. > > Right, this all seems quite broken. Non-translated guests (ie: PV) > would always need iommu page-table sync, but set_identity_p2m_entry > contains the following: > > if ( !paging_mode_translate(p2m->domain) ) > { > if ( !need_iommu_pt_sync(d) ) > return 0; > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, > IOMMUF_readable | IOMMUF_writable); > } > > I wonder whether the usage of need_iommu_pt_sync is wrong there, and > should instead be !has_iommu_pt(d), since non-translated domains would > never share the iommu page-tables anyway. You want syncing if the domain has IOMMU page tables and shared EPT is not in use, so this logic just seems wrong. > > In any case, I think need_sync must be set when the iommu page tables > are not shared, and gating it on iommu_hwdom_strict seems wrong to me, > the strictness of the iommu doesn't affect whether a sync is need or > not. I think need_sync is gated on strict mode because, in 'relaxed' mode, the mappings that are set up when dom0 is started are supposed to be static (modulo hotplug RAM)... so modifications to the domain's page list are not supposed to have any effect, and so no synchronization need be done. > > I've updated the patch to avoid the pointless mapping of dom0 page > list, but I haven't included the change to set_identity_p2m_entry. > So, I think the albeit odd looking logic in iommu_hwdom_init() was actually correct, but the code in set_identity_p2m_entry() is wrong. Paul > Thanks, Roger. > ---8<--- > diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c > index 79ec6719f5..abf9e38607 100644 > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -185,8 +185,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", > 0); > > hd->status = IOMMU_STATUS_initializing; > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d); > -if ( need_iommu_pt_sync(d) ) > +hd->need_sync = !iommu_use_hap_pt(d); > +if ( iommu_hwdom_strict && need_iommu_pt_sync(d) ) > { > struct page_info *page; > unsigned int i = 0, flush_flags = 0; ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Mon, Jul 22, 2019 at 01:54:00PM +0200, Paul Durrant wrote: > > -Original Message- > > From: Roger Pau Monne > > Sent: 22 July 2019 12:49 > > To: Paul Durrant ; 'Roman Shaposhnik' > > > > Cc: xen-devel@lists.xenproject.org; jgr...@suse.com; Andrew Cooper > > ; > > boris.ostrov...@oracle.com; jbeul...@suse.com > > Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx > > > > On Mon, Jul 22, 2019 at 08:20:36AM +, Paul Durrant wrote: > > > > -Original Message- > > > [snip] > > > > > (XEN) Domain heap initialised > > > > > (XEN) ACPI: 32/64X FACS address mismatch in FADT - > > > > > 8ce8ef80/, using 32 > > > > > (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119 > > > > > (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs > > > > > (XEN) [VT-D] RMRR address range 8d80..8fff not in reserved > > > > > memory; need "iommu_inclusive_mapping=1"? > > > > > > This is your problem. In versions prior to 4.11 (I think, and certainly > > > 4.12) > > iommu_inclusive_mapping used to default on, whereas now it appears to > > default off. In most > > circumstances this is fine because there is a new flag, > > iommu_hwdom_reserved, which defaults on and > > this makes sure that all e820 reserved regions are identity mapped (which > > usually covers undeclared > > RMRRs). You have the opposite problem... a declared RMRR which is not > > reserved, so you will need > > iommu_inclusive_mapping. > > > > I think there's a bug in the initialization of iommu for a PV dom0, > > which leaves dom0 without rmrr entries. Can you try the patch below > > and report whether it solves your issue? > > > > Thanks, Roger. > > ---8<--- > > diff --git a/xen/drivers/passthrough/iommu.c > > b/xen/drivers/passthrough/iommu.c > > index 79ec6719f5..9d91f0d633 100644 > > --- a/xen/drivers/passthrough/iommu.c > > +++ b/xen/drivers/passthrough/iommu.c > > @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > > register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m > > table", 0); > > > > hd->status = IOMMU_STATUS_initializing; > > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d); > > +hd->need_sync = !iommu_use_hap_pt(d); > > But this is going to mean the if() below is true for non-strict dom0, which > means it pointlessly maps the dom0 page list when hwdom_iommu_map() should > have already mapped all conventional RAM. Right, this all seems quite broken. Non-translated guests (ie: PV) would always need iommu page-table sync, but set_identity_p2m_entry contains the following: if ( !paging_mode_translate(p2m->domain) ) { if ( !need_iommu_pt_sync(d) ) return 0; return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K, IOMMUF_readable | IOMMUF_writable); } I wonder whether the usage of need_iommu_pt_sync is wrong there, and should instead be !has_iommu_pt(d), since non-translated domains would never share the iommu page-tables anyway. In any case, I think need_sync must be set when the iommu page tables are not shared, and gating it on iommu_hwdom_strict seems wrong to me, the strictness of the iommu doesn't affect whether a sync is need or not. I've updated the patch to avoid the pointless mapping of dom0 page list, but I haven't included the change to set_identity_p2m_entry. Thanks, Roger. ---8<--- diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index 79ec6719f5..abf9e38607 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -185,8 +185,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0); hd->status = IOMMU_STATUS_initializing; -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d); -if ( need_iommu_pt_sync(d) ) +hd->need_sync = !iommu_use_hap_pt(d); +if ( iommu_hwdom_strict && need_iommu_pt_sync(d) ) { struct page_info *page; unsigned int i = 0, flush_flags = 0; ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> -Original Message- > From: Roger Pau Monne > Sent: 22 July 2019 12:49 > To: Paul Durrant ; 'Roman Shaposhnik' > > Cc: xen-devel@lists.xenproject.org; jgr...@suse.com; Andrew Cooper > ; > boris.ostrov...@oracle.com; jbeul...@suse.com > Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx > > On Mon, Jul 22, 2019 at 08:20:36AM +, Paul Durrant wrote: > > > -Original Message- > > [snip] > > > > (XEN) Domain heap initialised > > > > (XEN) ACPI: 32/64X FACS address mismatch in FADT - > > > > 8ce8ef80/, using 32 > > > > (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119 > > > > (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs > > > > (XEN) [VT-D] RMRR address range 8d80..8fff not in reserved > > > > memory; need "iommu_inclusive_mapping=1"? > > > > This is your problem. In versions prior to 4.11 (I think, and certainly > > 4.12) > iommu_inclusive_mapping used to default on, whereas now it appears to default > off. In most > circumstances this is fine because there is a new flag, iommu_hwdom_reserved, > which defaults on and > this makes sure that all e820 reserved regions are identity mapped (which > usually covers undeclared > RMRRs). You have the opposite problem... a declared RMRR which is not > reserved, so you will need > iommu_inclusive_mapping. > > I think there's a bug in the initialization of iommu for a PV dom0, > which leaves dom0 without rmrr entries. Can you try the patch below > and report whether it solves your issue? > > Thanks, Roger. > ---8<--- > diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c > index 79ec6719f5..9d91f0d633 100644 > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", > 0); > > hd->status = IOMMU_STATUS_initializing; > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d); > +hd->need_sync = !iommu_use_hap_pt(d); But this is going to mean the if() below is true for non-strict dom0, which means it pointlessly maps the dom0 page list when hwdom_iommu_map() should have already mapped all conventional RAM. Paul > if ( need_iommu_pt_sync(d) ) > { > struct page_info *page; ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On Mon, Jul 22, 2019 at 08:20:36AM +, Paul Durrant wrote: > > -Original Message- > [snip] > > > (XEN) Domain heap initialised > > > (XEN) ACPI: 32/64X FACS address mismatch in FADT - > > > 8ce8ef80/, using 32 > > > (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119 > > > (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs > > > (XEN) [VT-D] RMRR address range 8d80..8fff not in reserved > > > memory; need "iommu_inclusive_mapping=1"? > > This is your problem. In versions prior to 4.11 (I think, and certainly 4.12) > iommu_inclusive_mapping used to default on, whereas now it appears to default > off. In most circumstances this is fine because there is a new flag, > iommu_hwdom_reserved, which defaults on and this makes sure that all e820 > reserved regions are identity mapped (which usually covers undeclared RMRRs). > You have the opposite problem... a declared RMRR which is not reserved, so > you will need iommu_inclusive_mapping. I think there's a bug in the initialization of iommu for a PV dom0, which leaves dom0 without rmrr entries. Can you try the patch below and report whether it solves your issue? Thanks, Roger. ---8<--- diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index 79ec6719f5..9d91f0d633 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0); hd->status = IOMMU_STATUS_initializing; -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d); +hd->need_sync = !iommu_use_hap_pt(d); if ( need_iommu_pt_sync(d) ) { struct page_info *page; ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> -Original Message- [snip] > > (XEN) Domain heap initialised > > (XEN) ACPI: 32/64X FACS address mismatch in FADT - > > 8ce8ef80/, using 32 > > (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119 > > (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs > > (XEN) [VT-D] RMRR address range 8d80..8fff not in reserved > > memory; need "iommu_inclusive_mapping=1"? This is your problem. In versions prior to 4.11 (I think, and certainly 4.12) iommu_inclusive_mapping used to default on, whereas now it appears to default off. In most circumstances this is fine because there is a new flag, iommu_hwdom_reserved, which defaults on and this makes sure that all e820 reserved regions are identity mapped (which usually covers undeclared RMRRs). You have the opposite problem... a declared RMRR which is not reserved, so you will need iommu_inclusive_mapping. Paul ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
De-htmling... - [snip] For RMRRs themselves, system firmware is well known for abiding by the spec [citation needed], but an RMRR must be honoured, because the entire purpose of them is to state "this device won't function without access to this area". An RMRR in a hole, while a violation of the spec, is obviously fine to use in practice, so we should just accept it and stop complaining. OTOH, RMRRs which hit other memory (particularly RAM) need more care, and probably want to force override the e820 to reserved. Nothing good will come from trusting the e820 over the DMAR table here, seeing as there is clearly an error somewhere in the firmware-provided information. However - I'm struggling to locate anywhere which actually walks dom0's RMRR list and inserts them into the IOMMU. Anyone got any hints? - I think you want setup_hwdom_rmrr() in passthrough/vtd/iommu.c Paul ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
On 19/07/2019 20:31, Roman Shaposhnik wrote: > Hi! > > we're using Xen on Advantech ARK-2250 Embedded Box PC: > > https://www.elmark.com.pl/web/uploaded/karty_produktow/advantech/ark-2250l/ark-2250l_instrukcja-uzytkownika.pdf > > After upgrading to Xen 4.12.0 from 4.11.0 we now have to utilize > iommu=no-igfx > workaround as per: > https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#iommu > > Without the workaround the screen appears to be garbled with colored > static noise and the following message keeps showing up: > (XEN) printk: 26235 messages suppressed. > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr > 8e43c000, iommu reg = 82c00021d000 > (XEN) printk: 26303 messages suppressed. > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr > 8e2c6000, iommu reg = 82c00021d000 > > Once iommu=no-igfx is applied the box can boot fine. > > At the end of this email, you can see a full log of the box booting > all the way into Dom0 with iommu=no-igfx applied. I am also attaching > similar log without no-igfx > > Please let me know if you need any more information to help us diagnose this. This will be a consequence of trying to remove various pieces of stupidity with the preexisting IOMMU logic, in an attempt to unify the PV and PVH paths. As for the symptoms you're seeing, that is because the GPU is not being given access to the RAM stolen for graphics purposes. Picking the log apart: (XEN) EFI RAM map: (XEN) - 00058000 (usable) (XEN) 00058000 - 00059000 (reserved) (XEN) 00059000 - 0009f000 (usable) (XEN) 0009f000 - 000a (reserved) (XEN) 0010 - 8648a000 (usable) (XEN) 8648a000 - 8648b000 (ACPI NVS) (XEN) 8648b000 - 864b5000 (reserved) (XEN) 864b5000 - 8c224000 (usable) (XEN) 8c224000 - 8c528000 (reserved) (XEN) 8c528000 - 8c736000 (usable) (XEN) 8c736000 - 8cea7000 (ACPI NVS) (XEN) 8cea7000 - 8d2ff000 (reserved) (XEN) 8d2ff000 - 8d30 (usable) (XEN) 8d30 - 8d40 (reserved) (XEN) e000 - f000 (reserved) (XEN) fe00 - fe011000 (reserved) (XEN) fec0 - fec01000 (reserved) (XEN) fee0 - fee01000 (reserved) (XEN) ff00 - 0001 (reserved) (XEN) 0001 - 00016e00 (usable) ... (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs (XEN) [VT-D] RMRR address range 8d80..8fff not in reserved memory; need "iommu_inclusive_mapping=1"? (XEN) Switched to APIC driver x2apic_cluster. ... (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr 8e48, iommu reg = 82c00021d000 (XEN) [VT-D]DMAR: reason 06 - PTE Read access is not set (XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0, iommu reg = 82c00021f000 (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear The RMRR identified is a hole in the e820, and the range which is causing IOMMU faults. Clearly it isn't being set up correctly. First of all, can you check what effect booting with iommu_inclusive_mapping=1 has please? While at it, iommu=debug would also be helpful. Back to the log. Strictly speaking, this is a violation of the VT-d spec. Section 8.4 Reserved Memory Region Reporting Structure says: "BIOS must report the RMRR reported memory addresses as reserved (or as EFI runtime) in the system memory map returned through methods such as INT15, EFI GetMemoryMap etc." However, Xen's logic here is very broken, and in need of fixing. For that message, it only checks the first and last address for being reserved, not the entire region, which will give it plenty of false negatives. For RMRRs themselves, system firmware is well known for abiding by the spec [citation needed], but an RMRR must be honoured, because the entire purpose of them is to state "this device won't function without access to this area". An RMRR in a hole, while a violation of the spec, is obviously fine to use in practice, so we should just accept it and stop complaining. OTOH, RMRRs which hit other memory (particularly RAM) need more care, and probably want to force override the e820 to reserved. Nothing good will come from trusting the e820 over the DMAR table here, seeing as there is clearly an error somewhere in the firmware-provided information. However - I'm struggling to locate anywhere which actually walks dom0's RMRR list and inserts them into the IOMMU. Anyone got any hints? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
CCing relevant maintainers as well (sorry for not doing it first time around) On Fri, Jul 19, 2019 at 12:31 PM Roman Shaposhnik wrote: > > Hi! > > we're using Xen on Advantech ARK-2250 Embedded Box PC: > > https://www.elmark.com.pl/web/uploaded/karty_produktow/advantech/ark-2250l/ark-2250l_instrukcja-uzytkownika.pdf > > After upgrading to Xen 4.12.0 from 4.11.0 we now have to utilize > iommu=no-igfx > workaround as per: > https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#iommu > > Without the workaround the screen appears to be garbled with colored > static noise and the following message keeps showing up: > (XEN) printk: 26235 messages suppressed. > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr > 8e43c000, iommu reg = 82c00021d000 > (XEN) printk: 26303 messages suppressed. > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr > 8e2c6000, iommu reg = 82c00021d000 > > Once iommu=no-igfx is applied the box can boot fine. > > At the end of this email, you can see a full log of the box booting > all the way into Dom0 with iommu=no-igfx applied. I am also attaching > similar log without no-igfx > > Please let me know if you need any more information to help us diagnose this. > > Thanks, > Roman. > > 0x:0x00:0x02.0x0: ROM: 0x1 bytes at 0x8968d018 > 0x:0x02:0x00.0x0: ROM: 0x1 bytes at 0x89640018 > 0x:0x00:0x1f.0x6: ROM: 0x10c00 bytes at 0x8962f018 > Xen 4.12.0 > (XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n Mon > Jun 17 18:50:07 UTC 2019 > (XEN) Latest ChangeSet: > (XEN) Bootloader: GRUB 2.03 > (XEN) Command line: iommu=no-igfx com1=115200,8n1 console=com1,vga > dom0_mem=1024M,max:1024M dom0_max_vcpus=1 dom0_vcpus_pin smt=false > (XEN) Xen image load base address: 0x8860 > (XEN) Video information: > (XEN) VGA is graphics mode 1680x1050, 32 bpp > (XEN) Disc information: > (XEN) Found 0 MBR signatures > (XEN) Found 1 EDD information structures > (XEN) EFI RAM map: > (XEN) - 00058000 (usable) > (XEN) 00058000 - 00059000 (reserved) > (XEN) 00059000 - 0009f000 (usable) > (XEN) 0009f000 - 000a (reserved) > (XEN) 0010 - 8648a000 (usable) > (XEN) 8648a000 - 8648b000 (ACPI NVS) > (XEN) 8648b000 - 864b5000 (reserved) > (XEN) 864b5000 - 8c224000 (usable) > (XEN) 8c224000 - 8c528000 (reserved) > (XEN) 8c528000 - 8c736000 (usable) > (XEN) 8c736000 - 8cea7000 (ACPI NVS) > (XEN) 8cea7000 - 8d2ff000 (reserved) > (XEN) 8d2ff000 - 8d30 (usable) > (XEN) 8d30 - 8d40 (reserved) > (XEN) e000 - f000 (reserved) > (XEN) fe00 - fe011000 (reserved) > (XEN) fec0 - fec01000 (reserved) > (XEN) fee0 - fee01000 (reserved) > (XEN) ff00 - 0001 (reserved) > (XEN) 0001 - 00016e00 (usable) > (XEN) ACPI: RSDP 8CE49000, 0024 (r2 ALASKA) > (XEN) ACPI: XSDT 8CE490A8, 00CC (r1 ALASKA A M I 1072009 AMI 10013) > (XEN) ACPI: FACP 8CE6C370, 010C (r5 ALASKA A M I 1072009 AMI 10013) > (XEN) ACPI: DSDT 8CE49208, 23167 (r2 ALASKA A M I 1072009 INTL 20120913) > (XEN) ACPI: FACS 8CE8EF80, 0040 > (XEN) ACPI: APIC 8CE6C480, 0084 (r3 ALASKA A M I 1072009 AMI 10013) > (XEN) ACPI: FPDT 8CE6C508, 0044 (r1 ALASKA A M I 1072009 AMI 10013) > (XEN) ACPI: FIDT 8CE6C550, 009C (r1 ALASKA A M I 1072009 AMI 10013) > (XEN) ACPI: MCFG 8CE6C5F0, 003C (r1 ALASKA A M I 1072009 MSFT 97) > (XEN) ACPI: HPET 8CE6C630, 0038 (r1 ALASKA A M I 1072009 AMI.5000B) > (XEN) ACPI: LPIT 8CE6C668, 0094 (r1 INTEL SKL-ULT0 MSFT 5F) > (XEN) ACPI: SSDT 8CE6C700, 0248 (r2 INTEL sensrhub0 INTL 20120913) > (XEN) ACPI: SSDT 8CE6C948, 2BAE (r2 INTEL PtidDevc 1000 INTL 20120913) > (XEN) ACPI: SSDT 8CE6F4F8, 0BE3 (r2 INTEL Ther_Rvp 1000 INTL 20120913) > (XEN) ACPI: SSDT 8CE700E0, 04A3 (r2 INTEL zpodd 1000 INTL 20120913) > (XEN) ACPI: DBGP 8CE70588, 0034 (r1 INTEL 0 MSFT 5F) > (XEN) ACPI: DBG2 8CE705C0, 0054 (r0 INTEL 0 MSFT 5F) > (XEN) ACPI: SSDT 8CE70618, 06E9 (r2 INTEL xh_rvp070 INTL 20120913) > (XEN) ACPI: SSDT 8CE70D08, 547E (r2 SaSsdt SaSsdt 3000 INTL 20120913) > (XEN) ACPI: UEFI 8CE76188, 0042 (r10 0) > (XEN) ACPI: SSDT 8CE761D0, 0E73 (r2 CpuRef CpuSsdt 3000 INTL 20120913) > (XEN) ACPI: BGRT 8CE77048, 0038 (r1 ALASKA A M I 1072009 AMI 10013) > (XEN) ACPI: DMAR 8CE77080, 00A8 (r1 INTEL SKL 1 INTL1) > (XEN) ACPI: TPM2 8CE77128, 0034 (r3Tpm2Tabl1 AMI 0) > (XEN) ACPI: ASF! 8CE77160, 00A5 (r32 INTEL HCG1 TFSMF4240) > (XE