Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-19 Thread Roman Shaposhnik
On Mon, Aug 19, 2019 at 1:16 AM Roger Pau Monné  wrote:
>
> On Sun, Aug 18, 2019 at 10:00:17PM -0700, Roman Shaposhnik wrote:
> > Hi Roger!
> >
> > Some good news, some bad news ;-)
> >
> > Good news is that on the newer BIOS, your original patch seems to work fine.
> >
> > IOW, with newer BIOS:
> > 1. without your original patch I see garbled screen
> > 2. with your original patch everything boots normally.
>
> That would be my expectation.
>
> > Bad news is that with older BIOS, your original patch seems to either
> > work or not depending on the BIOS some of the BIOS settings.
> >
> > IOW, with older BIOS:
> > 1. without your original patch I see garbled screen (regardless of
> > BIOS settings)
> > 2. with your original patch AND resetting to a factory defaults of
> > BIOS settings -- everything boots normally
> > 3. when I load up our custom settings -- the only thing that can
> > boot normally is the latest patch
> >
> > So, would it make sense and commit your original patch for now? This
> > will unlock me with newer BIOSes on these boxes.
>
> I think so, can you please provide a tested-by to the patch:
>
> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html

Done.

> Feel free to also note the weird behaviour you are seeing with this
> box and firmware version.
>
> > As for the older BIOSes, I still feel that it would be great for Xen
> > to boot more reliably -- especially since Xen 4.10 seems to be doing
> > fine regardless of BIOS version and settings.
> >
> > What do you think?
>
> We could add a quirk for this fimrware version and hardware if we know
> exactly what the issue is. Have you figured out if it's related to the
> flush? ie: iommu_iotlb_flush vs iommu_iotlb_flush_all vs iommu_flush_all?

I am actually at Open Source Summit in San Diego right now -- away from
my lab. I'll try to capture as much data on Fri as possible when I come back.

Please stay tuned.

Thanks,
Roman.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-19 Thread Roger Pau Monné
On Sun, Aug 18, 2019 at 10:00:17PM -0700, Roman Shaposhnik wrote:
> Hi Roger!
> 
> Some good news, some bad news ;-)
> 
> Good news is that on the newer BIOS, your original patch seems to work fine.
> 
> IOW, with newer BIOS:
> 1. without your original patch I see garbled screen
> 2. with your original patch everything boots normally.

That would be my expectation.

> Bad news is that with older BIOS, your original patch seems to either
> work or not depending on the BIOS some of the BIOS settings.
> 
> IOW, with older BIOS:
> 1. without your original patch I see garbled screen (regardless of
> BIOS settings)
> 2. with your original patch AND resetting to a factory defaults of
> BIOS settings -- everything boots normally
> 3. when I load up our custom settings -- the only thing that can
> boot normally is the latest patch
> 
> So, would it make sense and commit your original patch for now? This
> will unlock me with newer BIOSes on these boxes.

I think so, can you please provide a tested-by to the patch:

https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html

Feel free to also note the weird behaviour you are seeing with this
box and firmware version.

> As for the older BIOSes, I still feel that it would be great for Xen
> to boot more reliably -- especially since Xen 4.10 seems to be doing
> fine regardless of BIOS version and settings.
> 
> What do you think?

We could add a quirk for this fimrware version and hardware if we know
exactly what the issue is. Have you figured out if it's related to the
flush? ie: iommu_iotlb_flush vs iommu_iotlb_flush_all vs iommu_flush_all?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-18 Thread Roman Shaposhnik
Hi Roger!

Some good news, some bad news ;-)

Good news is that on the newer BIOS, your original patch seems to work fine.

IOW, with newer BIOS:
1. without your original patch I see garbled screen
2. with your original patch everything boots normally.

Bad news is that with older BIOS, your original patch seems to either
work or not depending on the BIOS some of the BIOS settings.

IOW, with older BIOS:
1. without your original patch I see garbled screen (regardless of
BIOS settings)
2. with your original patch AND resetting to a factory defaults of
BIOS settings -- everything boots normally
3. when I load up our custom settings -- the only thing that can
boot normally is the latest patch

So, would it make sense and commit your original patch for now? This
will unlock me with newer BIOSes on these boxes.

As for the older BIOSes, I still feel that it would be great for Xen
to boot more reliably -- especially since Xen 4.10 seems to be doing
fine regardless of BIOS version and settings.

What do you think?

Btw, just to be clear -- when I say your original patch here's what I mean:

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fef97c82f6..3605614aaf 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d,
unsigned long gfn_l,

 if ( !paging_mode_translate(p2m->domain) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
 IOMMUF_readable | IOMMUF_writable);
@@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d,
unsigned long gfn_l)

 if ( !paging_mode_translate(d) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
 }

Thanks,
Roman.

On Wed, Aug 14, 2019 at 1:06 AM Roger Pau Monné  wrote:
>
> On Tue, Aug 13, 2019 at 12:24:32PM -0700, Roman Shaposhnik wrote:
> > Hi Roger,
> >
> > sorry for the delay -- I hope you will understand that I actually had
> > a good reason. See below:
>
> No problem, just wanted to make sure this doesn't fell through the
> cracks.
>
> > On Mon, Aug 12, 2019 at 1:57 AM Roger Pau Monné  
> > wrote:
> > >
> > > Ping?
> > >
> > > I know I've posted this quite recently, but have you been able to test
> > > the two proposed patches?
> > >
> > > ie: the one here and:
> > >
> > > https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00643.html
> > >
> > > I would like to figure out exactly what's going on and fix this
> > > properly.
> >
> > Turns out this may actually be related to the BIOS version on these
> > boxes. I have
> > one with the BIOS build from 06/07/2018 and the other one is from 2017. So 
> > with
> > 2 of your proposed patches -- we now have a 2x2 test matrix. The awful
> > part seems
> > to be that the behavior looks to be slightly different.
> >
> > I need an extra day to summarize it all and I'll follow up quickly. It
> > just so far I lost
> > time trying to figure out while the same build would behave
> > differently on different
> > boxes only to find out that the BIOS is different (and the really
> > awful part is that
> > they both report as version 5.0.1.1 -- it is only when you look at the 
> > timestamp
> > of the build -- that's where you see them being different :-( ).
>
> Oh, that's awful. I hope you can get something clear out of this.
>
> Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-14 Thread Roger Pau Monné
On Tue, Aug 13, 2019 at 12:24:32PM -0700, Roman Shaposhnik wrote:
> Hi Roger,
> 
> sorry for the delay -- I hope you will understand that I actually had
> a good reason. See below:

No problem, just wanted to make sure this doesn't fell through the
cracks.

> On Mon, Aug 12, 2019 at 1:57 AM Roger Pau Monné  wrote:
> >
> > Ping?
> >
> > I know I've posted this quite recently, but have you been able to test
> > the two proposed patches?
> >
> > ie: the one here and:
> >
> > https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00643.html
> >
> > I would like to figure out exactly what's going on and fix this
> > properly.
> 
> Turns out this may actually be related to the BIOS version on these
> boxes. I have
> one with the BIOS build from 06/07/2018 and the other one is from 2017. So 
> with
> 2 of your proposed patches -- we now have a 2x2 test matrix. The awful
> part seems
> to be that the behavior looks to be slightly different.
> 
> I need an extra day to summarize it all and I'll follow up quickly. It
> just so far I lost
> time trying to figure out while the same build would behave
> differently on different
> boxes only to find out that the BIOS is different (and the really
> awful part is that
> they both report as version 5.0.1.1 -- it is only when you look at the 
> timestamp
> of the build -- that's where you see them being different :-( ).

Oh, that's awful. I hope you can get something clear out of this.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-13 Thread Roman Shaposhnik
Hi Roger,

sorry for the delay -- I hope you will understand that I actually had
a good reason. See below:

On Mon, Aug 12, 2019 at 1:57 AM Roger Pau Monné  wrote:
>
> Ping?
>
> I know I've posted this quite recently, but have you been able to test
> the two proposed patches?
>
> ie: the one here and:
>
> https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00643.html
>
> I would like to figure out exactly what's going on and fix this
> properly.

Turns out this may actually be related to the BIOS version on these
boxes. I have
one with the BIOS build from 06/07/2018 and the other one is from 2017. So with
2 of your proposed patches -- we now have a 2x2 test matrix. The awful
part seems
to be that the behavior looks to be slightly different.

I need an extra day to summarize it all and I'll follow up quickly. It
just so far I lost
time trying to figure out while the same build would behave
differently on different
boxes only to find out that the BIOS is different (and the really
awful part is that
they both report as version 5.0.1.1 -- it is only when you look at the timestamp
of the build -- that's where you see them being different :-( ).

Thanks,
Roman.

> On Wed, Aug 07, 2019 at 09:35:34AM +0200, Roger Pau Monné wrote:
> > On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote:
> > > On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné  
> > > wrote:
> > > >
> > > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:
> > > > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:
> > > > > > This patch completely fixes the problem for me!
> > > > > >
> > > > > > Thanks Roger! I'd love to see this in Xen 4.13
> > > > >
> > > > > Thanks for testing!
> > > > >
> > > > > It's still not clear to me why the previous approach didn't work, but
> > > > > I think this patch is better because it removes the usage of
> > > > > {set/clear}_identity_p2m_entry from PV domains. I will submit this
> > > > > formally now.
> > > >
> > > > Sorry to bother again, but since we still don't understand why the
> > > > previous fix didn't work for you, and I can't reproduce this with my
> > > > hardware, could you give the attached patch a try?
> > >
> > > No worries -- and thanks for helping to get it over the finish line --
> > > this is much appreciated!
> > >
> > > I'm happy to say that this latest patch is also working just fine. So
> > > I guess this is the one that's going to land in Xen 4.13?
> >
> > No, not really, sorry this was still a debug patch.
> >
> > So I think the behaviour you are seeing can only be explained if the
> > IOMMU is already enabled by the firmware when booting into Xen, can
> > this be the case?
> >
> > I have a patch I would like you to try to confirm this, can you please
> > give it a spin and report back (ideally with the Xen boot log).
> >
> > Thanks, Roger.
> > ---8<---
> > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> > index fef97c82f6..3605614aaf 100644
> > --- a/xen/arch/x86/mm/p2m.c
> > +++ b/xen/arch/x86/mm/p2m.c
> > @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
> > long gfn_l,
> >
> >  if ( !paging_mode_translate(p2m->domain) )
> >  {
> > -if ( !need_iommu_pt_sync(d) )
> > +if ( !has_iommu_pt(d) )
> >  return 0;
> >  return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
> >  IOMMUF_readable | IOMMUF_writable);
> > @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, 
> > unsigned long gfn_l)
> >
> >  if ( !paging_mode_translate(d) )
> >  {
> > -if ( !need_iommu_pt_sync(d) )
> > +if ( !has_iommu_pt(d) )
> >  return 0;
> >  return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
> >  }
> > diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> > b/xen/drivers/passthrough/vtd/iommu.c
> > index 5d72270c5b..9dd0ed7f63 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -2316,6 +2316,9 @@ static int __init vtd_setup(void)
> >   */
> >  for_each_drhd_unit ( drhd )
> >  {
> > +unsigned long flags;
> > +uint32_t val;
> > +
> >  iommu = drhd->iommu;
> >
> >  printk("Intel VT-d iommu %"PRIu32" supported page sizes: 4kB",
> > @@ -2351,6 +2354,22 @@ static int __init vtd_setup(void)
> >  if ( !vtd_ept_page_compatible(iommu) )
> >  iommu_hap_pt_share = 0;
> >
> > +spin_lock_irqsave(&iommu->register_lock, flags);
> > +val = dmar_readl(iommu->reg, DMAR_GSTS_REG);
> > +/*
> > + * TODO: needs to be revisited once Xen supports booting with an
> > + * already enabled IOMMU.
> > + */
> > +if ( val & DMA_GSTS_TES )
> > +{
> > +printk(XENLOG_WARNING VTDPREFIX
> > +   "IOMMU: DMA remapping already enabled, disabling it\n");
> > +dmar_writel(iommu->reg, DMA

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-12 Thread Roger Pau Monné
Ping?

I know I've posted this quite recently, but have you been able to test
the two proposed patches?

ie: the one here and:

https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00643.html

I would like to figure out exactly what's going on and fix this
properly.

Thanks, Roger.

On Wed, Aug 07, 2019 at 09:35:34AM +0200, Roger Pau Monné wrote:
> On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote:
> > On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné  wrote:
> > >
> > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:
> > > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:
> > > > > This patch completely fixes the problem for me!
> > > > >
> > > > > Thanks Roger! I'd love to see this in Xen 4.13
> > > >
> > > > Thanks for testing!
> > > >
> > > > It's still not clear to me why the previous approach didn't work, but
> > > > I think this patch is better because it removes the usage of
> > > > {set/clear}_identity_p2m_entry from PV domains. I will submit this
> > > > formally now.
> > >
> > > Sorry to bother again, but since we still don't understand why the
> > > previous fix didn't work for you, and I can't reproduce this with my
> > > hardware, could you give the attached patch a try?
> > 
> > No worries -- and thanks for helping to get it over the finish line --
> > this is much appreciated!
> > 
> > I'm happy to say that this latest patch is also working just fine. So
> > I guess this is the one that's going to land in Xen 4.13?
> 
> No, not really, sorry this was still a debug patch.
> 
> So I think the behaviour you are seeing can only be explained if the
> IOMMU is already enabled by the firmware when booting into Xen, can
> this be the case?
> 
> I have a patch I would like you to try to confirm this, can you please
> give it a spin and report back (ideally with the Xen boot log).
> 
> Thanks, Roger.
> ---8<---
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index fef97c82f6..3605614aaf 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l,
>  
>  if ( !paging_mode_translate(p2m->domain) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> +if ( !has_iommu_pt(d) )
>  return 0;
>  return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
>  IOMMUF_readable | IOMMUF_writable);
> @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l)
>  
>  if ( !paging_mode_translate(d) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> +if ( !has_iommu_pt(d) )
>  return 0;
>  return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
>  }
> diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> b/xen/drivers/passthrough/vtd/iommu.c
> index 5d72270c5b..9dd0ed7f63 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2316,6 +2316,9 @@ static int __init vtd_setup(void)
>   */
>  for_each_drhd_unit ( drhd )
>  {
> +unsigned long flags;
> +uint32_t val;
> +
>  iommu = drhd->iommu;
>  
>  printk("Intel VT-d iommu %"PRIu32" supported page sizes: 4kB",
> @@ -2351,6 +2354,22 @@ static int __init vtd_setup(void)
>  if ( !vtd_ept_page_compatible(iommu) )
>  iommu_hap_pt_share = 0;
>  
> +spin_lock_irqsave(&iommu->register_lock, flags);
> +val = dmar_readl(iommu->reg, DMAR_GSTS_REG);
> +/*
> + * TODO: needs to be revisited once Xen supports booting with an
> + * already enabled IOMMU.
> + */
> +if ( val & DMA_GSTS_TES )
> +{
> +printk(XENLOG_WARNING VTDPREFIX
> +   "IOMMU: DMA remapping already enabled, disabling it\n");
> +dmar_writel(iommu->reg, DMAR_GCMD_REG, val & ~DMA_GCMD_TE);
> +IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, dmar_readl,
> +  !(val & DMA_GSTS_TES), val);
> +}
> +spin_unlock_irqrestore(&iommu->register_lock, flags);
> +
>  ret = iommu_set_interrupt(drhd);
>  if ( ret )
>  {
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-07 Thread Roger Pau Monné
On Wed, Aug 07, 2019 at 10:31:40AM +0200, Jan Beulich wrote:
> On 07.08.2019 09:35, Roger Pau Monné  wrote:
> > On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote:
> > > On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné  
> > > wrote:
> > > > 
> > > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:
> > > > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:
> > > > > > This patch completely fixes the problem for me!
> > > > > > 
> > > > > > Thanks Roger! I'd love to see this in Xen 4.13
> > > > > 
> > > > > Thanks for testing!
> > > > > 
> > > > > It's still not clear to me why the previous approach didn't work, but
> > > > > I think this patch is better because it removes the usage of
> > > > > {set/clear}_identity_p2m_entry from PV domains. I will submit this
> > > > > formally now.
> > > > 
> > > > Sorry to bother again, but since we still don't understand why the
> > > > previous fix didn't work for you, and I can't reproduce this with my
> > > > hardware, could you give the attached patch a try?
> > > 
> > > No worries -- and thanks for helping to get it over the finish line --
> > > this is much appreciated!
> > > 
> > > I'm happy to say that this latest patch is also working just fine. So
> > > I guess this is the one that's going to land in Xen 4.13?
> > 
> > No, not really, sorry this was still a debug patch.
> > 
> > So I think the behaviour you are seeing can only be explained if the
> > IOMMU is already enabled by the firmware when booting into Xen, can
> > this be the case?
> 
> Even then - why would the existing flush not suffice in this case?

It should indeed, I'm just trying to figure out what's different in
his setup.

Also I'm not sure the iommu init code was designed taking into account
that the iommu DMA translation might already be enabled by the
firmware.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-07 Thread Jan Beulich

On 07.08.2019 11:57, Roger Pau Monné  wrote:

On Wed, Aug 07, 2019 at 09:08:58AM +0200, Jan Beulich wrote:

On 06.08.2019 23:48, Roman Shaposhnik wrote:

On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné  wrote:


On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:

On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:

This patch completely fixes the problem for me!

Thanks Roger! I'd love to see this in Xen 4.13


Thanks for testing!

It's still not clear to me why the previous approach didn't work, but
I think this patch is better because it removes the usage of
{set/clear}_identity_p2m_entry from PV domains. I will submit this
formally now.


Sorry to bother again, but since we still don't understand why the
previous fix didn't work for you, and I can't reproduce this with my
hardware, could you give the attached patch a try?


No worries -- and thanks for helping to get it over the finish line --
this is much appreciated!

I'm happy to say that this latest patch is also working just fine. So
I guess this is the one that's going to land in Xen 4.13?


Not necessarily - the other patch is also a candidate, but its
description would need to explain what was actually wrong.


AFAICT the only difference between the non-working version and the
working version is the flush, so I've added it here.


Now I'm afraid I still can't draw a helpful conclusion from Roman's
successful test: intel_iommu_hwdom_init(), after having called
setup_hwdom_rmrr(), calls iommu_flush_all() (with one other,
seemingly innocent call in between). The only conclusion I _could_
draw is that iommu_flush_all() doesn't do what its name says. Which
would be quite bad. But

[orig]
iommu_flush_all()
-> iommu_flush_iotlb_global(flush_non_present_entry=0)
-> flush->iotlb(DMA_TLB_GLOBAL_FLUSH, flush_non_present_entry=0)

[patch]
iommu_flush_iotlb_all()
-> iommu_flush_iotlb(dma_old_pte_present=0, page_count=0)
-> iommu_flush_iotlb_dsi(flush_non_present_entry=0)
-> flush->iotlb(DMA_TLB_DSI_FLUSH, flush_non_present_entry=0)

suggests to me that (as one would infer from the names) is the
more through flush. I must be overlooking something ...


I went over the iotlb queued invalidation code and it seems fine to
me, I haven't been able to spot any issues with it, and as you say
above the only difference is the flush type (DSI vs GLOBAL).

Again and just to make sure it's actually the flush type (and nothing
in between) the factor that makes this work or break, can you try the
above patch?


And I take it the "you" here was meant for Roman (only on Cc),
not me.

Jan


Thanks, Roger.
---8<---
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fef97c82f6..3605614aaf 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l,
  
  if ( !paging_mode_translate(p2m->domain) )

  {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
  return 0;
  return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
  IOMMUF_readable | IOMMUF_writable);
@@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l)
  
  if ( !paging_mode_translate(d) )

  {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
  return 0;
  return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
  }
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 5d72270c5b..885185ad09 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2026,7 +2026,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t 
map,
  mrmrr->count = 1;
  list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs);
  
-return 0;

+return iommu_flush_all();
  }
  
  static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev)






___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-07 Thread Roger Pau Monné
On Wed, Aug 07, 2019 at 09:08:58AM +0200, Jan Beulich wrote:
> On 06.08.2019 23:48, Roman Shaposhnik wrote:
> > On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné  wrote:
> > > 
> > > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:
> > > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:
> > > > > This patch completely fixes the problem for me!
> > > > > 
> > > > > Thanks Roger! I'd love to see this in Xen 4.13
> > > > 
> > > > Thanks for testing!
> > > > 
> > > > It's still not clear to me why the previous approach didn't work, but
> > > > I think this patch is better because it removes the usage of
> > > > {set/clear}_identity_p2m_entry from PV domains. I will submit this
> > > > formally now.
> > > 
> > > Sorry to bother again, but since we still don't understand why the
> > > previous fix didn't work for you, and I can't reproduce this with my
> > > hardware, could you give the attached patch a try?
> > 
> > No worries -- and thanks for helping to get it over the finish line --
> > this is much appreciated!
> > 
> > I'm happy to say that this latest patch is also working just fine. So
> > I guess this is the one that's going to land in Xen 4.13?
> 
> Not necessarily - the other patch is also a candidate, but its
> description would need to explain what was actually wrong.
> 
> > > AFAICT the only difference between the non-working version and the
> > > working version is the flush, so I've added it here.
> 
> Now I'm afraid I still can't draw a helpful conclusion from Roman's
> successful test: intel_iommu_hwdom_init(), after having called
> setup_hwdom_rmrr(), calls iommu_flush_all() (with one other,
> seemingly innocent call in between). The only conclusion I _could_
> draw is that iommu_flush_all() doesn't do what its name says. Which
> would be quite bad. But
> 
> [orig]
> iommu_flush_all()
> -> iommu_flush_iotlb_global(flush_non_present_entry=0)
> -> flush->iotlb(DMA_TLB_GLOBAL_FLUSH, flush_non_present_entry=0)
> 
> [patch]
> iommu_flush_iotlb_all()
> -> iommu_flush_iotlb(dma_old_pte_present=0, page_count=0)
> -> iommu_flush_iotlb_dsi(flush_non_present_entry=0)
> -> flush->iotlb(DMA_TLB_DSI_FLUSH, flush_non_present_entry=0)
> 
> suggests to me that (as one would infer from the names) is the
> more through flush. I must be overlooking something ...

I went over the iotlb queued invalidation code and it seems fine to
me, I haven't been able to spot any issues with it, and as you say
above the only difference is the flush type (DSI vs GLOBAL).

Again and just to make sure it's actually the flush type (and nothing
in between) the factor that makes this work or break, can you try the
above patch?

Thanks, Roger.
---8<---
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fef97c82f6..3605614aaf 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 
 if ( !paging_mode_translate(p2m->domain) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
 IOMMUF_readable | IOMMUF_writable);
@@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l)
 
 if ( !paging_mode_translate(d) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
 }
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 5d72270c5b..885185ad09 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2026,7 +2026,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t 
map,
 mrmrr->count = 1;
 list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs);
 
-return 0;
+return iommu_flush_all();
 }
 
 static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-07 Thread Jan Beulich

On 07.08.2019 09:35, Roger Pau Monné  wrote:

On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote:

On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné  wrote:


On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:

On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:

This patch completely fixes the problem for me!

Thanks Roger! I'd love to see this in Xen 4.13


Thanks for testing!

It's still not clear to me why the previous approach didn't work, but
I think this patch is better because it removes the usage of
{set/clear}_identity_p2m_entry from PV domains. I will submit this
formally now.


Sorry to bother again, but since we still don't understand why the
previous fix didn't work for you, and I can't reproduce this with my
hardware, could you give the attached patch a try?


No worries -- and thanks for helping to get it over the finish line --
this is much appreciated!

I'm happy to say that this latest patch is also working just fine. So
I guess this is the one that's going to land in Xen 4.13?


No, not really, sorry this was still a debug patch.

So I think the behaviour you are seeing can only be explained if the
IOMMU is already enabled by the firmware when booting into Xen, can
this be the case?


Even then - why would the existing flush not suffice in this case?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-07 Thread Roger Pau Monné
On Tue, Aug 06, 2019 at 02:48:51PM -0700, Roman Shaposhnik wrote:
> On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné  wrote:
> >
> > On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:
> > > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:
> > > > This patch completely fixes the problem for me!
> > > >
> > > > Thanks Roger! I'd love to see this in Xen 4.13
> > >
> > > Thanks for testing!
> > >
> > > It's still not clear to me why the previous approach didn't work, but
> > > I think this patch is better because it removes the usage of
> > > {set/clear}_identity_p2m_entry from PV domains. I will submit this
> > > formally now.
> >
> > Sorry to bother again, but since we still don't understand why the
> > previous fix didn't work for you, and I can't reproduce this with my
> > hardware, could you give the attached patch a try?
> 
> No worries -- and thanks for helping to get it over the finish line --
> this is much appreciated!
> 
> I'm happy to say that this latest patch is also working just fine. So
> I guess this is the one that's going to land in Xen 4.13?

No, not really, sorry this was still a debug patch.

So I think the behaviour you are seeing can only be explained if the
IOMMU is already enabled by the firmware when booting into Xen, can
this be the case?

I have a patch I would like you to try to confirm this, can you please
give it a spin and report back (ideally with the Xen boot log).

Thanks, Roger.
---8<---
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fef97c82f6..3605614aaf 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 
 if ( !paging_mode_translate(p2m->domain) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
 IOMMUF_readable | IOMMUF_writable);
@@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l)
 
 if ( !paging_mode_translate(d) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
 }
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 5d72270c5b..9dd0ed7f63 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2316,6 +2316,9 @@ static int __init vtd_setup(void)
  */
 for_each_drhd_unit ( drhd )
 {
+unsigned long flags;
+uint32_t val;
+
 iommu = drhd->iommu;
 
 printk("Intel VT-d iommu %"PRIu32" supported page sizes: 4kB",
@@ -2351,6 +2354,22 @@ static int __init vtd_setup(void)
 if ( !vtd_ept_page_compatible(iommu) )
 iommu_hap_pt_share = 0;
 
+spin_lock_irqsave(&iommu->register_lock, flags);
+val = dmar_readl(iommu->reg, DMAR_GSTS_REG);
+/*
+ * TODO: needs to be revisited once Xen supports booting with an
+ * already enabled IOMMU.
+ */
+if ( val & DMA_GSTS_TES )
+{
+printk(XENLOG_WARNING VTDPREFIX
+   "IOMMU: DMA remapping already enabled, disabling it\n");
+dmar_writel(iommu->reg, DMAR_GCMD_REG, val & ~DMA_GCMD_TE);
+IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, dmar_readl,
+  !(val & DMA_GSTS_TES), val);
+}
+spin_unlock_irqrestore(&iommu->register_lock, flags);
+
 ret = iommu_set_interrupt(drhd);
 if ( ret )
 {


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-07 Thread Jan Beulich

On 06.08.2019 23:48, Roman Shaposhnik wrote:

On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné  wrote:


On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:

On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:

This patch completely fixes the problem for me!

Thanks Roger! I'd love to see this in Xen 4.13


Thanks for testing!

It's still not clear to me why the previous approach didn't work, but
I think this patch is better because it removes the usage of
{set/clear}_identity_p2m_entry from PV domains. I will submit this
formally now.


Sorry to bother again, but since we still don't understand why the
previous fix didn't work for you, and I can't reproduce this with my
hardware, could you give the attached patch a try?


No worries -- and thanks for helping to get it over the finish line --
this is much appreciated!

I'm happy to say that this latest patch is also working just fine. So
I guess this is the one that's going to land in Xen 4.13?


Not necessarily - the other patch is also a candidate, but its
description would need to explain what was actually wrong.


AFAICT the only difference between the non-working version and the
working version is the flush, so I've added it here.


Now I'm afraid I still can't draw a helpful conclusion from Roman's
successful test: intel_iommu_hwdom_init(), after having called
setup_hwdom_rmrr(), calls iommu_flush_all() (with one other,
seemingly innocent call in between). The only conclusion I _could_
draw is that iommu_flush_all() doesn't do what its name says. Which
would be quite bad. But

[orig]
iommu_flush_all()
-> iommu_flush_iotlb_global(flush_non_present_entry=0)
-> flush->iotlb(DMA_TLB_GLOBAL_FLUSH, flush_non_present_entry=0)

[patch]
iommu_flush_iotlb_all()
-> iommu_flush_iotlb(dma_old_pte_present=0, page_count=0)
-> iommu_flush_iotlb_dsi(flush_non_present_entry=0)
-> flush->iotlb(DMA_TLB_DSI_FLUSH, flush_non_present_entry=0)

suggests to me that (as one would infer from the names) is the
more through flush. I must be overlooking something ...

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-06 Thread Roman Shaposhnik
On Tue, Aug 6, 2019 at 9:18 AM Roger Pau Monné  wrote:
>
> On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:
> > On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:
> > > This patch completely fixes the problem for me!
> > >
> > > Thanks Roger! I'd love to see this in Xen 4.13
> >
> > Thanks for testing!
> >
> > It's still not clear to me why the previous approach didn't work, but
> > I think this patch is better because it removes the usage of
> > {set/clear}_identity_p2m_entry from PV domains. I will submit this
> > formally now.
>
> Sorry to bother again, but since we still don't understand why the
> previous fix didn't work for you, and I can't reproduce this with my
> hardware, could you give the attached patch a try?

No worries -- and thanks for helping to get it over the finish line --
this is much appreciated!

I'm happy to say that this latest patch is also working just fine. So
I guess this is the one that's going to land in Xen 4.13?

Thanks,
Roman.

> AFAICT the only difference between the non-working version and the
> working version is the flush, so I've added it here.
>
> Thanks, Roger.
> ---8<---
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index fef97c82f6..3605614aaf 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l,
>
>  if ( !paging_mode_translate(p2m->domain) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> +if ( !has_iommu_pt(d) )
>  return 0;
>  return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
>  IOMMUF_readable | IOMMUF_writable);
> @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l)
>
>  if ( !paging_mode_translate(d) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> +if ( !has_iommu_pt(d) )
>  return 0;
>  return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
>  }
> diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> b/xen/drivers/passthrough/vtd/iommu.c
> index 5d72270c5b..9fd5c97be2 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2026,7 +2026,7 @@ static int rmrr_identity_mapping(struct domain *d, 
> bool_t map,
>  mrmrr->count = 1;
>  list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs);
>
> -return 0;
> +return iommu_iotlb_flush_all(d, IOMMU_FLUSHF_added | 
> IOMMU_FLUSHF_modified);
>  }
>
>  static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev)
>

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-06 Thread Roger Pau Monné
On Fri, Aug 02, 2019 at 10:05:40AM +0200, Roger Pau Monné wrote:
> On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:
> > This patch completely fixes the problem for me!
> > 
> > Thanks Roger! I'd love to see this in Xen 4.13
> 
> Thanks for testing!
> 
> It's still not clear to me why the previous approach didn't work, but
> I think this patch is better because it removes the usage of
> {set/clear}_identity_p2m_entry from PV domains. I will submit this
> formally now.

Sorry to bother again, but since we still don't understand why the
previous fix didn't work for you, and I can't reproduce this with my
hardware, could you give the attached patch a try?

AFAICT the only difference between the non-working version and the
working version is the flush, so I've added it here.

Thanks, Roger.
---8<---
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fef97c82f6..3605614aaf 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 
 if ( !paging_mode_translate(p2m->domain) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
 IOMMUF_readable | IOMMUF_writable);
@@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l)
 
 if ( !paging_mode_translate(d) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
 }
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 5d72270c5b..9fd5c97be2 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2026,7 +2026,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t 
map,
 mrmrr->count = 1;
 list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs);
 
-return 0;
+return iommu_iotlb_flush_all(d, IOMMU_FLUSHF_added | 
IOMMU_FLUSHF_modified);
 }
 
 static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-02 Thread Roger Pau Monné
On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:
> On Thu, Aug 1, 2019 at 1:16 AM Roger Pau Monné  wrote:
> >
> > On Wed, Jul 31, 2019 at 02:03:24PM -0700, Roman Shaposhnik wrote:
> > > On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper
> > >  wrote:
> > > >
> > > > On 31/07/2019 20:35, Roman Shaposhnik wrote:
> > > > > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné 
> > > > >  wrote:
> > > > >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote:
> > > > >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> > > >  Sorry -- got a bit distracted yesterday. Attached is the log with 
> > > >  only
> > > >  your latest patch attached. Interestingly enough the box booted 
> > > >  fine
> > > >  without screen artifacts. So I guess we're getting closer...
> > > > 
> > > >  Thanks for all the help!
> > > > >>> That's quite weird, there's no functional changes between the
> > > > >>> previous patches and this one, the only difference is that this 
> > > > >>> patch
> > > > >>> has more verbose output.
> > > > >>>
> > > > >>> Are you sure you didn't have any local patches on top of Xen that
> > > > >>> could explain this difference in behaviour?
> > > > >> FWIW, can you please try the plain patch again:
> > > > >>
> > > > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html
> > > > >>
> > > > >> And report back?
> > > > >>
> > > > >> I would like to get this committed ASAP if it does fix your issue.
> > > > > I'd like to say that it did -- but I tried it again just now and it
> > > > > still garbled screen and tons of:
> > > > >
> > > > > (XEN) printk: 26665 messages suppressed.
> > > > > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
> > > > > 8e14c000, iommu reg = 82c0008de000
> > > > >
> > > > > I'm very much confused by what's going on, but it seems that's the
> > > > > case -- adding those debug print statements make the issue go away
> > > > >
> > > > > Here are the patches that are being applied:
> > > > >NOT WORKING:
> > > > > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch
> > > > >
> > > > >WORKING: 
> > > > > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch
> > > > >
> > > > > At this point I'm really not sure what's going on.
> > > >
> > > > Ok.  seeing as you've double checked this, the mystery deepens.
> > > >
> > > > My bet is on the intel_iommu_lookup_page() call having side effects[1].
> > > > If you take out the debugging in the middle of the loop in
> > > > rmrr_identity_mapping(), does the problem reproduce again?
> > > >
> > > > ~Andrew
> > > >
> > > > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100%
> > > > more memory allocation and higher-level PTE construction than looks wise
> > > > for what is supposed to be a getter.
> > >
> > > Yup. That's what it is -- intel_iommu_lookup_page() seems to be the 
> > > culprit.
> > >
> > > I've did the experiment in the other direction -- adding a dummy call:
> > >  
> > > https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23
> > > on top of original Roger's patch makes system boot NORMALLY.
> >
> > I'm again quite lost, and I don't really understand why mappings added
> > by arch_iommu_hwdom_init seems to work fine while mappings added by
> > rmrr_identity_mapping don't.
> >
> > I have yet another patch for you to try, which attempts to mimic
> > exactly what arch_iommu_hwdom_init does into rmrr_identity_mapping,
> > can you please give it a try?
> >
> > This has the added bonus of limiting the use of
> > {set/clear}_identity_p2m_entry to translated domains only, since
> > rmrr_identity_mapping was the only caller against PV domains.
> >
> > Thanks, Roger.
> > ---8<---
> > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> > index fef97c82f6..d36a58b1a6 100644
> > --- a/xen/arch/x86/mm/p2m.c
> > +++ b/xen/arch/x86/mm/p2m.c
> > @@ -1341,10 +1341,8 @@ int set_identity_p2m_entry(struct domain *d, 
> > unsigned long gfn_l,
> >
> >  if ( !paging_mode_translate(p2m->domain) )
> >  {
> > -if ( !need_iommu_pt_sync(d) )
> > -return 0;
> > -return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
> > -IOMMUF_readable | IOMMUF_writable);
> > +ASSERT_UNREACHABLE();
> > +return -ENXIO;
> >  }
> >
> >  gfn_lock(p2m, gfn, 0);
> > @@ -1432,9 +1430,8 @@ int clear_identity_p2m_entry(struct domain *d, 
> > unsigned long gfn_l)
> >
> >  if ( !paging_mode_translate(d) )
> >  {
> > -if ( !need_iommu_pt_sync(d) )
> > -return 0;
> > -return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
> > +ASSERT_UNREACHABLE();
> > +return -ENXIO;
> >  }
> >
> >  gfn_lock(p2m, gfn, 0);
> > diff --git a/xen/drivers/passthrough/vtd/i

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-01 Thread Roman Shaposhnik
On Thu, Aug 1, 2019 at 1:16 AM Roger Pau Monné  wrote:
>
> On Wed, Jul 31, 2019 at 02:03:24PM -0700, Roman Shaposhnik wrote:
> > On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper
> >  wrote:
> > >
> > > On 31/07/2019 20:35, Roman Shaposhnik wrote:
> > > > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné  
> > > > wrote:
> > > >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote:
> > > >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> > >  Sorry -- got a bit distracted yesterday. Attached is the log with 
> > >  only
> > >  your latest patch attached. Interestingly enough the box booted fine
> > >  without screen artifacts. So I guess we're getting closer...
> > > 
> > >  Thanks for all the help!
> > > >>> That's quite weird, there's no functional changes between the
> > > >>> previous patches and this one, the only difference is that this patch
> > > >>> has more verbose output.
> > > >>>
> > > >>> Are you sure you didn't have any local patches on top of Xen that
> > > >>> could explain this difference in behaviour?
> > > >> FWIW, can you please try the plain patch again:
> > > >>
> > > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html
> > > >>
> > > >> And report back?
> > > >>
> > > >> I would like to get this committed ASAP if it does fix your issue.
> > > > I'd like to say that it did -- but I tried it again just now and it
> > > > still garbled screen and tons of:
> > > >
> > > > (XEN) printk: 26665 messages suppressed.
> > > > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
> > > > 8e14c000, iommu reg = 82c0008de000
> > > >
> > > > I'm very much confused by what's going on, but it seems that's the
> > > > case -- adding those debug print statements make the issue go away
> > > >
> > > > Here are the patches that are being applied:
> > > >NOT WORKING:
> > > > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch
> > > >
> > > >WORKING: 
> > > > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch
> > > >
> > > > At this point I'm really not sure what's going on.
> > >
> > > Ok.  seeing as you've double checked this, the mystery deepens.
> > >
> > > My bet is on the intel_iommu_lookup_page() call having side effects[1].
> > > If you take out the debugging in the middle of the loop in
> > > rmrr_identity_mapping(), does the problem reproduce again?
> > >
> > > ~Andrew
> > >
> > > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100%
> > > more memory allocation and higher-level PTE construction than looks wise
> > > for what is supposed to be a getter.
> >
> > Yup. That's what it is -- intel_iommu_lookup_page() seems to be the culprit.
> >
> > I've did the experiment in the other direction -- adding a dummy call:
> >  
> > https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23
> > on top of original Roger's patch makes system boot NORMALLY.
>
> I'm again quite lost, and I don't really understand why mappings added
> by arch_iommu_hwdom_init seems to work fine while mappings added by
> rmrr_identity_mapping don't.
>
> I have yet another patch for you to try, which attempts to mimic
> exactly what arch_iommu_hwdom_init does into rmrr_identity_mapping,
> can you please give it a try?
>
> This has the added bonus of limiting the use of
> {set/clear}_identity_p2m_entry to translated domains only, since
> rmrr_identity_mapping was the only caller against PV domains.
>
> Thanks, Roger.
> ---8<---
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index fef97c82f6..d36a58b1a6 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1341,10 +1341,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l,
>
>  if ( !paging_mode_translate(p2m->domain) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> -return 0;
> -return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
> -IOMMUF_readable | IOMMUF_writable);
> +ASSERT_UNREACHABLE();
> +return -ENXIO;
>  }
>
>  gfn_lock(p2m, gfn, 0);
> @@ -1432,9 +1430,8 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l)
>
>  if ( !paging_mode_translate(d) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> -return 0;
> -return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
> +ASSERT_UNREACHABLE();
> +return -ENXIO;
>  }
>
>  gfn_lock(p2m, gfn, 0);
> diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> b/xen/drivers/passthrough/vtd/iommu.c
> index 5d72270c5b..62df5ca5aa 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1969,6 +1969,7 @@ static int rmrr_identity_mapping(struct domain *d, 
> bool_t map,
>  unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) 

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-01 Thread Roman Shaposhnik
On Thu, Aug 1, 2019 at 1:45 AM Roger Pau Monné  wrote:
>
> On Wed, Jul 31, 2019 at 12:30:04PM -0700, Roman Shaposhnik wrote:
> > On Wed, Jul 31, 2019 at 1:36 AM Roger Pau Monné  
> > wrote:
> > >
> > > On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> > > > Sorry -- got a bit distracted yesterday. Attached is the log with only
> > > > your latest patch attached. Interestingly enough the box booted fine
> > > > without screen artifacts. So I guess we're getting closer...
> > > >
> > > > Thanks for all the help!
> > >
> > > That's quite weird, there's no functional changes between the
> > > previous patches and this one, the only difference is that this patch
> > > has more verbose output.
> >
> > That's super weird indeed. I'm re-trying your very first patch right
> > now to see if it may be a pilot error on my part (hope not).
> >
> > > Are you sure you didn't have any local patches on top of Xen that
> > > could explain this difference in behaviour?
> >
> > Pretty sure -- but let me push my branch in EVE and show to you all
> > (you will need Docker installed to build Xen the way EVE does).
>
> I'm afraid it's quite hard for me to figure out whether you have any
> patches on top of Xen or not.

It is actually super simple. We use Docker containers as method for
guaranteeing that our build environment is always pristine and controlled.
Precisely to avoid this type of issues like random patches getting in, etc.

So... let me explain (not so much to convince you to use it, but rather
to have an extra pair of eye on an off chance that I actually missed
something):

All of our builds steps are captured in the following Dockerfile:
  https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile
it is based on Alpine Linux 3.7 and we only install the following
packages into a printing Alpine:
  https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile#L5
then we proceed with downloading Xen from a printing upsteram
tarball:
  https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile#L36
and only apply patches that sit next to the Dockerfile and have
patch file extension:
  https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile#L50

At this point this is a single patch:
  https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch
there's literally nothing else in that folder:
  https://github.com/rvs/eve/blob/xen-bug/pkg/xen

Then we proceed with building Xen in a very vanilla fashion:
  https://github.com/rvs/eve/blob/xen-bug/pkg/xen/Dockerfile#L64

That's it! That's literally all the build logic.

> For further testing can you make sure you don't have any custom
> patches on top of Xen (if you indeed had any in the previous
> attempts)?

I don't think there's any way for extra patches to sneak in, but please
see the above explanation.

Thanks,
Roman.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-01 Thread Roger Pau Monné
On Wed, Jul 31, 2019 at 12:30:04PM -0700, Roman Shaposhnik wrote:
> On Wed, Jul 31, 2019 at 1:36 AM Roger Pau Monné  wrote:
> >
> > On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> > > Sorry -- got a bit distracted yesterday. Attached is the log with only
> > > your latest patch attached. Interestingly enough the box booted fine
> > > without screen artifacts. So I guess we're getting closer...
> > >
> > > Thanks for all the help!
> >
> > That's quite weird, there's no functional changes between the
> > previous patches and this one, the only difference is that this patch
> > has more verbose output.
> 
> That's super weird indeed. I'm re-trying your very first patch right
> now to see if it may be a pilot error on my part (hope not).
> 
> > Are you sure you didn't have any local patches on top of Xen that
> > could explain this difference in behaviour?
> 
> Pretty sure -- but let me push my branch in EVE and show to you all
> (you will need Docker installed to build Xen the way EVE does).

I'm afraid it's quite hard for me to figure out whether you have any
patches on top of Xen or not.

For further testing can you make sure you don't have any custom
patches on top of Xen (if you indeed had any in the previous
attempts)?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-01 Thread Roger Pau Monné
On Wed, Jul 31, 2019 at 02:03:24PM -0700, Roman Shaposhnik wrote:
> On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper
>  wrote:
> >
> > On 31/07/2019 20:35, Roman Shaposhnik wrote:
> > > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné  
> > > wrote:
> > >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote:
> > >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> >  Sorry -- got a bit distracted yesterday. Attached is the log with only
> >  your latest patch attached. Interestingly enough the box booted fine
> >  without screen artifacts. So I guess we're getting closer...
> > 
> >  Thanks for all the help!
> > >>> That's quite weird, there's no functional changes between the
> > >>> previous patches and this one, the only difference is that this patch
> > >>> has more verbose output.
> > >>>
> > >>> Are you sure you didn't have any local patches on top of Xen that
> > >>> could explain this difference in behaviour?
> > >> FWIW, can you please try the plain patch again:
> > >>
> > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html
> > >>
> > >> And report back?
> > >>
> > >> I would like to get this committed ASAP if it does fix your issue.
> > > I'd like to say that it did -- but I tried it again just now and it
> > > still garbled screen and tons of:
> > >
> > > (XEN) printk: 26665 messages suppressed.
> > > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
> > > 8e14c000, iommu reg = 82c0008de000
> > >
> > > I'm very much confused by what's going on, but it seems that's the
> > > case -- adding those debug print statements make the issue go away
> > >
> > > Here are the patches that are being applied:
> > >NOT WORKING:
> > > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch
> > >
> > >WORKING: 
> > > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch
> > >
> > > At this point I'm really not sure what's going on.
> >
> > Ok.  seeing as you've double checked this, the mystery deepens.
> >
> > My bet is on the intel_iommu_lookup_page() call having side effects[1].
> > If you take out the debugging in the middle of the loop in
> > rmrr_identity_mapping(), does the problem reproduce again?
> >
> > ~Andrew
> >
> > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100%
> > more memory allocation and higher-level PTE construction than looks wise
> > for what is supposed to be a getter.
> 
> Yup. That's what it is -- intel_iommu_lookup_page() seems to be the culprit.
> 
> I've did the experiment in the other direction -- adding a dummy call:
>  
> https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23
> on top of original Roger's patch makes system boot NORMALLY.

I'm again quite lost, and I don't really understand why mappings added
by arch_iommu_hwdom_init seems to work fine while mappings added by
rmrr_identity_mapping don't.

I have yet another patch for you to try, which attempts to mimic
exactly what arch_iommu_hwdom_init does into rmrr_identity_mapping,
can you please give it a try?

This has the added bonus of limiting the use of
{set/clear}_identity_p2m_entry to translated domains only, since
rmrr_identity_mapping was the only caller against PV domains.

Thanks, Roger.
---8<---
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fef97c82f6..d36a58b1a6 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1341,10 +1341,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 
 if ( !paging_mode_translate(p2m->domain) )
 {
-if ( !need_iommu_pt_sync(d) )
-return 0;
-return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
-IOMMUF_readable | IOMMUF_writable);
+ASSERT_UNREACHABLE();
+return -ENXIO;
 }
 
 gfn_lock(p2m, gfn, 0);
@@ -1432,9 +1430,8 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l)
 
 if ( !paging_mode_translate(d) )
 {
-if ( !need_iommu_pt_sync(d) )
-return 0;
-return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
+ASSERT_UNREACHABLE();
+return -ENXIO;
 }
 
 gfn_lock(p2m, gfn, 0);
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 5d72270c5b..62df5ca5aa 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1969,6 +1969,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t 
map,
 unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >> PAGE_SHIFT_4K;
 struct mapped_rmrr *mrmrr;
 struct domain_iommu *hd = dom_iommu(d);
+unsigned int flush_flags = 0;
 
 ASSERT(pcidevs_locked());
 ASSERT(rmrr->base_address < rmrr->end_address);
@@ -1982,7 +1983,7 @@ static int rmrr_identity_mapping(struct domain *d, bo

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-08-01 Thread Jan Beulich
On 31.07.2019 21:46, Andrew Cooper wrote:
> My bet is on the intel_iommu_lookup_page() call having side effects[1].
> If you take out the debugging in the middle of the loop in
> rmrr_identity_mapping(), does the problem reproduce again?

While your guess was right according to Roman's latest results, I'd
still like to note that ...

> [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100%
> more memory allocation and higher-level PTE construction than looks wise
> for what is supposed to be a getter.

... there being 0 passed as last argument is supposed to suppress
any allocations (and from looking at the function I can't really
see any side effects yet in this case; all I do see is that the
first check of "alloc" could/should happen before looking up
PCI device and DRHD).

I've also double checked - there clearly is a flush after the call
to setup_hwdom_rmrr(). I'm not even convinced this flush has much
value, seeing that translation gets enabled only afterwards.

Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Roman Shaposhnik
On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper
 wrote:
>
> On 31/07/2019 20:35, Roman Shaposhnik wrote:
> > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné  
> > wrote:
> >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote:
> >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
>  Sorry -- got a bit distracted yesterday. Attached is the log with only
>  your latest patch attached. Interestingly enough the box booted fine
>  without screen artifacts. So I guess we're getting closer...
> 
>  Thanks for all the help!
> >>> That's quite weird, there's no functional changes between the
> >>> previous patches and this one, the only difference is that this patch
> >>> has more verbose output.
> >>>
> >>> Are you sure you didn't have any local patches on top of Xen that
> >>> could explain this difference in behaviour?
> >> FWIW, can you please try the plain patch again:
> >>
> >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html
> >>
> >> And report back?
> >>
> >> I would like to get this committed ASAP if it does fix your issue.
> > I'd like to say that it did -- but I tried it again just now and it
> > still garbled screen and tons of:
> >
> > (XEN) printk: 26665 messages suppressed.
> > (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
> > 8e14c000, iommu reg = 82c0008de000
> >
> > I'm very much confused by what's going on, but it seems that's the
> > case -- adding those debug print statements make the issue go away
> >
> > Here are the patches that are being applied:
> >NOT WORKING:
> > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch
> >
> >WORKING: 
> > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch
> >
> > At this point I'm really not sure what's going on.
>
> Ok.  seeing as you've double checked this, the mystery deepens.
>
> My bet is on the intel_iommu_lookup_page() call having side effects[1].
> If you take out the debugging in the middle of the loop in
> rmrr_identity_mapping(), does the problem reproduce again?
>
> ~Andrew
>
> [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100%
> more memory allocation and higher-level PTE construction than looks wise
> for what is supposed to be a getter.

Yup. That's what it is -- intel_iommu_lookup_page() seems to be the culprit.

I've did the experiment in the other direction -- adding a dummy call:
 
https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23
on top of original Roger's patch makes system boot NORMALLY.

Thanks,
Roman.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Andrew Cooper
On 31/07/2019 20:35, Roman Shaposhnik wrote:
> On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné  wrote:
>> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote:
>>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
 Sorry -- got a bit distracted yesterday. Attached is the log with only
 your latest patch attached. Interestingly enough the box booted fine
 without screen artifacts. So I guess we're getting closer...

 Thanks for all the help!
>>> That's quite weird, there's no functional changes between the
>>> previous patches and this one, the only difference is that this patch
>>> has more verbose output.
>>>
>>> Are you sure you didn't have any local patches on top of Xen that
>>> could explain this difference in behaviour?
>> FWIW, can you please try the plain patch again:
>>
>> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html
>>
>> And report back?
>>
>> I would like to get this committed ASAP if it does fix your issue.
> I'd like to say that it did -- but I tried it again just now and it
> still garbled screen and tons of:
>
> (XEN) printk: 26665 messages suppressed.
> (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
> 8e14c000, iommu reg = 82c0008de000
>
> I'm very much confused by what's going on, but it seems that's the
> case -- adding those debug print statements make the issue go away
>
> Here are the patches that are being applied:
>NOT WORKING:
> https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch
>
>WORKING: 
> https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch
>
> At this point I'm really not sure what's going on.

Ok.  seeing as you've double checked this, the mystery deepens.

My bet is on the intel_iommu_lookup_page() call having side effects[1]. 
If you take out the debugging in the middle of the loop in
rmrr_identity_mapping(), does the problem reproduce again?

~Andrew

[1] Looking at the internals of addr_to_dma_page_maddr(), it does 100%
more memory allocation and higher-level PTE construction than looks wise
for what is supposed to be a getter.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Roman Shaposhnik
On Wed, Jul 31, 2019 at 2:46 AM Jan Beulich  wrote:
>
> On 31.07.2019 10:58, Andrew Cooper wrote:
> > On 31/07/2019 09:34, Jan Beulich wrote:
> >> On 30.07.2019 19:56, Roman Shaposhnik wrote:
> >>> On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich  wrote:
>  On 23.07.2019 20:25, Roman Shaposhnik wrote:
> > Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug
> > booted the system just fine.
>  Btw (I've noticed this only now) - are you saying without "iommu=debug"
>  the box does _not_ boot fine, despite the other option?
> >>> Yes. But it made sense to me since iommu=debug (as per your
> >>> explanation) overwhelms the CPU and I guess adding
> >>> iommu_inclusive_mapping=1 avoids the code path that does it?
> >> I'm afraid I don't follow: My question was whether
> >> "iommu_inclusive_mapping=1" alone would not allow the box to boot.
> >> Without "iommu=debug" there's no excessive logging afaict, no
> >> matter what other IOMMU options you use.
> >
> > Without inclusive mappings, the system boots but the screen is junk and
> > there are DMA faults all over the place.  With inclusive mappings, it
> > all "works" fine.
> >
> > With debug enabled, the DMA faults are spat out to the console for a
> > short while, until an APIC error occurs and the system wedges completely.
>
> Right - the verbosity _with_ "iommu=debug" may be a problem. Hence me
> wondering whether the system indeed wouldn't boot _without_ that option.

Without iommu=debug (and any other option) the screen is garbled.

Thanks,
Roman.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Roman Shaposhnik
On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné  wrote:
>
> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote:
> > On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> > > Sorry -- got a bit distracted yesterday. Attached is the log with only
> > > your latest patch attached. Interestingly enough the box booted fine
> > > without screen artifacts. So I guess we're getting closer...
> > >
> > > Thanks for all the help!
> >
> > That's quite weird, there's no functional changes between the
> > previous patches and this one, the only difference is that this patch
> > has more verbose output.
> >
> > Are you sure you didn't have any local patches on top of Xen that
> > could explain this difference in behaviour?
>
> FWIW, can you please try the plain patch again:
>
> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html
>
> And report back?
>
> I would like to get this committed ASAP if it does fix your issue.

I'd like to say that it did -- but I tried it again just now and it
still garbled screen and tons of:

(XEN) printk: 26665 messages suppressed.
(XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
8e14c000, iommu reg = 82c0008de000

I'm very much confused by what's going on, but it seems that's the
case -- adding those debug print statements make the issue go away

Here are the patches that are being applied:
   NOT WORKING:
https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch

   WORKING: 
https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch

At this point I'm really not sure what's going on.

Thanks,
Roman.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Roman Shaposhnik
On Wed, Jul 31, 2019 at 1:36 AM Roger Pau Monné  wrote:
>
> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> > Sorry -- got a bit distracted yesterday. Attached is the log with only
> > your latest patch attached. Interestingly enough the box booted fine
> > without screen artifacts. So I guess we're getting closer...
> >
> > Thanks for all the help!
>
> That's quite weird, there's no functional changes between the
> previous patches and this one, the only difference is that this patch
> has more verbose output.

That's super weird indeed. I'm re-trying your very first patch right
now to see if it may be a pilot error on my part (hope not).

> Are you sure you didn't have any local patches on top of Xen that
> could explain this difference in behaviour?

Pretty sure -- but let me push my branch in EVE and show to you all
(you will need Docker installed to build Xen the way EVE does).

But if you want to check for yourself:
   $ git clone https://github.com/rvs/eve.git -b xen-bug
   $ cd eve/pkg/xen
   $ docker build  --no-cache  .
   $ docker export $(docker create XXX a) | tar xvf - boot
   $ ls boot/xen.gz

Thanks,
Roman.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Jan Beulich
On 31.07.2019 10:58, Andrew Cooper wrote:
> On 31/07/2019 09:34, Jan Beulich wrote:
>> On 30.07.2019 19:56, Roman Shaposhnik wrote:
>>> On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich  wrote:
 On 23.07.2019 20:25, Roman Shaposhnik wrote:
> Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug
> booted the system just fine.
 Btw (I've noticed this only now) - are you saying without "iommu=debug"
 the box does _not_ boot fine, despite the other option?
>>> Yes. But it made sense to me since iommu=debug (as per your
>>> explanation) overwhelms the CPU and I guess adding
>>> iommu_inclusive_mapping=1 avoids the code path that does it?
>> I'm afraid I don't follow: My question was whether
>> "iommu_inclusive_mapping=1" alone would not allow the box to boot.
>> Without "iommu=debug" there's no excessive logging afaict, no
>> matter what other IOMMU options you use.
> 
> Without inclusive mappings, the system boots but the screen is junk and
> there are DMA faults all over the place.  With inclusive mappings, it
> all "works" fine.
> 
> With debug enabled, the DMA faults are spat out to the console for a
> short while, until an APIC error occurs and the system wedges completely.

Right - the verbosity _with_ "iommu=debug" may be a problem. Hence me
wondering whether the system indeed wouldn't boot _without_ that option.

Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Andrew Cooper
On 31/07/2019 09:34, Jan Beulich wrote:
> On 30.07.2019 19:56, Roman Shaposhnik wrote:
>> On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich  wrote:
>>> On 23.07.2019 20:25, Roman Shaposhnik wrote:
 Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug
 booted the system just fine.
>>> Btw (I've noticed this only now) - are you saying without "iommu=debug"
>>> the box does _not_ boot fine, despite the other option?
>> Yes. But it made sense to me since iommu=debug (as per your
>> explanation) overwhelms the CPU and I guess adding
>> iommu_inclusive_mapping=1 avoids the code path that does it?
> I'm afraid I don't follow: My question was whether
> "iommu_inclusive_mapping=1" alone would not allow the box to boot.
> Without "iommu=debug" there's no excessive logging afaict, no
> matter what other IOMMU options you use.

Without inclusive mappings, the system boots but the screen is junk and
there are DMA faults all over the place.  With inclusive mappings, it
all "works" fine.

With debug enabled, the DMA faults are spat out to the console for a
short while, until an APIC error occurs and the system wedges completely.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Jan Beulich
On 30.07.2019 19:56, Roman Shaposhnik wrote:
> On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich  wrote:
>>
>> On 23.07.2019 20:25, Roman Shaposhnik wrote:
>>> Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug
>>> booted the system just fine.
>>
>> Btw (I've noticed this only now) - are you saying without "iommu=debug"
>> the box does _not_ boot fine, despite the other option?
> 
> Yes. But it made sense to me since iommu=debug (as per your
> explanation) overwhelms the CPU and I guess adding
> iommu_inclusive_mapping=1 avoids the code path that does it?

I'm afraid I don't follow: My question was whether
"iommu_inclusive_mapping=1" alone would not allow the box to boot.
Without "iommu=debug" there's no excessive logging afaict, no
matter what other IOMMU options you use.

Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Jan Beulich
On 30.07.2019 19:55, Roman Shaposhnik wrote:
> Sorry -- got a bit distracted yesterday. Attached is the log with only
> your latest patch attached. Interestingly enough the box booted fine
> without screen artifacts. So I guess we're getting closer...

I'm afraid I can't make sense of this and the prior results: Afaics
it was only debugging code which got changed, the actual code
changes to address the issue remained the same. Hence the earlier
patch should have worked too, if this latest one does.

Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Roger Pau Monné
On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote:
> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> > Sorry -- got a bit distracted yesterday. Attached is the log with only
> > your latest patch attached. Interestingly enough the box booted fine
> > without screen artifacts. So I guess we're getting closer...
> > 
> > Thanks for all the help!
> 
> That's quite weird, there's no functional changes between the
> previous patches and this one, the only difference is that this patch
> has more verbose output.
> 
> Are you sure you didn't have any local patches on top of Xen that
> could explain this difference in behaviour?

FWIW, can you please try the plain patch again:

https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html

And report back?

I would like to get this committed ASAP if it does fix your issue.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-31 Thread Roger Pau Monné
On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> Sorry -- got a bit distracted yesterday. Attached is the log with only
> your latest patch attached. Interestingly enough the box booted fine
> without screen artifacts. So I guess we're getting closer...
> 
> Thanks for all the help!

That's quite weird, there's no functional changes between the
previous patches and this one, the only difference is that this patch
has more verbose output.

Are you sure you didn't have any local patches on top of Xen that
could explain this difference in behaviour?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-30 Thread Roman Shaposhnik
On Fri, Jul 26, 2019 at 1:06 AM Jan Beulich  wrote:
>
> On 23.07.2019 20:25, Roman Shaposhnik wrote:
> > Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug
> > booted the system just fine.
>
> Btw (I've noticed this only now) - are you saying without "iommu=debug"
> the box does _not_ boot fine, despite the other option?

Yes. But it made sense to me since iommu=debug (as per your
explanation) overwhelms the CPU and I guess adding
iommu_inclusive_mapping=1 avoids the code path that does it?

Thanks,
Roman.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-30 Thread Roger Pau Monné
Ping?

It would be very helpful for us in order to get this sorted out if you
could answer the questions below and try the new debug patch :).

On Fri, Jul 26, 2019 at 11:35:45AM +0200, Roger Pau Monné wrote:
> On Thu, Jul 25, 2019 at 05:47:19PM -0700, Roman Shaposhnik wrote:
> > Hi Roger!
> > 
> > With your patch (and build as a debug build) Xen crashes on boot
> > (which I guess was the point of your BUG_ON statement).
> 
> Yes, that's very weird, seems like entries are not added to the iommu
> page tables but I have no idea why, AFAICT this works fine on my
> system.
> 
> Do you have any patches on top of RELEASE-4.12.0?
> 
> I have another patch with more verbose output, could you give it a
> try? It's maybe going to be more chatty than the previous one.
> 
> I'm sorry to keep you testing stuff, but since I cannot reproduce this
> locally I have to rely on you to provide the debug output.
> 
> Thanks, Roger.
> ---8<---
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index b9bbb8f485..75f8359a99 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1331,7 +1331,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l,
>  
>  if ( !paging_mode_translate(p2m->domain) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> +if ( !has_iommu_pt(d) )
>  return 0;
>  return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
>  IOMMUF_readable | IOMMUF_writable);
> @@ -1422,7 +1422,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l)
>  
>  if ( !paging_mode_translate(d) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> +if ( !has_iommu_pt(d) )
>  return 0;
>  return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
>  }
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index 117b869b0c..214c5d515f 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -291,8 +291,18 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
>  unsigned long i;
>  int rc = 0;
>  
> +if (dfn_x(dfn) >= 0x8d800 && dfn_x(dfn) < 0x9 )
> +{
> +printk(" iommu_map %#lx\n", dfn_x(dfn));
> +process_pending_softirqs();
> +}
> +
>  if ( !iommu_enabled || !hd->platform_ops )
> +{
> +printk("iommu_enabled: %d platform_ops %p\n",
> +   iommu_enabled, hd->platform_ops);
>  return 0;
> +}
>  
>  ASSERT(IS_ALIGNED(dfn_x(dfn), (1ul << page_order)));
>  ASSERT(IS_ALIGNED(mfn_x(mfn), (1ul << page_order)));
> diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> b/xen/drivers/passthrough/vtd/iommu.c
> index 50a0e25224..8c3fcb50ae 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2009,12 +2009,33 @@ static int rmrr_identity_mapping(struct domain *d, 
> bool_t map,
>  if ( !map )
>  return -ENOENT;
>  
> +printk(" mapping %#lx - %#lx\n", base_pfn, end_pfn);
>  while ( base_pfn < end_pfn )
>  {
>  int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
> +mfn_t mfn;
> +unsigned int f;
>  
>  if ( err )
>  return err;
> +
> +err = intel_iommu_lookup_page(d, _dfn(base_pfn), &mfn, &f);
> +if ( err )
> +{
> +printk("intel_iommu_lookup_page err: %d\n", err);
> +BUG();
> +}
> +if ( base_pfn != mfn_x(mfn) )
> +{
> +printk("base_pfn: %#lx mfn: %#lx\n", base_pfn, mfn_x(mfn));
> +BUG();
> +}
> +if ( f != (IOMMUF_readable | IOMMUF_writable) )
> +{
> +printk("flags: %#x\n", f);
> +BUG();
> +}
> +
>  base_pfn++;
>  }
>  
> @@ -2263,6 +2284,7 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain 
> *d)
>  u16 bdf;
>  int ret, i;
>  
> +printk(" setting up regions\n");
>  pcidevs_lock();
>  for_each_rmrr_device ( rmrr, bdf, i )
>  {
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-26 Thread Roger Pau Monné
On Thu, Jul 25, 2019 at 05:47:19PM -0700, Roman Shaposhnik wrote:
> Hi Roger!
> 
> With your patch (and build as a debug build) Xen crashes on boot
> (which I guess was the point of your BUG_ON statement).

Yes, that's very weird, seems like entries are not added to the iommu
page tables but I have no idea why, AFAICT this works fine on my
system.

Do you have any patches on top of RELEASE-4.12.0?

I have another patch with more verbose output, could you give it a
try? It's maybe going to be more chatty than the previous one.

I'm sorry to keep you testing stuff, but since I cannot reproduce this
locally I have to rely on you to provide the debug output.

Thanks, Roger.
---8<---
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index b9bbb8f485..75f8359a99 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1331,7 +1331,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 
 if ( !paging_mode_translate(p2m->domain) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
 IOMMUF_readable | IOMMUF_writable);
@@ -1422,7 +1422,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l)
 
 if ( !paging_mode_translate(d) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
 }
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 117b869b0c..214c5d515f 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -291,8 +291,18 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
 unsigned long i;
 int rc = 0;
 
+if (dfn_x(dfn) >= 0x8d800 && dfn_x(dfn) < 0x9 )
+{
+printk(" iommu_map %#lx\n", dfn_x(dfn));
+process_pending_softirqs();
+}
+
 if ( !iommu_enabled || !hd->platform_ops )
+{
+printk("iommu_enabled: %d platform_ops %p\n",
+   iommu_enabled, hd->platform_ops);
 return 0;
+}
 
 ASSERT(IS_ALIGNED(dfn_x(dfn), (1ul << page_order)));
 ASSERT(IS_ALIGNED(mfn_x(mfn), (1ul << page_order)));
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 50a0e25224..8c3fcb50ae 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2009,12 +2009,33 @@ static int rmrr_identity_mapping(struct domain *d, 
bool_t map,
 if ( !map )
 return -ENOENT;
 
+printk(" mapping %#lx - %#lx\n", base_pfn, end_pfn);
 while ( base_pfn < end_pfn )
 {
 int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
+mfn_t mfn;
+unsigned int f;
 
 if ( err )
 return err;
+
+err = intel_iommu_lookup_page(d, _dfn(base_pfn), &mfn, &f);
+if ( err )
+{
+printk("intel_iommu_lookup_page err: %d\n", err);
+BUG();
+}
+if ( base_pfn != mfn_x(mfn) )
+{
+printk("base_pfn: %#lx mfn: %#lx\n", base_pfn, mfn_x(mfn));
+BUG();
+}
+if ( f != (IOMMUF_readable | IOMMUF_writable) )
+{
+printk("flags: %#x\n", f);
+BUG();
+}
+
 base_pfn++;
 }
 
@@ -2263,6 +2284,7 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain 
*d)
 u16 bdf;
 int ret, i;
 
+printk(" setting up regions\n");
 pcidevs_lock();
 for_each_rmrr_device ( rmrr, bdf, i )
 {

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-26 Thread Jan Beulich
On 23.07.2019 20:25, Roman Shaposhnik wrote:
> Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug
> booted the system just fine.

Btw (I've noticed this only now) - are you saying without "iommu=debug"
the box does _not_ boot fine, despite the other option?

Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-25 Thread Roman Shaposhnik
On Wed, Jul 24, 2019 at 10:42 AM Rich Persaud  wrote:
>
> On Jul 19, 2019, at 15:31, Roman Shaposhnik  wrote:
>
> Hi!
>
> we're using Xen on Advantech ARK-2250 Embedded Box PC:
>
> https://www.elmark.com.pl/web/uploaded/karty_produktow/advantech/ark-2250l/ark-2250l_instrukcja-uzytkownika.pdf
>
>
> Roman,
>
> Good to see Xen being used on fanless devices.

Oh, there's WAY more of those in EVE under Xen management ;-)
 https://wiki.lfedge.org/display/EVE/Hardware+Platforms+Supporting+EVE

> Does the AMI BIOS for the i7 6600U Skylake CPU [1] variant of ARK-2250 [2]
> support Intel TXT DRTM and discrete TPM, which would enable boot integrity 
> [3] protection for Xen, read-only dom0 and stateless VMs?
> Boot integrity is valuable on edge devices.

Funny you should mention this -- that's exactly what we're playing
with right now in LF Edge Project EVE. Do you want to pop up on the
mailing list or slack channel there? (not sure this is the right topic
for Xen-devel).

And just so that we're on the same page, here's what we are after when
it comes to root of trust in EVE (I really need to do a write up on
this soon):
   * measured boot (we're really not interested secure boot that much)
   * measured boot of the DomUs
   * proxy TPM to the DomUs

> [1] CPU spec: 
> https://ark.intel.com/content/www/us/en/ark/products/88192/intel-core-i7-6600u-processor-4m-cache-up-to-3-40-ghz.html
>
> [2] PC spec: 
> https://www.advantech.com/products/ark-2000_series_embedded_box_pcs/ark-2250l/mod_66ebc4e0-9a0c-489c-96a5-70a8054e9037
>
> [3] TrenchBoot, Xen Summit 2019, https://youtube.com/watch?v=f0LZFSq4Ack

Thanks for the notes! Much appreciated!

Thanks,
Roman.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-25 Thread Roman Shaposhnik
Hi Roger!

With your patch (and build as a debug build) Xen crashes on boot
(which I guess was the point of your BUG_ON statement).

The log is attached

Thanks,
Roman.

On Wed, Jul 24, 2019 at 7:11 AM Roger Pau Monné  wrote:
>
> On Tue, Jul 23, 2019 at 10:32:26AM -0700, Roman Shaposhnik wrote:
> > Hi Roger!
> >
> > I applied your patch, removed no-igfx and I still see the original
> > problem. Please let me know what other logs/debugs would you need at
> > this point.
>
> I'm not sure why you don't get the rmrrs added to the iommu page
> tables, AFAICT it works on my test box.
>
> I have a patch with extra debug messages and checks, could you give it
> a test, I'm attaching it below. Note that you don't need the previous
> patch, since it's already contained in the debug patch below.
>
> Please paste the Xen bootlog with the patch applied when your reply.
>
> Thank, Roger.
> ---8<---
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index fef97c82f6..3605614aaf 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l,
>
>  if ( !paging_mode_translate(p2m->domain) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> +if ( !has_iommu_pt(d) )
>  return 0;
>  return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
>  IOMMUF_readable | IOMMUF_writable);
> @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l)
>
>  if ( !paging_mode_translate(d) )
>  {
> -if ( !need_iommu_pt_sync(d) )
> +if ( !has_iommu_pt(d) )
>  return 0;
>  return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
>  }
> diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> b/xen/drivers/passthrough/vtd/iommu.c
> index 8b27d7e775..ea303b5d45 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2009,12 +2009,19 @@ static int rmrr_identity_mapping(struct domain *d, 
> bool_t map,
>  if ( !map )
>  return -ENOENT;
>
> +printk(" mapping %#lx - %#lx\n", base_pfn, end_pfn);
>  while ( base_pfn < end_pfn )
>  {
>  int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
> +mfn_t mfn;
> +unsigned int f;
>
>  if ( err )
>  return err;
> +BUG_ON(intel_iommu_lookup_page(d, _dfn(base_pfn), &mfn, &f));
> +BUG_ON(base_pfn != mfn_x(mfn));
> +BUG_ON(f != (IOMMUF_readable | IOMMUF_writable));
> +
>  base_pfn++;
>  }
>
> @@ -2263,6 +2270,7 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain 
> *d)
>  u16 bdf;
>  int ret, i;
>
> +printk(" setting up regions\n");
>  pcidevs_lock();
>  for_each_rmrr_device ( rmrr, bdf, i )
>  {
>
0x:0x00:0x02.0x0: ROM: 0x1 bytes at 0x8968e018
0x:0x02:0x00.0x0: ROM: 0x1 bytes at 0x89641018
0x:0x00:0x1f.0x6: ROM: 0x10c00 bytes at 0x89630018
 Xen 4.12.0
(XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=y  Fri Jul 26 
00:32:25 UTC 2019
(XEN) Latest ChangeSet:
(XEN) Bootloader: GRUB 2.03
(XEN) Command line: loglvl=all com1=115200,8n1 console=com1,vga 
dom0_mem=1024M,max:1024M dom0_max_vcpus=1 dom0_vcpus_pin smt=false
(XEN) Xen image load base address: 0x8800
(XEN) Video information:
(XEN)  VGA is graphics mode 1680x1050, 32 bpp
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 1 EDD information structures
(XEN) EFI RAM map:
(XEN)   - 00058000 (usable)
(XEN)  00058000 - 00059000 (reserved)
(XEN)  00059000 - 0009f000 (usable)
(XEN)  0009f000 - 000a (reserved)
(XEN)  0010 - 8648a000 (usable)
(XEN)  8648a000 - 8648b000 (ACPI NVS)
(XEN)  8648b000 - 864b5000 (reserved)
(XEN)  864b5000 - 8c224000 (usable)
(XEN)  8c224000 - 8c528000 (reserved)
(XEN)  8c528000 - 8c736000 (usable)
(XEN)  8c736000 - 8cea7000 (ACPI NVS)
(XEN)  8cea7000 - 8d2ff000 (reserved)
(XEN)  8d2ff000 - 8d30 (usable)
(XEN)  8d30 - 8d40 (reserved)
(XEN)  e000 - f000 (reserved)
(XEN)  fe00 - fe011000 (reserved)
(XEN)  fec0 - fec01000 (reserved)
(XEN)  fee0 - fee01000 (reserved)
(XEN)  ff00 - 0001 (reserved)
(XEN)  0001 - 00016e00 (usable)
(XEN) ACPI: RSDP 8CE49000, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT 8CE490A8, 00CC (r1 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: FACP 8CE6C370, 010C (r5 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: DSDT 8CE49208, 23167 (r2 ALASKA   A M I   1072009 INTL 20120913)
(XEN) ACPI: FACS 8CE8EF80, 0040
(XEN) ACPI: APIC 8CE6C480, 0084 (r3 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACP

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-24 Thread Rich Persaud
> On Jul 19, 2019, at 15:31, Roman Shaposhnik  wrote:
> 
> Hi!
> 
> we're using Xen on Advantech ARK-2250 Embedded Box PC:
>
> https://www.elmark.com.pl/web/uploaded/karty_produktow/advantech/ark-2250l/ark-2250l_instrukcja-uzytkownika.pdf

Roman, 

Good to see Xen being used on fanless devices.  Does the AMI BIOS for the i7 
6600U Skylake CPU [1] variant of ARK-2250 [2] support Intel TXT DRTM and 
discrete TPM, which would enable boot integrity [3] protection for Xen, 
read-only dom0 and stateless VMs?  Boot integrity is valuable on edge devices.

Rich


[1] CPU spec: 
https://ark.intel.com/content/www/us/en/ark/products/88192/intel-core-i7-6600u-processor-4m-cache-up-to-3-40-ghz.html

[2] PC spec: 
https://www.advantech.com/products/ark-2000_series_embedded_box_pcs/ark-2250l/mod_66ebc4e0-9a0c-489c-96a5-70a8054e9037

[3] TrenchBoot, Xen Summit 2019, https://youtube.com/watch?v=f0LZFSq4Ack

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-24 Thread Roger Pau Monné
On Tue, Jul 23, 2019 at 10:32:26AM -0700, Roman Shaposhnik wrote:
> Hi Roger!
> 
> I applied your patch, removed no-igfx and I still see the original
> problem. Please let me know what other logs/debugs would you need at
> this point.

I'm not sure why you don't get the rmrrs added to the iommu page
tables, AFAICT it works on my test box.

I have a patch with extra debug messages and checks, could you give it
a test, I'm attaching it below. Note that you don't need the previous
patch, since it's already contained in the debug patch below.

Please paste the Xen bootlog with the patch applied when your reply.

Thank, Roger.
---8<---
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fef97c82f6..3605614aaf 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 
 if ( !paging_mode_translate(p2m->domain) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
 IOMMUF_readable | IOMMUF_writable);
@@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l)
 
 if ( !paging_mode_translate(d) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
 }
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 8b27d7e775..ea303b5d45 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2009,12 +2009,19 @@ static int rmrr_identity_mapping(struct domain *d, 
bool_t map,
 if ( !map )
 return -ENOENT;
 
+printk(" mapping %#lx - %#lx\n", base_pfn, end_pfn);
 while ( base_pfn < end_pfn )
 {
 int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
+mfn_t mfn;
+unsigned int f;
 
 if ( err )
 return err;
+BUG_ON(intel_iommu_lookup_page(d, _dfn(base_pfn), &mfn, &f));
+BUG_ON(base_pfn != mfn_x(mfn));
+BUG_ON(f != (IOMMUF_readable | IOMMUF_writable));
+
 base_pfn++;
 }
 
@@ -2263,6 +2270,7 @@ static void __hwdom_init setup_hwdom_rmrr(struct domain 
*d)
 u16 bdf;
 int ret, i;
 
+printk(" setting up regions\n");
 pcidevs_lock();
 for_each_rmrr_device ( rmrr, bdf, i )
 {


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-24 Thread Jan Beulich
On 24.07.2019 14:00, Jan Beulich wrote:
> On 23.07.2019 19:58, Roman Shaposhnik wrote:
>> No worries. Take a look at the head of the log attached.
> 
> One more thing for you to tweak to make the log even more useful:
> As per
> 
> (XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n  Tue Jul 23 
> 17:15:48 UTC 2019
> 
> this is a non-debug build. If you used a debug one, there ought to
> be another confirming message next to
> 
> (XEN) [VT-D]found ACPI_DMAR_RMRR:
> (XEN) [VT-D]  RMRR address range 8d80..8fff not in reserved memory; 
> need "iommu_inclusive_mapping=1"?
> (XEN) [VT-D] endpoint: :00:02.0
> 
> plus others in case of errors beyond the memory map related one
> (which, as was pointed out already, is a warning only). In case
> you're interested, see register_one_rmrr() in particular.

Oh, additionally you want to add "loglvl=all" to your hypervisor
command line.

Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-24 Thread Jan Beulich
On 23.07.2019 19:58, Roman Shaposhnik wrote:
> No worries. Take a look at the head of the log attached.

One more thing for you to tweak to make the log even more useful:
As per

(XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n  Tue Jul 23 
17:15:48 UTC 2019

this is a non-debug build. If you used a debug one, there ought to
be another confirming message next to

(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D]  RMRR address range 8d80..8fff not in reserved memory; 
need "iommu_inclusive_mapping=1"?
(XEN) [VT-D] endpoint: :00:02.0

plus others in case of errors beyond the memory map related one
(which, as was pointed out already, is a warning only). In case
you're interested, see register_one_rmrr() in particular.

Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-24 Thread Andrew Cooper
On 24/07/2019 12:23, Andrew Cooper wrote:
> On 23/07/2019 18:32, Roman Shaposhnik wrote:
>> Oh, and it seems that that https://downloads.xenproject.org/ SSL
>> certificate expired yesterday -- perhaps somebody can take a look at
>> that.
> Thanks for reporting.  It should be being taken care of.

And it should be fixed now.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-24 Thread Andrew Cooper
On 23/07/2019 18:32, Roman Shaposhnik wrote:
> Oh, and it seems that that https://downloads.xenproject.org/ SSL
> certificate expired yesterday -- perhaps somebody can take a look at
> that.

Thanks for reporting.  It should be being taken care of.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-23 Thread Roman Shaposhnik
On Tue, Jul 23, 2019 at 11:12 AM Andrew Cooper
 wrote:
>
> On 23/07/2019 18:58, Roman Shaposhnik wrote:
>
> On Tue, Jul 23, 2019 at 10:50 AM Andrew Cooper
>  wrote:
>
> On 23/07/2019 18:48, Roman Shaposhnik wrote:
>
> On Tue, Jul 23, 2019 at 10:35 AM Andrew Cooper
>  wrote:
>
> On 23/07/2019 18:32, Roman Shaposhnik wrote:
>
> Hi Roger!
>
> I applied your patch, removed no-igfx and I still see the original
> problem. Please let me know what other logs/debugs would you need at
> this point.
>
> Please can you collect a full boot log with iommu=debug
>
> How long of an output should I expect when iommu=debug is enabled?
> I've just enabled it and I'm looking at what appears to be an endless
> scroll of debug info.
>
> This is all I see for the good 5 minutes at this point. Culminating with:
> (XEN)   (XEN) APIC error on CPU0: 40(00)
>
> and a failure to boot.
>
> Note that this is still without no-igfx
>
> I'm attaching the tail end of this log.
>
> Sadly, what is useful is the head of the log, before it starts
> complaining loudly about every DMA fault.
>
> No worries. Take a look at the head of the log attached.
>
> Btw, I'm kind of curious why iommu=debug would actually make it crash
>
>
> The system is rather sickly, and is debugging at a rate slower than incoming 
> faults, which is going to starve whichever CPU is taking the IOMMU interrupt.
>
> I wouldn't worry about the APIC error now.

Got it. Makes sense.

> Curiously, there is one single intremap error on boot, which is likely 
> unrelated.
>
> (XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0, iommu 
> reg = 82c0008e
>
> (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear
>
>
> This will be irq0 from the IO-APIC.
>
> Can you try booting following the guidance from
>
> (XEN) [VT-D]found ACPI_DMAR_RMRR:
>
> (XEN) [VT-D]  RMRR address range 8d80..8fff not in reserved memory; 
> need "iommu_inclusive_mapping=1"?
>
> (XEN) [VT-D] endpoint: :00:02.0
>
>
> which I noted on my first reply?  Given that Rogers patch didn't help, 
> something else is going on.

Sorry -- missed that option the first time around.

Interestingly enough, adding iommu_inclusive_mapping=1 AND iommu=debug
booted the system just fine.

There are no issues with screen.

I am attaching a full log.

Btw, my understanding is that this may point to a BIOS issue. Which
would be fair conclusion, but I've got to wonder why Xen 4.11 didn't
seem to be susceptible to this BIOS issue.

Thanks,
Roman.
0x:0x00:0x02.0x0: ROM: 0x1 bytes at 0x8968d018
0x:0x02:0x00.0x0: ROM: 0x1 bytes at 0x89640018
0x:0x00:0x1f.0x6: ROM: 0x10c00 bytes at 0x8962f018
 Xen 4.12.0
(XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n  Tue Jul 23 
17:15:48 UTC 2019
(XEN) Latest ChangeSet:
(XEN) Bootloader: GRUB 2.03
(XEN) Command line: com1=115200,8n1 console=com1,vga iommu_inclusive_mapping=1 
iommu=debug dom0_mem=1024M,max:1024M dom0_max_vcpus=1 dom0_vcpus_pin smt=false
(XEN) Xen image load base address: 0x8800
(XEN) Video information:
(XEN)  VGA is graphics mode 1680x1050, 32 bpp
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 1 EDD information structures
(XEN) EFI RAM map:
(XEN)   - 00058000 (usable)
(XEN)  00058000 - 00059000 (reserved)
(XEN)  00059000 - 0009f000 (usable)
(XEN)  0009f000 - 000a (reserved)
(XEN)  0010 - 8648a000 (usable)
(XEN)  8648a000 - 8648b000 (ACPI NVS)
(XEN)  8648b000 - 864b5000 (reserved)
(XEN)  864b5000 - 8c224000 (usable)
(XEN)  8c224000 - 8c528000 (reserved)
(XEN)  8c528000 - 8c736000 (usable)
(XEN)  8c736000 - 8cea7000 (ACPI NVS)
(XEN)  8cea7000 - 8d2ff000 (reserved)
(XEN)  8d2ff000 - 8d30 (usable)
(XEN)  8d30 - 8d40 (reserved)
(XEN)  e000 - f000 (reserved)
(XEN)  fe00 - fe011000 (reserved)
(XEN)  fec0 - fec01000 (reserved)
(XEN)  fee0 - fee01000 (reserved)
(XEN)  ff00 - 0001 (reserved)
(XEN)  0001 - 00016e00 (usable)
(XEN) ACPI: RSDP 8CE49000, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT 8CE490A8, 00CC (r1 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: FACP 8CE6C370, 010C (r5 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: DSDT 8CE49208, 23167 (r2 ALASKA   A M I   1072009 INTL 20120913)
(XEN) ACPI: FACS 8CE8EF80, 0040
(XEN) ACPI: APIC 8CE6C480, 0084 (r3 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: FPDT 8CE6C508, 0044 (r1 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: FIDT 8CE6C550, 009C (r1 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: MCFG 8CE6C5F0, 003C (r1 ALASKA   A M I   1072009 MSFT   97)
(XEN) ACPI: HPET 8CE6C630, 0038 (r1 ALASKA   A M I   1072009 AMI.5000B)
(XEN) ACPI: LP

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-23 Thread Andrew Cooper
On 23/07/2019 18:58, Roman Shaposhnik wrote:
> On Tue, Jul 23, 2019 at 10:50 AM Andrew Cooper
>  wrote:
>> On 23/07/2019 18:48, Roman Shaposhnik wrote:
>>> On Tue, Jul 23, 2019 at 10:35 AM Andrew Cooper
>>>  wrote:
 On 23/07/2019 18:32, Roman Shaposhnik wrote:
> Hi Roger!
>
> I applied your patch, removed no-igfx and I still see the original
> problem. Please let me know what other logs/debugs would you need at
> this point.
 Please can you collect a full boot log with iommu=debug
>>> How long of an output should I expect when iommu=debug is enabled?
>>> I've just enabled it and I'm looking at what appears to be an endless
>>> scroll of debug info.
>>>
>>> This is all I see for the good 5 minutes at this point. Culminating with:
>>> (XEN)   (XEN) APIC error on CPU0: 40(00)
>>>
>>> and a failure to boot.
>>>
>>> Note that this is still without no-igfx
>>>
>>> I'm attaching the tail end of this log.
>> Sadly, what is useful is the head of the log, before it starts
>> complaining loudly about every DMA fault.
> No worries. Take a look at the head of the log attached.
>
> Btw, I'm kind of curious why iommu=debug would actually make it crash

The system is rather sickly, and is debugging at a rate slower than
incoming faults, which is going to starve whichever CPU is taking the
IOMMU interrupt.

I wouldn't worry about the APIC error now.

Curiously, there is one single intremap error on boot, which is likely
unrelated.

(XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0, iommu reg 
= 82c0008e

(XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear


This will be irq0 from the IO-APIC.

Can you try booting following the guidance from

(XEN) [VT-D]found ACPI_DMAR_RMRR:

(XEN) [VT-D]  RMRR address range 8d80..8fff not in reserved memory; 
need "iommu_inclusive_mapping=1"?

(XEN) [VT-D] endpoint: :00:02.0


which I noted on my first reply?  Given that Rogers patch didn't help,
something else is going on.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-23 Thread Roman Shaposhnik
On Tue, Jul 23, 2019 at 10:50 AM Andrew Cooper
 wrote:
>
> On 23/07/2019 18:48, Roman Shaposhnik wrote:
> > On Tue, Jul 23, 2019 at 10:35 AM Andrew Cooper
> >  wrote:
> >> On 23/07/2019 18:32, Roman Shaposhnik wrote:
> >>> Hi Roger!
> >>>
> >>> I applied your patch, removed no-igfx and I still see the original
> >>> problem. Please let me know what other logs/debugs would you need at
> >>> this point.
> >> Please can you collect a full boot log with iommu=debug
> > How long of an output should I expect when iommu=debug is enabled?
> > I've just enabled it and I'm looking at what appears to be an endless
> > scroll of debug info.
> >
> > This is all I see for the good 5 minutes at this point. Culminating with:
> > (XEN)   (XEN) APIC error on CPU0: 40(00)
> >
> > and a failure to boot.
> >
> > Note that this is still without no-igfx
> >
> > I'm attaching the tail end of this log.
>
> Sadly, what is useful is the head of the log, before it starts
> complaining loudly about every DMA fault.

No worries. Take a look at the head of the log attached.

Btw, I'm kind of curious why iommu=debug would actually make it crash

Thanks,
Roman.
0x:0x00:0x02.0x0: ROM: 0x1 bytes at 0x8968d018
0x:0x02:0x00.0x0: ROM: 0x1 bytes at 0x89640018
0x:0x00:0x1f.0x6: ROM: 0x10c00 bytes at 0x8962f018
 Xen 4.12.0
(XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n  Tue Jul 23 
17:15:48 UTC 2019
(XEN) Latest ChangeSet:
(XEN) Bootloader: GRUB 2.03
(XEN) Command line: com1=115200,8n1 console=com1,vga iommu=debug 
dom0_mem=1024M,max:1024M dom0_max_vcpus=1 dom0_vcpus_pin smt=false
(XEN) Xen image load base address: 0x8800
(XEN) Video information:
(XEN)  VGA is graphics mode 1680x1050, 32 bpp
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 1 EDD information structures
(XEN) EFI RAM map:
(XEN)   - 00058000 (usable)
(XEN)  00058000 - 00059000 (reserved)
(XEN)  00059000 - 0009f000 (usable)
(XEN)  0009f000 - 000a (reserved)
(XEN)  0010 - 8648a000 (usable)
(XEN)  8648a000 - 8648b000 (ACPI NVS)
(XEN)  8648b000 - 864b5000 (reserved)
(XEN)  864b5000 - 8c224000 (usable)
(XEN)  8c224000 - 8c528000 (reserved)
(XEN)  8c528000 - 8c736000 (usable)
(XEN)  8c736000 - 8cea7000 (ACPI NVS)
(XEN)  8cea7000 - 8d2ff000 (reserved)
(XEN)  8d2ff000 - 8d30 (usable)
(XEN)  8d30 - 8d40 (reserved)
(XEN)  e000 - f000 (reserved)
(XEN)  fe00 - fe011000 (reserved)
(XEN)  fec0 - fec01000 (reserved)
(XEN)  fee0 - fee01000 (reserved)
(XEN)  ff00 - 0001 (reserved)
(XEN)  0001 - 00016e00 (usable)
(XEN) ACPI: RSDP 8CE49000, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT 8CE490A8, 00CC (r1 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: FACP 8CE6C370, 010C (r5 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: DSDT 8CE49208, 23167 (r2 ALASKA   A M I   1072009 INTL 20120913)
(XEN) ACPI: FACS 8CE8EF80, 0040
(XEN) ACPI: APIC 8CE6C480, 0084 (r3 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: FPDT 8CE6C508, 0044 (r1 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: FIDT 8CE6C550, 009C (r1 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: MCFG 8CE6C5F0, 003C (r1 ALASKA   A M I   1072009 MSFT   97)
(XEN) ACPI: HPET 8CE6C630, 0038 (r1 ALASKA   A M I   1072009 AMI.5000B)
(XEN) ACPI: LPIT 8CE6C668, 0094 (r1 INTEL   SKL-ULT0 MSFT   5F)
(XEN) ACPI: SSDT 8CE6C700, 0248 (r2 INTEL  sensrhub0 INTL 20120913)
(XEN) ACPI: SSDT 8CE6C948, 2BAE (r2 INTEL  PtidDevc 1000 INTL 20120913)
(XEN) ACPI: SSDT 8CE6F4F8, 0BE3 (r2 INTEL  Ther_Rvp 1000 INTL 20120913)
(XEN) ACPI: SSDT 8CE700E0, 04A3 (r2 INTEL zpodd 1000 INTL 20120913)
(XEN) ACPI: DBGP 8CE70588, 0034 (r1 INTEL  0 MSFT   5F)
(XEN) ACPI: DBG2 8CE705C0, 0054 (r0 INTEL  0 MSFT   5F)
(XEN) ACPI: SSDT 8CE70618, 06E9 (r2  INTEL xh_rvp070 INTL 20120913)
(XEN) ACPI: SSDT 8CE70D08, 547E (r2 SaSsdt  SaSsdt  3000 INTL 20120913)
(XEN) ACPI: UEFI 8CE76188, 0042 (r10 0)
(XEN) ACPI: SSDT 8CE761D0, 0E73 (r2 CpuRef  CpuSsdt 3000 INTL 20120913)
(XEN) ACPI: BGRT 8CE77048, 0038 (r1 ALASKA   A M I   1072009 AMI 10013)
(XEN) ACPI: DMAR 8CE77080, 00A8 (r1 INTEL  SKL 1 INTL1)
(XEN) ACPI: TPM2 8CE77128, 0034 (r3Tpm2Tabl1 AMI 0)
(XEN) ACPI: ASF! 8CE77160, 00A5 (r32 INTEL   HCG1 TFSMF4240)
(XEN) System RAM: 4003MB (4099736kB)
(XEN) Domain heap initialised
(XEN) ACPI: 32/64X FACS address mismatch in FADT - 8ce8ef80/, 
using 32
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119
(XEN) Enabling APIC mode:  Flat.  Using 1 I/O 

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-23 Thread Andrew Cooper
On 23/07/2019 18:48, Roman Shaposhnik wrote:
> On Tue, Jul 23, 2019 at 10:35 AM Andrew Cooper
>  wrote:
>> On 23/07/2019 18:32, Roman Shaposhnik wrote:
>>> Hi Roger!
>>>
>>> I applied your patch, removed no-igfx and I still see the original
>>> problem. Please let me know what other logs/debugs would you need at
>>> this point.
>> Please can you collect a full boot log with iommu=debug
> How long of an output should I expect when iommu=debug is enabled?
> I've just enabled it and I'm looking at what appears to be an endless
> scroll of debug info.
>
> This is all I see for the good 5 minutes at this point. Culminating with:
> (XEN)   (XEN) APIC error on CPU0: 40(00)
>
> and a failure to boot.
>
> Note that this is still without no-igfx
>
> I'm attaching the tail end of this log.

Sadly, what is useful is the head of the log, before it starts
complaining loudly about every DMA fault.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-23 Thread Andrew Cooper
On 23/07/2019 18:32, Roman Shaposhnik wrote:
> Hi Roger!
>
> I applied your patch, removed no-igfx and I still see the original
> problem. Please let me know what other logs/debugs would you need at
> this point.

Please can you collect a full boot log with iommu=debug

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-23 Thread Roman Shaposhnik
Hi Roger!

I applied your patch, removed no-igfx and I still see the original
problem. Please let me know what other logs/debugs would you need at
this point.

Btw, just to make it clear what patch got applied I'm attaching it to
this email.

Oh, and it seems that that https://downloads.xenproject.org/ SSL
certificate expired yesterday -- perhaps somebody can take a look at
that.

Thanks,
Roman.

On Mon, Jul 22, 2019 at 4:47 PM Andrew Cooper  wrote:
>
> On 23/07/2019 00:36, Roman Shaposhnik wrote:
> > Hi Everyone!
> >
> > Thanks a million for an extremely quick turnaround. I am in my lab
> > again next to the box in question.
> >
> > Should I go ahead and test the latest patch or wait for the official
> > one to be submitted?
> >
> > Thanks,
> > Roman.
>
> Use this patch to test with.  Roger forgot to CC you on the official
> one, but the code changes are identical.
>
> ~Andrew


01-iommu-mappings.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Andrew Cooper
On 23/07/2019 00:36, Roman Shaposhnik wrote:
> Hi Everyone!
>
> Thanks a million for an extremely quick turnaround. I am in my lab
> again next to the box in question.
>
> Should I go ahead and test the latest patch or wait for the official
> one to be submitted?
>
> Thanks,
> Roman.

Use this patch to test with.  Roger forgot to CC you on the official
one, but the code changes are identical.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Roman Shaposhnik
Hi Everyone!

Thanks a million for an extremely quick turnaround. I am in my lab
again next to the box in question.

Should I go ahead and test the latest patch or wait for the official
one to be submitted?

Thanks,
Roman.

On Mon, Jul 22, 2019 at 8:22 AM Roger Pau Monné  wrote:
>
> On Mon, Jul 22, 2019 at 05:02:35PM +0200, Paul Durrant wrote:
> > > -Original Message-
> > > From: Roger Pau Monne 
> > > Sent: 22 July 2019 15:40
> > > To: Paul Durrant 
> > > Cc: 'Roman Shaposhnik' ; 
> > > xen-devel@lists.xenproject.org; jgr...@suse.com; Andrew
> > > Cooper ; boris.ostrov...@oracle.com; 
> > > jbeul...@suse.com
> > > Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> > > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> > > index fef97c82f6..88a2430c8c 100644
> > > --- a/xen/arch/x86/mm/p2m.c
> > > +++ b/xen/arch/x86/mm/p2m.c
> > > @@ -836,7 +836,7 @@ guest_physmap_add_page(struct domain *d, gfn_t gfn, 
> > > mfn_t mfn,
> > >   */
> > >  for ( i = 0; i < (1UL << page_order); ++i, ++page )
> > >  {
> > > -if ( !need_iommu_pt_sync(d) )
> > > +if ( !has_iommu_pt(d) )
> > >  /* nothing */;
> > >  else if ( get_page_and_type(page, d, PGT_writable_page) )
> > >  put_page_and_type(page);
> > > @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, 
> > > unsigned long gfn_l,
> > >
> > >  if ( !paging_mode_translate(p2m->domain) )
> > >  {
> > > -if ( !need_iommu_pt_sync(d) )
> > > +if ( !has_iommu_pt(d) )
> > >  return 0;
> > >  return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), 
> > > PAGE_ORDER_4K,
> > >  IOMMUF_readable | IOMMUF_writable);
> > > @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, 
> > > unsigned long gfn_l)
> > >
> > >  if ( !paging_mode_translate(d) )
> > >  {
> > > -if ( !need_iommu_pt_sync(d) )
> > > +if ( !has_iommu_pt(d) )
> > >  return 0;
> > >  return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
> > >  }
> >
> > Yes, this all looks ok to me... although I still find it counterintuitive 
> > that we make p2m calls for PV domains.
>
> I agree, albeit I'm not sure of how to get rid of those, will need to
> look at the callers. For iommu callers we could likely just call
> iommu_legacy_{map/unmap} for PV.
>
> I'm going to formally submit this patch then.
>
> Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Roger Pau Monné
On Mon, Jul 22, 2019 at 05:02:35PM +0200, Paul Durrant wrote:
> > -Original Message-
> > From: Roger Pau Monne 
> > Sent: 22 July 2019 15:40
> > To: Paul Durrant 
> > Cc: 'Roman Shaposhnik' ; xen-devel@lists.xenproject.org; 
> > jgr...@suse.com; Andrew
> > Cooper ; boris.ostrov...@oracle.com; 
> > jbeul...@suse.com
> > Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> > index fef97c82f6..88a2430c8c 100644
> > --- a/xen/arch/x86/mm/p2m.c
> > +++ b/xen/arch/x86/mm/p2m.c
> > @@ -836,7 +836,7 @@ guest_physmap_add_page(struct domain *d, gfn_t gfn, 
> > mfn_t mfn,
> >   */
> >  for ( i = 0; i < (1UL << page_order); ++i, ++page )
> >  {
> > -if ( !need_iommu_pt_sync(d) )
> > +if ( !has_iommu_pt(d) )
> >  /* nothing */;
> >  else if ( get_page_and_type(page, d, PGT_writable_page) )
> >  put_page_and_type(page);
> > @@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
> > long gfn_l,
> > 
> >  if ( !paging_mode_translate(p2m->domain) )
> >  {
> > -if ( !need_iommu_pt_sync(d) )
> > +if ( !has_iommu_pt(d) )
> >  return 0;
> >  return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
> >  IOMMUF_readable | IOMMUF_writable);
> > @@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, 
> > unsigned long gfn_l)
> > 
> >  if ( !paging_mode_translate(d) )
> >  {
> > -if ( !need_iommu_pt_sync(d) )
> > +if ( !has_iommu_pt(d) )
> >  return 0;
> >  return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
> >  }
> 
> Yes, this all looks ok to me... although I still find it counterintuitive 
> that we make p2m calls for PV domains.

I agree, albeit I'm not sure of how to get rid of those, will need to
look at the callers. For iommu callers we could likely just call
iommu_legacy_{map/unmap} for PV.

I'm going to formally submit this patch then.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Paul Durrant


> -Original Message-
> From: Roger Pau Monne 
> Sent: 22 July 2019 15:40
> To: Paul Durrant 
> Cc: 'Roman Shaposhnik' ; xen-devel@lists.xenproject.org; 
> jgr...@suse.com; Andrew
> Cooper ; boris.ostrov...@oracle.com; 
> jbeul...@suse.com
> Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> 
> On Mon, Jul 22, 2019 at 04:03:44PM +0200, Paul Durrant wrote:
> > > -Original Message-
> > [snip]
> > > > > diff --git a/xen/drivers/passthrough/iommu.c 
> > > > > b/xen/drivers/passthrough/iommu.c
> > > > > index 79ec6719f5..9d91f0d633 100644
> > > > > --- a/xen/drivers/passthrough/iommu.c
> > > > > +++ b/xen/drivers/passthrough/iommu.c
> > > > > @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain 
> > > > > *d)
> > > > >  register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m 
> > > > > table", 0);
> > > > >
> > > > >  hd->status = IOMMU_STATUS_initializing;
> > > > > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
> > > > > +hd->need_sync = !iommu_use_hap_pt(d);
> > > >
> > > > But this is going to mean the if() below is true for non-strict dom0, 
> > > > which means it pointlessly
> > > maps the dom0 page list when hwdom_iommu_map() should have already mapped 
> > > all conventional RAM.
> > >
> > > Right, this all seems quite broken. Non-translated guests (ie: PV)
> > > would always need iommu page-table sync, but set_identity_p2m_entry
> > > contains the following:
> > >
> > > if ( !paging_mode_translate(p2m->domain) )
> > > {
> > > if ( !need_iommu_pt_sync(d) )
> > > return 0;
> > > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
> > > IOMMUF_readable | IOMMUF_writable);
> > > }
> > >
> > > I wonder whether the usage of need_iommu_pt_sync is wrong there, and
> > > should instead be !has_iommu_pt(d), since non-translated domains would
> > > never share the iommu page-tables anyway.
> >
> > You want syncing if the domain has IOMMU page tables and shared EPT is not 
> > in use, so this logic
> just seems wrong.
> 
> Right, so for a PV domain this would mean syncing if the iommu is in
> used, because there's no sharing at all.
> 
> > >
> > > In any case, I think need_sync must be set when the iommu page tables
> > > are not shared, and gating it on iommu_hwdom_strict seems wrong to me,
> > > the strictness of the iommu doesn't affect whether a sync is need or
> > > not.
> >
> > I think need_sync is gated on strict mode because, in 'relaxed' mode, the 
> > mappings that are set up
> when dom0 is started are supposed to be static (modulo hotplug RAM)... so 
> modifications to the
> domain's page list are not supposed to have any effect, and so no 
> synchronization need be done.
> 
> Hm, right, in relaxed mode dom0 is supposed to have all RAM regions
> mapped in the iommu page-tables, so there's no need to modify the
> iommu page tables at run time.
> 
> This approach seems slightly broken if dma operations agains mmio
> regions are attempted by dom0, but anyway, this seems to have worked
> fine so far.
> 
> > >
> > > I've updated the patch to avoid the pointless mapping of dom0 page
> > > list, but I haven't included the change to set_identity_p2m_entry.
> > >
> >
> > So, I think the albeit odd looking logic in iommu_hwdom_init() was actually 
> > correct, but the code in
> set_identity_p2m_entry() is wrong.
> 
> Ack, see the patch below. I think it might also be helpful to add a
> comment to the setting of need_sync in iommu_hwdom_init in order to
> mention that relaxed domains don't need sync because all ram is
> already mapped in the iommu page tables.

Agreed.

> 
> Thanks, Roger.
> ---8<---
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index fef97c82f6..88a2430c8c 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -836,7 +836,7 @@ guest_physmap_add_page(struct domain *d, gfn_t gfn, mfn_t 
> mfn,
>   */
>  for ( i = 0; i < (1UL << page_order); ++i, ++page )
>  {
> -if ( !need_iommu_pt_sync(d) )
> +if ( !has_iommu_pt(d) )
>  /* nothing */;
> 

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Roger Pau Monné
On Mon, Jul 22, 2019 at 04:03:44PM +0200, Paul Durrant wrote:
> > -Original Message-
> [snip]
> > > > diff --git a/xen/drivers/passthrough/iommu.c 
> > > > b/xen/drivers/passthrough/iommu.c
> > > > index 79ec6719f5..9d91f0d633 100644
> > > > --- a/xen/drivers/passthrough/iommu.c
> > > > +++ b/xen/drivers/passthrough/iommu.c
> > > > @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
> > > >  register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m 
> > > > table", 0);
> > > >
> > > >  hd->status = IOMMU_STATUS_initializing;
> > > > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
> > > > +hd->need_sync = !iommu_use_hap_pt(d);
> > >
> > > But this is going to mean the if() below is true for non-strict dom0, 
> > > which means it pointlessly
> > maps the dom0 page list when hwdom_iommu_map() should have already mapped 
> > all conventional RAM.
> > 
> > Right, this all seems quite broken. Non-translated guests (ie: PV)
> > would always need iommu page-table sync, but set_identity_p2m_entry
> > contains the following:
> > 
> > if ( !paging_mode_translate(p2m->domain) )
> > {
> > if ( !need_iommu_pt_sync(d) )
> > return 0;
> > return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
> > IOMMUF_readable | IOMMUF_writable);
> > }
> > 
> > I wonder whether the usage of need_iommu_pt_sync is wrong there, and
> > should instead be !has_iommu_pt(d), since non-translated domains would
> > never share the iommu page-tables anyway.
> 
> You want syncing if the domain has IOMMU page tables and shared EPT is not in 
> use, so this logic just seems wrong.

Right, so for a PV domain this would mean syncing if the iommu is in
used, because there's no sharing at all.

> > 
> > In any case, I think need_sync must be set when the iommu page tables
> > are not shared, and gating it on iommu_hwdom_strict seems wrong to me,
> > the strictness of the iommu doesn't affect whether a sync is need or
> > not.
> 
> I think need_sync is gated on strict mode because, in 'relaxed' mode, the 
> mappings that are set up when dom0 is started are supposed to be static 
> (modulo hotplug RAM)... so modifications to the domain's page list are not 
> supposed to have any effect, and so no synchronization need be done.

Hm, right, in relaxed mode dom0 is supposed to have all RAM regions
mapped in the iommu page-tables, so there's no need to modify the
iommu page tables at run time.

This approach seems slightly broken if dma operations agains mmio
regions are attempted by dom0, but anyway, this seems to have worked
fine so far.

> > 
> > I've updated the patch to avoid the pointless mapping of dom0 page
> > list, but I haven't included the change to set_identity_p2m_entry.
> > 
> 
> So, I think the albeit odd looking logic in iommu_hwdom_init() was actually 
> correct, but the code in set_identity_p2m_entry() is wrong.

Ack, see the patch below. I think it might also be helpful to add a
comment to the setting of need_sync in iommu_hwdom_init in order to
mention that relaxed domains don't need sync because all ram is
already mapped in the iommu page tables.

Thanks, Roger.
---8<---
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fef97c82f6..88a2430c8c 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -836,7 +836,7 @@ guest_physmap_add_page(struct domain *d, gfn_t gfn, mfn_t 
mfn,
  */
 for ( i = 0; i < (1UL << page_order); ++i, ++page )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 /* nothing */;
 else if ( get_page_and_type(page, d, PGT_writable_page) )
 put_page_and_type(page);
@@ -1341,7 +1341,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 
 if ( !paging_mode_translate(p2m->domain) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
 IOMMUF_readable | IOMMUF_writable);
@@ -1432,7 +1432,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
long gfn_l)
 
 if ( !paging_mode_translate(d) )
 {
-if ( !need_iommu_pt_sync(d) )
+if ( !has_iommu_pt(d) )
 return 0;
 return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
 }


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Paul Durrant
> -Original Message-
[snip]
> > > diff --git a/xen/drivers/passthrough/iommu.c 
> > > b/xen/drivers/passthrough/iommu.c
> > > index 79ec6719f5..9d91f0d633 100644
> > > --- a/xen/drivers/passthrough/iommu.c
> > > +++ b/xen/drivers/passthrough/iommu.c
> > > @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
> > >  register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m 
> > > table", 0);
> > >
> > >  hd->status = IOMMU_STATUS_initializing;
> > > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
> > > +hd->need_sync = !iommu_use_hap_pt(d);
> >
> > But this is going to mean the if() below is true for non-strict dom0, which 
> > means it pointlessly
> maps the dom0 page list when hwdom_iommu_map() should have already mapped all 
> conventional RAM.
> 
> Right, this all seems quite broken. Non-translated guests (ie: PV)
> would always need iommu page-table sync, but set_identity_p2m_entry
> contains the following:
> 
> if ( !paging_mode_translate(p2m->domain) )
> {
> if ( !need_iommu_pt_sync(d) )
> return 0;
> return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
> IOMMUF_readable | IOMMUF_writable);
> }
> 
> I wonder whether the usage of need_iommu_pt_sync is wrong there, and
> should instead be !has_iommu_pt(d), since non-translated domains would
> never share the iommu page-tables anyway.

You want syncing if the domain has IOMMU page tables and shared EPT is not in 
use, so this logic just seems wrong.

> 
> In any case, I think need_sync must be set when the iommu page tables
> are not shared, and gating it on iommu_hwdom_strict seems wrong to me,
> the strictness of the iommu doesn't affect whether a sync is need or
> not.

I think need_sync is gated on strict mode because, in 'relaxed' mode, the 
mappings that are set up when dom0 is started are supposed to be static (modulo 
hotplug RAM)... so modifications to the domain's page list are not supposed to 
have any effect, and so no synchronization need be done.

> 
> I've updated the patch to avoid the pointless mapping of dom0 page
> list, but I haven't included the change to set_identity_p2m_entry.
> 

So, I think the albeit odd looking logic in iommu_hwdom_init() was actually 
correct, but the code in set_identity_p2m_entry() is wrong.

  Paul

> Thanks, Roger.
> ---8<---
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index 79ec6719f5..abf9e38607 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -185,8 +185,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
>  register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 
> 0);
> 
>  hd->status = IOMMU_STATUS_initializing;
> -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
> -if ( need_iommu_pt_sync(d) )
> +hd->need_sync = !iommu_use_hap_pt(d);
> +if ( iommu_hwdom_strict && need_iommu_pt_sync(d) )
>  {
>  struct page_info *page;
>  unsigned int i = 0, flush_flags = 0;


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Roger Pau Monné
On Mon, Jul 22, 2019 at 01:54:00PM +0200, Paul Durrant wrote:
> > -Original Message-
> > From: Roger Pau Monne 
> > Sent: 22 July 2019 12:49
> > To: Paul Durrant ; 'Roman Shaposhnik' 
> > 
> > Cc: xen-devel@lists.xenproject.org; jgr...@suse.com; Andrew Cooper 
> > ;
> > boris.ostrov...@oracle.com; jbeul...@suse.com
> > Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> > 
> > On Mon, Jul 22, 2019 at 08:20:36AM +, Paul Durrant wrote:
> > > > -Original Message-
> > > [snip]
> > > > > (XEN) Domain heap initialised
> > > > > (XEN) ACPI: 32/64X FACS address mismatch in FADT -
> > > > > 8ce8ef80/, using 32
> > > > > (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119
> > > > > (XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
> > > > > (XEN) [VT-D]  RMRR address range 8d80..8fff not in reserved
> > > > > memory; need "iommu_inclusive_mapping=1"?
> > >
> > > This is your problem. In versions prior to 4.11 (I think, and certainly 
> > > 4.12)
> > iommu_inclusive_mapping used to default on, whereas now it appears to 
> > default off. In most
> > circumstances this is fine because there is a new flag, 
> > iommu_hwdom_reserved, which defaults on and
> > this makes sure that all e820 reserved regions are identity mapped (which 
> > usually covers undeclared
> > RMRRs). You have the opposite problem... a declared RMRR which is not 
> > reserved, so you will need
> > iommu_inclusive_mapping.
> > 
> > I think there's a bug in the initialization of iommu for a PV dom0,
> > which leaves dom0 without rmrr entries. Can you try the patch below
> > and report whether it solves your issue?
> > 
> > Thanks, Roger.
> > ---8<---
> > diff --git a/xen/drivers/passthrough/iommu.c 
> > b/xen/drivers/passthrough/iommu.c
> > index 79ec6719f5..9d91f0d633 100644
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
> >  register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m 
> > table", 0);
> > 
> >  hd->status = IOMMU_STATUS_initializing;
> > -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
> > +hd->need_sync = !iommu_use_hap_pt(d);
> 
> But this is going to mean the if() below is true for non-strict dom0, which 
> means it pointlessly maps the dom0 page list when hwdom_iommu_map() should 
> have already mapped all conventional RAM.

Right, this all seems quite broken. Non-translated guests (ie: PV)
would always need iommu page-table sync, but set_identity_p2m_entry
contains the following:

if ( !paging_mode_translate(p2m->domain) )
{
if ( !need_iommu_pt_sync(d) )
return 0;
return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
IOMMUF_readable | IOMMUF_writable);
}

I wonder whether the usage of need_iommu_pt_sync is wrong there, and
should instead be !has_iommu_pt(d), since non-translated domains would
never share the iommu page-tables anyway.

In any case, I think need_sync must be set when the iommu page tables
are not shared, and gating it on iommu_hwdom_strict seems wrong to me,
the strictness of the iommu doesn't affect whether a sync is need or
not.

I've updated the patch to avoid the pointless mapping of dom0 page
list, but I haven't included the change to set_identity_p2m_entry.

Thanks, Roger.
---8<---
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 79ec6719f5..abf9e38607 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -185,8 +185,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
 register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0);
 
 hd->status = IOMMU_STATUS_initializing;
-hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
-if ( need_iommu_pt_sync(d) )
+hd->need_sync = !iommu_use_hap_pt(d);
+if ( iommu_hwdom_strict && need_iommu_pt_sync(d) )
 {
 struct page_info *page;
 unsigned int i = 0, flush_flags = 0;


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Paul Durrant
> -Original Message-
> From: Roger Pau Monne 
> Sent: 22 July 2019 12:49
> To: Paul Durrant ; 'Roman Shaposhnik' 
> 
> Cc: xen-devel@lists.xenproject.org; jgr...@suse.com; Andrew Cooper 
> ;
> boris.ostrov...@oracle.com; jbeul...@suse.com
> Subject: Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx
> 
> On Mon, Jul 22, 2019 at 08:20:36AM +, Paul Durrant wrote:
> > > -Original Message-
> > [snip]
> > > > (XEN) Domain heap initialised
> > > > (XEN) ACPI: 32/64X FACS address mismatch in FADT -
> > > > 8ce8ef80/, using 32
> > > > (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119
> > > > (XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
> > > > (XEN) [VT-D]  RMRR address range 8d80..8fff not in reserved
> > > > memory; need "iommu_inclusive_mapping=1"?
> >
> > This is your problem. In versions prior to 4.11 (I think, and certainly 
> > 4.12)
> iommu_inclusive_mapping used to default on, whereas now it appears to default 
> off. In most
> circumstances this is fine because there is a new flag, iommu_hwdom_reserved, 
> which defaults on and
> this makes sure that all e820 reserved regions are identity mapped (which 
> usually covers undeclared
> RMRRs). You have the opposite problem... a declared RMRR which is not 
> reserved, so you will need
> iommu_inclusive_mapping.
> 
> I think there's a bug in the initialization of iommu for a PV dom0,
> which leaves dom0 without rmrr entries. Can you try the patch below
> and report whether it solves your issue?
> 
> Thanks, Roger.
> ---8<---
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index 79ec6719f5..9d91f0d633 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
>  register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 
> 0);
> 
>  hd->status = IOMMU_STATUS_initializing;
> -hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
> +hd->need_sync = !iommu_use_hap_pt(d);

But this is going to mean the if() below is true for non-strict dom0, which 
means it pointlessly maps the dom0 page list when hwdom_iommu_map() should have 
already mapped all conventional RAM.

  Paul

>  if ( need_iommu_pt_sync(d) )
>  {
>  struct page_info *page;


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Roger Pau Monné
On Mon, Jul 22, 2019 at 08:20:36AM +, Paul Durrant wrote:
> > -Original Message-
> [snip]
> > > (XEN) Domain heap initialised
> > > (XEN) ACPI: 32/64X FACS address mismatch in FADT -
> > > 8ce8ef80/, using 32
> > > (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119
> > > (XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
> > > (XEN) [VT-D]  RMRR address range 8d80..8fff not in reserved
> > > memory; need "iommu_inclusive_mapping=1"?
> 
> This is your problem. In versions prior to 4.11 (I think, and certainly 4.12) 
> iommu_inclusive_mapping used to default on, whereas now it appears to default 
> off. In most circumstances this is fine because there is a new flag, 
> iommu_hwdom_reserved, which defaults on and this makes sure that all e820 
> reserved regions are identity mapped (which usually covers undeclared RMRRs). 
> You have the opposite problem... a declared RMRR which is not reserved, so 
> you will need iommu_inclusive_mapping.

I think there's a bug in the initialization of iommu for a PV dom0,
which leaves dom0 without rmrr entries. Can you try the patch below
and report whether it solves your issue?

Thanks, Roger.
---8<---
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 79ec6719f5..9d91f0d633 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
 register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0);
 
 hd->status = IOMMU_STATUS_initializing;
-hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
+hd->need_sync = !iommu_use_hap_pt(d);
 if ( need_iommu_pt_sync(d) )
 {
 struct page_info *page;


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Paul Durrant
> -Original Message-
[snip]
> > (XEN) Domain heap initialised
> > (XEN) ACPI: 32/64X FACS address mismatch in FADT -
> > 8ce8ef80/, using 32
> > (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec0, GSI 0-119
> > (XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
> > (XEN) [VT-D]  RMRR address range 8d80..8fff not in reserved
> > memory; need "iommu_inclusive_mapping=1"?

This is your problem. In versions prior to 4.11 (I think, and certainly 4.12) 
iommu_inclusive_mapping used to default on, whereas now it appears to default 
off. In most circumstances this is fine because there is a new flag, 
iommu_hwdom_reserved, which defaults on and this makes sure that all e820 
reserved regions are identity mapped (which usually covers undeclared RMRRs). 
You have the opposite problem... a declared RMRR which is not reserved, so you 
will need iommu_inclusive_mapping.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-22 Thread Paul Durrant
De-htmling...

-
[snip]
For RMRRs themselves, system firmware is well known for abiding by the spec 
[citation needed], but an RMRR must be honoured, because the entire purpose of 
them is to state "this device won't function without access to this area".

An RMRR in a hole, while a violation of the spec, is obviously fine to use in 
practice, so we should just accept it and stop complaining.

OTOH, RMRRs which hit other memory (particularly RAM) need more care, and 
probably want to force override the e820 to reserved.  Nothing good will come 
from trusting the e820 over the DMAR table here, seeing as there is clearly an 
error somewhere in the firmware-provided information.

However - I'm struggling to locate anywhere which actually walks dom0's RMRR 
list and inserts them into the IOMMU.  Anyone got any hints?
-

I think you want setup_hwdom_rmrr() in passthrough/vtd/iommu.c

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-20 Thread Andrew Cooper
On 19/07/2019 20:31, Roman Shaposhnik wrote:
> Hi!
>
> we're using Xen on Advantech ARK-2250 Embedded Box PC:
> 
> https://www.elmark.com.pl/web/uploaded/karty_produktow/advantech/ark-2250l/ark-2250l_instrukcja-uzytkownika.pdf
>
> After upgrading to Xen 4.12.0 from 4.11.0 we now have to utilize  
> iommu=no-igfx
> workaround as per:
>  https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#iommu
>
> Without the workaround the screen appears to be garbled with colored
> static noise and the following message keeps showing up:
> (XEN) printk: 26235 messages suppressed.
> (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
> 8e43c000, iommu reg = 82c00021d000
> (XEN) printk: 26303 messages suppressed.
> (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
> 8e2c6000, iommu reg = 82c00021d000
>
> Once iommu=no-igfx is applied the box can boot fine.
>
> At the end of this email, you can see a full log of the box booting
> all the way into Dom0 with iommu=no-igfx applied. I am also attaching
> similar log without no-igfx
>
> Please let me know if you need any more information to help us diagnose this.

This will be a consequence of trying to remove various pieces of
stupidity with the preexisting IOMMU logic, in an attempt to unify the
PV and PVH paths.

As for the symptoms you're seeing, that is because the GPU is not being
given access to the RAM stolen for graphics purposes.

Picking the log apart:

(XEN) EFI RAM map:
(XEN)   - 00058000 (usable)
(XEN)  00058000 - 00059000 (reserved)
(XEN)  00059000 - 0009f000 (usable)
(XEN)  0009f000 - 000a (reserved)
(XEN)  0010 - 8648a000 (usable)
(XEN)  8648a000 - 8648b000 (ACPI NVS)
(XEN)  8648b000 - 864b5000 (reserved)
(XEN)  864b5000 - 8c224000 (usable)
(XEN)  8c224000 - 8c528000 (reserved)
(XEN)  8c528000 - 8c736000 (usable)
(XEN)  8c736000 - 8cea7000 (ACPI NVS)
(XEN)  8cea7000 - 8d2ff000 (reserved)
(XEN)  8d2ff000 - 8d30 (usable)
(XEN)  8d30 - 8d40 (reserved)
(XEN)  e000 - f000 (reserved)
(XEN)  fe00 - fe011000 (reserved)
(XEN)  fec0 - fec01000 (reserved)
(XEN)  fee0 - fee01000 (reserved)
(XEN)  ff00 - 0001 (reserved)
(XEN)  0001 - 00016e00 (usable)
...
(XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
(XEN) [VT-D]  RMRR address range 8d80..8fff not in reserved memory; 
need "iommu_inclusive_mapping=1"?
(XEN) Switched to APIC driver x2apic_cluster.
...
(XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr 8e48, 
iommu reg = 82c00021d000
(XEN) [VT-D]DMAR: reason 06 - PTE Read access is not set
(XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0, iommu reg 
= 82c00021f000
(XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear


The RMRR identified is a hole in the e820, and the range which is
causing IOMMU faults.  Clearly it isn't being set up correctly.

First of all, can you check what effect booting with
iommu_inclusive_mapping=1 has please?  While at it, iommu=debug would
also be helpful.

Back to the log.  Strictly speaking, this is a violation of the VT-d
spec.  Section 8.4 Reserved Memory Region Reporting Structure says:

"BIOS must report the RMRR reported memory addresses as reserved (or as
EFI runtime) in the system memory map returned through methods such as
INT15, EFI GetMemoryMap etc."

However, Xen's logic here is very broken, and in need of fixing.

For that message, it only checks the first and last address for being
reserved, not the entire region, which will give it plenty of false
negatives.

For RMRRs themselves, system firmware is well known for abiding by the
spec [citation needed], but an RMRR must be honoured, because the entire
purpose of them is to state "this device won't function without access
to this area".

An RMRR in a hole, while a violation of the spec, is obviously fine to
use in practice, so we should just accept it and stop complaining.

OTOH, RMRRs which hit other memory (particularly RAM) need more care,
and probably want to force override the e820 to reserved.  Nothing good
will come from trusting the e820 over the DMAR table here, seeing as
there is clearly an error somewhere in the firmware-provided information.

However - I'm struggling to locate anywhere which actually walks dom0's
RMRR list and inserts them into the IOMMU.  Anyone got any hints?

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx

2019-07-19 Thread Roman Shaposhnik
CCing relevant maintainers as well (sorry for not doing it first time around)

On Fri, Jul 19, 2019 at 12:31 PM Roman Shaposhnik  wrote:
>
> Hi!
>
> we're using Xen on Advantech ARK-2250 Embedded Box PC:
> 
> https://www.elmark.com.pl/web/uploaded/karty_produktow/advantech/ark-2250l/ark-2250l_instrukcja-uzytkownika.pdf
>
> After upgrading to Xen 4.12.0 from 4.11.0 we now have to utilize  
> iommu=no-igfx
> workaround as per:
>  https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#iommu
>
> Without the workaround the screen appears to be garbled with colored
> static noise and the following message keeps showing up:
> (XEN) printk: 26235 messages suppressed.
> (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
> 8e43c000, iommu reg = 82c00021d000
> (XEN) printk: 26303 messages suppressed.
> (XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault addr
> 8e2c6000, iommu reg = 82c00021d000
>
> Once iommu=no-igfx is applied the box can boot fine.
>
> At the end of this email, you can see a full log of the box booting
> all the way into Dom0 with iommu=no-igfx applied. I am also attaching
> similar log without no-igfx
>
> Please let me know if you need any more information to help us diagnose this.
>
> Thanks,
> Roman.
>
> 0x:0x00:0x02.0x0: ROM: 0x1 bytes at 0x8968d018
> 0x:0x02:0x00.0x0: ROM: 0x1 bytes at 0x89640018
> 0x:0x00:0x1f.0x6: ROM: 0x10c00 bytes at 0x8962f018
>  Xen 4.12.0
> (XEN) Xen version 4.12.0 (@) (gcc (Alpine 6.4.0) 6.4.0) debug=n  Mon
> Jun 17 18:50:07 UTC 2019
> (XEN) Latest ChangeSet:
> (XEN) Bootloader: GRUB 2.03
> (XEN) Command line: iommu=no-igfx com1=115200,8n1 console=com1,vga
> dom0_mem=1024M,max:1024M dom0_max_vcpus=1 dom0_vcpus_pin smt=false
> (XEN) Xen image load base address: 0x8860
> (XEN) Video information:
> (XEN)  VGA is graphics mode 1680x1050, 32 bpp
> (XEN) Disc information:
> (XEN)  Found 0 MBR signatures
> (XEN)  Found 1 EDD information structures
> (XEN) EFI RAM map:
> (XEN)   - 00058000 (usable)
> (XEN)  00058000 - 00059000 (reserved)
> (XEN)  00059000 - 0009f000 (usable)
> (XEN)  0009f000 - 000a (reserved)
> (XEN)  0010 - 8648a000 (usable)
> (XEN)  8648a000 - 8648b000 (ACPI NVS)
> (XEN)  8648b000 - 864b5000 (reserved)
> (XEN)  864b5000 - 8c224000 (usable)
> (XEN)  8c224000 - 8c528000 (reserved)
> (XEN)  8c528000 - 8c736000 (usable)
> (XEN)  8c736000 - 8cea7000 (ACPI NVS)
> (XEN)  8cea7000 - 8d2ff000 (reserved)
> (XEN)  8d2ff000 - 8d30 (usable)
> (XEN)  8d30 - 8d40 (reserved)
> (XEN)  e000 - f000 (reserved)
> (XEN)  fe00 - fe011000 (reserved)
> (XEN)  fec0 - fec01000 (reserved)
> (XEN)  fee0 - fee01000 (reserved)
> (XEN)  ff00 - 0001 (reserved)
> (XEN)  0001 - 00016e00 (usable)
> (XEN) ACPI: RSDP 8CE49000, 0024 (r2 ALASKA)
> (XEN) ACPI: XSDT 8CE490A8, 00CC (r1 ALASKA   A M I   1072009 AMI 10013)
> (XEN) ACPI: FACP 8CE6C370, 010C (r5 ALASKA   A M I   1072009 AMI 10013)
> (XEN) ACPI: DSDT 8CE49208, 23167 (r2 ALASKA   A M I   1072009 INTL 20120913)
> (XEN) ACPI: FACS 8CE8EF80, 0040
> (XEN) ACPI: APIC 8CE6C480, 0084 (r3 ALASKA   A M I   1072009 AMI 10013)
> (XEN) ACPI: FPDT 8CE6C508, 0044 (r1 ALASKA   A M I   1072009 AMI 10013)
> (XEN) ACPI: FIDT 8CE6C550, 009C (r1 ALASKA   A M I   1072009 AMI 10013)
> (XEN) ACPI: MCFG 8CE6C5F0, 003C (r1 ALASKA   A M I   1072009 MSFT   97)
> (XEN) ACPI: HPET 8CE6C630, 0038 (r1 ALASKA   A M I   1072009 AMI.5000B)
> (XEN) ACPI: LPIT 8CE6C668, 0094 (r1 INTEL   SKL-ULT0 MSFT   5F)
> (XEN) ACPI: SSDT 8CE6C700, 0248 (r2 INTEL  sensrhub0 INTL 20120913)
> (XEN) ACPI: SSDT 8CE6C948, 2BAE (r2 INTEL  PtidDevc 1000 INTL 20120913)
> (XEN) ACPI: SSDT 8CE6F4F8, 0BE3 (r2 INTEL  Ther_Rvp 1000 INTL 20120913)
> (XEN) ACPI: SSDT 8CE700E0, 04A3 (r2 INTEL zpodd 1000 INTL 20120913)
> (XEN) ACPI: DBGP 8CE70588, 0034 (r1 INTEL  0 MSFT   5F)
> (XEN) ACPI: DBG2 8CE705C0, 0054 (r0 INTEL  0 MSFT   5F)
> (XEN) ACPI: SSDT 8CE70618, 06E9 (r2  INTEL xh_rvp070 INTL 20120913)
> (XEN) ACPI: SSDT 8CE70D08, 547E (r2 SaSsdt  SaSsdt  3000 INTL 20120913)
> (XEN) ACPI: UEFI 8CE76188, 0042 (r10 0)
> (XEN) ACPI: SSDT 8CE761D0, 0E73 (r2 CpuRef  CpuSsdt 3000 INTL 20120913)
> (XEN) ACPI: BGRT 8CE77048, 0038 (r1 ALASKA   A M I   1072009 AMI 10013)
> (XEN) ACPI: DMAR 8CE77080, 00A8 (r1 INTEL  SKL 1 INTL1)
> (XEN) ACPI: TPM2 8CE77128, 0034 (r3Tpm2Tabl1 AMI 0)
> (XEN) ACPI: ASF! 8CE77160, 00A5 (r32 INTEL   HCG1 TFSMF4240)
> (XE