Re: [Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.

2016-01-14 Thread Jan Beulich
>>> On 23.12.15 at 09:25,  wrote:
> @@ -182,7 +186,7 @@ static int enter_state(u32 state)
>  error = tboot_s3_resume();
>  break;
>  case ACPI_STATE_S5:
> -acpi_enter_sleep_state(ACPI_STATE_S5);
> +error = acpi_enter_sleep_state(ACPI_STATE_S5);

I can't see how this is related to the purpose of the patch. I don't
mind such error checking being added, but not in this huge patch.
It would anyway be nice if you could see about splitting this
apart, to aid reviewing and - in case it would be needed after
committing - bisection.

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -2443,11 +2443,18 @@ static int __get_page_type(struct page_info *page, 
> unsigned long type,
>  if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
>  {
>  if ( (x & PGT_type_mask) == PGT_writable_page )
> -iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> +{
> +rc = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> +return rc;
> +}
>  else if ( type == PGT_writable_page )
> -iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> -   page_to_mfn(page),
> -   IOMMUF_readable|IOMMUF_writable);
> +{
> +rc = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> +page_to_mfn(page),
> +IOMMUF_readable|IOMMUF_writable);
> +if ( rc )
> +return rc;
> +}
>  }
>  }

Again you can't simply return here, or else you leak the type
reference, and you indefinitely stall any other CPU waiting for
page validation to happen.

> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -829,15 +829,23 @@ out:
>   need_modify_vtd_table )
>  {
>  if ( iommu_hap_pt_share )
> -iommu_pte_flush(d, gfn, _entry->epte, order, 
> vtd_pte_present);
> +rc = iommu_pte_flush(d, gfn, _entry->epte, order, 
> vtd_pte_present);
>  else
>  {
>  if ( iommu_flags )
>  for ( i = 0; i < (1 << order); i++ )
> -iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> +{
> +rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, 
> iommu_flags);
> +if ( rc )
> +break;
> +}

And the pattern repeats - you can't just exit without undoing what
so far was done.

>  else
>  for ( i = 0; i < (1 << order); i++ )
> -iommu_unmap_page(d, gfn + i);
> +{
> +rc = iommu_unmap_page(d, gfn + i);
> +if ( rc )
> +break;
> +}

As a special case, unmapping should perhaps continue despite an error,
in an attempt to do best effort cleanup.

I'm not going to continue further down, as I suspect I'll find more of
the same class of issues.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.

2016-01-14 Thread Jan Beulich
>>> On 25.12.15 at 03:53,  wrote:
>>  From: Xu, Quan
>> Sent: Wednesday, December 23, 2015 4:26 PM
>> --- a/xen/arch/x86/acpi/power.c
>> +++ b/xen/arch/x86/acpi/power.c
>> @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
>> 
>>  static int device_power_down(void)
>>  {
>> +int rc;
>> +
>>  console_suspend();
>> 
>>  time_suspend();
>> @@ -53,7 +55,9 @@ static int device_power_down(void)
>> 
>>  ioapic_suspend();
>> 
>> -iommu_suspend();
>> +rc = iommu_suspend();
>> +if ( rc )
>> +return rc;
>> 
>>  lapic_suspend();
>> 
> 
> Looks error handling is not only a problem in VT-d code. Above
> actually should check return values of all suspend callbacks. Just
> checking iommu_suspend is not enough, but it's a good improvement
> anyway...

No, it's not - it leaves the system in a non-working state without
undoing whatever succeeded already.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.

2016-01-05 Thread Xu, Quan
On December 25 2015 10:54 AM,  wrote:
> > From: Xu, Quan
> > Sent: Wednesday, December 23, 2015 4:26 PM
> >
> > This patch checks all kinds of error and all the way up the call trees
> > of VT-d Device-TLB flush.
> >
> > Signed-off-by: Quan Xu 
> > ---
> >  xen/arch/x86/acpi/power.c |   8 +-
> >  xen/arch/x86/crash.c  |   3 +-
> >  xen/arch/x86/domain_build.c   |   5 +-
> >  xen/arch/x86/mm.c |  15 ++-
> >  xen/arch/x86/mm/p2m-ept.c |  14 ++-
> >  xen/arch/x86/mm/p2m-pt.c  |  14 ++-
> >  xen/arch/x86/mm/p2m.c |  19 +++-
> >  xen/arch/x86/x86_64/mm.c  |   7 +-
> >  xen/common/domain.c   |   3 +-
> >  xen/common/grant_table.c  |   5 +-
> >  xen/common/memory.c   |  13 ++-
> >  xen/drivers/passthrough/amd/iommu_init.c  |   4 +-
> >  xen/drivers/passthrough/amd/pci_amd_iommu.c   |   4 +-
> >  xen/drivers/passthrough/arm/smmu.c|  13 ++-
> >  xen/drivers/passthrough/iommu.c   |  47 +---
> >  xen/drivers/passthrough/vtd/extern.h  |   4 +-
> >  xen/drivers/passthrough/vtd/iommu.c   | 157
> > --
> >  xen/drivers/passthrough/vtd/qinval.c  |   2 +-
> >  xen/drivers/passthrough/vtd/quirks.c  |  26 +++--
> >  xen/drivers/passthrough/vtd/x86/vtd.c |  13 ++-
> >  xen/drivers/passthrough/x86/iommu.c   |   6 +-
> >  xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   4 +-
> >  xen/include/asm-x86/iommu.h   |   2 +-
> >  xen/include/xen/iommu.h   |  20 ++--
> >  24 files changed, 300 insertions(+), 108 deletions(-)
> >

Kevin, 
Thanks for your comments!! It would take much time to review it.
I will fix them in next v5. May I discuss with you f2f for some dubious case?


Quan



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.

2015-12-24 Thread Tian, Kevin
> From: Xu, Quan
> Sent: Wednesday, December 23, 2015 4:26 PM
> 
> This patch checks all kinds of error and all the way up
> the call trees of VT-d Device-TLB flush.
> 
> Signed-off-by: Quan Xu 
> ---
>  xen/arch/x86/acpi/power.c |   8 +-
>  xen/arch/x86/crash.c  |   3 +-
>  xen/arch/x86/domain_build.c   |   5 +-
>  xen/arch/x86/mm.c |  15 ++-
>  xen/arch/x86/mm/p2m-ept.c |  14 ++-
>  xen/arch/x86/mm/p2m-pt.c  |  14 ++-
>  xen/arch/x86/mm/p2m.c |  19 +++-
>  xen/arch/x86/x86_64/mm.c  |   7 +-
>  xen/common/domain.c   |   3 +-
>  xen/common/grant_table.c  |   5 +-
>  xen/common/memory.c   |  13 ++-
>  xen/drivers/passthrough/amd/iommu_init.c  |   4 +-
>  xen/drivers/passthrough/amd/pci_amd_iommu.c   |   4 +-
>  xen/drivers/passthrough/arm/smmu.c|  13 ++-
>  xen/drivers/passthrough/iommu.c   |  47 +---
>  xen/drivers/passthrough/vtd/extern.h  |   4 +-
>  xen/drivers/passthrough/vtd/iommu.c   | 157
> --
>  xen/drivers/passthrough/vtd/qinval.c  |   2 +-
>  xen/drivers/passthrough/vtd/quirks.c  |  26 +++--
>  xen/drivers/passthrough/vtd/x86/vtd.c |  13 ++-
>  xen/drivers/passthrough/x86/iommu.c   |   6 +-
>  xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   4 +-
>  xen/include/asm-x86/iommu.h   |   2 +-
>  xen/include/xen/iommu.h   |  20 ++--
>  24 files changed, 300 insertions(+), 108 deletions(-)
> 
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index f41f0de..1974721 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
> 
>  static int device_power_down(void)
>  {
> +int rc;
> +
>  console_suspend();
> 
>  time_suspend();
> @@ -53,7 +55,9 @@ static int device_power_down(void)
> 
>  ioapic_suspend();
> 
> -iommu_suspend();
> +rc = iommu_suspend();
> +if ( rc )
> +return rc;
> 
>  lapic_suspend();
> 

Looks error handling is not only a problem in VT-d code. Above
actually should check return values of all suspend callbacks. Just
checking iommu_suspend is not enough, but it's a good improvement
anyway...

[...]

> diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
> index bca6fe7..a11bc2a 100644
> --- a/xen/arch/x86/domain_build.c
> +++ b/xen/arch/x86/domain_build.c
> @@ -1627,7 +1627,10 @@ int __init construct_dom0(
>  }
> 
>  if ( d->domain_id == hardware_domid )
> -iommu_hwdom_init(d);
> +{
> +if ( iommu_hwdom_init(d) )
> +printk("Xen warning : IOMMU hardware domain init failed.\n");
> +}

if construct_dom0 fails, guess we can panic here? e.g. simply move earlier
BUG_ON(rc != 0) after above trunk. In an ideal case we may disable 
iommu_enabled upon error at this point, to allow moving forward. But that
can be improved separately.

[...]

> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 202ff76..3c1db05 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -2443,11 +2443,18 @@ static int __get_page_type(struct page_info *page, 
> unsigned
> long type,
>  if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
>  {
>  if ( (x & PGT_type_mask) == PGT_writable_page )
> -iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> +{
> +rc = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> +return rc;
> +}

looks you return absolutely regardless of error check. There are still some
useful code after this point...

>  else if ( type == PGT_writable_page )
> -iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> -   page_to_mfn(page),
> -   IOMMUF_readable|IOMMUF_writable);
> +{
> +rc = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> +page_to_mfn(page),
> +IOMMUF_readable|IOMMUF_writable);
> +if ( rc )
> +return rc;
> +}

this one is correct.

[...]

> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
> index 709920a..b2b340d 100644
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -675,11 +675,19 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long
> gfn, mfn_t mfn,
>  }

curious why there's no similar check on code below:

if ( iommu_use_hap_pt(p2m->domain) )
{
if ( iommu_old_flags )
**amd_iommu_flush_pages(p2m->domain, gfn, page_order)**;
}

>  else if ( 

[Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.

2015-12-23 Thread Quan Xu
This patch checks all kinds of error and all the way up
the call trees of VT-d Device-TLB flush.

Signed-off-by: Quan Xu 
---
 xen/arch/x86/acpi/power.c |   8 +-
 xen/arch/x86/crash.c  |   3 +-
 xen/arch/x86/domain_build.c   |   5 +-
 xen/arch/x86/mm.c |  15 ++-
 xen/arch/x86/mm/p2m-ept.c |  14 ++-
 xen/arch/x86/mm/p2m-pt.c  |  14 ++-
 xen/arch/x86/mm/p2m.c |  19 +++-
 xen/arch/x86/x86_64/mm.c  |   7 +-
 xen/common/domain.c   |   3 +-
 xen/common/grant_table.c  |   5 +-
 xen/common/memory.c   |  13 ++-
 xen/drivers/passthrough/amd/iommu_init.c  |   4 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   4 +-
 xen/drivers/passthrough/arm/smmu.c|  13 ++-
 xen/drivers/passthrough/iommu.c   |  47 +---
 xen/drivers/passthrough/vtd/extern.h  |   4 +-
 xen/drivers/passthrough/vtd/iommu.c   | 157 --
 xen/drivers/passthrough/vtd/qinval.c  |   2 +-
 xen/drivers/passthrough/vtd/quirks.c  |  26 +++--
 xen/drivers/passthrough/vtd/x86/vtd.c |  13 ++-
 xen/drivers/passthrough/x86/iommu.c   |   6 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   4 +-
 xen/include/asm-x86/iommu.h   |   2 +-
 xen/include/xen/iommu.h   |  20 ++--
 24 files changed, 300 insertions(+), 108 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index f41f0de..1974721 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
 
 static int device_power_down(void)
 {
+int rc;
+
 console_suspend();
 
 time_suspend();
@@ -53,7 +55,9 @@ static int device_power_down(void)
 
 ioapic_suspend();
 
-iommu_suspend();
+rc = iommu_suspend();
+if ( rc )
+return rc;
 
 lapic_suspend();
 
@@ -182,7 +186,7 @@ static int enter_state(u32 state)
 error = tboot_s3_resume();
 break;
 case ACPI_STATE_S5:
-acpi_enter_sleep_state(ACPI_STATE_S5);
+error = acpi_enter_sleep_state(ACPI_STATE_S5);
 break;
 default:
 error = -EINVAL;
diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c
index 888a214..59e1af6 100644
--- a/xen/arch/x86/crash.c
+++ b/xen/arch/x86/crash.c
@@ -170,7 +170,8 @@ static void nmi_shootdown_cpus(void)
 
 /* Crash shutdown any IOMMU functionality as the crashdump kernel is not
  * happy when booting if interrupt/dma remapping is still enabled */
-iommu_crash_shutdown();
+if ( iommu_crash_shutdown() )
+printk("Failed to shut down IOMMU.\n");
 
 __stop_this_cpu();
 
diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index bca6fe7..a11bc2a 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -1627,7 +1627,10 @@ int __init construct_dom0(
 }
 
 if ( d->domain_id == hardware_domid )
-iommu_hwdom_init(d);
+{
+if ( iommu_hwdom_init(d) )
+printk("Xen warning : IOMMU hardware domain init failed.\n");
+}
 
 return 0;
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 202ff76..3c1db05 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2443,11 +2443,18 @@ static int __get_page_type(struct page_info *page, 
unsigned long type,
 if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
 {
 if ( (x & PGT_type_mask) == PGT_writable_page )
-iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
+{
+rc = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
+return rc;
+}
 else if ( type == PGT_writable_page )
-iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
-   page_to_mfn(page),
-   IOMMUF_readable|IOMMUF_writable);
+{
+rc = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
+page_to_mfn(page),
+IOMMUF_readable|IOMMUF_writable);
+if ( rc )
+return rc;
+}
 }
 }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 9860c6c..2ed43b0 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -829,15 +829,23 @@ out:
  need_modify_vtd_table )
 {
 if ( iommu_hap_pt_share )
-iommu_pte_flush(d, gfn, _entry->epte, order, vtd_pte_present);
+rc = iommu_pte_flush(d, gfn, _entry->epte, order, 
vtd_pte_present);
 else
 {
 if ( iommu_flags )
 for ( i = 0; i < (1 << order); i++ )
-