Re: [Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.
>>> On 23.12.15 at 09:25,wrote: > @@ -182,7 +186,7 @@ static int enter_state(u32 state) > error = tboot_s3_resume(); > break; > case ACPI_STATE_S5: > -acpi_enter_sleep_state(ACPI_STATE_S5); > +error = acpi_enter_sleep_state(ACPI_STATE_S5); I can't see how this is related to the purpose of the patch. I don't mind such error checking being added, but not in this huge patch. It would anyway be nice if you could see about splitting this apart, to aid reviewing and - in case it would be needed after committing - bisection. > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -2443,11 +2443,18 @@ static int __get_page_type(struct page_info *page, > unsigned long type, > if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) > { > if ( (x & PGT_type_mask) == PGT_writable_page ) > -iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); > +{ > +rc = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); > +return rc; > +} > else if ( type == PGT_writable_page ) > -iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), > - page_to_mfn(page), > - IOMMUF_readable|IOMMUF_writable); > +{ > +rc = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), > +page_to_mfn(page), > +IOMMUF_readable|IOMMUF_writable); > +if ( rc ) > +return rc; > +} > } > } Again you can't simply return here, or else you leak the type reference, and you indefinitely stall any other CPU waiting for page validation to happen. > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -829,15 +829,23 @@ out: > need_modify_vtd_table ) > { > if ( iommu_hap_pt_share ) > -iommu_pte_flush(d, gfn, _entry->epte, order, > vtd_pte_present); > +rc = iommu_pte_flush(d, gfn, _entry->epte, order, > vtd_pte_present); > else > { > if ( iommu_flags ) > for ( i = 0; i < (1 << order); i++ ) > -iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags); > +{ > +rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, > iommu_flags); > +if ( rc ) > +break; > +} And the pattern repeats - you can't just exit without undoing what so far was done. > else > for ( i = 0; i < (1 << order); i++ ) > -iommu_unmap_page(d, gfn + i); > +{ > +rc = iommu_unmap_page(d, gfn + i); > +if ( rc ) > +break; > +} As a special case, unmapping should perhaps continue despite an error, in an attempt to do best effort cleanup. I'm not going to continue further down, as I suspect I'll find more of the same class of issues. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.
>>> On 25.12.15 at 03:53,wrote: >> From: Xu, Quan >> Sent: Wednesday, December 23, 2015 4:26 PM >> --- a/xen/arch/x86/acpi/power.c >> +++ b/xen/arch/x86/acpi/power.c >> @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void); >> >> static int device_power_down(void) >> { >> +int rc; >> + >> console_suspend(); >> >> time_suspend(); >> @@ -53,7 +55,9 @@ static int device_power_down(void) >> >> ioapic_suspend(); >> >> -iommu_suspend(); >> +rc = iommu_suspend(); >> +if ( rc ) >> +return rc; >> >> lapic_suspend(); >> > > Looks error handling is not only a problem in VT-d code. Above > actually should check return values of all suspend callbacks. Just > checking iommu_suspend is not enough, but it's a good improvement > anyway... No, it's not - it leaves the system in a non-working state without undoing whatever succeeded already. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.
On December 25 2015 10:54 AM,wrote: > > From: Xu, Quan > > Sent: Wednesday, December 23, 2015 4:26 PM > > > > This patch checks all kinds of error and all the way up the call trees > > of VT-d Device-TLB flush. > > > > Signed-off-by: Quan Xu > > --- > > xen/arch/x86/acpi/power.c | 8 +- > > xen/arch/x86/crash.c | 3 +- > > xen/arch/x86/domain_build.c | 5 +- > > xen/arch/x86/mm.c | 15 ++- > > xen/arch/x86/mm/p2m-ept.c | 14 ++- > > xen/arch/x86/mm/p2m-pt.c | 14 ++- > > xen/arch/x86/mm/p2m.c | 19 +++- > > xen/arch/x86/x86_64/mm.c | 7 +- > > xen/common/domain.c | 3 +- > > xen/common/grant_table.c | 5 +- > > xen/common/memory.c | 13 ++- > > xen/drivers/passthrough/amd/iommu_init.c | 4 +- > > xen/drivers/passthrough/amd/pci_amd_iommu.c | 4 +- > > xen/drivers/passthrough/arm/smmu.c| 13 ++- > > xen/drivers/passthrough/iommu.c | 47 +--- > > xen/drivers/passthrough/vtd/extern.h | 4 +- > > xen/drivers/passthrough/vtd/iommu.c | 157 > > -- > > xen/drivers/passthrough/vtd/qinval.c | 2 +- > > xen/drivers/passthrough/vtd/quirks.c | 26 +++-- > > xen/drivers/passthrough/vtd/x86/vtd.c | 13 ++- > > xen/drivers/passthrough/x86/iommu.c | 6 +- > > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 4 +- > > xen/include/asm-x86/iommu.h | 2 +- > > xen/include/xen/iommu.h | 20 ++-- > > 24 files changed, 300 insertions(+), 108 deletions(-) > > Kevin, Thanks for your comments!! It would take much time to review it. I will fix them in next v5. May I discuss with you f2f for some dubious case? Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.
> From: Xu, Quan > Sent: Wednesday, December 23, 2015 4:26 PM > > This patch checks all kinds of error and all the way up > the call trees of VT-d Device-TLB flush. > > Signed-off-by: Quan Xu> --- > xen/arch/x86/acpi/power.c | 8 +- > xen/arch/x86/crash.c | 3 +- > xen/arch/x86/domain_build.c | 5 +- > xen/arch/x86/mm.c | 15 ++- > xen/arch/x86/mm/p2m-ept.c | 14 ++- > xen/arch/x86/mm/p2m-pt.c | 14 ++- > xen/arch/x86/mm/p2m.c | 19 +++- > xen/arch/x86/x86_64/mm.c | 7 +- > xen/common/domain.c | 3 +- > xen/common/grant_table.c | 5 +- > xen/common/memory.c | 13 ++- > xen/drivers/passthrough/amd/iommu_init.c | 4 +- > xen/drivers/passthrough/amd/pci_amd_iommu.c | 4 +- > xen/drivers/passthrough/arm/smmu.c| 13 ++- > xen/drivers/passthrough/iommu.c | 47 +--- > xen/drivers/passthrough/vtd/extern.h | 4 +- > xen/drivers/passthrough/vtd/iommu.c | 157 > -- > xen/drivers/passthrough/vtd/qinval.c | 2 +- > xen/drivers/passthrough/vtd/quirks.c | 26 +++-- > xen/drivers/passthrough/vtd/x86/vtd.c | 13 ++- > xen/drivers/passthrough/x86/iommu.c | 6 +- > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 4 +- > xen/include/asm-x86/iommu.h | 2 +- > xen/include/xen/iommu.h | 20 ++-- > 24 files changed, 300 insertions(+), 108 deletions(-) > > diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c > index f41f0de..1974721 100644 > --- a/xen/arch/x86/acpi/power.c > +++ b/xen/arch/x86/acpi/power.c > @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void); > > static int device_power_down(void) > { > +int rc; > + > console_suspend(); > > time_suspend(); > @@ -53,7 +55,9 @@ static int device_power_down(void) > > ioapic_suspend(); > > -iommu_suspend(); > +rc = iommu_suspend(); > +if ( rc ) > +return rc; > > lapic_suspend(); > Looks error handling is not only a problem in VT-d code. Above actually should check return values of all suspend callbacks. Just checking iommu_suspend is not enough, but it's a good improvement anyway... [...] > diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c > index bca6fe7..a11bc2a 100644 > --- a/xen/arch/x86/domain_build.c > +++ b/xen/arch/x86/domain_build.c > @@ -1627,7 +1627,10 @@ int __init construct_dom0( > } > > if ( d->domain_id == hardware_domid ) > -iommu_hwdom_init(d); > +{ > +if ( iommu_hwdom_init(d) ) > +printk("Xen warning : IOMMU hardware domain init failed.\n"); > +} if construct_dom0 fails, guess we can panic here? e.g. simply move earlier BUG_ON(rc != 0) after above trunk. In an ideal case we may disable iommu_enabled upon error at this point, to allow moving forward. But that can be improved separately. [...] > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c > index 202ff76..3c1db05 100644 > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -2443,11 +2443,18 @@ static int __get_page_type(struct page_info *page, > unsigned > long type, > if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) > { > if ( (x & PGT_type_mask) == PGT_writable_page ) > -iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); > +{ > +rc = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); > +return rc; > +} looks you return absolutely regardless of error check. There are still some useful code after this point... > else if ( type == PGT_writable_page ) > -iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), > - page_to_mfn(page), > - IOMMUF_readable|IOMMUF_writable); > +{ > +rc = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), > +page_to_mfn(page), > +IOMMUF_readable|IOMMUF_writable); > +if ( rc ) > +return rc; > +} this one is correct. [...] > diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c > index 709920a..b2b340d 100644 > --- a/xen/arch/x86/mm/p2m-pt.c > +++ b/xen/arch/x86/mm/p2m-pt.c > @@ -675,11 +675,19 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long > gfn, mfn_t mfn, > } curious why there's no similar check on code below: if ( iommu_use_hap_pt(p2m->domain) ) { if ( iommu_old_flags ) **amd_iommu_flush_pages(p2m->domain, gfn, page_order)**; } > else if (
[Xen-devel] [PATCH v4 1/3] VT-d: Check VT-d Device-TLB flush error.
This patch checks all kinds of error and all the way up the call trees of VT-d Device-TLB flush. Signed-off-by: Quan Xu--- xen/arch/x86/acpi/power.c | 8 +- xen/arch/x86/crash.c | 3 +- xen/arch/x86/domain_build.c | 5 +- xen/arch/x86/mm.c | 15 ++- xen/arch/x86/mm/p2m-ept.c | 14 ++- xen/arch/x86/mm/p2m-pt.c | 14 ++- xen/arch/x86/mm/p2m.c | 19 +++- xen/arch/x86/x86_64/mm.c | 7 +- xen/common/domain.c | 3 +- xen/common/grant_table.c | 5 +- xen/common/memory.c | 13 ++- xen/drivers/passthrough/amd/iommu_init.c | 4 +- xen/drivers/passthrough/amd/pci_amd_iommu.c | 4 +- xen/drivers/passthrough/arm/smmu.c| 13 ++- xen/drivers/passthrough/iommu.c | 47 +--- xen/drivers/passthrough/vtd/extern.h | 4 +- xen/drivers/passthrough/vtd/iommu.c | 157 -- xen/drivers/passthrough/vtd/qinval.c | 2 +- xen/drivers/passthrough/vtd/quirks.c | 26 +++-- xen/drivers/passthrough/vtd/x86/vtd.c | 13 ++- xen/drivers/passthrough/x86/iommu.c | 6 +- xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 4 +- xen/include/asm-x86/iommu.h | 2 +- xen/include/xen/iommu.h | 20 ++-- 24 files changed, 300 insertions(+), 108 deletions(-) diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c index f41f0de..1974721 100644 --- a/xen/arch/x86/acpi/power.c +++ b/xen/arch/x86/acpi/power.c @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void); static int device_power_down(void) { +int rc; + console_suspend(); time_suspend(); @@ -53,7 +55,9 @@ static int device_power_down(void) ioapic_suspend(); -iommu_suspend(); +rc = iommu_suspend(); +if ( rc ) +return rc; lapic_suspend(); @@ -182,7 +186,7 @@ static int enter_state(u32 state) error = tboot_s3_resume(); break; case ACPI_STATE_S5: -acpi_enter_sleep_state(ACPI_STATE_S5); +error = acpi_enter_sleep_state(ACPI_STATE_S5); break; default: error = -EINVAL; diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c index 888a214..59e1af6 100644 --- a/xen/arch/x86/crash.c +++ b/xen/arch/x86/crash.c @@ -170,7 +170,8 @@ static void nmi_shootdown_cpus(void) /* Crash shutdown any IOMMU functionality as the crashdump kernel is not * happy when booting if interrupt/dma remapping is still enabled */ -iommu_crash_shutdown(); +if ( iommu_crash_shutdown() ) +printk("Failed to shut down IOMMU.\n"); __stop_this_cpu(); diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c index bca6fe7..a11bc2a 100644 --- a/xen/arch/x86/domain_build.c +++ b/xen/arch/x86/domain_build.c @@ -1627,7 +1627,10 @@ int __init construct_dom0( } if ( d->domain_id == hardware_domid ) -iommu_hwdom_init(d); +{ +if ( iommu_hwdom_init(d) ) +printk("Xen warning : IOMMU hardware domain init failed.\n"); +} return 0; diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 202ff76..3c1db05 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -2443,11 +2443,18 @@ static int __get_page_type(struct page_info *page, unsigned long type, if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) { if ( (x & PGT_type_mask) == PGT_writable_page ) -iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); +{ +rc = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); +return rc; +} else if ( type == PGT_writable_page ) -iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), - page_to_mfn(page), - IOMMUF_readable|IOMMUF_writable); +{ +rc = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), +page_to_mfn(page), +IOMMUF_readable|IOMMUF_writable); +if ( rc ) +return rc; +} } } diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index 9860c6c..2ed43b0 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -829,15 +829,23 @@ out: need_modify_vtd_table ) { if ( iommu_hap_pt_share ) -iommu_pte_flush(d, gfn, _entry->epte, order, vtd_pte_present); +rc = iommu_pte_flush(d, gfn, _entry->epte, order, vtd_pte_present); else { if ( iommu_flags ) for ( i = 0; i < (1 << order); i++ ) -