Re: [Xen-devel] [PATCH v13 3/3] IOMMU: fix vt-d Device-TLB flush timeout issue

2016-07-04 Thread Xu, Quan
On July 04, 2016 2:16 PM, Tian, Kevin  wrote:
> > From: Xu, Quan
> > Sent: Wednesday, June 29, 2016 2:00 PM
> >
> > From: Quan Xu 
> >
> > If Device-TLB flush timed out, we hide the target ATS device
> > immediately. By hiding the device, we make sure it can't be assigned
> > to any domain any longer (see device_assigned).
> >
> > Signed-off-by: Quan Xu 
> >
> > CC: Jan Beulich 
> > CC: Kevin Tian 
> > CC: Feng Wu 
> >
> > ---
> > v13:
> >1. drop domain crash logic, which is added to the vendor
> >   independent layer in patch #2.
> >2. rename dev_invalidate_iotlb_timeout() to
> iommu_dev_iotlb_flush_timeout()
> >   and move it to the vendor independent layer.
> > ---
> >  xen/drivers/passthrough/iommu.c   | 21 +
> >  xen/drivers/passthrough/pci.c |  6 ++--
> >  xen/drivers/passthrough/vtd/extern.h  |  5 ++--
> > xen/drivers/passthrough/vtd/qinval.c  | 56
> > +++
> >  xen/drivers/passthrough/vtd/x86/ats.c | 11 ++-
> >  xen/include/xen/iommu.h   |  3 ++
> >  xen/include/xen/pci.h |  1 +
> >  7 files changed, 76 insertions(+), 27 deletions(-)
> >
> > diff --git a/xen/drivers/passthrough/iommu.c
> > b/xen/drivers/passthrough/iommu.c index d793f5d..5db8ae6 100644
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -361,6 +361,27 @@ int iommu_iotlb_flush_all(struct domain *d)
> >  return rc;
> >  }
> >
> > +void iommu_dev_iotlb_flush_timeout(struct domain *d,
> > +   struct pci_dev *pdev) {
> > +pcidevs_lock();
> > +
> > +ASSERT(pdev->domain);
> > +if ( d != pdev->domain )
> > +return;
> 
> return w/o releasing the lock!
> 
Yes, I really need releasing the lock before return.

> and is above scenario actually possible (a flush timeout is captured when the
> device doesn't belong to previous domain)? If not, better to move the
> condition into ASSERT.

IMO, this is possible.
  -- not all of call trees of device iotlb flush are under pcidevs_lock, (.i.e  
...--iommu_iotlb_flush()-- xenmem_add_to_physmap()... )
  -- In extreme cases , the domain may has been freed or the device may has 
been detached or even attached to another domain.
That's also why to introduce a domain point here.

> 
> > +
> > +list_del(>domain_list);
> > +pdev->domain = NULL;
> > +pci_hide_existing_device(pdev);
> > +if ( !d->is_shutting_down && printk_ratelimit() )
> > +printk(XENLOG_ERR
> > +   "dom%d: ATS device %04x:%02x:%02x.%u flush failed\n",
> > +   d->domain_id, pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> > +   PCI_FUNC(pdev->devfn));
> > +
> > +pcidevs_unlock();
> 
> please move above warning out of the lock.
> 

I think I'm better leave it as is.

as I use 'pdev' to print information, as similar as pci_release_devices().
If I use seg, bus, devfn variables directly, instead of 'pdev', I agree to move 
out of the lock, as similar as:

iommu_do_pci_domctl()
{
 case XEN_DOMCTL_assign_device... 
}

 correct me if I am not right.

Quan




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v13 3/3] IOMMU: fix vt-d Device-TLB flush timeout issue

2016-07-04 Thread Tian, Kevin
> From: Xu, Quan
> Sent: Wednesday, June 29, 2016 2:00 PM
> 
> From: Quan Xu 
> 
> If Device-TLB flush timed out, we hide the target ATS device
> immediately. By hiding the device, we make sure it can't be
> assigned to any domain any longer (see device_assigned).
> 
> Signed-off-by: Quan Xu 
> 
> CC: Jan Beulich 
> CC: Kevin Tian 
> CC: Feng Wu 
> 
> ---
> v13:
>1. drop domain crash logic, which is added to the vendor
>   independent layer in patch #2.
>2. rename dev_invalidate_iotlb_timeout() to iommu_dev_iotlb_flush_timeout()
>   and move it to the vendor independent layer.
> ---
>  xen/drivers/passthrough/iommu.c   | 21 +
>  xen/drivers/passthrough/pci.c |  6 ++--
>  xen/drivers/passthrough/vtd/extern.h  |  5 ++--
>  xen/drivers/passthrough/vtd/qinval.c  | 56
> +++
>  xen/drivers/passthrough/vtd/x86/ats.c | 11 ++-
>  xen/include/xen/iommu.h   |  3 ++
>  xen/include/xen/pci.h |  1 +
>  7 files changed, 76 insertions(+), 27 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index d793f5d..5db8ae6 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -361,6 +361,27 @@ int iommu_iotlb_flush_all(struct domain *d)
>  return rc;
>  }
> 
> +void iommu_dev_iotlb_flush_timeout(struct domain *d,
> +   struct pci_dev *pdev)
> +{
> +pcidevs_lock();
> +
> +ASSERT(pdev->domain);
> +if ( d != pdev->domain )
> +return;

return w/o releasing the lock!

and is above scenario actually possible (a flush timeout is captured when
the device doesn't belong to previous domain)? If not, better to move
the condition into ASSERT.

> +
> +list_del(>domain_list);
> +pdev->domain = NULL;
> +pci_hide_existing_device(pdev);
> +if ( !d->is_shutting_down && printk_ratelimit() )
> +printk(XENLOG_ERR
> +   "dom%d: ATS device %04x:%02x:%02x.%u flush failed\n",
> +   d->domain_id, pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +   PCI_FUNC(pdev->devfn));
> +
> +pcidevs_unlock();

please move above warning out of the lock.

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v13 3/3] IOMMU: fix vt-d Device-TLB flush timeout issue

2016-06-29 Thread Xu, Quan
From: Quan Xu 

If Device-TLB flush timed out, we hide the target ATS device
immediately. By hiding the device, we make sure it can't be
assigned to any domain any longer (see device_assigned).

Signed-off-by: Quan Xu 

CC: Jan Beulich 
CC: Kevin Tian 
CC: Feng Wu 

---
v13:
   1. drop domain crash logic, which is added to the vendor
  independent layer in patch #2.
   2. rename dev_invalidate_iotlb_timeout() to iommu_dev_iotlb_flush_timeout()
  and move it to the vendor independent layer.
---
 xen/drivers/passthrough/iommu.c   | 21 +
 xen/drivers/passthrough/pci.c |  6 ++--
 xen/drivers/passthrough/vtd/extern.h  |  5 ++--
 xen/drivers/passthrough/vtd/qinval.c  | 56 +++
 xen/drivers/passthrough/vtd/x86/ats.c | 11 ++-
 xen/include/xen/iommu.h   |  3 ++
 xen/include/xen/pci.h |  1 +
 7 files changed, 76 insertions(+), 27 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index d793f5d..5db8ae6 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -361,6 +361,27 @@ int iommu_iotlb_flush_all(struct domain *d)
 return rc;
 }
 
+void iommu_dev_iotlb_flush_timeout(struct domain *d,
+   struct pci_dev *pdev)
+{
+pcidevs_lock();
+
+ASSERT(pdev->domain);
+if ( d != pdev->domain )
+return;
+
+list_del(>domain_list);
+pdev->domain = NULL;
+pci_hide_existing_device(pdev);
+if ( !d->is_shutting_down && printk_ratelimit() )
+printk(XENLOG_ERR
+   "dom%d: ATS device %04x:%02x:%02x.%u flush failed\n",
+   d->domain_id, pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+   PCI_FUNC(pdev->devfn));
+
+pcidevs_unlock();
+}
+
 int __init iommu_setup(void)
 {
 int rc = -ENODEV;
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index bb5f344..58bfb79 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -419,7 +419,7 @@ static void free_pdev(struct pci_seg *pseg, struct pci_dev 
*pdev)
 xfree(pdev);
 }
 
-static void _pci_hide_device(struct pci_dev *pdev)
+void pci_hide_existing_device(struct pci_dev *pdev)
 {
 if ( pdev->domain )
 return;
@@ -436,7 +436,7 @@ int __init pci_hide_device(int bus, int devfn)
 pdev = alloc_pdev(get_pseg(0), bus, devfn);
 if ( pdev )
 {
-_pci_hide_device(pdev);
+pci_hide_existing_device(pdev);
 rc = 0;
 }
 pcidevs_unlock();
@@ -466,7 +466,7 @@ int __init pci_ro_device(int seg, int bus, int devfn)
 }
 
 __set_bit(PCI_BDF2(bus, devfn), pseg->ro_map);
-_pci_hide_device(pdev);
+pci_hide_existing_device(pdev);
 
 return 0;
 }
diff --git a/xen/drivers/passthrough/vtd/extern.h 
b/xen/drivers/passthrough/vtd/extern.h
index 45357f2..efaff28 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -25,6 +25,7 @@
 
 #define VTDPREFIX "[VT-D]"
 
+struct pci_ats_dev;
 extern bool_t rwbf_quirk;
 
 void print_iommu_regs(struct acpi_drhd_unit *drhd);
@@ -60,8 +61,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
  u64 addr, unsigned int size_order, u64 type);
 
 int __must_check qinval_device_iotlb_sync(struct iommu *iommu,
-  u32 max_invs_pend,
-  u16 sid, u16 size, u64 addr);
+  struct pci_ats_dev *ats_dev,
+  u16 did, u16 size, u64 addr);
 
 unsigned int get_cache_line_size(void);
 void cacheline_flush(char *);
diff --git a/xen/drivers/passthrough/vtd/qinval.c 
b/xen/drivers/passthrough/vtd/qinval.c
index 4492b29..7a5c433 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -27,11 +27,11 @@
 #include "dmar.h"
 #include "vtd.h"
 #include "extern.h"
+#include "../ats.h"
 
 #define VTD_QI_TIMEOUT 1
 
-static int __must_check invalidate_sync(struct iommu *iommu,
-bool_t flush_dev_iotlb);
+static int __must_check invalidate_sync(struct iommu *iommu);
 
 static void print_qi_regs(struct iommu *iommu)
 {
@@ -103,7 +103,7 @@ static int __must_check 
queue_invalidate_context_sync(struct iommu *iommu,
 
 unmap_vtd_domain_page(qinval_entries);
 
-return invalidate_sync(iommu, 0);
+return invalidate_sync(iommu);
 }
 
 static int __must_check queue_invalidate_iotlb_sync(struct iommu *iommu,
@@ -140,7 +140,7 @@ static int __must_check queue_invalidate_iotlb_sync(struct 
iommu *iommu,
 qinval_update_qtail(iommu, index);
 spin_unlock_irqrestore(>register_lock, flags);
 
-return invalidate_sync(iommu, 0);
+return invalidate_sync(iommu);
 }
 
 static int __must_check