Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-22 Thread Xu, Quan
On June 23, 2016 12:18 AM, Jan Beulich  wrote:
> >>> On 22.06.16 at 17:54,  wrote:
> > On June 17, 2016 3:01 PM, Jan Beulich  wrote:
> >> And again I don't understand: ASSERT()s are to verify assumed state.
> >> If
> > static
> >> code analysis resulted in understanding a function is unreachable
> >> when qi_ctrl->qinval_maddr is zero (because qinval ought to have got
> >> disabled if
> > any
> >> of the table setup failed), then adding ASSERT() would (a) document
> >> that and
> >> (b) allow to know quickly if something broke that assumption.
> >
> > other than enable_qinval() -- yes, I need to convert conditionals of
> > qi_ctrl->qinval_maddr into  ASSERT()s..
> > But in enable_qinval(), I am still not quite sure whether I need to
> > convert these conditionals of  qi_ctrl->qinval_maddr into ASSERT()s or
> > not.
> 
> No, I don't think you want to so there - you'd bring the system down in case
> of an actual initialization error. ASSERT()s should only be used on conditions
> controlled entirely by the hypervisor.
> 

Jan, thank you. Now I am clear.

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-22 Thread Jan Beulich
>>> On 22.06.16 at 17:54,  wrote:
> On June 17, 2016 3:01 PM, Jan Beulich  wrote:
>> And again I don't understand: ASSERT()s are to verify assumed state. If 
> static
>> code analysis resulted in understanding a function is unreachable when
>> qi_ctrl->qinval_maddr is zero (because qinval ought to have got disabled if 
> any
>> of the table setup failed), then adding ASSERT() would (a) document that and
>> (b) allow to know quickly if something broke that assumption.
> 
> other than enable_qinval() -- yes, I need to convert conditionals of 
> qi_ctrl->qinval_maddr into
>  ASSERT()s..
> But in enable_qinval(), I am still not quite sure whether I need to convert 
> these conditionals of
>  qi_ctrl->qinval_maddr into ASSERT()s or not.

No, I don't think you want to so there - you'd bring the system down
in case of an actual initialization error. ASSERT()s should only be
used on conditions controlled entirely by the hypervisor.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-22 Thread Xu, Quan
On June 17, 2016 3:01 PM, Jan Beulich  wrote:
> And again I don't understand: ASSERT()s are to verify assumed state. If static
> code analysis resulted in understanding a function is unreachable when
> qi_ctrl->qinval_maddr is zero (because qinval ought to have got disabled if 
> any
> of the table setup failed), then adding ASSERT() would (a) document that and
> (b) allow to know quickly if something broke that assumption.

other than enable_qinval() -- yes, I need to convert conditionals of 
qi_ctrl->qinval_maddr into
 ASSERT()s..
But in enable_qinval(), I am still not quite sure whether I need to convert 
these conditionals of
 qi_ctrl->qinval_maddr into ASSERT()s or not.

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue (+ crash logic )

2016-06-20 Thread Xu, Quan
On June 18, 2016 5:54 PM, Xu, Quan  wrote:
> On June 17, 2016 9:35 PM, Julien Grall  wrote:
> > On 17/06/16 09:51, Xu, Quan wrote:
> > > + arm/amd maintainers..
> > >
> > > On June 01, 2016 5:05 PM, Xu, Quan  wrote:
> > >> If Device-TLB flush timed out, we hide the target ATS device
> > >> immediately and crash the domain owning this ATS device. If
> > >> impacted domain is hardware domain, just throw out a warning.
> > >>
> > >> By hiding the device, we make sure it can't be assigned to any
> > >> domain any longer (see device_assigned).
> > >
> > > DomU (other than Dom0) gets crashed when a device IOTLB flush times
> out.
> > I suppose that's what you will want on ARM/AMD then too.
> >
> > Correct it is what we want for ARM :).
> >
> 
> 
> That's good :):)..
> Julien, thanks.
> 
> > > We need to move up the crash logic , as similar as iommu_map_page()
> > > /
> > iommu_unmap_page().
> > >
> > >  - add the crash logic to iommu_iotlb_flush() / 
> > > iommu_iotlb_flush_all().
> > >
> > >  - when IOMMU/MMU share page tables, we need to fix it one by one.
> > >  -- on amd, we need to add the crash logic to
> > amd_iommu_flush_pages().
> > >  -- on intel, we need to add the crash logic to iommu_pte_flush().
> > >  -- on arm, we benefit that we add the crash logic to
> > iommu_iotlb_flush().
> >
> > Right.
> >
> 
> :):)
> 
> > >
> > >
> > > Taken together, we need to add crash logic to
> > >   iommu_iotlb_flush() /
> > > iommu_iotlb_flush_all() /
> > iommu_pte_flush() / amd_iommu_flush_pages().
> >




Think twice,  now I am inclined to leave amd_iommu_flush_pages() as is,
As the IOTLB flush error has not been yet bubbled up on AMD platform, it is 
pointless to  add
domain crash logic there.

Quan




> > For iommu_iotlb_* yes as it is common code. I don't know for the others.
> >
> IMO, that's is enough. I hope the other maintainers can give some comments.
> Thanks!!
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue (+ crash logic )

2016-06-18 Thread Xu, Quan
On June 17, 2016 9:35 PM, Julien Grall  wrote:
> On 17/06/16 09:51, Xu, Quan wrote:
> > + arm/amd maintainers..
> >
> > On June 01, 2016 5:05 PM, Xu, Quan  wrote:
> >> If Device-TLB flush timed out, we hide the target ATS device
> >> immediately and crash the domain owning this ATS device. If impacted
> >> domain is hardware domain, just throw out a warning.
> >>
> >> By hiding the device, we make sure it can't be assigned to any domain
> >> any longer (see device_assigned).
> >
> > DomU (other than Dom0) gets crashed when a device IOTLB flush times out.
> I suppose that's what you will want on ARM/AMD then too.
> 
> Correct it is what we want for ARM :).
> 


That's good :):)..
Julien, thanks.

> > We need to move up the crash logic , as similar as iommu_map_page() /
> iommu_unmap_page().
> >
> >  - add the crash logic to iommu_iotlb_flush() / iommu_iotlb_flush_all().
> >
> >  - when IOMMU/MMU share page tables, we need to fix it one by one.
> >  -- on amd, we need to add the crash logic to
> amd_iommu_flush_pages().
> >  -- on intel, we need to add the crash logic to iommu_pte_flush().
> >  -- on arm, we benefit that we add the crash logic to
> iommu_iotlb_flush().
> 
> Right.
> 

:):)

> >
> >
> > Taken together, we need to add crash logic to
> >   iommu_iotlb_flush() / iommu_iotlb_flush_all() /
> iommu_pte_flush() / amd_iommu_flush_pages().
> 
> For iommu_iotlb_* yes as it is common code. I don't know for the others.
> 
IMO, that's is enough. I hope the other maintainers can give some comments. 
Thanks!!

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue (+ crash logic )

2016-06-17 Thread Julien Grall

Hi Quan,

On 17/06/16 09:51, Xu, Quan wrote:

+ arm/amd maintainers..

On June 01, 2016 5:05 PM, Xu, Quan  wrote:

If Device-TLB flush timed out, we hide the target ATS device immediately and
crash the domain owning this ATS device. If impacted domain is hardware
domain, just throw out a warning.

By hiding the device, we make sure it can't be assigned to any domain any
longer (see device_assigned).


DomU (other than Dom0) gets crashed when a device IOTLB flush times out. I 
suppose that's what you will want on ARM/AMD then too.


Correct it is what we want for ARM :).


We need to move up the crash logic , as similar as iommu_map_page() / 
iommu_unmap_page().

 - add the crash logic to iommu_iotlb_flush() / iommu_iotlb_flush_all().

 - when IOMMU/MMU share page tables, we need to fix it one by one.
 -- on amd, we need to add the crash logic to amd_iommu_flush_pages().
 -- on intel, we need to add the crash logic to iommu_pte_flush().
 -- on arm, we benefit that we add the crash logic to 
iommu_iotlb_flush().


Right.




Taken together, we need to add crash logic to
  iommu_iotlb_flush() / iommu_iotlb_flush_all() / 
iommu_pte_flush() / amd_iommu_flush_pages().


For iommu_iotlb_* yes as it is common code. I don't know for the others.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue (+ crash logic )

2016-06-17 Thread Xu, Quan
+ arm/amd maintainers..

On June 01, 2016 5:05 PM, Xu, Quan  wrote:
> If Device-TLB flush timed out, we hide the target ATS device immediately and
> crash the domain owning this ATS device. If impacted domain is hardware
> domain, just throw out a warning.
> 
> By hiding the device, we make sure it can't be assigned to any domain any
> longer (see device_assigned).

DomU (other than Dom0) gets crashed when a device IOTLB flush times out. I 
suppose that's what you will want on ARM/AMD then too.
We need to move up the crash logic , as similar as iommu_map_page() / 
iommu_unmap_page().

- add the crash logic to iommu_iotlb_flush() / iommu_iotlb_flush_all().

- when IOMMU/MMU share page tables, we need to fix it one by one.
-- on amd, we need to add the crash logic to amd_iommu_flush_pages().
-- on intel, we need to add the crash logic to iommu_pte_flush().
-- on arm, we benefit that we add the crash logic to 
iommu_iotlb_flush().


Taken together, we need to add crash logic to
 iommu_iotlb_flush() / iommu_iotlb_flush_all() / 
iommu_pte_flush() / amd_iommu_flush_pages().

Thoughts?

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 10:15,  wrote:
> On June 17, 2016 3:01 PM, Jan Beulich  wrote:
>> >>> On 17.06.16 at 08:08,  wrote:
>> 
>> > On June 16, 2016 5:05 PM, Jan Beulich  wrote:
>> >> >>> On 16.06.16 at 10:42,  wrote:
>> >> > On June 02, 2016 7:07 PM, Jan Beulich  wrote:
>> >> >> >>> On 01.06.16 at 11:05,  wrote:
>> >> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16
>> did,
>> >> >> > + struct pci_ats_dev
>> >> >> > +*ats_dev) {
>> >> >> > +struct domain *d = NULL;
>> >> >> > +struct pci_dev *pdev;
>> >> >> > +
>> >> >> > +if ( test_bit(did, iommu->domid_bitmap) )
>> >> >> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]);
>> >> >> > +
>> >> >> > +/*
>> >> >> > + * In case the domain has been freed or the IOMMU domid bitmap
>> is
>> >> >> > + * not valid, the device no longer belongs to this domain.
>> >> >> > + */
>> >> >> > +if ( d == NULL )
>> >> >> > +return;
>> >> >> > +
>> >> >> > +pcidevs_lock();
>> >> >> > +
>> >> >> > +for_each_pdev(d, pdev)
>> >> >> > +{
>> >> >> > +if ( (pdev->seg == ats_dev->seg) &&
>> >> >> > + (pdev->bus == ats_dev->bus) &&
>> >> >> > + (pdev->devfn == ats_dev->devfn) )
>> >> >> > +{
>> >> >> > +ASSERT(pdev->domain);
>> >> >> > +list_del(>domain_list);
>> >> >> > +pdev->domain = NULL;
>> >> >> > +pci_hide_existing_device(pdev);
>> >> >> > +break;
>> >> >> > +}
>> >> >> > +}
>> >> >> > +
>> >> >> > +pcidevs_unlock();
>> >> >>
>> >> >> ... this loop (and locking). (Of course such a change may better
>> >> >> be done in another preparatory patch.)
>> >> >>
>> >> >
>> >> > To eliminate the locking?  I am afraid the locking is still a must
>> >> > here even without the loop, also referring  to device_assigned()..
>> >>
>> >> If the entire loop gets eliminated, what would be left is
>> >>
>> >> pcidevs_lock();
>> >> pcidevs_unlock();
>> >>
>> >> which I don't think does any good.
>> >
>> > Why? I can't follow it..
>> 
>> I don't understand your question. Can you explain what use above code
>> sequence is, in your opinion? Or else - what does the "why"
>> refer to?
>> 
> 
> Ah, there may be a gap between us. without this loop,  these pdev operation 
> should be still there, such as,
> 
> 
> +ASSERT(pdev->domain);
> +list_del(>domain_list);
> +pdev->domain = NULL;
> +pci_hide_existing_device(pdev);
> 
> So, the left is not only:
>pcidevs_lock();
>pcidevs_unlock();

Oh, indeed. My bad.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-17 Thread Xu, Quan
On June 17, 2016 3:01 PM, Jan Beulich  wrote:
> >>> On 17.06.16 at 08:08,  wrote:
> 
> > On June 16, 2016 5:05 PM, Jan Beulich  wrote:
> >> >>> On 16.06.16 at 10:42,  wrote:
> >> > On June 02, 2016 7:07 PM, Jan Beulich  wrote:
> >> >> >>> On 01.06.16 at 11:05,  wrote:
> >> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16
> did,
> >> >> > + struct pci_ats_dev
> >> >> > +*ats_dev) {
> >> >> > +struct domain *d = NULL;
> >> >> > +struct pci_dev *pdev;
> >> >> > +
> >> >> > +if ( test_bit(did, iommu->domid_bitmap) )
> >> >> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> >> >> > +
> >> >> > +/*
> >> >> > + * In case the domain has been freed or the IOMMU domid bitmap
> is
> >> >> > + * not valid, the device no longer belongs to this domain.
> >> >> > + */
> >> >> > +if ( d == NULL )
> >> >> > +return;
> >> >> > +
> >> >> > +pcidevs_lock();
> >> >> > +
> >> >> > +for_each_pdev(d, pdev)
> >> >> > +{
> >> >> > +if ( (pdev->seg == ats_dev->seg) &&
> >> >> > + (pdev->bus == ats_dev->bus) &&
> >> >> > + (pdev->devfn == ats_dev->devfn) )
> >> >> > +{
> >> >> > +ASSERT(pdev->domain);
> >> >> > +list_del(>domain_list);
> >> >> > +pdev->domain = NULL;
> >> >> > +pci_hide_existing_device(pdev);
> >> >> > +break;
> >> >> > +}
> >> >> > +}
> >> >> > +
> >> >> > +pcidevs_unlock();
> >> >>
> >> >> ... this loop (and locking). (Of course such a change may better
> >> >> be done in another preparatory patch.)
> >> >>
> >> >
> >> > To eliminate the locking?  I am afraid the locking is still a must
> >> > here even without the loop, also referring  to device_assigned()..
> >>
> >> If the entire loop gets eliminated, what would be left is
> >>
> >> pcidevs_lock();
> >> pcidevs_unlock();
> >>
> >> which I don't think does any good.
> >
> > Why? I can't follow it..
> 
> I don't understand your question. Can you explain what use above code
> sequence is, in your opinion? Or else - what does the "why"
> refer to?
> 

Ah, there may be a gap between us. without this loop,  these pdev operation 
should be still there, such as,


+ASSERT(pdev->domain);
+list_del(>domain_list);
+pdev->domain = NULL;
+pci_hide_existing_device(pdev);

So, the left is not only:
   pcidevs_lock();
   pcidevs_unlock();


> >> >> > +static int __must_check dev_invalidate_sync(struct iommu
> >> >> > +*iommu,
> >> >> > +u16
> >> >> did,
> >> >> > +struct pci_ats_dev
> >> >> > +*ats_dev) {
> >> >> > +struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> >> >> > +int rc = 0;
> >> >> > +
> >> >> > +if ( qi_ctrl->qinval_maddr )
> >> >> > +{
> >> >> > +rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
> >> >> > +
> >> >> > +if ( rc == -ETIMEDOUT )
> >> >> > +dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
> >> >> > +}
> >> >> > +
> >> >> > +return rc;
> >> >> > +}
> >> >>
> >> >> I've never really understood why invalidate_sync() returns success
> >> >> when it didn't do anything. Now that you copy this same behavior
> >> >> here, I really need to ask you to explain that.
> >> >>
> >> >
> >> > It is acceptable to me, returning success when it didn't do
> >> > anything
> >> > -- this is worth reflection and criticism:(..
> >> > It is better:
> >> > +if ( qi_ctrl->qinval_maddr )
> >> > +...
> >> > +else
> >> > +rc = -ENOENT;
> >>
> >> Right. And perhaps a separate patch to make invalidate_sync() do the
> same.
> >
> > Agreed.
> >
> >> Question is whether this really ought to be a conditional, or whether
> > instead
> >> this code is unreachable when qinval is not in use, in which case
> >> these conditionals would imo better be converted to ASSERT()s.
> >>
> >
> > IMO, this should be a conditional.
> > As mentioned below, strictly speaking, this is a bug. We can't
> > ASSERT() based on a bug..
> > A coming patch may fix it..
> 
> And again I don't understand: ASSERT()s are to verify assumed state. If static
> code analysis resulted in understanding a function is unreachable when
> qi_ctrl->qinval_maddr is zero (because qinval ought to have got disabled if 
> any
> of the table setup failed), then adding ASSERT() would (a) document that and
> (b) allow to know quickly if something broke that assumption.
> 

You are right.  A separate patch does this.

> But then again I may simply misunderstand your wording: "We can't ASSERT()
> based on a bug" is really pretty unclear to me.
> 

I supposed the variable in ASSERT() is always true, but disable_qinval() needs 
to make
qi_ctrl->qinval_maddr  zero, but today it doesn't do this -- a bug.
With your explanation,  I got it now. Thanks.


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-17 Thread Jan Beulich
>>> On 17.06.16 at 08:08,  wrote:

> On June 16, 2016 5:05 PM, Jan Beulich  wrote:
>> >>> On 16.06.16 at 10:42,  wrote:
>> > On June 02, 2016 7:07 PM, Jan Beulich  wrote:
>> >> >>> On 01.06.16 at 11:05,  wrote:
>> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
>> >> > + struct pci_ats_dev
>> >> > +*ats_dev) {
>> >> > +struct domain *d = NULL;
>> >> > +struct pci_dev *pdev;
>> >> > +
>> >> > +if ( test_bit(did, iommu->domid_bitmap) )
>> >> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]);
>> >> > +
>> >> > +/*
>> >> > + * In case the domain has been freed or the IOMMU domid bitmap is
>> >> > + * not valid, the device no longer belongs to this domain.
>> >> > + */
>> >> > +if ( d == NULL )
>> >> > +return;
>> >> > +
>> >> > +pcidevs_lock();
>> >> > +
>> >> > +for_each_pdev(d, pdev)
>> >> > +{
>> >> > +if ( (pdev->seg == ats_dev->seg) &&
>> >> > + (pdev->bus == ats_dev->bus) &&
>> >> > + (pdev->devfn == ats_dev->devfn) )
>> >> > +{
>> >> > +ASSERT(pdev->domain);
>> >> > +list_del(>domain_list);
>> >> > +pdev->domain = NULL;
>> >> > +pci_hide_existing_device(pdev);
>> >> > +break;
>> >> > +}
>> >> > +}
>> >> > +
>> >> > +pcidevs_unlock();
>> >>
>> >> ... this loop (and locking). (Of course such a change may better be
>> >> done in another preparatory patch.)
>> >>
>> >
>> > To eliminate the locking?  I am afraid the locking is still a must
>> > here even without the loop, also referring  to device_assigned()..
>> 
>> If the entire loop gets eliminated, what would be left is
>> 
>> pcidevs_lock();
>> pcidevs_unlock();
>> 
>> which I don't think does any good.
> 
> Why? I can't follow it..

I don't understand your question. Can you explain what use above
code sequence is, in your opinion? Or else - what does the "why"
refer to?

>> >> > +static int __must_check dev_invalidate_sync(struct iommu *iommu,
>> >> > +u16
>> >> did,
>> >> > +struct pci_ats_dev
>> >> > +*ats_dev) {
>> >> > +struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
>> >> > +int rc = 0;
>> >> > +
>> >> > +if ( qi_ctrl->qinval_maddr )
>> >> > +{
>> >> > +rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
>> >> > +
>> >> > +if ( rc == -ETIMEDOUT )
>> >> > +dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
>> >> > +}
>> >> > +
>> >> > +return rc;
>> >> > +}
>> >>
>> >> I've never really understood why invalidate_sync() returns success
>> >> when it didn't do anything. Now that you copy this same behavior
>> >> here, I really need to ask you to explain that.
>> >>
>> >
>> > It is acceptable to me, returning success when it didn't do anything
>> > -- this is worth reflection and criticism:(..
>> > It is better:
>> > +if ( qi_ctrl->qinval_maddr )
>> > +...
>> > +else
>> > +rc = -ENOENT;
>> 
>> Right. And perhaps a separate patch to make invalidate_sync() do the same.
> 
> Agreed.
> 
>> Question is whether this really ought to be a conditional, or whether 
> instead
>> this code is unreachable when qinval is not in use, in which case these
>> conditionals would imo better be converted to ASSERT()s.
>> 
> 
> IMO, this should be a conditional.
> As mentioned below, strictly speaking, this is a bug. We can't  ASSERT() 
> based on a bug..
> A coming patch may fix it..

And again I don't understand: ASSERT()s are to verify assumed
state. If static code analysis resulted in understanding a function
is unreachable when qi_ctrl->qinval_maddr is zero (because
qinval ought to have got disabled if any of the table setup failed),
then adding ASSERT() would (a) document that and (b) allow to
know quickly if something broke that assumption.

But then again I may simply misunderstand your wording: "We
can't ASSERT() based on a bug" is really pretty unclear to me.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-17 Thread Xu, Quan

On June 16, 2016 5:05 PM, Jan Beulich  wrote:
> >>> On 16.06.16 at 10:42,  wrote:
> > On June 02, 2016 7:07 PM, Jan Beulich  wrote:
> >> >>> On 01.06.16 at 11:05,  wrote:
> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
> >> > + struct pci_ats_dev
> >> > +*ats_dev) {
> >> > +struct domain *d = NULL;
> >> > +struct pci_dev *pdev;
> >> > +
> >> > +if ( test_bit(did, iommu->domid_bitmap) )
> >> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> >> > +
> >> > +/*
> >> > + * In case the domain has been freed or the IOMMU domid bitmap is
> >> > + * not valid, the device no longer belongs to this domain.
> >> > + */
> >> > +if ( d == NULL )
> >> > +return;
> >> > +
> >> > +pcidevs_lock();
> >> > +
> >> > +for_each_pdev(d, pdev)
> >> > +{
> >> > +if ( (pdev->seg == ats_dev->seg) &&
> >> > + (pdev->bus == ats_dev->bus) &&
> >> > + (pdev->devfn == ats_dev->devfn) )
> >> > +{
> >> > +ASSERT(pdev->domain);
> >> > +list_del(>domain_list);
> >> > +pdev->domain = NULL;
> >> > +pci_hide_existing_device(pdev);
> >> > +break;
> >> > +}
> >> > +}
> >> > +
> >> > +pcidevs_unlock();
> >>
> >> ... this loop (and locking). (Of course such a change may better be
> >> done in another preparatory patch.)
> >>
> >
> > To eliminate the locking?  I am afraid the locking is still a must
> > here even without the loop, also referring  to device_assigned()..
> 
> If the entire loop gets eliminated, what would be left is
> 
> pcidevs_lock();
> pcidevs_unlock();
> 
> which I don't think does any good.
> 

Why? I can't follow it..


> >> > +static int __must_check dev_invalidate_sync(struct iommu *iommu,
> >> > +u16
> >> did,
> >> > +struct pci_ats_dev
> >> > +*ats_dev) {
> >> > +struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> >> > +int rc = 0;
> >> > +
> >> > +if ( qi_ctrl->qinval_maddr )
> >> > +{
> >> > +rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
> >> > +
> >> > +if ( rc == -ETIMEDOUT )
> >> > +dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
> >> > +}
> >> > +
> >> > +return rc;
> >> > +}
> >>
> >> I've never really understood why invalidate_sync() returns success
> >> when it didn't do anything. Now that you copy this same behavior
> >> here, I really need to ask you to explain that.
> >>
> >
> > It is acceptable to me, returning success when it didn't do anything
> > -- this is worth reflection and criticism:(..
> > It is better:
> > +if ( qi_ctrl->qinval_maddr )
> > +...
> > +else
> > +rc = -ENOENT;
> 
> Right. And perhaps a separate patch to make invalidate_sync() do the same.

Agreed.

> Question is whether this really ought to be a conditional, or whether instead
> this code is unreachable when qinval is not in use, in which case these
> conditionals would imo better be converted to ASSERT()s.
> 

IMO, this should be a conditional.
As mentioned below, strictly speaking, this is a bug. We can't  ASSERT() based 
on a bug..
A coming patch may fix it..


> > A question:
> > I find the page related to qi_ctrl->qinval_maddr is not freed at all.
> > IMO, In disable_qinval (), we need to do:
> >  - free the page related to qi_ctrl->qinval_maddr.
> >  - qi_ctrl->qinval_maddr = 0;
> 
> Well, that's a correct observation, but not a big problem imo: If this was a 
> per-
> domain resource, it surely would need fixing. But if freeing a couple of these
> pages (one per IOMMU) causes synchronization headaches (e.g. because
> there may still be dangling references to it), then I think freeing them is 
> not a
> must. But if freeing them is safe (like you seem to imply), then I'm certainly
> fine with you fixing this (not that my opinion would matter much here, as I'm
> not the maintainer of this code).
> 

Agreed, thanks for your explanation. At least I will leave it as is in this 
patch set.

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-16 Thread Jan Beulich
>>> On 16.06.16 at 10:42,  wrote:
> On June 02, 2016 7:07 PM, Jan Beulich  wrote:
>> >>> On 01.06.16 at 11:05,  wrote:
>> > --- a/xen/drivers/passthrough/vtd/extern.h
>> > +++ b/xen/drivers/passthrough/vtd/extern.h
>> > @@ -21,6 +21,7 @@
>> >  #define _VTD_EXTERN_H_
>> >
>> >  #include "dmar.h"
>> > +#include "../ats.h"
>> 
>> Why? You don't de-reference struct pci_ats_dev * in this file, so all you'd 
>> need
>> is a forward declaration. But then this is not in line with your v11 change
>> description above, so I wonder whether you actually sent a stale patch.
> 
> Sorry, this patch is really strange to me.

Well, it's yours, so you ought to be able to explain it.

>> After all I thought the v10 discussion (see
>> http://lists.xenproject.org/archives/html/xen-devel/2016- 
>> 05/msg02208.html
>> ) had made clear that this passing down,
> 
> Sure, what you said is very clear. But I read these code again, I found a  
> pci_get_pdev_by_domain()
> Can also get *pdev without below loop.. (also hardware domain calls 
> pci_get_pdev_by_domain() to get pdev.)

Which would not eliminate the loop - pci_get_pdev_by_domain()
does have a loop itself.

> To be honest,  now I don't like to add a struct pci_dev * inside struct 
> pci_ats_dev, as I need to change
> ' struct pci_ats_dev *pdev' to ' struct pci_ats_dev * pci_ats_dev ' in some 
> functions as well.

I don't see why variables would need renaming. If you need a
variable of type struct pci_dev * in a function which already has
a variable named pdev, simply name the new variable differently
(e.g. pci_dev).

>> besides reducing the number of
>> arguments of some function, would also be meant to eliminate ...
>> 
>> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
>> > + struct pci_ats_dev *ats_dev)
>> > +{
>> > +struct domain *d = NULL;
>> > +struct pci_dev *pdev;
>> > +
>> > +if ( test_bit(did, iommu->domid_bitmap) )
>> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]);
>> > +
>> > +/*
>> > + * In case the domain has been freed or the IOMMU domid bitmap is
>> > + * not valid, the device no longer belongs to this domain.
>> > + */
>> > +if ( d == NULL )
>> > +return;
>> > +
>> > +pcidevs_lock();
>> > +
>> > +for_each_pdev(d, pdev)
>> > +{
>> > +if ( (pdev->seg == ats_dev->seg) &&
>> > + (pdev->bus == ats_dev->bus) &&
>> > + (pdev->devfn == ats_dev->devfn) )
>> > +{
>> > +ASSERT(pdev->domain);
>> > +list_del(>domain_list);
>> > +pdev->domain = NULL;
>> > +pci_hide_existing_device(pdev);
>> > +break;
>> > +}
>> > +}
>> > +
>> > +pcidevs_unlock();
>> 
>> ... this loop (and locking). (Of course such a change may better be done in
>> another preparatory patch.)
>> 
> 
> To eliminate the locking?  I am afraid the locking is still a must here even 
> without the loop, also referring  to device_assigned()..

If the entire loop gets eliminated, what would be left is

pcidevs_lock();
pcidevs_unlock();

which I don't think does any good.

>> > +static int __must_check dev_invalidate_sync(struct iommu *iommu, u16
>> did,
>> > +struct pci_ats_dev
>> > +*ats_dev) {
>> > +struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
>> > +int rc = 0;
>> > +
>> > +if ( qi_ctrl->qinval_maddr )
>> > +{
>> > +rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
>> > +
>> > +if ( rc == -ETIMEDOUT )
>> > +dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
>> > +}
>> > +
>> > +return rc;
>> > +}
>> 
>> I've never really understood why invalidate_sync() returns success when it
>> didn't do anything. Now that you copy this same behavior here, I really need
>> to ask you to explain that.
>> 
> 
> It is acceptable to me, returning success when it didn't do anything -- this 
> is worth reflection and criticism:(..
> It is better:
> +if ( qi_ctrl->qinval_maddr )
> +...
> +else
> +rc = -ENOENT;

Right. And perhaps a separate patch to make invalidate_sync()
do the same. Question is whether this really ought to be a
conditional, or whether instead this code is unreachable when
qinval is not in use, in which case these conditionals would imo
better be converted to ASSERT()s.

> A question:
> I find the page related to qi_ctrl->qinval_maddr is not freed at all. IMO,
> In disable_qinval (), we need to do:
>  - free the page related to qi_ctrl->qinval_maddr.
>  - qi_ctrl->qinval_maddr = 0;

Well, that's a correct observation, but not a big problem imo: If this
was a per-domain resource, it surely would need fixing. But if freeing
a couple of these pages (one per IOMMU) causes synchronization
headaches (e.g. because there may still be dangling references to
it), then I think freeing them is 

Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-16 Thread Xu, Quan
(* I will CC arm/amd maintainer after this vt-d specific discussion, and then 
send out my proposal...)

On June 02, 2016 7:07 PM, Jan Beulich  wrote:
> >>> On 01.06.16 at 11:05,  wrote:
> > --- a/xen/drivers/passthrough/vtd/extern.h
> > +++ b/xen/drivers/passthrough/vtd/extern.h
> > @@ -21,6 +21,7 @@
> >  #define _VTD_EXTERN_H_
> >
> >  #include "dmar.h"
> > +#include "../ats.h"
> 
> Why? You don't de-reference struct pci_ats_dev * in this file, so all you'd 
> need
> is a forward declaration. But then this is not in line with your v11 change
> description above, so I wonder whether you actually sent a stale patch.

Sorry, this patch is really strange to me.

> After all I thought the v10 discussion (see
> http://lists.xenproject.org/archives/html/xen-devel/2016-
> 05/msg02208.html
> ) had made clear that this passing down,

Sure, what you said is very clear. But I read these code again, I found a  
pci_get_pdev_by_domain()
Can also get *pdev without below loop.. (also hardware domain calls 
pci_get_pdev_by_domain() to get pdev.)

To be honest,  now I don't like to add a struct pci_dev * inside struct 
pci_ats_dev, as I need to change
' struct pci_ats_dev *pdev' to ' struct pci_ats_dev * pci_ats_dev ' in some 
functions as well.


> besides reducing the number of
> arguments of some function, would also be meant to eliminate ...
> 
> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
> > + struct pci_ats_dev *ats_dev)
> > +{
> > +struct domain *d = NULL;
> > +struct pci_dev *pdev;
> > +
> > +if ( test_bit(did, iommu->domid_bitmap) )
> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> > +
> > +/*
> > + * In case the domain has been freed or the IOMMU domid bitmap is
> > + * not valid, the device no longer belongs to this domain.
> > + */
> > +if ( d == NULL )
> > +return;
> > +
> > +pcidevs_lock();
> > +
> > +for_each_pdev(d, pdev)
> > +{
> > +if ( (pdev->seg == ats_dev->seg) &&
> > + (pdev->bus == ats_dev->bus) &&
> > + (pdev->devfn == ats_dev->devfn) )
> > +{
> > +ASSERT(pdev->domain);
> > +list_del(>domain_list);
> > +pdev->domain = NULL;
> > +pci_hide_existing_device(pdev);
> > +break;
> > +}
> > +}
> > +
> > +pcidevs_unlock();
> 
> ... this loop (and locking). (Of course such a change may better be done in
> another preparatory patch.)
> 

To eliminate the locking?  I am afraid the locking is still a must here even 
without the loop, also referring  to device_assigned()..


> 
> > +static int __must_check dev_invalidate_sync(struct iommu *iommu, u16
> did,
> > +struct pci_ats_dev
> > +*ats_dev) {
> > +struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> > +int rc = 0;
> > +
> > +if ( qi_ctrl->qinval_maddr )
> > +{
> > +rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
> > +
> > +if ( rc == -ETIMEDOUT )
> > +dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
> > +}
> > +
> > +return rc;
> > +}
> 
> I've never really understood why invalidate_sync() returns success when it
> didn't do anything. Now that you copy this same behavior here, I really need
> to ask you to explain that.
> 

It is acceptable to me, returning success when it didn't do anything -- this is 
worth reflection and criticism:(..
It is better:
+if ( qi_ctrl->qinval_maddr )
+...
+else
+rc = -ENOENT;

A question:
I find the page related to qi_ctrl->qinval_maddr is not freed at all. IMO,
In disable_qinval (), we need to do:
 - free the page related to qi_ctrl->qinval_maddr.
 - qi_ctrl->qinval_maddr = 0;

Right?

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-02 Thread Jan Beulich
>>> On 01.06.16 at 11:05,  wrote:
> --- a/xen/drivers/passthrough/vtd/extern.h
> +++ b/xen/drivers/passthrough/vtd/extern.h
> @@ -21,6 +21,7 @@
>  #define _VTD_EXTERN_H_
>  
>  #include "dmar.h"
> +#include "../ats.h"

Why? You don't de-reference struct pci_ats_dev * in this file, so
all you'd need is a forward declaration. But then this is not in line
with your v11 change description above, so I wonder whether
you actually sent a stale patch. After all I thought the v10
discussion (see
http://lists.xenproject.org/archives/html/xen-devel/2016-05/msg02208.html
) had made clear that this passing down, besides reducing the
number of arguments of some function, would also be meant to
eliminate ...

> +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
> + struct pci_ats_dev *ats_dev)
> +{
> +struct domain *d = NULL;
> +struct pci_dev *pdev;
> +
> +if ( test_bit(did, iommu->domid_bitmap) )
> +d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> +
> +/*
> + * In case the domain has been freed or the IOMMU domid bitmap is
> + * not valid, the device no longer belongs to this domain.
> + */
> +if ( d == NULL )
> +return;
> +
> +pcidevs_lock();
> +
> +for_each_pdev(d, pdev)
> +{
> +if ( (pdev->seg == ats_dev->seg) &&
> + (pdev->bus == ats_dev->bus) &&
> + (pdev->devfn == ats_dev->devfn) )
> +{
> +ASSERT(pdev->domain);
> +list_del(>domain_list);
> +pdev->domain = NULL;
> +pci_hide_existing_device(pdev);
> +break;
> +}
> +}
> +
> +pcidevs_unlock();

... this loop (and locking). (Of course such a change may better be
done in another preparatory patch.)

> +if ( !is_hardware_domain(d) )
> +domain_crash(d);
> +else
> +printk(XENLOG_WARNING VTDPREFIX
> +   " dom%d: ATS device %04x:%02x:%02x.%u flush failed\n",
> +   d->domain_id, ats_dev->seg, ats_dev->bus,
> +   PCI_SLOT(ats_dev->devfn), PCI_FUNC(ats_dev->devfn));

Please use the same logic for logging and crashing as you do in
the other series, so that at least on average a resulting DomU
crash will be accompanied with some indication of the reason
beyond just the source file name and line number.

> +static int __must_check dev_invalidate_sync(struct iommu *iommu, u16 did,
> +struct pci_ats_dev *ats_dev)
> +{
> +struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> +int rc = 0;
> +
> +if ( qi_ctrl->qinval_maddr )
> +{
> +rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
> +
> +if ( rc == -ETIMEDOUT )
> +dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
> +}
> +
> +return rc;
> +}

I've never really understood why invalidate_sync() returns success
when it didn't do anything. Now that you copy this same behavior
here, I really need to ask you to explain that.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

2016-06-01 Thread Xu, Quan
From: Quan Xu 

If Device-TLB flush timed out, we hide the target ATS device
immediately and crash the domain owning this ATS device. If
impacted domain is hardware domain, just throw out a warning.

By hiding the device, we make sure it can't be assigned to any
domain any longer (see device_assigned).

v11:
   1. Pass down struct pci_dev *, instead of SBDF inside struct
  pci_ats_dev.
   2. change invalidate_sync() back as I add a specific function
  - dev_invalidate_sync() for device IOTLB.

Signed-off-by: Quan Xu 
---
 xen/drivers/passthrough/pci.c |  6 +--
 xen/drivers/passthrough/vtd/extern.h  |  5 +-
 xen/drivers/passthrough/vtd/qinval.c  | 86 +--
 xen/drivers/passthrough/vtd/x86/ats.c |  7 +--
 xen/include/xen/pci.h |  1 +
 5 files changed, 82 insertions(+), 23 deletions(-)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 98936f55c..843dc88 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -419,7 +419,7 @@ static void free_pdev(struct pci_seg *pseg, struct pci_dev 
*pdev)
 xfree(pdev);
 }
 
-static void _pci_hide_device(struct pci_dev *pdev)
+void pci_hide_existing_device(struct pci_dev *pdev)
 {
 if ( pdev->domain )
 return;
@@ -436,7 +436,7 @@ int __init pci_hide_device(int bus, int devfn)
 pdev = alloc_pdev(get_pseg(0), bus, devfn);
 if ( pdev )
 {
-_pci_hide_device(pdev);
+pci_hide_existing_device(pdev);
 rc = 0;
 }
 pcidevs_unlock();
@@ -466,7 +466,7 @@ int __init pci_ro_device(int seg, int bus, int devfn)
 }
 
 __set_bit(PCI_BDF2(bus, devfn), pseg->ro_map);
-_pci_hide_device(pdev);
+pci_hide_existing_device(pdev);
 
 return 0;
 }
diff --git a/xen/drivers/passthrough/vtd/extern.h 
b/xen/drivers/passthrough/vtd/extern.h
index 45357f2..94ca97a 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -21,6 +21,7 @@
 #define _VTD_EXTERN_H_
 
 #include "dmar.h"
+#include "../ats.h"
 #include 
 
 #define VTDPREFIX "[VT-D]"
@@ -60,8 +61,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
  u64 addr, unsigned int size_order, u64 type);
 
 int __must_check qinval_device_iotlb_sync(struct iommu *iommu,
-  u32 max_invs_pend,
-  u16 sid, u16 size, u64 addr);
+  struct pci_ats_dev *ats_dev,
+  u16 did, u16 size, u64 addr);
 
 unsigned int get_cache_line_size(void);
 void cacheline_flush(char *);
diff --git a/xen/drivers/passthrough/vtd/qinval.c 
b/xen/drivers/passthrough/vtd/qinval.c
index 9116588..bea3e86 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -30,8 +30,7 @@
 
 #define IOMMU_QI_TIMEOUT MILLISECS(1)
 
-static int __must_check invalidate_sync(struct iommu *iommu,
-bool_t flush_dev_iotlb);
+static int __must_check invalidate_sync(struct iommu *iommu);
 
 static void print_qi_regs(struct iommu *iommu)
 {
@@ -103,7 +102,7 @@ static int __must_check 
queue_invalidate_context_sync(struct iommu *iommu,
 
 unmap_vtd_domain_page(qinval_entries);
 
-return invalidate_sync(iommu, 0);
+return invalidate_sync(iommu);
 }
 
 static int __must_check queue_invalidate_iotlb_sync(struct iommu *iommu,
@@ -140,7 +139,7 @@ static int __must_check queue_invalidate_iotlb_sync(struct 
iommu *iommu,
 qinval_update_qtail(iommu, index);
 spin_unlock_irqrestore(>register_lock, flags);
 
-return invalidate_sync(iommu, 0);
+return invalidate_sync(iommu);
 }
 
 static int __must_check queue_invalidate_wait(struct iommu *iommu,
@@ -200,20 +199,81 @@ static int __must_check queue_invalidate_wait(struct 
iommu *iommu,
 return -EOPNOTSUPP;
 }
 
-static int __must_check invalidate_sync(struct iommu *iommu,
-bool_t flush_dev_iotlb)
+static int __must_check invalidate_sync(struct iommu *iommu)
 {
 struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
 
 if ( qi_ctrl->qinval_maddr )
-return queue_invalidate_wait(iommu, 0, 1, 1, flush_dev_iotlb);
+return queue_invalidate_wait(iommu, 0, 1, 1, 0);
 
 return 0;
 }
 
+static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
+ struct pci_ats_dev *ats_dev)
+{
+struct domain *d = NULL;
+struct pci_dev *pdev;
+
+if ( test_bit(did, iommu->domid_bitmap) )
+d = rcu_lock_domain_by_id(iommu->domid_map[did]);
+
+/*
+ * In case the domain has been freed or the IOMMU domid bitmap is
+ * not valid, the device no longer belongs to this domain.
+ */
+if ( d == NULL )
+return;
+
+pcidevs_lock();
+
+for_each_pdev(d, pdev)
+{
+if ( (pdev->seg