Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-15 Thread Zhang, Yang Z
Jan Beulich wrote on 2015-10-15: On 15.10.15 at 10:52, wrote: >> Jan Beulich wrote on 2015-10-15: >> On 15.10.15 at 09:28, wrote: The premise for a misbehaving guest to impact the system is that the IOMMU is buggy which takes long time to complete the invalidation. In othe

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-15 Thread Jan Beulich
>>> On 15.10.15 at 10:52, wrote: > Jan Beulich wrote on 2015-10-15: > On 15.10.15 at 09:28, wrote: >>> The premise for a misbehaving guest to impact the system is that the >>> IOMMU is buggy which takes long time to complete the invalidation. >>> In other words, if all invalidations are able

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-15 Thread Zhang, Yang Z
Jan Beulich wrote on 2015-10-15: On 15.10.15 at 09:28, wrote: >> Jan Beulich wrote on 2015-10-15: >> On 15.10.15 at 03:03, wrote: Jan Beulich wrote on 2015-10-14: > As long as the multi-millisecond spins aren't going to go away by > other means, I think conversion to async m

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-15 Thread Jan Beulich
>>> On 15.10.15 at 09:28, wrote: > Jan Beulich wrote on 2015-10-15: > On 15.10.15 at 03:03, wrote: >>> Jan Beulich wrote on 2015-10-14: As long as the multi-millisecond spins aren't going to go away by other means, I think conversion to async mode is ultimately unavoidable. >>> >>>

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-15 Thread Zhang, Yang Z
Jan Beulich wrote on 2015-10-15: On 15.10.15 at 03:03, wrote: >> Jan Beulich wrote on 2015-10-14: >>> As long as the multi-millisecond spins aren't going to go away by >>> other means, I think conversion to async mode is ultimately unavoidable. >> >> I am not fully agreed. I think the time t

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-14 Thread Jan Beulich
>>> On 15.10.15 at 03:03, wrote: > Jan Beulich wrote on 2015-10-14: >> As long as the multi-millisecond spins aren't going to go away by >> other means, I think conversion to async mode is ultimately unavoidable. > > I am not fully agreed. I think the time to spin is important. To me, less > tha

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-14 Thread Zhang, Yang Z
Jan Beulich wrote on 2015-10-14: On 14.10.15 at 07:12, wrote: >> Jan Beulich wrote on 2015-10-13: >> On 13.10.15 at 07:27, wrote: Jan Beulich wrote on 2015-10-12: On 12.10.15 at 03:42, wrote: >> So, my suggestion is that we can rely on user to not assign the >> ATS

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-14 Thread Xu, Quan
>> >>> On 13.10.2015 at 22:50 wrote: > >>> On 13.10.15 at 16:29, wrote: > >> > >>>On 29.09.2015 at 15:22 wrote: > >> >>> On 29.09.15 at 04:53, wrote: > >> Monday, September 28, 2015 2:47 PM, wrote: > >> >> >>> On 28.09.15 at 05:08, wrote: > >> >> Thursday, September 24, 2015 12:27 AM

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-14 Thread Xu, Quan
>> >>>On 13.10.2015 at 17:35, wrote: > At 11:09 + on 11 Oct (1444561760), Xu, Quan wrote: > What in particular is worrying you abo

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-14 Thread Jan Beulich
>>> On 14.10.15 at 07:12, wrote: > Jan Beulich wrote on 2015-10-13: > On 13.10.15 at 07:27, wrote: >>> Jan Beulich wrote on 2015-10-12: >>> On 12.10.15 at 03:42, wrote: > So, my suggestion is that we can rely on user to not assign the > ATS device if hypervisor says it cannot sup

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-13 Thread Zhang, Yang Z
Jan Beulich wrote on 2015-10-13: On 13.10.15 at 07:27, wrote: >> Jan Beulich wrote on 2015-10-12: >> On 12.10.15 at 03:42, wrote: So, my suggestion is that we can rely on user to not assign the ATS device if hypervisor says it cannot support such device. For example, if hy

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-13 Thread Jan Beulich
>>> On 13.10.15 at 16:29, wrote: >> > >>>On 29.09.2015 at 15:22 wrote: >> >>> On 29.09.15 at 04:53, wrote: >> Monday, September 28, 2015 2:47 PM, wrote: >> >> >>> On 28.09.15 at 05:08, wrote: >> >> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: >>The extra ref taken will pre

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-13 Thread Xu, Quan
>> >>>On 29.09.2015 at 15:22 wrote: > >>> On 29.09.15 at 04:53, wrote: > Monday, September 28, 2015 2:47 PM, wrote: > >> >>> On 28.09.15 at 05:08, wrote: > >> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: >The extra ref taken will prevent the page from getting freed. Jan

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-13 Thread Tim Deegan
Hi, At 11:09 + on 11 Oct (1444561760), Xu, Quan wrote: > One question: do two lists refer to page_list and arch.relmem_list? No, I was wondering if a page ever needed to be queued waiting for two different flushes -- e.g. if there are multiple IOMMUs. > I know you prefer __scheme_A__(I think

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-13 Thread Jan Beulich
>>> On 13.10.15 at 07:27, wrote: > Jan Beulich wrote on 2015-10-12: > On 12.10.15 at 03:42, wrote: >>> So, my suggestion is that we can rely on user to not assign the ATS >>> device if hypervisor says it cannot support such device. For >>> example, if hypervisor find the invalidation isn't co

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-12 Thread Zhang, Yang Z
Jan Beulich wrote on 2015-10-12: On 12.10.15 at 03:42, wrote: >> According the discussion and suggestion you made in past several >> weeks, obviously, it is not an easy task. So I am wondering whether >> it is worth to do it since: >> 1. ATS device is not popular. I only know one NIC from Myr

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-12 Thread Jan Beulich
>>> On 12.10.15 at 03:42, wrote: > According the discussion and suggestion you made in past several weeks, > obviously, it is not an easy task. So I am wondering whether it is worth to > do it since: > 1. ATS device is not popular. I only know one NIC from Myricom has ATS > capabilities. > 2.

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-12 Thread Jan Beulich
>>> On 11.10.15 at 13:09, wrote: > On 11.10.2015 at 2:25, wrote: >> At 17:02 + on 07 Oct (1444237344), Xu, Quan wrote: >> > Q2: how do you know when to drop them? >> >- log (or something) when the IOMMU entry is removed/overwritten; and >> >- drop the entry when the flush completes. >

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-11 Thread Zhang, Yang Z
Xu, Quan wrote on 2015-09-16: > Introduction > > >VT-d code currently has a number of cases where completion of > certain operations is being waited for by way of spinning. The > majority of instances use that variable indirectly through > IOMMU_WAIT_OP() macro , allowing for loop

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-11 Thread Xu, Quan
On 11.10.2015 at 2:25, wrote: > At 17:02 + on 07 Oct (1444237344), Xu, Quan wrote: > > Q2: how do you know when to drop them? > >- log (or something) when the IOMMU entry is removed/overwritten; and > >- drop the entry when the flush completes. > > > >-- We can add a new page_list_

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-10 Thread Tim Deegan
Hi, At 17:02 + on 07 Oct (1444237344), Xu, Quan wrote: > __scheme A__ > Q1: - when to take the references? > take the reference when the IOMMU entry is _created_; > in detail: > --iommu_map_page(), or > --ept_set_entry() [Once IOMMU shares EPT page table.] > > That leaves on

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-09 Thread Xu, Quan
>> >>> On 09.10.2015 at 15:18 wrote: > >>> On 09.10.15 at 09:06, wrote: > >> > >>>On 08.10.2015 at 16:52 wrote: > >> >>> On 07.10.15 at 19:02, wrote: > >> > __scheme B__ > >> > Q1: - when to take the references? > >> > > >> > take the reference when the IOMMU entry is _ > removed/overwrit

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-09 Thread Jan Beulich
>>> On 09.10.15 at 09:06, wrote: >> > >>>On 08.10.2015 at 16:52 wrote: >> >>> On 07.10.15 at 19:02, wrote: >> > Q3: what to do about mappings of other domains' memory (i.e. grant and >> > foreign mappings). >> >Between two domains, now I have only one idea to fix this tricky >> > issue -- wa

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-09 Thread Xu, Quan
>> >>>On 08.10.2015 at 16:52 wrote: > >>> On 07.10.15 at 19:02, wrote: > > __scheme A__ > > Q1: - when to take the references? > > take the reference when the IOMMU entry is _created_; > > in detail: > > --iommu_map_page(), or > > --ept_set_entry() [Once IOMMU shares EPT page ta

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-08 Thread Jan Beulich
>>> On 07.10.15 at 19:02, wrote: > __scheme A__ > Q1: - when to take the references? > take the reference when the IOMMU entry is _created_; > in detail: > --iommu_map_page(), or > --ept_set_entry() [Once IOMMU shares EPT page table.] > > That leaves one question: > -- how t

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-07 Thread Xu, Quan
>>> >> On October 01, 2015, at 5:09 PM wrote: > At 15:05 + on 30 Sep (1443625549), Xu, Quan wrote: > > >> >>> On September 29, 2015, at 5:12 PM, wrote: > > Could I introduce a new typed reference which can only been deref in > > QI interrupt handler(or associated tasklet)?? --(stop me, I al

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-10-01 Thread Tim Deegan
Hi, At 15:05 + on 30 Sep (1443625549), Xu, Quan wrote: > >> >>> On September 29, 2015, at 5:12 PM, wrote: > > So you'll need to do something else to make the unmap safe. > >The usual > > method in Xen is to hold a reference to the page (for read-only > > mappings) > > > Read-only mapping re

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-30 Thread Xu, Quan
>> >>> On September 29, 2015, at 5:12 PM, wrote: > At 03:08 + on 28 Sep (1443409723), Xu, Quan wrote: > > >>> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: > > > 7/13: I'm not convinced that making the vcpu spin calling > > > sched_yield() is a very good plan. Better to explicitly

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-30 Thread Jan Beulich
>>> On 30.09.15 at 15:55, wrote: >> >> >> >>> On September 29, 2015 at 3:22 PM, wrote: >> >>> On 29.09.15 at 04:53, wrote: >> Monday, September 28, 2015 2:47 PM, wrote: >> >> >>> On 28.09.15 at 05:08, wrote: >> >> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: >> > >> > For

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-30 Thread Xu, Quan
> >> >> >>> On September 29, 2015 at 3:22 PM, wrote: > >>> On 29.09.15 at 04:53, wrote: > Monday, September 28, 2015 2:47 PM, wrote: > >> >>> On 28.09.15 at 05:08, wrote: > >> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: > > > > For Tim's suggestion --"to make the IOMMU tab

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-29 Thread Jan Beulich
>>> On 29.09.15 at 11:11, wrote: > With the flush taking longer than Xen can wait for, you'll need to > do something more complex, e.g.: > - keep a log of all relevant pending derefs, to be processed when the >flush completes; or > - have some other method of preventing changes of ownership/

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-29 Thread Tim Deegan
Hi, At 03:08 + on 28 Sep (1443409723), Xu, Quan wrote: > >>> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: > > 7/13: I'm not convinced that making the vcpu spin calling > > sched_yield() is a very good plan. Better to explicitly pause the domain > > if you > > need its vcpus not t

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-29 Thread Jan Beulich
>>> On 29.09.15 at 04:53, wrote: Monday, September 28, 2015 2:47 PM, wrote: >> >>> On 28.09.15 at 05:08, wrote: >> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: > >> It would be a guest kernel bug, but all _we_ care about is that such a guest > kernel >> bug won't affect th

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-28 Thread Xu, Quan
>>> Monday, September 28, 2015 2:47 PM, wrote: > >>> On 28.09.15 at 05:08, wrote: > Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: > It would be a guest kernel bug, but all _we_ care about is that such a guest > kernel > bug won't affect the hypervisor or other guests. It won't a

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-27 Thread Jan Beulich
>>> On 28.09.15 at 05:08, wrote: Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: >> 7/13: I'm not convinced that making the vcpu spin calling >> sched_yield() is a very good plan. Better to explicitly pause the domain if >> you >> need its vcpus not to run. But first -- why does I

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-27 Thread Xu, Quan
Tim, thanks for your review. >>> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote: > Hi, > > At 14:09 + on 21 Sep (1442844587), Xu, Quan wrote: > > George / Tim, > > Could you help me review these memory patches? Thanks! > > The interrupt-mapping and chipset control parts of this are

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-23 Thread Tim Deegan
Hi, At 14:09 + on 21 Sep (1442844587), Xu, Quan wrote: > George / Tim, > Could you help me review these memory patches? Thanks! The interrupt-mapping and chipset control parts of this are outside my understanding. :) And I'm not an x86/mm maintainer any more, but I'll have a look: 7/13: I'm

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-21 Thread Jan Beulich
>>> On 21.09.15 at 16:03, wrote: On 21.09.15 at 20:04, < jbeul...@suse.com > wrote: >> >>> On 21.09.15 at 11:46, wrote: >> >>> >>> On 21.09.15 at 16:51, < jbeul...@suse.com > wrote: >> >>- Anything else? >> > >> > >> > Just test the extreme case. The ATS specification mandates a timeout >>

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-21 Thread Xu, Quan
George / Tim, Could you help me review these memory patches? Thanks! -Quan > -Original Message- > From: Xu, Quan > Sent: Wednesday, September 16, 2015 9:24 PM > To: andrew.coop...@citrix.com; Dong, Eddie; ian.campb...@citrix.com; > ian.jack...@eu.citrix.com; jbeul...@suse.com; Nakajima,

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-21 Thread Xu, Quan
>>> On 21.09.15 at 20:04, < jbeul...@suse.com > wrote: > >>> On 21.09.15 at 11:46, wrote: > >>> >>> On 21.09.15 at 16:51, < jbeul...@suse.com > wrote: > >>- Anything else? > > > > > > Just test the extreme case. The ATS specification mandates a timeout > > of 1 _minute_ for cache flush, even thou

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-21 Thread Jan Beulich
>>> On 21.09.15 at 11:46, wrote: >>> >>> On 21.09.15 at 16:51, < jbeul...@suse.com > wrote: >>- Anything else? > > > Just test the extreme case. The ATS specification mandates a timeout of 1 > _minute_ for cache flush, even though it doesn't take so much time for cache > flush. > In my design,

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-21 Thread Xu, Quan
Thanks Jan. >> >>> On 21.09.15 at 16:51, < jbeul...@suse.com > wrote: >>> On 17.09.15 at 05:26, wrote: > Much more information: >If I run a service in this domain and tested this waitqueue case. > The domain is still working after 60s, but It prints out Call Trace with > $dmesg: > > [ 161

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-21 Thread Jan Beulich
>>> On 17.09.15 at 05:26, wrote: > Much more information: >If I run a service in this domain and tested this waitqueue case. The > domain is still working after 60s, but It prints out Call Trace with $dmesg: > > [ 161.978599] BUG: soft lockup - CPU#0 stuck for 57s! [kworker/0:1:272] Not su

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-17 Thread Ian Jackson
Julien Grall writes ("Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device"): > On 16/09/2015 14:47, Ian Jackson wrote: > > I don't consider myself qualified to review that. I think the > > MAINTAINERS file should have an entry f

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-17 Thread Julien Grall
On 16/09/2015 14:47, Ian Jackson wrote: Julien Grall writes ("Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device"): On 16/09/15 11:46, Ian Jackson wrote: JOOI why did you CC me ? I did a quick scan of these patches and they don't seem to

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-16 Thread Xu, Quan
> -Original Message- > From: Xu, Quan > Sent: Wednesday, September 16, 2015 9:24 PM > To: andrew.coop...@citrix.com; Dong, Eddie; ian.campb...@citrix.com; > ian.jack...@eu.citrix.com; jbeul...@suse.com; Nakajima, Jun; k...@xen.org; > Tian, Kevin; t...@xen.org; Zhang, Yang Z; george.dun...

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-16 Thread Ian Jackson
Julien Grall writes ("Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device"): > On 16/09/15 11:46, Ian Jackson wrote: > > JOOI why did you CC me ? I did a quick scan of these patches and they > > don't seem to have any tools impact.

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-16 Thread Xu, Quan
> -Original Message- > From: Ian Jackson [mailto:ian.jack...@eu.citrix.com] > Sent: Wednesday, September 16, 2015 6:47 PM > To: Xu, Quan > Cc: andrew.coop...@citrix.com; Dong, Eddie; ian.campb...@citrix.com; > ian.jack...@eu.citrix.com; jbeul...@suse.com; Nakajima, Jun; k...@xen.org; > Ti

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-16 Thread Julien Grall
On 16/09/15 11:46, Ian Jackson wrote: > Quan Xu writes ("[Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS > Device"): >> Introduction >> > > Thanks for your submission. > > JOOI why did you CC me ? I did a quick scan of these patches and they > don't seem to have any to

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-16 Thread Ian Jackson
Quan Xu writes ("[Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device"): > Introduction > Thanks for your submission. JOOI why did you CC me ? I did a quick scan of these patches and they don't seem to have any tools impact. I would prefer not to be CC'd unless ther

[Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

2015-09-15 Thread Quan Xu
Introduction VT-d code currently has a number of cases where completion of certain operations is being waited for by way of spinning. The majority of instances use that variable indirectly through IOMMU_WAIT_OP() macro , allowing for loops of up to 1 second (DMAR_OPERATION_TIMEOU