Re: [Xen-devel] [PATCH v2] x86/apicv: fix RTC periodic timer and apicv issue
On September 24, 2016 7:34 AM, Tian Kevin < kevin.t...@intel.com > wrote: >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: Friday, September 23, 2016 11:34 PM >> >> >>> On 20.09.16 at 15:30,wrote: >> > --- a/xen/arch/x86/hvm/vlapic.c >> > +++ b/xen/arch/x86/hvm/vlapic.c >> > @@ -433,6 +433,12 @@ void vlapic_EOI_set(struct vlapic *vlapic) >> > void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector) { >> > struct domain *d = vlapic_domain(vlapic); >> > +struct vcpu *v = vlapic_vcpu(vlapic); >> > +struct hvm_intack pt_intack; >> > + >> > +pt_intack.vector = vector; >> > +pt_intack.source = hvm_intsrc_lapic; >> > +pt_intr_post(v, pt_intack); >> >> This also sits on the EOI LAPIC register write path, i.e. the change >> then also affects non-apicv environments. > >The new logic should be entered only when EOI-induced exit happens. > Yes, more that the EOI-induced exit is conditional, specifically, the bitmap is set by vmx_set_eoi_exit_bitmap(). Jan, what do you think? While I recall from v1 discussion, you have the same comment. We can dig it deep.. >> >> Furthermore - don't we have the same problem as with v1 again then? >> What prevents multiple EOIs to come here before the timer interrupt >> actually gets handled? You'd then clear ->irq_issued each time, >> rendering your change to pt_update_irq() ineffective. > >based on this patch, one irq_issued should cause only one EOI on related pt >vector and this EOI exit clears irq_issued then next EOI would be seen only >upon >another injection when irq_issued is set again... However there might be an >issue if this pt vector is shared with another device interrupt, which >although is >not a typical case... > Agreed. >> >> In any event please use hvm_intack_lapic() instead of open coding it >> (and then I don't think you need a local variable at all). >> Indeed. >> > --- a/xen/arch/x86/hvm/vmx/intr.c >> > +++ b/xen/arch/x86/hvm/vmx/intr.c >> > @@ -333,8 +333,6 @@ void vmx_intr_assist(void) > >> > clear_bit(i, >arch.hvm_vmx.eoi_exitmap_changed); >> > __vmwrite(EOI_EXIT_BITMAP(i), >v->arch.hvm_vmx.eoi_exit_bitmap[i]); >> > } >> > - >> > -pt_intr_post(v, intack); >> > } >> >> I'll defer to the VMX maintainers to determine whether removing this >> but not the other one is correct. > >This is correct. The other one is for non-apicv scenario. > >> >> > --- a/xen/arch/x86/hvm/vpt.c >> > +++ b/xen/arch/x86/hvm/vpt.c >> > @@ -252,7 +252,8 @@ int pt_update_irq(struct vcpu *v) >> > } >> > else >> > { >> > -if ( (pt->last_plt_gtime + pt->period) < max_lag ) >> > +if ( (pt->last_plt_gtime + pt->period) < max_lag && >> > + !pt->irq_issued ) >> > { >> > max_lag = pt->last_plt_gtime + pt->period; >> > earliest_pt = pt; >> >> Looking at the rest of the code I really wonder why this check hadn't >> been there from the beginning. Tim, do recall whether this was >> intentional (as opposed to simply having been an oversight)? >> > >Possibly simplify the emulation by relying on IRR/ISR to handling pending >situation? I think IRR is enough. But there is a challenge that the pt interrupt is __not_only__ handled by LAPIC ( see the bottom of pt_update_irq() ).. Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [ vTPM ] ownership disappear
On September 20, 2016 9:14, > Hi everyone, > when I create a vTPM instance and attached to it a VM, I use tpm-tools to > take thw ownership of the vTPM. Then, > if I destroy the vTPM and the VM, my expectation is that when I re-create the > same vTPM attached to the same VM, > the owership is already taken, but this is not the case: if I re-run > tpm_takeownership, the command is successfull. > Why this happens? Ernesto, I think the emulated TPM is always started with 'ST_CLEAR' status.. in vtpm.c: |main() |-main_loop() -- tpm_emulator_init(opt_args.startup ...) in tpm_emulator_init(), the opt_args.startup is with ''ST_CLEAR'' struct Opt_args opt_args = { .startup = ST_CLEAR, ... }; btw, in my experience, when tpm_takeownership is successfull and you re-run it again, it also print out '''success''' cc: Daniel De GraafEmil Condrea Huawei 2012 lab Quan Xu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] x86/apicv: fix RTC periodic timer and apicv issue
>From 97760602b5c94745e76ed78d23e8fdf9988d234e Mon Sep 17 00:00:00 2001 From: Quan XuDate: Tue, 20 Sep 2016 21:12:54 +0800 Subject: [PATCH v2] x86/apicv: fix RTC periodic timer and apicv issue When Xen apicv is enabled, wall clock time is faster on Windows7-32 guest with high payload (with 2vCPU, captured from xentrace, in high payload, the count of IPI interrupt increases rapidly between these vCPUs). If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector 0xd1) are both pending (index of bit set in vIRR), unfortunately, the IPI intrrupt is high priority than periodic timer interrupt. Xen updates IPI interrupt bit set in vIRR to guest interrupt status (RVI) as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI interrupt within VMX non-root operation without a VM-Exit. Within VMX non-root operation, if periodic timer interrupt index of bit is set in vIRR and highest, the apicv delivers periodic timer interrupt within VMX non-root operation as well. But in current code, if Xen doesn't update periodic timer interrupt bit set in vIRR to guest interrupt status (RVI) directly, Xen is not aware of this case to decrease the count (pending_intr_nr) of pending periodic timer interrupt, then Xen will deliver a periodic timer interrupt again. And that we update periodic timer interrupt in every VM-entry, there is a chance that already-injected instance (before EOI-induced exit happens) will incur another pending IRR setting if there is a VM-exit happens between virtual interrupt injection (vIRR->0, vISR->1) and EOI-induced exit (vISR->0), since pt_intr_post hasn't been invoked yet, then the guest receives more periodic timer interrupt. So change to pt_intr_post in EOI-induced exit handler and skip periodic timer when it is not be completely consumed (irq_issued is ture). Signed-off-by: Yifei Jiang Signed-off-by: Rongguang He Signed-off-by: Quan Xu --- v2: -change to pt_intr_post in EOI-induced exit handler. -skip periodic timer when it is not be completely consumed (irq_issued is ture). --- xen/arch/x86/hvm/vlapic.c | 6 ++ xen/arch/x86/hvm/vmx/intr.c | 2 -- xen/arch/x86/hvm/vpt.c | 3 ++- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c index 1d5d287..f83d6ab 100644 --- a/xen/arch/x86/hvm/vlapic.c +++ b/xen/arch/x86/hvm/vlapic.c @@ -433,6 +433,12 @@ void vlapic_EOI_set(struct vlapic *vlapic) void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector) { struct domain *d = vlapic_domain(vlapic); +struct vcpu *v = vlapic_vcpu(vlapic); +struct hvm_intack pt_intack; + +pt_intack.vector = vector; +pt_intack.source = hvm_intsrc_lapic; +pt_intr_post(v, pt_intack); if ( vlapic_test_and_clear_vector(vector, >regs->data[APIC_TMR]) ) vioapic_update_EOI(d, vector); diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c index 8fca08c..29d9bbf 100644 --- a/xen/arch/x86/hvm/vmx/intr.c +++ b/xen/arch/x86/hvm/vmx/intr.c @@ -333,8 +333,6 @@ void vmx_intr_assist(void) clear_bit(i, >arch.hvm_vmx.eoi_exitmap_changed); __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm_vmx.eoi_exit_bitmap[i]); } - -pt_intr_post(v, intack); } else { diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c index 5c48fdb..a9da436 100644 --- a/xen/arch/x86/hvm/vpt.c +++ b/xen/arch/x86/hvm/vpt.c @@ -252,7 +252,8 @@ int pt_update_irq(struct vcpu *v) } else { -if ( (pt->last_plt_gtime + pt->period) < max_lag ) +if ( (pt->last_plt_gtime + pt->period) < max_lag && + !pt->irq_issued ) { max_lag = pt->last_plt_gtime + pt->period; earliest_pt = pt; -- 1.8.3.4 0001-x86-apicv-fix-RTC-periodic-timer-and-apicv-issue.patch Description: 0001-x86-apicv-fix-RTC-periodic-timer-and-apicv-issue.patch ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On September 19, 2016 5:25 PM, Tian Kevin <kevin.t...@intel.com> wrote: >> From: Xuquan (Euler) [mailto:xuqu...@huawei.com] >> Sent: Monday, September 12, 2016 5:08 PM >> >> On September 12, 2016 3:58 PM, Tian, Kevin <kevin.t...@intel.com> wrote: >> >> From: Xuquan (Euler) [mailto:xuqu...@huawei.com] >> >> Sent: Friday, September 09, 2016 11:02 AM >> >> >> >> On August 30, 2016 1:58 PM, Tian Kevin < kevin.t...@intel.com > wrote: >> >> >> From: Xuquan (Euler) [mailto:xuqu...@huawei.com] >> >> >> Sent: Friday, August 19, 2016 8:59 PM >> >> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c >> >> index 1d5d287..cc247c3 100644 >> >> --- a/xen/arch/x86/hvm/vlapic.c >> >> +++ b/xen/arch/x86/hvm/vlapic.c >> >> @@ -433,6 +433,11 @@ void vlapic_EOI_set(struct vlapic *vlapic) >> >> void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector) { >> >> struct domain *d = vlapic_domain(vlapic); >> >> +struct hvm_intack pt_intack; >> >> + >> >> +pt_intack.vector = vector; >> >> +pt_intack.source = hvm_intsrc_lapic; >> >> +pt_intr_post(vlapic_vcpu(vlapic), pt_intack); >> >> >> >> if ( vlapic_test_and_clear_vector(vector, >> >>regs->data[APIC_TMR]) ) >> >> vioapic_update_EOI(d, vector); diff --git >> >> a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c index >> >> 8fca08c..29d9bbf 100644 >> >> --- a/xen/arch/x86/hvm/vmx/intr.c >> >> +++ b/xen/arch/x86/hvm/vmx/intr.c >> >> @@ -333,8 +333,6 @@ void vmx_intr_assist(void) >> >> clear_bit(i, >arch.hvm_vmx.eoi_exitmap_changed); >> >> __vmwrite(EOI_EXIT_BITMAP(i), >> >v->arch.hvm_vmx.eoi_exit_bitmap[i]); >> >> } >> >> - >> >> -pt_intr_post(v, intack); >> >> } >> >> else >> >> { >> >> >> > >> >Because we update pt irq in every vmentry, there is a chance that >> >already-injected instance (before EOI-induced exit happens) will >> >incur another pending IRR setting if there is a VM-exit happens >> >between HW virtual interrupt injection (vIRR->0, vISR->1) and >> >EOI-induced exit (vISR->0), since pt_intr_post hasn't been invoked >> >yet. I guess this is the reason why you still see faster wallclock. >> > >> >> Agreed. A good description. My bad description is from another aspect. >> >> >I think you need mark this pending_intr_post situation explicitly. >> >Then pt_update_irq should skip such pt timer when pending_intr_post >> >of that timer is true (otherwise the update is meaningless since >> >previous one hasn't been posted yet). Then with your change to post >> >in EOI-induced exit handler, it should work correctly to meet the >> >goal >> >- one virtual interrupt delivery for one pending pt intr... >> > >> I think we are at least on the right track. >> But I can't follow ' pending_intr_post ', a new parameter? Thanks. >> >> > >yes, a new parameter to record whether a intr_post operation is pending The existing parameter ' irq_issued ' looks good. I have tested with below modification last night, and it is working. If it is okay, I will send out v2.. Quan modification = diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c index 1d5d287..cc247c3 100644 --- a/xen/arch/x86/hvm/vlapic.c +++ b/xen/arch/x86/hvm/vlapic.c @@ -433,6 +433,11 @@ void vlapic_EOI_set(struct vlapic *vlapic) void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector) { struct domain *d = vlapic_domain(vlapic); +struct hvm_intack pt_intack; + +pt_intack.vector = vector; +pt_intack.source = hvm_intsrc_lapic; +pt_intr_post(vlapic_vcpu(vlapic), pt_intack); if ( vlapic_test_and_clear_vector(vector, >regs->data[APIC_TMR]) ) vioapic_update_EOI(d, vector); diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c index 8fca08c..29d9bbf 100644 --- a/xen/arch/x86/hvm/vmx/intr.c +++ b/xen/arch/x86/hvm/vmx/intr.c @@ -333,8 +333,6 @@ void vmx_intr_assist(void) clear_bit(i, >arch.hvm_vmx.eoi_exitmap_changed); __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm_vmx.eoi_exit_bitmap[i]); } - -pt_intr_post(v, intack); } else { diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c index 5c48fdb..620ca68 100644 --- a/xen/arch/x86/hvm/vpt.c +++ b/xen/arch/x86/hvm/vpt.c @@ -267,6 +267,11 @@ int pt_update_irq(struct vcpu *v) return -1; } +if ( earliest_pt->irq_issued ) +{ +spin_unlock(>arch.hvm_vcpu.tm_lock); +return -1; +} earliest_pt->irq_issued = 1; irq = earliest_pt->irq; is_lapic = (earliest_pt->source == PTSRC_lapic); ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On September 12, 2016 3:58 PM, Tian, Kevin <kevin.t...@intel.com> wrote: >> From: Xuquan (Euler) [mailto:xuqu...@huawei.com] >> Sent: Friday, September 09, 2016 11:02 AM >> >> On August 30, 2016 1:58 PM, Tian Kevin < kevin.t...@intel.com > wrote: >> >> From: Xuquan (Euler) [mailto:xuqu...@huawei.com] >> >> Sent: Friday, August 19, 2016 8:59 PM >> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c >> index 1d5d287..cc247c3 100644 >> --- a/xen/arch/x86/hvm/vlapic.c >> +++ b/xen/arch/x86/hvm/vlapic.c >> @@ -433,6 +433,11 @@ void vlapic_EOI_set(struct vlapic *vlapic) void >> vlapic_handle_EOI(struct vlapic *vlapic, u8 vector) { >> struct domain *d = vlapic_domain(vlapic); >> +struct hvm_intack pt_intack; >> + >> +pt_intack.vector = vector; >> +pt_intack.source = hvm_intsrc_lapic; >> +pt_intr_post(vlapic_vcpu(vlapic), pt_intack); >> >> if ( vlapic_test_and_clear_vector(vector, >>regs->data[APIC_TMR]) ) >> vioapic_update_EOI(d, vector); diff --git >> a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c index >> 8fca08c..29d9bbf 100644 >> --- a/xen/arch/x86/hvm/vmx/intr.c >> +++ b/xen/arch/x86/hvm/vmx/intr.c >> @@ -333,8 +333,6 @@ void vmx_intr_assist(void) >> clear_bit(i, >arch.hvm_vmx.eoi_exitmap_changed); >> __vmwrite(EOI_EXIT_BITMAP(i), >v->arch.hvm_vmx.eoi_exit_bitmap[i]); >> } >> - >> -pt_intr_post(v, intack); >> } >> else >> { >> > >Because we update pt irq in every vmentry, there is a chance that >already-injected instance (before EOI-induced exit happens) will incur another >pending IRR setting if there is a VM-exit happens between HW virtual interrupt >injection (vIRR->0, vISR->1) and EOI-induced exit (vISR->0), since pt_intr_post >hasn't been invoked yet. I guess this is the reason why you still see faster >wallclock. > Agreed. A good description. My bad description is from another aspect. >I think you need mark this pending_intr_post situation explicitly. >Then pt_update_irq should skip such pt timer when pending_intr_post of that >timer is true (otherwise the update is meaningless since previous one hasn't >been posted yet). Then with your change to post in EOI-induced exit handler, it >should work correctly to meet the goal >- one virtual interrupt delivery for one pending pt intr... > I think we are at least on the right track. But I can't follow ' pending_intr_post ', a new parameter? Thanks. Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On August 30, 2016 1:58 PM, Tian Kevin < kevin.t...@intel.com > wrote: >> From: Xuquan (Euler) [mailto:xuqu...@huawei.com] >> Sent: Friday, August 19, 2016 8:59 PM >> >> When Xen apicv is enabled, wall clock time is faster on Windows7-32 >> guest with high payload (with 2vCPU, captured from xentrace, in high >> payload, the count of IPI interrupt increases rapidly between these vCPUs). >> >> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector >> 0xd1) are both pending (index of bit set in VIRR), unfortunately, the >> IPI intrrupt is high priority than periodic timer interrupt. Xen >> updates IPI interrupt bit set in VIRR to guest interrupt status (RVI) >> as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI >> interrupt within VMX non-root operation without a VM exit. Within VMX >> non-root operation, if periodic timer interrupt index of bit is set in >> VIRR and highest, the apicv delivers periodic timer interrupt within VMX >non-root operation as well. >> >> But in current code, if Xen doesn't update periodic timer interrupt >> bit set in VIRR to guest interrupt status (RVI) directly, Xen is not >> aware of this case to decrease the count (pending_intr_nr) of pending >> periodic timer interrupt, then Xen will deliver a periodic timer >> interrupt again. The guest receives more periodic timer interrupt. >> >> If the periodic timer interrut is delivered and not the highest >> priority, make Xen be aware of this case to decrease the count of >> pending periodic timer interrupt. >> >> Signed-off-by: Yifei Jiang <jiangyi...@huawei.com> >> Signed-off-by: Rongguang He <herongguang...@huawei.com> >> Signed-off-by: Quan Xu <xuqu...@huawei.com> >> --- >> Why RFC: >> 1. I am not quite sure for other cases, such as nested case. >> 2. Within VMX non-root operation, an Asynchronous Enclave Exit (including >external >>interrupts, non-maskable interrupt system-management interrrupts, >exceptions >>and VM exit) may occur before delivery of a periodic timer interrupt, the >periodic >>timer interrupt may be lost when a coming periodic timer interrupt is >delivered. >>Actually, and so current code is. >> --- >> xen/arch/x86/hvm/vmx/intr.c | 16 +++- >> 1 file changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c >> index 8fca08c..d3a034e 100644 >> --- a/xen/arch/x86/hvm/vmx/intr.c >> +++ b/xen/arch/x86/hvm/vmx/intr.c >> @@ -334,7 +334,21 @@ void vmx_intr_assist(void) >> __vmwrite(EOI_EXIT_BITMAP(i), >v->arch.hvm_vmx.eoi_exit_bitmap[i]); >> } >> >> -pt_intr_post(v, intack); >> + /* >> +* If the periodic timer interrut is delivered and not the highest >priority, >> +* make Xen be aware of this case to decrease the count of pending >periodic >> +* timer interrupt. >> +*/ >> +if ( pt_vector != -1 && intack.vector > pt_vector ) >> +{ >> +struct hvm_intack pt_intack; >> + >> +pt_intack.vector = pt_vector; >> +pt_intack.source = hvm_intsrc_lapic; >> +pt_intr_post(v, pt_intack); >> +} >> +else >> +pt_intr_post(v, intack); >> } > >Here is my thought. w/o above patch one pending pt interrupt may result in >multiple injections of guest timer interrupt, as you identified, when pt_vector >happens to be not the highest pending vector. Then w/ your patch, multiple >pending pt interrupt instances may be combined as one injection of guest timer >interrupt. Especially intack.vector >>pt_vector doesn't mean pt_intr is eligible for injection, which might >be blocked due to other reasons. As Jan pointed out, this path is very >fragile, so >better we can have a fix to sustain the original behavior with one pending pt >instance causing one actual injection. > >What about doing pt_intr_post in EOI-induced VM-exit instead? EOI exit bit for >pt_intr is already set to 1 which means every injection would incur an >EOI-induced VM-exit: > > /* >* Set eoi_exit_bitmap for periodic timer interrup to cause EOI-induced >VM >* exit, then pending periodic time interrups have the chance to be >injected >* for compensation >*/ >if (pt_vector != -1) >vmx_set_eoi_exit_bitmap(v, pt_vector); > >I don't think delaying post makes much difference here. Even we do post earlier >like your p
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On September 06, 2016 9:21 PM, Jan Beulichwrote: On 06.09.16 at 13:22, wrote: >> On September 06, 2016 5:05 PM, Jan Beulich >> On 06.09.16 at 10:53, wrote: to be honest, my fix is also a tradeoff.. w/ my patch, multiple pending pt interrupt instances may be combined as one injection of guest timer interrupt.. but as Yang said, It should be ok since it happens rarely and even native OS will encounter the same problem if it disable the interrupt for too >>>long. >>> >>>And one OS may properly deal with that situation while another might >>>not. For example, let me remind you of issues the hypervisor had in >>>earlier days when no event/softirq processing happened for long >>>enough, resulting in time management issues due to missed platform >>>timer overflow handling. IOW an OS may avoid that situation by simply not >disabling IRQs for too long a time. >>> >> Thanks for your reminding, >> Jan, do you have any suggestion to fix this PT and apicv issue? >> What do you think about Kevin's? > >Kevin's suggestion sounded reasonable, iirc, and I'd rely on his and your >better >VMX knowledge for the low level details. > I'll try to verify Kevin's later. Kevin is really a master but I am just a rookie. I always appreciate both of your education.. Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On September 06, 2016 5:05 PM, Jan BeulichOn 06.09.16 at 10:53, wrote: >> to be honest, my fix is also a tradeoff.. w/ my patch, multiple >> pending pt interrupt instances may be combined as one injection of >> guest timer interrupt.. >> but as Yang said, It should be ok since it happens rarely and even >> native OS will encounter the same problem if it disable the interrupt for too >long. > >And one OS may properly deal with that situation while another might not. For >example, let me remind you of issues the hypervisor had in earlier days when no >event/softirq processing happened for long enough, resulting in time >management issues due to missed platform timer overflow handling. IOW an OS >may avoid that situation by simply not disabling IRQs for too long a time. > Thanks for your reminding, Jan, do you have any suggestion to fix this PT and apicv issue? What do you think about Kevin's? Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On August 30, 2016 1:58 PM, Tian, Kevin <kevin.t...@intel.com> wrote: >> From: Xuquan (Euler) [mailto:xuqu...@huawei.com] >> Sent: Friday, August 19, 2016 8:59 PM >> >> When Xen apicv is enabled, wall clock time is faster on Windows7-32 >> guest with high payload (with 2vCPU, captured from xentrace, in high >> payload, the count of IPI interrupt increases rapidly between these vCPUs). >> >> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector >> 0xd1) are both pending (index of bit set in VIRR), unfortunately, the >> IPI intrrupt is high priority than periodic timer interrupt. Xen >> updates IPI interrupt bit set in VIRR to guest interrupt status (RVI) >> as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI >> interrupt within VMX non-root operation without a VM exit. Within VMX >> non-root operation, if periodic timer interrupt index of bit is set in >> VIRR and highest, the apicv delivers periodic timer interrupt within VMX >non-root operation as well. >> >> But in current code, if Xen doesn't update periodic timer interrupt >> bit set in VIRR to guest interrupt status (RVI) directly, Xen is not >> aware of this case to decrease the count (pending_intr_nr) of pending >> periodic timer interrupt, then Xen will deliver a periodic timer >> interrupt again. The guest receives more periodic timer interrupt. >> >> If the periodic timer interrut is delivered and not the highest >> priority, make Xen be aware of this case to decrease the count of >> pending periodic timer interrupt. >> >> Signed-off-by: Yifei Jiang <jiangyi...@huawei.com> >> Signed-off-by: Rongguang He <herongguang...@huawei.com> >> Signed-off-by: Quan Xu <xuqu...@huawei.com> >> --- >> Why RFC: >> 1. I am not quite sure for other cases, such as nested case. >> 2. Within VMX non-root operation, an Asynchronous Enclave Exit (including >external >>interrupts, non-maskable interrupt system-management interrrupts, >exceptions >>and VM exit) may occur before delivery of a periodic timer interrupt, the >periodic >>timer interrupt may be lost when a coming periodic timer interrupt is >delivered. >>Actually, and so current code is. >> --- >> xen/arch/x86/hvm/vmx/intr.c | 16 +++- >> 1 file changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c >> index 8fca08c..d3a034e 100644 >> --- a/xen/arch/x86/hvm/vmx/intr.c >> +++ b/xen/arch/x86/hvm/vmx/intr.c >> @@ -334,7 +334,21 @@ void vmx_intr_assist(void) >> __vmwrite(EOI_EXIT_BITMAP(i), >v->arch.hvm_vmx.eoi_exit_bitmap[i]); >> } >> >> -pt_intr_post(v, intack); >> + /* >> +* If the periodic timer interrut is delivered and not the highest >priority, >> +* make Xen be aware of this case to decrease the count of pending >periodic >> +* timer interrupt. >> +*/ >> +if ( pt_vector != -1 && intack.vector > pt_vector ) >> +{ >> +struct hvm_intack pt_intack; >> + >> +pt_intack.vector = pt_vector; >> +pt_intack.source = hvm_intsrc_lapic; >> +pt_intr_post(v, pt_intack); >> +} >> +else >> +pt_intr_post(v, intack); >> } > >Here is my thought. w/o above patch one pending pt interrupt may result in >multiple injections of guest timer interrupt, as you identified, when pt_vector >happens to be not the highest pending vector. Then w/ your patch, multiple >pending pt interrupt instances may be combined as one injection of guest timer >interrupt. Especially intack.vector >>pt_vector doesn't mean pt_intr is eligible for injection, which might >be blocked due to other reasons. As Jan pointed out, this path is very >fragile, so >better we can have a fix to sustain the original behavior with one pending pt >instance causing one actual injection. > Agreed. Good summary.. >What about doing pt_intr_post in EOI-induced VM-exit instead? EOI exit bit for >pt_intr is already set to 1 which means every injection would incur an >EOI-induced VM-exit: > > /* >* Set eoi_exit_bitmap for periodic timer interrup to cause EOI-induced >VM >* exit, then pending periodic time interrups have the chance to be >injected >* for compensation >*/ >if (pt_vector != -1) >vmx_set_eoi_exit_bitmap(v, pt_vector); > >I don't think delaying post makes much difference here. Even we do
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On August 22, 2016 11:12 PM,wrote: On 22.08.16 at 16:02, wrote: >> On August 22, 2016 8:04 PM, wrote: >> On 22.08.16 at 13:41, wrote: On August 22, 2016 6:36 PM, wrote: On 19.08.16 at 14:58, wrote: >> From 9b2df963c13ad27e2cffbeddfa3267782ac3da2a Mon Sep 17 >00:00:00 >> 2001 >> From: Quan Xu >> Date: Fri, 19 Aug 2016 20:40:31 +0800 >> Subject: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv >> issue >> >> When Xen apicv is enabled, wall clock time is faster on >> Windows7-32 guest with high payload (with 2vCPU, captured from >> xentrace, in high payload, the count of IPI interrupt increases rapidly >between these vCPUs). >> >> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector >> 0xd1) are both pending (index of bit set in VIRR), unfortunately, >> the IPI intrrupt is high priority than periodic timer interrupt. >> Xen updates IPI interrupt bit set in VIRR to guest interrupt >> status >> (RVI) as a high priority and apicv (Virtual-Interrupt Delivery) >> delivers IPI interrupt within VMX non-root operation without a VM >> exit. Within VMX non-root operation, if periodic timer interrupt >> index of bit is set in VIRR and highest, the apicv delivers >> periodic timer >>>interrupt within VMX non-root operation as well. >> >> But in current code, if Xen doesn't update periodic timer >> interrupt bit set in VIRR to guest interrupt status (RVI) >> directly, Xen is not aware of this case to decrease the count >> (pending_intr_nr) of pending periodic timer interrupt, then Xen >> will deliver a periodic timer interrupt again. The guest receives >> more periodic timer interrupt. >> >> If the periodic timer interrut is delivered and not the highest >> priority, make Xen be aware of this case to decrease the count of >> pending periodic timer interrupt. > >I can see the issue you're trying to address, but for one - doesn't >this lead to other double accounting, namely once the pt irq becomes >the highest priority one? > It is does NOT lead to other double accounting.. As if the pt irq becomes the highest priority one, the intack is pt one.. Then: +else +pt_intr_post(v, intack); >>> >>>As just said in reply to Yang: If this is still the same interrupt >>>instance >> as in a >>>prior run through here which took the if() branch, this change looks >>>like >> having >>>the potential of double accounting. >>> >> >> I very appreciate your detail review. It looks like, but actually it >> doesn't happen. >> >> As the key parameter 'pt->irq_issued'.. >> >> In pt_update_irq(), once the PT irq is issued, set the pt->irq_issued.. >> In pt_intr_post(), clear the pt->irq_issued before touching the count >> 'pt->pending_intr_nr'.. >> >> According to your assumption, at the second call to pt_intr_post(), As >> if 'pt->irq_issued' is clear, pt is NULL in is_pt_irq() check, then >> return, there is no chance to touch the count 'pt->pending_intr_nr'.. > >Hmm, wait: That second pass would also get us through >pt_update_irq() a second time, which might cause irq_issued to get set again. > iiuc this case is IRQ combination, issue twice but delivery once.. Another enhancement may be required here, IF hvm_intsrc_lapic && the PT IRQ index bit in VIRR is not clear THEN don't issue PT IRQ in pt_update_irq() FI >Granted this code is fragile, therefore please excuse that I'm trying to be >extra >careful with accepting changes to it. Understood :):)... Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On August 22, 2016 8:04 PM,wrote: On 22.08.16 at 13:41, wrote: >> On August 22, 2016 6:36 PM, wrote: >> On 19.08.16 at 14:58, wrote: From 9b2df963c13ad27e2cffbeddfa3267782ac3da2a Mon Sep 17 00:00:00 2001 From: Quan Xu Date: Fri, 19 Aug 2016 20:40:31 +0800 Subject: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue When Xen apicv is enabled, wall clock time is faster on Windows7-32 guest with high payload (with 2vCPU, captured from xentrace, in high payload, the count of IPI interrupt increases rapidly between these vCPUs). If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector 0xd1) are both pending (index of bit set in VIRR), unfortunately, the IPI intrrupt is high priority than periodic timer interrupt. Xen updates IPI interrupt bit set in VIRR to guest interrupt status (RVI) as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI interrupt within VMX non-root operation without a VM exit. Within VMX non-root operation, if periodic timer interrupt index of bit is set in VIRR and highest, the apicv delivers periodic timer >interrupt within VMX non-root operation as well. But in current code, if Xen doesn't update periodic timer interrupt bit set in VIRR to guest interrupt status (RVI) directly, Xen is not aware of this case to decrease the count (pending_intr_nr) of pending periodic timer interrupt, then Xen will deliver a periodic timer interrupt again. The guest receives more periodic timer interrupt. If the periodic timer interrut is delivered and not the highest priority, make Xen be aware of this case to decrease the count of pending periodic timer interrupt. >>> >>>I can see the issue you're trying to address, but for one - doesn't >>>this lead to other double accounting, namely once the pt irq becomes >>>the highest priority one? >>> >> >> It is does NOT lead to other double accounting.. >> As if the pt irq becomes the highest priority one, the intack is pt one.. >> Then: >> >> +else >> +pt_intr_post(v, intack); > >As just said in reply to Yang: If this is still the same interrupt instance as >in a >prior run through here which took the if() branch, this change looks like >having >the potential of double accounting. > I very appreciate your detail review. It looks like, but actually it doesn't happen. As the key parameter 'pt->irq_issued'.. In pt_update_irq(), once the PT irq is issued, set the pt->irq_issued.. In pt_intr_post(), clear the pt->irq_issued before touching the count 'pt->pending_intr_nr'.. According to your assumption, at the second call to pt_intr_post(), As if 'pt->irq_issued' is clear, pt is NULL in is_pt_irq() check, then return, there is no chance to touch the count 'pt->pending_intr_nr'.. -- void pt_intr_post(struct vcpu *v, struct hvm_intack intack) { ... pt = is_pt_irq(v, intack); if ( pt == NULL ) { spin_unlock(>arch.hvm_vcpu.tm_lock); return; } ... ... } static struct periodic_time *is_pt_irq() { list_for_each_entry ( pt, head, list ) { if ( pt->pending_intr_nr && pt->irq_issued___ && (intack.vector == pt_irq_vector(pt, intack.source)) ) return pt; } return NULL; } __IIUC__, this question is based on the following pseudocode detail the behavior of virtual-interrupt delivery is __not__ atomic: Vector <- RVI; VISR[Vector] <- 1; SVI <- Vector; VPPR<- Vector & F0H; VIRR[Vector] <- 0; IF any bits set in VIRR Then RVI<-highest index of bit set in VIRR ELSE RVI <-0 FI Deliver interrupt with Vector through IDT ... We'd better check this first, as Yang said, this is atomic.. Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On August 22, 2016 9:10 PM, Yang Zhang wrote: >On 2016/8/22 20:04, Jan Beulich wrote: > On 22.08.16 at 13:41,wrote: >>> On August 22, 2016 6:36 PM, wrote: >>> On 19.08.16 at 14:58, wrote: > From 9b2df963c13ad27e2cffbeddfa3267782ac3da2a Mon Sep 17 00:00:00 > 2001 > From: Quan Xu > Date: Fri, 19 Aug 2016 20:40:31 +0800 > Subject: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv > issue > > When Xen apicv is enabled, wall clock time is faster on Windows7-32 > guest with high payload (with 2vCPU, captured from xentrace, in > high payload, the count of IPI interrupt increases rapidly between these >vCPUs). > > If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector > 0xd1) are both pending (index of bit set in VIRR), unfortunately, > the IPI intrrupt is high priority than periodic timer interrupt. > Xen updates IPI interrupt bit set in VIRR to guest interrupt status > (RVI) as a high priority and apicv (Virtual-Interrupt Delivery) > delivers IPI interrupt within VMX non-root operation without a VM > exit. Within VMX non-root operation, if periodic timer interrupt > index of bit is set in VIRR and highest, the apicv delivers periodic timer >interrupt within VMX non-root operation as well. > > But in current code, if Xen doesn't update periodic timer interrupt > bit set in VIRR to guest interrupt status (RVI) directly, Xen is > not aware of this case to decrease the count (pending_intr_nr) of > pending periodic timer interrupt, then Xen will deliver a periodic > timer interrupt again. The guest receives more periodic timer > interrupt. > > If the periodic timer interrut is delivered and not the highest > priority, make Xen be aware of this case to decrease the count of > pending periodic timer interrupt. I can see the issue you're trying to address, but for one - doesn't this lead to other double accounting, namely once the pt irq becomes the highest priority one? >>> >>> It is does NOT lead to other double accounting.. >>> As if the pt irq becomes the highest priority one, the intack is pt one.. >>> Then: >>> >>> +else >>> +pt_intr_post(v, intack); >> >> As just said in reply to Yang: If this is still the same interrupt >> instance as in a prior run through here which took the if() branch, >> this change looks like having the potential of double accounting. > >This cannot happen: if pt intr is the second priority one, then corresponding >vIRR >bit will be cleared by hardware while the highest priority intr is delivered to >guest. And the next vmx_intr_assit cannot aware of it since the vIRR bit is >zero. > I am afraid that this may happen.. as I mentioned as 2nd RFC reason, Within VMX non-root operation, an Asynchronous Enclave Exit (including external interrupts, non-maskable interrupt system-management interrrupts, exceptions and VM exit) may occur before delivery of a periodic timer interrupt, the pt bit of VIRR is still there... but I will explain more why it doesn't have the potential of double accounting in current code in next email. Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On August 22, 2016 7:40 PM, Yang Zhang wrote: >On 2016/8/22 18:35, Jan Beulich wrote: > On 19.08.16 at 14:58,wrote: >>> From 9b2df963c13ad27e2cffbeddfa3267782ac3da2a Mon Sep 17 00:00:00 >>> 2001 >>> From: Quan Xu >>> Date: Fri, 19 Aug 2016 20:40:31 +0800 >>> Subject: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv >>> issue >>> >>> When Xen apicv is enabled, wall clock time is faster on Windows7-32 >>> guest with high payload (with 2vCPU, captured from xentrace, in high >>> payload, the count of IPI interrupt increases rapidly between these vCPUs). >>> >>> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector >>> 0xd1) are both pending (index of bit set in VIRR), unfortunately, the >>> IPI intrrupt is high priority than periodic timer interrupt. Xen >>> updates IPI interrupt bit set in VIRR to guest interrupt status (RVI) >>> as a high priority and apicv (Virtual-Interrupt Delivery) delivers >>> IPI interrupt within VMX non-root operation without a VM exit. Within >>> VMX non-root operation, if periodic timer interrupt index of bit is >>> set in VIRR and highest, the apicv delivers periodic timer interrupt within >VMX non-root operation as well. >>> >>> But in current code, if Xen doesn't update periodic timer interrupt >>> bit set in VIRR to guest interrupt status (RVI) directly, Xen is not >>> aware of this case to decrease the count (pending_intr_nr) of pending >>> periodic timer interrupt, then Xen will deliver a periodic timer >>> interrupt again. The guest receives more periodic timer interrupt. >>> >>> If the periodic timer interrut is delivered and not the highest >>> priority, make Xen be aware of this case to decrease the count of >>> pending periodic timer interrupt. >> >> I can see the issue you're trying to address, but for one - doesn't >> this lead to other double accounting, namely once the pt irq becomes >> the highest priority one? > >"intack.vector > pt_vector" >I think above check in code can avoid the double accounting problem. It >recalculate pending timer interrupt only when the highest priority interrupt is >not timer. > Yes, >> >>> --- a/xen/arch/x86/hvm/vmx/intr.c >>> +++ b/xen/arch/x86/hvm/vmx/intr.c >>> @@ -334,7 +334,21 @@ void vmx_intr_assist(void) >>> __vmwrite(EOI_EXIT_BITMAP(i), >v->arch.hvm_vmx.eoi_exit_bitmap[i]); >>> } >>> >>> -pt_intr_post(v, intack); >>> + /* >>> +* If the periodic timer interrut is delivered and not the highest >priority, >>> +* make Xen be aware of this case to decrease the count of >pending periodic >>> +* timer interrupt. >>> +*/ >>> +if ( pt_vector != -1 && intack.vector > pt_vector ) >>> +{ >>> +struct hvm_intack pt_intack; >>> + >>> +pt_intack.vector = pt_vector; >>> +pt_intack.source = hvm_intsrc_lapic; >>> +pt_intr_post(v, pt_intack); >>> +} >>> +else >>> +pt_intr_post(v, intack); >> >> And then I'm having two problems with this change: Processing other >> than the highest priority irq here looks to be non-architectural. From >> looking over pt_intr_post() there's only pt related accounting and no >> actual interrupt handling there, but I'd like this to be (a) confirmed >> and (b) called out in the comment. >> >> Plus the change above is in no way apicv specific, so I'd also like to >> see clarification why this change is valid (i.e. at least not making >> things worse) even without apicv in use. > >It is apicv specific case. Above code will execute only when cpu has >virtual interrupt delivery which is part of apicv feature. > Yes, I think so too. Thanks for your clarification!! Quan >> >> And of course in any event VMX maintainer feedback is going to be >> needed here. >> >> Jan >> > > >-- >Yang >Alibaba Cloud Computing ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On August 22, 2016 6:36 PM,wrote: On 19.08.16 at 14:58, wrote: >> From 9b2df963c13ad27e2cffbeddfa3267782ac3da2a Mon Sep 17 00:00:00 2001 >> From: Quan Xu >> Date: Fri, 19 Aug 2016 20:40:31 +0800 >> Subject: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue >> >> When Xen apicv is enabled, wall clock time is faster on Windows7-32 guest >> with high payload (with 2vCPU, captured from xentrace, in high payload, the >> count of IPI interrupt increases rapidly between these vCPUs). >> >> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector 0xd1) >> are both pending (index of bit set in VIRR), unfortunately, the IPI intrrupt >> is high priority than periodic timer interrupt. Xen updates IPI interrupt bit >> set in VIRR to guest interrupt status (RVI) as a high priority and apicv >> (Virtual-Interrupt Delivery) delivers IPI interrupt within VMX non-root >> operation without a VM exit. Within VMX non-root operation, if periodic timer >> interrupt index of bit is set in VIRR and highest, the apicv delivers >> periodic >> timer interrupt within VMX non-root operation as well. >> >> But in current code, if Xen doesn't update periodic timer interrupt bit set >> in VIRR to guest interrupt status (RVI) directly, Xen is not aware of this >> case to decrease the count (pending_intr_nr) of pending periodic timer >> interrupt, >> then Xen will deliver a periodic timer interrupt again. The guest receives >> more >> periodic timer interrupt. >> >> If the periodic timer interrut is delivered and not the highest priority, >> make >> Xen be aware of this case to decrease the count of pending periodic timer >> interrupt. > >I can see the issue you're trying to address, but for one - doesn't >this lead to other double accounting, namely once the pt irq becomes >the highest priority one? > It is does NOT lead to other double accounting.. As if the pt irq becomes the highest priority one, the intack is pt one.. Then: +else +pt_intr_post(v, intack); >> --- a/xen/arch/x86/hvm/vmx/intr.c >> +++ b/xen/arch/x86/hvm/vmx/intr.c >> @@ -334,7 +334,21 @@ void vmx_intr_assist(void) >> __vmwrite(EOI_EXIT_BITMAP(i), >> v->arch.hvm_vmx.eoi_exit_bitmap[i]); >> } >> >> -pt_intr_post(v, intack); >> + /* >> +* If the periodic timer interrut is delivered and not the highest >> priority, >> +* make Xen be aware of this case to decrease the count of pending >> periodic >> +* timer interrupt. >> +*/ >> +if ( pt_vector != -1 && intack.vector > pt_vector ) >> +{ >> +struct hvm_intack pt_intack; >> + >> +pt_intack.vector = pt_vector; >> +pt_intack.source = hvm_intsrc_lapic; >> +pt_intr_post(v, pt_intack); >> +} >> +else >> +pt_intr_post(v, intack); > >And then I'm having two problems with this change: Processing other >than the highest priority irq here looks to be non-architectural. From >looking over pt_intr_post() there's only pt related accounting and no >actual interrupt handling there, but I'd like this to be > (a) confirmed To me, YES.. but I hope Kevin could confirm it again. >and (b) called out in the comment. Agreed. Good idea. > >Plus the change above is in no way apicv specific, so I'd also like to >see clarification why this change is valid (i.e. at least not making >things worse) even without apicv in use. > __iiuc__, it is apicv specific, as these change is under condition 'cpu_has_vmx_virtual_intr_delivery' (Virtual Interrupt Delivery is one of apicv features) as below: if ( cpu_has_vmx_virtual_intr_delivery ...) { ___change___ } >And of course in any event VMX maintainer feedback is going to be >needed here. Thanks for your comments!! Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
On August 22, 2016 5:38 PM, Yang Zhang wrote: >On 2016/8/19 20:58, Xuquan (Euler) wrote: >> From 9b2df963c13ad27e2cffbeddfa3267782ac3da2a Mon Sep 17 00:00:00 2001 >> From: Quan Xu <xuqu...@huawei.com> >> Date: Fri, 19 Aug 2016 20:40:31 +0800 >> Subject: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue >> >> When Xen apicv is enabled, wall clock time is faster on Windows7-32 >> guest with high payload (with 2vCPU, captured from xentrace, in high >> payload, the count of IPI interrupt increases rapidly between these vCPUs). >> >> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector >> 0xd1) are both pending (index of bit set in VIRR), unfortunately, the >> IPI intrrupt is high priority than periodic timer interrupt. Xen >> updates IPI interrupt bit set in VIRR to guest interrupt status (RVI) >> as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI >> interrupt within VMX non-root operation without a VM exit. Within VMX >> non-root operation, if periodic timer interrupt index of bit is set in >> VIRR and highest, the apicv delivers periodic timer interrupt within VMX >> non-root operation as well. >> >> But in current code, if Xen doesn't update periodic timer interrupt >> bit set in VIRR to guest interrupt status (RVI) directly, Xen is not >> aware of this case to decrease the count (pending_intr_nr) of pending >> periodic timer interrupt, then Xen will deliver a periodic timer >> interrupt again. The guest receives more periodic timer interrupt. >> >> If the periodic timer interrut is delivered and not the highest >> priority, make Xen be aware of this case to decrease the count of >> pending periodic timer interrupt. > >Nice catch! > >> >> Signed-off-by: Yifei Jiang <jiangyi...@huawei.com> >> Signed-off-by: Rongguang He <herongguang...@huawei.com> >> Signed-off-by: Quan Xu <xuqu...@huawei.com> >> --- >> Why RFC: >> 1. I am not quite sure for other cases, such as nested case. > >We don't support the nested APICv now. So it is ok for this patch. > >> 2. Within VMX non-root operation, an Asynchronous Enclave Exit (including >> external >>interrupts, non-maskable interrupt system-management interrrupts, >> exceptions >>and VM exit) may occur before delivery of a periodic timer interrupt, the >> periodic >>timer interrupt may be lost when a coming periodic timer interrupt is >> delivered. >>Actually, and so current code is. > >I think you mean the combination not interrupt lost. It should be ok since it >happens rarely and even native OS will encounter the same problem if it >disable the interrupt for too long. > Yes, it is 'combination'. >> --- >> xen/arch/x86/hvm/vmx/intr.c | 16 +++- >> 1 file changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c >> index 8fca08c..d3a034e 100644 >> --- a/xen/arch/x86/hvm/vmx/intr.c >> +++ b/xen/arch/x86/hvm/vmx/intr.c >> @@ -334,7 +334,21 @@ void vmx_intr_assist(void) >> __vmwrite(EOI_EXIT_BITMAP(i), >> v->arch.hvm_vmx.eoi_exit_bitmap[i]); >> } >> >> -pt_intr_post(v, intack); >> + /* >> +* If the periodic timer interrut is delivered and not the highest >> priority, >> +* make Xen be aware of this case to decrease the count of pending >> periodic >> +* timer interrupt. >> +*/ >> +if ( pt_vector != -1 && intack.vector > pt_vector ) >> +{ >> +struct hvm_intack pt_intack; >> + >> +pt_intack.vector = pt_vector; >> +pt_intack.source = hvm_intsrc_lapic; >> +pt_intr_post(v, pt_intack); >> +} >> +else >> +pt_intr_post(v, intack); >> } >> else >> { >> -- >> 1.7.12.4 >> > >Looks good to me. > >Reviewed-by: Yang Zhang <yang.zhang...@gmail.com> > Thanks for your comments!! Quan >-- >Yang >Alibaba Cloud Computing ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
>From 9b2df963c13ad27e2cffbeddfa3267782ac3da2a Mon Sep 17 00:00:00 2001 From: Quan XuDate: Fri, 19 Aug 2016 20:40:31 +0800 Subject: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue When Xen apicv is enabled, wall clock time is faster on Windows7-32 guest with high payload (with 2vCPU, captured from xentrace, in high payload, the count of IPI interrupt increases rapidly between these vCPUs). If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector 0xd1) are both pending (index of bit set in VIRR), unfortunately, the IPI intrrupt is high priority than periodic timer interrupt. Xen updates IPI interrupt bit set in VIRR to guest interrupt status (RVI) as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI interrupt within VMX non-root operation without a VM exit. Within VMX non-root operation, if periodic timer interrupt index of bit is set in VIRR and highest, the apicv delivers periodic timer interrupt within VMX non-root operation as well. But in current code, if Xen doesn't update periodic timer interrupt bit set in VIRR to guest interrupt status (RVI) directly, Xen is not aware of this case to decrease the count (pending_intr_nr) of pending periodic timer interrupt, then Xen will deliver a periodic timer interrupt again. The guest receives more periodic timer interrupt. If the periodic timer interrut is delivered and not the highest priority, make Xen be aware of this case to decrease the count of pending periodic timer interrupt. Signed-off-by: Yifei Jiang Signed-off-by: Rongguang He Signed-off-by: Quan Xu --- Why RFC: 1. I am not quite sure for other cases, such as nested case. 2. Within VMX non-root operation, an Asynchronous Enclave Exit (including external interrupts, non-maskable interrupt system-management interrrupts, exceptions and VM exit) may occur before delivery of a periodic timer interrupt, the periodic timer interrupt may be lost when a coming periodic timer interrupt is delivered. Actually, and so current code is. --- xen/arch/x86/hvm/vmx/intr.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c index 8fca08c..d3a034e 100644 --- a/xen/arch/x86/hvm/vmx/intr.c +++ b/xen/arch/x86/hvm/vmx/intr.c @@ -334,7 +334,21 @@ void vmx_intr_assist(void) __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm_vmx.eoi_exit_bitmap[i]); } -pt_intr_post(v, intack); + /* +* If the periodic timer interrut is delivered and not the highest priority, +* make Xen be aware of this case to decrease the count of pending periodic +* timer interrupt. +*/ +if ( pt_vector != -1 && intack.vector > pt_vector ) +{ +struct hvm_intack pt_intack; + +pt_intack.vector = pt_vector; +pt_intack.source = hvm_intsrc_lapic; +pt_intr_post(v, pt_intack); +} +else +pt_intr_post(v, intack); } else { -- 1.7.12.4 0001-x86-apicv-fix-RTC-periodic-timer-and-apicv-issue.patch Description: 0001-x86-apicv-fix-RTC-periodic-timer-and-apicv-issue.patch ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] help -- A question about ' XENFEAT_auto_translated_physmap':
On August 09, 2016 9:02 PM, < jbeul...@suse.com > wrote: > >>> On 09.08.16 at 14:36,wrote: > > Hi Jan, > > > > A question about ' XENFEAT_auto_translated_physmap': > > > > In linux code, in arch/x86/xen/mmu.c, > > I assume you know that I'm not a maintainer of the Linux code. > > > __xen_pgd_walk() > > { > > > > > > if (xen_feature(XENFEAT_auto_translated_physmap)) > > return 0; > > > > } > > > > > > > > Why xen_feature(XENFEAT_auto_translated_physmap) is true, then return > > directly? > > If not return directly, is there any potential risk? > > Well, the function is specifically there for operations (pinning/unpinning) > which are required only for the not-auto-translated case. > Why would anyone want to traverse a page table tree just to do nothing on > each of the entries? Jan, thank you!! As I am struggling with a dom0 crash, the kernel is old, 3.0.X.. Now there is a crash in [] _pin_lock+0x165/0x2a0<*crash* unable to handle kernel paging request at 8b1021826000 static void _pin_lock(struct mm_struct *mm, int lock) { 171 pgd_t *pgd = mm->pgd; 172 unsigned g; 173 174 for (g = 0; g <= ((TASK_SIZE_MAX-1) / PGDIR_SIZE); g++, pgd++) { 175 pud_t *pud; 176 unsigned u; 177 178 if (pgd_none(*pgd)) 179 continue; 180 pud = pud_offset(pgd, 0); 181 for (u = 0; u < PTRS_PER_PUD; u++, pud++) { 182 pmd_t *pmd; 183 unsigned m; 184 185 if (pud_none(*pud)) 186 continue; 187 pmd = pmd_offset(pud, 0); 188 for (m = 0; m < PTRS_PER_PMD; m++, pmd++) { 189 spinlock_t *ptl; 190 191 if (pmd_none(*pmd)) <-*crash* 192 continue; 193 ptl = pte_lockptr(0, pmd); 194 if (lock) 195 spin_lock(ptl); 196 else 197 spin_unlock(ptl); 198 } 199 } 200 } 201 } 202 #endif } Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] help -- A question about ' XENFEAT_auto_translated_physmap':
Hi Jan, A question about ' XENFEAT_auto_translated_physmap': In linux code, in arch/x86/xen/mmu.c, __xen_pgd_walk() { if (xen_feature(XENFEAT_auto_translated_physmap)) return 0; } Why xen_feature(XENFEAT_auto_translated_physmap) is true, then return directly? If not return directly, is there any potential risk? Detail for __xen_pgd_walk(): static int __xen_pgd_walk(struct mm_struct *mm, pgd_t *pgd, int (*func)(struct mm_struct *mm, struct page *, enum pt_level), unsigned long limit) { int flush = 0; unsigned hole_low, hole_high; unsigned pgdidx_limit, pudidx_limit, pmdidx_limit; unsigned pgdidx, pudidx, pmdidx; /* The limit is the last byte to be touched */ limit--; BUG_ON(limit >= FIXADDR_TOP); if (xen_feature(XENFEAT_auto_translated_physmap)) return 0; /* * 64-bit has a great big hole in the middle of the address * space, which contains the Xen mappings. On 32-bit these * will end up making a zero-sized hole and so is a no-op. */ hole_low = pgd_index(USER_LIMIT); hole_high = pgd_index(PAGE_OFFSET); pgdidx_limit = pgd_index(limit); #if PTRS_PER_PUD > 1 pudidx_limit = pud_index(limit); #else pudidx_limit = 0; #endif #if PTRS_PER_PMD > 1 pmdidx_limit = pmd_index(limit); #else pmdidx_limit = 0; #endif for (pgdidx = 0; pgdidx <= pgdidx_limit; pgdidx++) { pud_t *pud; if (pgdidx >= hole_low && pgdidx < hole_high) continue; if (!pgd_val(pgd[pgdidx])) continue; pud = pud_offset([pgdidx], 0); if (PTRS_PER_PUD > 1) /* not folded */ flush |= (*func)(mm, virt_to_page(pud), PT_PUD); for (pudidx = 0; pudidx < PTRS_PER_PUD; pudidx++) { pmd_t *pmd; if (pgdidx == pgdidx_limit && pudidx > pudidx_limit) goto out; if (pud_none(pud[pudidx])) continue; pmd = pmd_offset([pudidx], 0); if (PTRS_PER_PMD > 1) /* not folded */ flush |= (*func)(mm, virt_to_page(pmd), PT_PMD); for (pmdidx = 0; pmdidx < PTRS_PER_PMD; pmdidx++) { struct page *pte; if (pgdidx == pgdidx_limit && pudidx == pudidx_limit && pmdidx > pmdidx_limit) goto out; if (pmd_none(pmd[pmdidx])) continue; pte = pmd_page(pmd[pmdidx]); flush |= (*func)(mm, pte, PT_PTE); } } } out: /* Do the top level last, so that the callbacks can use it as a cue to do final things like tlb flushes. */ flush |= (*func)(mm, virt_to_page(pgd), PT_PGD); return flush; } Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 15/19] Qemu-Xen-vTPM: Xen frontend driver infrastructure
On August 07, 2016 7:39 PM, Emil Condreawrote: On Mon, Jul 25, 2016 at 7:01 PM, Anthony PERARD wrote: > > > > > +{ > > > +xs_transaction_t xbt = XBT_NULL; > > > + > > > +if (xendev->fe_state == xbus) { > > > +return 0; > > > +} > > > + > > > +xendev->fe_state = xbus; > > > +if (xendev->fe == NULL) { > > > +xen_pv_printf(NULL, 0, "xendev->fe is NULL\n"); > > > +return -1; > > > +} > > > + > > > +retry_transaction: Add a space here, then s/retry_transaction:/ retry_transaction:/ the same for the other cases.. > > > +xbt = xs_transaction_start(xenstore); > > > +if (xbt == XBT_NULL) { > > > +goto abort_transaction; > > > +} > > > > There is a transaction started, but I don't think it is used by the > > function below. Could you remove the transaction? > > I will remove it. For current version I don't see a direct usage of this > transaction. > Quan, did you have a specific reason for past versions for this transaction? No specific reason, maybe I copied these code from xen / libxl .. btw, why does libxl use ' retry_transaction: ' logic ... but QEMU doesn't ? Thank you both. Quan > > > > > > > +if (xenstore_write_int(xendev->fe, "state", xbus)) { > > > +goto abort_transaction; > > > +} > > > + > > > +if (!xs_transaction_end(xenstore, xbt, 0)) { > > > +if (errno == EAGAIN) { > > > +goto retry_transaction; > > > +} > > > +} > > > + > > > +return 0; > > > + > > > +abort_transaction: > > > +xs_transaction_end(xenstore, xbt, 1); > > > +return -1; > > > +} ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RESEND] MAINTAINERS: update email
Wei, thanks!! This email is sent out from outlook (the Linux can't send out email). the format may be not correct. Quan On Aug 01 2016 at 18:46:00pm, Wei Liu <wei.l...@citrix.com> wrote: On Mon, Aug 01, 2016 at 10:36:43AM +, Xuquan (Euler) wrote: > Signed-off-by: Quan Xu <xuqu...@huawei.com> > Reviewed-by: Kevin Tian <kevin.t...@intel.com> Thank you both. Quan, I think there is something wrong with your email setup. When I tried to git am this patch, I got: fatal: corrupt patch at line 11 Patch failed at 0001 MAINTAINERS: update email The copy of the patch that failed is found in: /local/work/COMMITTER/xen.git/.git/rebase-apply/patch When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". Looking at the email it seems that it is encoded somehow. No need to resend though. This patch is simple enough that I've fixed it for you. And I changed the subject line to: MAINTAINERS: update Quan Xu's email address before pushing to staging. > --- > MAINTAINERS | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/MAINTAINERS b/MAINTAINERS index 65c2447..97720a8 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -415,7 +415,7 @@ F: xen/include/asm-x86/hvm/monitor.h > > VTPM > M: Daniel De Graaf <dgde...@tycho.nsa.gov> > -M: Quan Xu <quan...@intel.com> > +M: Quan Xu <xuqu...@huawei.com> > S: Supported > F: extras/mini-os/tpm* > F: extras/mini-os/include/tpm* > -- > 1.7.12.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH RESEND] MAINTAINERS: update email
Signed-off-by: Quan XuReviewed-by: Kevin Tian --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 65c2447..97720a8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -415,7 +415,7 @@ F: xen/include/asm-x86/hvm/monitor.h VTPM M: Daniel De Graaf -M: Quan Xu +M: Quan Xu S: Supported F: extras/mini-os/tpm* F: extras/mini-os/include/tpm* -- 1.7.12.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel