On Wed, Dec 05, 2012 at 01:51:36PM +0000, Zhang, Yang Z wrote:
> Gleb Natapov wrote on 2012-12-05:
> > On Wed, Dec 05, 2012 at 06:02:59AM +0000, Zhang, Yang Z wrote:
> >> Gleb Natapov wrote on 2012-12-05:
> >>> On Wed, Dec 05, 2012 at 01:55:17AM +0000, Zhang, Yang Z wrote:
> >>>> Gleb Natapov wrote on 2012-12-04:
> >>>>> On Tue, Dec 04, 2012 at 06:39:50AM +0000, Zhang, Yang Z wrote:
> >>>>>> Gleb Natapov wrote on 2012-12-03:
> >>>>>>> On Mon, Dec 03, 2012 at 03:01:03PM +0800, Yang Zhang wrote:
> >>>>>>>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
> >>>>>>>> manually, which is fully taken care of by the hardware. This needs
> >>>>>>>> some special awareness into existing interrupr injection path:
> >>>>>>>> 
> >>>>>>>> - for pending interrupt, instead of direct injection, we may need
> >>>>>>>>   update architecture specific indicators before resuming to guest. -
> >>>>>>>>   A pending interrupt, which is masked by ISR, should be also
> >>>>>>>>   considered in above update action, since hardware will decide when
> >>>>>>>>   to inject it at right time. Current has_interrupt and get_interrupt
> >>>>>>>>   only returns a valid vector from injection p.o.v.
> >>>>>>> Most of my previous comments still apply.
> >>>>>>> 
> >>>>>>>> +void kvm_set_eoi_exitmap(struct kvm_vcpu *vcpu, int vector,
> >>>>>>>> +            int trig_mode, int always_set)
> >>>>>>>> +{
> >>>>>>>> +    if (kvm_x86_ops->set_eoi_exitmap)
> >>>>>>>> +            kvm_x86_ops->set_eoi_exitmap(vcpu, vector,
> >>>>>>>> +                                    trig_mode, always_set);
> >>>>>>>> +}
> >>>>>>>> +
> >>>>>>>>  /*
> >>>>>>>>   * Add a pending IRQ into lapic.
> >>>>>>>>   * Return 1 if successfully added and 0 if discarded.
> >>>>>>>> @@ -661,6 +669,7 @@ static int __apic_accept_irq(struct kvm_lapic
> >>> *apic,
> >>>>> int
> >>>>>>> delivery_mode,
> >>>>>>>>              if (unlikely(!apic_enabled(apic)))
> >>>>>>>>                      break;
> >>>>>>>> +            kvm_set_eoi_exitmap(vcpu, vector, trig_mode, 0);
> >>>>>>> As I said in the last review rebuild the bitmap when ioapic or irq
> >>>>>>> notifier configuration changes, user request bit to notify vcpus to
> >>>>>>> reload the bitmap.
> >>>>>> It is too complicated. When program ioapic entry, we cannot get the
> > target
> >>> vcpu
> >>>>> easily. We need to read destination format register and logical
> >>>>> destination register to find out target vcpu if using logical mode.
> >>>>> Also, we must trap every modification to the two registers to update
> >>>>> eoi bitmap. No need to check target vcpu. Enable exit on all vcpus
> >>>>> for the vector
> >>>> This is wrong. As we known, modern OS uses per VCPU vector. We cannot
> >>> ensure all vectors have same trigger mode. And what's worse, the
> >>> vector in another vcpu is used to handle high frequency
> >>> interrupts(like 10G NIC), then it will hurt performance.
> >>>> 
> >>> I never saw OSes reuse vector used by ioapic, as far as I see this
> >> Could you point out which code does this check in Linux kernel? I don't
> >> see any special checks when Linux kernel allocates a vector.
> >> 
> > arch/x86/kernel/apic/io_apic.c:create_irq_nr(). It uses
> > apic->target_cpus() to get cpu mask. target_cpus() return mask of all
> > online cpus. Actually you wrote arch_pi_create_irq() in PI patches to
> > workaround this behaviour and allocated vector per cpu, no?
> Yes, when create an new irq, it will allocate vector from all online cpus. 
> But after user changes the irq affinity, then the vector will reallocate with 
> new cpumask. And this will leave the vector available on other cpus. 
>  
Since during vector allocation all cpus are checked vector will not be
reused if it is allocated on any cpu.

> > Are you aware of any guest that I can run, examine ioapic/apic
> > configuration and see that the same vector is used on different vcpus
> > for different devices? Can you point me to it?
> > 
Can you answer this?

> >>> is not how Linux code works. Furthermore it will not work with KVM
> >>> currently since apic eoi redirected to ioapic based on vector alone,
> >>> not vector/vcpu pair and as far as I am aware this is how real HW works.
> >> yes, real HW works in this way. But why it is helpful in this case?
> > It makes it impossible to use the same vector for different devices on
> > different cpus if the vector is delivered to at least one cpu through 
> > ioapic.
> > It may cause spurious interrupts, it will bring havoc to our ack
> > notifiers (albeit this is KVM's implementation problem). Also look at
> > various comment in arch/x86/kernel/apic/io_apic.c, it looks like ioapics
> > tend to misbehave if you look at them funny. Who knows what troubles EOIing
> > the same vector twice on real HW may bring.
> > 
> >> 
> >>>>> programmed into ioapic. Which two registers? All accesses to ioapic are
> >>>>> trapped and reconfiguration is rare.
> >>>> In logical mode, the destination VCPU is depend on each CPU's destination
> >>> format register and logical destination register. So we must also trap 
> >>> the two
> >>> registers.
> >>>> And if it uses lowest priority delivery mode, the PPR need to be trapped 
> >>>> too.
> >>> Since PPR will change on each interrupt injection, the cost should be
> >>> higher than current approach. No need for all of that if bitmask it
> >>> global.
> >> No, the bitmask is per VCPU. Also, why it will work if bitmask is global?
> > Make in global. Why what will work?
> > 
> > And we need to trap format/logical destination/id registers anyway since
> > we need to build kvm->arch.apic_map table that is used to deliver
> > interrupts. BTW you can use this table to build per VCPU eoi bitmask
> > too, but I am not convinced it is needed in practice.
> Even KVM uses a simple way to implement the lowest priority delivery mode, we 
> still need to trap all interrupts that use the lowest priority delivery mode. 
> Because each interrupt will change CPU's priority and we need to recalculate 
> the priority and iterate the whole ioapic entry to renew the eoi exiting 
> bitmap. The cost should be worse than current way. I don't think it worth to 
> do.
> 
Just set the bit on every vcpu that can get interrupt with lowest priority.

--
                        Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to