On Wed, Aug 12, 2009 at 12:18:10PM +0300, Avi Kivity wrote:
> On 08/12/2009 12:04 PM, Gleb Natapov wrote:
>> On Wed, Aug 12, 2009 at 11:27:13AM +0300, Avi Kivity wrote:
>>
>>> On 08/11/2009 03:31 PM, Gleb Natapov wrote:
>>>
>>>
>>> What is the motivation for this change?
>>>
>>>
>> The motivation was explained in 0/0. I want to get rid of lock on
>> general irq injection path so the lock have to be pushed into ioapic
>> since multiple cpus can access it concurrently. PIC has such lock
>> already.
>>
>
> Ah, the real motivation is msi. Pushing locks down doesn't help if we
> keep locking them. But for msi we avoid the lock entirely.
>
Yes. MSI is one. Multiple IOAPICs may inject interrupt in parallel too
(if we will choose to implement multiple IOAPICs sometime).
>>> Why a spinlock and not a mutex?
>>>
>>>
>> Protected sections are small and we do not sleep there.
>>
>
> So what? A mutex is better since it allows preemption (and still has
> spinlock performance if it isn't preempted).
>
This lock will be taken during irq injection from irqfd, so may be leave
it as spinlock and take it _irqsave()? Do we want to allow irq injection
from interrupt context? Otherwise if you say that performance is the
same I don't care one way or the other.
>
>
>>> Need to explain why this is safe. I'm not sure it is, because we touch
>>> state afterwards in pic_intack(). We need to do all vcpu-synchronous
>>> operations before dropping the lock.
>>>
>> Forst pic_intack() calls pic_clear_isr() only in auto eoi mode and this mode
>> is already broken for assigned devices. Second for level triggered
>> interrupts pic_intack() does nothing after calling pic_clear_isr() and
>> third I can move pic_clear_isr() call to the end of pic_intack().
>>
>
> I meant, in a comment.
I you agree with above I'll add it as a comment.
>
>>>> void kvm_pic_clear_isr_ack(struct kvm *kvm)
>>>> @@ -238,7 +240,9 @@ void kvm_pic_reset(struct kvm_kpic_state *s)
>>>> if (vcpu0&& kvm_apic_accept_pic_intr(vcpu0))
>>>> if (s->irr& (1<< irq) || s->isr& (1<<
>>>> irq)) {
>>>> n = irq + irqbase;
>>>> + spin_unlock(&s->pics_state->lock);
>>>> kvm_notify_acked_irq(kvm,
>>>> SELECT_PIC(n), n);
>>>> + spin_lock(&s->pics_state->lock);
>>>>
>>>>
>>> Ditto here, needs to be moved until after done changing state.
>>>
>>>
>> I am not sure this code is even needed. IOAPIC don't call notifiers on
>> reset.
>>
>
> It should. What if there's a reset with an assigned device? We need to
> release the device interrupt (after doing FLR?).
Doing this will just re-enable host interrupt while irq condition is not
cleared in the device. The host will hang. So I think we really shouldn't.
>
>>
>>>> -static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int pin,
>>>> - int trigger_mode)
>>>> +static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector,
>>>> + int trigger_mode)
>>>> {
>>>> - union kvm_ioapic_redirect_entry *ent;
>>>> + int i;
>>>> +
>>>> + for (i = 0; i< IOAPIC_NUM_PINS; i++) {
>>>> + union kvm_ioapic_redirect_entry *ent =&ioapic->redirtbl[i];
>>>> +
>>>> + if (ent->fields.vector != vector)
>>>> + continue;
>>>>
>>>> - ent =&ioapic->redirtbl[pin];
>>>> + spin_unlock(&ioapic->lock);
>>>> + kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i);
>>>> + spin_lock(&ioapic->lock);
>>>>
>>>>
>>>>
>>> I *think* we need to clear remote_irr before dropping the lock. I
>>> *know* there's a missing comment here.
>>>
>> I don't see why we clear remote_irr before dropping the lock. If, while
>> lock was dropped, interrupt was delivered to this entry it will be
>> injected when ack notifier returns.
>>
>
> But we'll clear remote_irr afterward the redelivery, and we should to
> that only after the new interrupt is acked.
It depend on whether you consider calling ack notifiers a part of
interrupt acknowledgement process. If you do then remote_irr should not
be cleared before ack notifiers since ack process is not completed yet.
With current users functionally it shouldn't matter when we clear
remote_irr. I prefer doing it like we do it now since this how it was
before my patches and since code is simpler this way.
>
>>>> - kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, pin);
>>>> + if (trigger_mode != IOAPIC_LEVEL_TRIG)
>>>> + continue;
>>>>
>>>> - if (trigger_mode == IOAPIC_LEVEL_TRIG) {
>>>> ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG);
>>>> ent->fields.remote_irr = 0;
>>>> - if (!ent->fields.mask&& (ioapic->irr& (1<< pin)))
>>>> - ioapic_service(ioapic, pin);
>>>> + if (!ent->fields.mask&& (ioapic->irr& (1<< i)))
>>>> + ioapic_service(ioapic, i);
>>>> }
>>>> }
>>>>
>>>>
>>> To make the patch easier to read, suggest keeping the loop in the other
>>> function.
>>>
>>>
>> I don't follow. All __kvm_ioapic_update_eoi() contains is the loop, so
>> the loop is already in its own function. Do you mean move the context of
>> the loop into the other function and leave only for(;;) fun(); in
>> __kvm_ioapic_update_eoi()?
>>
>
> No, I mean keep the for loop in kvm_ioapic_update_eoi().
>
Can't do that. __kvm_ioapic_update_eoi() is called from other place with
lock held already.
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html