Re: [kvm-devel] KVM in-kernel APIC update

Avi Kivity Wed, 04 Apr 2007 09:52:58 -0700

Gregory Haskins wrote:
> What I was planning on doing was using that QEMU patch I provided to 
> intercept all pic_send_irq() calls and forward them directly to the kernel 
> via a new ioctl().  This ioctl would be directed at the VM fd, not the VCPU, 
> since its a pure ISA global pin reference and wont know the targeted vcpu 
> until the 8259/IOAPIC perform their translation.
>


Hmm.  If the ioapic is in the kernel, then it's a platform-wide resource 
and you would need a vm ioctl.  If ioapic emulation is in userspace, 
then the ioapic logic will have decided which cpu is targeted and you 
would issue a vcpu ioctl.

> So that being said, I think the interface between userspace and kernel would 
> be no more complex than it is today.  I.e. we would just be passing a single 
> int via an ioctl.  

I think the interface should mirror the hardware interface at the point 
of the "cut".  For example, if we keep the ioapic in userspace, the 
interface is ioapic/apic bus messages.  If we push the ioapic into the 
kernel, the interface describes the various ioapic pins and how the 
ioapics are connected to each other and to the processors (i.e. the 
topology).


> The danger, as you have pointed out, is accepting the QEMU patch that I 
> submitted that potentially diverges the pic code from QEMU upstream.  What 
> this buys us, however, is is that any code (kernel or userspace) would be 
> able to inject an ISA interrupt.  Using an ISA interrupt has the advantage of 
> already being represented in the ACPI/MP tables presented by Bochs, and thus 
> any compliant OS will automatically route the IRQ.
>
> Where things do get potentially complicated with the interface is if we go 
> with a hybrid solution.  Leaving the LAPIC in kernel, but the IOAPIC/8259 in 
> userspace requires a much wider interface so the IOAPIC can deliver APIC 
> style messages to the kernel.  Also, IOAPIC EOI for level sensitive 
> interrupts become more difficult and complex.  Putting LAPIC + IOAPIC#0 in 
> the kernel and leaving 8259 outside might actually work fairly well, but 
> in-kernel ISA interrupts will only work with OSs which enable the APIC.  This 
> may or may not be an issue and may be an acceptable tradeoff.  
>   

Everything should keep working, that is a must.  We just need the 
interfaces to follow the hardware faithfully.  The issue with the ioapic 
eoi is worrying me performance wise, though; it looks like we need to 
push the ioapic too if we are to have no-compromise performance on 
unmodified OSes.


>> There are two extreme models, which I think are both needed.  On one 
>> end, support for closed OSes (e.g. Windows) requires fairly strict 
>> conformance to the PCI model, which means going through the IOAPIC or 
>> PIC or however the interrupt lines are wired in qemu.  This seems to 
>> indicate 
>> that an in-kernel IOAPIC is needed.  On the other end (Linux), 
>> a legacy-free and emulation-free device can just inject interrupts 
>> directly and use shared memory to ack interrupts and indicate their source.
>>     
>
> Well, sort of.  The problem as I see it in both cases is still IRQ routing.   
> If you knew of a free vector to take (regardless of OS) you could forgo the 
> ACPI/MP declarations all together and register the vector with your "device" 
> (e.g. via a hypercall, etc) and have the device issue direct LAPIC messages 
> with the proper vector.  I think this would work assuming the device used 
> edge triggered interrupts (which don't require IOAPIC IRR registration or 
> EOI).
>
>   

For unmodified guests, use the existing pci irq routing.  I certainly 
wouldn't want to debug anything else.  For modified guests, there's no 
real problem.

>>>   
>>>       
>> Since you need locking anyway, best to use the unlocked versions 
>> (__set_bit()).
>>     
>
> Ack.  Yes, locking needs to be added.  Votes for appropriate mechanism? (e.g. 
> spinlock, mutex, etc?)
>
>   

spin_lock_irq(), as this lock will frequently have to be taken in host 
irq handlers.  Need to be extra careful with the locking in 
{vmx,svm}_vcpu_run().

>   
>>> @@ -108,20 +109,12 @@ static unsigned get_addr_size(struct kvm_vcpu *vcpu)
>>>  
>>>  static inline u8 pop_irq(struct kvm_vcpu *vcpu)
>>>  {
>>> -   int word_index = __ffs(vcpu->irq_summary);
>>> -   int bit_index = __ffs(vcpu->irq_pending[word_index]);
>>> -   int irq = word_index * BITS_PER_LONG + bit_index;
>>> -
>>> -   clear_bit(bit_index, &vcpu->irq_pending[word_index]);
>>> -   if (!vcpu->irq_pending[word_index])
>>> -           clear_bit(word_index, &vcpu->irq_summary);
>>> -   return irq;
>>> +   return kvm_vcpu_irq_read(vcpu, 0, NULL);
>>>  }
>>>  
>>>  static inline void push_irq(struct kvm_vcpu *vcpu, u8 irq)
>>>  {
>>> -   set_bit(irq, vcpu->irq_pending);
>>> -   set_bit(irq / BITS_PER_LONG, &vcpu->irq_summary);
>>> +   kvm_vcpu_irq_inject(vcpu, irq, 0);
>>>  }
>>>   
>>>       
>> It would be helpful to unify the vmx and svm irq code first (I can merge 
>> something like that immediately), so this patch becomes simpler.
>>     
>
> Well, I could just change any occurrence of "pop_irq" with 
> kvm_vcpu_irq_read() and any occurrence of push_irq() to 
> kvm_vcpu_irq_inject(), but I don't get the impression that this is what you 
> were referring to.  Could you elaborate?
>   

I meant having a cleanup patch before that pushes irq handling into 
common code, then the apic patch could modify that to call the kernel 
apic code if necessary.


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] KVM in-kernel APIC update

Reply via email to