Gregory Haskins wrote: > What I was planning on doing was using that QEMU patch I provided to > intercept all pic_send_irq() calls and forward them directly to the kernel > via a new ioctl(). This ioctl would be directed at the VM fd, not the VCPU, > since its a pure ISA global pin reference and wont know the targeted vcpu > until the 8259/IOAPIC perform their translation. >
Hmm. If the ioapic is in the kernel, then it's a platform-wide resource and you would need a vm ioctl. If ioapic emulation is in userspace, then the ioapic logic will have decided which cpu is targeted and you would issue a vcpu ioctl. > So that being said, I think the interface between userspace and kernel would > be no more complex than it is today. I.e. we would just be passing a single > int via an ioctl. I think the interface should mirror the hardware interface at the point of the "cut". For example, if we keep the ioapic in userspace, the interface is ioapic/apic bus messages. If we push the ioapic into the kernel, the interface describes the various ioapic pins and how the ioapics are connected to each other and to the processors (i.e. the topology). > The danger, as you have pointed out, is accepting the QEMU patch that I > submitted that potentially diverges the pic code from QEMU upstream. What > this buys us, however, is is that any code (kernel or userspace) would be > able to inject an ISA interrupt. Using an ISA interrupt has the advantage of > already being represented in the ACPI/MP tables presented by Bochs, and thus > any compliant OS will automatically route the IRQ. > > Where things do get potentially complicated with the interface is if we go > with a hybrid solution. Leaving the LAPIC in kernel, but the IOAPIC/8259 in > userspace requires a much wider interface so the IOAPIC can deliver APIC > style messages to the kernel. Also, IOAPIC EOI for level sensitive > interrupts become more difficult and complex. Putting LAPIC + IOAPIC#0 in > the kernel and leaving 8259 outside might actually work fairly well, but > in-kernel ISA interrupts will only work with OSs which enable the APIC. This > may or may not be an issue and may be an acceptable tradeoff. > Everything should keep working, that is a must. We just need the interfaces to follow the hardware faithfully. The issue with the ioapic eoi is worrying me performance wise, though; it looks like we need to push the ioapic too if we are to have no-compromise performance on unmodified OSes. >> There are two extreme models, which I think are both needed. On one >> end, support for closed OSes (e.g. Windows) requires fairly strict >> conformance to the PCI model, which means going through the IOAPIC or >> PIC or however the interrupt lines are wired in qemu. This seems to >> indicate >> that an in-kernel IOAPIC is needed. On the other end (Linux), >> a legacy-free and emulation-free device can just inject interrupts >> directly and use shared memory to ack interrupts and indicate their source. >> > > Well, sort of. The problem as I see it in both cases is still IRQ routing. > If you knew of a free vector to take (regardless of OS) you could forgo the > ACPI/MP declarations all together and register the vector with your "device" > (e.g. via a hypercall, etc) and have the device issue direct LAPIC messages > with the proper vector. I think this would work assuming the device used > edge triggered interrupts (which don't require IOAPIC IRR registration or > EOI). > > For unmodified guests, use the existing pci irq routing. I certainly wouldn't want to debug anything else. For modified guests, there's no real problem. >>> >>> >> Since you need locking anyway, best to use the unlocked versions >> (__set_bit()). >> > > Ack. Yes, locking needs to be added. Votes for appropriate mechanism? (e.g. > spinlock, mutex, etc?) > > spin_lock_irq(), as this lock will frequently have to be taken in host irq handlers. Need to be extra careful with the locking in {vmx,svm}_vcpu_run(). > >>> @@ -108,20 +109,12 @@ static unsigned get_addr_size(struct kvm_vcpu *vcpu) >>> >>> static inline u8 pop_irq(struct kvm_vcpu *vcpu) >>> { >>> - int word_index = __ffs(vcpu->irq_summary); >>> - int bit_index = __ffs(vcpu->irq_pending[word_index]); >>> - int irq = word_index * BITS_PER_LONG + bit_index; >>> - >>> - clear_bit(bit_index, &vcpu->irq_pending[word_index]); >>> - if (!vcpu->irq_pending[word_index]) >>> - clear_bit(word_index, &vcpu->irq_summary); >>> - return irq; >>> + return kvm_vcpu_irq_read(vcpu, 0, NULL); >>> } >>> >>> static inline void push_irq(struct kvm_vcpu *vcpu, u8 irq) >>> { >>> - set_bit(irq, vcpu->irq_pending); >>> - set_bit(irq / BITS_PER_LONG, &vcpu->irq_summary); >>> + kvm_vcpu_irq_inject(vcpu, irq, 0); >>> } >>> >>> >> It would be helpful to unify the vmx and svm irq code first (I can merge >> something like that immediately), so this patch becomes simpler. >> > > Well, I could just change any occurrence of "pop_irq" with > kvm_vcpu_irq_read() and any occurrence of push_irq() to > kvm_vcpu_irq_inject(), but I don't get the impression that this is what you > were referring to. Could you elaborate? > I meant having a cleanup patch before that pushes irq handling into common code, then the apic patch could modify that to call the kernel apic code if necessary. -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel