Gregory Haskins wrote: > Hi All, > The following is another patch that I broke out of the in-kernel-apic patch > that I sent earlier. This focuses purely on the irqdevice abstraction > without introducing the concept of the in-kernel LAPIC. It also provides an > implementation of a "userint" model, which should support a system with a > QEMU based emulation of the (A)PIC (e.g. current behavior). I have cleaned > up quite a few things based on all of your comments as well, so hopefully its > a little more polished than last time. > > As far as testing, I have confirmed that I can boot a 64-bit linux guest with > no discernable difference in behavior. > > Note that this patch applies after the in-kernel-mmio.patch that I sent > earlier today which has not yet been accepted/commited. This patch should > not depend on it. However, if you do have problems applying it let me know > and I can generate another one that is not after in-kernel-mmio in the series. >
A patchset is fine. > --- a/drivers/kvm/Makefile > +++ b/drivers/kvm/Makefile > @@ -2,7 +2,7 @@ > # Makefile for Kernel-based Virtual Machine module > # > > -kvm-objs := kvm_main.o mmu.o x86_emulate.o > +kvm-objs := kvm_main.o mmu.o x86_emulate.o kvm_userint.o > kvm_ prefixes are not needed for drivers/kvm/. kvm_main is an historical exception. > obj-$(CONFIG_KVM) += kvm.o > kvm-intel-objs = vmx.o > obj-$(CONFIG_KVM_INTEL) += kvm-intel.o > diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h > index c1923df..6caf60b 100644 > --- a/drivers/kvm/kvm.h > +++ b/drivers/kvm/kvm.h > @@ -13,6 +13,7 @@ > #include <linux/mm.h> > > #include "vmx.h" > +#include "kvm_irqdevice.h" > Ditto. > + > +#ifndef __KVM_IRQDEVICE_H > +#define __KVM_IRQDEVICE_H > + > +#define KVM_IRQFLAGS_NMI (1 << 0) > +#define KVM_IRQFLAGS_PEEK (1 << 1) > is PEEK still needed? > + > +struct kvm_irqdevice; > + > +struct kvm_irqsink { > + void (*raise_intr)(struct kvm_irqsink *this, > + struct kvm_irqdevice *dev); > + > + void *private; > +}; > + > +struct kvm_irqdevice { > + int (*pending)(struct kvm_irqdevice *this, int flags); > + int (*read_vector)(struct kvm_irqdevice *this, int flags); > I'm a bit confused here. The kvm_irqsink mechanism implies a push mechanism. ->pending and ->read_vector imply a pull mechanism. > + > +/* > + * kvm_irq_init() > + * > + * Description: > + * Initialized the kvm_irqdevice for use. Should be called before calling > + * any derived implementation init functions > + * > + * Parameters: > + * struct kvm_irqdevice *dev: This device > + * > + * Returns: (void) > + */ > This is infinitely better than the usual kvm documentation, but it would be better still to follow kerneldoc. > +static inline void kvm_irq_init(struct kvm_irqdevice *dev) > +{ > + memset(dev, 0, sizeof(*dev)); > +} > + > +/* > + * kvm_irq_pending() > + * > + * Description: > + * Efficiently determines if an interrupt is pending on an irqdevice > + * > + * Parameters: > + * struct kvm_irqdevice *dev: The device > + * int flags: Modifies the behavior as follows: > + * > + * KVM_IRQFLAGS_NMI: Mask everything but NMIs > + * > + * Returns: (int) > + * -1 = failure > + * 0 = no iterrupts pending (per "flags" criteria) > + * >0 = one or more interrupts are pending > + */ > +static inline int kvm_irq_pending(struct kvm_irqdevice *dev, int flags) > +{ > + return dev->pending(dev, flags); > +} > Usually -errno is returned for failure. Can this reasonably fail, though? it would simplify upper layers if it couldn't. > + > +/* > + * kvm_irq_set_pin() > + * > + * Description: > + * Allows the caller to assert/deassert an IRQ input pin to the device > + * according to device policy. > + * > + * Parameters: > + * struct kvm_irqdevice *dev: The device > + * int pin: The input pin to alter > + * int level: The value to set (1 = assert, 0 = deassert) > + * int flags: Reserved, must be 0 > Just remove it for now. When we find a use for it, we can add it. It's only userspace interfaces that are brittle. > + /* walk the interrupt-bitmap and inject an IRQ for each bit found */ > + for (i = 0; i < NR_IRQ_WORDS; ++i) { > + unsigned long word = sregs->interrupt_bitmap[i]; > + while(word) { > spaceafterwhile. Also, can just use a single loop with __test_bit(), as we're not performance critical here. > + int bit_index = __ffs(word); > + int irq = i * BITS_PER_LONG + bit_index; > + > + kvm_irq_set_pin(&vcpu->irq_dev, irq, 1, 0); > + > + clear_bit(bit_index, &word); > + } > + } > > set_segment(vcpu, &sregs->cs, VCPU_SREG_CS); > set_segment(vcpu, &sregs->ds, VCPU_SREG_DS); > @@ -2318,6 +2317,33 @@ out1: > > diff --git a/drivers/kvm/kvm_userint.c b/drivers/kvm/kvm_userint.c > new file mode 100644 > index 0000000..ded3098 > --- /dev/null > +++ b/drivers/kvm/kvm_userint.c > @@ -0,0 +1,201 @@ > +/* > + * kvm_userint.c: User Interrupts IRQ device > + * > + * This acts as an extention of an interrupt controller that exists > elsewhere > + * (typically in userspace/QEMU). Because this PIC is a pseudo device that > + * is downstream from a real emulated PIC, the "IRQ-to-vector" mapping has > + * already occured. Therefore, this PIC has the following unusal properties: > + * > + * 1) It has 256 "pins" which are literal vectors (e.g.no translation) > + * 2) It only supports "auto-EOI" behavior since it is expected that the > + * upstream emulated PIC will handle the real EOIs (if applicable) > + * 3) It only listens to "asserts" on the pins (deasserts are dropped) > + * because its an auto-EOI device anyway. > Wierd, but well explained. > +static inline int bitarray_findhighest(struct bitarray *this) > +{ > + if (!this->summary) > + return -1; > + else { > + int word_index = __ffs(this->summary); > + int bit_index = __ffs(this->pending[word_index]); > + > + return word_index * BITS_PER_LONG + bit_index; > + } > +} > Um, this is the lowest? > + > +/*---------------------------------------------------------------------- > + * userint interface - provides the actual kvm_irqdevice implementation > + *---------------------------------------------------------------------*/ > + > +typedef struct { > + spinlock_t lock; > + struct bitarray irq_pending; > + int nmi_pending; > +}kvm_userint; > No typedefs in kernel code for ordinary structs. > + > +static int userint_pending(struct kvm_irqdevice *this, int flags) > +{ > + kvm_userint *s = (kvm_userint*)this->private; > + int ret; > + > + spin_lock_irq(&s->lock); > + > + if (flags & KVM_IRQFLAGS_NMI) > + ret = s->nmi_pending; > + else > + ret = bitarray_pending(&s->irq_pending); > You probably want: ret = s->nmi_pending; if (!(flags & KVM_IRQFLAGS_NMI)) ret |= bitarray_pending(...); To avoid calling this twice when interrupts are enabled. > +static int userint_read_vector(struct kvm_irqdevice *this, int flags) > +{ > + kvm_userint *s = (kvm_userint*)this->private; > + int irq; > + > + spin_lock_irq(&s->lock); > + > + /* NMIs take priority, so if there is an NMI pending, or > + * if we are filtering out NMIs, only consider them > + */ > + if (s->nmi_pending || (flags & KVM_IRQFLAGS_NMI)) > + irq = s->nmi_pending ? 2 : -1; > + else > + irq = bitarray_findhighest(&s->irq_pending); > Same here. Actually the comment is correct (even though it does not start with a /* on a line by itself). In general, looks very good. The interface is a bit over-rich (->pending, ->read_vector, and irq_sink) but that's not a showstopper. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel