Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-28 Thread Avi Kivity

Gregory Haskins wrote:

 


Sure, the interface remains the same (write 8 bytes), but the
implementation can change.  For example, we can implement it to work
from interrupt context, once we hack the locking appropriately.



I was thinking more along the lines of eventfd_signal().  AIO and vbus
currently use this interface, as opposed to the more polymorhpic
f_ops-write().

  


But eventfd_signal basically marries us to eventfd.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-28 Thread Avi Kivity

Gregory Haskins wrote:

 


But eventfd_signal basically marries us to eventfd.



Well, only if we expect the fd to have eventfd semantics.  There are
advantages to doing so, as we have discussed, because things like AIO
can polymorhpically signal an interrupt without even knowing whats
behind the eventfd.  But this isn't a strict requirement to support
AIO.  Really all we need is a way for both kernel and userspace to
signal.  Perhaps I should export an irqfd_signal() function from kvm,
which today will map to eventfd_signal(), and tomorrow to ??.  I don't
think using f_ops-write() is an option for in-kernel signaling, so we
need some kind of interface here.

Does that sound reasonable?
  


irqfd_signal() ties the user of irqfd to kvm.  I want this user to be 
independent of kvm; it should work with eventfd, kvm's eventfd lookalike 
(if we move away from eventfd) or pipes.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-28 Thread Gregory Haskins
Avi Kivity wrote:
 Gregory Haskins wrote:

  

 But eventfd_signal basically marries us to eventfd.
 

 Well, only if we expect the fd to have eventfd semantics.  There are
 advantages to doing so, as we have discussed, because things like AIO
 can polymorhpically signal an interrupt without even knowing whats
 behind the eventfd.  But this isn't a strict requirement to support
 AIO.  Really all we need is a way for both kernel and userspace to
 signal.  Perhaps I should export an irqfd_signal() function from kvm,
 which today will map to eventfd_signal(), and tomorrow to ??.  I don't
 think using f_ops-write() is an option for in-kernel signaling, so we
 need some kind of interface here.

 Does that sound reasonable?
   

 irqfd_signal() ties the user of irqfd to kvm.  I want this user to be
 independent of kvm; it should work with eventfd, kvm's eventfd
 lookalike (if we move away from eventfd) or pipes.

So what is your proposal for such interface?

-Greg




signature.asc
Description: OpenPGP digital signature


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-28 Thread Avi Kivity

Avi Kivity wrote:


So what is your proposal for such interface?

  


-write().



Alternatively, a new fileop -signal_event(), which would map to 
eventfd_signal() or irqfd_signal().  This would be defined to work in 
irq contexts.  It may also be useful for uio interrupts.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-28 Thread Gregory Haskins
Avi Kivity wrote:
 Avi Kivity wrote:

 So what is your proposal for such interface?

   

 -write().


 Alternatively, a new fileop -signal_event(), which would map to
 eventfd_signal() or irqfd_signal().  This would be defined to work in
 irq contexts.  It may also be useful for uio interrupts.

Hmm...I'm not crazy about either of those.  write() has obvious
limitations both from a interrupt execution context, as well as the
awkwardness of dealing with creating+passing a viable userspace
pointer from kernel code.  On the other hand, a new fileop doesn't quite
seem appropriate either since it doesn't apply to the overall fileop
abstraction very well.

We could potentially have a separate vtable interface just for
event-type fds, and make eventfd and irqfd the first implementations. 
But I am not sure it is worth it.  What I suggest is that we work within
the existing eventfd interface.  It was designed specifically to signal
events, after all.

If at some point in the future we need to ensure that the callbacks are
not invoked from a preempt-off/irq-off critical section, we can revist
the eventfd internals at that time.  Note that since we would like to
support signaling from interrupt context anyway, trying to get rid of
the wqh critical section that we have today may be a fools errand (*). 
Instead, we should probably focus on making the injection path support
non-preemptible contexts, as this will have the biggest benefits and
gains in the long run.

Thoughts?
-Greg

(*) The biggest benefit is that you can do tricks like
  if (preemptible()) do_it_now(); else
do_it_later();




signature.asc
Description: OpenPGP digital signature


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-28 Thread Avi Kivity

Gregory Haskins wrote:

Avi Kivity wrote:
  

Avi Kivity wrote:


So what is your proposal for such interface?

  


-write().

  

Alternatively, a new fileop -signal_event(), which would map to
eventfd_signal() or irqfd_signal().  This would be defined to work in
irq contexts.  It may also be useful for uio interrupts.



Hmm...I'm not crazy about either of those.  write() has obvious
limitations both from a interrupt execution context, as well as the
awkwardness of dealing with creating+passing a viable userspace
pointer from kernel code.  On the other hand, a new fileop doesn't quite
seem appropriate either since it doesn't apply to the overall fileop
abstraction very well.

We could potentially have a separate vtable interface just for
event-type fds, and make eventfd and irqfd the first implementations. 
But I am not sure it is worth it.  What I suggest is that we work within

the existing eventfd interface.  It was designed specifically to signal
events, after all.

If at some point in the future we need to ensure that the callbacks are
not invoked from a preempt-off/irq-off critical section, we can revist
the eventfd internals at that time.  Note that since we would like to
support signaling from interrupt context anyway, trying to get rid of
the wqh critical section that we have today may be a fools errand (*). 
Instead, we should probably focus on making the injection path support

non-preemptible contexts, as this will have the biggest benefits and
gains in the long run.

  

But again, you're forcing everyone who uses irqfd to require eventfd.

Maybe we should change eventfd_signal() to fall back to -write if the 
file happens not to be an eventfd.  It could also handle the 
nonpreemptible context as well.



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-28 Thread Gregory Haskins
Avi Kivity wrote:
 Gregory Haskins wrote:

  

 Sure, the interface remains the same (write 8 bytes), but the
 implementation can change.  For example, we can implement it to work
 from interrupt context, once we hack the locking appropriately.
 

 I was thinking more along the lines of eventfd_signal().  AIO and vbus
 currently use this interface, as opposed to the more polymorhpic
 f_ops-write().

   

 But eventfd_signal basically marries us to eventfd.

Well, only if we expect the fd to have eventfd semantics.  There are
advantages to doing so, as we have discussed, because things like AIO
can polymorhpically signal an interrupt without even knowing whats
behind the eventfd.  But this isn't a strict requirement to support
AIO.  Really all we need is a way for both kernel and userspace to
signal.  Perhaps I should export an irqfd_signal() function from kvm,
which today will map to eventfd_signal(), and tomorrow to ??.  I don't
think using f_ops-write() is an option for in-kernel signaling, so we
need some kind of interface here.

Does that sound reasonable?

-Greg




signature.asc
Description: OpenPGP digital signature


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-28 Thread Gregory Haskins
Avi Kivity wrote:
 Gregory Haskins wrote:
 Avi Kivity wrote:
  
 Avi Kivity wrote:

 So what is your proposal for such interface?

   
 -write().

   
 Alternatively, a new fileop -signal_event(), which would map to
 eventfd_signal() or irqfd_signal().  This would be defined to work in
 irq contexts.  It may also be useful for uio interrupts.

 
 Hmm...I'm not crazy about either of those.  write() has obvious
 limitations both from a interrupt execution context, as well as the
 awkwardness of dealing with creating+passing a viable userspace
 pointer from kernel code.  On the other hand, a new fileop doesn't quite
 seem appropriate either since it doesn't apply to the overall fileop
 abstraction very well.

 We could potentially have a separate vtable interface just for
 event-type fds, and make eventfd and irqfd the first implementations.
 But I am not sure it is worth it.  What I suggest is that we work within
 the existing eventfd interface.  It was designed specifically to signal
 events, after all.

 If at some point in the future we need to ensure that the callbacks are
 not invoked from a preempt-off/irq-off critical section, we can revist
 the eventfd internals at that time.  Note that since we would like to
 support signaling from interrupt context anyway, trying to get rid of
 the wqh critical section that we have today may be a fools errand
 (*). Instead, we should probably focus on making the injection path
 support
 non-preemptible contexts, as this will have the biggest benefits and
 gains in the long run.

   
 But again, you're forcing everyone who uses irqfd to require eventfd.

 Maybe we should change eventfd_signal() to fall back to -write if the
 file happens not to be an eventfd.  It could also handle the
 nonpreemptible context as well.


Yep, that works.  And the nice thing from my perspective is: we don't
need a v4 ;)

-Greg



signature.asc
Description: OpenPGP digital signature


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-27 Thread Avi Kivity

Gregory Haskins wrote:

This allows an eventfd to be registered as an irq source with a guest.  Any
signaling operation on the eventfd (via userspace or kernel) will inject
the registered GSI at the next available window.

 
+struct kvm_irqfd {

+   __u32 fd;
+   __u32 gsi;
+};
+
  


I think it's better to have ioctl create and return the fd.  This way we 
aren't tied to eventfd (though it makes a lot of sense to use it).


Also, please add a flags field and some padding so we can extend it later.


+
+#include linux/kvm_host.h
+#include linux/eventfd.h
+#include linux/workqueue.h
+#include linux/wait.h
+#include linux/poll.h
+#include linux/file.h
+#include linux/list.h
+
+struct _irqfd {
+   struct kvm   *kvm;
+   int   gsi;
+   struct file  *file;
+   struct list_head  list;
+   poll_tablept;
+   wait_queue_head_t*wqh;
+   wait_queue_t  wait;
+   struct work_structwork;
+};
+
+static void
+irqfd_inject(struct work_struct *work)
+{
+   struct _irqfd *irqfd = container_of(work, struct _irqfd, work);
+   struct kvm *kvm = irqfd-kvm;
+
+   mutex_lock(kvm-lock);
+   kvm_set_irq(kvm, kvm-irqfd.src, irqfd-gsi, 1);
  


Need to lower the irq too (though irqfd only supports edge triggered 
interrupts).



+   mutex_unlock(kvm-lock);
+}
+
+static int
+irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
+{
+   struct _irqfd *irqfd = container_of(wait, struct _irqfd, wait);
+
+   /*
+* The eventfd calls its wake_up with interrupts disabled,
+* so we need to defer the IRQ injection until later since we need
+* to acquire the kvm-lock to do so.
+*/
+   schedule_work(irqfd-work);
+
+   return 0;
+}
  


One day we'll have lockless injection and we'll want to drop this.  I 
guess if we create the fd ourselves we can make it work, but I don't see 
how we can do this with eventfd.



+int
+kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi)
+{
+   struct _irqfd *irqfd;
+   struct file *file;
+   int ret;
+
+   irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
+   if (!irqfd)
+   return -ENOMEM;
+
+   irqfd-kvm = kvm;
+   irqfd-gsi = gsi;
+   INIT_LIST_HEAD(irqfd-list);
+   init_waitqueue_func_entry(irqfd-wait, irqfd_wakeup);
+   init_poll_funcptr(irqfd-pt, irqfd_ptable_queue_proc);
+   INIT_WORK(irqfd-work, irqfd_inject);
+
+   file = eventfd_fget(fd);
+   if (IS_ERR(file)) {
+   ret = PTR_ERR(file);
+   goto fail;
+   }
+
+   ret = file-f_op-poll(file, irqfd-pt);
+   /* do we need to look for errors in ret? */
  


Do we?


+
+   irqfd-file = file;
+
+   mutex_lock(kvm-lock);
+   if (kvm-irqfd.src == -1) {
+   ret = kvm_request_irq_source_id(kvm);
+   BUG_ON(ret  0);
  


I think you can reuse the userspace irq source (since it's just another 
way for userspace to inject an interrupt).  It isn't really needed since 
the irq source stuff is only needed to support level triggered interrupts.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-27 Thread Gregory Haskins
Gregory Haskins wrote:
 Avi Kivity wrote:
   

 One day we'll have lockless injection and we'll want to drop this.  I
 guess if we create the fd ourselves we can make it work, but I don't
 see how we can do this with eventfd.

 

 Hmm...this is a good point.  There probably is no way to use eventfd
 off the shelf in a way that doesn't cause this callback to be in a
 critical section.  Should we just worry about switching away from
 eventfd when this occurs, or should I implement a custom anon-fd now?
   

Something else to consider:  eventfd supports calling eventfd_signal()
from interrupt context, so
even if the callback were not invoked from a preempt/irq off CS due to
the wqh-lock, the context may still be a CS anyway.  Now, userspace and
vbus based injection do not need to worry about this, but I wonder if
this is a  desirable attribute for some other source (device-assignment
perhaps)?

If so, is there a way to design the lockless-injection code such that
its still friendly to irq-context, or is this a mode that we would never
want to support anyway?  I think this would be a good thing to have at
least to maintain compatibility with the existing eventfd interface,
which has its own advantages.

-Greg




signature.asc
Description: OpenPGP digital signature


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-27 Thread Avi Kivity

Gregory Haskins wrote:

Something else to consider:  eventfd supports calling eventfd_signal()
from interrupt context, so
even if the callback were not invoked from a preempt/irq off CS due to
the wqh-lock, the context may still be a CS anyway.  Now, userspace and
vbus based injection do not need to worry about this, but I wonder if
this is a  desirable attribute for some other source (device-assignment
perhaps)?
  


I'm no networking expert, but device assignment certainly wants to queue 
an interrupt from an interrupt (basically it forwards the interrupt from 
host to guest).  Block is also simple enough to trigger the interrupt 
from the completion context.


For networking, we'd eventually want to do the host-guest copy using a 
dma engine, and we'd want to inject the interrupt from the dma engine's 
completion handler.




If so, is there a way to design the lockless-injection code such that
its still friendly to irq-context, or is this a mode that we would never
want to support anyway?  I think this would be a good thing to have at
least to maintain compatibility with the existing eventfd interface,
which has its own advantages.
  


This has RCU painted all over it in a 1200-point font.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-27 Thread Gregory Haskins
Avi Kivity wrote:
 Gregory Haskins wrote:
 This allows an eventfd to be registered as an irq source with a
 guest.  Any
 signaling operation on the eventfd (via userspace or kernel) will
 inject
 the registered GSI at the next available window.

  
 +struct kvm_irqfd {
 +__u32 fd;
 +__u32 gsi;
 +};
 +
 
 I think it's better to have ioctl create and return the fd.  This way
 we aren't tied to eventfd (though it makes a lot of sense to use it).
 

 I dont mind either way, but I am not sure it buys us much as the one
 driving the fd would need to understand if the interface is
 eventfd-esque or something else anyway.  Let me know if you still want
 to see this changed.
   

 Sure, the interface remains the same (write 8 bytes), but the
 implementation can change.  For example, we can implement it to work
 from interrupt context, once we hack the locking appropriately.

I was thinking more along the lines of eventfd_signal().  AIO and vbus
currently use this interface, as opposed to the more polymorhpic
f_ops-write().



 +static void
 +irqfd_inject(struct work_struct *work)
 +{
 +struct _irqfd *irqfd = container_of(work, struct _irqfd, work);
 +struct kvm *kvm = irqfd-kvm;
 +
 +mutex_lock(kvm-lock);
 +kvm_set_irq(kvm, kvm-irqfd.src, irqfd-gsi, 1);
 
 Need to lower the irq too (though irqfd only supports edge triggered
 interrupts).

 
 Should I just do back-to-back 1+0 inside the same lock?

   

 Yes.  Might be nice to add a kvm_toggle_irq(), but let's leave that
 until later.

Ok.



  

 One day we'll have lockless injection and we'll want to drop this.  I
 guess if we create the fd ourselves we can make it work, but I don't
 see how we can do this with eventfd.

 

 Hmm...this is a good point.  There probably is no way to use eventfd
 off the shelf in a way that doesn't cause this callback to be in a
 critical section.  Should we just worry about switching away from
 eventfd when this occurs, or should I implement a custom anon-fd now?
   

 I'd just go with eventfd, and switch when it becomes relevant.  As
 long as the kernel allocates the fd, we're free to do as we like.

Sounds good.

-Greg




signature.asc
Description: OpenPGP digital signature


Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-24 Thread Gregory Haskins
Hi David,
   See inline for an answer to your last question:

Gregory Haskins wrote:
 This allows an eventfd to be registered as an irq source with a guest.  Any
 signaling operation on the eventfd (via userspace or kernel) will inject
 the registered GSI at the next available window.

 Signed-off-by: Gregory Haskins ghask...@novell.com
 ---

  arch/x86/kvm/Makefile|2 -
  arch/x86/kvm/x86.c   |1 
  include/linux/kvm.h  |7 ++
  include/linux/kvm_host.h |7 ++
  virt/kvm/irqfd.c |  146 
 ++
  virt/kvm/kvm_main.c  |   12 
  6 files changed, 174 insertions(+), 1 deletions(-)
  create mode 100644 virt/kvm/irqfd.c

 diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
 index b43c4ef..d5fff51 100644
 --- a/arch/x86/kvm/Makefile
 +++ b/arch/x86/kvm/Makefile
 @@ -3,7 +3,7 @@
  #
  
  common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
 -coalesced_mmio.o irq_comm.o)
 +coalesced_mmio.o irq_comm.o irqfd.o)
  ifeq ($(CONFIG_KVM_TRACE),y)
  common-objs += $(addprefix ../../../virt/kvm/, kvm_trace.o)
  endif
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 8cb8542..88c45e6 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -1027,6 +1027,7 @@ int kvm_dev_ioctl_check_extension(long ext)
   case KVM_CAP_REINJECT_CONTROL:
   case KVM_CAP_IRQ_INJECT_STATUS:
   case KVM_CAP_ASSIGN_DEV_IRQ:
 + case KVM_CAP_IRQFD:
   r = 1;
   break;
   case KVM_CAP_COALESCED_MMIO:
 diff --git a/include/linux/kvm.h b/include/linux/kvm.h
 index 3db5d8d..2404775 100644
 --- a/include/linux/kvm.h
 +++ b/include/linux/kvm.h
 @@ -415,6 +415,7 @@ struct kvm_trace_rec {
  #define KVM_CAP_ASSIGN_DEV_IRQ 29
  /* Another bug in KVM_SET_USER_MEMORY_REGION fixed: */
  #define KVM_CAP_JOIN_MEMORY_REGIONS_WORKS 30
 +#define KVM_CAP_IRQFD 31
  
  #ifdef KVM_CAP_IRQ_ROUTING
  
 @@ -454,6 +455,11 @@ struct kvm_irq_routing {
  
  #endif
  
 +struct kvm_irqfd {
 + __u32 fd;
 + __u32 gsi;
 +};
 +
  /*
   * ioctls for VM fds
   */
 @@ -498,6 +504,7 @@ struct kvm_irq_routing {
  #define KVM_ASSIGN_SET_MSIX_ENTRY \
   _IOW(KVMIO, 0x74, struct kvm_assigned_msix_entry)
  #define KVM_DEASSIGN_DEV_IRQ   _IOW(KVMIO, 0x75, struct kvm_assigned_irq)
 +#define KVM_ASSIGN_IRQFD  _IOW(KVMIO, 0x76, struct kvm_irqfd)
  
  /*
   * ioctls for vcpu fds
 diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
 index 095ebb6..57f8dfe 100644
 --- a/include/linux/kvm_host.h
 +++ b/include/linux/kvm_host.h
 @@ -134,6 +134,10 @@ struct kvm {
   struct list_head vm_list;
   struct kvm_io_bus mmio_bus;
   struct kvm_io_bus pio_bus;
 + struct {
 + struct list_head items;
 + int  src;
 + } irqfd;
   struct kvm_vm_stat stat;
   struct kvm_arch arch;
   atomic_t users_count;
 @@ -524,4 +528,7 @@ static inline void kvm_free_irq_routing(struct kvm *kvm) 
 {}
  
  #endif
  
 +int kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi);
 +void kvm_irqfd_release(struct kvm *kvm);
 +
  #endif
 diff --git a/virt/kvm/irqfd.c b/virt/kvm/irqfd.c
 new file mode 100644
 index 000..d2e0915
 --- /dev/null
 +++ b/virt/kvm/irqfd.c
 @@ -0,0 +1,146 @@
 +/*
 + * irqfd: Allows an eventfd to be used to inject an interrupt to the guest
 + *
 + * Credit goes to Avi Kivity for the original idea.
 + *
 + * Copyright 2009 Novell.  All Rights Reserved.
 + *
 + * Author:
 + *   Gregory Haskins ghask...@novell.com
 + *
 + * This file is free software; you can redistribute it and/or modify
 + * it under the terms of version 2 of the GNU General Public License
 + * as published by the Free Software Foundation.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.   See the
 + * GNU General Public License for more details.
 + *
 + * You should have received a copy of the GNU General Public License
 + * along with this program; if not, write to the Free Software Foundation,
 + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
 + */
 +
 +#include linux/kvm_host.h
 +#include linux/eventfd.h
 +#include linux/workqueue.h
 +#include linux/wait.h
 +#include linux/poll.h
 +#include linux/file.h
 +#include linux/list.h
 +
 +struct _irqfd {
 + struct kvm   *kvm;
 + int   gsi;
 + struct file  *file;
 + struct list_head  list;
 + poll_tablept;
 + wait_queue_head_t*wqh;
 + wait_queue_t  wait;
 + struct work_structwork;
 +};
 +
 +static void
 +irqfd_inject(struct work_struct *work)
 +{
 + struct _irqfd *irqfd = container_of(work, struct _irqfd, work);
 + struct kvm *kvm = irqfd-kvm;
 +
 + mutex_lock(kvm-lock);
 + 

Re: [KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-24 Thread Davide Libenzi
On Fri, 24 Apr 2009, Gregory Haskins wrote:

 Hi David,
See inline for an answer to your last question:

Ok, so basically the eventfd lifetime is embedded inside the irqfd 
lifetime. Looks OK to me.



- Davide


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM PATCH v2 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-04-23 Thread Gregory Haskins
This allows an eventfd to be registered as an irq source with a guest.  Any
signaling operation on the eventfd (via userspace or kernel) will inject
the registered GSI at the next available window.

Signed-off-by: Gregory Haskins ghask...@novell.com
---

 arch/x86/kvm/Makefile|2 -
 arch/x86/kvm/x86.c   |1 
 include/linux/kvm.h  |7 ++
 include/linux/kvm_host.h |7 ++
 virt/kvm/irqfd.c |  146 ++
 virt/kvm/kvm_main.c  |   12 
 6 files changed, 174 insertions(+), 1 deletions(-)
 create mode 100644 virt/kvm/irqfd.c

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index b43c4ef..d5fff51 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -3,7 +3,7 @@
 #
 
 common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
-coalesced_mmio.o irq_comm.o)
+coalesced_mmio.o irq_comm.o irqfd.o)
 ifeq ($(CONFIG_KVM_TRACE),y)
 common-objs += $(addprefix ../../../virt/kvm/, kvm_trace.o)
 endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8cb8542..88c45e6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1027,6 +1027,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_REINJECT_CONTROL:
case KVM_CAP_IRQ_INJECT_STATUS:
case KVM_CAP_ASSIGN_DEV_IRQ:
+   case KVM_CAP_IRQFD:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 3db5d8d..2404775 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -415,6 +415,7 @@ struct kvm_trace_rec {
 #define KVM_CAP_ASSIGN_DEV_IRQ 29
 /* Another bug in KVM_SET_USER_MEMORY_REGION fixed: */
 #define KVM_CAP_JOIN_MEMORY_REGIONS_WORKS 30
+#define KVM_CAP_IRQFD 31
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -454,6 +455,11 @@ struct kvm_irq_routing {
 
 #endif
 
+struct kvm_irqfd {
+   __u32 fd;
+   __u32 gsi;
+};
+
 /*
  * ioctls for VM fds
  */
@@ -498,6 +504,7 @@ struct kvm_irq_routing {
 #define KVM_ASSIGN_SET_MSIX_ENTRY \
_IOW(KVMIO, 0x74, struct kvm_assigned_msix_entry)
 #define KVM_DEASSIGN_DEV_IRQ   _IOW(KVMIO, 0x75, struct kvm_assigned_irq)
+#define KVM_ASSIGN_IRQFD  _IOW(KVMIO, 0x76, struct kvm_irqfd)
 
 /*
  * ioctls for vcpu fds
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 095ebb6..57f8dfe 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -134,6 +134,10 @@ struct kvm {
struct list_head vm_list;
struct kvm_io_bus mmio_bus;
struct kvm_io_bus pio_bus;
+   struct {
+   struct list_head items;
+   int  src;
+   } irqfd;
struct kvm_vm_stat stat;
struct kvm_arch arch;
atomic_t users_count;
@@ -524,4 +528,7 @@ static inline void kvm_free_irq_routing(struct kvm *kvm) {}
 
 #endif
 
+int kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi);
+void kvm_irqfd_release(struct kvm *kvm);
+
 #endif
diff --git a/virt/kvm/irqfd.c b/virt/kvm/irqfd.c
new file mode 100644
index 000..d2e0915
--- /dev/null
+++ b/virt/kvm/irqfd.c
@@ -0,0 +1,146 @@
+/*
+ * irqfd: Allows an eventfd to be used to inject an interrupt to the guest
+ *
+ * Credit goes to Avi Kivity for the original idea.
+ *
+ * Copyright 2009 Novell.  All Rights Reserved.
+ *
+ * Author:
+ * Gregory Haskins ghask...@novell.com
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include linux/kvm_host.h
+#include linux/eventfd.h
+#include linux/workqueue.h
+#include linux/wait.h
+#include linux/poll.h
+#include linux/file.h
+#include linux/list.h
+
+struct _irqfd {
+   struct kvm   *kvm;
+   int   gsi;
+   struct file  *file;
+   struct list_head  list;
+   poll_tablept;
+   wait_queue_head_t*wqh;
+   wait_queue_t  wait;
+   struct work_structwork;
+};
+
+static void
+irqfd_inject(struct work_struct *work)
+{
+   struct _irqfd *irqfd = container_of(work, struct _irqfd, work);
+   struct kvm *kvm = irqfd-kvm;
+
+   mutex_lock(kvm-lock);
+   kvm_set_irq(kvm, kvm-irqfd.src, irqfd-gsi, 1);
+   mutex_unlock(kvm-lock);
+}
+
+static int
+irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
+{
+