Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove

2010-01-21 Thread Davide Libenzi
On Thu, 21 Jan 2010, Michael S. Tsirkin wrote: This is a backport of commit: 03db343a6320f780937078433fa7d8da955e6fce modified in a way that introduces some code duplication on the one hand, but reduces the risk of regressing existing eventfd users on the other hand. KVM needs a wait to

Re: [PATCHv2 1/3] eventfd: allow atomic read and waitqueue remove

2010-01-21 Thread Davide Libenzi
On Thu, 21 Jan 2010, Avi Kivity wrote: On 01/21/2010 07:32 PM, Michael S. Tsirkin wrote: Can't you read from the file? IMO no, the read could block. But you're in process context. An eventfd never blocks. Can you control the eventfd flags? Because if yes, O_NONBLOCK

Re: [PATCH 0/2] kvm: fix spurious interrupt with irqfd

2010-01-20 Thread Davide Libenzi
On Wed, 20 Jan 2010, Avi Kivity wrote: I'm uncomfortable with pushing Davide's patch into stable and possibly causing regressions with unrelated applications. Can't a guest live with a spurious interrupt? It's not like they're unknown. Better material for .33 instead of stable, I agree. -

[patch] eventfd - allow atomic read and waitqueue remove

2010-01-13 Thread Davide Libenzi
, in order to keep dependencies localized. So you just ignore this. Avi, this fixes a few checkpatch warnings, so you should get this instead of the one posted previously. Signed-off-by: Davide Libenzi davi...@xmailserver.org - Davide --- fs/eventfd.c| 89

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-11 Thread Davide Libenzi
On Mon, 11 Jan 2010, Michael S. Tsirkin wrote: Yes, I think this will work. Will test and report. Thanks! If you ever happen to find a solution in KVM that does not require eventfd changes, that'd be even better :) Otherwise ping me when you tested, that I'll puch this to Andrew. - Davide

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-11 Thread Davide Libenzi
On Mon, 11 Jan 2010, Michael S. Tsirkin wrote: On Mon, Jan 11, 2010 at 11:14:14AM -0800, Davide Libenzi wrote: On Mon, 11 Jan 2010, Michael S. Tsirkin wrote: Yes, I think this will work. Will test and report. Thanks! If you ever happen to find a solution in KVM that does not require

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-10 Thread Davide Libenzi
On Sun, 10 Jan 2010, Michael S. Tsirkin wrote: What is you do (under proper irqfd locking) something like: eventfd_ctx_read(ctx, 1, cnt); if (irqfd-cnt != cnt) { irqfd-cnt = cnt; schedule_work(irqfd-inject); } And deactivation deep

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-10 Thread Davide Libenzi
On Sun, 10 Jan 2010, Michael S. Tsirkin wrote: Not sure what you mean by irqfd locking. IMO we must take waitqueue lock, otherwise we can not prevent wait queue callback from being invoked, right? Note that if we take our own lock under wait queue callback, we won't be able to use this lock

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-10 Thread Davide Libenzi
On Sun, 10 Jan 2010, Michael S. Tsirkin wrote: On Sun, Jan 10, 2010 at 09:27:34AM -0800, Davide Libenzi wrote: On Sun, 10 Jan 2010, Michael S. Tsirkin wrote: Not sure what you mean by irqfd locking. IMO we must take waitqueue lock, otherwise we can not prevent wait queue callback

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-07 Thread Davide Libenzi
On Thu, 7 Jan 2010, Michael S. Tsirkin wrote: Sure, I was trying to be as brief as possible, here's a detailed summary. Description of the system (MSI emulation in KVM): KVM supports an ioctl to assign/deassign an eventfd file to interrupt message in guest OS. When this eventfd is

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-07 Thread Davide Libenzi
On Thu, 7 Jan 2010, Davide Libenzi wrote: When you unmask, shouldn't the fd be in a clear state anyway? Forget that. If it's used for IRQs you likely need to have a current state when you unmask. - Davide -- To unsubscribe from this list: send the line unsubscribe kvm in the body

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-07 Thread Davide Libenzi
On Thu, 7 Jan 2010, Michael S. Tsirkin wrote: Sure, I was trying to be as brief as possible, here's a detailed summary. Description of the system (MSI emulation in KVM): KVM supports an ioctl to assign/deassign an eventfd file to interrupt message in guest OS. When this eventfd is

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Wed, 6 Jan 2010, Michael S. Tsirkin wrote: On Tue, Sep 01, 2009 at 07:24:24AM -0700, Davide Libenzi wrote: On Tue, 1 Sep 2009, Avi Kivity wrote: On 09/01/2009 02:45 AM, Davide Libenzi wrote: On Thu, 27 Aug 2009, Davide Libenzi wrote: On Thu, 27 Aug 2009, Michael

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Wed, 6 Jan 2010, Michael S. Tsirkin wrote: I tried to explain, no? That patch was taking wait queue spinlock and was assuming that eventfd_read_ctx is called from a task that can block. KVM attaches its own poller so this is not a good fit for us. Hope this clarifies. and yet again

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Thu, 7 Jan 2010, Michael S. Tsirkin wrote: On top of creating an interface which requires a lock, which noone can get from the interface itself, since it's not exposed. I think here's how KVM gets it: the way it does is by calling poll with our own poll table, then in poll_queue_proc

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Thu, 7 Jan 2010, Michael S. Tsirkin wrote: On Wed, Jan 06, 2010 at 02:46:06PM -0800, Davide Libenzi wrote: On Thu, 7 Jan 2010, Michael S. Tsirkin wrote: On top of creating an interface which requires a lock, which noone can get from the interface itself, since it's

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2010-01-06 Thread Davide Libenzi
On Thu, 7 Jan 2010, Michael S. Tsirkin wrote: OK. What I think we need is a way to remove ourselves from the eventfd wait queue and clear the counter atomically. We currently do remove_wait_queue(irqfd-wqh, irqfd-wait); where wqh saves the eventfd wait queue head. You do a

Re: [GIT PULL] AlacrityVM guest drivers for 2.6.33

2009-12-22 Thread Davide Libenzi
On Tue, 22 Dec 2009, Gregory Haskins wrote: On 12/22/09 2:39 PM, Davide Libenzi wrote: On Tue, 22 Dec 2009, Gregory Haskins wrote: On 12/22/09 1:53 PM, Avi Kivity wrote: I asked why the irqfd/ioeventfd mechanisms are insufficient, and you did not reply. BTW: the ioeventfd issue

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-27 Thread Davide Libenzi
On Thu, 27 Aug 2009, Michael S. Tsirkin wrote: On Thu, Aug 27, 2009 at 07:13:32AM +0300, Avi Kivity wrote: On 08/27/2009 02:30 AM, Davide Libenzi wrote: On Wed, 26 Aug 2009, Davide Libenzi wrote: I see no kernel equivalent to read(), but that's easily done. Adding

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-27 Thread Davide Libenzi
On Thu, 27 Aug 2009, Paolo Bonzini wrote: Ok, so why not using the eventfd counter as state? On the device side: void write_state(int sfd, int state) { u64 cnt; /* Clear the current state, sfd is in non-blocking mode */ read(sfd,cnt, sizeof(cnt)); /* Writes new

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-27 Thread Davide Libenzi
On Thu, 27 Aug 2009, Michael S. Tsirkin wrote: On Thu, Aug 27, 2009 at 07:21:49AM -0700, Davide Libenzi wrote: On Thu, 27 Aug 2009, Paolo Bonzini wrote: Ok, so why not using the eventfd counter as state? On the device side: void write_state(int sfd, int state

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-27 Thread Davide Libenzi
On Thu, 27 Aug 2009, Michael S. Tsirkin wrote: Oh, I stopped pushing EFD_STATE since we have a solution. I am just trying to grok what does and what does not consititute a use-once addition, in your mind, and what does and what does not belong in eventfd. The reason atomic does not belong

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-27 Thread Davide Libenzi
On Thu, 27 Aug 2009, Michael S. Tsirkin wrote: Oh, I stopped pushing EFD_STATE since we have a solution. Do you guys need the kernel-side eventfd_ctx_read() I posted or not? Because if nobody uses it, I'm not going to push it. - Davide -- To unsubscribe from this list: send the line

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-26 Thread Davide Libenzi
On Wed, 26 Aug 2009, Michael S. Tsirkin wrote: On Tue, Aug 25, 2009 at 02:57:01PM -0700, Davide Libenzi wrote: On Tue, 25 Aug 2009, Michael S. Tsirkin wrote: Yes, we don't want that. The best thing is to try to restate the problem in a way that is generic, and then either solve

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-26 Thread Davide Libenzi
On Wed, 26 Aug 2009, Avi Kivity wrote: On 08/26/2009 08:45 PM, Davide Libenzi wrote: OK, if I get it correctly, there is one eventfd signaler (the device), and one eventfd reader (the hypervisor), right? Each hypervisor listens for multiple devices detecting state changes

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-26 Thread Davide Libenzi
On Wed, 26 Aug 2009, Avi Kivity wrote: On 08/26/2009 10:13 PM, Davide Libenzi wrote: Ok, so why not using the eventfd counter as state? On the device side: void write_state(int sfd, int state) { u64 cnt; /* Clear the current state, sfd is in non-blocking mode

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-26 Thread Davide Libenzi
On Wed, 26 Aug 2009, Gleb Natapov wrote: On Wed, Aug 26, 2009 at 10:42:05PM +0300, Avi Kivity wrote: On 08/26/2009 10:13 PM, Davide Libenzi wrote: Ok, so why not using the eventfd counter as state? On the device side: void write_state(int sfd, int state) { u64 cnt

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-26 Thread Davide Libenzi
On Wed, 26 Aug 2009, Davide Libenzi wrote: I see no kernel equivalent to read(), but that's easily done. Adding an in-kernel read based on ctx, that is no problem at all. Something like the untested below. I had thought you said the eventfd readers where in userspace, but I might have

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-25 Thread Davide Libenzi
On Tue, 25 Aug 2009, Michael S. Tsirkin wrote: Yes, we don't want that. The best thing is to try to restate the problem in a way that is generic, and then either solve or best use existing solution. Right? I thought I had that, but apparently not. The reason I'm Cc-ing you is not to try

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-24 Thread Davide Libenzi
On Sun, 23 Aug 2009, Michael S. Tsirkin wrote: On Sun, Aug 23, 2009 at 04:40:51PM +0300, Avi Kivity wrote: On 08/23/2009 04:36 PM, Michael S. Tsirkin wrote: More important here is realization that eventfd is a mutex/semaphore implementation, not a generic event reporting interface as we

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-24 Thread Davide Libenzi
On Mon, 24 Aug 2009, Avi Kivity wrote: On 08/24/2009 09:25 PM, Davide Libenzi wrote: Indeed, the default eventfd behaviour is like, well, an event. Signaling (kernel side) or writing (userspace side), signals the event. Waiting (reading) it, will reset the event. If you use EFD_SEMAPHORE

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-24 Thread Davide Libenzi
On Tue, 25 Aug 2009, Michael S. Tsirkin wrote: On Mon, Aug 24, 2009 at 11:25:01AM -0700, Davide Libenzi wrote: On Sun, 23 Aug 2009, Michael S. Tsirkin wrote: On Sun, Aug 23, 2009 at 04:40:51PM +0300, Avi Kivity wrote: On 08/23/2009 04:36 PM, Michael S. Tsirkin wrote: More

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-24 Thread Davide Libenzi
On Tue, 25 Aug 2009, Paolo Bonzini wrote: There are userspace libraries that do almost everything, but you hardly see things like pthread_(EFD_STATE-like)_create() or similar system interfaces based on such abstraction. It actually seems as close to a condition variable as an eventfd

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-21 Thread Davide Libenzi
On Thu, 20 Aug 2009, Paolo Bonzini wrote: On 08/20/2009 07:44 PM, Davide Libenzi wrote: On Thu, 20 Aug 2009, Avi Kivity wrote: On 08/20/2009 07:20 PM, Davide Libenzi wrote: I briefly looked at this while in vacation, although I did not reply hoping the horrible feeling about

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-20 Thread Davide Libenzi
On Thu, 20 Aug 2009, Michael S. Tsirkin wrote: Davide, Looks like I got an ack from Avi and no comments from others, so the following patchset is identical to the last RFC. Can it be merged for 2.6.32? Thanks! I briefly looked at this while in vacation, although I did not reply hoping the

Re: [PATCH 0/2] eventfd: new EFD_STATE flag

2009-08-20 Thread Davide Libenzi
On Thu, 20 Aug 2009, Avi Kivity wrote: On 08/20/2009 07:20 PM, Davide Libenzi wrote: I briefly looked at this while in vacation, although I did not reply hoping the horrible feeling about this code would go away. It didn't. I find this to be an ugly and ad-hoc multiplexing of eventfd

Re: [KVM PATCH v9 2/5] eventfd: use locked POLLHUP

2009-07-02 Thread Davide Libenzi
it is not possible to use this feature of eventfd in a race-free way. This patch changes the POLLHUP code to use the locked variant to rectify this problem. Signed-off-by: Gregory Haskins ghask...@novell.com CC: Davide Libenzi davi...@xmailserver.org --- fs/eventfd.c |7 +-- 1 files

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-25 Thread Davide Libenzi
On Thu, 25 Jun 2009, Rusty Russell wrote: On Thu, 25 Jun 2009 08:15:11 am Davide Libenzi wrote: Some components would like to know if userspace dropped the fd, and take proper action accordingly (release resources, drop module instances, etc...). Like to know? Possibly. Need to know

Re: [KVM PATCH v5 0/4] irqfd fixes and enhancements

2009-06-25 Thread Davide Libenzi
On Thu, 25 Jun 2009, Gregory Haskins wrote: So I know we talked yesterday in the review session about whether it was actually worth all this complexity to deal with the POLLHUP or if we should just revert to the prior two syscall model and be done with it. Rusty reflected these same

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-24 Thread Davide Libenzi
On Wed, 24 Jun 2009, Rusty Russell wrote: On Tue, 23 Jun 2009 03:33:22 am Davide Libenzi wrote: What you're doing there, is setting up a kernel-to-kernel (since userspace only role is to create the eventfd) communication, using a file* as accessory. That IMO is plain wrong. The most

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-24 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote: Maybe there is such a reason, and it hasn't yet been beaten into my skull. But I still think it should be done in a separate patch. One which comes with a full description of the reasons for the change, btw. Since your skull seems pretty hard to

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-24 Thread Davide Libenzi
On Wed, 24 Jun 2009, Andrew Morton wrote: Split what? My skull? Heh :) umm, yes please, I believe the patches should be split. And I'm still not seeing the justification for forcing CONFIG_EVENTFD onto all CONFIG_AIO users! Eventfd notifications became part of the AIO API (it's not even

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-23 Thread Davide Libenzi
On Mon, 22 Jun 2009, Gregory Haskins wrote: To be honest, I am not sure. I would guess its not a *huge* deal, though it was obviously enough of a concern to at least discuss it. I can definitely say that I think the other issues which are being fixed are substantially more important. Ok

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Rusty Russell wrote: The first 'struct eventfd_ctx;' line is not required. Will repost dropping that. - Davide -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Mon, 22 Jun 2009, Gregory Haskins wrote: To be honest, I am not sure. I would guess its not a *huge* deal, though it was obviously enough of a concern to at least discuss it. I can definitely say that I think

[patch] eventfd - revised interface and cleanups

2009-06-23 Thread Davide Libenzi
. Signed-off-by: Davide Libenzi davi...@xmailserver.org - Davide --- drivers/lguest/lg.h |2 drivers/lguest/lguest_user.c |4 - fs/aio.c | 24 ++ fs/eventfd.c | 101 ++- include/linux/aio.h

Re: [patch] eventfd - revised interface and cleanups

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Avi Kivity wrote: On 06/23/2009 07:47 PM, Davide Libenzi wrote: The following patch changes the eventfd interface to de-couple the eventfd memory context, from the file pointer instance. Without such change, there is no clean way to racely free handle the POLLHUP

Re: [patch] eventfd - revised interface and cleanups

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Randy Dunlap wrote: kernel-doc syntax requires the function name + short description on one line, followed by parameters. Any longer function description then comes after the parameters. See Documentation/kernel-doc-nano-HOWTO.txt for more info, or ask me. Will fix,

Re: [patch] eventfd - revised interface and cleanups

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Gregory Haskins wrote: I think the lack of a way to got from a file* to a ctx* (or back) might be a problem. I am not an expert, but if I understood the gist of what Al Viro (cc'd) was saying early on, an fd can change meaning out from under you without warning. With

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote: On Tue, 23 Jun 2009 12:25:36 -0700 (PDT) Davide Libenzi davi...@xmailserver.org wrote: The following patch changes the eventfd interface to de-couple the eventfd memory context, from the file pointer instance. Without such change

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote: On Tue, 23 Jun 2009 12:25:36 -0700 (PDT) Davide Libenzi davi...@xmailserver.org wrote: Another cleanup this patch does, is making AIO select EVENTFD, instead of adding a bunch of empty function stubs inside eventfd.h in order to handle

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote: On Tue, 23 Jun 2009 12:25:36 -0700 (PDT) Davide Libenzi davi...@xmailserver.org wrote: The following patch changes the eventfd interface to de-couple the eventfd memory context, from the file pointer instance. Without such change

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote: That isn't for us to decide. Entire syscalls can be disabled in config. That is not a well defined separate syscall though. It's a member/feature of the aiocb. - Davide -- To unsubscribe from this list: send the line unsubscribe kvm in the body of

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote: +struct eventfd_ctx *eventfd_ctx_get(struct eventfd_ctx *ctx) +{ + kref_get(ctx-kref); + return ctx; +} +EXPORT_SYMBOL_GPL(eventfd_ctx_get); doesn't match the code. You appear to have not seen the above

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote: Should functions be describing all the returned error codes, ala man pages? I think so. This becomes pretty painful when the function calls other functions, for which just relays the error code. Should we be just documenting the error codes

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote: On Tue, 23 Jun 2009 14:34:50 -0700 (PDT) Davide Libenzi davi...@xmailserver.org wrote: On Tue, 23 Jun 2009, Andrew Morton wrote: Should functions be describing all the returned error codes, ala man pages? I think so

[patch] eventfd - revised interface and cleanups (3rd rev)

2009-06-23 Thread Davide Libenzi
are using the eventfd context instead of the file*. Another cleanup this patch does, is making AIO select EVENTFD, instead of adding a bunch of empty function stubs inside eventfd.h in order to handle the (AIO !EVENTFD) case. Signed-off-by: Davide Libenzi davi...@xmailserver.org - Davide

Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-23 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote: On Tue, 23 Jun 2009 14:25:05 -0700 (PDT) Davide Libenzi davi...@xmailserver.org wrote: On Tue, 23 Jun 2009, Andrew Morton wrote: That isn't for us to decide. Entire syscalls can be disabled in config. That is not a well defined separate

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-22 Thread Davide Libenzi
On Mon, 22 Jun 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Sun, 21 Jun 2009, Gregory Haskins wrote: This looks great, Davide. I am fairly certain I can now solve the races and even implement Michael's DEASSIGN feature with this patch in place. I will actually fire

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-22 Thread Davide Libenzi
On Mon, 22 Jun 2009, Gregory Haskins wrote: I am probably confused or perhaps have the wrong terminology, but isnt that ok. I am concerned about the consumer (the guy getting the POLLINs) to be able to detect POLLHUP when the last producer (f_ops-write() from userspace, eventfd_signal() from

Re: [KVM PATCH v3 3/3] KVM: Fix races in irqfd using new eventfd_kref_get interface

2009-06-22 Thread Davide Libenzi
On Mon, 22 Jun 2009, Michael S. Tsirkin wrote: Not really, it's impossible to document all races one have thought about and avoided. Well, when some new code has non-trivial locking/racing logics, you better document it as clearly as possible, akpm announced time ago. - Davide -- To

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-22 Thread Davide Libenzi
On Mon, 22 Jun 2009, Gregory Haskins wrote: The general thesis is for decoupling of the two subsystems. In order to do this, you need some form of polymorphism and an intermediate handle mechanism which is userspace friendly. File-descriptors already fit this role neatly, with the int fd

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-22 Thread Davide Libenzi
On Mon, 22 Jun 2009, Michael S. Tsirkin wrote: On Mon, Jun 22, 2009 at 11:03:22AM -0700, Davide Libenzi wrote: In your case of kernel-to-kernel scenario, why would you need eventfd at all, if userspace role in that model is simply to create it? That's not 100% true. We have a mode where

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-22 Thread Davide Libenzi
On Mon, 22 Jun 2009, Gregory Haskins wrote: Michael S. Tsirkin wrote: On Mon, Jun 22, 2009 at 11:51:42AM -0700, Davide Libenzi wrote: A file* based kernel-to-kernel interface is rather wrong IMO. But eventfd_ctx should work fine. Yeah, and I guess we can always just

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-22 Thread Davide Libenzi
On Mon, 22 Jun 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Mon, 22 Jun 2009, Michael S. Tsirkin wrote: On Mon, Jun 22, 2009 at 11:03:22AM -0700, Davide Libenzi wrote: In your case of kernel-to-kernel scenario, why would you need eventfd at all, if userspace role

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-22 Thread Davide Libenzi
On Mon, 22 Jun 2009, Gregory Haskins wrote: So up in userspace, the vbus pci-device would have an open reference to the kvm guest (derived from /dev/kvm) and an open reference to a vbus (derived from /dev/vbus). Lets call these kvmfd, and vbusfd, respectively. For something like an

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-22 Thread Davide Libenzi
On Mon, 22 Jun 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Mon, 22 Jun 2009, Gregory Haskins wrote: So up in userspace, the vbus pci-device would have an open reference to the kvm guest (derived from /dev/kvm) and an open reference to a vbus (derived from /dev/vbus

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-21 Thread Davide Libenzi
On Sat, 20 Jun 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Sat, 20 Jun 2009, Davide Libenzi wrote: On Sat, 20 Jun 2009, Davide Libenzi wrote: How about the one below? Maybe with an interface that can be undone w/out a file* :) This is another

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-21 Thread Davide Libenzi
On Sun, 21 Jun 2009, Gregory Haskins wrote: This looks great, Davide. I am fairly certain I can now solve the races and even implement Michael's DEASSIGN feature with this patch in place. I will actually fire it up tomorrow when I am back in the office and give it a spin, but I do not spy

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-20 Thread Davide Libenzi
On Fri, 19 Jun 2009, Gregory Haskins wrote: In the POLLIN event, you schedule_work(irqfd-inject) and there are no races there AFAICS (you basically do not care of anything eventfd memory related at all). For POLLHUP, you do: spin_lock(irqfd-slock); if (irqfd-wqh)

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-20 Thread Davide Libenzi
On Sat, 20 Jun 2009, Davide Libenzi wrote: How about the one below? Maybe with an interface that can be undone w/out a file* :) - Davide --- fs/eventfd.c| 34 +- include/linux/eventfd.h |8 2 files changed, 41 insertions(+), 1

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-20 Thread Davide Libenzi
On Sat, 20 Jun 2009, Davide Libenzi wrote: On Sat, 20 Jun 2009, Davide Libenzi wrote: How about the one below? Maybe with an interface that can be undone w/out a file* :) This is another alternative, based on a low-carb diet of your notifier patch. Same concept of de-coupling VFS

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-19 Thread Davide Libenzi
On Fri, 19 Jun 2009, Gregory Haskins wrote: eventfd currently emits a POLLHUP wakeup on f_ops-release() to generate a notifier-release() callback. This lets notification clients know if the eventfd is about to go away and is very useful particularly for in-kernel clients. However, as it

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-19 Thread Davide Libenzi
On Fri, 19 Jun 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Fri, 19 Jun 2009, Gregory Haskins wrote: eventfd currently emits a POLLHUP wakeup on f_ops-release() to generate a notifier-release() callback. This lets notification clients know if the eventfd is about to go

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-19 Thread Davide Libenzi
On Fri, 19 Jun 2009, Gregory Haskins wrote: I am fairly confident it is not that simple after having thought about this issue over the last few days. But I've been wrong in the past. Propose a patch and I will review it for races/correctness, if you like. Perhaps a combination of that plus

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-19 Thread Davide Libenzi
On Fri, 19 Jun 2009, Davide Libenzi wrote: On Fri, 19 Jun 2009, Gregory Haskins wrote: I am fairly confident it is not that simple after having thought about this issue over the last few days. But I've been wrong in the past. Propose a patch and I will review it for races/correctness

Re: [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface

2009-06-18 Thread Davide Libenzi
On Thu, 18 Jun 2009, Gregory Haskins wrote: Actually there is only one (the tx-thread) aside from the eventfd imposed workqueue one. Incidentally, I would love to get rid of the other thread too, so I am not just picking on eventfd ;). The other is a lot harder since it has to update the

Re: [KVM PATCH 3/4] eventfd: add generalized notifier interface

2009-06-18 Thread Davide Libenzi
On Thu, 18 Jun 2009, Gregory Haskins wrote: We currently open-code notification registration via the f_ops-poll() method of the eventfd. Lets abstract this into a notification API extension of eventfd, while still using the wait-queue as the underlying mechanism. This will allow us to

Re: [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface

2009-06-18 Thread Davide Libenzi
On Thu, 18 Jun 2009, Michael S. Tsirkin wrote: On Wed, Jun 17, 2009 at 04:21:19PM -0700, Davide Libenzi wrote: The interface is just ugly IMO. You have eventfd_signal() that can sleep, or not, depending on the registered -signal() function implementations. This is pretty bad, a lot worse

Re: [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface

2009-06-18 Thread Davide Libenzi
On Thu, 18 Jun 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Thu, 18 Jun 2009, Gregory Haskins wrote: Actually there is only one (the tx-thread) aside from the eventfd imposed workqueue one. Incidentally, I would love to get rid of the other thread too, so I am not just

Re: [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface

2009-06-17 Thread Davide Libenzi
On Tue, 16 Jun 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Tue, 16 Jun 2009, Gregory Haskins wrote: Does this all make sense? This conversation has been *really* long, and I haven't had time to look at the patch yet. But looking at the amount of changes

Re: [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface

2009-06-17 Thread Davide Libenzi
On Wed, 17 Jun 2009, Gregory Haskins wrote: Can you elaborate? I currently do not see how I could do the proposed concept inside of irqfd while still using eventfd. Of course, that would be possible if we fork irqfd from eventfd, and perhaps this is what you are proposing. As previously

Re: [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface

2009-06-17 Thread Davide Libenzi
On Wed, 17 Jun 2009, Gregory Haskins wrote: Davide Libenzi wrote: How much the (possible, but not certain) kernel thread context switch time weighs in the overall KVM IRQ service time? Generally each one is costing me about 7us on average. For something like high-speed

Re: [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface

2009-06-17 Thread Davide Libenzi
On Wed, 17 Jun 2009, Gregory Haskins wrote: I am not clear on what you are asking here. So in case this was a sincere question on how things work, here is a highlight of the typical flow of a packet that ingresses on a physical adapter and routes to KVM via vbus. a) interrupt from eth to

Re: [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier interface

2009-06-16 Thread Davide Libenzi
On Tue, 16 Jun 2009, Gregory Haskins wrote: Does this all make sense? This conversation has been *really* long, and I haven't had time to look at the patch yet. But looking at the amount of changes, and the amount of even more changes talked in this thread, there's a very slim chance that

Re: [KVM-RFC PATCH 2/2] kvm: use POLLHUP to close an irqfd instead of an explicit ioctl

2009-06-03 Thread Davide Libenzi
On Wed, 3 Jun 2009, Gregory Haskins wrote: Thanks again for the review, Paul. IIUC, you think the design is ok as it is. Davide, In light of this, would you like to submit patch 1/2 formally with your SOB at your earliest convenience? Or would you prefer that I submit it and you

Re: [KVM-RFC PATCH 2/2] kvm: use POLLHUP to close an irqfd instead of an explicit ioctl

2009-06-02 Thread Davide Libenzi
On Tue, 2 Jun 2009, Gregory Haskins wrote: @@ -64,12 +101,28 @@ irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key) { struct _irqfd *irqfd = container_of(wait, struct _irqfd, wait); - /* - * The wake_up is called with interrupts disabled. Therefore we

Re: [KVM PATCH v10] kvm: add support for irqfd

2009-05-27 Thread Davide Libenzi
On Wed, 27 May 2009, Michael S. Tsirkin wrote: On Wed, May 27, 2009 at 10:06:50AM -0400, Gregory Haskins wrote: Michael S. Tsirkin wrote: On Wed, May 20, 2009 at 10:30:49AM -0400, Gregory Haskins wrote: +static int +kvm_assign_irqfd(struct kvm *kvm, int fd, int gsi) +{ +

Re: [KVM PATCH v10] kvm: add support for irqfd

2009-05-26 Thread Davide Libenzi
On Tue, 26 May 2009, Michael S. Tsirkin wrote: On Wed, May 20, 2009 at 10:30:49AM -0400, Gregory Haskins wrote: +static int +irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key) +{ + struct _irqfd *irqfd = container_of(wait, struct _irqfd, wait); + + /* +*

Re: [KVM PATCH v7 2/3] kvm: add support for irqfd via eventfd-notification interface

2009-05-14 Thread Davide Libenzi
On Thu, 14 May 2009, Gregory Haskins wrote: Avi Kivity wrote: Gregory Haskins wrote: KVM provides a complete virtual system environment for guests, including support for injecting interrupts modeled after the real exception/interrupt facilities present on the native platform (such as

Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-05-12 Thread Davide Libenzi
On Mon, 11 May 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Wed, 6 May 2009, Gregory Haskins wrote: If there isn't any more feedback on the series from Al, Avi, etc...please formally submit your eventfd patch so this series is available for Avi to pull in for inclusion

Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-05-07 Thread Davide Libenzi
On Thu, 7 May 2009, Avi Kivity wrote: What's your take on adding irq context safe callbacks to irqfd? To give some background here, we would like to use eventfd as a generic connector between components, so the components do not know about each other. So far eventfd successfully abstracts

Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-05-07 Thread Davide Libenzi
On Thu, 7 May 2009, Avi Kivity wrote: Davide Libenzi wrote: What's your take on adding irq context safe callbacks to irqfd? To give some background here, we would like to use eventfd as a generic connector between components, so the components do not know about each other

Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-05-07 Thread Davide Libenzi
On Thu, 7 May 2009, Avi Kivity wrote: What's your take on adding irq context safe callbacks to irqfd? To give some background here, we would like to use eventfd as a generic connector between components, so the components do not know about each other. So far eventfd successfully abstracts

Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-05-06 Thread Davide Libenzi
On Wed, 6 May 2009, Gregory Haskins wrote: Al, Davide, Gregory Haskins wrote: + +int +kvm_irqfd(struct kvm *kvm, int gsi, int flags) +{ + struct _irqfd *irqfd; + struct file *file = NULL; + int fd = -1; + int ret; + + irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);

Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-05-06 Thread Davide Libenzi
On Wed, 6 May 2009, Gregory Haskins wrote: I think we are ok in this regard (at least in v5) without the callback. kvm holds irqfd, which holds eventfd. In a normal situation, we will have eventfd with 2 references. If userspace closes the eventfd, it will drop 1 of the 2 eventfd file

Re: [KVM PATCH v4 0/2] irqfd

2009-05-05 Thread Davide Libenzi
On Mon, 4 May 2009, Gregory Haskins wrote: (Applies to kvm.git:7da2e3ba, plus you will also need Davide Libenzi's eventfd_file_create() patch, which you can find here: http://www.mail-archive.com/kvm@vger.kernel.org/msg13923.html Ping me back if Al acks the irqfd thing, that I'll take a

Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-05-03 Thread Davide Libenzi
On Sun, 3 May 2009, Al Viro wrote: On Mon, Apr 27, 2009 at 02:33:34PM -0400, Gregory Haskins wrote: + /* We re-use eventfd for irqfd */ + fd = sys_eventfd2(0, 0); + if (fd 0) { + ret = fd; + goto fail; + } + + /* We maintain a reference to eventfd

Re: [KVM PATCH v3 2/2] kvm: add support for irqfd via eventfd-notification interface

2009-05-03 Thread Davide Libenzi
On Sun, 3 May 2009, Al Viro wrote: IOW, the sane solution would be to export something that returns your struct file *. And stop playing with fd completely. This builds but it's not tested at all. - Make all the work of the old anon_inode_getfd(), done by a new anon_inode_getfile(), with

Re: [KVM PATCH 2/3] eventfd: add a notifier mechanism

2009-04-24 Thread Davide Libenzi
On Fri, 24 Apr 2009, Gregory Haskins wrote: Davide Libenzi wrote: On Thu, 23 Apr 2009, Gregory Haskins wrote: Take a look at init_waitqueue_func_entry(), in particula at that func parameter. Then look at how __wake_up_common() does its thing. You don't need to be waiting for our

  1   2   >