Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-06 Thread Paolo Bonzini
Il 05/03/2012 18:35, Avi Kivity ha scritto:
   The I/O handlers would still use the qemu mutex, no?  we'd just protect
   the select() (taking the mutex from before releasing the global lock,
   and reacquiring it afterwards).
 
  Yes, that could work, but it is _really_ ugly. 
 Yes, it is...
 

Let's just fix it in NBD then.

Paolo



Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Jan Kiszka
On 2012-03-05 09:34, Paolo Bonzini wrote:
 This is quite ugly.  Two threads, one running main_loop_wait and
 one running qemu_aio_wait, can race with each other on running the
 same iohandler.  The result is that an iohandler could run while the
 underlying socket is not readable or writable, with possibly ill effects.

Hmm, isn't it a problem already that a socket is polled by two threads
at the same time? Can't that be avoided?

Long-term, I'd like to cut out certain file descriptors from the main
loop and process them completely in separate threads (for separate
locking, prioritization etc.). Dunno how NBD works, but maybe it should
be reworked like this already.

Jan

 
 This shows as a failure to boot an IDE disk using the NBD device.
 We can consider it a bug in NBD or in the main loop.  The patch fixes
 this in main_loop_wait, which is always going to lose the race because
 qemu_aio_wait runs select with the global lock held.
 
 Reported-by: Laurent Vivier laur...@vivier.eu
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
   Anthony, if you think this is too ugly tell me and I can
   post an NBD fix too.
 
  main-loop.c |7 +++
  1 files changed, 7 insertions(+), 0 deletions(-)
 
 diff --git a/main-loop.c b/main-loop.c
 index db23de0..3beccff 100644
 --- a/main-loop.c
 +++ b/main-loop.c
 @@ -458,6 +458,13 @@ int main_loop_wait(int nonblocking)
  
  if (timeout  0) {
  qemu_mutex_lock_iothread();
 +
 +/* Poll again.  A qemu_aio_wait() on another thread
 + * could have made the fdsets stale.
 + */
 +tv.tv_sec = 0;
 +tv.tv_usec = 0;
 +ret = select(nfds + 1, rfds, wfds, xfds, tv);
  }
  
  glib_select_poll(rfds, wfds, xfds, (ret  0));

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Paolo Bonzini
Il 05/03/2012 10:07, Jan Kiszka ha scritto:
  This is quite ugly.  Two threads, one running main_loop_wait and
  one running qemu_aio_wait, can race with each other on running the
  same iohandler.  The result is that an iohandler could run while the
  underlying socket is not readable or writable, with possibly ill effects.
 
 Hmm, isn't it a problem already that a socket is polled by two threads
 at the same time? Can't that be avoided?

We still have synchronous I/O in the device models.  That's the root
cause of the bug, I suppose.

 Long-term, I'd like to cut out certain file descriptors from the main
 loop and process them completely in separate threads (for separate
 locking, prioritization etc.). Dunno how NBD works, but maybe it should
 be reworked like this already.

Me too, I even made a very simple proof of concept a couple of weeks ago
(search for a thread switching the block layer from coroutines to
threads).  It worked, though it is obviously not upstreamable in any way.

In that world order EventNotifiers would replace
qemu_aio_set_fd_handler, and socket-based protocols such as NBD would
run with blocking I/O in their own thread.  In addition to one thread
per I/O request (from a thread pool), there would be one arbiter thread
that reads replies and dispatches them to the appropriate I/O request
thread.  The arbiter thread replaces the read callback in
qemu_aio_set_fd_handler.

The problem is, even though it worked, making this thread-safe is
another story.  I suspect that in practice it is very difficult to do
without resurrecting RCU patches.

Paolo



Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Avi Kivity
On 03/05/2012 11:07 AM, Jan Kiszka wrote:
 On 2012-03-05 09:34, Paolo Bonzini wrote:
  This is quite ugly.  Two threads, one running main_loop_wait and
  one running qemu_aio_wait, can race with each other on running the
  same iohandler.  The result is that an iohandler could run while the
  underlying socket is not readable or writable, with possibly ill effects.

 Hmm, isn't it a problem already that a socket is polled by two threads
 at the same time? Can't that be avoided?

Could it be done simply by adding a mutex there?  It's hardly a clean
fix, but it's not a clean problem.

 Long-term, I'd like to cut out certain file descriptors from the main
 loop and process them completely in separate threads (for separate
 locking, prioritization etc.). Dunno how NBD works, but maybe it should
 be reworked like this already.

Ideally qemu_set_fd_handler2() should be made thread local, and each
device thread would run a copy of the main loop, just working on
different data.

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Jan Kiszka
On 2012-03-05 15:24, Avi Kivity wrote:
 Long-term, I'd like to cut out certain file descriptors from the main
 loop and process them completely in separate threads (for separate
 locking, prioritization etc.). Dunno how NBD works, but maybe it should
 be reworked like this already.
 
 Ideally qemu_set_fd_handler2() should be made thread local, and each
 device thread would run a copy of the main loop, just working on
 different data.

qemu_set_fd_handler2 may not only be called over an iothread. Rather, we
need an object and associated lock that is related to the io-path (i.e.
frontend device + backend driver). That object has to be passed to
services like qemu_set_fd_handler2.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Paolo Bonzini
Il 05/03/2012 15:24, Avi Kivity ha scritto:
 On 03/05/2012 11:07 AM, Jan Kiszka wrote:
 On 2012-03-05 09:34, Paolo Bonzini wrote:
 This is quite ugly.  Two threads, one running main_loop_wait and
 one running qemu_aio_wait, can race with each other on running the
 same iohandler.  The result is that an iohandler could run while the
 underlying socket is not readable or writable, with possibly ill effects.

 Hmm, isn't it a problem already that a socket is polled by two threads
 at the same time? Can't that be avoided?
 
 Could it be done simply by adding a mutex there?  It's hardly a clean
 fix, but it's not a clean problem.

Hmm, I don't think so.  It would need to protect execution of the
iohandlers too, and pretty much everything can happen there including a
nested loop.  Of course recursive mutexes exist, but it sounds like too
big an axe.

I could add a generation count updated by qemu_aio_wait(), and rerun the
select() only if the generation count changes during its execution.

Or we can call it an NBD bug.  I'm not against that, but it seemed to me
that the problem is more general.

Paolo



Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Avi Kivity
On 03/05/2012 04:30 PM, Paolo Bonzini wrote:
 Il 05/03/2012 15:24, Avi Kivity ha scritto:
  On 03/05/2012 11:07 AM, Jan Kiszka wrote:
  On 2012-03-05 09:34, Paolo Bonzini wrote:
  This is quite ugly.  Two threads, one running main_loop_wait and
  one running qemu_aio_wait, can race with each other on running the
  same iohandler.  The result is that an iohandler could run while the
  underlying socket is not readable or writable, with possibly ill effects.
 
  Hmm, isn't it a problem already that a socket is polled by two threads
  at the same time? Can't that be avoided?
  
  Could it be done simply by adding a mutex there?  It's hardly a clean
  fix, but it's not a clean problem.

 Hmm, I don't think so.  It would need to protect execution of the
 iohandlers too, and pretty much everything can happen there including a
 nested loop.  Of course recursive mutexes exist, but it sounds like too
 big an axe.

The I/O handlers would still use the qemu mutex, no?  we'd just protect
the select() (taking the mutex from before releasing the global lock,
and reacquiring it afterwards).

 I could add a generation count updated by qemu_aio_wait(), and rerun the
 select() only if the generation count changes during its execution.

 Or we can call it an NBD bug.  I'm not against that, but it seemed to me
 that the problem is more general.

What about making sure all callers of qemu_aio_wait() run from
coroutines (or threads)?  Then they just ask the main thread to wake
them up, instead of dispatching completions themselves.

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Paolo Bonzini
Il 05/03/2012 16:14, Avi Kivity ha scritto:
  Hmm, I don't think so.  It would need to protect execution of the
  iohandlers too, and pretty much everything can happen there including a
  nested loop.  Of course recursive mutexes exist, but it sounds like too
  big an axe.
 The I/O handlers would still use the qemu mutex, no?  we'd just protect
 the select() (taking the mutex from before releasing the global lock,
 and reacquiring it afterwards).

Yes, that could work, but it is _really_ ugly.  I still prefer this
patch or fixing NBD.  At least both contain the hack in a single place.

  I could add a generation count updated by qemu_aio_wait(), and rerun the
  select() only if the generation count changes during its execution.
 
  Or we can call it an NBD bug.  I'm not against that, but it seemed to me
  that the problem is more general.
 What about making sure all callers of qemu_aio_wait() run from
 coroutines (or threads)?  Then they just ask the main thread to wake
 them up, instead of dispatching completions themselves.

That would open another Pandora's box.  The point of having a separate
main loop is that only AIO can happen during qemu_aio_wait() or
qemu_aio_flush().  In particular you don't want the monitor to process
input while you're running another monitor command.

Paolo



Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Avi Kivity
On 03/05/2012 06:14 PM, Paolo Bonzini wrote:
 Il 05/03/2012 16:14, Avi Kivity ha scritto:
   Hmm, I don't think so.  It would need to protect execution of the
   iohandlers too, and pretty much everything can happen there including a
   nested loop.  Of course recursive mutexes exist, but it sounds like too
   big an axe.
  The I/O handlers would still use the qemu mutex, no?  we'd just protect
  the select() (taking the mutex from before releasing the global lock,
  and reacquiring it afterwards).

 Yes, that could work, but it is _really_ ugly. 

Yes, it is...

  I still prefer this
 patch or fixing NBD.  At least both contain the hack in a single place.



   I could add a generation count updated by qemu_aio_wait(), and rerun the
   select() only if the generation count changes during its execution.
  
   Or we can call it an NBD bug.  I'm not against that, but it seemed to me
   that the problem is more general.
  What about making sure all callers of qemu_aio_wait() run from
  coroutines (or threads)?  Then they just ask the main thread to wake
  them up, instead of dispatching completions themselves.

 That would open another Pandora's box.  The point of having a separate
 main loop is that only AIO can happen during qemu_aio_wait() or
 qemu_aio_flush().  In particular you don't want the monitor to process
 input while you're running another monitor command.

Hmm, yes, we're abusing the type of completion here as a kind of wierd
locking.  It's conceptually broken since an aio completion could trigger
anything.  Usually it just involves block format driver and device code,
but in theory, it can affect the state of whoever's running qemu_aio_wait().

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Avi Kivity
On 03/05/2012 04:30 PM, Jan Kiszka wrote:
 On 2012-03-05 15:24, Avi Kivity wrote:
  Long-term, I'd like to cut out certain file descriptors from the main
  loop and process them completely in separate threads (for separate
  locking, prioritization etc.). Dunno how NBD works, but maybe it should
  be reworked like this already.
  
  Ideally qemu_set_fd_handler2() should be made thread local, and each
  device thread would run a copy of the main loop, just working on
  different data.

 qemu_set_fd_handler2 may not only be called over an iothread. Rather, we
 need an object and associated lock that is related to the io-path (i.e.
 frontend device + backend driver). That object has to be passed to
 services like qemu_set_fd_handler2.

Not sure I like implicit lock-taking.  In particular, how does it
interact with unregistering an fd_handler?

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait

2012-03-05 Thread Jan Kiszka
On 2012-03-05 18:39, Avi Kivity wrote:
 On 03/05/2012 04:30 PM, Jan Kiszka wrote:
 On 2012-03-05 15:24, Avi Kivity wrote:
 Long-term, I'd like to cut out certain file descriptors from the main
 loop and process them completely in separate threads (for separate
 locking, prioritization etc.). Dunno how NBD works, but maybe it should
 be reworked like this already.

 Ideally qemu_set_fd_handler2() should be made thread local, and each
 device thread would run a copy of the main loop, just working on
 different data.

 qemu_set_fd_handler2 may not only be called over an iothread. Rather, we
 need an object and associated lock that is related to the io-path (i.e.
 frontend device + backend driver). That object has to be passed to
 services like qemu_set_fd_handler2.
 
 Not sure I like implicit lock-taking.  In particular, how does it
 interact with unregistering an fd_handler?

I wasn't suggesting implicit lock taking, just decoupling from our
infamous global lock. My point is that thread-local won't help here.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux