Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState

2018-02-15 Thread Eric Blake

On 02/15/2018 04:12 AM, Kevin Wolf wrote:

Having acquired the AioContext does not make this function return true.
The semantics are:
1. Current thread is the IOThread that runs the AioContext
2. Current thread is the main loop and the AioContext is the global
AioContext.

The function tests whether the current thread is the "native" or "home"
thread for this AioContext.  Perhaps we could also call it the "poller"
thread because only that thread is allowed to call aio_poll(ctx, true).

   if (aio_context_in_native_thread(ctx)) {
   ...
   } else {
   ...
   }

What do you think?


"home" or "native" both work for me. Or if we want to keep the name
short, maybe just changing the order and s/iothread/thread/ would be
enough: bool in_aio_context_thread(AioContext *ctx) - do you think that
would still be prone to misunderstandings?


in_aio_context_home_thread() sounds slightly better than 
in_aio_context_thread(), but swapping the verb makes a nice difference 
where either name is better than what we currently have.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState

2018-02-15 Thread Kevin Wolf
Am 15.02.2018 um 10:27 hat Stefan Hajnoczi geschrieben:
> On Wed, Feb 14, 2018 at 04:31:45PM -0600, Eric Blake wrote:
> > On 02/14/2018 08:06 AM, Stefan Hajnoczi wrote:
> > > On Tue, Feb 13, 2018 at 10:01:06AM -0600, Eric Blake wrote:
> > > I hope this explains things!  The main issue that raised these questions
> > > was that aio_context_in_iothread() has a misleading name.  Shall we
> > > rename it?
> > 
> > Maybe, but that's a separate patch.  What name would we bikeshed, maybe
> > aio_context_correct_thread() (we are the correct thread if we are the
> > iothread that owns ctx, or if we are the main thread and have properly
> > acquired ctx) 
> 
> Having acquired the AioContext does not make this function return true.
> The semantics are:
> 1. Current thread is the IOThread that runs the AioContext
> 2. Current thread is the main loop and the AioContext is the global
>AioContext.
> 
> The function tests whether the current thread is the "native" or "home"
> thread for this AioContext.  Perhaps we could also call it the "poller"
> thread because only that thread is allowed to call aio_poll(ctx, true).
> 
>   if (aio_context_in_native_thread(ctx)) {
>   ...
>   } else {
>   ...
>   }
> 
> What do you think?

"home" or "native" both work for me. Or if we want to keep the name
short, maybe just changing the order and s/iothread/thread/ would be
enough: bool in_aio_context_thread(AioContext *ctx) - do you think that
would still be prone to misunderstandings?

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState

2018-02-15 Thread Stefan Hajnoczi
On Wed, Feb 14, 2018 at 04:31:45PM -0600, Eric Blake wrote:
> On 02/14/2018 08:06 AM, Stefan Hajnoczi wrote:
> > On Tue, Feb 13, 2018 at 10:01:06AM -0600, Eric Blake wrote:
> > I hope this explains things!  The main issue that raised these questions
> > was that aio_context_in_iothread() has a misleading name.  Shall we
> > rename it?
> 
> Maybe, but that's a separate patch.  What name would we bikeshed, maybe
> aio_context_correct_thread() (we are the correct thread if we are the
> iothread that owns ctx, or if we are the main thread and have properly
> acquired ctx) 

Having acquired the AioContext does not make this function return true.
The semantics are:
1. Current thread is the IOThread that runs the AioContext
2. Current thread is the main loop and the AioContext is the global
   AioContext.

The function tests whether the current thread is the "native" or "home"
thread for this AioContext.  Perhaps we could also call it the "poller"
thread because only that thread is allowed to call aio_poll(ctx, true).

  if (aio_context_in_native_thread(ctx)) {
  ...
  } else {
  ...
  }

What do you think?


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState

2018-02-14 Thread Eric Blake

On 02/13/2018 10:01 AM, Eric Blake wrote:

On 02/13/2018 08:20 AM, Stefan Hajnoczi wrote:

BlockDriverState has the BDRV_POLL_WHILE() macro to wait on event loop
activity while a condition evaluates to true.  This is used to implement
synchronous operations where it acts as a condvar between the IOThread
running the operation and the main loop waiting for the operation.  It
can also be called from the thread that owns the AioContext and in that
case it's just a nested event loop.





It may be that your patch is correct (as I'm not an expert on the rules 
in play here), but more comments may help.  Or you may have a NULL 
dereference bug lurking.  So at this point, I can't give R-b, even 
though the refactoring of the BDRV_POLL_WHILE() macro into a separate 
helper makes sense from the high level view.


Okay, based on your responses, I can now give

Reviewed-by: Eric Blake 

although it may still help to do followups with better documentation 
and/or a rename of the confusing functions.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState

2018-02-14 Thread Eric Blake

On 02/14/2018 08:06 AM, Stefan Hajnoczi wrote:

On Tue, Feb 13, 2018 at 10:01:06AM -0600, Eric Blake wrote:

Trying to understand here:



+#define AIO_WAIT_WHILE(wait, ctx, cond) ({  \
+bool waited_ = false;   \
+bool busy_ = true;  \
+AioWait *wait_ = (wait);\
+AioContext *ctx_ = (ctx);   \
+if (aio_context_in_iothread(ctx_)) {\
+while ((cond) || busy_) {   \
+busy_ = aio_poll(ctx_, (cond)); \
+waited_ |= !!(cond) | busy_;\
+}   \


If we are in an iothread already, we never dereference wait,


No, the name and documentation for aio_context_in_iothread() is
misleading.  It actually means "does this AioContext belong to the
current thread?", which is more general than just the IOThread case.

aio_context_in_iothread() returns true when:
1. We are the IOThread that owns ctx. <-- the case you thought of
2. We are the main loop and ctx == qemu_get_aio_context().
^--- the sneaky case that BDRV_POLL_WHILE() has always relied on


Thanks, that helps.



+AIO_WAIT_WHILE(bdrv_get_aio_wait(bs_), \
+   bdrv_get_aio_context(bs_),  \
+   cond); })


...we can pass NULL as the wait parameter, which will crash.


It won't crash since if (aio_context_in_iothread(ctx_)) will take the true
case when bs_ == NULL.


Okay, you've solved that one.




+++ b/block/io.c



   void bdrv_wakeup(BlockDriverState *bs)
   {
-/* The barrier (or an atomic op) is in the caller.  */
-if (atomic_read(&bs->wakeup)) {
-aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL);
-}
+aio_wait_kick(bdrv_get_aio_wait(bs));


this is another case where passing NULL...


bdrv_wakeup() is only called when bs != NULL.


And looks like we're safe, there, as well.



I hope this explains things!  The main issue that raised these questions
was that aio_context_in_iothread() has a misleading name.  Shall we
rename it?


Maybe, but that's a separate patch.  What name would we bikeshed, maybe 
aio_context_correct_thread() (we are the correct thread if we are the 
iothread that owns ctx, or if we are the main thread and have properly 
acquired ctx) or aio_context_use_okay() (we can only use the ctx if we 
own it [native iothread] or have acquired it [main loop])




I'm having a hard time picking a new name because it must not be
confused with AioContext acquire/release, which doesn't influence the
"native" AioContext that the current thread has an affinity with.



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState

2018-02-14 Thread Stefan Hajnoczi
On Tue, Feb 13, 2018 at 10:01:06AM -0600, Eric Blake wrote:
> Trying to understand here:
> 
> 
> > +#define AIO_WAIT_WHILE(wait, ctx, cond) ({  \
> > +bool waited_ = false;   \
> > +bool busy_ = true;  \
> > +AioWait *wait_ = (wait);\
> > +AioContext *ctx_ = (ctx);   \
> > +if (aio_context_in_iothread(ctx_)) {\
> > +while ((cond) || busy_) {   \
> > +busy_ = aio_poll(ctx_, (cond)); \
> > +waited_ |= !!(cond) | busy_;\
> > +}   \
> 
> If we are in an iothread already, we never dereference wait,

No, the name and documentation for aio_context_in_iothread() is
misleading.  It actually means "does this AioContext belong to the
current thread?", which is more general than just the IOThread case.

aio_context_in_iothread() returns true when:
1. We are the IOThread that owns ctx. <-- the case you thought of
2. We are the main loop and ctx == qemu_get_aio_context().
   ^--- the sneaky case that BDRV_POLL_WHILE() has always relied on

> > +} else {\
> > +assert(qemu_get_current_aio_context() ==\
> > +   qemu_get_aio_context()); \
> > +assert(!wait_->need_kick);  \
> 
> but if we are in the main loop, wait must be non-NULL.

The else statement only handles the case where the main loop is waiting
for an IOThread AioContext.

s/in the main loop/in the main loop and ctx is an IOThread AioContext/

> 
> > +++ b/include/block/block.h
> > @@ -2,6 +2,7 @@
> >   #define BLOCK_H
> >   #include "block/aio.h"
> > +#include "block/aio-wait.h"
> >   #include "qapi-types.h"
> >   #include "qemu/iov.h"
> >   #include "qemu/coroutine.h"
> > @@ -367,41 +368,14 @@ void bdrv_drain_all_begin(void);
> >   void bdrv_drain_all_end(void);
> >   void bdrv_drain_all(void);
> > +/* Returns NULL when bs == NULL */
> > +AioWait *bdrv_get_aio_wait(BlockDriverState *bs);
> 
> This can return NULL, so it is only ever safe to use in an iothread; because
> if it is used in the main loop,...
> 
> > +
> >   #define BDRV_POLL_WHILE(bs, cond) ({   \
> 
> > +AIO_WAIT_WHILE(bdrv_get_aio_wait(bs_), \
> > +   bdrv_get_aio_context(bs_),  \
> > +   cond); })
> 
> ...we can pass NULL as the wait parameter, which will crash.

It won't crash since if (aio_context_in_iothread(ctx_)) will take the true
case when bs_ == NULL.

> > +++ b/block/io.c
> 
> >   void bdrv_wakeup(BlockDriverState *bs)
> >   {
> > -/* The barrier (or an atomic op) is in the caller.  */
> > -if (atomic_read(&bs->wakeup)) {
> > -aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL);
> > -}
> > +aio_wait_kick(bdrv_get_aio_wait(bs));
> 
> this is another case where passing NULL...

bdrv_wakeup() is only called when bs != NULL.

I hope this explains things!  The main issue that raised these questions
was that aio_context_in_iothread() has a misleading name.  Shall we
rename it?

I'm having a hard time picking a new name because it must not be
confused with AioContext acquire/release, which doesn't influence the
"native" AioContext that the current thread has an affinity with.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState

2018-02-13 Thread Eric Blake

On 02/13/2018 08:20 AM, Stefan Hajnoczi wrote:

BlockDriverState has the BDRV_POLL_WHILE() macro to wait on event loop
activity while a condition evaluates to true.  This is used to implement
synchronous operations where it acts as a condvar between the IOThread
running the operation and the main loop waiting for the operation.  It
can also be called from the thread that owns the AioContext and in that
case it's just a nested event loop.

BlockBackend needs this behavior but doesn't always have a
BlockDriverState it can use.  This patch extracts BDRV_POLL_WHILE() into
the AioWait abstraction, which can be used with AioContext and isn't
tied to BlockDriverState anymore.

This feature could be built directly into AioContext but then all users
would kick the event loop even if they signal different conditions.
Imagine an AioContext with many BlockDriverStates, each time a request
completes any waiter would wake up and re-check their condition.  It's
nicer to keep a separate AioWait object for each condition instead.

Please see "block/aio-wait.h" for details on the API.

The name AIO_WAIT_WHILE() avoids the confusion between AIO_POLL_WHILE()
and AioContext polling.

Signed-off-by: Stefan Hajnoczi 
---


Trying to understand here:



+#define AIO_WAIT_WHILE(wait, ctx, cond) ({  \
+bool waited_ = false;   \
+bool busy_ = true;  \
+AioWait *wait_ = (wait);\
+AioContext *ctx_ = (ctx);   \
+if (aio_context_in_iothread(ctx_)) {\
+while ((cond) || busy_) {   \
+busy_ = aio_poll(ctx_, (cond)); \
+waited_ |= !!(cond) | busy_;\
+}   \


If we are in an iothread already, we never dereference wait,


+} else {\
+assert(qemu_get_current_aio_context() ==\
+   qemu_get_aio_context()); \
+assert(!wait_->need_kick);  \


but if we are in the main loop, wait must be non-NULL.


+++ b/include/block/block.h
@@ -2,6 +2,7 @@
  #define BLOCK_H
  
  #include "block/aio.h"

+#include "block/aio-wait.h"
  #include "qapi-types.h"
  #include "qemu/iov.h"
  #include "qemu/coroutine.h"
@@ -367,41 +368,14 @@ void bdrv_drain_all_begin(void);
  void bdrv_drain_all_end(void);
  void bdrv_drain_all(void);
  
+/* Returns NULL when bs == NULL */

+AioWait *bdrv_get_aio_wait(BlockDriverState *bs);


This can return NULL, so it is only ever safe to use in an iothread; 
because if it is used in the main loop,...



+
  #define BDRV_POLL_WHILE(bs, cond) ({   \



+AIO_WAIT_WHILE(bdrv_get_aio_wait(bs_), \
+   bdrv_get_aio_context(bs_),  \
+   cond); })


...we can pass NULL as the wait parameter, which will crash.


+++ b/block.c
@@ -4716,6 +4716,11 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs)
  return bs->aio_context;
  }
  
+AioWait *bdrv_get_aio_wait(BlockDriverState *bs)

+{
+return bs ? &bs->wait : NULL;
+}


So, do we need documentation to that fact?  Also,


+++ b/block/io.c



  void bdrv_wakeup(BlockDriverState *bs)
  {
-/* The barrier (or an atomic op) is in the caller.  */
-if (atomic_read(&bs->wakeup)) {
-aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL);
-}
+aio_wait_kick(bdrv_get_aio_wait(bs));


this is another case where passing NULL...


+++ b/util/aio-wait.c



+void aio_wait_kick(AioWait *wait)
+{
+/* The barrier (or an atomic op) is in the caller.  */
+if (atomic_read(&wait->need_kick)) {


...is bad.  Does that mean bdrv_wakeup() can only be called when bs is 
non-NULL?  Does that need documentation?


It may be that your patch is correct (as I'm not an expert on the rules 
in play here), but more comments may help.  Or you may have a NULL 
dereference bug lurking.  So at this point, I can't give R-b, even 
though the refactoring of the BDRV_POLL_WHILE() macro into a separate 
helper makes sense from the high level view.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org