Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState
On 02/15/2018 04:12 AM, Kevin Wolf wrote: Having acquired the AioContext does not make this function return true. The semantics are: 1. Current thread is the IOThread that runs the AioContext 2. Current thread is the main loop and the AioContext is the global AioContext. The function tests whether the current thread is the "native" or "home" thread for this AioContext. Perhaps we could also call it the "poller" thread because only that thread is allowed to call aio_poll(ctx, true). if (aio_context_in_native_thread(ctx)) { ... } else { ... } What do you think? "home" or "native" both work for me. Or if we want to keep the name short, maybe just changing the order and s/iothread/thread/ would be enough: bool in_aio_context_thread(AioContext *ctx) - do you think that would still be prone to misunderstandings? in_aio_context_home_thread() sounds slightly better than in_aio_context_thread(), but swapping the verb makes a nice difference where either name is better than what we currently have. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState
Am 15.02.2018 um 10:27 hat Stefan Hajnoczi geschrieben: > On Wed, Feb 14, 2018 at 04:31:45PM -0600, Eric Blake wrote: > > On 02/14/2018 08:06 AM, Stefan Hajnoczi wrote: > > > On Tue, Feb 13, 2018 at 10:01:06AM -0600, Eric Blake wrote: > > > I hope this explains things! The main issue that raised these questions > > > was that aio_context_in_iothread() has a misleading name. Shall we > > > rename it? > > > > Maybe, but that's a separate patch. What name would we bikeshed, maybe > > aio_context_correct_thread() (we are the correct thread if we are the > > iothread that owns ctx, or if we are the main thread and have properly > > acquired ctx) > > Having acquired the AioContext does not make this function return true. > The semantics are: > 1. Current thread is the IOThread that runs the AioContext > 2. Current thread is the main loop and the AioContext is the global >AioContext. > > The function tests whether the current thread is the "native" or "home" > thread for this AioContext. Perhaps we could also call it the "poller" > thread because only that thread is allowed to call aio_poll(ctx, true). > > if (aio_context_in_native_thread(ctx)) { > ... > } else { > ... > } > > What do you think? "home" or "native" both work for me. Or if we want to keep the name short, maybe just changing the order and s/iothread/thread/ would be enough: bool in_aio_context_thread(AioContext *ctx) - do you think that would still be prone to misunderstandings? Kevin signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState
On Wed, Feb 14, 2018 at 04:31:45PM -0600, Eric Blake wrote: > On 02/14/2018 08:06 AM, Stefan Hajnoczi wrote: > > On Tue, Feb 13, 2018 at 10:01:06AM -0600, Eric Blake wrote: > > I hope this explains things! The main issue that raised these questions > > was that aio_context_in_iothread() has a misleading name. Shall we > > rename it? > > Maybe, but that's a separate patch. What name would we bikeshed, maybe > aio_context_correct_thread() (we are the correct thread if we are the > iothread that owns ctx, or if we are the main thread and have properly > acquired ctx) Having acquired the AioContext does not make this function return true. The semantics are: 1. Current thread is the IOThread that runs the AioContext 2. Current thread is the main loop and the AioContext is the global AioContext. The function tests whether the current thread is the "native" or "home" thread for this AioContext. Perhaps we could also call it the "poller" thread because only that thread is allowed to call aio_poll(ctx, true). if (aio_context_in_native_thread(ctx)) { ... } else { ... } What do you think? signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState
On 02/13/2018 10:01 AM, Eric Blake wrote: On 02/13/2018 08:20 AM, Stefan Hajnoczi wrote: BlockDriverState has the BDRV_POLL_WHILE() macro to wait on event loop activity while a condition evaluates to true. This is used to implement synchronous operations where it acts as a condvar between the IOThread running the operation and the main loop waiting for the operation. It can also be called from the thread that owns the AioContext and in that case it's just a nested event loop. It may be that your patch is correct (as I'm not an expert on the rules in play here), but more comments may help. Or you may have a NULL dereference bug lurking. So at this point, I can't give R-b, even though the refactoring of the BDRV_POLL_WHILE() macro into a separate helper makes sense from the high level view. Okay, based on your responses, I can now give Reviewed-by: Eric Blake although it may still help to do followups with better documentation and/or a rename of the confusing functions. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState
On 02/14/2018 08:06 AM, Stefan Hajnoczi wrote: On Tue, Feb 13, 2018 at 10:01:06AM -0600, Eric Blake wrote: Trying to understand here: +#define AIO_WAIT_WHILE(wait, ctx, cond) ({ \ +bool waited_ = false; \ +bool busy_ = true; \ +AioWait *wait_ = (wait);\ +AioContext *ctx_ = (ctx); \ +if (aio_context_in_iothread(ctx_)) {\ +while ((cond) || busy_) { \ +busy_ = aio_poll(ctx_, (cond)); \ +waited_ |= !!(cond) | busy_;\ +} \ If we are in an iothread already, we never dereference wait, No, the name and documentation for aio_context_in_iothread() is misleading. It actually means "does this AioContext belong to the current thread?", which is more general than just the IOThread case. aio_context_in_iothread() returns true when: 1. We are the IOThread that owns ctx. <-- the case you thought of 2. We are the main loop and ctx == qemu_get_aio_context(). ^--- the sneaky case that BDRV_POLL_WHILE() has always relied on Thanks, that helps. +AIO_WAIT_WHILE(bdrv_get_aio_wait(bs_), \ + bdrv_get_aio_context(bs_), \ + cond); }) ...we can pass NULL as the wait parameter, which will crash. It won't crash since if (aio_context_in_iothread(ctx_)) will take the true case when bs_ == NULL. Okay, you've solved that one. +++ b/block/io.c void bdrv_wakeup(BlockDriverState *bs) { -/* The barrier (or an atomic op) is in the caller. */ -if (atomic_read(&bs->wakeup)) { -aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL); -} +aio_wait_kick(bdrv_get_aio_wait(bs)); this is another case where passing NULL... bdrv_wakeup() is only called when bs != NULL. And looks like we're safe, there, as well. I hope this explains things! The main issue that raised these questions was that aio_context_in_iothread() has a misleading name. Shall we rename it? Maybe, but that's a separate patch. What name would we bikeshed, maybe aio_context_correct_thread() (we are the correct thread if we are the iothread that owns ctx, or if we are the main thread and have properly acquired ctx) or aio_context_use_okay() (we can only use the ctx if we own it [native iothread] or have acquired it [main loop]) I'm having a hard time picking a new name because it must not be confused with AioContext acquire/release, which doesn't influence the "native" AioContext that the current thread has an affinity with. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState
On Tue, Feb 13, 2018 at 10:01:06AM -0600, Eric Blake wrote: > Trying to understand here: > > > > +#define AIO_WAIT_WHILE(wait, ctx, cond) ({ \ > > +bool waited_ = false; \ > > +bool busy_ = true; \ > > +AioWait *wait_ = (wait);\ > > +AioContext *ctx_ = (ctx); \ > > +if (aio_context_in_iothread(ctx_)) {\ > > +while ((cond) || busy_) { \ > > +busy_ = aio_poll(ctx_, (cond)); \ > > +waited_ |= !!(cond) | busy_;\ > > +} \ > > If we are in an iothread already, we never dereference wait, No, the name and documentation for aio_context_in_iothread() is misleading. It actually means "does this AioContext belong to the current thread?", which is more general than just the IOThread case. aio_context_in_iothread() returns true when: 1. We are the IOThread that owns ctx. <-- the case you thought of 2. We are the main loop and ctx == qemu_get_aio_context(). ^--- the sneaky case that BDRV_POLL_WHILE() has always relied on > > +} else {\ > > +assert(qemu_get_current_aio_context() ==\ > > + qemu_get_aio_context()); \ > > +assert(!wait_->need_kick); \ > > but if we are in the main loop, wait must be non-NULL. The else statement only handles the case where the main loop is waiting for an IOThread AioContext. s/in the main loop/in the main loop and ctx is an IOThread AioContext/ > > > +++ b/include/block/block.h > > @@ -2,6 +2,7 @@ > > #define BLOCK_H > > #include "block/aio.h" > > +#include "block/aio-wait.h" > > #include "qapi-types.h" > > #include "qemu/iov.h" > > #include "qemu/coroutine.h" > > @@ -367,41 +368,14 @@ void bdrv_drain_all_begin(void); > > void bdrv_drain_all_end(void); > > void bdrv_drain_all(void); > > +/* Returns NULL when bs == NULL */ > > +AioWait *bdrv_get_aio_wait(BlockDriverState *bs); > > This can return NULL, so it is only ever safe to use in an iothread; because > if it is used in the main loop,... > > > + > > #define BDRV_POLL_WHILE(bs, cond) ({ \ > > > +AIO_WAIT_WHILE(bdrv_get_aio_wait(bs_), \ > > + bdrv_get_aio_context(bs_), \ > > + cond); }) > > ...we can pass NULL as the wait parameter, which will crash. It won't crash since if (aio_context_in_iothread(ctx_)) will take the true case when bs_ == NULL. > > +++ b/block/io.c > > > void bdrv_wakeup(BlockDriverState *bs) > > { > > -/* The barrier (or an atomic op) is in the caller. */ > > -if (atomic_read(&bs->wakeup)) { > > -aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL); > > -} > > +aio_wait_kick(bdrv_get_aio_wait(bs)); > > this is another case where passing NULL... bdrv_wakeup() is only called when bs != NULL. I hope this explains things! The main issue that raised these questions was that aio_context_in_iothread() has a misleading name. Shall we rename it? I'm having a hard time picking a new name because it must not be confused with AioContext acquire/release, which doesn't influence the "native" AioContext that the current thread has an affinity with. signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v2 1/4] block: extract AIO_WAIT_WHILE() from BlockDriverState
On 02/13/2018 08:20 AM, Stefan Hajnoczi wrote: BlockDriverState has the BDRV_POLL_WHILE() macro to wait on event loop activity while a condition evaluates to true. This is used to implement synchronous operations where it acts as a condvar between the IOThread running the operation and the main loop waiting for the operation. It can also be called from the thread that owns the AioContext and in that case it's just a nested event loop. BlockBackend needs this behavior but doesn't always have a BlockDriverState it can use. This patch extracts BDRV_POLL_WHILE() into the AioWait abstraction, which can be used with AioContext and isn't tied to BlockDriverState anymore. This feature could be built directly into AioContext but then all users would kick the event loop even if they signal different conditions. Imagine an AioContext with many BlockDriverStates, each time a request completes any waiter would wake up and re-check their condition. It's nicer to keep a separate AioWait object for each condition instead. Please see "block/aio-wait.h" for details on the API. The name AIO_WAIT_WHILE() avoids the confusion between AIO_POLL_WHILE() and AioContext polling. Signed-off-by: Stefan Hajnoczi --- Trying to understand here: +#define AIO_WAIT_WHILE(wait, ctx, cond) ({ \ +bool waited_ = false; \ +bool busy_ = true; \ +AioWait *wait_ = (wait);\ +AioContext *ctx_ = (ctx); \ +if (aio_context_in_iothread(ctx_)) {\ +while ((cond) || busy_) { \ +busy_ = aio_poll(ctx_, (cond)); \ +waited_ |= !!(cond) | busy_;\ +} \ If we are in an iothread already, we never dereference wait, +} else {\ +assert(qemu_get_current_aio_context() ==\ + qemu_get_aio_context()); \ +assert(!wait_->need_kick); \ but if we are in the main loop, wait must be non-NULL. +++ b/include/block/block.h @@ -2,6 +2,7 @@ #define BLOCK_H #include "block/aio.h" +#include "block/aio-wait.h" #include "qapi-types.h" #include "qemu/iov.h" #include "qemu/coroutine.h" @@ -367,41 +368,14 @@ void bdrv_drain_all_begin(void); void bdrv_drain_all_end(void); void bdrv_drain_all(void); +/* Returns NULL when bs == NULL */ +AioWait *bdrv_get_aio_wait(BlockDriverState *bs); This can return NULL, so it is only ever safe to use in an iothread; because if it is used in the main loop,... + #define BDRV_POLL_WHILE(bs, cond) ({ \ +AIO_WAIT_WHILE(bdrv_get_aio_wait(bs_), \ + bdrv_get_aio_context(bs_), \ + cond); }) ...we can pass NULL as the wait parameter, which will crash. +++ b/block.c @@ -4716,6 +4716,11 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs) return bs->aio_context; } +AioWait *bdrv_get_aio_wait(BlockDriverState *bs) +{ +return bs ? &bs->wait : NULL; +} So, do we need documentation to that fact? Also, +++ b/block/io.c void bdrv_wakeup(BlockDriverState *bs) { -/* The barrier (or an atomic op) is in the caller. */ -if (atomic_read(&bs->wakeup)) { -aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL); -} +aio_wait_kick(bdrv_get_aio_wait(bs)); this is another case where passing NULL... +++ b/util/aio-wait.c +void aio_wait_kick(AioWait *wait) +{ +/* The barrier (or an atomic op) is in the caller. */ +if (atomic_read(&wait->need_kick)) { ...is bad. Does that mean bdrv_wakeup() can only be called when bs is non-NULL? Does that need documentation? It may be that your patch is correct (as I'm not an expert on the rules in play here), but more comments may help. Or you may have a NULL dereference bug lurking. So at this point, I can't give R-b, even though the refactoring of the BDRV_POLL_WHILE() macro into a separate helper makes sense from the high level view. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org