On 18/02/2018 19:20, Stefan Hajnoczi wrote:
> Paolo's patches have been getting us closer to multiqueue block layer
> support but there is a final set of changes required that has become
> clearer to me just recently.  I'm curious if this matches Paolo's
> vision and whether anyone else has comments.
> We need to push the AioContext lock down into BlockDriverState so that
> thread-safety is not tied to a single AioContext but to the
> BlockDriverState itself.  We also need to audit block layer code to
> identify places that assume everything is run from a single
> AioContext.

This is mostly done already.  Within BlockDriverState
dirty_bitmap_mutex, reqs_lock and the BQL is good enough in many cases.
Drivers already have their mutex.

> After this is done the final piece is to eliminate
> bdrv_set_aio_context().  BlockDriverStates should not be associated
> with an AioContext.  Instead they should use whichever AioContext they
> are invoked under.  The current thread's AioContext can be fetched
> using qemu_get_current_aio_context().  This is either the main loop
> AioContext or an IOThread AioContext.
> The .bdrv_attach/detach_aio_context() callbacks will no longer be
> necessary in a world where block driver code is thread-safe and any
> AioContext can be used.

This is not entirely possible.  In particular, network drivers still
have a "home context" which is where the file descriptor callbacks are
attached to.  They could still dispatch I/O from any thread in a
multiqueue setup.  This is the remaining intermediate step between "no
AioContext lock" and "multiqueue".

> bdrv_drain_all() and friends do not require extensive modifications
> because the bdrv_wakeup() mechanism already works properly when there
> are multiple IOThreads involved.

Yes, this is already done indeed.

> Block jobs no longer need to be in the same AioContext as the
> BlockDriverState.  For simplicity we may choose to always run them in
> the main loop AioContext by default.  This may have a performance
> impact on tight loops like bdrv_is_allocated() and the initial
> mirroring phase, but maybe not.
> The upshot of all this is that bdrv_set_aio_context() goes away while
> all block driver code needs to be more aware of thread-safety.  It can
> no longer assume that everything is called from one AioContext.


> We should optimize file-posix.c and qcow2.c for maximum parallelism
> using fine-grained locks and other techniques.  The remaining block
> drivers can use one CoMutex per BlockDriverState.

Even better: there is one thread pool and linux-aio context per I/O
thread, file-posix.c should just submit I/O to the current thread with
no locking whatsoever.  There is still reqs_lock, but that can be
optimized easily (see
http://lists.gnu.org/archive/html/qemu-devel/2017-04/msg03323.html; now
that we have QemuLockable, reqs_lock could also just become a QemuSpin).

qcow2.c could be adjusted to use rwlocks.

> I'm excited that we're relatively close to multiqueue now.  I don't
> want to jinx it by saying 2018 is the year of the multiqueue block
> layer, but I'll say it anyway :).

Heh.  I have stopped pushing my patches (and scratched a few itches with
patchew instead) because I'm still a bit burned out from recent KVM
stuff, but this may be the injection of enthusiasm that I needed. :)

Actually, I'd be content with removing the AioContext lock in the first
half of 2018.  1/3rd of that is gone already---doh!  But we're actually
pretty close, thanks to you and all the others who have helped reviewing
the past 100 or so patches!


Reply via email to