On Tue, Apr 26, 2016 at 11:54:19AM +0100, Stefan Hajnoczi wrote: > On Fri, Apr 15, 2016 at 01:31:55PM +0200, Paolo Bonzini wrote: > > [this time including the mailing list] > > > > This is yet another tiny bit of the multiqueue work, this time affecting > > the synchronization infrastructure for coroutines. Currently, coroutines > > synchronize between the main I/O thread and the dataplane iothread through > > the AioContext lock. However, for multiqueue a single BDS will be used > > by multiple iothreads and hence multiple AioContexts. This calls for > > a different approach to coroutine synchronization. This series is my > > first attempt at it. Besides multiqueue, I think throttling groups > > (which can already span multiple AioContexts) could also take advantage > > of the new CoMutexes. > > > > The series has two main parts: > > > > - it makes CoMutex bind coroutines to an AioContexts. It is of course > > still possible to move coroutines around with explicit yield and > > qemu_coroutine_enter, but a CoMutex will now reenter the coroutine > > where it was running before, rather than in the AioContext of > > the previous owner. To do this, a new function aio_co_schedule is > > introduced to run a coroutine on a given iothread. I think this could > > be used in other places too; for now it lets a CoMutex protect stuff > > across multiple AioContexts without them moving around(*). Of course > > currently a CoMutex is generally not used across multiple iothreads, > > because you have to acquire/release the AioContext around CoMutex > > critical sections. However... > > > > - ... the second change is exactly to make CoMutex thread-safe and remove > > the need for a "thread-based" mutex around it. The new CoMutex is > > exactly the same as a mutex implementation that you'd find in an > > operating system. iothreads are the moral equivalent of CPUs in > > a kernel, while coroutines resemble kernel threads running without > > preemption on a CPU. Even if you have to take concurrency into account, > > the lack of preemption while running coroutines or bottom halves > > keeps the complexity at bay. For example, it is easy to synchronize > > between qemu_co_mutex_lock's yield and the qemu_coroutine_enter in > > aio_co_schedule's bottom half. > > > > Same as before, CoMutex puts coroutines to sleep with > > qemu_coroutine_yield and wake them up with the new aio_co_schedule. > > I could have wrapped CoMutex's CoQueue with a "regular" thread mutex or > > spinlock. The resulting code would have looked a lot like RFifoLock > > (with CoQueue replacing RFifoLock's condition variable). Rather, > > inspired by the parallel between coroutines and non-preemptive kernel > > threads, I chose to adopt the same lock-free mutex algorithm as OSv. > > The algorithm only needs two to four atomic ops for a lock-unlock pair > > (two when uncontended). To cover CoQueue, each CoQueue is made to > > depend on a CoMutex, similar to condition variables. Most CoQueues > > already have a corresponding CoMutex so this is not a big deal; > > converting the others is left for a future series. I did this because > > CoQueue looks a lot like OSv's waitqueues; so if necessary, we could > > even take OSv's support for wait morphing (which avoids the thundering > > herd problem) and add it to CoMutex and CoQueue. This may be useful > > when making tracked_requests thread-safe. > > > > Kevin: this has nothing to do with my old plan from Brno, and it's > > possibly a lot closer to what you wanted. Your idea was to automatically > > release the "thread mutex" when a coroutine yields, I think you should > > be fine with not having a thread mutex at all! > > > > This will need some changes in the formats because, for multiqueue, > > CoMutexes would need to be used like "real" thread mutexes. Code like > > this: > > > > ... > > qemu_co_mutex_unlock() > > ... /* still access shared data, but don't yield */ > > qemu_coroutine_yield() > > > > might be required to use this other pattern: > > > > ... /* access shared data, but don't yield */ > > qemu_co_mutex_unlock() > > qemu_coroutine_yield() > > > > because "adding a second CPU" is already introducing concurrency that > > wasn't there before. The "non-preemptive multitasking" reference only > > applies to things that access AioContext-local data. This includes the > > synchronization primitives implemented in this series, the thread pool, > > the Linux AIO support, but not much else. It still simplifies _those_ > > though. :) > > > > Anyhow, we'll always have some BlockDriver that do not support multiqueue, > > such as the network protocols. Thus it would be possible to handle the > > formats one at a time. raw-posix, raw and qcow2 would already form a > > pretty good set, and the first two do not use CoMutex at all. > > > > The patch has quite a lot of new code, but about half of it is testcases. > > The new test covers correctness and (when run with --verbose) also takes a > > stab at measuring performance; the results is that performance of CoMutex > > is comparable to pthread mutexes. The only snag is that that you cannot > > make a direct comparison between CoMutex (fair) and pthread_mutex_t > > (unfair). For this reason the testcase also measures performance of a > > quick-and-dirty implementation of a fair mutex, based on MCS locks + > > futexes. > > > > There's a lot of meat in the above text, and I hope it will make the code > > clearer and compensate for the terse commit messages. :) I'll probably > > write a single-threaded testcase too, just to provide some more unit > > test comparison of "before" and "after". > > > > I haven't even started a guest with this patches, let alone run > > qemu-iotests... generally the changes are well confined to unit tested > > code, patch 2 for example is completely untested. There are a couple > > other places that at least need more comments, but I wanted to throw > > the patch out for an early review of the general approach. > > > > Paolo Bonzini (11): > > coroutine: use QSIMPLEQ instead of QTAILQ > > throttle-groups: restart throttled requests from coroutine context > > coroutine: delete qemu_co_enter_next > > aio: introduce aio_co_schedule > > coroutine-lock: reschedule coroutine on the AioContext it was running on > > coroutine-lock: make CoMutex thread-safe > > coroutine-lock: add limited spinning to CoMutex > > test-aio-multithread: add performance comparison with thread-based mutexes > > coroutine-lock: place CoMutex before CoQueue in header > > coroutine-lock: add mutex argument to CoQueue APIs > > coroutine-lock: make CoRwlock thread-safe and fair > > > > async.c | 38 ++++ > > block/backup.c | 2 +- > > block/io.c | 2 +- > > block/qcow2-cluster.c | 4 +- > > block/sheepdog.c | 2 +- > > block/throttle-groups.c | 71 +++++-- > > hw/9pfs/9p.c | 2 +- > > include/block/aio.h | 14 ++ > > include/qemu/atomic.h | 3 + > > include/qemu/coroutine.h | 106 ++++++---- > > include/qemu/coroutine_int.h | 14 +- > > iothread.c | 16 ++ > > stubs/iothread.c | 11 ++ > > tests/Makefile | 11 +- > > tests/iothread.c | 107 ++++++++++ > > tests/iothread.h | 25 +++ > > tests/test-aio-multithread.c | 452 > > +++++++++++++++++++++++++++++++++++++++++++ > > trace-events | 8 +- > > util/qemu-coroutine-lock.c | 257 ++++++++++++++++++++---- > > util/qemu-coroutine.c | 2 +- > > 20 files changed, 1037 insertions(+), 110 deletions(-) > > create mode 100644 tests/iothread.c > > create mode 100644 tests/iothread.h > > create mode 100644 tests/test-aio-multithread.c > > Looks promising, modulo questions and comments I posted on patches.
I should add that I didn't review the internals of the new thread-safe CoMutex implemenation due to time constraints. If you want me to think through the cases I can do it but I'm happy to trust you and the OSv code.
signature.asc
Description: PGP signature