On Thu, Jan 06, 2022 at 05:20:57PM +0900, Hiroki Narukawa wrote: Phil, thanks for notifying me.
> Coroutine pool size was 64 from long ago, and the basis was organized in the > commit message in c740ad92. > > At that time, virtio-blk queue-size and num-queue were not configuable, and > equivalent values were 128 and 1. > > Coroutine pool size 64 was fine then. > > Later queue-size and num-queue got configuable, and default values were > increased. > > Coroutine pool with size 64 exhausts frequently with random disk IO in new > size, and slows down. > > This commit adjusts coroutine pool size adaptively with new values. > > This commit adds 64 by default, but now coroutine is not only for block > devices, > > and is not too much burdon comparing with new default. > > pool size of 128 * vCPUs. > > Signed-off-by: Hiroki Narukawa <hnaru...@yahoo-corp.jp> > --- > hw/block/virtio-blk.c | 3 +++ > include/qemu/coroutine.h | 5 +++++ > util/qemu-coroutine.c | 15 +++++++++++---- > 3 files changed, 19 insertions(+), 4 deletions(-) Have you measured with QEMU 6.1 or later? Commit d7ddd0a1618a75b31dc308bb37365ce1da972154 ("linux-aio: limit the batch size using `aio-max-batch` parameter") can hide this issue so it may not be apparent in recent QEMU releases. I like your approach better than what I tried recently (I ended up dropping the patch from my queue because it doesn't handle coroutines created in one thread and terminated in another thread correctly): https://patchew.org/QEMU/20210913153524.1190696-1-stefa...@redhat.com/ > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > index f139cd7cc9..726dbe14de 100644 > --- a/hw/block/virtio-blk.c > +++ b/hw/block/virtio-blk.c > @@ -32,6 +32,7 @@ > #include "hw/virtio/virtio-bus.h" > #include "migration/qemu-file-types.h" > #include "hw/virtio/virtio-access.h" > +#include "qemu/coroutine.h" > > /* Config size before the discard support (hide associated config fields) */ > #define VIRTIO_BLK_CFG_SIZE offsetof(struct virtio_blk_config, \ > @@ -1222,6 +1223,8 @@ static void virtio_blk_device_realize(DeviceState *dev, > Error **errp) > for (i = 0; i < conf->num_queues; i++) { > virtio_add_queue(vdev, conf->queue_size, virtio_blk_handle_output); > } > + qemu_coroutine_increase_pool_batch_size(conf->num_queues * > conf->queue_size > + / 2); This over-provisions coroutine pools when IOThreads are configured, because --device virtio-blk-pci,iothread=iothread2 will only submit I/O requests in iothread2, for example. Other threads don't need to increase their limit. However, I think it's okay to use this inexact approach. It's still better than the current hardcoded 64 coroutine pool size. > virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err); > if (err != NULL) { > error_propagate(errp, err); > diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h > index 4829ff373d..e52ed76ab2 100644 > --- a/include/qemu/coroutine.h > +++ b/include/qemu/coroutine.h > @@ -331,6 +331,11 @@ void qemu_co_sleep_wake(QemuCoSleep *w); > */ > void coroutine_fn yield_until_fd_readable(int fd); > > +/** > + * Increase coroutine pool size > + */ > +void qemu_coroutine_increase_pool_batch_size(unsigned int > additional_pool_size); > + > #include "qemu/lockable.h" > > #endif /* QEMU_COROUTINE_H */ > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c > index 38fb6d3084..080a1e0126 100644 > --- a/util/qemu-coroutine.c > +++ b/util/qemu-coroutine.c > @@ -20,12 +20,14 @@ > #include "qemu/coroutine_int.h" > #include "block/aio.h" > > +/** Initial batch size is 64, and is increased on demand */ > enum { > - POOL_BATCH_SIZE = 64, > + POOL_INITIAL_BATCH_SIZE = 64, > }; > > /** Free list to speed up creation */ > static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool); > +static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE; > static unsigned int release_pool_size; > static __thread QSLIST_HEAD(, Coroutine) alloc_pool = > QSLIST_HEAD_INITIALIZER(pool); > static __thread unsigned int alloc_pool_size; > @@ -49,7 +51,7 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, > void *opaque) > if (CONFIG_COROUTINE_POOL) { > co = QSLIST_FIRST(&alloc_pool); > if (!co) { > - if (release_pool_size > POOL_BATCH_SIZE) { > + if (release_pool_size > pool_batch_size) { > /* Slow path; a good place to register the destructor, too. > */ > if (!coroutine_pool_cleanup_notifier.notify) { > coroutine_pool_cleanup_notifier.notify = > coroutine_pool_cleanup; > @@ -86,12 +88,12 @@ static void coroutine_delete(Coroutine *co) > co->caller = NULL; > > if (CONFIG_COROUTINE_POOL) { > - if (release_pool_size < POOL_BATCH_SIZE * 2) { > + if (release_pool_size < pool_batch_size * 2) { > QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next); > qatomic_inc(&release_pool_size); > return; > } > - if (alloc_pool_size < POOL_BATCH_SIZE) { > + if (alloc_pool_size < pool_batch_size) { > QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next); > alloc_pool_size++; > return; > @@ -202,3 +204,8 @@ AioContext *coroutine_fn > qemu_coroutine_get_aio_context(Coroutine *co) > { > return co->ctx; > } > + > +void qemu_coroutine_increase_pool_batch_size(unsigned int > additional_pool_size) > +{ > + qatomic_add(&pool_batch_size, additional_pool_size); If atomic_add() is used to modify pool_batch_size then qatomic_read() should be used for loads. At a minimum it serves as documentation that this is an atomic variable.
signature.asc
Description: PGP signature