> On Thu, Jan 06, 2022 at 05:20:57PM +0900, Hiroki Narukawa wrote:
> 
> Phil, thanks for notifying me.
> 
> > Coroutine pool size was 64 from long ago, and the basis was organized in 
> > the commit message in c740ad92.
> > 
> > At that time, virtio-blk queue-size and num-queue were not configuable, and 
> > equivalent values were 128 and 1.
> > 
> > Coroutine pool size 64 was fine then.
> > 
> > Later queue-size and num-queue got configuable, and default values were 
> > increased.
> > 
> > Coroutine pool with size 64 exhausts frequently with random disk IO in new 
> > size, and slows down.
> > 
> > This commit adjusts coroutine pool size adaptively with new values.
> > 
> > This commit adds 64 by default, but now coroutine is not only for 
> > block devices,
> > 
> > and is not too much burdon comparing with new default.
> > 
> > pool size of 128 * vCPUs.
> > 
> > Signed-off-by: Hiroki Narukawa <hnaru...@yahoo-corp.jp>
> > ---
> >  hw/block/virtio-blk.c    |  3 +++
> >  include/qemu/coroutine.h |  5 +++++
> >  util/qemu-coroutine.c    | 15 +++++++++++----
> >  3 files changed, 19 insertions(+), 4 deletions(-)
> 
> Have you measured with QEMU 6.1 or later? Commit
> d7ddd0a1618a75b31dc308bb37365ce1da972154 ("linux-aio: limit the batch size 
> using `aio-max-batch` parameter") can hide this issue so it may not be 
> apparent in recent QEMU releases.
> 
> I like your approach better than what I tried recently (I ended up dropping 
> the patch from my queue because it doesn't handle coroutines created in one 
> thread and terminated in another thread correctly):
> https://patchew.org/QEMU/20210913153524.1190696-1-stefa...@redhat.com/

Yes, I measured with both QEMU-6.0 and QEMU-6.2, and both were affected by 
coroutine pool size. Two versions did not have so much difference in my 
measurement.

> 
> > 
> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 
> > f139cd7cc9..726dbe14de 100644
> > --- a/hw/block/virtio-blk.c
> > +++ b/hw/block/virtio-blk.c
> > @@ -32,6 +32,7 @@
> >  #include "hw/virtio/virtio-bus.h"
> >  #include "migration/qemu-file-types.h"
> >  #include "hw/virtio/virtio-access.h"
> > +#include "qemu/coroutine.h"
> >  
> >  /* Config size before the discard support (hide associated config 
> > fields) */  #define VIRTIO_BLK_CFG_SIZE offsetof(struct 
> > virtio_blk_config, \ @@ -1222,6 +1223,8 @@ static void 
> > virtio_blk_device_realize(DeviceState *dev, Error **errp)
> >      for (i = 0; i < conf->num_queues; i++) {
> >          virtio_add_queue(vdev, conf->queue_size, virtio_blk_handle_output);
> >      }
> > +    qemu_coroutine_increase_pool_batch_size(conf->num_queues * 
> > conf->queue_size
> > +                                            / 2);
> 
> This over-provisions coroutine pools when IOThreads are configured, because 
> --device virtio-blk-pci,iothread=iothread2 will only submit I/O requests in 
> iothread2, for example. Other threads don't need to increase their limit.
> 
> However, I think it's okay to use this inexact approach. It's still better 
> than the current hardcoded 64 coroutine pool size.
> 

Thank you for pointing out, Considering your comment, it seems to be better if 
pool_batch_size may be thread local variable.
But I couldn't find a way to initialize pool size from the same thread of 
caller.

> >      virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err);
> >      if (err != NULL) {
> >          error_propagate(errp, err);
> > diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h index 
> > 4829ff373d..e52ed76ab2 100644
> > --- a/include/qemu/coroutine.h
> > +++ b/include/qemu/coroutine.h
> > @@ -331,6 +331,11 @@ void qemu_co_sleep_wake(QemuCoSleep *w);
> >   */
> >  void coroutine_fn yield_until_fd_readable(int fd);
> >  
> > +/**
> > + * Increase coroutine pool size
> > + */
> > +void qemu_coroutine_increase_pool_batch_size(unsigned int 
> > +additional_pool_size);
> > +
> >  #include "qemu/lockable.h"
> >  
> >  #endif /* QEMU_COROUTINE_H */
> > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c index 
> > 38fb6d3084..080a1e0126 100644
> > --- a/util/qemu-coroutine.c
> > +++ b/util/qemu-coroutine.c
> > @@ -20,12 +20,14 @@
> >  #include "qemu/coroutine_int.h"
> >  #include "block/aio.h"
> >  
> > +/** Initial batch size is 64, and is increased on demand */
> >  enum {
> > -    POOL_BATCH_SIZE = 64,
> > +    POOL_INITIAL_BATCH_SIZE = 64,
> >  };
> >  
> >  /** Free list to speed up creation */  static QSLIST_HEAD(, 
> > Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool);
> > +static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE;
> >  static unsigned int release_pool_size;  static __thread QSLIST_HEAD(, 
> > Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool);  static 
> > __thread unsigned int alloc_pool_size; @@ -49,7 +51,7 @@ Coroutine 
> > *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)
> >      if (CONFIG_COROUTINE_POOL) {
> >          co = QSLIST_FIRST(&alloc_pool);
> >          if (!co) {
> > -            if (release_pool_size > POOL_BATCH_SIZE) {
> > +            if (release_pool_size > pool_batch_size) {
> >                  /* Slow path; a good place to register the destructor, 
> > too.  */
> >                  if (!coroutine_pool_cleanup_notifier.notify) {
> >                      coroutine_pool_cleanup_notifier.notify = 
> > coroutine_pool_cleanup; @@ -86,12 +88,12 @@ static void 
> > coroutine_delete(Coroutine *co)
> >      co->caller = NULL;
> >  
> >      if (CONFIG_COROUTINE_POOL) {
> > -        if (release_pool_size < POOL_BATCH_SIZE * 2) {
> > +        if (release_pool_size < pool_batch_size * 2) {
> >              QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next);
> >              qatomic_inc(&release_pool_size);
> >              return;
> >          }
> > -        if (alloc_pool_size < POOL_BATCH_SIZE) {
> > +        if (alloc_pool_size < pool_batch_size) {
> >              QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next);
> >              alloc_pool_size++;
> >              return;
> > @@ -202,3 +204,8 @@ AioContext *coroutine_fn 
> > qemu_coroutine_get_aio_context(Coroutine *co)  {
> >      return co->ctx;
> >  }
> > +
> > +void qemu_coroutine_increase_pool_batch_size(unsigned int 
> > +additional_pool_size) {
> > +    qatomic_add(&pool_batch_size, additional_pool_size);
> 
> If atomic_add() is used to modify pool_batch_size then qatomic_read() should 
> be used for loads. At a minimum it serves as documentation that this is an 
> atomic variable.
>

Fixed this in patch v2 and resent it.

Reply via email to