When creating multiple VMs in parallel using virt-install, pool refresh
fails with:

  error: internal error: pool 'default' has asynchronous jobs running.

This happens because a concurrent volume creation holds an async job
on the pool. The storagePoolRefresh() function immediately returns
VIR_ERR_INTERNAL_ERROR instead of waiting for the async job to finish.

Patch 1 fixes the error code from VIR_ERR_INTERNAL_ERROR to
VIR_ERR_OPERATION_INVALID in all 4 affected operations (pool-refresh,
pool-destroy, pool-undefine, pool-delete), consistent with the adjacent
"pool is not active" and "pool is starting up" checks.

Patch 2 adds a condition variable to virStoragePoolObj so that
storagePoolRefresh() waits up to 30 seconds for async jobs to drain
instead of failing immediately. The other three operations keep the
immediate error since waiting to destroy/delete/undefine a pool during
volume creation is not a sensible user workflow.

Reproducing the bug requires artificially widening the race window
since fallocate() completes in microseconds on local filesystems.
I used LD_PRELOAD to inject a 15-second delay into fallocate64()
for files in the pool directory. Details and validation results are
in the individual patch notes below the '---' line.

https://issues.redhat.com/browse/RHEL-150758

Lucas Amaral (2):
  storage: fix error code for async jobs check in pool operations
  storage: wait for async jobs to drain during pool refresh

 src/conf/virstorageobj.c     | 39 ++++++++++++++++++++++++++++++++++++
 src/conf/virstorageobj.h     |  3 +++
 src/libvirt_private.syms     |  1 +
 src/storage/storage_driver.c | 25 ++++++++++++++++-------
 4 files changed, 61 insertions(+), 7 deletions(-)

-- 
2.52.0

Reply via email to