Re: [PATCH for-6.1? v2 3/7] job: @force parameter for job_cancel_sync{,_all}()

2021-08-03 Thread Kevin Wolf
Am 26.07.2021 um 16:46 hat Max Reitz geschrieben:
> Callers should be able to specify whether they want job_cancel_sync() to
> force-cancel the job or not.
> 
> In fact, almost all invocations do not care about consistency of the
> result and just want the job to terminate as soon as possible, so they
> should pass force=true.  The replication block driver is the exception.
> 
> This changes some iotest outputs, because quitting qemu while a mirror
> job is active will now lead to it being cancelled instead of completed,
> which is what we want.  (Cancelling a READY mirror job with force=false
> may take an indefinite amount of time, which we do not want when
> quitting.  If users want consistent results, they must have all jobs be
> done before they quit qemu.)
> 
> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
> Signed-off-by: Max Reitz 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 

> diff --git a/job.c b/job.c
> index e7a5d28854..9e971d64cf 100644
> --- a/job.c
> +++ b/job.c
> @@ -763,7 +763,12 @@ static void job_completed_txn_abort(Job *job)
>  if (other_job != job) {
>  ctx = other_job->aio_context;
>  aio_context_acquire(ctx);
> -job_cancel_async(other_job, false);
> +/*
> + * This is a transaction: If one job failed, no result will 
> matter.
> + * Therefore, pass force=true to terminate all other jobs as 
> quickly
> + * as possible.
> + */
> +job_cancel_async(other_job, true);
>  aio_context_release(ctx);
>  }
>  }

Sneaking in a hunk that is unrelated to what the commit message
promises? How naughty! :-)

(But I guess the change makes sense.)

Kevin




[PATCH for-6.1? v2 3/7] job: @force parameter for job_cancel_sync{, _all}()

2021-07-26 Thread Max Reitz
Callers should be able to specify whether they want job_cancel_sync() to
force-cancel the job or not.

In fact, almost all invocations do not care about consistency of the
result and just want the job to terminate as soon as possible, so they
should pass force=true.  The replication block driver is the exception.

This changes some iotest outputs, because quitting qemu while a mirror
job is active will now lead to it being cancelled instead of completed,
which is what we want.  (Cancelling a READY mirror job with force=false
may take an indefinite amount of time, which we do not want when
quitting.  If users want consistent results, they must have all jobs be
done before they quit qemu.)

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 include/qemu/job.h| 10 ++---
 block/replication.c   |  4 +-
 blockdev.c|  4 +-
 job.c | 27 +---
 qemu-nbd.c|  2 +-
 softmmu/runstate.c|  2 +-
 storage-daemon/qemu-storage-daemon.c  |  2 +-
 tests/unit/test-block-iothread.c  |  2 +-
 tests/unit/test-blockjob.c|  2 +-
 tests/qemu-iotests/109.out| 60 +++
 tests/qemu-iotests/tests/qsd-jobs.out |  2 +-
 11 files changed, 61 insertions(+), 56 deletions(-)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 41162ed494..5e8edbc2c8 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -506,19 +506,19 @@ void job_user_cancel(Job *job, bool force, Error **errp);
 
 /**
  * Synchronously cancel the @job.  The completion callback is called
- * before the function returns.  The job may actually complete
- * instead of canceling itself; the circumstances under which this
- * happens depend on the kind of job that is active.
+ * before the function returns.  If @force is false, the job may
+ * actually complete instead of canceling itself; the circumstances
+ * under which this happens depend on the kind of job that is active.
  *
  * Returns the return value from the job if the job actually completed
  * during the call, or -ECANCELED if it was canceled.
  *
  * Callers must hold the AioContext lock of job->aio_context.
  */
-int job_cancel_sync(Job *job);
+int job_cancel_sync(Job *job, bool force);
 
 /** Synchronously cancels all jobs using job_cancel_sync(). */
-void job_cancel_sync_all(void);
+void job_cancel_sync_all(bool force);
 
 /**
  * @job: The job to be completed.
diff --git a/block/replication.c b/block/replication.c
index 32444b9a8f..e7a9327b12 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -149,7 +149,7 @@ static void replication_close(BlockDriverState *bs)
 if (s->stage == BLOCK_REPLICATION_FAILOVER) {
 commit_job = >commit_job->job;
 assert(commit_job->aio_context == qemu_get_current_aio_context());
-job_cancel_sync(commit_job);
+job_cancel_sync(commit_job, false);
 }
 
 if (s->mode == REPLICATION_MODE_SECONDARY) {
@@ -726,7 +726,7 @@ static void replication_stop(ReplicationState *rs, bool 
failover, Error **errp)
  * disk, secondary disk in backup_job_completed().
  */
 if (s->backup_job) {
-job_cancel_sync(>backup_job->job);
+job_cancel_sync(>backup_job->job, false);
 }
 
 if (!failover) {
diff --git a/blockdev.c b/blockdev.c
index 3d8ac368a1..aa95918c02 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1848,7 +1848,7 @@ static void drive_backup_abort(BlkActionState *common)
 aio_context = bdrv_get_aio_context(state->bs);
 aio_context_acquire(aio_context);
 
-job_cancel_sync(>job->job);
+job_cancel_sync(>job->job, true);
 
 aio_context_release(aio_context);
 }
@@ -1949,7 +1949,7 @@ static void blockdev_backup_abort(BlkActionState *common)
 aio_context = bdrv_get_aio_context(state->bs);
 aio_context_acquire(aio_context);
 
-job_cancel_sync(>job->job);
+job_cancel_sync(>job->job, true);
 
 aio_context_release(aio_context);
 }
diff --git a/job.c b/job.c
index e7a5d28854..9e971d64cf 100644
--- a/job.c
+++ b/job.c
@@ -763,7 +763,12 @@ static void job_completed_txn_abort(Job *job)
 if (other_job != job) {
 ctx = other_job->aio_context;
 aio_context_acquire(ctx);
-job_cancel_async(other_job, false);
+/*
+ * This is a transaction: If one job failed, no result will matter.
+ * Therefore, pass force=true to terminate all other jobs as 
quickly
+ * as possible.
+ */
+job_cancel_async(other_job, true);
 aio_context_release(ctx);
 }
 }
@@ -964,12 +969,24 @@ static void job_cancel_err(Job *job, Error **errp)
 job_cancel(job, false);
 }
 
-int job_cancel_sync(Job *job)
+/**
+ * Same as