Re: [Qemu-devel] [RFC v4 15/21] blockjobs: add prepare callback

2018-03-06 Thread John Snow


On 02/28/2018 12:04 PM, Kevin Wolf wrote:
> Am 24.02.2018 um 00:51 hat John Snow geschrieben:
>> Some jobs upon finalization may need to perform some work that can
>> still fail. If these jobs are part of a transaction, it's important
>> that these callbacks fail the entire transaction.
>>
>> We allow for a new callback in addition to commit/abort/clean that
>> allows us the opportunity to have fairly late-breaking failures
>> in the transactional process.
>>
>> The expected flow is:
>>
>> - All jobs in a transaction converge to the WAITING state
>>   (added in a forthcoming commit)
>> - All jobs prepare to call either commit/abort
>> - If any job fails, is canceled, or fails preparation, all jobs
>>   call their .abort callback.
>> - All jobs enter the PENDING state, awaiting manual intervention
>>   (also added in a forthcoming commit)
>> - block-job-finalize is issued by the user/management layer
>> - All jobs call their commit callbacks.
>>
>> Signed-off-by: John Snow 
> 
> You almost made me believe the scary thought that we need transactional
> graph modifications, but after writing half of the reply, I think it's
> just that your order here is wrong.
> 

Sorry, yes, this blurb was outdated. I regret that it wasted your time.

> So .prepare is the last thing in the whole process that is allowed to
> fail. Graph manipulations such as bdrv_replace_node() can fail. Graph
> manipulations can also only be made in response to block-job-finalize
> because the management layer must be aware of them. Take them together
> and you have a problem.
> 
> Didn't we already establish earlier that .prepare/.commit/.abort must be
> called together and cannot be separated by waiting for a QMP command
> because of locking and things?
> 

Right; so what really happens is that in response to the FINALIZE verb,
the prepare loop is done first to check for success, and then commit or
abort are dispatched as appropriate.

> So if you go to PENDING first, then wait for block-job-finalize and only
> then call .prepare/.commit/.abort, we should be okay for both problems.
> 
> And taking a look at the final state, that seems to be what you do, so
> in the end, it's probably just the commit message that needs a fix.

Yep, sorry.

> 
>>  blockjob.c   | 34 +++---
>>  include/block/blockjob_int.h | 10 ++
>>  2 files changed, 41 insertions(+), 3 deletions(-)
>>
>> diff --git a/blockjob.c b/blockjob.c
>> index 8f02c03880..1c010ec100 100644
>> --- a/blockjob.c
>> +++ b/blockjob.c
>> @@ -394,6 +394,18 @@ static void block_job_update_rc(BlockJob *job)
>>  }
>>  }
>>  
>> +static int block_job_prepare(BlockJob *job)
>> +{
>> +if (job->ret) {
>> +goto out;
>> +}
>> +if (job->driver->prepare) {
>> +job->ret = job->driver->prepare(job);
>> +}
>> + out:
>> +return job->ret;
>> +}
> 
> Why not just if (job->ret == 0 && job->driver->prepare) and save the
> goto?
> 

Churn. ¯\_(ツ)_/¯

> Kevin
> 



Re: [Qemu-devel] [RFC v4 15/21] blockjobs: add prepare callback

2018-02-28 Thread Kevin Wolf
Am 24.02.2018 um 00:51 hat John Snow geschrieben:
> Some jobs upon finalization may need to perform some work that can
> still fail. If these jobs are part of a transaction, it's important
> that these callbacks fail the entire transaction.
> 
> We allow for a new callback in addition to commit/abort/clean that
> allows us the opportunity to have fairly late-breaking failures
> in the transactional process.
> 
> The expected flow is:
> 
> - All jobs in a transaction converge to the WAITING state
>   (added in a forthcoming commit)
> - All jobs prepare to call either commit/abort
> - If any job fails, is canceled, or fails preparation, all jobs
>   call their .abort callback.
> - All jobs enter the PENDING state, awaiting manual intervention
>   (also added in a forthcoming commit)
> - block-job-finalize is issued by the user/management layer
> - All jobs call their commit callbacks.
> 
> Signed-off-by: John Snow 

You almost made me believe the scary thought that we need transactional
graph modifications, but after writing half of the reply, I think it's
just that your order here is wrong.

So .prepare is the last thing in the whole process that is allowed to
fail. Graph manipulations such as bdrv_replace_node() can fail. Graph
manipulations can also only be made in response to block-job-finalize
because the management layer must be aware of them. Take them together
and you have a problem.

Didn't we already establish earlier that .prepare/.commit/.abort must be
called together and cannot be separated by waiting for a QMP command
because of locking and things?

So if you go to PENDING first, then wait for block-job-finalize and only
then call .prepare/.commit/.abort, we should be okay for both problems.

And taking a look at the final state, that seems to be what you do, so
in the end, it's probably just the commit message that needs a fix.

>  blockjob.c   | 34 +++---
>  include/block/blockjob_int.h | 10 ++
>  2 files changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 8f02c03880..1c010ec100 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -394,6 +394,18 @@ static void block_job_update_rc(BlockJob *job)
>  }
>  }
>  
> +static int block_job_prepare(BlockJob *job)
> +{
> +if (job->ret) {
> +goto out;
> +}
> +if (job->driver->prepare) {
> +job->ret = job->driver->prepare(job);
> +}
> + out:
> +return job->ret;
> +}

Why not just if (job->ret == 0 && job->driver->prepare) and save the
goto?

Kevin



Re: [Qemu-devel] [RFC v4 15/21] blockjobs: add prepare callback

2018-02-27 Thread John Snow


On 02/27/2018 02:56 PM, Eric Blake wrote:
> On 02/23/2018 05:51 PM, John Snow wrote:
>> Some jobs upon finalization may need to perform some work that can
>> still fail. If these jobs are part of a transaction, it's important
>> that these callbacks fail the entire transaction.
>>
>> We allow for a new callback in addition to commit/abort/clean that
>> allows us the opportunity to have fairly late-breaking failures
>> in the transactional process.
>>
>> The expected flow is:
>>
>> - All jobs in a transaction converge to the WAITING state
>>    (added in a forthcoming commit)
>> - All jobs prepare to call either commit/abort
>> - If any job fails, is canceled, or fails preparation, all jobs
>>    call their .abort callback.
>> - All jobs enter the PENDING state, awaiting manual intervention
>>    (also added in a forthcoming commit)
>> - block-job-finalize is issued by the user/management layer
>> - All jobs call their commit callbacks.
>>
>> Signed-off-by: John Snow 
>> ---
>>   blockjob.c   | 34 +++---
>>   include/block/blockjob_int.h | 10 ++
>>   2 files changed, 41 insertions(+), 3 deletions(-)
>>
> 
>> @@ -467,17 +480,22 @@ static void block_job_cancel_async(BlockJob *job)
>>   job->cancelled = true;
>>   }
>>   -static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
>> +static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
>>   {
>>   AioContext *ctx;
>>   BlockJob *job, *next;
>> +    int rc;
>>     QLIST_FOREACH_SAFE(job, >jobs, txn_list, next) {
>>   ctx = blk_get_aio_context(job->blk);
>>   aio_context_acquire(ctx);
>> -    fn(job);
>> +    rc = fn(job);
>>   aio_context_release(ctx);
>> +    if (rc) {
>> +    break;
>> +    }
> 
> This short-circuits the application of the function to the rest of the
> group.  Is that ever going to be a problem?
> 

With what I've written, I don't think so -- but I can't guarantee
someone won't misunderstand the semantics of it and it will become a
problem. It is a potentially dangerous function in that way.

--js



Re: [Qemu-devel] [RFC v4 15/21] blockjobs: add prepare callback

2018-02-27 Thread Eric Blake

On 02/23/2018 05:51 PM, John Snow wrote:

Some jobs upon finalization may need to perform some work that can
still fail. If these jobs are part of a transaction, it's important
that these callbacks fail the entire transaction.

We allow for a new callback in addition to commit/abort/clean that
allows us the opportunity to have fairly late-breaking failures
in the transactional process.

The expected flow is:

- All jobs in a transaction converge to the WAITING state
   (added in a forthcoming commit)
- All jobs prepare to call either commit/abort
- If any job fails, is canceled, or fails preparation, all jobs
   call their .abort callback.
- All jobs enter the PENDING state, awaiting manual intervention
   (also added in a forthcoming commit)
- block-job-finalize is issued by the user/management layer
- All jobs call their commit callbacks.

Signed-off-by: John Snow 
---
  blockjob.c   | 34 +++---
  include/block/blockjob_int.h | 10 ++
  2 files changed, 41 insertions(+), 3 deletions(-)




@@ -467,17 +480,22 @@ static void block_job_cancel_async(BlockJob *job)
  job->cancelled = true;
  }
  
-static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))

+static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
  {
  AioContext *ctx;
  BlockJob *job, *next;
+int rc;
  
  QLIST_FOREACH_SAFE(job, >jobs, txn_list, next) {

  ctx = blk_get_aio_context(job->blk);
  aio_context_acquire(ctx);
-fn(job);
+rc = fn(job);
  aio_context_release(ctx);
+if (rc) {
+break;
+}


This short-circuits the application of the function to the rest of the 
group.  Is that ever going to be a problem?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[Qemu-devel] [RFC v4 15/21] blockjobs: add prepare callback

2018-02-23 Thread John Snow
Some jobs upon finalization may need to perform some work that can
still fail. If these jobs are part of a transaction, it's important
that these callbacks fail the entire transaction.

We allow for a new callback in addition to commit/abort/clean that
allows us the opportunity to have fairly late-breaking failures
in the transactional process.

The expected flow is:

- All jobs in a transaction converge to the WAITING state
  (added in a forthcoming commit)
- All jobs prepare to call either commit/abort
- If any job fails, is canceled, or fails preparation, all jobs
  call their .abort callback.
- All jobs enter the PENDING state, awaiting manual intervention
  (also added in a forthcoming commit)
- block-job-finalize is issued by the user/management layer
- All jobs call their commit callbacks.

Signed-off-by: John Snow 
---
 blockjob.c   | 34 +++---
 include/block/blockjob_int.h | 10 ++
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 8f02c03880..1c010ec100 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -394,6 +394,18 @@ static void block_job_update_rc(BlockJob *job)
 }
 }
 
+static int block_job_prepare(BlockJob *job)
+{
+if (job->ret) {
+goto out;
+}
+if (job->driver->prepare) {
+job->ret = job->driver->prepare(job);
+}
+ out:
+return job->ret;
+}
+
 static void block_job_commit(BlockJob *job)
 {
 assert(!job->ret);
@@ -417,7 +429,7 @@ static void block_job_clean(BlockJob *job)
 }
 }
 
-static void block_job_completed_single(BlockJob *job)
+static int block_job_completed_single(BlockJob *job)
 {
 assert(job->completed);
 
@@ -452,6 +464,7 @@ static void block_job_completed_single(BlockJob *job)
 block_job_txn_unref(job->txn);
 block_job_event_concluded(job);
 block_job_unref(job);
+return 0;
 }
 
 static void block_job_cancel_async(BlockJob *job)
@@ -467,17 +480,22 @@ static void block_job_cancel_async(BlockJob *job)
 job->cancelled = true;
 }
 
-static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
+static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
 {
 AioContext *ctx;
 BlockJob *job, *next;
+int rc;
 
 QLIST_FOREACH_SAFE(job, >jobs, txn_list, next) {
 ctx = blk_get_aio_context(job->blk);
 aio_context_acquire(ctx);
-fn(job);
+rc = fn(job);
 aio_context_release(ctx);
+if (rc) {
+break;
+}
 }
+return rc;
 }
 
 static void block_job_do_dismiss(BlockJob *job)
@@ -567,6 +585,8 @@ static void block_job_completed_txn_success(BlockJob *job)
 {
 BlockJobTxn *txn = job->txn;
 BlockJob *other_job;
+int rc = 0;
+
 /*
  * Successful completion, see if there are other running jobs in this
  * txn.
@@ -576,6 +596,14 @@ static void block_job_completed_txn_success(BlockJob *job)
 return;
 }
 }
+
+/* Jobs may require some prep-work to complete without failure */
+rc = block_job_txn_apply(txn, block_job_prepare);
+if (rc) {
+block_job_completed_txn_abort(job);
+return;
+}
+
 /* We are the last completed job, commit the transaction. */
 block_job_txn_apply(txn, block_job_completed_single);
 }
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 259d49b32a..642adce68b 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -53,6 +53,16 @@ struct BlockJobDriver {
  */
 void (*complete)(BlockJob *job, Error **errp);
 
+/**
+ * If the callback is not NULL, prepare will be invoked when all the jobs
+ * belonging to the same transaction complete; or upon this job's 
completion
+ * if it is not in a transaction.
+ *
+ * This callback will not be invoked if the job has already failed.
+ * If it fails, abort and then clean will be called.
+ */
+int (*prepare)(BlockJob *job);
+
 /**
  * If the callback is not NULL, it will be invoked when all the jobs
  * belonging to the same transaction complete; or upon this job's
-- 
2.14.3