Re: [Qemu-devel] [RFC v4 15/21] blockjobs: add prepare callback

2018-03-06 Thread John Snow


On 02/28/2018 12:04 PM, Kevin Wolf wrote:
> Am 24.02.2018 um 00:51 hat John Snow geschrieben:
>> Some jobs upon finalization may need to perform some work that can
>> still fail. If these jobs are part of a transaction, it's important
>> that these callbacks fail the entire transaction.
>>
>> We allow for a new callback in addition to commit/abort/clean that
>> allows us the opportunity to have fairly late-breaking failures
>> in the transactional process.
>>
>> The expected flow is:
>>
>> - All jobs in a transaction converge to the WAITING state
>>   (added in a forthcoming commit)
>> - All jobs prepare to call either commit/abort
>> - If any job fails, is canceled, or fails preparation, all jobs
>>   call their .abort callback.
>> - All jobs enter the PENDING state, awaiting manual intervention
>>   (also added in a forthcoming commit)
>> - block-job-finalize is issued by the user/management layer
>> - All jobs call their commit callbacks.
>>
>> Signed-off-by: John Snow 
> 
> You almost made me believe the scary thought that we need transactional
> graph modifications, but after writing half of the reply, I think it's
> just that your order here is wrong.
> 

Sorry, yes, this blurb was outdated. I regret that it wasted your time.

> So .prepare is the last thing in the whole process that is allowed to
> fail. Graph manipulations such as bdrv_replace_node() can fail. Graph
> manipulations can also only be made in response to block-job-finalize
> because the management layer must be aware of them. Take them together
> and you have a problem.
> 
> Didn't we already establish earlier that .prepare/.commit/.abort must be
> called together and cannot be separated by waiting for a QMP command
> because of locking and things?
> 

Right; so what really happens is that in response to the FINALIZE verb,
the prepare loop is done first to check for success, and then commit or
abort are dispatched as appropriate.

> So if you go to PENDING first, then wait for block-job-finalize and only
> then call .prepare/.commit/.abort, we should be okay for both problems.
> 
> And taking a look at the final state, that seems to be what you do, so
> in the end, it's probably just the commit message that needs a fix.

Yep, sorry.

> 
>>  blockjob.c   | 34 +++---
>>  include/block/blockjob_int.h | 10 ++
>>  2 files changed, 41 insertions(+), 3 deletions(-)
>>
>> diff --git a/blockjob.c b/blockjob.c
>> index 8f02c03880..1c010ec100 100644
>> --- a/blockjob.c
>> +++ b/blockjob.c
>> @@ -394,6 +394,18 @@ static void block_job_update_rc(BlockJob *job)
>>  }
>>  }
>>  
>> +static int block_job_prepare(BlockJob *job)
>> +{
>> +if (job->ret) {
>> +goto out;
>> +}
>> +if (job->driver->prepare) {
>> +job->ret = job->driver->prepare(job);
>> +}
>> + out:
>> +return job->ret;
>> +}
> 
> Why not just if (job->ret == 0 && job->driver->prepare) and save the
> goto?
> 

Churn. ¯\_(ツ)_/¯

> Kevin
> 



Re: [Qemu-devel] [RFC v4 15/21] blockjobs: add prepare callback

2018-02-28 Thread Kevin Wolf
Am 24.02.2018 um 00:51 hat John Snow geschrieben:
> Some jobs upon finalization may need to perform some work that can
> still fail. If these jobs are part of a transaction, it's important
> that these callbacks fail the entire transaction.
> 
> We allow for a new callback in addition to commit/abort/clean that
> allows us the opportunity to have fairly late-breaking failures
> in the transactional process.
> 
> The expected flow is:
> 
> - All jobs in a transaction converge to the WAITING state
>   (added in a forthcoming commit)
> - All jobs prepare to call either commit/abort
> - If any job fails, is canceled, or fails preparation, all jobs
>   call their .abort callback.
> - All jobs enter the PENDING state, awaiting manual intervention
>   (also added in a forthcoming commit)
> - block-job-finalize is issued by the user/management layer
> - All jobs call their commit callbacks.
> 
> Signed-off-by: John Snow 

You almost made me believe the scary thought that we need transactional
graph modifications, but after writing half of the reply, I think it's
just that your order here is wrong.

So .prepare is the last thing in the whole process that is allowed to
fail. Graph manipulations such as bdrv_replace_node() can fail. Graph
manipulations can also only be made in response to block-job-finalize
because the management layer must be aware of them. Take them together
and you have a problem.

Didn't we already establish earlier that .prepare/.commit/.abort must be
called together and cannot be separated by waiting for a QMP command
because of locking and things?

So if you go to PENDING first, then wait for block-job-finalize and only
then call .prepare/.commit/.abort, we should be okay for both problems.

And taking a look at the final state, that seems to be what you do, so
in the end, it's probably just the commit message that needs a fix.

>  blockjob.c   | 34 +++---
>  include/block/blockjob_int.h | 10 ++
>  2 files changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 8f02c03880..1c010ec100 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -394,6 +394,18 @@ static void block_job_update_rc(BlockJob *job)
>  }
>  }
>  
> +static int block_job_prepare(BlockJob *job)
> +{
> +if (job->ret) {
> +goto out;
> +}
> +if (job->driver->prepare) {
> +job->ret = job->driver->prepare(job);
> +}
> + out:
> +return job->ret;
> +}

Why not just if (job->ret == 0 && job->driver->prepare) and save the
goto?

Kevin



Re: [Qemu-devel] [RFC v4 15/21] blockjobs: add prepare callback

2018-02-27 Thread John Snow


On 02/27/2018 02:56 PM, Eric Blake wrote:
> On 02/23/2018 05:51 PM, John Snow wrote:
>> Some jobs upon finalization may need to perform some work that can
>> still fail. If these jobs are part of a transaction, it's important
>> that these callbacks fail the entire transaction.
>>
>> We allow for a new callback in addition to commit/abort/clean that
>> allows us the opportunity to have fairly late-breaking failures
>> in the transactional process.
>>
>> The expected flow is:
>>
>> - All jobs in a transaction converge to the WAITING state
>>    (added in a forthcoming commit)
>> - All jobs prepare to call either commit/abort
>> - If any job fails, is canceled, or fails preparation, all jobs
>>    call their .abort callback.
>> - All jobs enter the PENDING state, awaiting manual intervention
>>    (also added in a forthcoming commit)
>> - block-job-finalize is issued by the user/management layer
>> - All jobs call their commit callbacks.
>>
>> Signed-off-by: John Snow 
>> ---
>>   blockjob.c   | 34 +++---
>>   include/block/blockjob_int.h | 10 ++
>>   2 files changed, 41 insertions(+), 3 deletions(-)
>>
> 
>> @@ -467,17 +480,22 @@ static void block_job_cancel_async(BlockJob *job)
>>   job->cancelled = true;
>>   }
>>   -static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
>> +static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
>>   {
>>   AioContext *ctx;
>>   BlockJob *job, *next;
>> +    int rc;
>>     QLIST_FOREACH_SAFE(job, >jobs, txn_list, next) {
>>   ctx = blk_get_aio_context(job->blk);
>>   aio_context_acquire(ctx);
>> -    fn(job);
>> +    rc = fn(job);
>>   aio_context_release(ctx);
>> +    if (rc) {
>> +    break;
>> +    }
> 
> This short-circuits the application of the function to the rest of the
> group.  Is that ever going to be a problem?
> 

With what I've written, I don't think so -- but I can't guarantee
someone won't misunderstand the semantics of it and it will become a
problem. It is a potentially dangerous function in that way.

--js



Re: [Qemu-devel] [RFC v4 15/21] blockjobs: add prepare callback

2018-02-27 Thread Eric Blake

On 02/23/2018 05:51 PM, John Snow wrote:

Some jobs upon finalization may need to perform some work that can
still fail. If these jobs are part of a transaction, it's important
that these callbacks fail the entire transaction.

We allow for a new callback in addition to commit/abort/clean that
allows us the opportunity to have fairly late-breaking failures
in the transactional process.

The expected flow is:

- All jobs in a transaction converge to the WAITING state
   (added in a forthcoming commit)
- All jobs prepare to call either commit/abort
- If any job fails, is canceled, or fails preparation, all jobs
   call their .abort callback.
- All jobs enter the PENDING state, awaiting manual intervention
   (also added in a forthcoming commit)
- block-job-finalize is issued by the user/management layer
- All jobs call their commit callbacks.

Signed-off-by: John Snow 
---
  blockjob.c   | 34 +++---
  include/block/blockjob_int.h | 10 ++
  2 files changed, 41 insertions(+), 3 deletions(-)




@@ -467,17 +480,22 @@ static void block_job_cancel_async(BlockJob *job)
  job->cancelled = true;
  }
  
-static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))

+static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
  {
  AioContext *ctx;
  BlockJob *job, *next;
+int rc;
  
  QLIST_FOREACH_SAFE(job, >jobs, txn_list, next) {

  ctx = blk_get_aio_context(job->blk);
  aio_context_acquire(ctx);
-fn(job);
+rc = fn(job);
  aio_context_release(ctx);
+if (rc) {
+break;
+}


This short-circuits the application of the function to the rest of the 
group.  Is that ever going to be a problem?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org