Re: [Qemu-block] Transactions, Jobs, and Cancellation

2016-09-30 Thread John Snow



On 09/30/2016 02:30 PM, Eric Blake wrote:

On 09/30/2016 11:35 AM, John Snow wrote:

Hi Eric (as a proxy for Libvirt);

I want to make a change to transactions such that they do not actually
start the jobs until the entire transaction is error-checked for validity.

This would be a change from the current setup where:

- Some jobs are started
- One job cannot start
- Existing jobs are cancelled, emitting job events
- QMP transaction returns failure.


In this case, the number of events emitted is less than the number of
events comprising the overall transaction, right?



Yes, because some may fail to execute %jobtype%_start(), and the 
transaction halts and returns ERROR. Some jobs may have already started, 
and those are canceled, producing an event.




to something more like:

- Some jobs are queued to start
- One job cannot start
- Existing queued jobs are un-created
- No events are emitted, but the QMP transaction fails.


Is the failure guaranteed to be synchronous to the QMP command that
requested the 'transaction' command?  If so, I think you're fine: in
general, libvirt is looking for events (and presumably N events for a
'transaction' with N components) _only_ if the 'transaction' command
itself succeeded; but if the 'transaction' command fails, then there is
no expectation of events.



Yep, the synchronous case -- before we return any info on the QMP 
transaction at all.



If the failure is asynchronous (and can happen even after 'transaction'
returns success), then it gets a bit weirder; but libvirt still tries to
operate on the principle that events are best-effort notifications and
may be missed, so as long as it is always possible to poll QMP and learn
the same information as would have been presented in the omitted events,
we are probably still okay.



Yeah, no changes to the async case, regardless of the value of the 
trasnactional-cancel property.




If I understand correctly, it's possible for the transaction to fail
even after it has started several jobs because they begin operating
during the prepare phase.

Then, because the transaction preparation has failed, QEMU will cancel
the transaction instead of proceeding, which will generate some
BLOCK_JOB_CANCELLED events.

This may affect libvirt and others if I change these semantics.

Does that sound appropriate to you, or would you from a libvirt
perspective RATHER get events for "uncreated" jobs like you do now? I
could make it do either, but I'd rather prefer to simply not emit events
for jobs that didn't truly never start.


Avoiding events for jobs that never start makes sense if the transaction
itself returns failure.



I can also begin emitting events for jobs that *actually* start if it
would help to disambiguate the cases between old and new transactions.

Note: This has nothing to do with the transactional-cancel property,
which only impacts what happens when jobs created by a transaction fail
AFTER a successful return from qmp_transaction.


Ah, so it sounds like this is ALL about the synchronous case (that is,
the old behavior was that _even_ when 'transaction' fails, some events
were possibly leaked for < N portions of the transaction, if the earlier
portions were started to the point that they could emit a cancelled
event as a result of the later portions never starting).  So I think
your idea makes sense.



Great!

--js



Re: [Qemu-block] Transactions, Jobs, and Cancellation

2016-09-30 Thread Eric Blake
On 09/30/2016 11:35 AM, John Snow wrote:
> Hi Eric (as a proxy for Libvirt);
> 
> I want to make a change to transactions such that they do not actually
> start the jobs until the entire transaction is error-checked for validity.
> 
> This would be a change from the current setup where:
> 
> - Some jobs are started
> - One job cannot start
> - Existing jobs are cancelled, emitting job events
> - QMP transaction returns failure.

In this case, the number of events emitted is less than the number of
events comprising the overall transaction, right?

> 
> to something more like:
> 
> - Some jobs are queued to start
> - One job cannot start
> - Existing queued jobs are un-created
> - No events are emitted, but the QMP transaction fails.

Is the failure guaranteed to be synchronous to the QMP command that
requested the 'transaction' command?  If so, I think you're fine: in
general, libvirt is looking for events (and presumably N events for a
'transaction' with N components) _only_ if the 'transaction' command
itself succeeded; but if the 'transaction' command fails, then there is
no expectation of events.

If the failure is asynchronous (and can happen even after 'transaction'
returns success), then it gets a bit weirder; but libvirt still tries to
operate on the principle that events are best-effort notifications and
may be missed, so as long as it is always possible to poll QMP and learn
the same information as would have been presented in the omitted events,
we are probably still okay.

> 
> If I understand correctly, it's possible for the transaction to fail
> even after it has started several jobs because they begin operating
> during the prepare phase.
> 
> Then, because the transaction preparation has failed, QEMU will cancel
> the transaction instead of proceeding, which will generate some
> BLOCK_JOB_CANCELLED events.
> 
> This may affect libvirt and others if I change these semantics.
> 
> Does that sound appropriate to you, or would you from a libvirt
> perspective RATHER get events for "uncreated" jobs like you do now? I
> could make it do either, but I'd rather prefer to simply not emit events
> for jobs that didn't truly never start.

Avoiding events for jobs that never start makes sense if the transaction
itself returns failure.

> 
> I can also begin emitting events for jobs that *actually* start if it
> would help to disambiguate the cases between old and new transactions.
> 
> Note: This has nothing to do with the transactional-cancel property,
> which only impacts what happens when jobs created by a transaction fail
> AFTER a successful return from qmp_transaction.

Ah, so it sounds like this is ALL about the synchronous case (that is,
the old behavior was that _even_ when 'transaction' fails, some events
were possibly leaked for < N portions of the transaction, if the earlier
portions were started to the point that they could emit a cancelled
event as a result of the later portions never starting).  So I think
your idea makes sense.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] Transactions, Jobs, and Cancellation

2016-09-30 Thread John Snow

Hi Eric (as a proxy for Libvirt);

I want to make a change to transactions such that they do not actually 
start the jobs until the entire transaction is error-checked for validity.


This would be a change from the current setup where:

- Some jobs are started
- One job cannot start
- Existing jobs are cancelled, emitting job events
- QMP transaction returns failure.

to something more like:

- Some jobs are queued to start
- One job cannot start
- Existing queued jobs are un-created
- No events are emitted, but the QMP transaction fails.

If I understand correctly, it's possible for the transaction to fail 
even after it has started several jobs because they begin operating 
during the prepare phase.


Then, because the transaction preparation has failed, QEMU will cancel 
the transaction instead of proceeding, which will generate some 
BLOCK_JOB_CANCELLED events.


This may affect libvirt and others if I change these semantics.

Does that sound appropriate to you, or would you from a libvirt 
perspective RATHER get events for "uncreated" jobs like you do now? I 
could make it do either, but I'd rather prefer to simply not emit events 
for jobs that didn't truly never start.


I can also begin emitting events for jobs that *actually* start if it 
would help to disambiguate the cases between old and new transactions.


Note: This has nothing to do with the transactional-cancel property, 
which only impacts what happens when jobs created by a transaction fail 
AFTER a successful return from qmp_transaction.


--js