On 10/05/2017 07:38 AM, Kevin Wolf wrote:
> Am 05.10.2017 um 03:46 hat John Snow geschrieben:
>> On 10/04/2017 02:27 PM, Kevin Wolf wrote:
>>> Am 04.10.2017 um 03:52 hat John Snow geschrieben:
For jobs that complete when a monitor isn't looking, there's no way to
tell what the job's final return code was. We need to allow jobs to
remain in the list until queried for reliable management.
>>>
>>> Just a short summary of what I discussed with John on IRC:
>>>
>>> Another important reason why we want to have an explicit end of block
>>> jobs is that job completion often makes changes to the graph. For a
>>> management tool that manages the block graph on a node level, it is a
>>> big problem if graph changes can happen at any point that can lead to
>>> bad race conditions. Giving the management tool control over the end of
>>> the block job makes it aware that graph changes happen.
>>>
>>> This means that compared to this RFC series, we need to move the waiting
>>> earlier in the process:
>>>
>>> 1. Block job is done and calls block_job_completed()
>>> 2. Wait for other block jobs in the same job transaction to complete
>>> 3. Send a (new) QMP event to the management tool to notify it that the
>>>job is ready to be reaped
>>
>> Oh, I suppose to distinguish it from "COMPLETED" in that sense, because
>> it isn't actually COMPLETED anymore under your vision, so it requires a
>> new event in this proposal.
>>
>> This becomes a bit messy, bumping up against both "READY" and a
>> transactional pre-completed state semantically. U, for lack of a
>> better word in the timeframe I'd like to complete this email in, let's
>> call this new theoretical state "PENDING"?
>>
>> So presently, a job goes through the following life cycle:
>>
>> 1. CREATED --> RUNNING
>> 2. RUNNING <--> PAUSED
>> 3. RUNNING --> (READY | COMPLETED | CANCELED)
>> 4. READY --> (COMPLETED | CANCELED)
>> 5. (COMPLETED | CANCELED) --> NULL
>>
>> Where we emit an event upon entering "READY", "COMPLETED" or "CANCELED".
>
> Roughly yes, but it's not quite true because you can still pause and
> unpause ready jobs. So READY and PAUSED are kind of orthogonal.
>
>> My patchset here effectively adds a new optional terminal state:
>>
>> 5. (COMPLETED | CANCELED) --> (NULL | FINISHED)
>> 6. FINISHED --> NULL
>>
>> Where the last transition from FINISHED to NULL is performed via
>> block-job-reap, but notably we get to re-use the events for COMPLETED |
>> CANCELED to indicate the availability of this operation to be performed.
>>
>> What happens in the case of transactionally managed jobs presently is
>> that jobs get stuck as they enter the COMPLETED|CANCELED state. If you
>> were to query them they behave as if they're RUNNING. There's no
>> discrete state that exists for this presently.
>>
>> You can cancel these as normal, but I'm not sure if you can pause them,
>> actually. (Note to self, test that.) I think they have almost exactly
>> like any RUNNING job would.
>
> Except that they don't do any work any more. This is an mportant
> difference for a mirror job which would normally keep copying new writes
> until it sends the COMPLETED event. So when libvirt restarts and it sees
> a "RUNNING" mirror job, it can't decide whether it is still copying
> things or has already completed.
>
> Looks like this is another reason why we want a separate state here.
>
>> What you're proposing here is the formalization of the pre-completion
>> state ("PENDING") and that in this state, a job outside of a transaction
>> can exist until it is manually told to finally, once and for all,
>> actually finish its business. We can use this as a hook to perform and
>> last graph changes so they will not come as a surprise to the management
>> application. Maybe this operation should be called "Finalize". Again,
>> for lack of a better term in the timeframe, I'll refer to it as such for
>> now.
>
> "finalize" doesn't sound too bad.
>
>> I think importantly this actually distinguishes it from "reap" in that
>> the commit phase can still fail, so we can't let the job follow that
>> auto transition back to the NULL state.
>
> Let me see if I understand correctly: We want to make sure that the
> management tool sees the final return value for the job. We have already
> decided that events aren't enough for this because the management tool
> could be restarted while we send the event, so the information is lost.
> Having it as a return value of block-job-reap isn't enough either
> because it could be lost the same way. We need a separate phase where
> libvirt can query the return value and from which we don't automatically
> transition away.
>
> I'm afraid that you are right.
>
>> That means that we'd need both a block-job-finalize AND a
>> block-job-reap to accomplish both of the following goals:
>>
>> (1) Allow the management application to control graph changes [libvirt]
>> (2) Prevent auto transitions to NULL state for asynchronous clients [A
>> requi