On 10/06/2016 06:44 PM, John Snow wrote:
On 10/05/2016 11:17 AM, Kevin Wolf wrote:
Am 01.10.2016 um 00:00 hat John Snow geschrieben:
Instead of automatically starting jobs at creation time via backup_start
et al, we'd like to return a job object pointer that can be started
manually at later point in time.
For now, add the block_job_start mechanism and start the jobs
automatically as we have been doing, with conversions job-by-job coming
in later patches.
Of note: cancellation of unstarted jobs will perform all the normal
cleanup as if the job had started, particularly abort and clean. The
only difference is that we will not emit any events, because the job
never actually started.
Signed-off-by: John Snow <js...@redhat.com>
Should we make sure that jobs are only added to the block_jobs list once
they are started? It doesn't sound like a good idea to make a job
without a coroutine user-accessible.
That would certainly help limit exposure to a potentially dangerous
object, but some operations on these un-started jobs are still perfectly
reasonable, like cancellation. Even things like "set speed" are
perfectly reasonable on an unstarted job.
I'd rather just verify that having an accessible job cannot be operated
on unsafely via the public interface, even though that's more work.
So here's the list:
-block_job_set_speed: Depends on the set_speed implementation, but
backup_set_speed makes no assumptions and that's the only job I am
attempting to convert in this series.
-block_job_cancel: Edited to specifically support pre-started jobs in
-block_job_complete: Edited to check for a coroutine specifically, but
even without, a job will not be able to become ready without running first.
-block_job_query: Harmless* (*I could probably expose a 'started'
variable so that libvirt didn't get confused as to why there are jobs
that exist but are not busy nor paused.)
-block_job_user_paused: Harmless, if meaningless.
-block_job_resume: **We will attempt to call block_job_enter, but
conditional on job->co, we do nothing, so it's harmless if not
misleading that you can pause/resume to your heart's content.
-block_job_user_resume: ^ http://i.imgur.com/2zYxrIe.png ^
-block_job_cancel_sync: Safe, due to edits to block_job_cancel. Polling
loop WILL complete as normal, because all jobs will finish through
block_job_completed one way or another.
-block_job_cancel_sync_all: As safe as the above.
-block_job_complete_sync: Safe, complete will return error for unstarted
-block_job_iostatus_reset: Harmless, I think -- backup does not
implement this method. (Huh, *no* jobs implement iostatus_reset at the
-block_job_txn_new: Doesn't operate on jobs.
-block_job_txn_unref: Also doesn't operate on jobs.
-block_job_txn_add_job: Safe and intended! There is trickery here,
though, as once a job is introduced into transactions it opens it up to
the private interface. This adds the following functions to considerations:
-block_job_completed: Safe, does not assume a coroutine anywhere.
-block_job_completed_single: Specifically updated to be context-aware of
if we are pre-started or not. This is the "real" completion mechanism
for BlockJobs that gets run for transactional OR individual jobs.
-block_job_completed_txn_abort: ***BUG***! The problem with the patch as
I've sent it is that cancel calls completed (under the premise that
nobody else would ever get to be able to), but we call both cancel AND
completed from this function, which will cause us to call completed
twice on pre-started jobs. I will fix this for the next version.
Actually, I was mistaken. The way I wrote it is fine, and it will work
qmp_transaction encounters an error during prepare and calls the abort
method for each action.
- The abort method calls block_job_cancel on the not-yet-started job,
- Which in turn calls block_job_completed with ret = -ECANCELED
which then either calls:
- block_job_completed_single with a failure mode (which calls .abort and
.clean) if we are not in a transaction, or
txn_abort is going to:
1) Set txn->aborting to true, and
2) Cancel every other job in the transaction using block_job_cancel_sync.
These jobs will also come to txn_abort, but since txn->aborting is true,
will become a NOP.
Then, finally, we will call block_job_completed_single on every job in
the transaction which will perform our proper cleanup by calling .abort
and .clean for each job.
No duplication, I was mistaken...!
-block_job_completed_txn_success: Should never be called; if it IS, the
presence of an unstarted job in the transaction will cause an early
return. And even if I am utterly wrong and every job in the transaction
completes successfully (somehow?) calling block_job_completed_single is
perfectly safe by design.
Everything else is either internal to block job instances or the
There may be:
(A) Bugs in my code/thinking, or
(B) Improvements we can make to the readability,
but I believe that this is (Apart from the mentioned bug) not dangerous.