Ok, let's discuss it on list.
On 27.09.2016 19:19, John Snow wrote:
Replying off-list to follow your lead, but this might be a good
discussion to have upstream where Stefan and Jeff can see it.
On 09/26/2016 11:53 AM, Vladimir Sementsov-Ogievskiy wrote:
Hi John!
What about this series? If you are working on this, I can tell about
another related problem.
I got mired in trying to think about how to delete a BlockJob that
hasn't technically been started yet -- this patch series is unsafe in
that it will be unable to "cancel" a job that hasn't started yet.
(And I don't want to start a job just to cancel it, that's gross.)
I need to expand the job modifications here a little bit to make this
completely safe. I may need to create a job->started bool that allows
you to call a job_delete() function if and only if started is false.
I'll try to send this along this week.
I'm working on async scheme for backup. I've started from something like
just wrapping backup_do_cow into coroutine and use this coroutine in
full-backup-loop instead of just call backup_do_cow. This simple
approach seems better than used in mirror - in mirror we create new
coroutine for each async read and write, when I create new coroutine for
copy of = read + write. But then I decided to rewrite this to
worker-coroutines scheme. I.e. we do not create new coroutine for each
request and control their count by " while (job->inflight_requests >=
MAX_IN_FLIGHT) { yield }" but we create MAX_IN_FLIGHT worker coroutines
at backup start, and then they just copy cluster by cluster, using
common cop_bitmap to synchronize. All cluster copying is done in
backup-workers and write-notifiers only change the order in which
workers copy clusters. And I liked it all until I figured out the error
handling.
Makes sense to me so far.
I can't handle errors in workers, because I need to call
block_job_error_action, which may stop blockjob, and it seems to be a
problem if several workers calls it simultaneously. So, worker have to
What kind of problems are you seeing? Maybe we can fix them? is it for
STOP cases? (or also on IGNORE/REPORT?)
Otherwise, just stash an error object in a location that the master
coroutine can find, pause/terminate all worker threads, and signal the
error.
defer error handling to main coroutine of the blockjob (this coroutine
in my solution doesn't do copying but only handles blockjob staff like
pauses, stops, throttling). So, we will have a queue of workers, waiting
for main coroutine to handle their errors. This looks too complicated..
Really? I don't think it's too bad. The worker detects an error and
creates a record of the problem, then the worker yields.
The controller detects the worker has yielded and sees the record of
the error. The controller opts not to re-schedule the worker.
The controller then instructs all remaining workers to yield. Once all
workers are yielded, the controller invokes block_job_error_action.
From there, we can handle the error accordingly.
Either way it sounds like we're going to have to manage the complexity
of what happens when a job with worker coroutines is paused by e.g.
the QMP monitor: the controller needs to send and coordinate messages
to the workers; so this doesn't sound like too much added complexity
unless I'm dreaming.
And finally I come to the idea that all this problems and complications
are because blockjob is not created for asynchronous work, because
blockjob has only one working coroutine. So the true way is to maintain
several working coroutines on generic blockjob layer. This will lead to
simpler and transparent async schemes for backup and mirror (and may be
other blockjobs).
So, the question is: what about refactoring blockjobs, to move from one
working coroutine to several ones? Is it hard and is it possible? I do
not understand all block-jobs related staff..
I think jobs will need to remain "one coroutine, one job" for now, but
there's no reason why drive-backup or blockdev-backup can't just
create multiple jobs each if that's what they need to do. (The backup
job object could, in theory, just have another job pointer to a helper
job if it really became necessary.)
Let's try to solve your design problem first and see if we can't make
error reporting behave nicely in a contested environment.
Hmm, ok, I'll go this way for now. But anyway, I think that finally
async scheme in backup and mirror should be the same. And it should
share the same code, so all worker-related should be separated from
backup to own file, or injected into block-jobs.
On 08.08.2016 22:09, John Snow wrote:
There are a few problems with transactional job completion right now.
First, if jobs complete so quickly they complete before remaining jobs
get a chance to join the transaction, the completion mode can leave
well
known state and the QLIST can get corrupted and the transactional jobs
can complete in batches or phases instead of all together.
Second, if two or more jobs defer to the main loop at roughly the same
time, it's possible for one job's cleanup to directly invoke the other
job's cleanup from within the same thread, leading to a situation that
will deadlock the entire transaction.
Thanks to Vladimir for pointing out these modes of failure.
________________________________________________________________________________
For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch job-manual-start
https://github.com/jnsnow/qemu/tree/job-manual-start
This version is tagged job-manual-start-v1:
https://github.com/jnsnow/qemu/releases/tag/job-manual-start-v1
John Snow (4):
blockjob: add block_job_start
blockjob: refactor backup_start as backup_job_create
blockjob: add .clean property
iotests: add transactional failure race test
Vladimir Sementsov-Ogievskiy (1):
blockjob: fix dead pointer in txn list
block/backup.c | 50 +++++++-----
block/commit.c | 2 +-
block/mirror.c | 2 +-
block/stream.c | 2 +-
blockdev.c | 194
++++++++++++++++++++++++++-------------------
blockjob.c | 24 +++++-
include/block/block_int.h | 19 ++---
include/block/blockjob.h | 16 ++++
tests/qemu-iotests/124 | 91 +++++++++++++++++++++
tests/qemu-iotests/124.out | 4 +-
tests/test-blockjob-txn.c | 2 +-
11 files changed, 284 insertions(+), 122 deletions(-)
--
Best regards,
Vladimir