09.04.2021 12:51, Max Reitz wrote:
On 08.04.21 19:26, Vladimir Sementsov-Ogievskiy wrote:
08.04.2021 20:04, John Snow wrote:
On 4/8/21 12:58 PM, Vladimir Sementsov-Ogievskiy wrote:
job-complete command is async. Can we instead just add a boolean like
job->completion_requested, and set it if job-complete called in STANDBY state,
and on job_resume job_complete will be called automatically if this boolean is
true?
job_complete has a synchronous setup, though -- we lose out on a lot of
synchronous error checking in that circumstance.
yes, that's a problem..
I was not able to audit it to determine that it'd be safe to attempt that setup
during a drained section -- I imagine it won't work and will fail, though.
So I thought we'd have to signal completion and run the setup *later*, but what do we do
if we get an error then? Does the entire job fail? Do we emit some new event?
("BLOCK_JOB_COMPLETION_FAILED" ?) Is it recoverable?
Isn't it possible even now, that after successful job-complete job still fails
and we report BLOCK_JOB_COMPLETED with error?
And actually, how much benefit user get from the fact that job-complete may
fail?
We can make job-complete a simple always-success boolean flag setter like
job-pause.
I wanted to say the following:
But job-pause does always succeed, in contrast to block-job-complete.
block-job-complete is more akin to job-finalize, which too is a
synchronous operation.
But when I wrote that last sentence, I asked myself whether what
mirror_complete() does isn’t actually a remnant of what we had to do when we
didn’t have job-finalize yet. Shouldn’t that all be in mirror_exit_common()?
What’s the advantage of opening the backing chain or putting blockers on the
to-replace node in block-job-complete? Aren’t that all graph-changing
operation, basically, i.e. stuff that should be done in job-finalize?
If we move everything to mirror_exit_common(), all that remains to do is
basically set some should_complete flag (could even be part of the Job struct),
and then the whole problem disappears.
Thoughts?
Sounds good.. ButI want to understand first one simple thing: can job fail even
after block-job-complete succeeded?
As I understand current users think that it can't. And block-job-complete is documented
as "This command completes an active background block operation synchronously".
So it's assumed that if block-job-complete succeeded we are totally done.
But maybe, it's wrong? Can mirror_prepare fail after mirror_complete success?
And user must check job status after job is finalized? Or check error in
BLOCK_JOB_COMPLETED event?
--
Best regards,
Vladimir