On 26.07.19 13:49, Kevin Wolf wrote: > Am 26.07.2019 um 12:50 hat Max Reitz geschrieben: >> On 25.07.19 18:27, Kevin Wolf wrote: >>> This fixes device like IDE that can still start new requests from I/O >> >> *devices >> >>> handlers in the CPU thread while the block backend is drained. >>> >>> The basic assumption is that in a drain section, no new requests should >>> be allowed through a BlockBackend (blk_drained_begin/end don't exist, >>> we get drain sections only on the node level). However, there are two >>> special cases where requests should not be queued: >>> >>> 1. Block jobs: We already make sure that block jobs are paused in a >>> drain section, so they won't start new requests. However, if the >>> drain_begin is called on the job's BlockBackend first, it can happen >>> that we deadlock because the job stays busy until it reaches a pause >>> point - which it can't if it's requests aren't processed any more. >>> >>> The proper solution here would be to make all requests through the >>> job's filter node instead of using a BlockBackend. For now, just >>> disabling request queuin on the job BlockBackend is simpler. >> >> Yep, seems reasonable. >> >> (We’d need a relationship that a BB is owned by some job, and then pause >> the job when the BB is drained, I suppose. But that’s exactly >> accomplished by not making the job use a BB, but its BdrvChild >> references instead.) > > We actually had this before commit ad90feba, when we changed it to use > the job's BdrvChild objects instead. All block jobs have both currently, > they just don't use their BdrvChild objects much. > >>> 2. In test cases where making requests through bdrv_* would be >>> cumbersome because we'd need a BdrvChild. As we already got the >>> functionality to disable request queuing from 1., use it in tests, >>> too, for convenience. >>> >>> Signed-off-by: Kevin Wolf <kw...@redhat.com> >>> --- >>> include/sysemu/block-backend.h | 11 +++--- >>> block/backup.c | 1 + >>> block/block-backend.c | 69 +++++++++++++++++++++++++++++----- >>> block/commit.c | 2 + >>> block/mirror.c | 6 ++- >>> blockjob.c | 3 ++ >>> tests/test-bdrv-drain.c | 1 + >>> 7 files changed, 76 insertions(+), 17 deletions(-) >> >> [...] >> >>> diff --git a/block/block-backend.c b/block/block-backend.c >>> index fdd6b01ecf..603b281743 100644 >>> --- a/block/block-backend.c >>> +++ b/block/block-backend.c >> >> [...] >> >>> @@ -1127,13 +1136,26 @@ static int blk_check_byte_request(BlockBackend >>> *blk, int64_t offset, >>> return 0; >>> } >>> >>> +static void blk_wait_while_drained(BlockBackend *blk) >> >> +coroutine_fn? (Maybe even blk_co_wait...) >> >>> +{ >>> + if (blk->quiesce_counter && !blk->disable_request_queuing) { >>> + qemu_co_queue_wait(&blk->queued_requests, NULL); >>> + } >>> +} >>> + >>> int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset, >>> unsigned int bytes, QEMUIOVector *qiov, >>> - BdrvRequestFlags flags) >>> + BdrvRequestFlags flags, bool >>> wait_while_drained) >> >> What’s the purpose of this parameter? How would it hurt to always >> wait_while_drained? >> >> I see the following callers of blk_co_p{read,write}v() that call it with >> wait_while_drained=false: >> >> 1. blk_aio_{read,write}_entry(): They wait themselves, so they don’t >> need these functions to wait. But OTOH, because they have waited, we >> know that the BB is not quiesced here, so we won’t wait here anyway. >> (These functions should be coroutine_fn, too, by the way) > > I think I was worried that the coroutine might yield between the two > places. Later I noticed that blk_wait_while_drained() must be the very > first thing anyway, so maybe it doesn't matter any more now. > > If we did yield here for requests coming from blk_aio_prwv(), in_flight > would be increased and drain would deadlock. > > Would you prefer if I just unconditionally wait if we're drained?
I think I would, yes. >> 2. mirror: It disables request queuing anyway, so wait_while_drained >> doesn’t have any effect. > > Yes, I wasn't sure what to use there. false seemed like it would be > less likely to cause misunderstandings because it just repeats what > would happen anyway. > >>> { >>> int ret; >>> - BlockDriverState *bs = blk_bs(blk); >>> + BlockDriverState *bs; >>> >>> + if (wait_while_drained) { >>> + blk_wait_while_drained(blk); >>> + } >> >> [...] >> >> What about blk_co_flush()? Should that wait, too? > > Hm, probably, yes. > >>> @@ -2232,6 +2278,9 @@ static void blk_root_drained_end(BdrvChild *child, >>> int *drained_end_counter) >>> if (blk->dev_ops && blk->dev_ops->drained_end) { >>> blk->dev_ops->drained_end(blk->dev_opaque); >>> } >>> + while (qemu_co_enter_next(&blk->queued_requests, NULL)) { >>> + /* Resume all queued requests */ >>> + } >> >> Wouldn’t qemu_co_queue_restart_all(&blk->queued_requests) achieve the same? > > It would fail an assertion because we're not in coroutine context. > (Guess what my first attempt was!) :-) Max
signature.asc
Description: OpenPGP digital signature