On 03.04.20 16:50, Kevin Wolf wrote: > Am 03.04.2020 um 14:42 hat Max Reitz geschrieben: >> On 03.04.20 12:44, Kevin Wolf wrote: >>> Calling blk_wait_while_drained() while blk->in_flight is increased for >>> the current request is wrong because it will cause the drain operation >>> to deadlock. >>> >>> Many callers of blk_wait_while_drained() have already increased >>> blk->in_flight when called in a blk_aio_*() path, but can also be called >>> in synchonous code paths where blk->in_flight isn't increased. This >>> means that these calls of blk_wait_while_drained() are wrong at least in >>> some cases. >>> >>> In order to fix this, increase blk->in_flight even for synchronous >>> operations and temporarily decrease the counter again in >>> blk_wait_while_drained(). >>> >>> Fixes: cf3129323f900ef5ddbccbe86e4fa801e88c566e >>> Signed-off-by: Kevin Wolf <kw...@redhat.com> >>> --- >>> block/block-backend.c | 8 ++++---- >>> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> blk_co_pdiscard() and blk_co_flush() are called from outside of >> block-backend.c (namely from mirror.c and nbd/server.c). Is that OK? > > Hm... I think you're right that the NBD server has a problem now because > we might now decrease blk->in_flight without having increased it. > (Mirror should be fine anyway because it sets disable_request_queuing.) > > At first I was going to suggest that we could do the opposite of this > patch and just move the dec/wait/inc sequence (which this patch removes > for read/write) to all coroutine entry functions, so direct calls > wouldn't incorrectly decrease the counter. > > But this is not what we want either, we do want to queue requests for > drained BlockBackends even in the blk_co_*() API. > > Do you have another idea or do we have to turn blk_co_*() into wrappers > around the existing functions, which would gain an additional bool > parameter that tells whether we need to dec/inc or not?
So that whenever blk_co_* is called from outside of block-backend.c, we don’t dec/inc? Sounds reasonable to me. The only alternative I see would be ensuring we call blk_wait_while_drained() only outside of in_flight sections (without having to dec/inc around it). But we can’t call it in synchronous sections. And for those synchronous calls, we also have to wrap the in_flight section around the whole asynchronous boilerplate, so there is no place where they can call bdrv_wait_while_drained() without dec/inc around it. So I can’t think of another way either. Max
signature.asc
Description: OpenPGP digital signature