Am 12.09.2018 um 19:03 hat Denis V. Lunev geschrieben: > On 09/12/2018 04:15 PM, Kevin Wolf wrote: > > Am 12.09.2018 um 14:03 hat Denis Plotnikov geschrieben: > >> On 10.09.2018 15:41, Kevin Wolf wrote: > >>> Am 29.06.2018 um 14:40 hat Denis Plotnikov geschrieben: > >>>> Fixes the problem of ide request appearing when the BDS is in > >>>> the "drained section". > >>>> > >>>> Without the patch the request can come and be processed by the main > >>>> event loop, as the ide requests are processed by the main event loop > >>>> and the main event loop doesn't stop when its context is in the > >>>> "drained section". > >>>> The request execution is postponed until the end of "drained section". > >>>> > >>>> The patch doesn't modify ide specific code, as well as any other > >>>> device code. Instead, it modifies the infrastructure of asynchronous > >>>> Block Backend requests, in favor of postponing the requests arisen > >>>> when in "drained section" to remove the possibility of request appearing > >>>> for all the infrastructure clients. > >>>> > >>>> This approach doesn't make vCPU processing the request wait untill > >>>> the end of request processing. > >>>> > >>>> Signed-off-by: Denis Plotnikov <dplotni...@virtuozzo.com> > >>> I generally agree with the idea that requests should be queued during a > >>> drained section. However, I think there are a few fundamental problems > >>> with the implementation in this series: > >>> > >>> 1) aio_disable_external() is already a layering violation and we'd like > >>> to get rid of it (by replacing it with a BlockDevOps callback from > >>> BlockBackend to the devices), so adding more functionality there > >>> feels like a step in the wrong direction. > >>> > >>> 2) Only blk_aio_* are fixed, while we also have synchronous public > >>> interfaces (blk_pread/pwrite) as well as coroutine-based ones > >>> (blk_co_*). They need to be postponed as well. > >> Good point! Thanks! > > Should we really prohibit all public interfaces, as they are reused > inside block level?
We need to fix that. blk_*() should never be called from inside the BDS layer. > There is also a problem which is not stated in the clear words yet. > We have potential deadlock in the code under the following > conditions, which should be also taken into the consideration. > > <path from the controller> > bdrv_co_pwritev > bdrv_inc_in_flight > bdrv_aligned_pwritev > notifier_list_with_return_notify > backup_before_write_notify > backup_do_cow > backup_cow_with_bounce_buffer > blk_co_preadv > > Here blk_co_preadv() must finish its work before we will release the > notifier and finish request initiated from the controller and which > has incremented in-fligh counter. Yes, before_write notifiers are evil. I've objected to them since the day they were introduced and I'm surprised it's becoming a problem only now. We should probably change the backup job to insert a job node rather sooner than later. Then it doesn't need to call blk_*() any more. > Thus we should differentiate requests initiated at the controller > level and requests initiated in the block layer. This is sad but > true. The difference is supposed to be whether a request goes through a BlockBackend or not. Kevin