On 10/03/2016 10:40, Christian Borntraeger wrote: > On 03/10/2016 10:03 AM, Christian Borntraeger wrote: >> On 03/10/2016 02:51 AM, Fam Zheng wrote: >> [...] >>> The aio_poll() inside "blk_set_aio_context(s->conf->conf.blk, s->ctx)" looks >>> suspicious: >>> >>> main thread iothread >>> ---------------------------------------------------------------------------- >>> virtio_blk_handle_output() >>> virtio_blk_data_plane_start() >>> vblk->dataplane_started = true; >>> blk_set_aio_context() >>> bdrv_set_aio_context() >>> bdrv_drain() >>> aio_poll() >>> <snip...> >>> virtio_blk_handle_output() >>> /* s->dataplane_started is true */ >>> !!! -> virtio_blk_handle_request() >>> event_notifier_set(ioeventfd) >>> aio_poll() >>> >>> virtio_blk_handle_request() >>> >>> Christian, could you try the followed patch? The aio_poll above is replaced >>> with a "limited aio_poll" that doesn't disptach ioeventfd. >>> >>> (Note: perhaps moving "vblk->dataplane_started = true;" after >>> blk_set_aio_context() also *works around* this.) >>> >>> --- >>> >>> diff --git a/block.c b/block.c >>> index ba24b8e..e37e8f7 100644 >>> --- a/block.c >>> +++ b/block.c >>> @@ -4093,7 +4093,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs, >>> >>> void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context) >>> { >>> - bdrv_drain(bs); /* ensure there are no in-flight requests */ >>> + /* ensure there are no in-flight requests */ >>> + bdrv_drained_begin(bs); >>> + bdrv_drained_end(bs); >>> >>> bdrv_detach_aio_context(bs); >>> >> >> That seems to do the trick. > > Or not. Crashed again :-(
I would put bdrv_drained_end just before aio_context_release. But secondarily, I'm thinking of making the logic simpler to understand in two ways: 1) adding a mutex around virtio_blk_data_plane_start/stop. 2) moving event_notifier_set(virtio_queue_get_host_notifier(s->vq)); virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true); to a bottom half (created with aio_bh_new in s->ctx). The bottom half takes the mutex, checks again "if (vblk->dataplane_started)" and if it's true starts the processing. Thanks, Paolo