On 19/05/2015 20:37, Fam Zheng wrote: > On Tue, 05/19 10:49, Paolo Bonzini wrote: >> >> >> On 19/05/2015 18:48, Fam Zheng wrote: >>>>> >>>>> This is too late. As a rule, the blocker must be established before >>>>> calling bdrv_drain, and removed on the next yield (in this case, before >>>>> the assignment to last_pause_ns). >>> I don't understand. If the blocker is removed before mirror_run returns, >>> wouldn't more device IO already hit source image by the time mirror_exit >>> runs? >> >> If you go to mirror_exit, you won't reach the assignment (so you have to >> remove the blocker in mirror_exit too). >> >> But if you don't go to mirror_exit because cnt != 0, you must remove the >> blocker before the next I/O. >> > > OK, but I'm still not clear how is it too late in this patch? Although the > blocker is set after bdrv_drain, we know there is no dirty data because cnt is > 0, and we'll be holding a blocker when releasing the AioContext, no new IO is > allowed.
So you rely on the caller of mirror_run not calling aio_context_release between bdrv_drain and block_job_defer_to_main_loop? That indeed should work, but why not stick to a common pattern of blocking I/O before bdrv_drain? That's how bdrv_drain is almost always used in the code, so it's a safe thing to do and the preemption points are then documented more clearly. I think it would be nice to have all bdrv_drain calls: - either preceded by a device I/O blocker - or preceded by a comment explaining why the call is there and why it doesn't need the blocker This is not a NACK, but I would like to understand the disadvantages of what I am suggesting here. Paolo