On 08/07/21 12:36, Stefan Hajnoczi wrote:
What is very clear from this patch is that it
is strictly related to the brdv_* and lower level calls, because
they also internally check or even use the aiocontext lock.
Therefore, in order to make it work, I temporarly added some
aiocontext_acquire/release pair around the function that
still assert for them or assume they are hold and temporarly
unlock (unlock() - lock()).

Sounds like the issue is that this patch series assumes AioContext locks
are no longer required for calling the blk_*()/bdrv_*() APIs? That is
not the case yet, so you had to then add those aio_context_lock() calls
back in elsewhere. This approach introduces unnecessary risk. I think we
should wait until blk_*()/bdrv_*() no longer requires the caller to hold
the AioContext lock before applying this series.

In general I'm in favor of pushing the lock further down into smaller and smaller critical sections; it's a good approach to make further audits easier until it's "obvious" that the lock is unnecessary. I haven't yet reviewed Emanuele's patches to see if this is what he's doing where he's adding the acquire/release calls, but that's my understanding of both his cover letter and your reply.

The I/O blk_*()/bdrv_*() *should* not require the caller to hold the AioContext lock; all drivers use their own CoMutex or QemuMutex when needed, and generic code should also be ready (caveat emptor). Others, such as reopen, are a mess that requires a separate audit. Restricting acquire/release to be only around those seems like a good starting point.

Paolo


Reply via email to