On Thu, Sep 22, 2022 at 4:42 PM Emanuele Giuseppe Esposito <eespo...@redhat.com> wrote: > > Am 18/09/2022 um 19:12 schrieb Emanuele Giuseppe Esposito: > >> In replication_stop, we call job_cancel_sync() inside > >> aio_context_acquire - aio_context_release section. Should it be fixed? > > > I don't think it breaks anything. The question is: what is the > > aiocontext there protecting? Do we need it? I have no idea. > > Ok Paolo reminded me that job_cancel_sync() calls > AIO_WAIT_WHILE_UNLOCKED. This means that it indeed needs to be wrapped > in an aiocontext release() + acquire() block. > > >> Another question, sometimes you move job_start out of > >> aio-context-acquire critical section, sometimes not. As I understand, > >> it's of for job_start to be called both with acquired aio-context or not > >> acquired? > >> > > Same as above, job_start does not need the aiocontext to be taken > > (otherwise job_lock would be useless). > > I still think here it does not matter if the aiocontext is taken or not.
What matters is that the AioContext lock is taken either inside or outside the job_lock. job_start() takes the job mutex, so you _always_ need to ensure that the job mutex is taken inside AioContext lock and never the opposite. >From quick grep of AIO_WAIT_WHILE_UNLOCKED(), job_finish_sync_locked() is the only user and the only place where the two locks interact. It is explicitly called with AioContext locks _not_ taken, and releases job_lock when the AioContext lock might be taken inside AIO_WAIT_WHILE_UNLOCKED(); so it should be fine. This also validates the idea of AIO_WAIT_WHILE_UNLOCKED(). Paolo