Am 14.09.2018 um 17:12 hat Paolo Bonzini geschrieben: > On 13/09/2018 18:59, Kevin Wolf wrote: > > Am 13.09.2018 um 17:10 hat Paolo Bonzini geschrieben: > >> On 13/09/2018 14:52, Kevin Wolf wrote: > >>> + if (qemu_get_current_aio_context() == qemu_get_aio_context()) { > >>> + /* If we are in the main thread, the callback is allowed to unref > >>> + * the BlockBackend, so we have to hold an additional reference */ > >>> + blk_ref(acb->rwco.blk); > >>> + } > >>> acb->common.cb(acb->common.opaque, acb->rwco.ret); > >>> + blk_dec_in_flight(acb->rwco.blk); > >>> + if (qemu_get_current_aio_context() == qemu_get_aio_context()) { > >>> + blk_unref(acb->rwco.blk); > >>> + } > >> > >> Is this something that happens only for some specific callers? That is, > >> which callers are sure that the callback is invoked from the main thread? > > > > I can't seem to reproduce the problem I saw any more even when reverting > > the bdrv_ref/unref pair. If I remember correctly it was actually a > > nested aio_poll() that was running a block job completion or something > > like that - which would obviously only happen on the main thread because > > the job intentionally defers to the main thread. > > > > The only reason I made this conditional is that I think bdrv_unref() > > still isn't safe outside the main thread, is it? > > Yes, making it conditional is correct, but it is quite fishy even with > the conditional. > > As you mention, you could have a nested aio_poll() in the main thread, > for example invoked from a bottom half, but in that case I'd rather > track the caller that is creating the bottom half and see if it lacks a > bdrv_ref/bdrv_unref (or perhaps it's even higher in the tree that is > missing).
I went back to the commit where I first added the patch (it already contained the ref/unref pair) and tried if I could reproduce a bug with the pair removed. I couldn't. I'm starting to think that maybe I was just overly cautious with the ref/unref. I may have confused the nested aio_poll() crash with a different situation. I've dealt with so many crashes and hangs while working on this series that it's quite possible. Kevin