Am 08.08.2017 um 14:53 hat Stefan Hajnoczi geschrieben: > On Fri, Aug 04, 2017 at 01:46:17PM +0200, Paolo Bonzini wrote: > > On 04/08/2017 11:58, Stefan Hajnoczi wrote: > > >> the root cause of this bug is related to this as well: > > >> https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg02945.html > > >> > > >> From commit 99723548 we started assuming (incorrectly?) that blk_ > > >> functions always WILL have an attached BDS, but this is not always true, > > >> for instance, flushing the cache from an empty CDROM. > > >> > > >> Paolo, can we move the flight counter increment outside of the > > >> block-backend layer, is that safe? > > > I think the bdrv_inc_in_flight(blk_bs(blk)) needs to be fixed > > > regardless of the throttling timer issue discussed below. BB cannot > > > assume that the BDS graph is non-empty. > > > > Can we make bdrv_aio_* return NULL (even temporarily) if there is no > > attached BDS? That would make it much easier to fix. > > There are many blk_aio_*() callers. Returning NULL forces them to > perform extra error handling.
Yes, that's my concern. We removed NULL returns a long time ago. Most callers probably don't check for it any more. > When you say "temporarily" do you mean it returns NULL but schedules a > one-shot BH to invoke the callback? I wonder if we can use a singleton > aiocb instead of NULL for -ENOMEDIUM errors. This doesn't help. As soon as you involve BHs, you need to consider them during blk_drain(), otherwise the drain can return too early. And if you want to consider them during blk_drain()... well, I made an attempt, maybe we can make it work with some more changes. But I'm starting to see that it's not a trivial change; though admittedly, the NULL return thing doesn't look trivial either. Kevin
Description: PGP signature