On 08/08/2017 14:53, Stefan Hajnoczi wrote:
> On Fri, Aug 04, 2017 at 01:46:17PM +0200, Paolo Bonzini wrote:
>> On 04/08/2017 11:58, Stefan Hajnoczi wrote:
>>>> the root cause of this bug is related to this as well:
>>>> https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg02945.html
>>>> From commit 99723548 we started assuming (incorrectly?) that blk_
>>>> functions always WILL have an attached BDS, but this is not always true,
>>>> for instance, flushing the cache from an empty CDROM.
>>>> Paolo, can we move the flight counter increment outside of the
>>>> block-backend layer, is that safe?
>>> I think the bdrv_inc_in_flight(blk_bs(blk)) needs to be fixed
>>> regardless of the throttling timer issue discussed below.  BB cannot
>>> assume that the BDS graph is non-empty.
>> Can we make bdrv_aio_* return NULL (even temporarily) if there is no
>> attached BDS?  That would make it much easier to fix.
> There are many blk_aio_*() callers.  Returning NULL forces them to
> perform extra error handling.

Most of them either are for non-removable devices and check for a
non-empty BB at startup.  The others (ide-cd and scsi-cd, sd) check the
BlockBackend's blk_is_available or blk_is_inserted state themselves.

It does require some auditing of course, remember that anything that
would return NULL, would now be crashing already in bdrv_inc_in_flight.
We'd be seeing NULL pointer dereferences left and right, if it were so bad.

> When you say "temporarily" do you mean it returns NULL but schedules a
> one-shot BH to invoke the callback?  I wonder if we can use a singleton
> aiocb instead of NULL for -ENOMEDIUM errors.

No, I mean undoing that in 2.11.


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to