On Mon, Jul 17, 2017 at 5:43 PM, John Snow <js...@redhat.com> wrote: > On 07/17/2017 06:26 AM, Dr. David Alan Gilbert wrote: >> * Stefan Hajnoczi (stefa...@gmail.com) wrote: >>> On Thu, Jul 13, 2017 at 08:01:16PM +0100, Dr. David Alan Gilbert (git) >>> wrote: >>>> From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> >>>> >>>> There's a rare exit seg if the guest is accessing >>>> IO during exit. >>>> It's always hitting the atomic_inc(&bs->in_flight) with a NULL >>>> bs. This was added recently in 99723548 but I don't see it >>>> as the cause. >>>> >>>> Flip vl.c around so we pause the cpus before closing the block devices, >>>> that way we shouldn't have anything trying to access them when >>>> they're gone. >>>> >>>> This was originally Red Hat bz >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1451015 >>>> >>>> Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com> >>>> Reported-by: Cong Li <c...@redhat.com> >>>> >>>> -- >>>> This is a very rare race, I'll leave it running in a loop to see if >>>> we hit anything else and to check this really fixes it. >>>> >>>> I do worry if there are other cases that can trigger this - e.g. >>>> hot-unplug or ejecting a CD. >>>> >>>> --- >>>> vl.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> Reviewed-by: Stefan Hajnoczi <stefa...@redhat.com> >> >> Thanks; and the test I left running seems solid - ~12k runs >> over the weekend with no seg. >> >> Dave >> >> -- >> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK >> > > the root cause of this bug is related to this as well: > https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg02945.html > > From commit 99723548 we started assuming (incorrectly?) that blk_ > functions always WILL have an attached BDS, but this is not always true, > for instance, flushing the cache from an empty CDROM. > > Paolo, can we move the flight counter increment outside of the > block-backend layer, is that safe?
I think the bdrv_inc_in_flight(blk_bs(blk)) needs to be fixed regardless of the throttling timer issue discussed below. BB cannot assume that the BDS graph is non-empty. Stefan