Re: [Xen-devel] xen-block: race condition when stopping the device (WAS: Re: [xen-4.13-testing test] 144736: regressions - FAIL)

2019-12-16 Thread Durrant, Paul
> -Original Message-
> From: Durrant, Paul 
> Sent: 16 December 2019 09:50
> To: Durrant, Paul ; Julien Grall ;
> Ian Jackson 
> Cc: Jürgen Groß ; Stefano Stabellini
> ; qemu-de...@nongnu.org; osstest service owner
> ; Anthony Perard
> ; xen-devel@lists.xenproject.org
> Subject: RE: [Xen-devel] xen-block: race condition when stopping the
> device (WAS: Re: [xen-4.13-testing test] 144736: regressions - FAIL)
> 
> > -Original Message-
> > From: Xen-devel  On Behalf Of
> > Durrant, Paul
> > Sent: 16 December 2019 09:34
> > To: Julien Grall ; Ian Jackson 
> > Cc: Jürgen Groß ; Stefano Stabellini
> > ; qemu-de...@nongnu.org; osstest service owner
> > ; Anthony Perard
> > ; xen-devel@lists.xenproject.org
> > Subject: Re: [Xen-devel] xen-block: race condition when stopping the
> > device (WAS: Re: [xen-4.13-testing test] 144736: regressions - FAIL)
> >
> > > -Original Message-
> > [snip]
> > > >>
> > > >> This feels like a race condition between the init/free code with
> > > >> handler. Anthony, does it ring any bell?
> > > >>
> > > >
> > > >  From that stack bt it looks like an iothread managed to run after
> the
> > > sring was NULLed. This should not be able happen as the dataplane
> should
> > > have been moved back onto QEMU's main thread context before the ring
> is
> > > unmapped.
> > >
> > > My knowledge of this code is fairly limited, so correct me if I am
> > wrong.
> > >
> > > blk_set_aio_context() would set the context for the block aio. AFAICT,
> > > the only aio for the block is xen_block_complete_aio().
> >
> > Not quite. xen_block_dataplane_start() calls
> > xen_device_bind_event_channel() and that will add an event channel fd
> into
> > the aio context, so the shared ring is polled by the iothread as well as
> > block i/o completion.
> >
> > >
> > > In the stack above, we are not dealing with a block aio but an aio tie
> > > to the event channel (see the call from xen_device_poll). So I don't
> > > think the blk_set_aio_context() would affect the aio.
> > >
> >
> > For the reason I outline above, it does.
> >
> > > So it would be possible to get the iothread running because we
> received
> > > a notification on the event channel while we are stopping the block
> (i.e
> > > xen_block_dataplane_stop()).
> > >
> >
> > We should assume an iothread can essentially run at any time, as it is a
> > polling entity. It should eventually block polling on fds assign to its
> > aio context but I don't think the abstraction guarantees that it cannot
> be
> > awoken for other reasons (e.g. off a timeout). However and event from
> the
> > frontend will certainly cause the evtchn fd poll to wake up.
> >
> > > If xen_block_dataplane_stop() grab the context lock first, then the
> > > iothread dealing with the event may wait on the lock until its
> released.
> > >
> > > By the time the lock is grabbed, we may have free all the resources
> > > (including srings). So the event iothread will end up to dereference a
> > > NULL pointer.
> > >
> >
> > I think the problem may actually be that xen_block_dataplane_event()
> does
> > not acquire the context and thus is not synchronized with
> > xen_block_dataplane_stop(). The documentation in multiple-iothreads.txt
> is
> > not clear whether a poll handler called by an iothread needs to acquire
> > the context though; TBH I would not have thought it necessary.
> >
> > > It feels to me we need a way to quiesce all the iothreads (blk,
> > > event,...) before continuing. But I am a bit unsure how to do this in
> > > QEMU.
> > >
> >
> > Looking at virtio-blk.c I see that it does seem to close off its evtchn
> > equivalent from iothread context via aio_wait_bh_oneshot(). So I wonder
> > whether the 'right' thing to do is to call
> > xen_device_unbind_event_channel() using the same mechanism to ensure
> > xen_block_dataplane_event() can't race.
> 
> Digging around the virtio-blk history I see:
> 
> commit 1010cadf62332017648abee0d7a3dc7f2eef9632
> Author: Stefan Hajnoczi 
> Date:   Wed Mar 7 14:42:03 2018 +
> 
> virtio-blk: fix race between .ioeventfd_stop() and vq handler
> 
> If the main loop thread invokes .ioeventfd_stop() just as the vq
> handler
> function begins in the IOThread then the handler may lose the

Re: [Xen-devel] xen-block: race condition when stopping the device (WAS: Re: [xen-4.13-testing test] 144736: regressions - FAIL)

2019-12-16 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of
> Durrant, Paul
> Sent: 16 December 2019 09:34
> To: Julien Grall ; Ian Jackson 
> Cc: Jürgen Groß ; Stefano Stabellini
> ; qemu-de...@nongnu.org; osstest service owner
> ; Anthony Perard
> ; xen-devel@lists.xenproject.org
> Subject: Re: [Xen-devel] xen-block: race condition when stopping the
> device (WAS: Re: [xen-4.13-testing test] 144736: regressions - FAIL)
> 
> > -Original Message-
> [snip]
> > >>
> > >> This feels like a race condition between the init/free code with
> > >> handler. Anthony, does it ring any bell?
> > >>
> > >
> > >  From that stack bt it looks like an iothread managed to run after the
> > sring was NULLed. This should not be able happen as the dataplane should
> > have been moved back onto QEMU's main thread context before the ring is
> > unmapped.
> >
> > My knowledge of this code is fairly limited, so correct me if I am
> wrong.
> >
> > blk_set_aio_context() would set the context for the block aio. AFAICT,
> > the only aio for the block is xen_block_complete_aio().
> 
> Not quite. xen_block_dataplane_start() calls
> xen_device_bind_event_channel() and that will add an event channel fd into
> the aio context, so the shared ring is polled by the iothread as well as
> block i/o completion.
> 
> >
> > In the stack above, we are not dealing with a block aio but an aio tie
> > to the event channel (see the call from xen_device_poll). So I don't
> > think the blk_set_aio_context() would affect the aio.
> >
> 
> For the reason I outline above, it does.
> 
> > So it would be possible to get the iothread running because we received
> > a notification on the event channel while we are stopping the block (i.e
> > xen_block_dataplane_stop()).
> >
> 
> We should assume an iothread can essentially run at any time, as it is a
> polling entity. It should eventually block polling on fds assign to its
> aio context but I don't think the abstraction guarantees that it cannot be
> awoken for other reasons (e.g. off a timeout). However and event from the
> frontend will certainly cause the evtchn fd poll to wake up.
> 
> > If xen_block_dataplane_stop() grab the context lock first, then the
> > iothread dealing with the event may wait on the lock until its released.
> >
> > By the time the lock is grabbed, we may have free all the resources
> > (including srings). So the event iothread will end up to dereference a
> > NULL pointer.
> >
> 
> I think the problem may actually be that xen_block_dataplane_event() does
> not acquire the context and thus is not synchronized with
> xen_block_dataplane_stop(). The documentation in multiple-iothreads.txt is
> not clear whether a poll handler called by an iothread needs to acquire
> the context though; TBH I would not have thought it necessary.
> 
> > It feels to me we need a way to quiesce all the iothreads (blk,
> > event,...) before continuing. But I am a bit unsure how to do this in
> > QEMU.
> >
> 
> Looking at virtio-blk.c I see that it does seem to close off its evtchn
> equivalent from iothread context via aio_wait_bh_oneshot(). So I wonder
> whether the 'right' thing to do is to call
> xen_device_unbind_event_channel() using the same mechanism to ensure
> xen_block_dataplane_event() can't race.

Digging around the virtio-blk history I see:

commit 1010cadf62332017648abee0d7a3dc7f2eef9632
Author: Stefan Hajnoczi 
Date:   Wed Mar 7 14:42:03 2018 +

virtio-blk: fix race between .ioeventfd_stop() and vq handler

If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock.  By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!

Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread.  This way no races with the vq handler are
possible.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Fam Zheng 
Acked-by: Paolo Bonzini 
Message-id: 20180307144205.20619-3-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 

...so I think xen-block has exactly the same problem. I think we may also be 
missing a qemu_bh_cancel() to make sure block aio completions are stopped. I'll 
prep a patch.

  Paul

> 
>   Paul
> 
> > Cheers,
> >
> > --
> > Julien Grall
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] xen-block: race condition when stopping the device (WAS: Re: [xen-4.13-testing test] 144736: regressions - FAIL)

2019-12-16 Thread Durrant, Paul
> -Original Message-
[snip]
> >>
> >> This feels like a race condition between the init/free code with
> >> handler. Anthony, does it ring any bell?
> >>
> >
> >  From that stack bt it looks like an iothread managed to run after the
> sring was NULLed. This should not be able happen as the dataplane should
> have been moved back onto QEMU's main thread context before the ring is
> unmapped.
> 
> My knowledge of this code is fairly limited, so correct me if I am wrong.
> 
> blk_set_aio_context() would set the context for the block aio. AFAICT,
> the only aio for the block is xen_block_complete_aio().

Not quite. xen_block_dataplane_start() calls xen_device_bind_event_channel() 
and that will add an event channel fd into the aio context, so the shared ring 
is polled by the iothread as well as block i/o completion.

> 
> In the stack above, we are not dealing with a block aio but an aio tie
> to the event channel (see the call from xen_device_poll). So I don't
> think the blk_set_aio_context() would affect the aio.
> 

For the reason I outline above, it does.

> So it would be possible to get the iothread running because we received
> a notification on the event channel while we are stopping the block (i.e
> xen_block_dataplane_stop()).
> 

We should assume an iothread can essentially run at any time, as it is a 
polling entity. It should eventually block polling on fds assign to its aio 
context but I don't think the abstraction guarantees that it cannot be awoken 
for other reasons (e.g. off a timeout). However and event from the frontend 
will certainly cause the evtchn fd poll to wake up.

> If xen_block_dataplane_stop() grab the context lock first, then the
> iothread dealing with the event may wait on the lock until its released.
> 
> By the time the lock is grabbed, we may have free all the resources
> (including srings). So the event iothread will end up to dereference a
> NULL pointer.
> 

I think the problem may actually be that xen_block_dataplane_event() does not 
acquire the context and thus is not synchronized with 
xen_block_dataplane_stop(). The documentation in multiple-iothreads.txt is not 
clear whether a poll handler called by an iothread needs to acquire the context 
though; TBH I would not have thought it necessary.

> It feels to me we need a way to quiesce all the iothreads (blk,
> event,...) before continuing. But I am a bit unsure how to do this in
> QEMU.
> 

Looking at virtio-blk.c I see that it does seem to close off its evtchn 
equivalent from iothread context via aio_wait_bh_oneshot(). So I wonder whether 
the 'right' thing to do is to call xen_device_unbind_event_channel() using the 
same mechanism to ensure xen_block_dataplane_event() can't race.

  Paul

> Cheers,
> 
> --
> Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel