Re: [Qemu-block] [PATCH] vl: introduce vm_shutdown()

2018-02-28 Thread Stefan Hajnoczi
On Wed, Feb 28, 2018 at 09:58:13AM +0800, Fam Zheng wrote:
> On Tue, 02/27 15:30, Stefan Hajnoczi wrote:
> > On Fri, Feb 23, 2018 at 04:20:44PM +0800, Fam Zheng wrote:
> > > On Tue, 02/20 13:10, Stefan Hajnoczi wrote:
> > > > 1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the
> > > >virtio_scsi_ctx_check() assertion failure because the BDS AioContext
> > > >has been modified by iothread_stop_all().
> > > 
> > > Does this patch fix the issue completely? IIUC virtio_scsi_handle_cmd can
> > > already be entered at the time of main thread calling 
> > > virtio_scsi_clear_aio(),
> > > so this race condition still exists:
> > > 
> > >   main thread   iothread
> > > -
> > >   vm_shutdown
> > > ...
> > >   virtio_bus_stop_ioeventfd
> > > virtio_scsi_dataplane_stop
> > > aio_poll()
> > >   ...
> > > 
> > > virtio_scsi_data_plane_handle_cmd()
> > >   aio_context_acquire(s->ctx)
> > >   virtio_scsi_acquire(s).enter
> > >   virtio_scsi_clear_aio()
> > >   aio_context_release(s->ctx)
> > >   
> > > virtio_scsi_acquire(s).return
> > >   virtio_scsi_handle_cmd_vq()
> > > ...
> > >   virtqueue_pop()
> > > 
> > > Is it possible that the above virtqueue_pop() still returns one element 
> > > that was
> > > queued before vm_shutdown() was called?
> > 
> > No, it can't because virtio_scsi_clear_aio() invokes
> > virtio_queue_host_notifier_aio_read(>host_notifier) to process the
> > virtqueue.  By the time we get back to iothread's
> > virtio_scsi_data_plane_handle_cmd() the virtqueue is already empty.
> > 
> > Vcpus have been paused so no additional elements can slip into the
> > virtqueue.
> 
> So there is:
> 
> static void virtio_queue_host_notifier_aio_read(EventNotifier *n)
> {
> VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
> if (event_notifier_test_and_clear(n)) {
> virtio_queue_notify_aio_vq(vq);
> }
> }
> 
> Guest kicks after adding an element to VQ, but we check ioeventfd before 
> trying
> virtqueue_pop(). Is that a problem? If VCPUs are paused after enqueuing but
> before kicking VQ, the ioeventfd is not set, the virtqueue is not processed
> here.

You are right.

This race condition also affects the existing 'stop' command, where
ioeventfd is disabled in the same way.  I'll send a v2 with a patch to
fix this.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH] vl: introduce vm_shutdown()

2018-02-27 Thread Fam Zheng
On Tue, 02/27 15:30, Stefan Hajnoczi wrote:
> On Fri, Feb 23, 2018 at 04:20:44PM +0800, Fam Zheng wrote:
> > On Tue, 02/20 13:10, Stefan Hajnoczi wrote:
> > > 1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the
> > >virtio_scsi_ctx_check() assertion failure because the BDS AioContext
> > >has been modified by iothread_stop_all().
> > 
> > Does this patch fix the issue completely? IIUC virtio_scsi_handle_cmd can
> > already be entered at the time of main thread calling 
> > virtio_scsi_clear_aio(),
> > so this race condition still exists:
> > 
> >   main thread   iothread
> > -
> >   vm_shutdown
> > ...
> >   virtio_bus_stop_ioeventfd
> > virtio_scsi_dataplane_stop
> > aio_poll()
> >   ...
> > 
> > virtio_scsi_data_plane_handle_cmd()
> >   aio_context_acquire(s->ctx)
> >   virtio_scsi_acquire(s).enter
> >   virtio_scsi_clear_aio()
> >   aio_context_release(s->ctx)
> >   virtio_scsi_acquire(s).return
> >   virtio_scsi_handle_cmd_vq()
> > ...
> >   virtqueue_pop()
> > 
> > Is it possible that the above virtqueue_pop() still returns one element 
> > that was
> > queued before vm_shutdown() was called?
> 
> No, it can't because virtio_scsi_clear_aio() invokes
> virtio_queue_host_notifier_aio_read(>host_notifier) to process the
> virtqueue.  By the time we get back to iothread's
> virtio_scsi_data_plane_handle_cmd() the virtqueue is already empty.
> 
> Vcpus have been paused so no additional elements can slip into the
> virtqueue.

So there is:

static void virtio_queue_host_notifier_aio_read(EventNotifier *n)
{
VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
if (event_notifier_test_and_clear(n)) {
virtio_queue_notify_aio_vq(vq);
}
}

Guest kicks after adding an element to VQ, but we check ioeventfd before trying
virtqueue_pop(). Is that a problem? If VCPUs are paused after enqueuing but
before kicking VQ, the ioeventfd is not set, the virtqueue is not processed
here.

Fam




Re: [Qemu-block] [PATCH] vl: introduce vm_shutdown()

2018-02-27 Thread Stefan Hajnoczi
On Fri, Feb 23, 2018 at 04:20:44PM +0800, Fam Zheng wrote:
> On Tue, 02/20 13:10, Stefan Hajnoczi wrote:
> > 1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the
> >virtio_scsi_ctx_check() assertion failure because the BDS AioContext
> >has been modified by iothread_stop_all().
> 
> Does this patch fix the issue completely? IIUC virtio_scsi_handle_cmd can
> already be entered at the time of main thread calling virtio_scsi_clear_aio(),
> so this race condition still exists:
> 
>   main thread   iothread
> -
>   vm_shutdown
> ...
>   virtio_bus_stop_ioeventfd
> virtio_scsi_dataplane_stop
> aio_poll()
>   ...
> 
> virtio_scsi_data_plane_handle_cmd()
>   aio_context_acquire(s->ctx)
>   virtio_scsi_acquire(s).enter
>   virtio_scsi_clear_aio()
>   aio_context_release(s->ctx)
>   virtio_scsi_acquire(s).return
>   virtio_scsi_handle_cmd_vq()
> ...
>   virtqueue_pop()
> 
> Is it possible that the above virtqueue_pop() still returns one element that 
> was
> queued before vm_shutdown() was called?

No, it can't because virtio_scsi_clear_aio() invokes
virtio_queue_host_notifier_aio_read(>host_notifier) to process the
virtqueue.  By the time we get back to iothread's
virtio_scsi_data_plane_handle_cmd() the virtqueue is already empty.

Vcpus have been paused so no additional elements can slip into the
virtqueue.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH] vl: introduce vm_shutdown()

2018-02-23 Thread Fam Zheng
On Tue, 02/20 13:10, Stefan Hajnoczi wrote:
> 1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the
>virtio_scsi_ctx_check() assertion failure because the BDS AioContext
>has been modified by iothread_stop_all().

Does this patch fix the issue completely? IIUC virtio_scsi_handle_cmd can
already be entered at the time of main thread calling virtio_scsi_clear_aio(),
so this race condition still exists:

  main thread   iothread
-
  vm_shutdown
...
  virtio_bus_stop_ioeventfd
virtio_scsi_dataplane_stop
aio_poll()
  ...
virtio_scsi_data_plane_handle_cmd()
  aio_context_acquire(s->ctx)
  virtio_scsi_acquire(s).enter
  virtio_scsi_clear_aio()
  aio_context_release(s->ctx)
  virtio_scsi_acquire(s).return
  virtio_scsi_handle_cmd_vq()
...
  virtqueue_pop()

Is it possible that the above virtqueue_pop() still returns one element that was
queued before vm_shutdown() was called?

If so I think we additionally need to an "s->ioeventfd_stopped" flag set in
virtio_scsi_stop_ioeventfd() and check it in
virtio_scsi_data_plane_handle_cmd().

Fam