On Wed, Feb 28, 2018 at 09:58:13AM +0800, Fam Zheng wrote: > On Tue, 02/27 15:30, Stefan Hajnoczi wrote: > > On Fri, Feb 23, 2018 at 04:20:44PM +0800, Fam Zheng wrote: > > > On Tue, 02/20 13:10, Stefan Hajnoczi wrote: > > > > 1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the > > > > virtio_scsi_ctx_check() assertion failure because the BDS AioContext > > > > has been modified by iothread_stop_all(). > > > > > > Does this patch fix the issue completely? IIUC virtio_scsi_handle_cmd can > > > already be entered at the time of main thread calling > > > virtio_scsi_clear_aio(), > > > so this race condition still exists: > > > > > > main thread iothread > > > ----------------------------------------------------------------------------- > > > vm_shutdown > > > ... > > > virtio_bus_stop_ioeventfd > > > virtio_scsi_dataplane_stop > > > aio_poll() > > > ... > > > > > > virtio_scsi_data_plane_handle_cmd() > > > aio_context_acquire(s->ctx) > > > virtio_scsi_acquire(s).enter > > > virtio_scsi_clear_aio() > > > aio_context_release(s->ctx) > > > > > > virtio_scsi_acquire(s).return > > > virtio_scsi_handle_cmd_vq() > > > ... > > > virtqueue_pop() > > > > > > Is it possible that the above virtqueue_pop() still returns one element > > > that was > > > queued before vm_shutdown() was called? > > > > No, it can't because virtio_scsi_clear_aio() invokes > > virtio_queue_host_notifier_aio_read(&vq->host_notifier) to process the > > virtqueue. By the time we get back to iothread's > > virtio_scsi_data_plane_handle_cmd() the virtqueue is already empty. > > > > Vcpus have been paused so no additional elements can slip into the > > virtqueue. > > So there is: > > static void virtio_queue_host_notifier_aio_read(EventNotifier *n) > { > VirtQueue *vq = container_of(n, VirtQueue, host_notifier); > if (event_notifier_test_and_clear(n)) { > virtio_queue_notify_aio_vq(vq); > } > } > > Guest kicks after adding an element to VQ, but we check ioeventfd before > trying > virtqueue_pop(). Is that a problem? If VCPUs are paused after enqueuing but > before kicking VQ, the ioeventfd is not set, the virtqueue is not processed > here.
You are right. This race condition also affects the existing 'stop' command, where ioeventfd is disabled in the same way. I'll send a v2 with a patch to fix this. Stefan
signature.asc
Description: PGP signature