Il 14/01/2014 15:24, Eric Farman ha scritto:
> When an unplug is triggered via QMP, the routine scsi_req_cancel
> is called to cancel any outstanding requests.  However, the I/Os
> themselves were instantiated via an asynchronous call that will
> drive scsi_*_complete routines after the unplug call stack finishes.
> As all references to the request have been released by the cancel
> thread, the scsi_*_complete routines experience a range of failures
> when it attempts to manipulate the released storage.

This should never happen.  See scsi_req_cancel:

    void scsi_req_cancel(SCSIRequest *req)
    {
        trace_scsi_req_cancel(req->dev->id, req->lun, req->tag);
        if (!req->enqueued) {
            return;
        }
        scsi_req_ref(req);
        scsi_req_dequeue(req);
        req->io_canceled = true;
        if (req->ops->cancel_io) {
            req->ops->cancel_io(req);
        }
        if (req->bus->info->cancel) {
            req->bus->info->cancel(req);
        }
        scsi_req_unref(req);
    }

After req->ops->cancel_io returns, the following invariant must hold:

    Any AIO callbacks will have been called before req->ops->cancel_io
    returns, or they never will.

The invariant is also present in bdrv_aio_cancel, and should respected
at all levels: dma_aio_cancel in dma-helpers.c, thread_pool_cancel in
thread-pool.c, laio_cancel in block/linux-aio.c, and so on.

scsi_cancel_io (in hw/scsi/scsi-disk.c) is very careful in its handling
of reference counts and aiocb, with the exact purpose of triggering an
assertion failure if the invariant is not respected.

Now that I look more at the code, at least this patch is needed:

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index bce617c..ee1f5eb 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2306,6 +2306,7 @@ static const SCSIReqOps scsi_disk_emulate_reqops
     .send_command = scsi_disk_emulate_command,
     .read_data    = scsi_disk_emulate_read_data,
     .write_data   = scsi_disk_emulate_write_data,
+    .cancel_io    = scsi_cancel_io,
     .get_buf      = scsi_get_buf,
 };


but it should only have an effect in very special cases, with commands
such as UNMAP, WRITE SAME or MODE SELECT.

Paolo

Reply via email to