On 9/22/23 02:36, Wenchao Hao wrote:
Fix the issue of skipping scsi_try_bus_device_reset() for devices
which is in progress of removing in following order:

T1:                                     T2:scsi_error_handle
__scsi_remove_device
   scsi_device_set_state(sdev, SDEV_DEL)
                                        // would skip device with SDEV_DEL state
                                        shost_for_each_device()
                                          scsi_try_bus_device_reset
                                        flush all commands
  ...
  scsi_device is released

Some drivers like smartpqi only implement eh_device_reset_handler,
if device reset is skipped, the commands which had been sent to
firmware or devices hardware are not cleared. The error handle
would flush all these commands in scsi_unjam_host().

When the commands are finished by hardware, use after free issue is
triggered.

Add parameter "check_state" to macro shost_for_each_device() to
determine if check device status when traversal scsi_device
of Scsi_Host, and set this parameter to false when traversal
in scsi_error_handle to address this issue.

The above is incomprehensible to me. Please explain more clearly why this change is needed.

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index d0911bc28663..db8b9e42267c 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -704,6 +704,23 @@ int scsi_cdl_enable(struct scsi_device *sdev, bool enable)
        return 0;
  }
+static int __scsi_device_get(struct scsi_device *sdev, bool check_state)

"check_state" is a bad argument name because it does not clearly explain the purpose of this argument. Would "include_deleted" perhaps be a better name?

+{
+       if (check_state &&
+           (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL))
+               goto fail;
+       if (!try_module_get(sdev->host->hostt->module))
+               goto fail;
+       if (!get_device(&sdev->sdev_gendev))
+               goto fail_put_module;
+       return 0;
+
+fail_put_module:
+       module_put(sdev->host->hostt->module);
+fail:
+       return -ENXIO;
+}

Looking at the above code, I think we need two functions: one that does not include the sdev->sdev_state check and a second function that includes the sdev->sdev_state check (scsi_device_get()) and calls the first. That will result in code that is easier to read than calls to a function with a boolean argument.

diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index c498a12f7715..e166d053c839 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -389,21 +389,25 @@ extern void __starget_for_each_device(struct scsi_target 
*, void *,
/* only exposed to implement shost_for_each_device */
  extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *,
-                                                 struct scsi_device *);
+                                                 struct scsi_device *,
+                                                 bool);
/**
   * shost_for_each_device - iterate over all devices of a host
   * @sdev: the &struct scsi_device to use as a cursor
   * @shost: the &struct scsi_host to iterate over
+ * @check_state: if skip check scsi_device's state to skip some devices
+ *               scsi_device with SDEV_DEL or SDEV_CANCEL would be skipped
+ *               if this is true
   *
   * Iterator that returns each device attached to @shost.  This loop
   * takes a reference on each device and releases it at the end.  If
   * you break out of the loop, you must call scsi_device_put(sdev).
   */
-#define shost_for_each_device(sdev, shost) \
-       for ((sdev) = __scsi_iterate_devices((shost), NULL); \
+#define shost_for_each_device(sdev, shost, check_state) \
+       for ((sdev) = __scsi_iterate_devices((shost), NULL, check_state); \
             (sdev); \
-            (sdev) = __scsi_iterate_devices((shost), (sdev)))
+            (sdev) = __scsi_iterate_devices((shost), (sdev), check_state))
/**
   * __shost_for_each_device - iterate over all devices of a host (UNLOCKED)

Since only the SCSI error handler passes 0 as 'check_state' argument to shost_for_each_device(), instead of adding a boolean argument to that macro, please do the following:
* Introduce a new macro for the check_state = 1 case.
* Keep the semantics for shost_for_each_device().

With this approach no SCSI LLDs will have to be modified.

Thanks,

Bart.

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/open-iscsi/ce9cef41-29e2-4056-a60b-b0e4ee1cc17e%40acm.org.

Reply via email to