Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling

Bart Van Assche Wed, 19 Jun 2013 08:29:00 -0700

On 06/19/13 15:44, Jack Wang wrote:

+               /*
+                * It can occur that after fast_io_fail_tmo expired and before
+                * dev_loss_tmo expired that the SCSI error handler has
+                * offlined one or more devices. scsi_target_unblock() doesn't
+                * change the state of these devices into running, so do that
+                * explicitly.
+                */
+               spin_lock_irq(shost->host_lock);
+               __shost_for_each_device(sdev, shost)
+                       if (sdev->sdev_state == SDEV_OFFLINE)
+                               sdev->sdev_state = SDEV_RUNNING;
+               spin_unlock_irq(shost->host_lock);


Do you have test case to verify this behaviour?


Hello Jack,

This is what I came up with after analyzing why a so-called "portflapping" test failed. The concept of that test is simple: useibportstate to disable and reenable the proper IB port on the switchwith random intervals and check whether I/O starts running again if thepath remains operational long enough. When running such a test for a fewdays with random intervals between a few seconds and a few minutessooner or later it will occur that scsi_try_host_reset() succeeds andthat scsi_eh_test_devices() fails. That will cause the SCSI errorhandler to offline devices. Hence the above code to change the offlinestate into running after a reconnect succeeds. I'm not proud of thatcode but I couldn't find a better solution. Maybe the above code won'tbe necessary anymore once we switch to Hannes' new SCSI error handler.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling

Reply via email to