> On Aug 10, 2015, at 10:41 PM, Benjamin Herrenschmidt
> <[email protected]> wrote:
>
> On Mon, 2015-08-10 at 12:09 -0500, Matthew R. Ochs wrote:
>> Introduce support for enhanced I/O error handling.
>>
>> Signed-off-by: Matthew R. Ochs <[email protected]>
>> Signed-off-by: Manoj N. Kumar <[email protected]>
>> ---
>
> So I'm not necessarily very qualified to review SCSI bits as I haven't
> done anything close to the Linux SCSI code for almost a decade but I
> have a couple of questions and nits…
Thanks for reviewing Ben!
>
>> wait_queue_head_t tmf_waitq;
>> bool tmf_active;
>> - u8 err_recovery_active:1;
>> + wait_queue_head_t limbo_waitq;
>> + enum cxlflash_state state;
>> };
>>
>> struct afu_cmd {
>> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
>> index 76a7286..18359d4 100644
>> --- a/drivers/scsi/cxlflash/main.c
>> +++ b/drivers/scsi/cxlflash/main.c
>> @@ -380,6 +380,18 @@ static int cxlflash_queuecommand(struct Scsi_Host
>> *host, struct scsi_cmnd *scp)
>> }
>> spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
>>
>> + switch (cfg->state) {
>> + case STATE_LIMBO:
>> + pr_debug_ratelimited("%s: device in limbo!\n", __func__);
>> + rc = SCSI_MLQUEUE_HOST_BUSY;
>> + goto out;
>> + case STATE_FAILTERM:
>> + pr_debug_ratelimited("%s: device has failed!\n", __func__);
>> + goto error;
>> + default:
>> + break;
>> + }
>
> I noticed that you mostly read and write that new state outside of your
> tmf_waitq.lock. Is there any other lock or mutex protecting you ? In
> this specific case ?
No there isn’t.
>
> I know in the old day queuecommand was called with a host lock, is that
> still the case ?
No, it’s no longer the case.
> Or you just don't care about an occasional spurrious
> SCSI_MLQUEUE_HOST_BUSY ?
This is pretty much true. =)
>
>> cmd = cxlflash_cmd_checkout(afu);
>> if (unlikely(!cmd)) {
>> pr_err("%s: could not get a free command\n", __func__);
>> @@ -428,6 +440,10 @@ static int cxlflash_queuecommand(struct Scsi_Host
>> *host, struct scsi_cmnd *scp)
>>
>> out:
>> return rc;
>> +error:
>> + scp->result = (DID_NO_CONNECT << 16);
>> + scp->scsi_done(scp);
>> + return 0;
>> }
>>
>> /**
>> @@ -455,9 +471,21 @@ static int cxlflash_eh_device_reset_handler(struct
>> scsi_cmnd *scp)
>> get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
>> get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
>>
>> - rcr = send_tmf(afu, scp, TMF_LUN_RESET);
>> - if (unlikely(rcr))
>> + switch (cfg->state) {
>> + case STATE_NORMAL:
>> + rcr = send_tmf(afu, scp, TMF_LUN_RESET);
>> + if (unlikely(rcr))
>> + rc = FAILED;
>> + break;
>> + case STATE_LIMBO:
>> + wait_event(cfg->limbo_waitq, cfg->state != STATE_LIMBO);
>> + if (cfg->state == STATE_NORMAL)
>> + break;
>> + /* fall through */
>> + default:
>> rc = FAILED;
>> + break;
>> + }
>
> Same here, since you are doing wait_event, I assume no lock is held
> (unless it's a mutex) and in subsequent places I am not listing.
>
> As I said, it could well be that it all works fine but it would be worth
> mentioning in this case because it's not obvious from simply reading the
> code.
It has been working well, and yes, your assumption is correct. We can look at
adding more/better serialization as a future enhancement.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html