On 10.08.20 16:13, Alan Stern wrote:
> On Mon, Aug 10, 2020 at 02:03:17PM +0200, Martin Kepplinger wrote:
>> On 09.08.20 17:26, Alan Stern wrote:
>>> This is a somewhat fragile approach.  You don't know for certain that 
>>> scsi_noretry_cmd will be called.  Also, scsi_noretry_cmd can be called 
>>> from other places.
>>>
>>> It would be better to clear the expecting_media_change flag just before 
>>> returning from scsi_decide_disposition.  That way its use is localized 
>>> to one routine, not spread out between two.
>>>
>>> Alan Stern
>>>
>>
>> Hi Alan,
>>
>> maybe you're right. I initially just thought that I'd allow for specific
>> error codes in scsi_noretry_cmd() to return non-NULL (BUS_BUSY, PARITY,
>> ERROR) despite having the flag set.
>>
>> The below version works equally fine for me but I'm not sure if it's
>> actually more safe.
>>
>> James, when exposing a new writable sysfs option like
>> "suspend_no_media_change"(?) that drivers can check before setting the
>> new "expecting_media_change" flag (during resume), would this addition
>> make sense to you?
>>
>> thanks,
>>
>>                       martin
>>
>>
>>
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -565,6 +565,18 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>>                              return NEEDS_RETRY;
>>                      }
>>              }
>> +            if (scmd->device->expecting_media_change) {
>> +                    if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
>> +                            /*
>> +                             * clear the expecting_media_change in
>> +                             * scsi_decide_disposition() because we
>> +                             * need to catch possible "fail fast" overrides
>> +                             * that block readahead can cause.
>> +                             */
>> +                            return NEEDS_RETRY;
>> +                    }
>> +            }
>> +
>>              /*
>>               * we might also expect a cc/ua if another LUN on the target
>>               * reported a UA with an ASC/ASCQ of 3F 0E -
>> @@ -1944,9 +1956,19 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
>>       * the request was not marked fast fail.  Note that above,
>>       * even if the request is marked fast fail, we still requeue
>>       * for queue congestion conditions (QUEUE_FULL or BUSY) */
>> -    if ((++scmd->retries) <= scmd->allowed
>> -        && !scsi_noretry_cmd(scmd)) {
>> -            return NEEDS_RETRY;
>> +    if ((++scmd->retries) <= scmd->allowed) {
>> +            /*
>> +             * but scsi_noretry_cmd() cannot override the
>> +             * expecting_media_change flag.
>> +             */
>> +            if (!scsi_noretry_cmd(scmd) ||
>> +                scmd->device->expecting_media_change) {
>> +                    scmd->device->expecting_media_change = 0;
>> +                    return NEEDS_RETRY;
>> +            } else {
>> +                    /* marked fast fail and not expected. */
>> +                    return SUCCESS;
>> +            }
>>      } else {
> 
> This may not matter...  but it's worth pointing out that 
> expecting_media_change doesn't get cleared if ++scmd->retries > 
> scmd->allowed.

absolutely worth pointing out and I'm not yet sure about that one.

> 
> It also doesn't get cleared in cases where the device _doesn't_ 
> report a Unit Attention.

true. but don't we set the flag for a future UA we don't yet know of? If
we would want to clear it outside of a UA, I think we'd need to keep
track of a suspend/resume cycle and if we see that we *had* successfully
"done requests" after resuming, we could clear it...

> 
> Alan Stern
> 

Reply via email to