Re: [PATCH 2/5] scsi: improved eh timeout handler

Douglas Gilbert Thu, 07 Nov 2013 10:34:54 -0800

On 13-11-07 01:45 AM, Hannes Reinecke wrote:

On 11/06/2013 06:23 PM, Mike Christie wrote:

On 11/05/2013 10:48 PM, Hannes Reinecke wrote:

On 11/05/2013 08:19 PM, Mike Christie wrote:

On 11/04/2013 11:05 PM, Hannes Reinecke wrote:

+
+       scmd->eh_eflags |= SCSI_EH_ABORT_SCHEDULED;
+       SCSI_LOG_ERROR_RECOVERY(3,
+               scmd_printk(KERN_INFO, scmd,
+                           "scmd %p abort scheduled\n", scmd));
+       schedule_delayed_work(&scmd->abort_work, HZ / 100);
+       return SUCCESS;
+}


Do we want to use our own workqueue_struct with WQ_MEM_RECLAIM set?

Errm. Yes, why?

I must admit I'm not _that_ familiar with workqueues ...
Care to explain?


We all share the above workqueue_structs pool of threads, so if we get
stuck behind code doing GFP_KERNEL allocs that end up needing to write
data to the disk we are now trying to aborts on, then we could get
stuck. With WQ_MEM_RECLAIM, we have our own backup thread that gets
created at workqueue_struct create time which can get used in cases like
that so we can always make forward progress.

Ah. Right. Yes, that makes sense.

I guess I'll have to redo the patches _yet again_.


I wonder if it might be useful to flag a LU (disk)
with "try really hard to recover me, perhaps at the
expense of other LUs". Seems like a LU containing the
rootfs or swap might qualify for setting such a flag.
And LUs that have this flag cleared could be assumed
to not get wedged in the fashion that Mike pointed out.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/5] scsi: improved eh timeout handler

Reply via email to