On 12/17/2012 10:58 PM, Sreekanth Reddy wrote:
> This patch stops the driver to invoke kthread (which remove the dead ioc)
> for some time while EEH recovery has started.
Thank you for posting this, the issue we have seen is resolved now.
Shouldn't be an additional initialization added?
So after a transient event the non_operational_loop is reset again?
Tomas
diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
b/drivers/scsi/mpt2sas/mpt2sas_base.c
index fd3b3d7..480111c 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -208,6 +208,8 @@ _base_fault_reset_work(struct work_struct *work)
return; /* don't rearm timer */
}
+ ioc->non_operational_loop = 0;
+
if ((doorbell & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_FAULT) {
rc = mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP,
FORCE_BIG_HAMMER);
>
> Signed-off-by: Sreekanth Reddy <[email protected]>
> ---
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
> b/drivers/scsi/mpt2sas/mpt2sas_base.c
> index ffd85c5..2349531 100755
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> @@ -155,7 +155,7 @@ _base_fault_reset_work(struct work_struct *work)
> struct task_struct *p;
>
> spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags);
> - if (ioc->shost_recovery)
> + if (ioc->shost_recovery || ioc->pci_error_recovery)
> goto rearm_timer;
> spin_unlock_irqrestore(&ioc->ioc_reset_in_progress_lock, flags);
>
> @@ -164,6 +164,20 @@ _base_fault_reset_work(struct work_struct *work)
> printk(MPT2SAS_INFO_FMT "%s : SAS host is non-operational
> !!!!\n",
> ioc->name, __func__);
>
> + /* It may be possible that EEH recovery can resolve some of
> + * pci bus failure issues rather removing the dead ioc function
> + * by considering controller is in a non-operational state. So
> + * here priority is given to the EEH recovery. If it doesn't
> + * not resolve this issue, mpt2sas driver will consider this
> + * controller to non-operational state and remove the dead ioc
> + * function.
> + */
> + if (ioc->non_operational_loop++ < 5) {
> + spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock,
> + flags);
> + goto rearm_timer;
> + }
> +
> /*
> * Call _scsih_flush_pending_cmds callback so that we flush all
> * pending commands back to OS. This call is required to aovid
> @@ -4386,6 +4400,7 @@ mpt2sas_base_attach(struct MPT2SAS_ADAPTER *ioc)
> if (missing_delay[0] != -1 && missing_delay[1] != -1)
> _base_update_missing_delay(ioc, missing_delay[0],
> missing_delay[1]);
> + ioc->non_operational_loop = 0;
>
> return 0;
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h
> b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index 543d8d6..c6ee7aa 100755
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -835,6 +835,7 @@ struct MPT2SAS_ADAPTER {
> u16 cpu_msix_table_sz;
> u32 ioc_reset_count;
> MPT2SAS_FLUSH_RUNNING_CMDS schedule_dead_ioc_flush_running_cmds;
> + u32 non_operational_loop;
>
> /* internal commands, callback index */
> u8 scsi_io_cb_idx;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html