Re: [PATCH v4 4/4] scsi: ufs: inject errors to verify error handling

2015-03-10 Thread Akinobu Mita
2015-03-10 19:20 GMT+09:00 Gilad Broner :
>>> +static bool inject_cmd_hang_tr(struct ufs_hba *hba)
>>> +{
>>> +   int tag;
>>> +
>>> +   tag = find_first_bit(>outstanding_reqs, hba->nutrs);
>>> +   if (tag == hba->nutrs)
>>> +   return 0;
>>> +
>>> +   __clear_bit(tag, >outstanding_reqs);
>>> +   hba->lrb[tag].cmd = NULL;
>>> +   __clear_bit(tag, >lrb_in_use);
>>
>> hba->lrb_in_use is a bitmap set by test_and_set_bit_lock().  So
>> this should be cleared by clear_bit_unlock().
>
> You are correct. Thanks.
>
>>
>> And as soon as the bit corresponds to this slot in hba->lrb_in_use is
>> cleared, this slot could be reused.  But if the request corresponds
>> to the slot is not completed yet, it could sacrifice the new request,
>> too.  So should we only inject into the commands which have been
>> completed like this?
>
> Please note that we only clear the bit in hba->lrb_in_use. scsi_done is
> not called for this request. Therefore, the tag is not yet free in the
> block layer and next calls for queuecommand will not pass down this tag to
> be used in the UFS driver. So there is no danger of a new request being
> sacrificed.

OK, I see there is no danger as far as the commands are comming
through queuecommand().  But what about the query requests?
PATCH 1/4 in this series has added ioctl interface for query
request which also acquires a tag in hba->lrb_in_use through
ufshcd_get_dev_cmd_tag().  Although it is very difficult to
happen, is it possible for new query requests to be clashed by
inject_cmd_hang_tr() in the same way I described earlier?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/4] scsi: ufs: inject errors to verify error handling

2015-03-10 Thread Gilad Broner
>> +static bool inject_cmd_hang_tr(struct ufs_hba *hba)
>> +{
>> +   int tag;
>> +
>> +   tag = find_first_bit(>outstanding_reqs, hba->nutrs);
>> +   if (tag == hba->nutrs)
>> +   return 0;
>> +
>> +   __clear_bit(tag, >outstanding_reqs);
>> +   hba->lrb[tag].cmd = NULL;
>> +   __clear_bit(tag, >lrb_in_use);
>
> hba->lrb_in_use is a bitmap set by test_and_set_bit_lock().  So
> this should be cleared by clear_bit_unlock().

You are correct. Thanks.

>
> And as soon as the bit corresponds to this slot in hba->lrb_in_use is
> cleared, this slot could be reused.  But if the request corresponds
> to the slot is not completed yet, it could sacrifice the new request,
> too.  So should we only inject into the commands which have been
> completed like this?

Please note that we only clear the bit in hba->lrb_in_use. scsi_done is
not called for this request. Therefore, the tag is not yet free in the
block layer and next calls for queuecommand will not pass down this tag to
be used in the UFS driver. So there is no danger of a new request being
sacrificed.

On a different note, we are debating internally on a few other changes so
until we consolidate those I will drop this patch with error injection.

Gilad.

-- 
Qualcomm Israel, on behalf of Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/4] scsi: ufs: inject errors to verify error handling

2015-03-10 Thread Gilad Broner
 +static bool inject_cmd_hang_tr(struct ufs_hba *hba)
 +{
 +   int tag;
 +
 +   tag = find_first_bit(hba-outstanding_reqs, hba-nutrs);
 +   if (tag == hba-nutrs)
 +   return 0;
 +
 +   __clear_bit(tag, hba-outstanding_reqs);
 +   hba-lrb[tag].cmd = NULL;
 +   __clear_bit(tag, hba-lrb_in_use);

 hba-lrb_in_use is a bitmap set by test_and_set_bit_lock().  So
 this should be cleared by clear_bit_unlock().

You are correct. Thanks.


 And as soon as the bit corresponds to this slot in hba-lrb_in_use is
 cleared, this slot could be reused.  But if the request corresponds
 to the slot is not completed yet, it could sacrifice the new request,
 too.  So should we only inject into the commands which have been
 completed like this?

Please note that we only clear the bit in hba-lrb_in_use. scsi_done is
not called for this request. Therefore, the tag is not yet free in the
block layer and next calls for queuecommand will not pass down this tag to
be used in the UFS driver. So there is no danger of a new request being
sacrificed.

On a different note, we are debating internally on a few other changes so
until we consolidate those I will drop this patch with error injection.

Gilad.

-- 
Qualcomm Israel, on behalf of Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/4] scsi: ufs: inject errors to verify error handling

2015-03-10 Thread Akinobu Mita
2015-03-10 19:20 GMT+09:00 Gilad Broner gbro...@codeaurora.org:
 +static bool inject_cmd_hang_tr(struct ufs_hba *hba)
 +{
 +   int tag;
 +
 +   tag = find_first_bit(hba-outstanding_reqs, hba-nutrs);
 +   if (tag == hba-nutrs)
 +   return 0;
 +
 +   __clear_bit(tag, hba-outstanding_reqs);
 +   hba-lrb[tag].cmd = NULL;
 +   __clear_bit(tag, hba-lrb_in_use);

 hba-lrb_in_use is a bitmap set by test_and_set_bit_lock().  So
 this should be cleared by clear_bit_unlock().

 You are correct. Thanks.


 And as soon as the bit corresponds to this slot in hba-lrb_in_use is
 cleared, this slot could be reused.  But if the request corresponds
 to the slot is not completed yet, it could sacrifice the new request,
 too.  So should we only inject into the commands which have been
 completed like this?

 Please note that we only clear the bit in hba-lrb_in_use. scsi_done is
 not called for this request. Therefore, the tag is not yet free in the
 block layer and next calls for queuecommand will not pass down this tag to
 be used in the UFS driver. So there is no danger of a new request being
 sacrificed.

OK, I see there is no danger as far as the commands are comming
through queuecommand().  But what about the query requests?
PATCH 1/4 in this series has added ioctl interface for query
request which also acquires a tag in hba-lrb_in_use through
ufshcd_get_dev_cmd_tag().  Although it is very difficult to
happen, is it possible for new query requests to be clashed by
inject_cmd_hang_tr() in the same way I described earlier?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/4] scsi: ufs: inject errors to verify error handling

2015-03-04 Thread Akinobu Mita
2015-03-02 23:56 GMT+09:00 Gilad Broner :
> From: Sujit Reddy Thumma 
>
> Use fault-injection framework to simulate error conditions
> in the controller and verify error handling mechanisms
> implemented in UFS host controller driver.
>
> This is used only during development and hence
> guarded by CONFIG_UFS_FAULT_INJECTION debug config option.

This feature looks useful, but I have a couple of comments and
question.

> +static bool inject_cmd_hang_tr(struct ufs_hba *hba)
> +{
> +   int tag;
> +
> +   tag = find_first_bit(>outstanding_reqs, hba->nutrs);
> +   if (tag == hba->nutrs)
> +   return 0;
> +
> +   __clear_bit(tag, >outstanding_reqs);
> +   hba->lrb[tag].cmd = NULL;
> +   __clear_bit(tag, >lrb_in_use);

hba->lrb_in_use is a bitmap set by test_and_set_bit_lock().  So
this should be cleared by clear_bit_unlock().

And as soon as the bit corresponds to this slot in hba->lrb_in_use is
cleared, this slot could be reused.  But if the request corresponds
to the slot is not completed yet, it could sacrifice the new request,
too.  So should we only inject into the commands which have been
completed like this?

tr_doorbell = ufshcd_readl(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL);
completed_reqs = tr_doorbell ^ hba->outstanding_reqs;
tag = find_first_bit(_reqs, hba->nutrs);
...

Or clear the corresponding bit in REG_UTP_TASK_REQ_LIST_CLEAR, just
like what inject_fatal_err_tr() does?

> +
> +   /* command hang injected */
> +   return 1;
> +}
> +
> +static int inject_cmd_hang_tm(struct ufs_hba *hba)
> +{
> +   int tag;
> +
> +   tag = find_first_bit(>outstanding_tasks, hba->nutmrs);
> +   if (tag == hba->nutmrs)
> +   return 0;
> +
> +   __clear_bit(tag, >outstanding_tasks);
> +   __clear_bit(tag, >tm_slots_in_use);
> +
> +   /* command hang injected */
> +   return 1;
> +}

The same goes for hba->tm_slots_in_use.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/4] scsi: ufs: inject errors to verify error handling

2015-03-04 Thread Akinobu Mita
2015-03-02 23:56 GMT+09:00 Gilad Broner gbro...@codeaurora.org:
 From: Sujit Reddy Thumma sthu...@codeaurora.org

 Use fault-injection framework to simulate error conditions
 in the controller and verify error handling mechanisms
 implemented in UFS host controller driver.

 This is used only during development and hence
 guarded by CONFIG_UFS_FAULT_INJECTION debug config option.

This feature looks useful, but I have a couple of comments and
question.

 +static bool inject_cmd_hang_tr(struct ufs_hba *hba)
 +{
 +   int tag;
 +
 +   tag = find_first_bit(hba-outstanding_reqs, hba-nutrs);
 +   if (tag == hba-nutrs)
 +   return 0;
 +
 +   __clear_bit(tag, hba-outstanding_reqs);
 +   hba-lrb[tag].cmd = NULL;
 +   __clear_bit(tag, hba-lrb_in_use);

hba-lrb_in_use is a bitmap set by test_and_set_bit_lock().  So
this should be cleared by clear_bit_unlock().

And as soon as the bit corresponds to this slot in hba-lrb_in_use is
cleared, this slot could be reused.  But if the request corresponds
to the slot is not completed yet, it could sacrifice the new request,
too.  So should we only inject into the commands which have been
completed like this?

tr_doorbell = ufshcd_readl(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL);
completed_reqs = tr_doorbell ^ hba-outstanding_reqs;
tag = find_first_bit(completed_reqs, hba-nutrs);
...

Or clear the corresponding bit in REG_UTP_TASK_REQ_LIST_CLEAR, just
like what inject_fatal_err_tr() does?

 +
 +   /* command hang injected */
 +   return 1;
 +}
 +
 +static int inject_cmd_hang_tm(struct ufs_hba *hba)
 +{
 +   int tag;
 +
 +   tag = find_first_bit(hba-outstanding_tasks, hba-nutmrs);
 +   if (tag == hba-nutmrs)
 +   return 0;
 +
 +   __clear_bit(tag, hba-outstanding_tasks);
 +   __clear_bit(tag, hba-tm_slots_in_use);
 +
 +   /* command hang injected */
 +   return 1;
 +}

The same goes for hba-tm_slots_in_use.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 4/4] scsi: ufs: inject errors to verify error handling

2015-03-02 Thread Gilad Broner
From: Sujit Reddy Thumma 

Use fault-injection framework to simulate error conditions
in the controller and verify error handling mechanisms
implemented in UFS host controller driver.

This is used only during development and hence
guarded by CONFIG_UFS_FAULT_INJECTION debug config option.

Signed-off-by: Sujit Reddy Thumma 
---
 drivers/scsi/ufs/ufs-debugfs.c | 140 +
 drivers/scsi/ufs/ufs-debugfs.h |   4 ++
 drivers/scsi/ufs/ufshcd.c  |   2 +
 drivers/scsi/ufs/ufshcd.h  |   5 ++
 lib/Kconfig.debug  |  14 +
 5 files changed, 165 insertions(+)

diff --git a/drivers/scsi/ufs/ufs-debugfs.c b/drivers/scsi/ufs/ufs-debugfs.c
index 27ab053..bac72d0 100644
--- a/drivers/scsi/ufs/ufs-debugfs.c
+++ b/drivers/scsi/ufs/ufs-debugfs.c
@@ -17,6 +17,7 @@
  *
  */
 
+#include 
 #include "ufs-debugfs.h"
 #include "unipro.h"
 
@@ -41,6 +42,143 @@ struct desc_field_offset {
} while (0)
 #define DOORBELL_CLR_TOUT_US   (1000 * 1000) /* 1 sec */
 
+#ifdef CONFIG_UFS_FAULT_INJECTION
+
+#define INJECT_COMMAND_HANG (0x0)
+
+static DECLARE_FAULT_ATTR(fail_default_attr);
+static char *fail_request;
+module_param(fail_request, charp, 0);
+
+static bool inject_fatal_err_tr(struct ufs_hba *hba, u8 ocs_err)
+{
+   int tag;
+
+   tag = find_first_bit(>outstanding_reqs, hba->nutrs);
+   if (tag == hba->nutrs)
+   return 0;
+
+   ufshcd_writel(hba, ~(1 << tag), REG_UTP_TRANSFER_REQ_LIST_CLEAR);
+   (>lrb[tag])->utr_descriptor_ptr->header.dword_2 =
+   cpu_to_be32(ocs_err);
+
+   /* fatal error injected */
+   return 1;
+}
+
+static bool inject_fatal_err_tm(struct ufs_hba *hba, u8 ocs_err)
+{
+   int tag;
+
+   tag = find_first_bit(>outstanding_tasks, hba->nutmrs);
+   if (tag == hba->nutmrs)
+   return 0;
+
+   ufshcd_writel(hba, ~(1 << tag), REG_UTP_TASK_REQ_LIST_CLEAR);
+   (>utmrdl_base_addr[tag])->header.dword_2 =
+   cpu_to_be32(ocs_err);
+
+   /* fatal error injected */
+   return 1;
+}
+
+static bool inject_cmd_hang_tr(struct ufs_hba *hba)
+{
+   int tag;
+
+   tag = find_first_bit(>outstanding_reqs, hba->nutrs);
+   if (tag == hba->nutrs)
+   return 0;
+
+   __clear_bit(tag, >outstanding_reqs);
+   hba->lrb[tag].cmd = NULL;
+   __clear_bit(tag, >lrb_in_use);
+
+   /* command hang injected */
+   return 1;
+}
+
+static int inject_cmd_hang_tm(struct ufs_hba *hba)
+{
+   int tag;
+
+   tag = find_first_bit(>outstanding_tasks, hba->nutmrs);
+   if (tag == hba->nutmrs)
+   return 0;
+
+   __clear_bit(tag, >outstanding_tasks);
+   __clear_bit(tag, >tm_slots_in_use);
+
+   /* command hang injected */
+   return 1;
+}
+
+void ufsdbg_fail_request(struct ufs_hba *hba, u32 *intr_status)
+{
+   u8 ocs_err;
+   static const u32 errors[] = {
+   CONTROLLER_FATAL_ERROR,
+   SYSTEM_BUS_FATAL_ERROR,
+   INJECT_COMMAND_HANG,
+   };
+
+   if (!should_fail(>debugfs_files.fail_attr, 1))
+   goto out;
+
+   *intr_status = errors[prandom_u32() % ARRAY_SIZE(errors)];
+   dev_info(hba->dev, "%s: fault-inject error: 0x%x\n",
+   __func__, *intr_status);
+
+   switch (*intr_status) {
+   case CONTROLLER_FATAL_ERROR: /* fall through */
+   ocs_err = OCS_FATAL_ERROR;
+   goto set_ocs;
+   case SYSTEM_BUS_FATAL_ERROR:
+   ocs_err = OCS_INVALID_CMD_TABLE_ATTR;
+set_ocs:
+   if (!inject_fatal_err_tr(hba, ocs_err))
+   if (!inject_fatal_err_tm(hba, ocs_err))
+   *intr_status = 0;
+   break;
+   case INJECT_COMMAND_HANG:
+   if (!inject_cmd_hang_tr(hba))
+   inject_cmd_hang_tm(hba);
+   break;
+   default:
+   BUG();
+   /* some configurations ignore panics caused by BUG() */
+   break;
+   }
+out:
+   return;
+}
+
+static void ufsdbg_setup_fault_injection(struct ufs_hba *hba)
+{
+   hba->debugfs_files.fail_attr = fail_default_attr;
+
+   if (fail_request)
+   setup_fault_attr(>debugfs_files.fail_attr, fail_request);
+
+   /* suppress dump stack everytime failure is injected */
+   hba->debugfs_files.fail_attr.verbose = 0;
+
+   if (IS_ERR(fault_create_debugfs_attr("inject_fault",
+   hba->debugfs_files.debugfs_root,
+   >debugfs_files.fail_attr)))
+   dev_err(hba->dev, "%s: failed to create debugfs entry\n",
+   __func__);
+}
+#else
+void ufsdbg_fail_request(struct ufs_hba *hba, u32 *intr_status)
+{
+}
+
+static void ufsdbg_setup_fault_injection(struct ufs_hba *hba)
+{
+}
+#endif /* 

[PATCH v4 4/4] scsi: ufs: inject errors to verify error handling

2015-03-02 Thread Gilad Broner
From: Sujit Reddy Thumma sthu...@codeaurora.org

Use fault-injection framework to simulate error conditions
in the controller and verify error handling mechanisms
implemented in UFS host controller driver.

This is used only during development and hence
guarded by CONFIG_UFS_FAULT_INJECTION debug config option.

Signed-off-by: Sujit Reddy Thumma sthu...@codeaurora.org
---
 drivers/scsi/ufs/ufs-debugfs.c | 140 +
 drivers/scsi/ufs/ufs-debugfs.h |   4 ++
 drivers/scsi/ufs/ufshcd.c  |   2 +
 drivers/scsi/ufs/ufshcd.h  |   5 ++
 lib/Kconfig.debug  |  14 +
 5 files changed, 165 insertions(+)

diff --git a/drivers/scsi/ufs/ufs-debugfs.c b/drivers/scsi/ufs/ufs-debugfs.c
index 27ab053..bac72d0 100644
--- a/drivers/scsi/ufs/ufs-debugfs.c
+++ b/drivers/scsi/ufs/ufs-debugfs.c
@@ -17,6 +17,7 @@
  *
  */
 
+#include linux/random.h
 #include ufs-debugfs.h
 #include unipro.h
 
@@ -41,6 +42,143 @@ struct desc_field_offset {
} while (0)
 #define DOORBELL_CLR_TOUT_US   (1000 * 1000) /* 1 sec */
 
+#ifdef CONFIG_UFS_FAULT_INJECTION
+
+#define INJECT_COMMAND_HANG (0x0)
+
+static DECLARE_FAULT_ATTR(fail_default_attr);
+static char *fail_request;
+module_param(fail_request, charp, 0);
+
+static bool inject_fatal_err_tr(struct ufs_hba *hba, u8 ocs_err)
+{
+   int tag;
+
+   tag = find_first_bit(hba-outstanding_reqs, hba-nutrs);
+   if (tag == hba-nutrs)
+   return 0;
+
+   ufshcd_writel(hba, ~(1  tag), REG_UTP_TRANSFER_REQ_LIST_CLEAR);
+   (hba-lrb[tag])-utr_descriptor_ptr-header.dword_2 =
+   cpu_to_be32(ocs_err);
+
+   /* fatal error injected */
+   return 1;
+}
+
+static bool inject_fatal_err_tm(struct ufs_hba *hba, u8 ocs_err)
+{
+   int tag;
+
+   tag = find_first_bit(hba-outstanding_tasks, hba-nutmrs);
+   if (tag == hba-nutmrs)
+   return 0;
+
+   ufshcd_writel(hba, ~(1  tag), REG_UTP_TASK_REQ_LIST_CLEAR);
+   (hba-utmrdl_base_addr[tag])-header.dword_2 =
+   cpu_to_be32(ocs_err);
+
+   /* fatal error injected */
+   return 1;
+}
+
+static bool inject_cmd_hang_tr(struct ufs_hba *hba)
+{
+   int tag;
+
+   tag = find_first_bit(hba-outstanding_reqs, hba-nutrs);
+   if (tag == hba-nutrs)
+   return 0;
+
+   __clear_bit(tag, hba-outstanding_reqs);
+   hba-lrb[tag].cmd = NULL;
+   __clear_bit(tag, hba-lrb_in_use);
+
+   /* command hang injected */
+   return 1;
+}
+
+static int inject_cmd_hang_tm(struct ufs_hba *hba)
+{
+   int tag;
+
+   tag = find_first_bit(hba-outstanding_tasks, hba-nutmrs);
+   if (tag == hba-nutmrs)
+   return 0;
+
+   __clear_bit(tag, hba-outstanding_tasks);
+   __clear_bit(tag, hba-tm_slots_in_use);
+
+   /* command hang injected */
+   return 1;
+}
+
+void ufsdbg_fail_request(struct ufs_hba *hba, u32 *intr_status)
+{
+   u8 ocs_err;
+   static const u32 errors[] = {
+   CONTROLLER_FATAL_ERROR,
+   SYSTEM_BUS_FATAL_ERROR,
+   INJECT_COMMAND_HANG,
+   };
+
+   if (!should_fail(hba-debugfs_files.fail_attr, 1))
+   goto out;
+
+   *intr_status = errors[prandom_u32() % ARRAY_SIZE(errors)];
+   dev_info(hba-dev, %s: fault-inject error: 0x%x\n,
+   __func__, *intr_status);
+
+   switch (*intr_status) {
+   case CONTROLLER_FATAL_ERROR: /* fall through */
+   ocs_err = OCS_FATAL_ERROR;
+   goto set_ocs;
+   case SYSTEM_BUS_FATAL_ERROR:
+   ocs_err = OCS_INVALID_CMD_TABLE_ATTR;
+set_ocs:
+   if (!inject_fatal_err_tr(hba, ocs_err))
+   if (!inject_fatal_err_tm(hba, ocs_err))
+   *intr_status = 0;
+   break;
+   case INJECT_COMMAND_HANG:
+   if (!inject_cmd_hang_tr(hba))
+   inject_cmd_hang_tm(hba);
+   break;
+   default:
+   BUG();
+   /* some configurations ignore panics caused by BUG() */
+   break;
+   }
+out:
+   return;
+}
+
+static void ufsdbg_setup_fault_injection(struct ufs_hba *hba)
+{
+   hba-debugfs_files.fail_attr = fail_default_attr;
+
+   if (fail_request)
+   setup_fault_attr(hba-debugfs_files.fail_attr, fail_request);
+
+   /* suppress dump stack everytime failure is injected */
+   hba-debugfs_files.fail_attr.verbose = 0;
+
+   if (IS_ERR(fault_create_debugfs_attr(inject_fault,
+   hba-debugfs_files.debugfs_root,
+   hba-debugfs_files.fail_attr)))
+   dev_err(hba-dev, %s: failed to create debugfs entry\n,
+   __func__);
+}
+#else
+void ufsdbg_fail_request(struct ufs_hba *hba, u32 *intr_status)
+{
+}
+
+static void