On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
On 10/17/2016 12:24 PM, Sumit Saxena wrote:
megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI layer
without sending it to firmware as firmware takes care of flushing cache.
This patch will change the driver behavior wrt SYNCHRONIZE_CACHE handling.
If underlying firmware has support to handle the SYNCHORNIZE_CACHE, driver
will send it for firmware otherwise complete it back to SCSI layer with
SUCCESS immediately.
If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
"canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will be

This behavior can be controlled via module parameter and user can fallback
to old behavior of returning SYNCHRONIZE_CACHE by driver only without
sending it to firmware.

Signed-off-by: Sumit Saxena <sumit.sax...@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.de...@broadcom.com>
  drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
  drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
  drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
  drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
  4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h 
index ca86c88..43fd14f 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
  #define MR_MAX_MSIX_REG_ARRAY                   16
  #define MR_RDPQ_MODE_OFFSET                   0X00800000
+#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET                0X01000000
  * register set for both 1068 and 1078 controllers
  * structure extended for 1078 registers
@@ -2140,6 +2142,7 @@ struct megasas_instance {
        u8 is_imr;
        u8 is_rdpq;
        bool dev_handle;
+       bool fw_sync_cache_support;
  struct MR_LD_VF_MAP {
        u32 size;
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
index 092387f..a4a8e2f 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -104,6 +104,10 @@ unsigned int scmd_timeout = MEGASAS_DEFAULT_CMD_TIMEOUT;
  module_param(scmd_timeout, int, S_IRUGO);
  MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), default 90s. See 
+bool block_sync_cache;
+module_param(block_sync_cache, bool, S_IRUGO);
+MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver Default: 0(Send it 
to firmware)");
@@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host *shost, struct 
scsi_cmnd *scmd)
                goto out_done;
- switch (scmd->cmnd[0]) {
-               /*
-                * FW takes care of flush cache on its own
-                * No need to send it down
-                */
+       if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
+               (!instance->fw_sync_cache_support)) {
                scmd->result = DID_OK << 16;
                goto out_done;
-       default:
-               break;
return instance->instancet->build_and_issue_cmd(instance, scmd);
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c 
index 2159f6a..8237580 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct megasas_instance *instance)
                ret = 1;
                goto fail_fw_init;
+       if (!block_sync_cache)
+               instance->fw_sync_cache_support = (scratch_pad_2 &
+                       MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
IOCInitMessage =
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h 
index e3bee04..2857154 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
@@ -72,6 +72,8 @@
  #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
  #define MPI2_REPLY_POST_HOST_INDEX_OFFSET     (0x0000006C)
+extern bool block_sync_cache;
   * Raid context flags

Be extra careful with that.

SYNCHRONIZE_CACHE has (potentially) a global scope, as there might be an
array-wide cache, and a cache flush would affect the entire cache.
Linux is using SYNCHRONIZE_CACHE as a per device setting, ie it assumes
that the effects of a cache flush is restricted to the device in question.

If this is _not_ the case I'd rather not enable it.
Have you checked that enabling this functionality doesn't have negative
performance impact?



This must go in - without this fix, there is no data integrity for any file 

In effect, this driver by default has been throwing away SYNCHRONIZE_CACHE commands even when acting in JBOD/non-RAID mode.

Of course, actually doing a SYNCHRONIZE_CACHE to drives will be slower than throwing it away, but this is not optional.

We really need to have some ways to validate that our IO stack is properly and safely configured.

I would love to see a couple of things:

* having T10 & T13 report the existence of a volatile write cache - this is different than WCE set, some devices have a write cache and are battery/flash backed.

* having a robust tool to test over power failure/disconnect that our assumptions are true - any write is durable after a SYNCHRONIZE_CACHE or CACHE_FLUSH_EXT is sent and ack'ed



To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to