On 19.10.2016 11:50, Ching Huang wrote:
> Hi Tomas,
>
> SCSI command checking in queuecommand function of arcmsr can be removed
> safely.
> Now driver can pass all scsi command to controller firmware.
thanks for your quick reply. Is this (passing the SYNCHRONIZE_CACHE command
to firmware) safe for all arcmsr cards, even the oldest?
Please start with your patch a new thread and add your 'Signed-off-by:' line.
>
> diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c
> b/drivers/scsi/arcmsr/arcmsr_hba.c
> --- a/drivers/scsi/arcmsr/arcmsr_hba.c 2016-10-19 16:18:45.000000000
> +0800
> +++ b/drivers/scsi/arcmsr/arcmsr_hba.c 2016-10-19 16:31:59.665524699
> +0800
> @@ -2623,13 +2623,6 @@ static int arcmsr_queue_command_lck(stru
> cmd->scsi_done = done;
> cmd->host_scribble = NULL;
> cmd->result = 0;
> - if ((scsicmd == SYNCHRONIZE_CACHE) ||(scsicmd == SEND_DIAGNOSTIC)){
> - if(acb->devstate[target][lun] == ARECA_RAID_GONE) {
> - cmd->result = (DID_NO_CONNECT << 16);
> - }
> - cmd->scsi_done(cmd);
> - return 0;
> - }
> if (target == 16) {
> /* virtual device for iop message transfer */
> arcmsr_handle_virtual_command(acb, cmd);
>
> Thanks
> Ching
> On Tue, 2016-10-18 at 15:56 +0200, Tomas Henzl wrote:
>> Hi,
>>
>> similar suspicious code path can be found in the queuecommand functions in
>> other drivers too
>> these are -
>> pmcraid.c
>> arcmsr_hba.c
>> cc-ing maintainers -
>> (but both drivers seem to be unmaintained for a while -
>> I've added Ching for arcmsr and Raghava for pmcraid)
>>
>> please read this thread and verify that your driver and firmware is safe.
>>
>> It is expected that a raid card controls the disk write cache (switches it
>> off)
>> but this might not be true for all modes of operation a raid adapter handles
>> - pass through/non-RAID etc. In such case when disk write cache is enabled
>> and your driver skips the SYNCHRONIZE_CACHE command a FS corruption
>> is very much possible during unexpected power loss or even a clean shutdown.
>>
>> tomash
>>
>> On 17.10.2016 19:51, James Bottomley wrote:
>>> On Mon, 2016-10-17 at 23:06 +0530, Kashyap Desai wrote:
>>>>> -----Original Message-----
>>>>> From: James Bottomley [mailto:[email protected]]
>>>>> Sent: Monday, October 17, 2016 10:50 PM
>>>>> To: Ric Wheeler; Hannes Reinecke; Sumit Saxena;
>>>>> [email protected]
>>>>> Cc: [email protected]; [email protected];
>>>>> [email protected]; Christoph Hellwig; Martin K. Petersen;
>>>>> Jeff
>>>>> Moyer; Gris Ge; Ewan Milne; Jens Axboe
>>>>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
>>>>> command
>>>>> to firmware
>>>>>
>>>>> On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
>>>>>> On 10/17/2016 12:20 PM, James Bottomley wrote:
>>>>>>> On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
>>>>>>>> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
>>>>>>>>> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>>>>>>>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command
>>>>>>>>>> back to
>>>>>>>>>> SCSI layer without sending it to firmware as firmware
>>>>>>>>>> takes
>>>>>>>>>> care of flushing cache. This patch will change the
>>>>>>>>>> driver
>>>>>>>>>> behavior wrt SYNCHRONIZE_CACHE handling. If underlying
>>>>>>>>>> firmware has support to handle the SYNCHORNIZE_CACHE,
>>>>>>>>>> driver
>>>>>>>>>> will send it for firmware otherwise complete it back to
>>>>>>>>>> SCSI
>>>>>>>>>> layer with SUCCESS immediately. If Firmware handle
>>>>>>>>>> SYNCHORNIZE_CACHE for both VD and JBOD
>>>>>>>>>> "canHandleSyncCache"
>>>>>>>>>> bit in scratch pad register(offset
>>>>>>>>>> 0x00B4) will be set.
>>>>>>>>>>
>>>>>>>>>> This behavior can be controlled via module parameter and
>>>>>>>>>> user
>>>>>>>>>> can fallback to old behavior of returning
>>>>>>>>>> SYNCHRONIZE_CACHE by
>>>>>>>>>> driver only without sending it to firmware.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Sumit Saxena <[email protected]>
>>>>>>>>>> Signed-off-by: Kashyap Desai <[email protected]>
>>>>>>>>>> ---
>>>>>>>>>> drivers/scsi/megaraid/megaraid_sas.h | 3 +++
>>>>>>>>>> drivers/scsi/megaraid/megaraid_sas_base.c | 14
>>>>>>>>>> ++++++---
>>>>>>>>>> -----
>>>>>>>>>> drivers/scsi/megaraid/megaraid_sas_fusion.c | 3 +++
>>>>>>>>>> drivers/scsi/megaraid/megaraid_sas_fusion.h | 2 ++
>>>>>>>>>> 4 files changed, 14 insertions(+), 8 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>> index ca86c88..43fd14f 100644
>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>>>>>>>> #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT 14
>>>>>>>>>> #define MR_MAX_MSIX_REG_ARRAY 16
>>>>>>>>>> #define MR_RDPQ_MODE_OFFSET 0X0
>>>>>>>>>> 0800
>>>>>>>>>> 000
>>>>>>>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET 0
>>>>>>>>>> X010
>>>>>>>>>> 0000
>>>>>>>>>> 0
>>>>>>>>>> +
>>>>>>>>>> /*
>>>>>>>>>> * register set for both 1068 and 1078 controllers
>>>>>>>>>> * structure extended for 1078 registers @@ -2140,6
>>>>>>>>>> +2142,7
>>>>>>>>>> @@ struct megasas_instance {
>>>>>>>>>> u8 is_imr;
>>>>>>>>>> u8 is_rdpq;
>>>>>>>>>> bool dev_handle;
>>>>>>>>>> + bool fw_sync_cache_support;
>>>>>>>>>> };
>>>>>>>>>> struct MR_LD_VF_MAP {
>>>>>>>>>> u32 size;
>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>> index 092387f..a4a8e2f 100644
>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>>>>>>>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;
>>>>>>>>>> module_param(scmd_timeout, int, S_IRUGO);
>>>>>>>>>> MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>>>>>>>>>> (10
>>>>>>>>>> -90s), default 90s. See megasas_reset_timer.");
>>>>>>>>>>
>>>>>>>>>> +bool block_sync_cache;
>>>>>>>>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>>>>>>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by
>>>>>>>>>> driver
>>>>>>>>>> Default: 0(Send it to firmware)");
>>>>>>>>>> +
>>>>>>>>>> MODULE_LICENSE("GPL");
>>>>>>>>>> MODULE_VERSION(MEGASAS_VERSION);
>>>>>>>>>> MODULE_AUTHOR("[email protected]");
>>>>>>>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
>>>>> Scsi_Host
>>>>>>>>>> *shost, struct scsi_cmnd *scmd)
>>>>>>>>>> goto out_done;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> - switch (scmd->cmnd[0]) {
>>>>>>>>>> - case SYNCHRONIZE_CACHE:
>>>>>>>>>> - /*
>>>>>>>>>> - * FW takes care of flush cache on its
>>>>>>>>>> own
>>>>>>>>>> - * No need to send it down
>>>>>>>>>> - */
>>>>>>>>>> + if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>>>>>>>>> + (!instance->fw_sync_cache_support)) {
>>>>>>>>>> scmd->result = DID_OK << 16;
>>>>>>>>>> goto out_done;
>>>>>>>>>> - default:
>>>>>>>>>> - break;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> return instance->instancet
>>>>>>>>>> ->build_and_issue_cmd(instance, scmd);
>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>> index 2159f6a..8237580 100644
>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
>>>>>>>>>> megasas_instance *instance)
>>>>>>>>>> ret = 1;
>>>>>>>>>> goto fail_fw_init;
>>>>>>>>>> }
>>>>>>>>>> + if (!block_sync_cache)
>>>>>>>>>> + instance->fw_sync_cache_support =
>>>>>>>>>> (scratch_pad_2
>>>>>>>>>> &
>>>>>>>>>> + MR_CAN_HANDLE_SYNC_CACHE_OFFSET)
>>>>>>>>>> ? 1
>>>>>>>>>> :
>>>>>>>>>> 0;
>>>>>>>>>>
>>>>>>>>>> IOCInitMessage =
>>>>>>>>>> dma_alloc_coherent(&instance->pdev->dev,
>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>> index e3bee04..2857154 100644
>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>> @@ -72,6 +72,8 @@
>>>>>>>>>> #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
>>>>>>>>>> (0x0000030C)
>>>>>>>>>> #define MPI2_REPLY_POST_HOST_INDEX_OFFSET (0x00
>>>>>>>>>> 0000
>>>>>>>>>> 6C)
>>>>>>>>>>
>>>>>>>>>> +extern bool block_sync_cache;
>>>>>>>>>> +
>>>>>>>>>> /*
>>>>>>>>>> * Raid context flags
>>>>>>>>>> */
>>>>>>>>>>
>>>>>>>>> Be extra careful with that.
>>>>>>>>>
>>>>>>>>> SYNCHRONIZE_CACHE has (potentially) a global scope, as
>>>>>>>>> there
>>>>>>>>> might be an array-wide cache, and a cache flush would
>>>>>>>>> affect the
>>>>>>>>> entire cache. Linux is using SYNCHRONIZE_CACHE as a per
>>>>>>>>> device
>>>>>>>>> setting, ie it assumes that the effects of a cache flush is
>>>>>>>>> restricted to the device in question.
>>>>>>>>>
>>>>>>>>> If this is _not_ the case I'd rather not enable it.
>>>>>>>>> Have you checked that enabling this functionality doesn't
>>>>>>>>> have
>>>>>>>>> negative performance impact?
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> Hannes
>>>>>>>> This must go in - without this fix, there is no data
>>>>>>>> integrity for
>>>>>>>> any file system.
>>>>>>> That's not what I get from the change log. What it says to me
>>>>>>> is
>>>>>>> that the caches are currently firmware managed. Barring
>>>>>>> firmware
>>>>>>> bugs, that means that we currently don't have any integrity
>>>>>>> issues.
>>>>>> Your understanding (or the change log) is incorrect.
>>>>> There's no current way in English of construing "as firmware takes
>>>>> care of
>>>>> flushing cache" to mean its inverse, so the changelog needs
>>>>> updating to
>>>>> explain
>>>>> that firmware does *not* take care of cache flushing, so
>>>>> effectively
>>>>> nothing
>>>>> does and we'll need a cc to stable if the problem is that nothing
>>>>> takes
>>>>> care of
>>>>> flushing the drive caches.
>>>>>
>>>>> James
>>>> Sorry for confusion. More accurate to say -
>>>>
>>>> Current megaraid_sas driver returns SYNCHRONIZE_CACHE command back to
>>>> SCSI layer without sending it down to firmware as firmware supposed
>>>> to takes care of flushing cache for all Virtual Disk at the time of
>>>> system reboot/shutdown. Because of above FW rule driver coded wrongly
>>>> and it added bug that SYNCHRONIZE_CACHE is not passed to FW for JBOD
>>>> as well. (MR Firmware wanted OS to manage SYNCHRONIZE_CACHE and pass
>>>> the same to the Firmware. ). We figure out that even for VD, why
>>>> driver should send back SYNC_CACHE command...let's have the same in
>>>> FW (giving all control in FW)
>>>>
>>>> New behavior described -
>>>>
>>>> IF application requests SYNCH_CACHE
>>>> IF any FW which supports new API bit called canHandleSyncCache
>>>> Driver sends SYNCH_CACHE command to the FW
>>>> IF 'VirtualDisk'
>>>> FW does not send SYNCH_CACHE to drives
>>>> FW returns SUCCESS
>>>> ELSE IF 'JBOD'
>>>> FW sends SYNCH_CACHE to drive
>>>> FW obtains status from drive and returns same status back to
>>>> driver
>>>> ENDIF
>>>>
>>>> Fixing this is robust if FW and driver changes are inline. See below
>>>> for more detail.
>>>>
>>>> Case - 1
>>>> This patch is attempt to fix one level problem where Virtual Disks
>>>> are not managed correctly in MR FW. There are some MR FW (E.a Gen2
>>>> and Gen2.5 FW), which set WCE bit for Virtual Disk but FW is not
>>>> reply correct for SYNCH_CACHE. This was handled in past and carry
>>>> forward (in driver + FW ) to all new generation products as well. We
>>>> tried to collect all possible details why it was handled such way to
>>>> provide better driver fix for this issue(mainly to avoid this FW
>>>> checks and module parameters etc..). But now it looks like to make
>>>> generic fix (Device ID based etc..)is even risky, so went with this
>>>> protective approach. It is very risky to find out which Device ID
>>>> and FW has capability to manage SYNC_CACHE, so we managed to figure
>>>> out which are the new FW using FW capability bit.
>>>>
>>>> E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported
>>>> command (this is a firmware bug) and immediately it will be failed to
>>>> SML. This will cause File system to go Read-only.
>>> That's a better description ... what you're saying is that the patch
>>> doesn't actually fix the bug Ric is worrying about.
>>>
>>>> Case -2
>>>> One more thing which we are trying and known driver can send
>>>> "SYNC_CACHE" unconditionally to all generation of FW. For this we
>>>> are doing more unit testing on various controllers/FW and planning to
>>>> provide another patch to fix JBOD path for any FW. (It will not be
>>>> based on FW capability/module parameter).
>>> OK, but we really need this ASAP. As Ric said, data integrity is at
>>> risk until this is fixed. Can you also reference the commit that
>>> introduced the problem so we know how far to do the sable backports?
>>>
>>>> If we remove module parameter, we can ask customer to disable WCE on
>>>> drive to get similar impact.
>>> Thanks,
>>>
>>> James
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html