Re: [PATCH 06/16] aacraid: Added sysfs for driver version
On 02/15/2017 07:12 PM, Raghava Aditya Renukunta wrote: > > I agree , but it makes it easier to get the driver version when I am > developing > and I don't know which driver version is currently loaded > > In addition internally our test automation suites use this information as > opposed to modinfo to get the driver version running. OK then. Speaking of test automation, is there something you may be able to share? Thanks, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units
On 02/15/2017 07:08 PM, Raghava Aditya Renukunta wrote: > > I will put in the queue for the next set of patch submissions and chip away > at it bit by bit. Yay, thanks. -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
[PATCH] scsi_dh: only attach to SCSI devices
Any device might be setting a queuedata structure, so we need to check if the queuedata really belongs to a SCSI device before proceeding. Signed-off-by: Hannes Reinecke--- drivers/scsi/scsi_dh.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_dh.c b/drivers/scsi/scsi_dh.c index b8d3b97..da104ed 100644 --- a/drivers/scsi/scsi_dh.c +++ b/drivers/scsi/scsi_dh.c @@ -226,7 +226,9 @@ static struct scsi_device *get_sdev_from_queue(struct request_queue *q) spin_lock_irqsave(q->queue_lock, flags); sdev = q->queuedata; - if (!sdev || !get_device(>sdev_gendev)) + if (!sdev || + !scsi_is_sdev_device(>sdev_gendev) || + !get_device(>sdev_gendev)) sdev = NULL; spin_unlock_irqrestore(q->queue_lock, flags); -- 1.8.5.6
Re: [PATCH v4] sd: Check for unaligned partial completion
Bart, On 2/16/17 12:28, Bart Van Assche wrote: > On Thu, 2017-02-16 at 11:52 +0900, Damien Le Moal wrote: >> Thanks for the pointers. I checked libiscsi tests. And from what is done >> there, it seems to me that it is basically impossible to distinguished >> between a buggy hardware response and an in-purpose (or buggy) not >> aligned data-out buffer size. >> E.g. the test in test_write10_residuals.c which issues a 1 block write >> with a buf size of 1 and checks that residual is 1 - block size. >> For that case, with both this patch and the original mpt3sas patch, the >> rounding up of resid to the block size will break the test. >> >> Yet, it is very surprising that LIO and SCST_SCSI issue the "buggy" >> request as is without checking and eventually correcting things first. >> So unless LIO and SCST_SCSI are fixed to do that on the request they >> send to the device, both patches should be dropped. >> Users with buggy mpt3 HBAs will need to update the HBA FW... >> >> So the conclusion may be that we need to drop everything ? The mpt3sas >> patch breaks ZBC now, so that must be removed too. > > Hello Damien, > > For the example you cited (bufflen - residual) = block size. This is > why I proposed to round (bufflen - residual) instead of residual. Whether > bufflen or (bufflen - residual) is rounded doesn't matter if bufflen is a > multiple of the logical block size. But it matters if bufflen is not a > multiple of the logical block size. To be sure, I went through the scsi submission and completion path of REQ_TYPE_FS again. For REQ_OP_DISCARD, REQ_OP_WRITE_SAME, REQ_OP_ZONE_REPORT and REQ_OP_ZONE_RESET, the resid check in sd_done is skipped thanks to the special handling for these commands in the switch-case statement. So we do not need to worry about scsi_buflen() alignment. For REQ_OP_FLUSH, scsi_buflen() == 0. So no problem either. All that remains of REQ_TYPE_FS are REQ_OP_READ and REQ_OP_WRITE. For these, cmd->sdb.length (that is scsi_buflen) is set to count * sector_size in sd_setup_read_write_cmnd. So scsi_buflen() will always be aligned. This also means that things like the libiscsi tests with incorrect data buffer size will actually not be seen as is by sd_done. For the example of the test_write10_residuals.c of 1 block write using a 1B buffer, sd_done will see scsi_buflen() of 1 block. So it seems to me that for the buggy mpt3sas hardware, we really only need to check resid alignment and correct it if necessary. The code you proposed if (resid & (sector_size - 1)) { resid = min(good_bytes, round_up(resid, sector_size)); good_bytes -= resid; scsi_set_resid(SCpnt, resid); } looks just perfect for this. Let me know if I missed something. Best regards. -- Damien Le Moal, Ph.D. Sr. Manager, System Software Research Group, Western Digital Corporation damien.lem...@wdc.com (+81) 0466-98-3593 (ext. 513593) 1 kirihara-cho, Fujisawa, Kanagawa, 252-0888 Japan www.wdc.com, www.hgst.com
Re: [PATCH v4] sd: Check for unaligned partial completion
Martin, On 2/16/17 12:36, Martin K. Petersen wrote: > Damien, > > Damien> So the conclusion may be that we need to drop everything ? The > Damien> mpt3sas patch breaks ZBC now, so that must be removed too. > > Nah. > > But it's important that we restrict the rounding to TYPE_FS requests > (i.e. I/Os issued by the kernel where we control what's going on). Your > patch already does this by virtue of changing sd_done(). So for that > reason I'm OK with relocating the rounding from mpt3sas to sd.c. > > However, I still think it's a fundamental problem that mpt3sas reports > completion in terms of "SAS frame bytes (successfully?) transferred" > instead of "blocks acknowledged by the target". The drive can obviously > not write a partial sector. So that approach to completion reporting > seems completely bogus to me. > > Anyway. My main comment is that it seems like the rounding logic would > be better contained in sd_completed_bytes() since that is already doing > sanity checking on what's reported. Indeed, that would be nicer. But sd_completed_bytes is only called when sense data (a sense key) was also returned with the command. It is not clear to me if that is always the case when an unaligned resid is reported by the bogus mpt3sas HBA. Handling all possible (bogus) cases may require a more extensive rewrite of sd_completed_bytes, which may not be the best of choice this late in the RC cycle. Should I keep the fix in its current place and defer the move to sd_completed_bytes to a later patch for 4.11 ? > Also, I am no fan of perpetuating the arcane SCSI logging stuff now that > we have tracing. So just make it a regular sd_printk(). Sure, will do. Thank you. -- Damien Le Moal, Ph.D. Sr. Manager, System Software Research Group, Western Digital Corporation damien.lem...@wdc.com (+81) 0466-98-3593 (ext. 513593) 1 kirihara-cho, Fujisawa, Kanagawa, 252-0888 Japan www.wdc.com, www.hgst.com
Re: sense handling improvements
> "Christoph" == Christoph Hellwigwrites: Christoph> this series is on top of the scsi_request changes in Jens' Christoph> tree and further improves the handling of the sense buffer. Very nice cleanup! Christoph> The first patch prevents any possibily of reusing stale sense Christoph> codes in sense headers, and is a bug fix that we should Christoph> probably get into the block tree ASAP. Christoph> The rest cleans up handling of the parsed sense data and Christoph> could go in either through the block tree, or a SCSI branch Christoph> on top of the block tree. I can bring them in after Linus' initial block pull. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v4] sd: Check for unaligned partial completion
> "Damien" == Damien Le Moalwrites: Damien, Damien> So the conclusion may be that we need to drop everything ? The Damien> mpt3sas patch breaks ZBC now, so that must be removed too. Nah. But it's important that we restrict the rounding to TYPE_FS requests (i.e. I/Os issued by the kernel where we control what's going on). Your patch already does this by virtue of changing sd_done(). So for that reason I'm OK with relocating the rounding from mpt3sas to sd.c. However, I still think it's a fundamental problem that mpt3sas reports completion in terms of "SAS frame bytes (successfully?) transferred" instead of "blocks acknowledged by the target". The drive can obviously not write a partial sector. So that approach to completion reporting seems completely bogus to me. Anyway. My main comment is that it seems like the rounding logic would be better contained in sd_completed_bytes() since that is already doing sanity checking on what's reported. Also, I am no fan of perpetuating the arcane SCSI logging stuff now that we have tracing. So just make it a regular sd_printk(). Thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v4] sd: Check for unaligned partial completion
On Thu, 2017-02-16 at 11:52 +0900, Damien Le Moal wrote: > Thanks for the pointers. I checked libiscsi tests. And from what is done > there, it seems to me that it is basically impossible to distinguished > between a buggy hardware response and an in-purpose (or buggy) not > aligned data-out buffer size. > E.g. the test in test_write10_residuals.c which issues a 1 block write > with a buf size of 1 and checks that residual is 1 - block size. > For that case, with both this patch and the original mpt3sas patch, the > rounding up of resid to the block size will break the test. > > Yet, it is very surprising that LIO and SCST_SCSI issue the "buggy" > request as is without checking and eventually correcting things first. > So unless LIO and SCST_SCSI are fixed to do that on the request they > send to the device, both patches should be dropped. > Users with buggy mpt3 HBAs will need to update the HBA FW... > > So the conclusion may be that we need to drop everything ? The mpt3sas > patch breaks ZBC now, so that must be removed too. Hello Damien, For the example you cited (bufflen - residual) = block size. This is why I proposed to round (bufflen - residual) instead of residual. Whether bufflen or (bufflen - residual) is rounded doesn't matter if bufflen is a multiple of the logical block size. But it matters if bufflen is not a multiple of the logical block size. Bart.
Re: [PATCH v4] sd: Check for unaligned partial completion
Bart, On 2/16/17 11:52, Damien Le Moal wrote: > Bart, > > On 2/16/17 10:10, Bart Van Assche wrote: >> On Thu, 2017-02-16 at 09:54 +0900, Damien Le Moal wrote: >>> On 2/16/17 01:42, Bart Van Assche wrote: An additional concern: what if the size of the Data-Out buffer is not a multiple of the logical block size? Shouldn't we round down (good_bytes - resid) instead of rounding up resid? >>> >>> The only REQ_TYPE_FS request that may not operate on LBA size aligned >>> payloads is REQ_OP_ZONE_REPORT which is handles in a different case of >>> the switch statement. So I think it is safe. >> >> Hello Damien, >> >> Are you aware that it is the software that submits a request that controls >> the buffer length and not the device that processes a request? Submitting >> Data-Out buffers with odd sizes is one of the tests in the libiscsi test >> suite. See e.g. the source file test-tool/test_write10_residuals.c in that >> test suite. The request submitted by the libiscsi test suite reach the >> kernel of the target system either through SG IO or through the iSCSI >> target driver. When using iSCSI, both the LIO and SCST SCSI target >> frameworks translate these requests into REQ_TYPE_FS requests. I think we >> should aim not to affect the outcome of libiscsi tests when the underlying >> device is a SCSI disk. > > Thanks for the pointers. I checked libiscsi tests. And from what is done > there, it seems to me that it is basically impossible to distinguished > between a buggy hardware response and an in-purpose (or buggy) not > aligned data-out buffer size. > E.g. the test in test_write10_residuals.c which issues a 1 block write > with a buf size of 1 and checks that residual is 1 - block size. > For that case, with both this patch and the original mpt3sas patch, the > rounding up of resid to the block size will break the test. > > Yet, it is very surprising that LIO and SCST_SCSI issue the "buggy" > request as is without checking and eventually correcting things first. > So unless LIO and SCST_SCSI are fixed to do that on the request they > send to the device, both patches should be dropped. > Users with buggy mpt3 HBAs will need to update the HBA FW... > > So the conclusion may be that we need to drop everything ? The mpt3sas > patch breaks ZBC now, so that must be removed too. > > Thoughts ? Or we rewrite to this: if (!(good_bytes & (sector_size - 1)) && resid & (sector_size - 1)) { resid = min(good_bytes, round_up(resid, sector_size)); good_bytes -= resid; scsi_set_resid(SCpnt, resid); } That is, do the resid correction only and only if the command buffer size is aligned. But still not sure that is safe in all possible cases. -- Damien Le Moal, Ph.D. Sr. Manager, System Software Research Group, Western Digital Corporation damien.lem...@wdc.com (+81) 0466-98-3593 (ext. 513593) 1 kirihara-cho, Fujisawa, Kanagawa, 252-0888 Japan www.wdc.com, www.hgst.com
Re: [PATCH v4] sd: Check for unaligned partial completion
Bart, On 2/16/17 10:10, Bart Van Assche wrote: > On Thu, 2017-02-16 at 09:54 +0900, Damien Le Moal wrote: >> On 2/16/17 01:42, Bart Van Assche wrote: >>> An additional concern: what if the size of the Data-Out buffer is not a >>> multiple of the logical block size? Shouldn't we round down (good_bytes - >>> resid) instead of rounding up resid? >> >> The only REQ_TYPE_FS request that may not operate on LBA size aligned >> payloads is REQ_OP_ZONE_REPORT which is handles in a different case of >> the switch statement. So I think it is safe. > > Hello Damien, > > Are you aware that it is the software that submits a request that controls > the buffer length and not the device that processes a request? Submitting > Data-Out buffers with odd sizes is one of the tests in the libiscsi test > suite. See e.g. the source file test-tool/test_write10_residuals.c in that > test suite. The request submitted by the libiscsi test suite reach the > kernel of the target system either through SG IO or through the iSCSI > target driver. When using iSCSI, both the LIO and SCST SCSI target > frameworks translate these requests into REQ_TYPE_FS requests. I think we > should aim not to affect the outcome of libiscsi tests when the underlying > device is a SCSI disk. Thanks for the pointers. I checked libiscsi tests. And from what is done there, it seems to me that it is basically impossible to distinguished between a buggy hardware response and an in-purpose (or buggy) not aligned data-out buffer size. E.g. the test in test_write10_residuals.c which issues a 1 block write with a buf size of 1 and checks that residual is 1 - block size. For that case, with both this patch and the original mpt3sas patch, the rounding up of resid to the block size will break the test. Yet, it is very surprising that LIO and SCST_SCSI issue the "buggy" request as is without checking and eventually correcting things first. So unless LIO and SCST_SCSI are fixed to do that on the request they send to the device, both patches should be dropped. Users with buggy mpt3 HBAs will need to update the HBA FW... So the conclusion may be that we need to drop everything ? The mpt3sas patch breaks ZBC now, so that must be removed too. Thoughts ? Best regards. -- Damien Le Moal, Ph.D. Sr. Manager, System Software Research Group, Western Digital Corporation damien.lem...@wdc.com (+81) 0466-98-3593 (ext. 513593) 1 kirihara-cho, Fujisawa, Kanagawa, 252-0888 Japan www.wdc.com, www.hgst.com
Re: [PATCH] scsi: megaraid_sas: handle dma_addr_t right on 32-bit
> "Arnd" == Arnd Bergmannwrites: Arnd> When building with a dma_addr_t that is different from pointer Arnd> size, we get this warning: Applied to 4.11/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering
Re: [patch] scsi: megaraid_sas: array overflow in megasas_dump_frame()
> "Dan" == Dan Carpenterwrites: Dan> The "sz" variable is in terms of bytes, but we're treating the Dan> buffer as an array of __le32 so we have to divide by 4. Dan> Fixes: def0eab3af86 ("scsi: megaraid_sas: enhance debug logs in OCR Dan> context") Signed-off-by: Dan Carpenter Applied to 4.11/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 1/2] Don't blacklist nvme
On Wed, Feb 15 2017 at 9:01pm -0500, Mike Snitzerwrote: > On Tue, Feb 14 2017 at 6:00pm -0500, > Keith Busch wrote: > > > On Tue, Feb 14, 2017 at 01:35:45PM -0800, Bart Van Assche wrote: > > > On 02/14/2017 01:19 PM, Keith Busch wrote: > > > > These devices are mulitpath capable, and have been able to stack with > > > > dm-mpath since kernel 4.2. > > > > > > > > - str = > > > > STRDUP("^(ram|raw|loop|fd|md|dm-|sr|scd|st|dcssblk|nvme)[0-9]"); > > > > + str = > > > > STRDUP("^(ram|raw|loop|fd|md|dm-|sr|scd|st|dcssblk)[0-9]"); > > > > > > Have you checked whether dm-mpath works properly with nvme? Last time I > > > tried that dm-mpath crashed when it called scsi_dh_attach() because that > > > last function assumes that it is passed a SCSI device queue instead of > > > checking whether or not the queue passed to that function is a SCSI > > > device queue. > > > > Good point. I was unknowingly running with CONFIG_SCSI_DH disabled, > > and blissfully unaware of its existence! After enabling that option, > > I see what you mean. > > > > If we don't want to mess with the kernel, I can change the multipath-tools > > to get around that by appending the following to NVMe hwentry in the > > second patch in this series: > > > > .retain_hwhandler = RETAIN_HWHANDLER_OFF, > > > > And the problem goes away. > > That gives me a clue, I'll take a look. Looks like drivers/scsi/scsi_dh.c:get_sdev_from_queue() needs to first check if the request_queue is from a SCSI device (rather than assuming it is). Not quite sure how to check that though. By pushing the check down into scsi_dh it implies DM multipath will be calling into scsi_dh code for other transports (NVMe, etc). So an alternative would be to have DM multipath guard SCSI specific code with an appropriate check. Again, not sure how that check (e.g. some new is_scsi_request_queue) should be implemented. Mike
Re: [PATCH] snic: switch to pci_irq_alloc_vectors
> "Christoph" == Christoph Hellwigwrites: Applied to 4.11/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v4] sd: Check for unaligned partial completion
On Thu, 2017-02-16 at 09:54 +0900, Damien Le Moal wrote: > On 2/16/17 01:42, Bart Van Assche wrote: > > An additional concern: what if the size of the Data-Out buffer is not a > > multiple of the logical block size? Shouldn't we round down (good_bytes - > > resid) instead of rounding up resid? > > The only REQ_TYPE_FS request that may not operate on LBA size aligned > payloads is REQ_OP_ZONE_REPORT which is handles in a different case of > the switch statement. So I think it is safe. Hello Damien, Are you aware that it is the software that submits a request that controls the buffer length and not the device that processes a request? Submitting Data-Out buffers with odd sizes is one of the tests in the libiscsi test suite. See e.g. the source file test-tool/test_write10_residuals.c in that test suite. The request submitted by the libiscsi test suite reach the kernel of the target system either through SG IO or through the iSCSI target driver. When using iSCSI, both the LIO and SCST SCSI target frameworks translate these requests into REQ_TYPE_FS requests. I think we should aim not to affect the outcome of libiscsi tests when the underlying device is a SCSI disk. Thanks, Bart.
Re: [PATCH v4] sd: Check for unaligned partial completion
Bart, On 2/16/17 01:42, Bart Van Assche wrote: > On Wed, 2017-02-15 at 11:12 +0900, Damien Le Moal wrote: >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index 1f5d92a..d05a328 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -1790,6 +1790,8 @@ static int sd_done(struct scsi_cmnd *SCpnt) >> { >> int result = SCpnt->result; >> unsigned int good_bytes = result ? 0 : scsi_bufflen(SCpnt); >> +unsigned int sector_size = SCpnt->device->sector_size; >> +unsigned int resid; >> struct scsi_sense_hdr sshdr; >> struct scsi_disk *sdkp = scsi_disk(SCpnt->request->rq_disk); >> struct request *req = SCpnt->request; >> @@ -1820,6 +1822,24 @@ static int sd_done(struct scsi_cmnd *SCpnt) >> scsi_set_resid(SCpnt, blk_rq_bytes(req)); >> } >> break; >> +default: >> +/* >> + * In case of bogus fw or device, we could end up having >> + * an unaligned partial completion. Check this here and force >> + * alignment. >> + */ >> +resid = scsi_get_resid(SCpnt); >> +if (resid & (sector_size - 1)) { >> +SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt, >> +"Unaligned partial completion (resid=%u, >> sector_sz=%u)\n", >> +resid, sector_size)); >> +resid = round_up(resid, sector_size); >> +if (resid < good_bytes) >> +good_bytes -= resid; >> +else >> +good_bytes = 0; >> +scsi_set_resid(SCpnt, resid); >> +} >> } >> >> if (result) { > > An additional concern: what if the size of the Data-Out buffer is not a > multiple of the logical block size? Shouldn't we round down (good_bytes - > resid) instead of rounding up resid? The only REQ_TYPE_FS request that may not operate on LBA size aligned payloads is REQ_OP_ZONE_REPORT which is handles in a different case of the switch statement. So I think it is safe. Best regards. -- Damien Le Moal, Ph.D. Sr. Manager, System Software Research Group, Western Digital Corporation damien.lem...@wdc.com (+81) 0466-98-3593 (ext. 513593) 1 kirihara-cho, Fujisawa, Kanagawa, 252-0888 Japan www.wdc.com, www.hgst.com
Re: [PATCH v4] sd: Check for unaligned partial completion
Bart, On 2/16/17 00:10, Bart Van Assche wrote: > On Wed, 2017-02-15 at 11:12 +0900, Damien Le Moal wrote: >> +resid = round_up(resid, sector_size); >> +if (resid < good_bytes) >> +good_bytes -= resid; >> +else >> +good_bytes = 0; >> +scsi_set_resid(SCpnt, resid); > > Hello Damien, > > If the Data-Out buffer is smaller than a single logical block, can the above > code cause resid to exceed the Data-Out buffer size? I think we should avoid > to convert a residual overflow into a residual underflow. Additionally, will > round_up() work correctly if resid is negative (residual underflow)? How > about using the following (untested) code instead of the above? > > if (resid > 0) { > resid = min(good_bytes, round_up(resid, sector_size)); > good_bytes -= resid; > scsi_set_resid(SCpnt, resid); > } Indeed, that would be more solid and cover all possible weird values of resid. I will resend a v5. Thanks. -- Damien Le Moal, Ph.D. Sr. Manager, System Software Research Group, Western Digital Corporation damien.lem...@wdc.com (+81) 0466-98-3593 (ext. 513593) 1 kirihara-cho, Fujisawa, Kanagawa, 252-0888 Japan www.wdc.com, www.hgst.com
[PATCH 3.16 226/306] scsi: mpt3sas: Fix secure erase premature termination
3.16.40-rc1 review patch. If anyone has any objections, please let me know. -- From: Andrey Grodzovskycommit 18f6084a989ba1b38702f9af37a2e4049a924be6 upstream. This is a work around for a bug with LSI Fusion MPT SAS2 when perfoming secure erase. Due to the very long time the operation takes, commands issued during the erase will time out and will trigger execution of the abort hook. Even though the abort hook is called for the specific command which timed out, this leads to entire device halt (scsi_state terminated) and premature termination of the secure erase. Set device state to busy while ATA passthrough commands are in progress. [mkp: hand applied to 4.9/scsi-fixes, tweaked patch description] Signed-off-by: Andrey Grodzovsky Acked-by: Sreekanth Reddy Cc: Cc: Sathya Prakash Cc: Chaitra P B Cc: Suganath Prabu Subramani Cc: Sreekanth Reddy Cc: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Ben Hutchings --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -1625,7 +1625,10 @@ _scsih_get_volume_capabilities(struct MP return 0; } - +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd) +{ + return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16); +} /** * _scsih_enable_tlr - setting TLR flags @@ -3541,6 +3544,13 @@ _scsih_qcmd(struct Scsi_Host *shost, str scsi_print_command(scmd); #endif + /* +* Lock the device for any subsequent command until command is +* done. +*/ + if (ata_12_16_cmd(scmd)) + scsi_internal_device_block(scmd->device); + sas_device_priv_data = scmd->device->hostdata; if (!sas_device_priv_data || !sas_device_priv_data->sas_target) { scmd->result = DID_NO_CONNECT << 16; @@ -4041,6 +4051,9 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *i if (scmd == NULL) return 1; + if (ata_12_16_cmd(scmd)) + scsi_internal_device_unblock(scmd->device, SDEV_RUNNING); + mpi_request = mpt3sas_base_get_msg_frame(ioc, smid); if (mpi_reply == NULL) {
[PATCH 3/3] qla2xxx: Fix Regression introduced by pci_alloc_irq_vectors_affinity call.
From: Michael HernandezFor target mode, we need to increase minimum vectors value by one to account for ATIO queue. Following stack trace will be seen Call Trace: qla24xx_config_rings+0x15a/0x230 [qla2xxx] qla2x00_init_rings+0x1a1/0x3a0 [qla2xxx] qla2x00_restart_isp+0x5c/0x120 [qla2xxx] qla2x00_abort_isp+0x138/0x430 [qla2xxx] ? __schedule+0x260/0x580 qla2x00_do_dpc+0x3bc/0x920 [qla2xxx] ? qla2x00_relogin+0x290/0x290 [qla2xxx] ? schedule+0x3a/0xa0 ? qla2x00_relogin+0x290/0x290 [qla2xxx] kthread+0x103/0x140 ? __kthread_init_worker+0x40/0x40 ret_from_fork+0x29/0x40 RIP: qlt_24xx_config_rings+0x6c/0x90 Fixes: 17e5fc5 ("scsi: qla2xxx: fix MSI-X vector affinity") Signed-off-by: Michael Hernandez Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_isr.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index 6e0a0f1..6b324ac 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -3013,14 +3013,17 @@ struct qla_init_msix_entry { int i, ret; struct qla_msix_entry *qentry; scsi_qla_host_t *vha = pci_get_drvdata(ha->pdev); + int min_vecs = QLA_BASE_VECTORS; struct irq_affinity desc = { .pre_vectors = QLA_BASE_VECTORS, }; - if (QLA_TGT_MODE_ENABLED() && IS_ATIO_MSIX_CAPABLE(ha)) + if (QLA_TGT_MODE_ENABLED() && IS_ATIO_MSIX_CAPABLE(ha)) { desc.pre_vectors++; + min_vecs++; + } - ret = pci_alloc_irq_vectors_affinity(ha->pdev, QLA_BASE_VECTORS, + ret = pci_alloc_irq_vectors_affinity(ha->pdev, min_vecs, ha->msix_count, PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, ); -- 1.8.3.1
[PATCH 2/3] qla2xxx: Fix response queue count for Target mode.
From: Michael HernandezTarget mode initialization was not calculating response queue values correctly resulting into one less MSI-X vector. Fixes: 093df73 ("scsi: qla2xxx: Fix Target mode handling with Multiqueue changes.") Signed-off-by: Michael Hernandez Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_os.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 8174cee..71b6b20 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1916,12 +1916,13 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) if (ql2xmqsupport) { /* MB interrupt uses 1 vector */ ha->max_req_queues = ha->msix_count - 1; - ha->max_rsp_queues = ha->max_req_queues; /* ATIOQ needs 1 vector. That's 1 less QPair */ if (QLA_TGT_MODE_ENABLED()) ha->max_req_queues--; + ha->max_rsp_queues = ha->max_req_queues; + /* Queue pairs is the max value minus * the base queue pair */ ha->max_qpairs = ha->max_req_queues - 1; -- 1.8.3.1
[PATCH 1/3] qla2xxx: Cleaned up queue configuration code.
From: Michael HernandezThis patch cleaned up queue configuration code, such that once initialized, we should not touch msix_count value. This will prevent incorrect numbers of MSI-X vectors requested while performing target mode configuration. Fixes: d745952 ("scsi: qla2xxx: Add multiple queue pair functionality.") Signed-off-by: Michael Hernandez Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_os.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index d01c90c..8174cee 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1815,6 +1815,7 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) /* Determine queue resources */ ha->max_req_queues = ha->max_rsp_queues = 1; + ha->msix_count = QLA_BASE_VECTORS; if (!ql2xmqsupport || (!IS_QLA25XX(ha) && !IS_QLA81XX(ha))) goto mqiobase_exit; @@ -1842,9 +1843,8 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) "BAR 3 not enabled.\n"); mqiobase_exit: - ha->msix_count = ha->max_rsp_queues + 1; ql_dbg_pci(ql_dbg_init, ha->pdev, 0x001c, - "MSIX Count:%d.\n", ha->msix_count); + "MSIX Count: %d.\n", ha->msix_count); return (0); iospace_error_exit: @@ -1892,6 +1892,7 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) /* 83XX 26XX always use MQ type access for queues * - mbar 2, a.k.a region 4 */ ha->max_req_queues = ha->max_rsp_queues = 1; + ha->msix_count = QLA_BASE_VECTORS; ha->mqiobase = ioremap(pci_resource_start(ha->pdev, 4), pci_resource_len(ha->pdev, 4)); @@ -1934,14 +1935,8 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) "BAR 1 not enabled.\n"); mqiobase_exit: - ha->msix_count = ha->max_rsp_queues + 1; - if (QLA_TGT_MODE_ENABLED()) - ha->msix_count++; - - qlt_83xx_iospace_config(ha); - ql_dbg_pci(ql_dbg_init, ha->pdev, 0x011f, - "MSIX Count:%d.\n", ha->msix_count); + "MSIX Count: %d.\n", ha->msix_count); return 0; iospace_error_exit: -- 1.8.3.1
[PATCH 0/3] qla2xxx: Bug fixes and cleanup for the driver.
Hi Martin, This series contains small cleanup + fix for regression that was introduced by pci_alloc_irq_vectors_affinity() call in driver. Please apply this series to 4.10/scsi-fixes at your earliest convenience. Thanks, Himanshu Michael Hernandez (3): qla2xxx: Cleaned up queue configuration code. qla2xxx: Fix response queue count for Target mode. qla2xxx: Fix Regression introduced by pci_alloc_irq_vectors_affinity call. drivers/scsi/qla2xxx/qla_isr.c | 7 +-- drivers/scsi/qla2xxx/qla_os.c | 16 ++-- 2 files changed, 11 insertions(+), 12 deletions(-) -- 1.8.3.1
RE: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert
> -Original Message- > From: Johannes Thumshirn [mailto:jthumsh...@suse.de] > Sent: Wednesday, February 15, 2017 12:44 AM > To: Raghava Aditya Renukunta >; j...@linux.vnet.ibm.com; > martin.peter...@oracle.com; linux-scsi@vger.kernel.org > Cc: Dave Carroll ; Gana Sridaran > ; Scott Benesh > ; dan.carpen...@oracle.com > Subject: Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw > assert > > EXTERNAL EMAIL > > > On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > > When the command thread performs a periodic time sync and the > firmware is > > going through an assert during that time, the command thread waits for > the > > response that would never arrive. The SCSI Mid layer's error handler would > > eventually reset the controller, but the eh_handler just issues a > > "thread stop" to the command thread which is stuck on a semaphore and > the > > eh_thread would in turn goes to sleep waiting for the command_thread to > > respond to the stop which never happens. > > > > Fixed by allowing SIGTERM for the command thread, and during the > eh_reset > > call, sends termination signal to the command thread. As a follow-up, the > > eh_reset handler would take care of the controller reset. > > > > Signed-off-by: Raghava Aditya Renukunta > > > Reviewed-by: David Carroll > > --- > > This look a bit scary. Can't the kthread be converted to a workqueue so > we could call cancel_work_sync()? Could you please elaborate on the reasons why this fix is scary? I understand that killing a thread is not standard (for any reason), and if there are other nuanced issues I would like to understand them. Thank you, Raghava Aditya > > -- > Johannes Thumshirn Storage > jthumsh...@suse.de+49 911 74053 689 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) > Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH v3 00/16] lpfc: Add NVME Fabrics support
On 02/15/2017 08:04 AM, James Smart wrote: > On 2/15/2017 2:12 AM, Sagi Grimberg wrote: >> Also, can you send your patchset threaded? It usually does so >> when generating the patches with git format-patch, not sure >> how this is not the case with your set... It would make my >> life a bit easier. > > Well, I am using git format-patch to create them. But have to revert to > a manual mailing request as I haven't been able to get git-send-email > working past our firewalls. Sorry. Hello James, In the headers of your patch e-mails I found the following: "User-Agent: Heirloom mailx 12.5 6/20/10". As far as I know the transports supported by that e-mail client are IMAP, POP3 and SMTP. I think git supports both SMTP and IMAP for sending e-mails. Thanks, Bart.
RE: [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART Family)
> -Original Message- > From: Johannes Thumshirn [mailto:jthumsh...@suse.de] > Sent: Wednesday, February 15, 2017 12:49 AM > To: Raghava Aditya Renukunta >; j...@linux.vnet.ibm.com; > martin.peter...@oracle.com; linux-scsi@vger.kernel.org > Cc: Dave Carroll ; Gana Sridaran > ; Scott Benesh > ; dan.carpen...@oracle.com > Subject: Re: [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART > Family) > > EXTERNAL EMAIL > > > On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > > When the SMART family of controller panic (KERNEL_PANIC) , they do not > ^ controllers?^ extra space > > honor IOP resets. So better to skip it and directly perform a IWBR reset. > > > > Signed-off-by: Raghava Aditya Renukunta > > > Reviewed-by: David Carroll > > --- > > drivers/scsi/aacraid/src.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c > > index b23c818..5bb9865 100644 > > --- a/drivers/scsi/aacraid/src.c > > +++ b/drivers/scsi/aacraid/src.c > > @@ -714,6 +714,12 @@ static int aac_src_restart_adapter(struct aac_dev > *dev, int bled, u8 reset_type) > > pr_err("%s%d: adapter kernel panic'd %x.\n", > > dev->name, dev->id, bled); > > > > + /* > > + * WHen there is a BlinkLED, IOP_RESET has not effect >^ When > > + */ > > + if (bled >= 2 && dev->sa_firmware && (reset_type & HW_IOP_RESET)) > ^ No need for the >parenthesis > > + reset_type &= ~HW_IOP_RESET; > > + > > dev->a_ops.adapter_enable_int = aac_src_disable_interrupt; > > > > switch (reset_type) { > > > > Apart from that, > Reviewed-by: Johannes Thumshirn Yes I will fix this in the next patch set. Regards, Raghava Aditya > -- > Johannes Thumshirn Storage > jthumsh...@suse.de+49 911 74053 689 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) > Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
RE: [PATCH 06/16] aacraid: Added sysfs for driver version
> -Original Message- > From: Johannes Thumshirn [mailto:jthumsh...@suse.de] > Sent: Wednesday, February 15, 2017 12:33 AM > To: Raghava Aditya Renukunta >; j...@linux.vnet.ibm.com; > martin.peter...@oracle.com; linux-scsi@vger.kernel.org > Cc: Dave Carroll ; Gana Sridaran > ; Scott Benesh > ; dan.carpen...@oracle.com > Subject: Re: [PATCH 06/16] aacraid: Added sysfs for driver version > > EXTERNAL EMAIL > > > On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > > Added support to retrieve driver version from a new sysfs variable called > > driver_version. It makes it easier for the user to figure out the driver > > version that is currently running. > > > > Signed-off-by: Raghava Aditya Renukunta > > > Reviewed-by: David Carroll > > --- > > Can't this be retrieved via modinfo? I agree , but it makes it easier to get the driver version when I am developing and I don't know which driver version is currently loaded In addition internally our test automation suites use this information as opposed to modinfo to get the driver version running. Regards, Raghava Aditya > -- > Johannes Thumshirn Storage > jthumsh...@suse.de+49 911 74053 689 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) > Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH] scsi, block: fix memory leak of sdpk on when gd fails to allocate
On Fri, 2017-02-03 at 19:38 +, Colin King wrote: > From: Colin Ian King> > On an allocation failure of gd, the current exit path is via out_free_devt > which leaves sdpk still allocated and hence it gets leaked. Fix this by > correcting the order of resource free'ing with a change in the error exit > path labels. > > Detected by CoverityScan, CID#1399519 ("Resource Leak") > > Fixes: 0dba1314d4f81115dc ("scsi, block: fix duplicate bdi name registration > crashes") > Signed-off-by: Colin Ian King > --- > drivers/scsi/sd.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index cb6e68d..99e1206 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -3213,10 +3213,10 @@ static int sd_probe(struct device *dev) > sd_devt = NULL; > out_put: > put_disk(gd); > - out_free: > - kfree(sdkp); > out_free_devt: > kfree(sd_devt); > + out_free: > + kfree(sdkp); > out: > scsi_autopm_put_device(sdp); > return error; Hello Colin, This patch looks fine to me. But since it is a fix for a patch that exists in Jens' tree and that is not yet upstream, please resubmit it to Jens. Bart.
RE: [PATCH 05/16] aacraid: Fix memory leak in fib init path
> -Original Message- > From: Johannes Thumshirn [mailto:jthumsh...@suse.de] > Sent: Wednesday, February 15, 2017 12:32 AM > To: Raghava Aditya Renukunta >; j...@linux.vnet.ibm.com; > martin.peter...@oracle.com; linux-scsi@vger.kernel.org > Cc: Dave Carroll ; Gana Sridaran > ; Scott Benesh > ; dan.carpen...@oracle.com > Subject: Re: [PATCH 05/16] aacraid: Fix memory leak in fib init path > > EXTERNAL EMAIL > > > On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > > aac_fib_map_free frees misaligned fib dma memory, additionally it does > not > > free up the whole memory. > > > > Fixed by changing the code to free up the correct and full memory > > allocation. > > > > Cc: sta...@vger.kernel.org > > Fixes: e8b12f0fb835223 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC > based controller family) > > Signed-off-by: Raghava Aditya Renukunta > > > Reviewed-by: David Carroll > > --- > > drivers/scsi/aacraid/commsup.c | 20 +--- > > 1 file changed, 9 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/scsi/aacraid/commsup.c > b/drivers/scsi/aacraid/commsup.c > > index f7a3bcb..863c98d 100644 > > --- a/drivers/scsi/aacraid/commsup.c > > +++ b/drivers/scsi/aacraid/commsup.c > > @@ -97,8 +97,8 @@ void aac_fib_map_free(struct aac_dev *dev) > > { > > if (dev->hw_fib_va && dev->max_cmd_size) { > > pci_free_consistent(dev->pdev, > > - (dev->max_cmd_size * > > - (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB)), > > + (dev->max_cmd_size + sizeof(struct aac_fib_xporthdr)) > > + * (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB) + 31, > > dev->hw_fib_va, dev->hw_fib_pa); > > Can you please do something like: > > size_t alloc_size; > int numtags; > > numtags = dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB; > alloc_size = (dev->max_cmd_size + sizeof(struct aac_fib_xporthdr)) * > numtags + 31; > pci_free_consistent(dev->pdev, alloc_size, dev->hw_fib_va, > dev->hw_fib_pa); > > And please indent correctly. If it indentation doesn't work correctly > because you hit the 80 chars limit, that's a sign something should be > reconsidered. Yes, I will rework this. Regards, Raghava Aditya > Thanks, > Johannes > -- > Johannes Thumshirn Storage > jthumsh...@suse.de+49 911 74053 689 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) > Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
RE: [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units
> -Original Message- > From: Johannes Thumshirn [mailto:jthumsh...@suse.de] > Sent: Wednesday, February 15, 2017 12:20 AM > To: Raghava Aditya Renukunta >; j...@linux.vnet.ibm.com; > martin.peter...@oracle.com; linux-scsi@vger.kernel.org > Cc: Dave Carroll ; Gana Sridaran > ; Scott Benesh > ; dan.carpen...@oracle.com > Subject: Re: [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units > > EXTERNAL EMAIL > > > On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > > Arrconf management utility at times sends fibs with AdapterProcessed set > > in its fibs. This causes the controller to panic and lockup. > > > > Fixed by failing the commands that have AdapterProcessed set in its flag. > > > > Signed-off-by: Raghava Aditya Renukunta > > > Reviewed-by: David Carroll > > --- > > drivers/scsi/aacraid/commsup.c | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/scsi/aacraid/commsup.c > b/drivers/scsi/aacraid/commsup.c > > index 6220b47..f7a3bcb 100644 > > --- a/drivers/scsi/aacraid/commsup.c > > +++ b/drivers/scsi/aacraid/commsup.c > > @@ -522,6 +522,10 @@ int aac_fib_send(u16 command, struct fib *fibptr, > unsigned long size, > > > > if (!(hw_fib->header.XferState & cpu_to_le32(HostOwned))) > > return -EBUSY; > > + > > + if (hw_fib->header.XferState & cpu_to_le32(AdapterProcessed)) > > + return -EINVAL; > > + > > As far as I can see the fib_xfer_state enum isn't an exported as an > official ABI, so it's a good candidate (whole of aacraid.h actually) for > the next round of camel case removals. > > Anyways, this can wait: > Reviewed-by: Johannes Thumshirn I will put in the queue for the next set of patch submissions and chip away at it bit by bit. Thank you, Raghava Aditya Renukunta > > -- > Johannes Thumshirn Storage > jthumsh...@suse.de+49 911 74053 689 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) > Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
RE: [PATCH 03/16] aacraid: Fix for excessive prints on EEH
> On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > > This issue showed up on a kdump debug(single CPU on powerkvm), when > EEH > > errors rendered the adapter unusable. The driver correctly detected the > > issue and attempted to restart the controller, in doing so the driver > > attempted to read the status registers of the controller. This triggered > > additional eeh errors which continued for a good 6 minutes. > > > > Fixed by returning without waiting when EEH error is reported. > > > > Signed-off-by: Raghava Aditya Renukunta >> > Reviewed-by: David Carroll > > --- > > drivers/scsi/aacraid/commsup.c | 38 > +- > > 1 file changed, 37 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/scsi/aacraid/commsup.c > b/drivers/scsi/aacraid/commsup.c > > index 56090f5..6220b47 100644 > > --- a/drivers/scsi/aacraid/commsup.c > > +++ b/drivers/scsi/aacraid/commsup.c > > @@ -461,6 +461,30 @@ int aac_queue_get(struct aac_dev * dev, u32 * > index, u32 qid, struct hw_fib * hw > > return 0; > > } > > Please do > > > +#ifdef CONFIG_EEH > > +static inline int aac_check_eeh_failure(struct aac_dev *dev) > > +{ > > + /* Check for an EEH failure for the given > > + * device node. Function eeh_dev_check_failure() > > + * returns 0 if there has not been an EEH error > > + * otherwise returns a non-zero value. > > + * > > + * Need to be called before any PCI operation, > > + * i.e.,before aac_adapter_check_health() > > + */ > > + struct eeh_dev *edev = pci_dev_to_eeh_dev(dev->pdev); > > + > > + if (eeh_dev_check_failure(edev)) { > > + /* The EEH mechanisms will handle this > > + * error and reset the device if > > + * necessary. > > + */ > > + return 1; > > + } > > + return 0; > > +} > > #else > static inline int aac_check_eeh_failure(struct aac_dev *dev) > { > return 0; > } > > > +#endif > > + > > [...] > > > + > > +#if defined(CONFIG_EEH) > > + if (aac_check_eeh_failure(dev)) > > + return -EFAULT; > > +#endif > > + > > So the #if defined() blocks become unnecessary. Yes I can do that. Regards, Raghava Aditya > Thanks, > Johannes > -- > Johannes Thumshirn Storage > jthumsh...@suse.de+49 911 74053 689 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) > Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH v4] sd: Check for unaligned partial completion
On Wed, 2017-02-15 at 11:12 +0900, Damien Le Moal wrote: > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index 1f5d92a..d05a328 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -1790,6 +1790,8 @@ static int sd_done(struct scsi_cmnd *SCpnt) > { > int result = SCpnt->result; > unsigned int good_bytes = result ? 0 : scsi_bufflen(SCpnt); > + unsigned int sector_size = SCpnt->device->sector_size; > + unsigned int resid; > struct scsi_sense_hdr sshdr; > struct scsi_disk *sdkp = scsi_disk(SCpnt->request->rq_disk); > struct request *req = SCpnt->request; > @@ -1820,6 +1822,24 @@ static int sd_done(struct scsi_cmnd *SCpnt) > scsi_set_resid(SCpnt, blk_rq_bytes(req)); > } > break; > + default: > + /* > + * In case of bogus fw or device, we could end up having > + * an unaligned partial completion. Check this here and force > + * alignment. > + */ > + resid = scsi_get_resid(SCpnt); > + if (resid & (sector_size - 1)) { > + SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt, > + "Unaligned partial completion (resid=%u, > sector_sz=%u)\n", > + resid, sector_size)); > + resid = round_up(resid, sector_size); > + if (resid < good_bytes) > + good_bytes -= resid; > + else > + good_bytes = 0; > + scsi_set_resid(SCpnt, resid); > + } > } > > if (result) { An additional concern: what if the size of the Data-Out buffer is not a multiple of the logical block size? Shouldn't we round down (good_bytes - resid) instead of rounding up resid? Thanks, Bart.
Re: [PATCH v3 00/16] lpfc: Add NVME Fabrics support
On 2/15/2017 2:12 AM, Sagi Grimberg wrote: The patches were cut against 1/30 scsi.git tree, misc branch. ** THEY ARE INTENDED FOR THE SCSI.GIT TREE, MISC BRANCH ** The lpfc version in the linux-block.git tree is rather old. I have a recipe for how to get it to a version that syncs with the scsi.git/misc tree so that these patches can apply there as well. Contact me if you would like it. This set does not seem to apply cleanly on nvme-4.11 tree, looks like patch 6 is failing. As I stated - the patches were cut vs the scsi.git tree - where the most recent lpfc is. So I expected merge via the scsi.git tree. Given how old the scsi/lpfc part was on linux-block as of a little while ago, it doesn't surprise me they aren't clean. I'll cut a version vs git.infradead.orig/nvme.git nvme-4.11 and send them privately. Also, can you send your patchset threaded? It usually does so when generating the patches with git format-patch, not sure how this is not the case with your set... It would make my life a bit easier. Well, I am using git format-patch to create them. But have to revert to a manual mailing request as I haven't been able to get git-send-email working past our firewalls. Sorry. -- james
Re: sense handling improvements
On Wed, 2017-02-15 at 16:04 +0100, Christoph Hellwig wrote: > On Wed, Feb 15, 2017 at 09:19:18AM +0100, Hannes Reinecke wrote: > > On 02/14/2017 08:15 PM, Christoph Hellwig wrote: > > > Hi all, > > > > > > this series is on top of the scsi_request changes in Jens' tree and > > > further improves the handling of the sense buffer. > > > > > > > Sorry, but I'm feeling really daft: which scsi_request changes? > > That is the "split scsi passthrough fields out of struct request" > series. > > > To be found in which tree? > > Jens' for-next tree, as mentioned above. Hello Christoph, Are you aware that "split scsi passthrough fields out of struct request" series introduces a new bug, a bug that I have already reported but that has not yet been addressed? See also https://www.spinics.net/lists/raid/msg55494.html. Bart.
Re: [PATCH v4] sd: Check for unaligned partial completion
On Wed, 2017-02-15 at 11:12 +0900, Damien Le Moal wrote: > + resid = round_up(resid, sector_size); > + if (resid < good_bytes) > + good_bytes -= resid; > + else > + good_bytes = 0; > + scsi_set_resid(SCpnt, resid); Hello Damien, If the Data-Out buffer is smaller than a single logical block, can the above code cause resid to exceed the Data-Out buffer size? I think we should avoid to convert a residual overflow into a residual underflow. Additionally, will round_up() work correctly if resid is negative (residual underflow)? How about using the following (untested) code instead of the above? if (resid > 0) { resid = min(good_bytes, round_up(resid, sector_size)); good_bytes -= resid; scsi_set_resid(SCpnt, resid); } Thanks, Bart.
Re: sense handling improvements
On Wed, Feb 15, 2017 at 09:19:18AM +0100, Hannes Reinecke wrote: > On 02/14/2017 08:15 PM, Christoph Hellwig wrote: > > Hi all, > > > > this series is on top of the scsi_request changes in Jens' tree and > > further improves the handling of the sense buffer. > > > Sorry, but I'm feeling really daft: which scsi_request changes? That is the "split scsi passthrough fields out of struct request" series. > To be found in which tree? Jens' for-next tree, as mentioned above. > Have we audited all drivers to _not_ do DMA into the sense buffer? > By first glance some still do, so they'll break horribly when moving the > sense buffer onto the stack ... With the above series the sense buffer is allocate by the driver, and they will always DMA into that.
[PATCH V5 net-next 1/2] qed: Add support for hardware offloaded FCoE.
From: Arun EasiThis adds the backbone required for the various HW initalizations which are necessary for the FCoE driver (qedf) for QLogic FastLinQ 4 line of adapters - FW notification, resource initializations, etc. Signed-off-by: Arun Easi Signed-off-by: Yuval Mintz --- drivers/net/ethernet/qlogic/Kconfig |3 + drivers/net/ethernet/qlogic/qed/Makefile |1 + drivers/net/ethernet/qlogic/qed/qed.h | 11 + drivers/net/ethernet/qlogic/qed/qed_cxt.c | 98 +- drivers/net/ethernet/qlogic/qed/qed_cxt.h |3 + drivers/net/ethernet/qlogic/qed/qed_dcbx.c| 13 +- drivers/net/ethernet/qlogic/qed/qed_dcbx.h|5 +- drivers/net/ethernet/qlogic/qed/qed_dev.c | 205 - drivers/net/ethernet/qlogic/qed/qed_dev_api.h | 42 + drivers/net/ethernet/qlogic/qed/qed_fcoe.c| 1014 + drivers/net/ethernet/qlogic/qed/qed_fcoe.h| 87 ++ drivers/net/ethernet/qlogic/qed/qed_hsi.h | 781 +++- drivers/net/ethernet/qlogic/qed/qed_hw.c |3 + drivers/net/ethernet/qlogic/qed/qed_ll2.c | 25 + drivers/net/ethernet/qlogic/qed/qed_ll2.h |2 +- drivers/net/ethernet/qlogic/qed/qed_main.c|7 + drivers/net/ethernet/qlogic/qed/qed_mcp.c |3 + drivers/net/ethernet/qlogic/qed/qed_mcp.h |1 + drivers/net/ethernet/qlogic/qed/qed_reg_addr.h|8 + drivers/net/ethernet/qlogic/qed/qed_sp.h |4 + drivers/net/ethernet/qlogic/qed/qed_sp_commands.c |3 + include/linux/qed/common_hsi.h| 10 +- include/linux/qed/fcoe_common.h | 715 +++ include/linux/qed/qed_fcoe_if.h | 145 +++ include/linux/qed/qed_if.h| 41 +- 25 files changed, 3211 insertions(+), 19 deletions(-) create mode 100644 drivers/net/ethernet/qlogic/qed/qed_fcoe.c create mode 100644 drivers/net/ethernet/qlogic/qed/qed_fcoe.h create mode 100644 include/linux/qed/fcoe_common.h create mode 100644 include/linux/qed/qed_fcoe_if.h diff --git a/drivers/net/ethernet/qlogic/Kconfig b/drivers/net/ethernet/qlogic/Kconfig index 3cfd105..737b303 100644 --- a/drivers/net/ethernet/qlogic/Kconfig +++ b/drivers/net/ethernet/qlogic/Kconfig @@ -113,4 +113,7 @@ config QED_RDMA config QED_ISCSI bool +config QED_FCOE + bool + endif # NET_VENDOR_QLOGIC diff --git a/drivers/net/ethernet/qlogic/qed/Makefile b/drivers/net/ethernet/qlogic/qed/Makefile index 729e437..e234083 100644 --- a/drivers/net/ethernet/qlogic/qed/Makefile +++ b/drivers/net/ethernet/qlogic/qed/Makefile @@ -7,3 +7,4 @@ qed-$(CONFIG_QED_SRIOV) += qed_sriov.o qed_vf.o qed-$(CONFIG_QED_LL2) += qed_ll2.o qed-$(CONFIG_QED_RDMA) += qed_roce.o qed-$(CONFIG_QED_ISCSI) += qed_iscsi.o qed_ooo.o +qed-$(CONFIG_QED_FCOE) += qed_fcoe.o diff --git a/drivers/net/ethernet/qlogic/qed/qed.h b/drivers/net/ethernet/qlogic/qed/qed.h index 1f61cf3..0e218d0 100644 --- a/drivers/net/ethernet/qlogic/qed/qed.h +++ b/drivers/net/ethernet/qlogic/qed/qed.h @@ -60,6 +60,7 @@ #define QED_WFQ_UNIT 100 #define ISCSI_BDQ_ID(_port_id) (_port_id) +#define FCOE_BDQ_ID(_port_id) ((_port_id) + 2) #define QED_WID_SIZE(1024) #define QED_PF_DEMS_SIZE(4) @@ -167,6 +168,7 @@ struct qed_tunn_update_params { */ enum qed_pci_personality { QED_PCI_ETH, + QED_PCI_FCOE, QED_PCI_ISCSI, QED_PCI_ETH_ROCE, QED_PCI_DEFAULT /* default in shmem */ @@ -204,6 +206,7 @@ enum QED_FEATURE { QED_VF, QED_RDMA_CNQ, QED_VF_L2_QUE, + QED_FCOE_CQ, QED_MAX_FEATURES, }; @@ -221,6 +224,7 @@ enum QED_PORT_MODE { enum qed_dev_cap { QED_DEV_CAP_ETH, + QED_DEV_CAP_FCOE, QED_DEV_CAP_ISCSI, QED_DEV_CAP_ROCE, }; @@ -255,6 +259,10 @@ struct qed_hw_info { u32 part_num[4]; unsigned char hw_mac_addr[ETH_ALEN]; + u64 node_wwn; + u64 port_wwn; + + u16 num_fcoe_conns; struct qed_igu_info *p_igu_info; @@ -410,6 +418,7 @@ struct qed_hwfn { struct qed_ooo_info *p_ooo_info; struct qed_rdma_info*p_rdma_info; struct qed_iscsi_info *p_iscsi_info; + struct qed_fcoe_info*p_fcoe_info; struct qed_pf_paramspf_params; bool b_rdma_enabled_in_prs; @@ -618,11 +627,13 @@ struct qed_dev { u8 protocol; #define IS_QED_ETH_IF(cdev) ((cdev)->protocol == QED_PROTOCOL_ETH) +#define IS_QED_FCOE_IF(cdev)((cdev)->protocol == QED_PROTOCOL_FCOE) /* Callbacks to protocol driver */ union { struct
[PATCH V5 0/2] Add QLogic FastLinQ FCoE (qedf) driver
From: "Dupuis, Chad"Dave, please apply the qed patch to net-next at your earliest convenience. Martin, the qed patch needs to be applied first as the qedf patch is dependent on the FCoE bits in the first qed driver patch. This series introduces the hardware offload FCoE initiator driver for the 41000 Series Converged Network Adapters (579xx chip) by Cavium. The overall driver design includes a common module ('qed') and protocol specific dependent modules ('qedf' for FCoE). This driver uses the kernel components of libfc and libfcoe as is and does not make use of the open-fcoe user space components. Therefore, no changes will need to be made to any open-fcoe components. The 'qed' common module, under drivers/net/ethernet/qlogic/qed/, is enhanced with functionality required for FCoE support. Changes from V4 -> V5 - Fix code alignment, function and variable formatting based on review comments in qed patch Changes from V3 -> V4 - Minor update to banner text in qed_fcoe.c|h files - Fix kbuild robot error on 32-bit systems in qedf - Remove unneeded double memcpy for offloaded ELS commands in qedf Changes from V2 -> V3 - Fix uninitialized variables reported by kbuild robot in qedf - Remove superfluous comments from qedf.h - Introduce new qedf_ctx flag to different stopping I/O for debug purposes. - Don't take lport->disc.disc_mutex when restarting an rport. - Remove extra whitespace in qedf_hsi.h Changes from V1 -> V2 Changes in qed: - Fix compiler warning when CONFIG_DCB is not set. Fixes in qedf: - Add qedf to scsi directory Makefile. - Updates to convert LightL2 and I/O processing kthreads to workqueues. Changes from RFC -> V1 - Squash qedf patches to one patch now that the initial review has taken place - Convert qedf to use hotplug state machine - Return via va_end to match corresponding va_start in logging functions - Convert qedf_ctx offloaded port list to a RCU list so searches do not need to make use of spinlocks. Also eliminates the need to fcport conn_id's. - Use IS_ERR(fp) in qedf_flogi_resp() instead of checking individual FC_EX_* errors. - Remove scsi_block_target when executing TMF request. - Checkpatch fixes in the qed and qedf patches Arun Easi (1): qed: Add support for hardware offloaded FCoE. Dupuis, Chad (1): qedf: Add QLogic FastLinQ offload FCoE driver framework. MAINTAINERS |6 + drivers/net/ethernet/qlogic/Kconfig |3 + drivers/net/ethernet/qlogic/qed/Makefile |1 + drivers/net/ethernet/qlogic/qed/qed.h | 11 + drivers/net/ethernet/qlogic/qed/qed_cxt.c | 98 +- drivers/net/ethernet/qlogic/qed/qed_cxt.h |3 + drivers/net/ethernet/qlogic/qed/qed_dcbx.c| 13 +- drivers/net/ethernet/qlogic/qed/qed_dcbx.h|5 +- drivers/net/ethernet/qlogic/qed/qed_dev.c | 205 +- drivers/net/ethernet/qlogic/qed/qed_dev_api.h | 42 + drivers/net/ethernet/qlogic/qed/qed_fcoe.c| 1014 +++ drivers/net/ethernet/qlogic/qed/qed_fcoe.h| 87 + drivers/net/ethernet/qlogic/qed/qed_hsi.h | 781 - drivers/net/ethernet/qlogic/qed/qed_hw.c |3 + drivers/net/ethernet/qlogic/qed/qed_ll2.c | 25 + drivers/net/ethernet/qlogic/qed/qed_ll2.h |2 +- drivers/net/ethernet/qlogic/qed/qed_main.c|7 + drivers/net/ethernet/qlogic/qed/qed_mcp.c |3 + drivers/net/ethernet/qlogic/qed/qed_mcp.h |1 + drivers/net/ethernet/qlogic/qed/qed_reg_addr.h|8 + drivers/net/ethernet/qlogic/qed/qed_sp.h |4 + drivers/net/ethernet/qlogic/qed/qed_sp_commands.c |3 + drivers/scsi/Kconfig |1 + drivers/scsi/Makefile |1 + drivers/scsi/qedf/Kconfig | 11 + drivers/scsi/qedf/Makefile|5 + drivers/scsi/qedf/qedf.h | 545 drivers/scsi/qedf/qedf_attr.c | 165 + drivers/scsi/qedf/qedf_dbg.c | 195 ++ drivers/scsi/qedf/qedf_dbg.h | 154 + drivers/scsi/qedf/qedf_debugfs.c | 460 +++ drivers/scsi/qedf/qedf_els.c | 949 ++ drivers/scsi/qedf/qedf_fip.c | 269 ++ drivers/scsi/qedf/qedf_hsi.h | 422 +++ drivers/scsi/qedf/qedf_io.c | 2282 ++ drivers/scsi/qedf/qedf_main.c | 3336 + drivers/scsi/qedf/qedf_version.h | 15 + include/linux/qed/common_hsi.h| 10 +- include/linux/qed/fcoe_common.h | 715 + include/linux/qed/qed_fcoe_if.h | 145 + include/linux/qed/qed_if.h| 41 +- 41 files changed, 12027 insertions(+), 19 deletions(-) create mode 100644
RE: [patch] scsi: megaraid_sas: array overflow in megasas_dump_frame()
>-Original Message- >From: Dan Carpenter [mailto:dan.carpen...@oracle.com] >Sent: Tuesday, February 14, 2017 10:09 PM >To: Kashyap Desai; Shivasharan S >Cc: Sumit Saxena; James E.J. Bottomley; Martin K. Petersen; >megaraidlinux@broadcom.com; linux-scsi@vger.kernel.org; kernel- >janit...@vger.kernel.org >Subject: [patch] scsi: megaraid_sas: array overflow in megasas_dump_frame() > >The "sz" variable is in terms of bytes, but we're treating the buffer as an array of >__le32 so we have to divide by 4. > >Fixes: def0eab3af86 ("scsi: megaraid_sas: enhance debug logs in OCR context") >Signed-off-by: Dan Carpenter> >diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c >b/drivers/scsi/megaraid/megaraid_sas_base.c >index dc9f42e135bb..7ac9a9ee9bd4 100644 >--- a/drivers/scsi/megaraid/megaraid_sas_base.c >+++ b/drivers/scsi/megaraid/megaraid_sas_base.c >@@ -2754,7 +2754,7 @@ megasas_dump_frame(void *mpi_request, int sz) > __le32 *mfp = (__le32 *)mpi_request; > > printk(KERN_INFO "IO request frame:\n\t"); >- for (i = 0; i < sz; i++) { >+ for (i = 0; i < sz / sizeof(__le32); i++) { > if (i && ((i % 8) == 0)) > printk("\n\t"); > printk("%08x ", le32_to_cpu(mfp[i])); Patch looks good. In last reply, Acked-by tag was not in proper format. Fixing it now. Sorry for inconvenience. Acked-by: Sumit Saxena
RE: [PATCH] scsi: megaraid_sas: handle dma_addr_t right on 32-bit
>-Original Message- >From: Arnd Bergmann [mailto:a...@arndb.de] >Sent: Wednesday, February 15, 2017 2:52 AM >To: James E.J. Bottomley; Martin K. Petersen >Cc: Arnd Bergmann; Kashyap Desai; Sumit Saxena; Shivasharan S; Tomas Henzl; >Hannes Reinecke; Sasikumar Chandrasekaran; >megaraidlinux@broadcom.com; linux-scsi@vger.kernel.org; linux- >ker...@vger.kernel.org >Subject: [PATCH] scsi: megaraid_sas: handle dma_addr_t right on 32-bit > >When building with a dma_addr_t that is different from pointer size, we get this >warning: > >drivers/scsi/megaraid/megaraid_sas_fusion.c: In function >'megasas_make_prp_nvme': >drivers/scsi/megaraid/megaraid_sas_fusion.c:1654:17: error: cast to pointer >from integer of different size [-Werror=int-to-pointer-cast] > >It's better to not pretend that the dma address is a pointer and instead use a >dma_addr_t consistently. > >Fixes: 33203bc4d61b ("scsi: megaraid_sas: NVME fast path io support") >Signed-off-by: Arnd Bergmann>--- > drivers/scsi/megaraid/megaraid_sas_fusion.c | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > >diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c >b/drivers/scsi/megaraid/megaraid_sas_fusion.c >index 750090119f81..29650ba669da 100644 >--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c >+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c >@@ -1619,7 +1619,8 @@ megasas_make_prp_nvme(struct megasas_instance >*instance, struct scsi_cmnd *scmd, { > int sge_len, offset, num_prp_in_chain = 0; > struct MPI25_IEEE_SGE_CHAIN64 *main_chain_element, *ptr_first_sgl; >- u64 *ptr_sgl, *ptr_sgl_phys; >+ u64 *ptr_sgl; >+ dma_addr_t ptr_sgl_phys; > u64 sge_addr; > u32 page_mask, page_mask_result; > struct scatterlist *sg_scmd; >@@ -1651,14 +1652,14 @@ megasas_make_prp_nvme(struct megasas_instance >*instance, struct scsi_cmnd *scmd, >*/ > page_mask = mr_nvme_pg_size - 1; > ptr_sgl = (u64 *)cmd->sg_frame; >- ptr_sgl_phys = (u64 *)cmd->sg_frame_phys_addr; >+ ptr_sgl_phys = cmd->sg_frame_phys_addr; > memset(ptr_sgl, 0, instance->max_chain_frame_sz); > > /* Build chain frame element which holds all prps except first*/ > main_chain_element = (struct MPI25_IEEE_SGE_CHAIN64 *) > ((u8 *)sgl_ptr + sizeof(struct MPI25_IEEE_SGE_CHAIN64)); > >- main_chain_element->Address = cpu_to_le64((uintptr_t)ptr_sgl_phys); >+ main_chain_element->Address = cpu_to_le64(ptr_sgl_phys); > main_chain_element->NextChainOffset = 0; > main_chain_element->Flags = IEEE_SGE_FLAGS_CHAIN_ELEMENT | > IEEE_SGE_FLAGS_SYSTEM_ADDR | >@@ -1696,16 +1697,15 @@ megasas_make_prp_nvme(struct megasas_instance >*instance, struct scsi_cmnd *scmd, > scmd_printk(KERN_NOTICE, > scmd, "page boundary ptr_sgl: 0x%p\n", > ptr_sgl); >- ptr_sgl_phys++; >- *ptr_sgl = >- cpu_to_le64((uintptr_t)ptr_sgl_phys); >+ ptr_sgl_phys += 8; >+ *ptr_sgl = cpu_to_le64(ptr_sgl_phys); > ptr_sgl++; > num_prp_in_chain++; > } > > *ptr_sgl = cpu_to_le64(sge_addr); > ptr_sgl++; >- ptr_sgl_phys++; >+ ptr_sgl_phys += 8; > num_prp_in_chain++; > > sge_addr += mr_nvme_pg_size; Patch looks good. In last reply, Acked-by tag was not in proper format. Fixing it now. Sorry for inconvenience. Acked-by: Sumit Saxena >-- >2.9.0
Re: [PATCH v3 00/16] lpfc: Add NVME Fabrics support
Hi James, This patch set adds support for NVME over Fabrics FC transport to lpfc The internals of the driver are reworked to support being either: a SCSI initiator; a NVME intiator; both a SCSI initiator and a NVME initiator; or a NVME target. The driver effectively has parallel NVME and SCSI stacks that utilize their own set of resources. They intersect only at the hardware level, mainly in queue create layers and interrupt handling. A few new files are added to support the interfaces of the FC transport LLDD api for NVME fabrics. The patches were cut against 1/30 scsi.git tree, misc branch. ** THEY ARE INTENDED FOR THE SCSI.GIT TREE, MISC BRANCH ** The lpfc version in the linux-block.git tree is rather old. I have a recipe for how to get it to a version that syncs with the scsi.git/misc tree so that these patches can apply there as well. Contact me if you would like it. This set does not seem to apply cleanly on nvme-4.11 tree, looks like patch 6 is failing. Also, can you send your patchset threaded? It usually does so when generating the patches with git format-patch, not sure how this is not the case with your set... It would make my life a bit easier. Thanks.
RE: [PATCH 00/10] mpt3sas: full mq support
> > > Hannes, > > Result I have posted last time is with merge operation enabled in block > layer. If I disable merge operation then I don't see much improvement > with > multiple hw request queues. Here is the result, > > fio results when nr_hw_queues=1, > 4k read when numjobs=24: io=248387MB, bw=1655.1MB/s, iops=423905, > runt=150003msec > > fio results when nr_hw_queues=24, > 4k read when numjobs=24: io=263904MB, bw=1759.4MB/s, iops=450393, > runt=150001msec Hannes - I worked with Sreekanth and also understand pros/cons of Patch #10. " [PATCH 10/10] mpt3sas: scsi-mq interrupt steering" In above patch, can_queue of HBA is divided based on logic CPU, it means we want to mimic as if mpt3sas HBA support multi queue distributing actual resources which is single Submission H/W Queue. This approach badly impact many performance areas. nr_hw_queues = 1 is what I observe as best performance approach since it never throttle IO if sdev->queue_depth is set to HBA queue depth. In case of nr_hw_queues = "CPUs" throttle IO at SCSI level since we never allow more than "updated can_queue" in LLD. Below code bring actual HBA can_queue very low ( Ea on 96 logical core CPU new can_queue goes to 42, if HBA queue depth is 4K). It means we will see lots of IO throttling in scsi mid layer due to shost->can_queue reach the limit very soon if you have jobs with higher QD. if (ioc->shost->nr_hw_queues > 1) { ioc->shost->nr_hw_queues = ioc->msix_vector_count; ioc->shost->can_queue /= ioc->msix_vector_count; } I observe negative performance if I have 8 SSD drives attached to Ventura (latest IT controller). 16 fio jobs at QD=128 gives ~1600K IOPs and the moment I switch to nr_hw_queues = "CPUs", it gave hardly ~850K IOPs. This is mainly because of host_busy stuck at very low ~169 on my setup. May be as Sreekanth mentioned, performance improvement you have observed is due to nomerges=2 is not set and OS will attempt soft back/front merge. I debug live machine and understood we never see parallel instance of "scsi_dispatch_cmd" as we expect due to can_queue is less. If we really has *very* large HBA QD, this patch #10 to expose multiple SQ may be useful. For now, we are looking for updated version of patch which will only keep IT HBA in SQ mode (like we are doing in driver) and add interface to use blk_tag in both scsi.mq and !scsi.mq mode. Sreekanth has already started working on it, but we may need to check full performance test run to post the actual patch. May be we can cherry pick few patches from this series and get blk_tag support to improve performance of later which will not allow use to choose nr_hw_queue to be tunable. Thanks, Kashyap > > Thanks, > Sreekanth
Re: [PATCH 15/16] aacraid: Fix a potential spinlock double unlock bug
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > The driver does not unlock the reply queue spin lock after handling SMART > adapter events. Instead it might attempt to unlock an already unlocked > spin lock. > > Fixed by making sure the driver locks the spin lock before freeing it. > > Thank you dan for finding this issue out. > > Fixes: 6223a39fe6fbbeef (scsi: aacraid: Added support for hotplug) > Reported-by: Dan Carpenter> Signed-off-by: Raghava Aditya Renukunta > > Reviewed-by: David Carroll > --- Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 16/16] aacraid: Update driver version
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > Updated driver version to 50792 > > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 14/16] aacraid: Save adapter fib log before an IOP reset
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > Currently the adapter firmware does not save outstanding I/O's log > information when an IOP reset is triggered. This is problematic when > trying to root cause and debug issues. > > Fixed by adding sync command to trigger I/O log file save in the adapter > firmware before issuing an IOP reset. > > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 13/16] aacraid: Reorder Adapter status check
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > The driver currently checks the SELF_TEST_FAILED first and then > KERNEL_PANIC next. Under error conditions(boot code failure) both > SELF_TEST_FAILED and KERNEL_PANIC can be set at the same time. > > The driver has the capability to reset the controller on an KERNEL_PANIC > , but not on SELF_TEST_FAILED. > > Fixed by first checking KERNEL_PANIC and then the others. > > Cc: sta...@vger.kernel.org > Fixes: e8b12f0fb835223752 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC > base controller family) > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- Apart from the odd comma placement in the 2nd paragraph of the commit message Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART Family)
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > When the SMART family of controller panic (KERNEL_PANIC) , they do not ^ controllers?^ extra space > honor IOP resets. So better to skip it and directly perform a IWBR reset. > > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- > drivers/scsi/aacraid/src.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c > index b23c818..5bb9865 100644 > --- a/drivers/scsi/aacraid/src.c > +++ b/drivers/scsi/aacraid/src.c > @@ -714,6 +714,12 @@ static int aac_src_restart_adapter(struct aac_dev *dev, > int bled, u8 reset_type) > pr_err("%s%d: adapter kernel panic'd %x.\n", > dev->name, dev->id, bled); > > + /* > + * WHen there is a BlinkLED, IOP_RESET has not effect ^ When > + */ > + if (bled >= 2 && dev->sa_firmware && (reset_type & HW_IOP_RESET)) ^ No need for the parenthesis > + reset_type &= ~HW_IOP_RESET; > + > dev->a_ops.adapter_enable_int = aac_src_disable_interrupt; > > switch (reset_type) { > Apart from that, Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 11/16] aacraid: Decrease adapter health check interval
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > Currently driver checks the health status of the adapter once every 24 > hours. When that happens the driver becomes dependent on the kernel to > figure out if the adapter is misbehaving. This might take some time > (when the adapter is idle). The driver currently has support to > restart/recover the controller when it fails, and decreasing the time > interval will help. > > Fixed by decreasing check interval from 24 hours to 1 minute > > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > When the command thread performs a periodic time sync and the firmware is > going through an assert during that time, the command thread waits for the > response that would never arrive. The SCSI Mid layer's error handler would > eventually reset the controller, but the eh_handler just issues a > "thread stop" to the command thread which is stuck on a semaphore and the > eh_thread would in turn goes to sleep waiting for the command_thread to > respond to the stop which never happens. > > Fixed by allowing SIGTERM for the command thread, and during the eh_reset > call, sends termination signal to the command thread. As a follow-up, the > eh_reset handler would take care of the controller reset. > > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- This look a bit scary. Can't the kthread be converted to a workqueue so we could call cancel_work_sync()? -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 09/16] aacraid: Reload offlined drives after controller reset
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > During the IOP reset stress testing, it was found that the drives can be > marked offline when the adapter controller crashes and IO's are running > in parallel. When the controller does come back from the reset, the drive > that is marked offline is not exposed. > > Fixed by removing and adding drives that are marked offline. In addition > invoke a scsi host bus rescan to capture any additional configuration > changes. > > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 08/16] aacraid: Skip wellness sync on controller failure
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > aac_command_thread checks on the health of controller periodically, > using aac_check_health. If the status is an error state KERNEL_PANIC or > anything else. The driver will attempt to restart the adapter, but the > response is not checked in aac_command_thread. This allows the periodic > sync to go thru and lead the driver to a hung state. > > Fixed by terminating the periodic loop(intended per original design), > if the controller is not restored to a healthy state. > > Cc: sta...@vger.kernel.org > Fixes: 3d77d8404478353358 (scsi: aacraid: Added support for periodic wellness > sync) > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 07/16] aacraid: Fix sync fibs time out on controller reset
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > After controller shutdown, all sync fibs time out due to not knowing > about the switch to INT-x mode > > Fixed by replacing aac_src_access_devreg() to aac_set_intx_mode() call. > > Cc: sta...@vger.kernel.org > Fixes: 495c021767bd78c998 (aacraid: MSI-x support) > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 06/16] aacraid: Added sysfs for driver version
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > Added support to retrieve driver version from a new sysfs variable called > driver_version. It makes it easier for the user to figure out the driver > version that is currently running. > > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- Can't this be retrieved via modinfo? -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 05/16] aacraid: Fix memory leak in fib init path
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > aac_fib_map_free frees misaligned fib dma memory, additionally it does not > free up the whole memory. > > Fixed by changing the code to free up the correct and full memory > allocation. > > Cc: sta...@vger.kernel.org > Fixes: e8b12f0fb835223 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC > based controller family) > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- > drivers/scsi/aacraid/commsup.c | 20 +--- > 1 file changed, 9 insertions(+), 11 deletions(-) > > diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c > index f7a3bcb..863c98d 100644 > --- a/drivers/scsi/aacraid/commsup.c > +++ b/drivers/scsi/aacraid/commsup.c > @@ -97,8 +97,8 @@ void aac_fib_map_free(struct aac_dev *dev) > { > if (dev->hw_fib_va && dev->max_cmd_size) { > pci_free_consistent(dev->pdev, > - (dev->max_cmd_size * > - (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB)), > + (dev->max_cmd_size + sizeof(struct aac_fib_xporthdr)) > + * (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB) + 31, > dev->hw_fib_va, dev->hw_fib_pa); Can you please do something like: size_t alloc_size; int numtags; numtags = dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB; alloc_size = (dev->max_cmd_size + sizeof(struct aac_fib_xporthdr)) * numtags + 31; pci_free_consistent(dev->pdev, alloc_size, dev->hw_fib_va, dev->hw_fib_pa); And please indent correctly. If it indentation doesn't work correctly because you hit the 80 chars limit, that's a sign something should be reconsidered. Thanks, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 00/10] mpt3sas: full mq support
On Mon, Feb 13, 2017 at 6:41 PM, Hannes Reineckewrote: > On 02/13/2017 07:15 AM, Sreekanth Reddy wrote: >> On Fri, Feb 10, 2017 at 12:29 PM, Hannes Reinecke wrote: >>> On 02/10/2017 05:43 AM, Sreekanth Reddy wrote: On Thu, Feb 9, 2017 at 6:42 PM, Hannes Reinecke wrote: > On 02/09/2017 02:03 PM, Sreekanth Reddy wrote: >>> [ .. ] >> >> >> Hannes, >> >> I have created a md raid0 with 4 SAS SSD drives using below command, >> #mdadm --create /dev/md0 --level=0 --raid-devices=4 /dev/sdg /dev/sdh >> /dev/sdi /dev/sdj >> >> And here is 'mdadm --detail /dev/md0' command output, >> -- >> /dev/md0: >> Version : 1.2 >> Creation Time : Thu Feb 9 14:38:47 2017 >> Raid Level : raid0 >> Array Size : 780918784 (744.74 GiB 799.66 GB) >>Raid Devices : 4 >> Total Devices : 4 >> Persistence : Superblock is persistent >> >> Update Time : Thu Feb 9 14:38:47 2017 >> State : clean >> Active Devices : 4 >> Working Devices : 4 >> Failed Devices : 0 >> Spare Devices : 0 >> >> Chunk Size : 512K >> >>Name : host_name >>UUID : b63f9da7:b7de9a25:6a46ca00:42214e22 >> Events : 0 >> >> Number Major Minor RaidDevice State >>0 8 960 active sync /dev/sdg >>1 8 1121 active sync /dev/sdh >>2 8 1442 active sync /dev/sdj >>3 8 1283 active sync /dev/sdi >> -- >> >> Then I have used below fio profile to run 4K sequence read operations >> with nr_hw_queues=1 driver and with nr_hw_queues=24 driver (as my >> system has two numa node and each with 12 cpus). >> - >> global] >> ioengine=libaio >> group_reporting >> direct=1 >> rw=read >> bs=4k >> allow_mounted_write=0 >> iodepth=128 >> runtime=150s >> >> [job1] >> filename=/dev/md0 >> - >> >> Here are the fio results when nr_hw_queues=1 (i.e. single request >> queue) with various number of job counts >> 1JOB 4k read : io=213268MB, bw=1421.8MB/s, iops=363975, runt=150001msec >> 2JOBs 4k read : io=309605MB, bw=2064.2MB/s, iops=528389, runt=150001msec >> 4JOBs 4k read : io=281001MB, bw=1873.4MB/s, iops=479569, runt=150002msec >> 8JOBs 4k read : io=236297MB, bw=1575.2MB/s, iops=403236, runt=150016msec >> >> Here are the fio results when nr_hw_queues=24 (i.e. multiple request >> queue) with various number of job counts >> 1JOB 4k read : io=95194MB, bw=649852KB/s, iops=162463, runt=150001msec >> 2JOBs 4k read : io=189343MB, bw=1262.3MB/s, iops=323142, runt=150001msec >> 4JOBs 4k read : io=314832MB, bw=2098.9MB/s, iops=537309, runt=150001msec >> 8JOBs 4k read : io=277015MB, bw=1846.8MB/s, iops=472769, runt=150001msec >> >> Here we can see that on less number of jobs count, single request >> queue (nr_hw_queues=1) is giving more IOPs than multi request >> queues(nr_hw_queues=24). >> >> Can you please share your fio profile, so that I can try same thing on >> my system. >> > Have you tried with the latest git update from Jens for-4.11/block (or > for-4.11/next) branch? I am using below git repo, https://git.kernel.org/cgit/linux/kernel/git/mkp/scsi.git/log/?h=4.11/scsi-queue Today I will try with Jens for-4.11/block. >>> By all means, do. >>> > I've found that using the mq-deadline scheduler has a noticeable > performance boost. > > The fio job I'm using is essentially the same; you just should make sure > to specify a 'numjob=' statement in there. > Otherwise fio will just use a single CPU, which of course leads to > averse effects in the multiqueue case. Yes I am providing 'numjob=' on fio command line as shown below, # fio md_fio_profile --numjobs=8 --output=fio_results.txt >>> Still, it looks as if you'd be using less jobs than you have CPUs. >>> Which means you'll be running into a tag starvation scenario on those >>> CPUs, especially for the small blocksizes. >>> What are the results if you set 'numjobs' to the number of CPUs? >>> >> >> Hannes, >> >> Tried on Jens for-4.11/block kernel repo and also set each block PD's >> scheduler as 'mq-deadline', and here is my results for 4K SR on md0 >> (raid0 with 4 drives). I have 24 CPUs and so tried even with setting >> numjobs=24. >>
Re: [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > Arrconf management utility at times sends fibs with AdapterProcessed set > in its fibs. This causes the controller to panic and lockup. > > Fixed by failing the commands that have AdapterProcessed set in its flag. > > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- > drivers/scsi/aacraid/commsup.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c > index 6220b47..f7a3bcb 100644 > --- a/drivers/scsi/aacraid/commsup.c > +++ b/drivers/scsi/aacraid/commsup.c > @@ -522,6 +522,10 @@ int aac_fib_send(u16 command, struct fib *fibptr, > unsigned long size, > > if (!(hw_fib->header.XferState & cpu_to_le32(HostOwned))) > return -EBUSY; > + > + if (hw_fib->header.XferState & cpu_to_le32(AdapterProcessed)) > + return -EINVAL; > + As far as I can see the fib_xfer_state enum isn't an exported as an official ABI, so it's a good candidate (whole of aacraid.h actually) for the next round of camel case removals. Anyways, this can wait: Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: sense handling improvements
On 02/14/2017 08:15 PM, Christoph Hellwig wrote: > Hi all, > > this series is on top of the scsi_request changes in Jens' tree and > further improves the handling of the sense buffer. > Sorry, but I'm feeling really daft: which scsi_request changes? To be found in which tree? I dimly remember seeing them, but have been unable to find them again. > The first patch prevents any possibily of reusing stale sense codes > in sense headers, and is a bug fix that we should probably get into > the block tree ASAP. > > The rest cleans up handling of the parsed sense data and could go in > either through the block tree, or a SCSI branch on top of the block > tree. > Have we audited all drivers to _not_ do DMA into the sense buffer? By first glance some still do, so they'll break horribly when moving the sense buffer onto the stack ... Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
Re: [PATCH 00/10] mpt3sas: full mq support
On 02/15/2017 09:15 AM, Christoph Hellwig wrote: > On Tue, Feb 07, 2017 at 02:19:09PM +0100, Christoph Hellwig wrote: >> Patch 1-7 look fine to me with minor fixups, and I'd love to see >> them go into 4.11. > > Any chance to see a resend of these? > Sure. Will do shortly. Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
Re: [PATCH 00/10] mpt3sas: full mq support
On Tue, Feb 07, 2017 at 02:19:09PM +0100, Christoph Hellwig wrote: > Patch 1-7 look fine to me with minor fixups, and I'd love to see > them go into 4.11. Any chance to see a resend of these?
Re: [PATCH 03/16] aacraid: Fix for excessive prints on EEH
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > This issue showed up on a kdump debug(single CPU on powerkvm), when EEH > errors rendered the adapter unusable. The driver correctly detected the > issue and attempted to restart the controller, in doing so the driver > attempted to read the status registers of the controller. This triggered > additional eeh errors which continued for a good 6 minutes. > > Fixed by returning without waiting when EEH error is reported. > > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- > drivers/scsi/aacraid/commsup.c | 38 +- > 1 file changed, 37 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c > index 56090f5..6220b47 100644 > --- a/drivers/scsi/aacraid/commsup.c > +++ b/drivers/scsi/aacraid/commsup.c > @@ -461,6 +461,30 @@ int aac_queue_get(struct aac_dev * dev, u32 * index, u32 > qid, struct hw_fib * hw > return 0; > } Please do > +#ifdef CONFIG_EEH > +static inline int aac_check_eeh_failure(struct aac_dev *dev) > +{ > + /* Check for an EEH failure for the given > + * device node. Function eeh_dev_check_failure() > + * returns 0 if there has not been an EEH error > + * otherwise returns a non-zero value. > + * > + * Need to be called before any PCI operation, > + * i.e.,before aac_adapter_check_health() > + */ > + struct eeh_dev *edev = pci_dev_to_eeh_dev(dev->pdev); > + > + if (eeh_dev_check_failure(edev)) { > + /* The EEH mechanisms will handle this > + * error and reset the device if > + * necessary. > + */ > + return 1; > + } > + return 0; > +} #else static inline int aac_check_eeh_failure(struct aac_dev *dev) { return 0; } > +#endif > + [...] > + > +#if defined(CONFIG_EEH) > + if (aac_check_eeh_failure(dev)) > + return -EFAULT; > +#endif > + So the #if defined() blocks become unnecessary. Thanks, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 02/16] aacraid: Use correct channel number for raw srb
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > The channel being used for raw srb commands is retrieved from the utility > sent fibs and is converted into physical channel id. The driver does not > need to to do this since the management utility sends the correct channel > id in the first place and in addition the driver sets inaccurate > information in the cmd sent to the firmware and gets an invalid response. > > Fixed by using channel id from srb command. > > Cc: sta...@vger.kernel.org > Fixes: 423400e64d377c0 ("scsi: aacraid: Include HBA direct interface") > Signed-off-by: Raghava Aditya Renukunta >> Reviewed-by: David Carroll > --- Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 01/16] aacraid: Fix camel case
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote: > Replaced camel case with snake case for init supported options. > > Suggested-by: Johannes Thumshirn> Signed-off-by: Raghava Aditya Renukunta > > Reviewed-by: David Carroll > --- \o/ Bonus-points-awarded-by: Johannes Thumshirn Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 2/6] sd: improve TUR handling in sd_check_events
On 02/14/2017 08:15 PM, Christoph Hellwig wrote: > Remove bogus evaluations of retval and sshdr when the device is offline, > and fix a possible NULL pointer dereference by allocating the 8 byte > sized sense header on stack. > > Signed-off-by: Christoph Hellwig> --- > drivers/scsi/sd.c | 25 +++-- > 1 file changed, 11 insertions(+), 14 deletions(-) > Reviewed-by: Hannes Reinecke Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
Re: [PATCH 1/6] scsi: always zero sshdr in scsi_normalize_sense
On 02/14/2017 08:15 PM, Christoph Hellwig wrote: > This gives us a clear state even if a command didn't return sense data. > > Signed-off-by: Christoph Hellwig> --- > drivers/scsi/scsi_common.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/scsi_common.c b/drivers/scsi/scsi_common.c > index b1383a71400e..a75673bb82b3 100644 > --- a/drivers/scsi/scsi_common.c > +++ b/drivers/scsi/scsi_common.c > @@ -137,11 +137,11 @@ EXPORT_SYMBOL(int_to_scsilun); > bool scsi_normalize_sense(const u8 *sense_buffer, int sb_len, > struct scsi_sense_hdr *sshdr) > { > + memset(sshdr, 0, sizeof(struct scsi_sense_hdr)); > + > if (!sense_buffer || !sb_len) > return false; > > - memset(sshdr, 0, sizeof(struct scsi_sense_hdr)); > - > sshdr->response_code = (sense_buffer[0] & 0x7f); > > if (!scsi_sense_valid(sshdr)) > Reviewed-by: Hannes Reinecke Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)