[PATCH] [RESEND] qla2xxx: prevent board_disable from running during EEH

2015-06-26 Thread Mauricio Faria de Oliveira
Commit f3ddac1918fe963bcbf8d407a3a3c0881b47248b ([SCSI] qla2xxx: Disable adapter when we encounter a PCI disconnect.) has introduced a code that disables the board, releasing some resources, when reading 0x. In case this happens when there is an EEH, this read will trigger EEH detection

Re: [dm-devel] IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]

2015-10-29 Thread Mauricio Faria de Oliveira
need to track down other mails (btw, thanks for the detailed patch header but it enabled me to be skeptical of your request to revert): You're welcome. If it's been useful for rejecting this patch and getting a better one later, it's worth it. :) Kind regards, -- Mauricio Faria de Oliveira IBM

[PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()

2016-06-07 Thread Mauricio Faria de Oliveira
):0713 SCSI layer issued Device Reset (0, 0) return x2002 <...> lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002 <...> lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002 <...> lpfc 0006:01:00.4: 4:(0):31

[PATCH 2/2] lpfc: add option for lpfc_fcp_io_sched (LPFC_FCP_SCHED_BY_CPU_CORE)

2016-06-01 Thread Mauricio Faria de Oliveira
on next-20160601. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- drivers/scsi/lpfc/lpfc_attr.c | 8 +-- drivers/scsi/lpfc/lpfc_hw4.h | 1 + drivers/scsi/lpfc/lpfc_init.c | 54 ++- drivers/scsi/lpfc/lpfc_scsi.c | 3 ++- 4

[PATCH 1/2] lpfc: support for CPU phys_id and core_id on PowerPC64

2016-06-01 Thread Mauricio Faria de Oliveira
on some systems). While in there, include the CPU number in the debug message, which helps reading it on systems with many CPUs. This depends on commit 'powerpc: export cpu_to_core_id()' (submitted to the linuxppc-dev mailing list). Tested on next-20160601 w/ commit. Signed-off-by: Mauricio

[PATCH 0/2] lpfc: PowerPC64 topology information, and per-core scheduling

2016-06-01 Thread Mauricio Faria de Oliveira
(topology information), which has server processors with many cores/threads and per-core caches. Although the series include bits for PowerPC64, the per-core scheduling patch is architecture independent. Tested on next-20160601 (with an extra commit for patch 1/2, see commit msg). Mauricio Faria de

Re: [PATCH 1/2] lpfc: support for CPU phys_id and core_id on PowerPC64

2016-06-22 Thread Mauricio Faria de Oliveira
technical problems, for example. Thanks for the review/comments (Christoph too), -- Mauricio Faria de Oliveira IBM Linux Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo inf

Re: block: don't check request size in blk_cloned_rq_check_limits()

2016-06-16 Thread Mauricio Faria de Oliveira
closely. -- Mauricio Faria de Oliveira IBM Linux Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] lpfc: PowerPC64 topology information, and per-core scheduling

2016-06-21 Thread Mauricio Faria de Oliveira
On 06/01/2016 05:43 PM, Mauricio Faria de Oliveira wrote: Tested on next-20160601 (with an extra commit for patch 1/2, see commit msg). FYI, that commit has been accepted into powerpc next [1]. [1] https://git.kernel.org/powerpc/c/f8ab481066e7246e4b272233aa -- Mauricio Faria de Oliveira IBM

Re: [PATCH 1/2] lpfc: support for CPU phys_id and core_id on PowerPC64

2016-06-21 Thread Mauricio Faria de Oliveira
where ppc64/le usually runs, on which it would be easier to adapt this relatively small change than moving forward w/ blk-mq/scsi-mq, for example -- even if the latter is clearly a superior approach. [1] http://lists.infradead.org/pipermail/linux-nvme/2016-June/005012.html -- Mauricio Faria de

Re: [PATCH] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()

2016-07-25 Thread Mauricio Faria de Oliveira
, and it happens in normal scenarios (eg SCSI EH), it seems appropriate. Thanks, -- Mauricio Faria de Oliveira IBM Linux Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo inf

Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash

2017-01-25 Thread Mauricio Faria de Oliveira
, please feel free to change the sign-off line as appropriate here. Thanks, -- Mauricio Faria de Oliveira IBM Linux Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo inf

[PATCH v2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash

2017-01-25 Thread Mauricio Faria de Oliveira
xxx] qla2x00_abort_isp+0xef/0x690 [qla2xxx] qla2x00_do_dpc+0x36c/0x880 [qla2xxx] kthread+0x10c/0x140 Note: this patch is a slight change of the original patch sent by Bart, submitted by request of mkp. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Reported

Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash

2017-01-24 Thread Mauricio Faria de Oliveira
qla2xxx_eh_abort(GET_CMD_SP(sp)); + qla2xxx_eh_abort(scmd); spin_lock_irqsave(>hardware_lock, flags); } req->outstanding_cmds[cnt] = NULL; Signed-off-by: Mauricio Faria de Olive

[PATCH] lpfc: fix oops/BUG in lpfc_sli_ringtxcmpl_put()

2016-11-23 Thread Mauricio Faria de Oliveira
et_from_kernel_thread+0x5c/0xbc <...> Cc: sta...@vger.kernel.org # v4.8 Fixes: 22466da5b4b7 ("lpfc: Fix possible NULL pointer dereference") Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- drivers/scsi/lpfc/lpfc_sli.c | 14 -- 1 file

Re: [PATCH] lpfc: fix oops/BUG in lpfc_sli_ringtxcmpl_put()

2016-11-23 Thread Mauricio Faria de Oliveira
Due credit; an oversight. On 11/23/2016 10:33 AM, Mauricio Faria de Oliveira wrote: Reported-by: Harsha Thyagaraja <hathy...@in.ibm.com> Cc: sta...@vger.kernel.org # v4.8 Fixes: 22466da5b4b7 ("lpfc: Fix possible NULL pointer dereference") Signed-off-by: Mauricio Faria de

Re: [PATCH] lpfc: fix oops/BUG in lpfc_sli_ringtxcmpl_put()

2016-11-23 Thread Mauricio Faria de Oliveira
On 11/23/2016 12:12 PM, Johannes Thumshirn wrote: Looks good and sorry for the bug, Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> Thanks for the quick review. Not a problem! This problem turned out to be a good learning exercise. :) -- Mauricio Faria de Oliveira IBM Linux Tech

[PATCH] qla2xxx: do not abort all commands in the adapter during EEH recovery

2016-11-14 Thread Mauricio Faria de Oliveira
orry for this oversight.) With it applied, both PCI device remove and EEH recovery works fine. Fixes: 1535aa75a3d8 ("scsi: qla2xxx: fix invalid DMA access after command aborts in PCI device remove") Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com>

[PATCH RESEND] block: allow WRITE_SAME commands with the SG_IO ioctl

2016-12-15 Thread Mauricio Faria de Oliveira
or 17096824 Links: [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52 [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device') Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Signed-off-by: Brahadamba

[PATCH] scsi: do not requeue requests unaligned with device sector size

2016-12-19 Thread Mauricio Faria de Oliveira
[sda] tag#0 8 sectors total, 4096 bytes done. [...] sd 0:0:0:0: tag#0 0 sectors total, 0 bytes done. Apologies for the ridiculously long commit message with description and test-cases, but this problem has been relatively difficult to reproduce and understand, so I thought the documentation/instr

Re: [PATCH] scsi: do not requeue requests unaligned with device sector size

2016-12-21 Thread Mauricio Faria de Oliveira
f how the I/O is being broken up into frames at the transport level and at which offset the transfer was interrupted. Christoph, Hannes, Martin, Thank you all for your comments and pointers to the documentation/spec. I'll carry it on with the HBA and storage folks. cheers, -- Mauricio F

[PATCH] scsi: ses: don't get power status of SES device slot on probe

2017-03-29 Thread Mauricio Faria de Oliveira
that more properly, set the initial power state value to '-1' (i.e., uninitialized) instead of '1' (power 'on'), and check for it in that callback which may do an direct access to the field value _if_ a callback function is not defined. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux

[PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-11 Thread Mauricio Faria de Oliveira
wp=rw |-+- policy='service-time 0' prio=0 status=active | `- 2:2:7:0 sdaf 65:240 active undef running `-+- policy='service-time 0' prio=0 status=enabled `- 1:2:7:0 sdh 8:112 active undef running Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.i

Re: [PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-11 Thread Mauricio Faria de Oliveira
This is the PATCH v2. Sorry for the wrong subject line. On 04/11/2017 11:46 AM, Mauricio Faria de Oliveira wrote: Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Acked-by: Brian King <brk...@linux.vnet.ibm.com> --- v2: - use the scsi_cmd local variable

Re: [PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-11 Thread Mauricio Faria de Oliveira
o back to the ipr_cmd to get the pointer, so could be: Thanks for catching that oversight. -- Mauricio Faria de Oliveira IBM Linux Technology Center

[PATCH 2/4] scsi: scsi_dh_alua: create alua_rtpg_print() for alua_rtpg() sdev_printk

2017-04-10 Thread Mauricio Faria de Oliveira
), in which case the 'valid_states' information is not printed. That is for the following patch too. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- drivers/scsi/device_handler/scsi_dh_alua.c | 43 ++ 1 file changed, 32 insertions(+), 11 del

[PATCH 3/4] scsi: scsi_dh_alua: print changes to RTPG state of other PGs too

2017-04-10 Thread Mauricio Faria de Oliveira
ciated to this port group. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- drivers/scsi/device_handler/scsi_dh_alua.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_

[PATCH 0/4] scsi: scsi_dh_alua: handle target port unavailable state

2017-04-10 Thread Mauricio Faria de Oliveira
Mauricio Faria de Oliveira (4): scsi: scsi_dh_alua: allow I/O in the target port unavailable state scsi: scsi_dh_alua: create alua_rtpg_print() for alua_rtpg() sdev_printk scsi: scsi_dh_alua: print changes to RTPG state of other PGs too scsi: scsi_dh_alua: do not print target port g

[PATCH 1/4] scsi: scsi_dh_alua: allow I/O in the target port unavailable state

2017-04-10 Thread Mauricio Faria de Oliveira
gt;state can be updated properly (and further SCSI IO error messages then silenced through alua_prep_fn()). Once a path checker eventually detects an active state again, the port group state will be updated by the path activation call, alua_activate(), as it schedules an alua_rtpg() check. Signed

[PATCH 4/4] scsi: scsi_dh_alua: do not print target port group state if it remains unavailable

2017-04-10 Thread Mauricio Faria de Oliveira
. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- drivers/scsi/device_handler/scsi_dh_alua.c | 26 -- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_h

Re: [PATCH 0/4] scsi: scsi_dh_alua: handle target port unavailable state

2017-04-10 Thread Mauricio Faria de Oliveira
On 04/10/2017 10:17 PM, Mauricio Faria de Oliveira wrote: For documentation purposes, I'll reply to this cover letter with the analysis of such cases of this problem, and the accompanying messages from kernel logs. Here it goes, for anyone interested. Scenario: 4 LUNs, 2 target port groups

Re: [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown

2017-04-04 Thread Mauricio Faria de Oliveira
Hi Martin and Junichi, On 04/03/2017 11:10 PM, Junichi Nomura wrote: On 04/04/17 06:53, Mauricio Faria de Oliveira wrote: On 03/28/2017 11:29 PM, Junichi Nomura wrote: Since commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications"), "rmmod lpfc" sta

Re: [PATCH] scsi: ses: don't get power status of SES device slot on probe

2017-04-05 Thread Mauricio Faria de Oliveira
d making this patch a one-line. :- ) cheers, -- Mauricio Faria de Oliveira IBM Linux Technology Center

Re: [lpfc 05/19] Fix driver unload/reload operation.

2017-04-05 Thread Mauricio Faria de Oliveira
FO, LOG_INIT, "2821 initialize iocb list %d.\n", phba->cfg_iocb_cnt*1024); cheers, -- Mauricio Faria de Oliveira IBM Linux Technology Center

[PATCH v2] scsi: ses: don't get power status of SES device slot on probe

2017-04-05 Thread Mauricio Faria de Oliveira
ower state value to '-1' (i.e., uninitialized) instead of '1' (power 'on'), and check for it in that callback which may do an direct access to the field value _if_ a callback function is not defined. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Fixes: 0802488

Re: [PATCH] scsi: ses: don't get power status of SES device slot on probe

2017-04-05 Thread Mauricio Faria de Oliveira
On 04/05/2017 11:41 AM, Dan Williams wrote: On Wed, Apr 5, 2017 at 6:13 AM, Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> wrote: 1) imagine .get_power_status couldn't update the 'power_status' field (it's a bit unlikely with the in-tree ses driver, but in th

Re: [PATCH v2] scsi: ses: don't get power status of SES device slot on probe

2017-04-05 Thread Mauricio Faria de Oliveira
On 04/05/2017 01:23 PM, Song Liu wrote: Reviewed-by: Song Liu <songliubrav...@fb.com> Thanks for reviewing, Song Liu. It's good to know this patch doesn't break anything for you. cheers, -- Mauricio Faria de Oliveira IBM Linux Technology Center

[PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-10 Thread Mauricio Faria de Oliveira
wp=rw |-+- policy='service-time 0' prio=0 status=active | `- 2:2:7:0 sdaf 65:240 active undef running `-+- policy='service-time 0' prio=0 status=enabled `- 1:2:7:0 sdh 8:112 active undef running Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com>

Re: [PATCH v2 10/11] lpfc: Add missing memory barrier

2017-03-03 Thread Mauricio Faria de Oliveira
received partially updated WQE data. Add the memory barrier after updating the WQE memory. Reviewed-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Martin, may you please flag this patch for stable? Thank you, -- Mauricio Faria de Oliveira IBM Linux Technology Center

Re: [PATCH v3 01/16] lpfc: Correct WQ creation for pagesize

2017-03-03 Thread Mauricio Faria de Oliveira
Hi Martin and James, On 02/12/2017 07:52 PM, James Smart wrote: Correct WQ creation for pagesize Reviewed-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Please flag this patch for stable. This patch resolves a serious problem on IBM Power systems at least. An (appa

Re: [PATCHv3 1/6] scsi_error: count medium access timeout only once per EH run

2017-03-13 Thread Mauricio Faria de Oliveira
4dac35e..6a4f75a 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -106,6 +106,7 @@ struct scsi_disk { unsignedrc_basis: 2; unsignedzoned: 2; unsignedurswrz : 1; + unsignedmedium_access_reset : 1; }; #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev) -- Mauricio Faria de Oliveira IBM Linux Technology Center

Re: [PATCHv3 1/6] scsi_error: count medium access timeout only once per EH run

2017-03-13 Thread Mauricio Faria de Oliveira
On 03/13/2017 11:48 AM, Hannes Reinecke wrote: This is assuming that we're always running on a scsi_disk, and that scsi_disk is the only one implementing 'eh_action'. Neither of which is necessarily true. Ah, OK. Thanks for explaining. -- Mauricio Faria de Oliveira IBM Linux Technology

Re: [PATCH] scsi: lpfc: Add shutdown method for kexec

2017-03-06 Thread Mauricio Faria de Oliveira
On 02/12/2017 07:49 PM, Anton Blanchard wrote: We see lpfc devices regularly fail during kexec. Fix this by adding a shutdown method which mirrors the remove method. Reviewed-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Tested-by: Mauricio Faria de Oliveira

Re: [PATCH] scsi: lpfc: Add shutdown method for kexec

2017-03-07 Thread Mauricio Faria de Oliveira
present/ask for consideration too. I think I should have included this in the tested-by tag email, for documentation/evidence: no regression observed in system shutdown path. Thanks, -- Mauricio Faria de Oliveira IBM Linux Technology Center

Re: [PATCH v3 01/16] lpfc: Correct WQ creation for pagesize

2017-03-07 Thread Mauricio Faria de Oliveira
; I missed checking the right tree. Thanks for the pointers. -- Mauricio Faria de Oliveira IBM Linux Technology Center

Re: [PATCH v3 01/16] lpfc: Correct WQ creation for pagesize

2017-03-07 Thread Mauricio Faria de Oliveira
e if they see fit/required. [1] http://www.spinics.net/lists/linux-scsi/msg105886.html cheers, -- Mauricio Faria de Oliveira IBM Linux Technology Center

[PATCH] lpfc: fix double free of bound CQ/WQ ring pointer

2017-04-03 Thread Mauricio Faria de Oliveira
idx]->pring = pring; commit 85e8a23936ab ("scsi: lpfc: Add shutdown method for kexec") made this more likely as lpfc_pci_remove_one() is called on driver shutdown (e.g., modprobe -r / rmmod). (this patch is partially based on a different patch suggested by Johannes, thus adding a Suggested-by ta

Re: [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown

2017-04-03 Thread Mauricio Faria de Oliveira
t sent ([PATCH] lpfc: fix double free of bound CQ/WQ ring pointer) resolves it? I don't have a setup to test it handy right now. cheers, -- Mauricio Faria de Oliveira IBM Linux Technology Center

Re: [PATCH v2 1/4] scsi: scsi_dh_alua: allow I/O in target port unavailable and standby states

2017-07-11 Thread Mauricio Faria de Oliveira
go through that function. (and it occurred to me that the state-change check of patch 3 can be done there, simpler.) cheers, -- Mauricio Faria de Oliveira IBM Linux Technology Center

Re: [PATCH v2 1/4] scsi: scsi_dh_alua: allow I/O in target port unavailable and standby states

2017-07-11 Thread Mauricio Faria de Oliveira
On 07/11/2017 12:32 PM, Mauricio Faria de Oliveira wrote: Also, it seems the Unavailable/Standby states would not be logged without a recheck from alua_check_sense(), since the only callers of alua_rtpg_queue() are alua_activate() and alua_check[_sense]() Well, actually it does get logged

[PATCH v2 4/4] scsi: scsi_dh_alua: add sdev_dbg() to track alua_rtpg_work()

2017-07-10 Thread Mauricio Faria de Oliveira
Insert sdev_dbg() calls in the function path which may queue alua_rtpg_work() past initialization, for debugging purposes: - alua_activate() - alua_check_sense() - alua_rtpg_queue() Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- drivers/scsi/device_h

[PATCH v2 0/4] scsi_dh_alua: fix stuck I/O after unavailable/standby states

2017-07-10 Thread Mauricio Faria de Oliveira
in unavailable/standby are not logged - only changes are. Patch 4 adds few sdev_dbg() calls to track the path to alua_rtpg_work() Tested on v4.12+ (commit b4b8cbf679c4). Mauricio Faria de Oliveira (4): scsi: scsi_dh_alua: allow I/O in target port unavailable and standby states scsi

[PATCH v2 2/4] scsi: scsi_dh_alua: print changes to RTPG state of all PGs

2017-07-10 Thread Mauricio Faria de Oliveira
for the current PG. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- v2: - use lockdep_assert_held() instead of documenting locking conventions (Bart Van Assche <bart.vanass...@sandisk.com>) - define two functions (with/without supported states information)

[PATCH v2 3/4] scsi: scsi_dh_alua: do not print RTPG state if it remains unavailable/standby

2017-07-10 Thread Mauricio Faria de Oliveira
scheduled in alua_check_sense() to update PG state. So, do not to print such message if unavailable/standby state remains (i.e., the PG did not transition to/from such states). All other cases continue to be printed. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com>

[PATCH v2 1/4] scsi: scsi_dh_alua: allow I/O in target port unavailable and standby states

2017-07-10 Thread Mauricio Faria de Oliveira
ated on path activation (alua_activate(), as it schedules a recheck), thus I/O requests are no longer failed. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Reported-by: Naresh Bannoth <nbann...@in.ibm.com> --- v2: - also add support for standby state

Re: [PATCH v2 0/4] scsi_dh_alua: fix stuck I/O after unavailable/standby states

2017-07-10 Thread Mauricio Faria de Oliveira
On 07/10/2017 07:47 PM, Mauricio Faria de Oliveira wrote: This patchset addresses that problem, and adds a few improvements to the logging of PG state changes. Here are some kernel log snippets with the patchset, if that helps. The 2 port groups temporarily gone into unavailable state

Re: [PATCH] test-case

2018-02-01 Thread Mauricio Faria de Oliveira
On 01/31/2018 08:50 PM, Bart Van Assche wrote: I think it would be useful to have some variant of the above code in the kernel tree. Are you familiar with the fault injection framework (see also and Documentation/fault-injection/fault-injection.txt)? No, not yet. That's very interesting.

Re: [PATCH] scsi: mpt3sas: fix oops in error handlers after shutdown/unload

2018-02-01 Thread Mauricio Faria de Oliveira
On 01/31/2018 08:59 PM, Bart Van Assche wrote: On Wed, 2018-01-31 at 17:48 -0200, Mauricio Faria de Oliveira wrote: On 01/31/2018 05:06 PM, Bart Van Assche wrote: Sorry but I think this patch introduces new race conditions. Have you Can you detail the race conditions? As far as I can see

Re: [PATCH] scsi: mpt3sas: fix oops in error handlers after shutdown/unload

2018-01-31 Thread Mauricio Faria de Oliveira
Bart, Thanks for reviewing. On 01/31/2018 05:06 PM, Bart Van Assche wrote: Sorry but I think this patch introduces new race conditions. Have you Can you detail the race conditions? As far as I can see, the only race condition would be when an error handler is invoked very close in time to

[PATCH] scsi: mpt3sas: fix oops in error handlers after shutdown/unload

2018-01-31 Thread Mauricio Faria de Oliveira
reset and target reset handlers do not cause oopses, but print a misleading message of host reset in progress, thus fix those too. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 28 1 file changed, 2

test results

2018-02-01 Thread Mauricio Faria de Oliveira
The test-case results with PATCH v2. scsih_abort() = Without patch: [ 362.669743] setting logging_level(0x1000) [ 362.705074] mpt3sas_cm0: skip free_smid/scsi_done scmd(c01fd4f2bd40) [ 363.956579] sd 16:0:1:0: [sdf] Synchronizing SCSI cache [ 363.956844]

[PATCH v2] scsi: mpt3sas: fix oops in error handlers after shutdown/unload

2018-02-01 Thread Mauricio Faria de Oliveira
elp, so still go for the changes. Also, this might help to prevent similar errors in the future, in case code changes and possibly tries to access freed stuff. Note the fix in scsih_host_reset() is still important anyway. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- v2:

[PATCH] test-case

2018-01-31 Thread Mauricio Faria de Oliveira
This patch can be verified with this simple test-case, which inserts a wait loop at the bottom of 'scsih_shutdown()' and forces SCSI commands to timeout (skip 'scmd->scsi_done()'). It abuses the 'ioc->logging_level' parameter do to that, with: - 0x1000: wait loop on scsih_shutdown() and skip

Re: [PATCH v2] scsi: mpt3sas: fix oops in error handlers after shutdown/unload

2018-02-14 Thread Mauricio Faria de Oliveira
s already used in many other points in the code, for the same reasons (exit early before the code attempts to use stuff that might be released). Thanks again, -- Mauricio Faria de Oliveira IBM Linux Technology Center

Re: [PATCH v2] scsi: mpt3sas: fix oops in error handlers after shutdown/unload

2018-02-15 Thread Mauricio Faria de Oliveira
Hi Sreekanth, On 02/15/2018 03:48 AM, Sreekanth Reddy wrote: During the shutdown time, I don't want the outstanding IOs to timeout due to disabling of interrupts and go the TM path. So I wanted to clear out all the Outstanding IOs in the shutdown path itself instead of clearing them in TM path.

[PATCH v2 1/2] scsi: mpt3sas: fix oops in error handlers after shutdown/unload

2018-02-16 Thread Mauricio Faria de Oliveira
for the changes. Also, this might help to prevent similar errors in the future, in case code changes and possibly tries to access freed stuff. Note the fix in scsih_host_reset() is still important anyway. Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> ---

[PATCH v3 0/2] scsi: mpt3sas: prevent oops in the shutdown/unload path

2018-02-16 Thread Mauricio Faria de Oliveira
. Mauricio Faria de Oliveira (2): scsi: mpt3sas: fix oops in error handlers after shutdown/unload scsi: mpt3sas: wait for and flush running commands on shutdown/unload drivers/scsi/mpt3sas/mpt3sas_base.c | 8 drivers/scsi/mpt3sas/mpt3sas_base.h | 3 +++ drivers/scsi/mpt3sas

[PATCH 2/2] scsi: mpt3sas: wait for and flush running commands on shutdown/unload

2018-02-16 Thread Mauricio Faria de Oliveira
.com> Tested-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Signed-off-by: Sreekanth Reddy <sreekanth.re...@broadcom.com> [mauricfo: introduced something in commit message.] Signed-off-by: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> --- drivers/

Re: [PATCH v3 0/2] scsi: mpt3sas: prevent oops in the shutdown/unload path

2018-03-05 Thread Mauricio Faria de Oliveira
Martin, James, On 02/22/2018 01:07 AM, Martin K. Petersen wrote: The first patch prevents the SCSI error handlers to run once the shutdown/unload path starts. This avoids an oops at least in the host reset handler, on kernels with a recent patch, and also in the abort handler on kernels