LOAN AVAILABLE @3 %
LOAN AVAILABLE @3 % Rely mohammedanis...@gmail.com
Re: [PATCH v4 00/11] Enhance libsas hotplug feature
On 2017/9/6 21:22, Christoph Hellwig wrote: On Wed, Sep 06, 2017 at 02:07:57PM +0100, John Garry wrote: Regardless of the fate of the rest of the patches in this series, I think patches 1,2,3,4,11/11 can be taken in isolation (subject to review, of course). It would save maintaining them out-of-tree. I did a quick review of those and they all look fine to me. I'll try to find some time to review the real changes in the next days. . Thank you very much and I'm looking forward to your suggestions of the real changes.
Re: [PATCH v3 1/5] scsi: ufs: add Hisilicon ufs driver code
On Tue, Aug 29, 2017 at 10:41 AM, Li Weiwrote: itel(host, UFS_ARESET, PERRSTDIS3_OFFSET); > + > + /* disable lp_reset_n */ > + ufs_sys_ctrl_set_bits(host, BIT_SYSCTRL_LP_RESET_N, RESET_CTRL_EN); > + mdelay(1); > + > + if (gpio_is_valid(host->reset_gpio)) > + gpio_direction_output(host->reset_gpio, 1); > + > + ufs_sys_ctrl_writel(host, MASK_UFS_DEVICE_RESET | > BIT_UFS_DEVICE_RESET, > + UFS_DEVICE_RESET_CTRL); > + > + mdelay(20); Could those mdelay() be turned into msleep() functions? > +static int ufs_hisi_get_resource(struct ufs_hisi_host *host) > +{ > + struct resource *mem_res; > + struct device_node *np = NULL; > + struct device *dev = host->hba->dev; > + struct platform_device *pdev = to_platform_device(dev); > + > + /* get resource of ufs sys ctrl */ > + mem_res = platform_get_resource(pdev, IORESOURCE_MEM, 1); > + host->ufs_sys_ctrl = devm_ioremap_resource(dev, mem_res); > + if (IS_ERR(host->ufs_sys_ctrl)) > + return PTR_ERR(host->ufs_sys_ctrl); > + > + np = of_find_compatible_node(NULL, NULL, "hisilicon,hi3660-crgctrl"); It's generally not a good idea to look up one device by its "compatible" string. What is the "crgctrl"? Does it have a proper DT binding? Maybe there should be a driver for it, or you could make it a "syscon" device and look it up by phandle instead. > diff --git a/drivers/scsi/ufs/ufs-hisi.h b/drivers/scsi/ufs/ufs-hisi.h > new file mode 100644 > index ..52430a2aca90 > --- /dev/null > +++ b/drivers/scsi/ufs/ufs-hisi.h If the header is only used in one file, you don't need it, just move the definitions into the other file. Arnd
Re: [PATCH 1/2] bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer
On Wed, Sep 06, 2017 at 08:07:43AM -0600, Jens Axboe wrote: > On 09/06/2017 07:44 AM, Christoph Hellwig wrote: > > From: Benjamin Block> > > > Since we split the scsi_request out of struct request bsg fails to > > provide a reply-buffer for the drivers. This was done via the pointer > > for sense-data, that is not preallocated anymore. > > Confused, this is already in master, was included before 4.13 was > finalized. > I think this is the backport for 4.11 and 4.12 as requested by Greg. I just send a mail as answer that I would do it, but I guess Christoph already had something in the pipe. Or? I'll take a look. Beste Grüße / Best regards, - Benjamin Block -- Linux on z Systems Development / IBM Systems & Technology Group IBM Deutschland Research & Development GmbH Vorsitz. AufsR.: Martina Koederitz /Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294
Re: [PATCH 1/2] bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer
On Wed, Sep 06, 2017 at 04:15:09PM +0200, Benjamin Block wrote: > On Wed, Sep 06, 2017 at 08:07:43AM -0600, Jens Axboe wrote: > > On 09/06/2017 07:44 AM, Christoph Hellwig wrote: > > > From: Benjamin Block> > > > > > Since we split the scsi_request out of struct request bsg fails to > > > provide a reply-buffer for the drivers. This was done via the pointer > > > for sense-data, that is not preallocated anymore. > > > > Confused, this is already in master, was included before 4.13 was > > finalized. > > > > I think this is the backport for 4.11 and 4.12 as requested by Greg. I > just send a mail as answer that I would do it, but I guess Christoph > already had something in the pipe. > > Or? I'll take a look. No, I sent the wrong patch - this was in my queue before the patches I wanted to send.
Re: [PATCH 1/2] bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer
On 09/06/2017 07:44 AM, Christoph Hellwig wrote: > From: Benjamin Block> > Since we split the scsi_request out of struct request bsg fails to > provide a reply-buffer for the drivers. This was done via the pointer > for sense-data, that is not preallocated anymore. Confused, this is already in master, was included before 4.13 was finalized. -- Jens Axboe
two small bsg fixes V2
Two fixups for the recent bsg-lib fixes, should go into 4.13 stable as well.
[PATCH 1/2] bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer
From: Benjamin BlockSince we split the scsi_request out of struct request bsg fails to provide a reply-buffer for the drivers. This was done via the pointer for sense-data, that is not preallocated anymore. Failing to allocate/assign it results in illegal dereferences because LLDs use this pointer unquestioned. An example panic on s390x, using the zFCP driver, looks like this (I had debugging on, otherwise NULL-pointer dereferences wouldn't even panic on s390x): Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6403 Fault in home space mode while using kernel ASCE. AS:01590007 R3:0024 Oops: 0038 ilc:2 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-bsg-regression+ #3 Hardware name: IBM 2964 N96 702 (z/VM 6.4.0) task: 65cb0100 task.stack: 65cb4000 Krnl PSW : 0704e0018000 03ff801e4156 (zfcp_fc_ct_els_job_handler+0x16/0x58 [zfcp]) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0001 5fa9d0d0 5fa9d078 00e16866 03ff0290 6b6b6b6b6b6b6b6b 59f78f00 000f 593a0958 593a0958 60d88800 5ddd4c38 58b50100 0700659cba08 03ff801e8556 659cb9a8 Krnl Code: 03ff801e4146: e3102054lg %r1,80(%r2) 03ff801e414c: 58402040 l %r4,64(%r2) #03ff801e4150: e3502024 lg %r5,32(%r2) >03ff801e4156: 50405004 st %r4,4(%r5) 03ff801e415a: e54c5008 mvhi8(%r5),0 03ff801e4160: e33010280012 lt %r3,40(%r1) 03ff801e4166: a718fffb lhi %r1,-5 03ff801e416a: 1803 lr %r0,%r3 Call Trace: ([<03ff801e8556>] zfcp_fsf_req_complete+0x726/0x768 [zfcp]) [<03ff801ea82a>] zfcp_fsf_reqid_check+0x102/0x180 [zfcp] [<03ff801eb980>] zfcp_qdio_int_resp+0x230/0x278 [zfcp] [<009b91b6>] qdio_kick_handler+0x2ae/0x2c8 [<009b9e3e>] __tiqdio_inbound_processing+0x406/0xc10 [<001684c2>] tasklet_action+0x15a/0x1d8 [<00bd28ec>] __do_softirq+0x3ec/0x848 [<001675a4>] irq_exit+0x74/0xf8 [<0010dd6a>] do_IRQ+0xba/0xf0 [<00bd19e8>] io_int_handler+0x104/0x2d4 [<001033b6>] enabled_wait+0xb6/0x188 ([<0010339e>] enabled_wait+0x9e/0x188) [<0010396a>] arch_cpu_idle+0x32/0x50 [<00bd0112>] default_idle_call+0x52/0x68 [<001cd0fa>] do_idle+0x102/0x188 [<001cd41e>] cpu_startup_entry+0x3e/0x48 [<00118c64>] smp_start_secondary+0x11c/0x130 [<00bd2016>] restart_int_handler+0x62/0x78 [<>] (null) INFO: lockdep is turned off. Last Breaking-Event-Address: [<03ff801e41d6>] zfcp_fc_ct_job_handler+0x3e/0x48 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt This patch moves bsg-lib to allocate and setup struct bsg_job ahead of time, including the allocation of a buffer for the reply-data. This means, struct bsg_job is not allocated separately anymore, but as part of struct request allocation - similar to struct scsi_cmd. Reflect this in the function names that used to handle creation/destruction of struct bsg_job. Reported-by: Steffen Maier Suggested-by: Christoph Hellwig Reviewed-by: Christoph Hellwig Signed-off-by: Benjamin Block Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request") Cc: #4.11+ Signed-off-by: Jens Axboe --- block/bsg-lib.c | 74 + include/linux/blkdev.h | 1 - include/linux/bsg-lib.h | 2 ++ 3 files changed, 46 insertions(+), 31 deletions(-) diff --git a/block/bsg-lib.c b/block/bsg-lib.c index 4752dbc3dc49..c82408c7cc3c 100644 --- a/block/bsg-lib.c +++ b/block/bsg-lib.c @@ -29,26 +29,25 @@ #include /** - * bsg_destroy_job - routine to teardown/delete a bsg job + * bsg_teardown_job - routine to teardown a bsg job * @job: bsg_job that is to be torn down */ -static void bsg_destroy_job(struct kref *kref) +static void bsg_teardown_job(struct kref *kref) { struct bsg_job *job = container_of(kref, struct bsg_job, kref); struct request *rq = job->req; - blk_end_request_all(rq, BLK_STS_OK); - put_device(job->dev); /* release reference for the request */ kfree(job->request_payload.sg_list); kfree(job->reply_payload.sg_list); - kfree(job); + + blk_end_request_all(rq, BLK_STS_OK); } void bsg_job_put(struct bsg_job *job) { - kref_put(>kref, bsg_destroy_job); + kref_put(>kref, bsg_teardown_job); }
[PATCH 2/2] scsi_transport_fc: fix NULL pointer dereference in fc_bsg_job_timeout
bsg-lib now embeddeds the job structure into the request, and req->special can't be used anymore. Signed-off-by: Christoph HellwigCc: sta...@vger.kernel.org --- drivers/scsi/scsi_transport_fc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c index 3c6bc0081fcb..ba9d70f8a6a1 100644 --- a/drivers/scsi/scsi_transport_fc.c +++ b/drivers/scsi/scsi_transport_fc.c @@ -3571,7 +3571,7 @@ fc_vport_sched_delete(struct work_struct *work) static enum blk_eh_timer_return fc_bsg_job_timeout(struct request *req) { - struct bsg_job *job = (void *) req->special; + struct bsg_job *job = blk_mq_rq_to_pdu(req); struct Scsi_Host *shost = fc_bsg_to_shost(job); struct fc_rport *rport = fc_bsg_to_rport(job); struct fc_internal *i = to_fc_internal(shost->transportt); -- 2.11.0
Re: [PATCH v4 03/11] libsas: remove unused port_gone_completion and DISCE_PORT_GONE
Looks good, Reviewed-by: Christoph Hellwig
Re: [PATCH v4 00/11] Enhance libsas hotplug feature
On Wed, Sep 06, 2017 at 02:07:57PM +0100, John Garry wrote: > Regardless of the fate of the rest of the patches in this series, I think > patches 1,2,3,4,11/11 can be taken in isolation (subject to review, of > course). It would save maintaining them out-of-tree. I did a quick review of those and they all look fine to me. I'll try to find some time to review the real changes in the next days.
Re: [PATCH v4 04/11] libsas: rename notify_port_event() for consistency
Looks fine, Reviewed-by: Christoph Hellwig
Re: [PATCH v4 11/11] libsas: add event to defer list tail instead of head when draining
Looks fine, Reviewed-by: Christoph Hellwig
Re: [PATCH v4 02/11] libsas: remove the numbering for each event enum
Looks fine, Reviewed-by: Christoph Hellwig
Re: [PATCH v4 01/11] libsas: kill useless ha_event and do some cleanup
Looks good, Reviewed-by: Christoph Hellwig
Re: [PATCH v4 00/11] Enhance libsas hotplug feature
On 06/09/2017 10:15, Jason Yan wrote: Hello all, Yijing Wang handed over this topic to me. We are working on it the last two months. We have tested the patchset for a long time. Here is the new version. Now the libsas hotplug has some issues, Dan Williams report a similar bug here before https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg39187.html The issues we have found 1. if LLDD burst reports lots of phy-up/phy-down sas events, some events may lost because a same sas events is pending now, finally libsas topo may different the hardware. 2. receive a phy down sas event, libsas call sas_deform_port to remove devices, it would first delete the sas port, then put a destruction discovery event in a new work, and queue it at the tail of workqueue, once the sas port be deleted, its children device will be deleted too, when the destruction work start, it will found the target device has been removed, and report a sysfs warnning. 3. since a hotplug process will be divided into several works, if a phy up sas event insert into phydown works, like destruction work ---> PORTE_BYTES_DMAED (sas_form_port) >PHYE_LOSS_OF_SIGNAL the hot remove flow would broken by PORTE_BYTES_DMAED event, it's not we expected, and issues would occur. v3->v4: -get rid of unused ha event and do some cleanup -use dynamic alloced work and support shutting down the phy if active event reached the threshold -use flush_workqueue instead of wait-completion to process discover events synchronously -direct call probe and destruct function -other small code improvements v2->v3: some code improvements suggested by Johannes and John, split v2 patch 2 into several small patches. v1->v2: some code improvements suggested by John Garry Jason Yan (10): libsas: kill useless ha_event and do some cleanup libsas: remove the numbering for each event enum libsas: remove unused port_gone_completion and DISCE_PORT_GONE libsas: rename notify_port_event() for consistency libsas: Use dynamic alloced work to avoid sas event lost libsas: shut down the PHY if events reached the threshold libsas: make the event threshold configurable libsas: Use new workqueue to run sas event and disco event libsas: libsas: use flush_workqueue to process disco events synchronously libsas: direct call probe and destruct chenxiang (1): libsas: add event to defer list tail instead of head when draining Regardless of the fate of the rest of the patches in this series, I think patches 1,2,3,4,11/11 can be taken in isolation (subject to review, of course). It would save maintaining them out-of-tree. John drivers/scsi/aic94xx/aic94xx_hwi.c| 3 - drivers/scsi/hisi_sas/hisi_sas_main.c | 7 ++- drivers/scsi/libsas/sas_ata.c | 1 - drivers/scsi/libsas/sas_discover.c| 36 +++- drivers/scsi/libsas/sas_dump.c| 10 drivers/scsi/libsas/sas_dump.h| 1 - drivers/scsi/libsas/sas_event.c | 97 +++- drivers/scsi/libsas/sas_expander.c| 2 +- drivers/scsi/libsas/sas_init.c| 101 +- drivers/scsi/libsas/sas_internal.h| 7 +++ drivers/scsi/libsas/sas_phy.c | 73 drivers/scsi/libsas/sas_port.c| 25 + include/scsi/libsas.h | 81 --- include/scsi/scsi_transport_sas.h | 1 + 14 files changed, 270 insertions(+), 175 deletions(-)
Re: [PATCH v4 04/11] libsas: rename notify_port_event() for consistency
Reviewed-by: Johannes Thumshirn-- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH v4 03/11] libsas: remove unused port_gone_completion and DISCE_PORT_GONE
Reviewed-by: Johannes Thumshirn-- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH v4 02/11] libsas: remove the numbering for each event enum
I guess it boils down to personal preference, but Reviewed-by: Johannes Thumshirn-- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH v4 01/11] libsas: kill useless ha_event and do some cleanup
Looks good, Reviewed-by: Johannes Thumshirn-- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 1/2] scsi_transport_fc: fix NULL pointer dereference in fc_bsg_job_timeout
On Wed, Sep 06, 2017 at 06:59:39PM +0800, Ming Lei wrote: > On Wed, Sep 6, 2017 at 6:11 PM, Christoph Hellwigwrote: > > bsg-lib now embeddeds the job structure into the request, and req->special > > can't be used anymore. > > > > Signed-off-by: Christoph Hellwig > > Cc: sta...@vger.kernel.org > > --- > > drivers/scsi/scsi_transport_fc.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/scsi/scsi_transport_fc.c > > b/drivers/scsi/scsi_transport_fc.c > > index 3c6bc0081fcb..d8de46806a1e 100644 > > --- a/drivers/scsi/scsi_transport_fc.c > > +++ b/drivers/scsi/scsi_transport_fc.c > > @@ -3571,7 +3571,7 @@ fc_vport_sched_delete(struct work_struct *work) > > static enum blk_eh_timer_return > > fc_bsg_job_timeout(struct request *req) > > { > > - struct bsg_job *job = (void *) req->special; > > + struct bsg_job *job = blk_mq_rq_to_pdu(req->special); > > still req->special? Meh, sent out before the rebase finished - I'll fix it up.
Re: [PATCH 1/2] scsi_transport_fc: fix NULL pointer dereference in fc_bsg_job_timeout
On Wed, Sep 6, 2017 at 6:11 PM, Christoph Hellwigwrote: > bsg-lib now embeddeds the job structure into the request, and req->special > can't be used anymore. > > Signed-off-by: Christoph Hellwig > Cc: sta...@vger.kernel.org > --- > drivers/scsi/scsi_transport_fc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_transport_fc.c > b/drivers/scsi/scsi_transport_fc.c > index 3c6bc0081fcb..d8de46806a1e 100644 > --- a/drivers/scsi/scsi_transport_fc.c > +++ b/drivers/scsi/scsi_transport_fc.c > @@ -3571,7 +3571,7 @@ fc_vport_sched_delete(struct work_struct *work) > static enum blk_eh_timer_return > fc_bsg_job_timeout(struct request *req) > { > - struct bsg_job *job = (void *) req->special; > + struct bsg_job *job = blk_mq_rq_to_pdu(req->special); still req->special? -- Ming Lei
[PATCH 1/2] scsi_transport_fc: fix NULL pointer dereference in fc_bsg_job_timeout
bsg-lib now embeddeds the job structure into the request, and req->special can't be used anymore. Signed-off-by: Christoph HellwigCc: sta...@vger.kernel.org --- drivers/scsi/scsi_transport_fc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c index 3c6bc0081fcb..d8de46806a1e 100644 --- a/drivers/scsi/scsi_transport_fc.c +++ b/drivers/scsi/scsi_transport_fc.c @@ -3571,7 +3571,7 @@ fc_vport_sched_delete(struct work_struct *work) static enum blk_eh_timer_return fc_bsg_job_timeout(struct request *req) { - struct bsg_job *job = (void *) req->special; + struct bsg_job *job = blk_mq_rq_to_pdu(req->special); struct Scsi_Host *shost = fc_bsg_to_shost(job); struct fc_rport *rport = fc_bsg_to_rport(job); struct fc_internal *i = to_fc_internal(shost->transportt); -- 2.11.0
two small bsg fixes
Two fixups for the recent bsg-lib fixups, should go into 4.13 stable as well.
[PATCH 2/2] bsg-lib: don't free job in bsg_prepare_job
The job structure is allocated as part of the request, so we should not free it in the error path of bsg_prepare_job. Signed-off-by: Christoph HellwigCc: sta...@vger.kernel.org --- block/bsg-lib.c | 1 - 1 file changed, 1 deletion(-) diff --git a/block/bsg-lib.c b/block/bsg-lib.c index c82408c7cc3c..dbddff8174e5 100644 --- a/block/bsg-lib.c +++ b/block/bsg-lib.c @@ -154,7 +154,6 @@ static int bsg_prepare_job(struct device *dev, struct request *req) failjob_rls_rqst_payload: kfree(job->request_payload.sg_list); failjob_rls_job: - kfree(job); return -ENOMEM; } -- 2.11.0
Re: [PATCH RESEND] lpfc: Don't return internal MBXERR_ERROR code from probe function
On Wed, Sep 06, 2017 at 11:54:15AM +0200, Stefano Brivio wrote: > Thanks for your feedback! > > I considered doing something similar, but there are different error > coded which are set when we reach the label out_free_mbsx. I checked all > of them (and I hope I didn't miss any), but they all looked correct, > and in quite a few cases different than -EIO (e.g. -ENODEV). > > So I think always returning -EIO in those cases is not what we want. We still could pre-assign the rc value: rc = -EIO; rc = foo() if (rc) goto err_handler; rc = bar() if (rc) goto err_handler; rc = -ENODEV; if (rc) goto somewhere_else; But let's not complicate things and get this one queued up. > > Because as this patch shows there's always a chance to miss an 'rc = -EIO'. > > > > Out of curiosity, do you know what's the value of rc in the failure case? > > Yes, MBXERR_ERROR (mentioned in patch subject -- sorry, I could have > repeated it in the message perhaps). Ah ok I somehow missed it, sorry. Byte, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
[PATCH] scsi: ufs: fix wrong command type of UTRD for UFSHCI v2.1
Since the command type of UTRD in UFS 2.1 specification is the same with UFS 2.0. And it assumes the future UFS specification will follow the same definition. Signed-off-by: kehuanlin--- drivers/scsi/ufs/ufshcd.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 5bc9dc1..c33a2f8 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2195,10 +2195,11 @@ static int ufshcd_comp_devman_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) u32 upiu_flags; int ret = 0; - if (hba->ufs_version == UFSHCI_VERSION_20) - lrbp->command_type = UTP_CMD_TYPE_UFS_STORAGE; - else + if ((hba->ufs_version == UFSHCI_VERSION_10) || + (hba->ufs_version == UFSHCI_VERSION_11)) lrbp->command_type = UTP_CMD_TYPE_DEV_MANAGE; + else + lrbp->command_type = UTP_CMD_TYPE_UFS_STORAGE; ufshcd_prepare_req_desc_hdr(lrbp, _flags, DMA_NONE); if (hba->dev_cmd.type == DEV_CMD_TYPE_QUERY) @@ -,10 +2223,11 @@ static int ufshcd_comp_scsi_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) u32 upiu_flags; int ret = 0; - if (hba->ufs_version == UFSHCI_VERSION_20) - lrbp->command_type = UTP_CMD_TYPE_UFS_STORAGE; - else + if ((hba->ufs_version == UFSHCI_VERSION_10) || + (hba->ufs_version == UFSHCI_VERSION_11)) lrbp->command_type = UTP_CMD_TYPE_SCSI; + else + lrbp->command_type = UTP_CMD_TYPE_UFS_STORAGE; if (likely(lrbp->cmd)) { ufshcd_prepare_req_desc_hdr(lrbp, _flags, -- 2.7.4
Re: [PATCH RESEND] lpfc: Don't return internal MBXERR_ERROR code from probe function
On Wed, 6 Sep 2017 11:30:34 +0200 Johannes Thumshirnwrote: > On Wed, Sep 06, 2017 at 11:02:56AM +0200, Stefano Brivio wrote: > > Internal error codes happen to be positive, thus the PCI driver > > core won't treat them as failure, but we do. This would cause a > > crash later on as lpfc_pci_remove_one() is called (e.g. as > > shutdown function). > > > > Fixes: 6d368e532168 ("[SCSI] lpfc 8.3.24: Add resource extent support") > > Signed-off-by: Stefano Brivio > > --- > > This seems to have been ignored. Re-sending as suggested by Johannes. > > > > drivers/scsi/lpfc/lpfc_init.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c > > index 491aa95eb0f6..38cc2b5bb5a2 100644 > > --- a/drivers/scsi/lpfc/lpfc_init.c > > +++ b/drivers/scsi/lpfc/lpfc_init.c > > @@ -6118,6 +6118,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) > > "Extents and RPI headers enabled.\n"); > > } > > mempool_free(mboxq, phba->mbox_mem_pool); > > + rc = -EIO; > > goto out_free_bsmbx; > > } > > > > -- > > 2.9.4 > > The patch looks good, but there are lots of > if (rc) { > mempool_free(mboxq, phba->mbox_mem_pool); > rc = -EIO; > goto out_free_bsmbx; > } > > in lpfc_sli4_driver_resource_setup(). Shouldn't out_free_bsmbx take care of it > all so we only have: > if (rc) > goto out_free_bsmbx; Thanks for your feedback! I considered doing something similar, but there are different error coded which are set when we reach the label out_free_mbsx. I checked all of them (and I hope I didn't miss any), but they all looked correct, and in quite a few cases different than -EIO (e.g. -ENODEV). So I think always returning -EIO in those cases is not what we want. > Because as this patch shows there's always a chance to miss an 'rc = -EIO'. > > Out of curiosity, do you know what's the value of rc in the failure case? Yes, MBXERR_ERROR (mentioned in patch subject -- sorry, I could have repeated it in the message perhaps). -- Stefano
Re: [PATCH RESEND] lpfc: Don't return internal MBXERR_ERROR code from probe function
On Wed, Sep 06, 2017 at 11:02:56AM +0200, Stefano Brivio wrote: > Internal error codes happen to be positive, thus the PCI driver > core won't treat them as failure, but we do. This would cause a > crash later on as lpfc_pci_remove_one() is called (e.g. as > shutdown function). > > Fixes: 6d368e532168 ("[SCSI] lpfc 8.3.24: Add resource extent support") > Signed-off-by: Stefano Brivio> --- > This seems to have been ignored. Re-sending as suggested by Johannes. > > drivers/scsi/lpfc/lpfc_init.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c > index 491aa95eb0f6..38cc2b5bb5a2 100644 > --- a/drivers/scsi/lpfc/lpfc_init.c > +++ b/drivers/scsi/lpfc/lpfc_init.c > @@ -6118,6 +6118,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) > "Extents and RPI headers enabled.\n"); > } > mempool_free(mboxq, phba->mbox_mem_pool); > + rc = -EIO; > goto out_free_bsmbx; > } > > -- > 2.9.4 The patch looks good, but there are lots of if (rc) { mempool_free(mboxq, phba->mbox_mem_pool); rc = -EIO; goto out_free_bsmbx; } in lpfc_sli4_driver_resource_setup(). Shouldn't out_free_bsmbx take care of it all so we only have: if (rc) goto out_free_bsmbx; Because as this patch shows there's always a chance to miss an 'rc = -EIO'. Out of curiosity, do you know what's the value of rc in the failure case? Anyways: Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
[PATCH v4 01/11] libsas: kill useless ha_event and do some cleanup
The ha_event now has only one event HAE_RESET, and this event does nothing. Kill it and do some cleanup. This is a preparation for enhance libsas hotplug feature in the next patches. Signed-off-by: Jason YanSigned-off-by: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams --- drivers/scsi/aic94xx/aic94xx_hwi.c| 3 --- drivers/scsi/hisi_sas/hisi_sas_main.c | 1 - drivers/scsi/libsas/sas_dump.c| 10 -- drivers/scsi/libsas/sas_dump.h| 1 - drivers/scsi/libsas/sas_event.c | 20 drivers/scsi/libsas/sas_init.c| 12 include/scsi/libsas.h | 21 - 7 files changed, 68 deletions(-) diff --git a/drivers/scsi/aic94xx/aic94xx_hwi.c b/drivers/scsi/aic94xx/aic94xx_hwi.c index f2671a8..ec09438 100644 --- a/drivers/scsi/aic94xx/aic94xx_hwi.c +++ b/drivers/scsi/aic94xx/aic94xx_hwi.c @@ -721,11 +721,8 @@ int asd_init_hw(struct asd_ha_struct *asd_ha) */ static void asd_chip_reset(struct asd_ha_struct *asd_ha) { - struct sas_ha_struct *sas_ha = _ha->sas_ha; - ASD_DPRINTK("chip reset for %s\n", pci_name(asd_ha->pcidev)); asd_chip_hardrst(asd_ha); - sas_ha->notify_ha_event(sas_ha, HAE_RESET); } /* -- Done List Routines -- */ diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 4022c3f..5e47abb 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -990,7 +990,6 @@ static int hisi_sas_controller_reset(struct hisi_hba *hisi_hba) hisi_sas_release_tasks(hisi_hba); spin_unlock_irqrestore(_hba->lock, flags); - sas_ha->notify_ha_event(sas_ha, HAE_RESET); dev_dbg(dev, "controller reset successful!\n"); } else return -1; diff --git a/drivers/scsi/libsas/sas_dump.c b/drivers/scsi/libsas/sas_dump.c index cd6f99c..7e5d262 100644 --- a/drivers/scsi/libsas/sas_dump.c +++ b/drivers/scsi/libsas/sas_dump.c @@ -24,10 +24,6 @@ #include "sas_dump.h" -static const char *sas_hae_str[] = { - [0] = "HAE_RESET", -}; - static const char *sas_porte_str[] = { [0] = "PORTE_BYTES_DMAED", [1] = "PORTE_BROADCAST_RCVD", @@ -53,12 +49,6 @@ void sas_dprint_phye(int phyid, enum phy_event pe) SAS_DPRINTK("phy%d: phy event: %s\n", phyid, sas_phye_str[pe]); } -void sas_dprint_hae(struct sas_ha_struct *sas_ha, enum ha_event he) -{ - SAS_DPRINTK("ha %s: %s event\n", dev_name(sas_ha->dev), - sas_hae_str[he]); -} - void sas_dump_port(struct asd_sas_port *port) { SAS_DPRINTK("port%d: class:0x%x\n", port->id, port->class); diff --git a/drivers/scsi/libsas/sas_dump.h b/drivers/scsi/libsas/sas_dump.h index 800e4c6..6aaee6b 100644 --- a/drivers/scsi/libsas/sas_dump.h +++ b/drivers/scsi/libsas/sas_dump.h @@ -26,5 +26,4 @@ void sas_dprint_porte(int phyid, enum port_event pe); void sas_dprint_phye(int phyid, enum phy_event pe); -void sas_dprint_hae(struct sas_ha_struct *sas_ha, enum ha_event he); void sas_dump_port(struct asd_sas_port *port); diff --git a/drivers/scsi/libsas/sas_event.c b/drivers/scsi/libsas/sas_event.c index c0d0d97..70c4653 100644 --- a/drivers/scsi/libsas/sas_event.c +++ b/drivers/scsi/libsas/sas_event.c @@ -124,14 +124,6 @@ void sas_enable_revalidation(struct sas_ha_struct *ha) mutex_unlock(>disco_mutex); } -static int notify_ha_event(struct sas_ha_struct *sas_ha, enum ha_event event) -{ - BUG_ON(event >= HA_NUM_EVENTS); - - return sas_queue_event(event, _ha->pending, - _ha->ha_events[event].work, sas_ha); -} - static int notify_port_event(struct asd_sas_phy *phy, enum port_event event) { struct sas_ha_struct *ha = phy->ha; @@ -154,18 +146,6 @@ int sas_notify_phy_event(struct asd_sas_phy *phy, enum phy_event event) int sas_init_events(struct sas_ha_struct *sas_ha) { - static const work_func_t sas_ha_event_fns[HA_NUM_EVENTS] = { - [HAE_RESET] = sas_hae_reset, - }; - - int i; - - for (i = 0; i < HA_NUM_EVENTS; i++) { - INIT_SAS_WORK(_ha->ha_events[i].work, sas_ha_event_fns[i]); - sas_ha->ha_events[i].ha = sas_ha; - } - - sas_ha->notify_ha_event = notify_ha_event; sas_ha->notify_port_event = notify_port_event; sas_ha->notify_phy_event = sas_notify_phy_event; diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c index 64e9cdd..d3f5b57 100644 --- a/drivers/scsi/libsas/sas_init.c +++ b/drivers/scsi/libsas/sas_init.c @@ -106,17 +106,6 @@ void sas_hash_addr(u8 *hashed, const u8 *sas_addr) hashed[2] = r & 0xFF; } - -/* -- HA events -- */ -
[PATCH v4 09/11] libsas: libsas: use flush_workqueue to process disco events synchronously
Use flush_workqueue to insure the disco and revalidate events processed synchronously. Signed-off-by: Jason YanCC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams --- drivers/scsi/libsas/sas_port.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c index 9326628..64722f4 100644 --- a/drivers/scsi/libsas/sas_port.c +++ b/drivers/scsi/libsas/sas_port.c @@ -192,6 +192,7 @@ static void sas_form_port(struct asd_sas_phy *phy) si->dft->lldd_port_formed(phy); sas_discover_event(phy->port, DISCE_DISCOVER_DOMAIN); + flush_workqueue(sas_ha->disco_q); } /** @@ -277,6 +278,9 @@ void sas_porte_broadcast_rcvd(struct work_struct *work) SAS_DPRINTK("broadcast received: %d\n", prim); sas_discover_event(phy->port, DISCE_REVALIDATE_DOMAIN); + + if (phy->port) + flush_workqueue(phy->port->ha->disco_q); } void sas_porte_link_reset_err(struct work_struct *work) -- 2.5.0
[PATCH v4 03/11] libsas: remove unused port_gone_completion and DISCE_PORT_GONE
No one uses the port_gone_completion in struct asd_sas_port and DISCE_PORT_GONE in enum disover_event, clean them out. Signed-off-by: Jason YanCC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams --- include/scsi/libsas.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h index ccf3b48..99f32b5 100644 --- a/include/scsi/libsas.h +++ b/include/scsi/libsas.h @@ -81,7 +81,6 @@ enum phy_event { enum discover_event { DISCE_DISCOVER_DOMAIN = 0U, DISCE_REVALIDATE_DOMAIN, - DISCE_PORT_GONE, DISCE_PROBE, DISCE_SUSPEND, DISCE_RESUME, @@ -256,8 +255,6 @@ struct sas_discovery { /* The port struct is Class:RW, driver:RO */ struct asd_sas_port { /* private: */ - struct completion port_gone_completion; - struct sas_discovery disc; struct domain_device *port_dev; spinlock_t dev_list_lock; -- 2.5.0
[PATCH v4 07/11] libsas: make the event threshold configurable
Add a sysfs attr that LLDD can configure it for every host. We made a example in hisi_sas. Other LLDDs using libsas can implement it if they want. Suggested-by: Hannes ReineckeSigned-off-by: Jason Yan CC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams --- drivers/scsi/hisi_sas/hisi_sas_main.c | 6 ++ drivers/scsi/libsas/sas_init.c| 27 +++ include/scsi/libsas.h | 1 + 3 files changed, 34 insertions(+) diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 5e47abb..9eceed1 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -1470,6 +1470,11 @@ EXPORT_SYMBOL_GPL(hisi_sas_rescan_topology); struct scsi_transport_template *hisi_sas_stt; EXPORT_SYMBOL_GPL(hisi_sas_stt); +struct device_attribute *host_attrs[] = { +_attr_phy_event_threshold, +NULL, +}; + static struct scsi_host_template _hisi_sas_sht = { .module = THIS_MODULE, .name = DRV_NAME, @@ -1489,6 +1494,7 @@ static struct scsi_host_template _hisi_sas_sht = { .eh_bus_reset_handler = sas_eh_bus_reset_handler, .target_destroy = sas_target_destroy, .ioctl = sas_ioctl, + .shost_attrs= host_attrs, }; struct scsi_host_template *hisi_sas_sht = &_hisi_sas_sht; EXPORT_SYMBOL_GPL(hisi_sas_sht); diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c index b1e03d5..e2d98a8 100644 --- a/drivers/scsi/libsas/sas_init.c +++ b/drivers/scsi/libsas/sas_init.c @@ -537,6 +537,33 @@ static struct sas_function_template sft = { .smp_handler = sas_smp_handler, }; +static inline ssize_t phy_event_threshold_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct Scsi_Host *shost = class_to_shost(dev); + struct sas_ha_struct *sha = SHOST_TO_SAS_HA(shost); + + return scnprintf(buf, PAGE_SIZE, "%u\n", sha->event_thres); +} + +static inline ssize_t phy_event_threshold_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct Scsi_Host *shost = class_to_shost(dev); + struct sas_ha_struct *sha = SHOST_TO_SAS_HA(shost); + + sha->event_thres = simple_strtol(buf, NULL, 10); + + return count; +} + +DEVICE_ATTR(phy_event_threshold, + S_IRUGO|S_IWUSR, + phy_event_threshold_show, + phy_event_threshold_store); +EXPORT_SYMBOL_GPL(dev_attr_phy_event_threshold); + struct scsi_transport_template * sas_domain_attach_transport(struct sas_domain_function_template *dft) { diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h index 2fa0b13..08c1c32 100644 --- a/include/scsi/libsas.h +++ b/include/scsi/libsas.h @@ -679,6 +679,7 @@ extern int sas_bios_param(struct scsi_device *, sector_t capacity, int *hsc); extern struct scsi_transport_template * sas_domain_attach_transport(struct sas_domain_function_template *); +extern struct device_attribute dev_attr_phy_event_threshold; int sas_discover_root_expander(struct domain_device *); -- 2.5.0
[PATCH v4 06/11] libsas: shut down the PHY if events reached the threshold
If the PHY burst too many events, we will alloc a lot of events for the worker. This may leads to memory exhaustion. Dan Williams suggested to shut down the PHY if the events reached the threshold, because in this case the PHY may have gone into some erroneous state. Users can re-enable the PHY by sysfs if they want. We cannot use the fixed memory pool because if we run out of events, the shut down event and loss of signal event will lost too. The events still need to be allocated and processed in this case. Suggested-by: Dan WilliamsSigned-off-by: Jason Yan CC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl --- drivers/scsi/libsas/sas_init.c | 21 - drivers/scsi/libsas/sas_phy.c | 31 ++- include/scsi/libsas.h | 6 ++ 3 files changed, 56 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c index 85c278a..b1e03d5 100644 --- a/drivers/scsi/libsas/sas_init.c +++ b/drivers/scsi/libsas/sas_init.c @@ -122,6 +122,8 @@ int sas_register_ha(struct sas_ha_struct *sas_ha) INIT_LIST_HEAD(_ha->defer_q); INIT_LIST_HEAD(_ha->eh_dev_q); + sas_ha->event_thres = SAS_PHY_SHUTDOWN_THRES; + error = sas_register_phys(sas_ha); if (error) { printk(KERN_NOTICE "couldn't register sas phys:%d\n", error); @@ -556,14 +558,31 @@ EXPORT_SYMBOL_GPL(sas_domain_attach_transport); struct asd_sas_event *sas_alloc_event(struct asd_sas_phy *phy) { + struct asd_sas_event *event; gfp_t flags = in_interrupt() ? GFP_ATOMIC : GFP_KERNEL; - return kmem_cache_zalloc(sas_event_cache, flags); + event = kmem_cache_zalloc(sas_event_cache, flags); + if (!event) + return NULL; + + atomic_inc(>event_nr); + if (atomic_read(>event_nr) > phy->ha->event_thres && + !phy->in_shutdown) { + phy->in_shutdown = 1; + sas_printk("The phy%02d bursting events, shut it down.\n", + phy->id); + sas_notify_phy_event(phy, PHYE_SHUTDOWN); + } + + return event; } void sas_free_event(struct asd_sas_event *event) { + struct asd_sas_phy *phy = event->phy; + kmem_cache_free(sas_event_cache, event); + atomic_dec(>event_nr); } /* -- SAS Class register/unregister -- */ diff --git a/drivers/scsi/libsas/sas_phy.c b/drivers/scsi/libsas/sas_phy.c index 59f8292..3df1eec 100644 --- a/drivers/scsi/libsas/sas_phy.c +++ b/drivers/scsi/libsas/sas_phy.c @@ -35,6 +35,7 @@ static void sas_phye_loss_of_signal(struct work_struct *work) struct asd_sas_event *ev = to_asd_sas_event(work); struct asd_sas_phy *phy = ev->phy; + phy->in_shutdown = 0; phy->error = 0; sas_deform_port(phy, 1); } @@ -44,6 +45,7 @@ static void sas_phye_oob_done(struct work_struct *work) struct asd_sas_event *ev = to_asd_sas_event(work); struct asd_sas_phy *phy = ev->phy; + phy->in_shutdown = 0; phy->error = 0; } @@ -105,6 +107,32 @@ static void sas_phye_resume_timeout(struct work_struct *work) } +static void sas_phye_shutdown(struct work_struct *work) +{ + struct asd_sas_event *ev = to_asd_sas_event(work); + struct asd_sas_phy *phy = ev->phy; + struct sas_ha_struct *sas_ha = phy->ha; + struct sas_internal *i = + to_sas_internal(sas_ha->core.shost->transportt); + + if (phy->enabled && i->dft->lldd_control_phy) { + int ret; + + phy->error = 0; + phy->enabled = 0; + ret = i->dft->lldd_control_phy(phy, PHY_FUNC_DISABLE, NULL); + if (ret) + sas_printk("lldd disable phy%02d returned %d\n", + phy->id, ret); + + } else if (!i->dft->lldd_control_phy) + sas_printk("lldd does not support phy%02d control\n", phy->id); + else + sas_printk("phy%02d is not enabled, cannot shutdown\n", + phy->id); + +} + /* -- Phy class registration -- */ int sas_register_phys(struct sas_ha_struct *sas_ha) @@ -116,6 +144,7 @@ int sas_register_phys(struct sas_ha_struct *sas_ha) struct asd_sas_phy *phy = sas_ha->sas_phy[i]; phy->error = 0; + atomic_set(>event_nr, 0); INIT_LIST_HEAD(>port_phy_el); phy->port = NULL; @@ -151,5 +180,5 @@ const work_func_t sas_phy_event_fns[PHY_NUM_EVENTS] = { [PHYE_OOB_ERROR] = sas_phye_oob_error, [PHYE_SPINUP_HOLD] = sas_phye_spinup_hold, [PHYE_RESUME_TIMEOUT] = sas_phye_resume_timeout, - + [PHYE_SHUTDOWN] = sas_phye_shutdown, }; diff --git
[PATCH v4 00/11] Enhance libsas hotplug feature
Hello all, Yijing Wang handed over this topic to me. We are working on it the last two months. We have tested the patchset for a long time. Here is the new version. Now the libsas hotplug has some issues, Dan Williams report a similar bug here before https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg39187.html The issues we have found 1. if LLDD burst reports lots of phy-up/phy-down sas events, some events may lost because a same sas events is pending now, finally libsas topo may different the hardware. 2. receive a phy down sas event, libsas call sas_deform_port to remove devices, it would first delete the sas port, then put a destruction discovery event in a new work, and queue it at the tail of workqueue, once the sas port be deleted, its children device will be deleted too, when the destruction work start, it will found the target device has been removed, and report a sysfs warnning. 3. since a hotplug process will be divided into several works, if a phy up sas event insert into phydown works, like destruction work ---> PORTE_BYTES_DMAED (sas_form_port) >PHYE_LOSS_OF_SIGNAL the hot remove flow would broken by PORTE_BYTES_DMAED event, it's not we expected, and issues would occur. v3->v4: -get rid of unused ha event and do some cleanup -use dynamic alloced work and support shutting down the phy if active event reached the threshold -use flush_workqueue instead of wait-completion to process discover events synchronously -direct call probe and destruct function -other small code improvements v2->v3: some code improvements suggested by Johannes and John, split v2 patch 2 into several small patches. v1->v2: some code improvements suggested by John Garry Jason Yan (10): libsas: kill useless ha_event and do some cleanup libsas: remove the numbering for each event enum libsas: remove unused port_gone_completion and DISCE_PORT_GONE libsas: rename notify_port_event() for consistency libsas: Use dynamic alloced work to avoid sas event lost libsas: shut down the PHY if events reached the threshold libsas: make the event threshold configurable libsas: Use new workqueue to run sas event and disco event libsas: libsas: use flush_workqueue to process disco events synchronously libsas: direct call probe and destruct chenxiang (1): libsas: add event to defer list tail instead of head when draining drivers/scsi/aic94xx/aic94xx_hwi.c| 3 - drivers/scsi/hisi_sas/hisi_sas_main.c | 7 ++- drivers/scsi/libsas/sas_ata.c | 1 - drivers/scsi/libsas/sas_discover.c| 36 +++- drivers/scsi/libsas/sas_dump.c| 10 drivers/scsi/libsas/sas_dump.h| 1 - drivers/scsi/libsas/sas_event.c | 97 +++- drivers/scsi/libsas/sas_expander.c| 2 +- drivers/scsi/libsas/sas_init.c| 101 +- drivers/scsi/libsas/sas_internal.h| 7 +++ drivers/scsi/libsas/sas_phy.c | 73 drivers/scsi/libsas/sas_port.c| 25 + include/scsi/libsas.h | 81 --- include/scsi/scsi_transport_sas.h | 1 + 14 files changed, 270 insertions(+), 175 deletions(-) -- 2.5.0
[PATCH v4 10/11] libsas: direct call probe and destruct
In commit 87c8331f ([SCSI] libsas: prevent domain rediscovery competing with ata error handling) introduced disco mutex to prevent rediscovery competing with ata error handling and put the whole revalidation in the mutex. But the rphy add/remove needs to wait for the error handling which also grabs the disco mutex. This may leads to dead lock.So the probe and destruct event were introduce to do the rphy add/remove asynchronously and out of the lock. The asynchronously processed workers makes the whole discovery process not atomic, the other events may interrupt the process. For example, if a loss of signal event inserted before the probe event, the sas_deform_port() is called and the port will be deleted. And sas_port_delete() may run before the destruct event, but the port-x:x is the top parent of end device or expander. This leads to a kernel WARNING such as: [ 82.042979] sysfs group 'power' not found for kobject 'phy-1:0:22' [ 82.042983] [ cut here ] [ 82.042986] WARNING: CPU: 54 PID: 1714 at fs/sysfs/group.c:237 sysfs_remove_group+0x94/0xa0 [ 82.043059] Call trace: [ 82.043082] [] sysfs_remove_group+0x94/0xa0 [ 82.043085] [] dpm_sysfs_remove+0x60/0x70 [ 82.043086] [] device_del+0x138/0x308 [ 82.043089] [] sas_phy_delete+0x38/0x60 [ 82.043091] [] do_sas_phy_delete+0x6c/0x80 [ 82.043093] [] device_for_each_child+0x58/0xa0 [ 82.043095] [] sas_remove_children+0x40/0x50 [ 82.043100] [] sas_destruct_devices+0x64/0xa0 [ 82.043102] [] process_one_work+0x1fc/0x4b0 [ 82.043104] [] worker_thread+0x50/0x490 [ 82.043105] [] kthread+0xfc/0x128 [ 82.043107] [] ret_from_fork+0x10/0x50 Make probe and destruct a direct call in the disco and revalidate function, but put them outside the lock. The whole discovery or revalidate won't be interrupted by other events. And the DISCE_PROBE and DISCE_DESTRUCT event are deleted as a result of the direct call. Introduce a new list to destruct the sas_port and put the port delete after the destruct. This makes sure the right order of destroying the sysfs kobject and fix the warning above. Signed-off-by: Jason YanCC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams --- drivers/scsi/libsas/sas_ata.c | 1 - drivers/scsi/libsas/sas_discover.c | 34 -- drivers/scsi/libsas/sas_expander.c | 2 +- drivers/scsi/libsas/sas_internal.h | 1 + drivers/scsi/libsas/sas_port.c | 3 +++ include/scsi/libsas.h | 3 +-- include/scsi/scsi_transport_sas.h | 1 + 7 files changed, 27 insertions(+), 18 deletions(-) diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c index 87f5e694..dbe8c5e 100644 --- a/drivers/scsi/libsas/sas_ata.c +++ b/drivers/scsi/libsas/sas_ata.c @@ -729,7 +729,6 @@ int sas_discover_sata(struct domain_device *dev) if (res) return res; - sas_discover_event(dev->port, DISCE_PROBE); return 0; } diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c index 14f714d..d5f5b58 100644 --- a/drivers/scsi/libsas/sas_discover.c +++ b/drivers/scsi/libsas/sas_discover.c @@ -212,13 +212,9 @@ void sas_notify_lldd_dev_gone(struct domain_device *dev) } } -static void sas_probe_devices(struct work_struct *work) +static void sas_probe_devices(struct asd_sas_port *port) { struct domain_device *dev, *n; - struct sas_discovery_event *ev = to_sas_discovery_event(work); - struct asd_sas_port *port = ev->port; - - clear_bit(DISCE_PROBE, >disc.pending); /* devices must be domain members before link recovery and probe */ list_for_each_entry(dev, >disco_list, disco_list_node) { @@ -294,7 +290,6 @@ int sas_discover_end_dev(struct domain_device *dev) res = sas_notify_lldd_dev_found(dev); if (res) return res; - sas_discover_event(dev->port, DISCE_PROBE); return 0; } @@ -353,13 +348,9 @@ static void sas_unregister_common_dev(struct asd_sas_port *port, struct domain_d sas_put_device(dev); } -static void sas_destruct_devices(struct work_struct *work) +void sas_destruct_devices(struct asd_sas_port *port) { struct domain_device *dev, *n; - struct sas_discovery_event *ev = to_sas_discovery_event(work); - struct asd_sas_port *port = ev->port; - - clear_bit(DISCE_DESTRUCT, >disc.pending); list_for_each_entry_safe(dev, n, >destroy_list, disco_list_node) { list_del_init(>disco_list_node); @@ -370,6 +361,16 @@ static void sas_destruct_devices(struct work_struct *work) } } +void sas_destruct_ports(struct asd_sas_port *port) +{ + struct sas_port *sas_port, *p; + + list_for_each_entry_safe(sas_port, p, >sas_port_del_list,
[PATCH v4 08/11] libsas: Use new workqueue to run sas event and disco event
Now all libsas works are queued to scsi host workqueue, include sas event work post by LLDD and sas discovery work, and a sas hotplug flow may be divided into several works, e.g libsas receive a PORTE_BYTES_DMAED event, currently we process it as following steps: sas_form_port --- run in work in shost workq sas_discover_domain --- run in another work in shost workq ... sas_probe_devices --- run in new work in shost workq We found during hot-add a device, libsas may need run several works in same workqueue to add device in system, the process is not atomic, it may interrupt by other sas event works, like PHYE_LOSS_OF_SIGNAL. This patch is preparation of execute libsas sas event in sync. We need to use different workqueue to run sas event and disco event. Otherwise the work will be blocked for waiting another chained work in the same workqueue. Signed-off-by: Yijing WangCC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams Signed-off-by: Jason Yan --- drivers/scsi/libsas/sas_discover.c | 2 +- drivers/scsi/libsas/sas_event.c| 4 ++-- drivers/scsi/libsas/sas_init.c | 18 ++ include/scsi/libsas.h | 3 +++ 4 files changed, 24 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c index 60de662..14f714d 100644 --- a/drivers/scsi/libsas/sas_discover.c +++ b/drivers/scsi/libsas/sas_discover.c @@ -534,7 +534,7 @@ static void sas_chain_work(struct sas_ha_struct *ha, struct sas_work *sw) * workqueue, or known to be submitted from a context that is * not racing against draining */ - scsi_queue_work(ha->core.shost, >work); + queue_work(ha->disco_q, >work); } static void sas_chain_event(int event, unsigned long *pending, diff --git a/drivers/scsi/libsas/sas_event.c b/drivers/scsi/libsas/sas_event.c index 35412d9..c120657 100644 --- a/drivers/scsi/libsas/sas_event.c +++ b/drivers/scsi/libsas/sas_event.c @@ -40,7 +40,7 @@ int sas_queue_work(struct sas_ha_struct *ha, struct sas_work *sw) if (list_empty(>drain_node)) list_add(>drain_node, >defer_q); } else - rc = scsi_queue_work(ha->core.shost, >work); + rc = queue_work(ha->event_q, >work); return rc; } @@ -61,7 +61,7 @@ static int sas_queue_event(int event, struct sas_work *work, void __sas_drain_work(struct sas_ha_struct *ha) { - struct workqueue_struct *wq = ha->core.shost->work_q; + struct workqueue_struct *wq = ha->event_q; struct sas_work *sw, *_sw; int ret; diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c index e2d98a8..b49c46f 100644 --- a/drivers/scsi/libsas/sas_init.c +++ b/drivers/scsi/libsas/sas_init.c @@ -109,6 +109,7 @@ void sas_hash_addr(u8 *hashed, const u8 *sas_addr) int sas_register_ha(struct sas_ha_struct *sas_ha) { + char name[64]; int error = 0; mutex_init(_ha->disco_mutex); @@ -142,10 +143,24 @@ int sas_register_ha(struct sas_ha_struct *sas_ha) goto Undo_ports; } + error = -ENOMEM; + snprintf(name, sizeof(name), "%s_event_q", dev_name(sas_ha->dev)); + sas_ha->event_q = create_singlethread_workqueue(name); + if (!sas_ha->event_q) + goto Undo_ports; + + snprintf(name, sizeof(name), "%s_disco_q", dev_name(sas_ha->dev)); + sas_ha->disco_q = create_singlethread_workqueue(name); + if (!sas_ha->disco_q) + goto Undo_event_q; + INIT_LIST_HEAD(_ha->eh_done_q); INIT_LIST_HEAD(_ha->eh_ata_q); return 0; + +Undo_event_q: + destroy_workqueue(sas_ha->event_q); Undo_ports: sas_unregister_ports(sas_ha); Undo_phys: @@ -176,6 +191,9 @@ int sas_unregister_ha(struct sas_ha_struct *sas_ha) __sas_drain_work(sas_ha); mutex_unlock(_ha->drain_mutex); + destroy_workqueue(sas_ha->disco_q); + destroy_workqueue(sas_ha->event_q); + return 0; } diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h index 08c1c32..d1ab157 100644 --- a/include/scsi/libsas.h +++ b/include/scsi/libsas.h @@ -388,6 +388,9 @@ struct sas_ha_struct { struct device *dev; /* should be set */ struct module *lldd_module; /* should be set */ + struct workqueue_struct *event_q; + struct workqueue_struct *disco_q; + u8 *sas_addr; /* must be set */ u8 hashed_sas_addr[HASHED_SAS_ADDR_SIZE]; -- 2.5.0
[PATCH v4 11/11] libsas: add event to defer list tail instead of head when draining
From: chenxiangEvents will be added to defer_q list when setting ha->status to SAS_HA_DRAINING. Events will be called after drain workqueue. Those events are added to the head of list, but they are scanned one by one from the head to the tail, which will cause those events be called in the reverse order of being added. So change list_add to list_add_tail in function sas_queue_work. Signed-off-by: chenxiang Signed-off-by: Jason Yan CC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams --- drivers/scsi/libsas/sas_event.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/libsas/sas_event.c b/drivers/scsi/libsas/sas_event.c index c120657..b124198 100644 --- a/drivers/scsi/libsas/sas_event.c +++ b/drivers/scsi/libsas/sas_event.c @@ -38,7 +38,7 @@ int sas_queue_work(struct sas_ha_struct *ha, struct sas_work *sw) if (test_bit(SAS_HA_DRAINING, >state)) { /* add it to the defer list, if not already pending */ if (list_empty(>drain_node)) - list_add(>drain_node, >defer_q); + list_add_tail(>drain_node, >defer_q); } else rc = queue_work(ha->event_q, >work); -- 2.5.0
[PATCH v4 05/11] libsas: Use dynamic alloced work to avoid sas event lost
Now libsas hotplug work is static, every sas event type has its own static work, LLDD driver queues the hotplug work into shost->work_q. If LLDD driver burst posts lots hotplug events to libsas, the hotplug events may pending in the workqueue like shost->work_q new work[PORTE_BYTES_DMAED] --> |[PHYE_LOSS_OF_SIGNAL][PORTE_BYTES_DMAED] -> processing |<---wait worker to process>| In this case, a new PORTE_BYTES_DMAED event coming, libsas try to queue it to shost->work_q, but this work is already pending, so it would be lost. Finally, libsas delete the related sas port and sas devices, but LLDD driver expect libsas add the sas port and devices(last sas event). This patch use dynamic allocated work to avoid this issue. Signed-off-by: Yijing WangCC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams Signed-off-by: Jason Yan --- drivers/scsi/libsas/sas_event.c| 75 +- drivers/scsi/libsas/sas_init.c | 27 -- drivers/scsi/libsas/sas_internal.h | 6 +++ drivers/scsi/libsas/sas_phy.c | 44 +- drivers/scsi/libsas/sas_port.c | 18 - include/scsi/libsas.h | 16 +--- 6 files changed, 115 insertions(+), 71 deletions(-) diff --git a/drivers/scsi/libsas/sas_event.c b/drivers/scsi/libsas/sas_event.c index 3e225ef..35412d9 100644 --- a/drivers/scsi/libsas/sas_event.c +++ b/drivers/scsi/libsas/sas_event.c @@ -29,7 +29,8 @@ int sas_queue_work(struct sas_ha_struct *ha, struct sas_work *sw) { - int rc = 0; + /* it's added to the defer_q when draining so return succeed */ + int rc = 1; if (!test_bit(SAS_HA_REGISTERED, >state)) return 0; @@ -44,19 +45,15 @@ int sas_queue_work(struct sas_ha_struct *ha, struct sas_work *sw) return rc; } -static int sas_queue_event(int event, unsigned long *pending, - struct sas_work *work, +static int sas_queue_event(int event, struct sas_work *work, struct sas_ha_struct *ha) { - int rc = 0; + unsigned long flags; + int rc; - if (!test_and_set_bit(event, pending)) { - unsigned long flags; - - spin_lock_irqsave(>lock, flags); - rc = sas_queue_work(ha, work); - spin_unlock_irqrestore(>lock, flags); - } + spin_lock_irqsave(>lock, flags); + rc = sas_queue_work(ha, work); + spin_unlock_irqrestore(>lock, flags); return rc; } @@ -66,6 +63,7 @@ void __sas_drain_work(struct sas_ha_struct *ha) { struct workqueue_struct *wq = ha->core.shost->work_q; struct sas_work *sw, *_sw; + int ret; set_bit(SAS_HA_DRAINING, >state); /* flush submitters */ @@ -78,7 +76,11 @@ void __sas_drain_work(struct sas_ha_struct *ha) clear_bit(SAS_HA_DRAINING, >state); list_for_each_entry_safe(sw, _sw, >defer_q, drain_node) { list_del_init(>drain_node); - sas_queue_work(ha, sw); + ret = sas_queue_work(ha, sw); + if (ret != 1) { + struct asd_sas_event *ev = to_asd_sas_event(>work); + sas_free_event(ev); + } } spin_unlock_irq(>lock); } @@ -119,29 +121,68 @@ void sas_enable_revalidation(struct sas_ha_struct *ha) if (!test_and_clear_bit(ev, >pending)) continue; - sas_queue_event(ev, >pending, >disc_work[ev].work, ha); + sas_queue_event(ev, >disc_work[ev].work, ha); } mutex_unlock(>disco_mutex); } + +static void sas_port_event_worker(struct work_struct *work) +{ + struct asd_sas_event *ev = to_asd_sas_event(work); + + sas_port_event_fns[ev->event](work); + sas_free_event(ev); +} + +static void sas_phy_event_worker(struct work_struct *work) +{ + struct asd_sas_event *ev = to_asd_sas_event(work); + + sas_phy_event_fns[ev->event](work); + sas_free_event(ev); +} + static int sas_notify_port_event(struct asd_sas_phy *phy, enum port_event event) { + struct asd_sas_event *ev; struct sas_ha_struct *ha = phy->ha; + int ret; BUG_ON(event >= PORT_NUM_EVENTS); - return sas_queue_event(event, >port_events_pending, - >port_events[event].work, ha); + ev = sas_alloc_event(phy); + if (!ev) + return -ENOMEM; + + INIT_SAS_EVENT(ev, sas_port_event_worker, phy, event); + + ret = sas_queue_event(event, >work, ha); + if (ret != 1) + sas_free_event(ev); + + return ret; } int sas_notify_phy_event(struct
[PATCH v4 04/11] libsas: rename notify_port_event() for consistency
Rename function notify_port_event() to sas_notify_port_event(), which will be consistent with sas_notify_phy_event(). Signed-off-by: Jason YanCC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl CC: Dan Williams --- drivers/scsi/libsas/sas_event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/libsas/sas_event.c b/drivers/scsi/libsas/sas_event.c index 70c4653..3e225ef 100644 --- a/drivers/scsi/libsas/sas_event.c +++ b/drivers/scsi/libsas/sas_event.c @@ -124,7 +124,7 @@ void sas_enable_revalidation(struct sas_ha_struct *ha) mutex_unlock(>disco_mutex); } -static int notify_port_event(struct asd_sas_phy *phy, enum port_event event) +static int sas_notify_port_event(struct asd_sas_phy *phy, enum port_event event) { struct sas_ha_struct *ha = phy->ha; @@ -146,7 +146,7 @@ int sas_notify_phy_event(struct asd_sas_phy *phy, enum phy_event event) int sas_init_events(struct sas_ha_struct *sas_ha) { - sas_ha->notify_port_event = notify_port_event; + sas_ha->notify_port_event = sas_notify_port_event; sas_ha->notify_phy_event = sas_notify_phy_event; return 0; -- 2.5.0
[PATCH v4 02/11] libsas: remove the numbering for each event enum
Numbering for each event enum makes no sense. Remove the numbering so that we don't have to calculate the number by hand every time. Signed-off-by: Jason YanCC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl --- include/scsi/libsas.h | 34 +- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h index e536597..ccf3b48 100644 --- a/include/scsi/libsas.h +++ b/include/scsi/libsas.h @@ -62,31 +62,31 @@ enum sas_phy_type { */ enum port_event { PORTE_BYTES_DMAED = 0U, - PORTE_BROADCAST_RCVD = 1, - PORTE_LINK_RESET_ERR = 2, - PORTE_TIMER_EVENT = 3, - PORTE_HARD_RESET = 4, - PORT_NUM_EVENTS = 5, + PORTE_BROADCAST_RCVD, + PORTE_LINK_RESET_ERR, + PORTE_TIMER_EVENT, + PORTE_HARD_RESET, + PORT_NUM_EVENTS, }; enum phy_event { PHYE_LOSS_OF_SIGNAL = 0U, - PHYE_OOB_DONE = 1, - PHYE_OOB_ERROR= 2, - PHYE_SPINUP_HOLD = 3, /* hot plug SATA, no COMWAKE sent */ - PHYE_RESUME_TIMEOUT = 4, - PHY_NUM_EVENTS= 5, + PHYE_OOB_DONE, + PHYE_OOB_ERROR, + PHYE_SPINUP_HOLD, /* hot plug SATA, no COMWAKE sent */ + PHYE_RESUME_TIMEOUT, + PHY_NUM_EVENTS, }; enum discover_event { DISCE_DISCOVER_DOMAIN = 0U, - DISCE_REVALIDATE_DOMAIN = 1, - DISCE_PORT_GONE = 2, - DISCE_PROBE = 3, - DISCE_SUSPEND = 4, - DISCE_RESUME= 5, - DISCE_DESTRUCT = 6, - DISC_NUM_EVENTS = 7, + DISCE_REVALIDATE_DOMAIN, + DISCE_PORT_GONE, + DISCE_PROBE, + DISCE_SUSPEND, + DISCE_RESUME, + DISCE_DESTRUCT, + DISC_NUM_EVENTS, }; /* -- Expander Devices -- */ -- 2.5.0
[PATCH RESEND] lpfc: Don't return internal MBXERR_ERROR code from probe function
Internal error codes happen to be positive, thus the PCI driver core won't treat them as failure, but we do. This would cause a crash later on as lpfc_pci_remove_one() is called (e.g. as shutdown function). Fixes: 6d368e532168 ("[SCSI] lpfc 8.3.24: Add resource extent support") Signed-off-by: Stefano Brivio--- This seems to have been ignored. Re-sending as suggested by Johannes. drivers/scsi/lpfc/lpfc_init.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 491aa95eb0f6..38cc2b5bb5a2 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -6118,6 +6118,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) "Extents and RPI headers enabled.\n"); } mempool_free(mboxq, phba->mbox_mem_pool); + rc = -EIO; goto out_free_bsmbx; } -- 2.9.4
Re: [PATCH] lpfc: Don't return internal MBXERR_ERROR code from probe function
On Wed, Sep 06, 2017 at 10:47:38AM +0200, Stefano Brivio wrote: > The original submission is archived at > https://marc.info/?l=linux-scsi=150392554622786=2. Before I cause > any confusion... do you want me to re-submit this with the same > subject? As v2 with a comment? [PATCH RESEND] should be OK -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH] lpfc: Don't return internal MBXERR_ERROR code from probe function
On Wed, 6 Sep 2017 10:42:35 +0200 Johannes Thumshirnwrote: > On Wed, Sep 06, 2017 at 10:32:22AM +0200, Stefano Brivio wrote: > > > > I didn't get feedback about this patch. Is there any issue with the > > submission? > > > > I think it actually fixes a quite critical issue, if initialization > > fails we have crashes on reboot like the one reported below [1], and > > should perhaps also be queued for -stable. > > It seems to have slipped trough the cracks, can you please re-submit? The original submission is archived at https://marc.info/?l=linux-scsi=150392554622786=2. Before I cause any confusion... do you want me to re-submit this with the same subject? As v2 with a comment? -- Stefano
Re: [PATCH] lpfc: Don't return internal MBXERR_ERROR code from probe function
On Wed, Sep 06, 2017 at 10:32:22AM +0200, Stefano Brivio wrote: > > I didn't get feedback about this patch. Is there any issue with the > submission? > > I think it actually fixes a quite critical issue, if initialization > fails we have crashes on reboot like the one reported below [1], and > should perhaps also be queued for -stable. It seems to have slipped trough the cracks, can you please re-submit? Thanks, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH] lpfc: Don't return internal MBXERR_ERROR code from probe function
Hi, On Mon, 28 Aug 2017 15:05:23 +0200 Stefano Briviowrote: > Internal error codes happen to be positive, thus the PCI driver > core won't treat them as failure, but we do. This would cause a > crash later on as lpfc_pci_remove_one() is called (e.g. as > shutdown function). > > Fixes: 6d368e532168 ("[SCSI] lpfc 8.3.24: Add resource extent support") > Signed-off-by: Stefano Brivio > --- > drivers/scsi/lpfc/lpfc_init.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c > index 491aa95eb0f6..38cc2b5bb5a2 100644 > --- a/drivers/scsi/lpfc/lpfc_init.c > +++ b/drivers/scsi/lpfc/lpfc_init.c > @@ -6118,6 +6118,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) > "Extents and RPI headers enabled.\n"); > } > mempool_free(mboxq, phba->mbox_mem_pool); > + rc = -EIO; > goto out_free_bsmbx; > } > I didn't get feedback about this patch. Is there any issue with the submission? I think it actually fixes a quite critical issue, if initialization fails we have crashes on reboot like the one reported below [1], and should perhaps also be queued for -stable. [1] [ 568.638555] BUG: unable to handle kernel NULL pointer dereference at 07c0 [ 568.679154] IP: lpfc_pci_remove_one+0x20/0x890 [lpfc] [ 568.704062] PGD 0 [ 568.704063] [ 568.721714] Oops: [#1] SMP [ 568.736895] Modules linked in: fuse vfat msdos fat binfmt_misc xfs fcoe libfcoe libfc rpcrdma ib_isert iscsi_target_mod ib_iser ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core intel_rapl x86_pkg_temp_thermal intel_powerclamp intel_cstate intel_uncore intel_rapl_perf sg ipmi_si hpilo ipmi_devintf pcspkr ipmi_msghandler lpc_ich ioatdma shpchp dca pcc_cpufreq acpi_cpufreq ext4 jbd2 mbcache loop nfsv3 nfs_acl nfs lockd grace fscache sd_mod 8021q garp stp llc mrp mgag200 ata_generic i2c_algo_bit pata_acpi drm_kms_helper syscopyarea sfc sysfillrect sysimgblt fb_sys_fops ttm tg3 mtd crct10dif_pclmul lpfc crc32_pclmul ata_piix drm ptp hpsa serio_raw scsi_transport_fc be2net ghash_clmulni_intel libata i2c_core scsi_transport_sas pps_core [ 569.082589] mdio wmi sunrpc xts lrw gf128mul mcryptd dm_crypt dm_round_robin dm_multipath dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_zero dm_mod linear raid10 raid456 async_raid6_recov async_memcpy libcrc32c crc32c_intel async_pq async_xor xor async_tx raid6_pq raid1 raid0 iscsi_ibft iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs cramfs edd ctr(E) [ 569.254181] CPU: 9 PID: 46361 Comm: reboot Tainted: GE 4.11.0-22.el7a.x86_64 #1 [ 569.301048] Hardware name: HP ProLiant DL388p Gen8, BIOS P70 12/14/2012 [ 569.332878] task: 880801161680 task.stack: c9000514 [ 569.362693] RIP: 0010:lpfc_pci_remove_one+0x20/0x890 [lpfc] [ 569.390275] RSP: 0018:c90005143d18 EFLAGS: 00010296 [ 569.416040] RAX: RBX: 8804bd9cf000 RCX: [ 569.449874] RDX: RSI: 0246 RDI: 8804bd9cf000 [ 569.484402] RBP: c90005143d78 R08: 8804bd9cf0b8 R09: 0006 [ 569.519146] R10: 0020 R11: ea0020975b80 R12: 8804bd9cf000 [ 569.553408] R13: a057a2a0 R14: 8804bd9cf100 R15: fee1dead [ 569.588474] FS: 7f99fe258880() GS:88083f4c() knlGS: [ 569.628288] CS: 0010 DS: ES: CR0: 80050033 [ 569.656916] CR2: 07c0 CR3: 000825d91000 CR4: 000406e0 [ 569.693185] Call Trace: [ 569.705044] pci_device_shutdown+0x36/0x70 [ 569.725702] device_shutdown+0xdf/0x190 [ 569.744657] kernel_restart_prepare+0x36/0x40 [ 569.767099] kernel_restart+0x12/0x60 [ 569.784916] SYSC_reboot+0x1f3/0x220 [ 569.802649] ? __alloc_fd+0x46/0x170 [ 569.819539] ? vfs_writev+0x3c/0x50 [ 569.836206] ? do_writev+0x61/0xf0 [ 569.852730] SyS_reboot+0xe/0x10 [ 569.868375] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 569.890588] RIP: 0033:0x7f99fd065a56 [ 569.907894] RSP: 002b:7ffc8c5e2b58 EFLAGS: 0202 ORIG_RAX: 00a9 [ 569.944700] RAX: ffda RBX: ff91 RCX: 7f99fd065a56 [ 569.978659] RDX: 01234567 RSI: 28121969 RDI: fee1dead [ 570.013272] RBP: R08: 5583f9de32a0 R09: 7ffc8c5e2220 [ 570.048039] R10: 0024 R11: 0202 R12: 5583f9d6ef13 [ 570.082565] R13: 7ffc8c5e2e20 R14: R15: [ 570.116914] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 89 fb 48 83 ec 38 48 8b 87 38 01 00 00 <4c> 8b b0 c0 07 00 00 48 89 45 c0 45 0f b6 86 00 08 00 00 41 80