Re: [BUG] scsi/qla2xxx: a possible sleep-in-atomic bug in qlt_get_tag
On 2017/12/13 12:42, James Bottomley wrote: On Wed, 2017-12-13 at 11:18 +0800, Jia-Ju Bai wrote: The driver may sleep under a spinlock. The function call paths are: qlt_handle_abts_recv_work (acquire the spinlock) qlt_response_pkt_all_vps qlt_response_pkt qlt_handle_cmd_for_atio qlt_get_tag percpu_ida_alloc --> may sleep qla82xx_msix_rsp_q (acquire the spinlock) qla24xx_process_response_queue qlt_handle_abts_recv qlt_response_pkt_all_vps qlt_response_pkt qlt_handle_cmd_for_atio qlt_get_tag percpu_ida_alloc --> may sleep-in-atomic qla24xx_intr_handler (acquire the spinlock) qla24xx_process_response_queue qlt_handle_abts_recv qlt_response_pkt qlt_handle_cmd_for_atio qlt_get_tag percpu_ida_alloc --> may sleep I do not find a good way to fix it, so I only report. This possible bug is found by my static analysis tool (DSAC) and checked by my code review. The report is incorrect: percpu_ida_alloc with state==TASK_RUNNING is atomic (and interrupt) safe which appears to be the case here. James Thanks for your reply :) I have checked the definition of percpu_ida_alloc, and I think you are right. Sorry for my incorrect bug report. Thanks, Jia-Ju Bai
Re: [PATCH 1/4] block: check for GENHD_FL_UP in del_gendisk()
On 12/12/2017 05:57 PM, Bart Van Assche wrote: > On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote: >> From: Hannes Reinecke>> >> When a device is probed asynchronously del_gendisk() might be called >> before the async probing was run, causing del_gendisk() to crash >> due to uninitialized sysfs objects. >> >> Signed-off-by: Hannes Reinecke >> --- >> block/genhd.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/block/genhd.c b/block/genhd.c >> index c2223f1..cc40d95 100644 >> --- a/block/genhd.c >> +++ b/block/genhd.c >> @@ -697,6 +697,9 @@ void del_gendisk(struct gendisk *disk) >> struct disk_part_iter piter; >> struct hd_struct *part; >> >> +if (!(disk->flags & GENHD_FL_UP)) >> +return; >> + >> blk_integrity_del(disk); >> disk_del_events(disk); > > Hello Hannes, > > Thank you for having published your approach for increasing disk probing > concurrency. Your approach looks interesting to me. However, I don't think > that patches 1/4..3/4 are sufficient to avoid races between e.g. > device_add_disk() and del_gendisk(). As far as I know no locks are held > around the device_add_disk() and del_gendisk() calls. Does that mean that > del_gendisk() can call e.g. blk_integrity_del() before device_add_disk() has > called blk_integrity_add()? > In principle, yes. However, the overall idea here is that device_add_disk() and del_gendisk() are enclosed within upper layer procedures, which themselves provide additional locking. In our case the sd driver provided synchronisation guarantees ensuring that device_add_disk() and del_gendisk() doesn't run concurrently. if one is really concerned we could convert disk->flags to a bitmask, and use atomic bitmask modification; that should avoid any concurrency issues. However, I'm not sure if it's worth it. Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
Re: [BUG] scsi/qla2xxx: a possible sleep-in-atomic bug in qlt_get_tag
On Wed, 2017-12-13 at 11:18 +0800, Jia-Ju Bai wrote: > The driver may sleep under a spinlock. > The function call paths are: > qlt_handle_abts_recv_work (acquire the spinlock) > qlt_response_pkt_all_vps > qlt_response_pkt > qlt_handle_cmd_for_atio > qlt_get_tag > percpu_ida_alloc --> may sleep > > qla82xx_msix_rsp_q (acquire the spinlock) > qla24xx_process_response_queue > qlt_handle_abts_recv > qlt_response_pkt_all_vps > qlt_response_pkt > qlt_handle_cmd_for_atio > qlt_get_tag > percpu_ida_alloc --> may sleep-in-atomic > > qla24xx_intr_handler (acquire the spinlock) > qla24xx_process_response_queue > qlt_handle_abts_recv > qlt_response_pkt > qlt_handle_cmd_for_atio > qlt_get_tag > percpu_ida_alloc --> may sleep > > I do not find a good way to fix it, so I only report. > This possible bug is found by my static analysis tool (DSAC) and > checked by my code review. The report is incorrect: percpu_ida_alloc with state==TASK_RUNNING is atomic (and interrupt) safe which appears to be the case here. James
[BUG] scsi/qla2xxx: a possible sleep-in-atomic bug in qlt_get_tag
The driver may sleep under a spinlock. The function call paths are: qlt_handle_abts_recv_work (acquire the spinlock) qlt_response_pkt_all_vps qlt_response_pkt qlt_handle_cmd_for_atio qlt_get_tag percpu_ida_alloc --> may sleep qla82xx_msix_rsp_q (acquire the spinlock) qla24xx_process_response_queue qlt_handle_abts_recv qlt_response_pkt_all_vps qlt_response_pkt qlt_handle_cmd_for_atio qlt_get_tag percpu_ida_alloc --> may sleep-in-atomic qla24xx_intr_handler (acquire the spinlock) qla24xx_process_response_queue qlt_handle_abts_recv qlt_response_pkt qlt_handle_cmd_for_atio qlt_get_tag percpu_ida_alloc --> may sleep I do not find a good way to fix it, so I only report. This possible bug is found by my static analysis tool (DSAC) and checked by my code review. Thanks, Jia-Ju Bai
[PATCH] scsi: mpt3sas: validate device handle before ata pending flag
Commit ffb584565894 ("scsi: mpt3sas: fix hang on ata passthrough commands") introduced ata_command_pending bit-flag to ensure sending only one outstanding pass through ATA command to target device. However if target device become irresponsive with ata_command_pending, we won't be able to remove it timely, since we try to dispatch a SYNCHRONIZE_CACHE command and bail out with SAM_STAT_BUSY repeatedly. Move the invalid device handle checking a bit earlier than the ata_command_pending. Signed-off-by: Yuanyuan Zhong--- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index b258f21..32fc0d8 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -4758,6 +4758,16 @@ static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool pending) return 0; } + sas_target_priv_data = sas_device_priv_data->sas_target; + + /* invalid device handle */ + handle = sas_target_priv_data->handle; + if (handle == MPT3SAS_INVALID_DEVICE_HANDLE) { + scmd->result = DID_NO_CONNECT << 16; + scmd->scsi_done(scmd); + return 0; + } + /* * Bug work around for firmware SATL handling. The loop * is based on atomic operations and ensures consistency @@ -4771,17 +4781,6 @@ static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool pending) } } while (_scsih_set_satl_pending(scmd, true)); - sas_target_priv_data = sas_device_priv_data->sas_target; - - /* invalid device handle */ - handle = sas_target_priv_data->handle; - if (handle == MPT3SAS_INVALID_DEVICE_HANDLE) { - scmd->result = DID_NO_CONNECT << 16; - scmd->scsi_done(scmd); - return 0; - } - - /* host recovery or link resets sent via IOCTLs */ if (ioc->shost_recovery || ioc->ioc_link_reset_in_progress) return SCSI_MLQUEUE_HOST_BUSY; -- 1.9.1
Re: [bug report] A race between device_resume and removing disk
Ping...Does anyone has some idea about this issue? 在 2017/11/16 11:54, chenxiang (M) 写道: Hi all, When debugging suspend and resume of hisi_sas, I find a issue: use commands (echo freeze > /sys/power/state) to suspend, after 5s system will be resumed as i enable TEST_DEVICES. But if I plug one disks during suspend, system will be blocked all the time and it seems that there is a deadlock. I suspect that when device resume, it will acquire dev->mutex (device_lock(dev)) during almost the whole device resume process, but if remove a disk, it also tries to acquire dev->mutex when device_del. So it causes a deadlock. Do you have any idea about this issue? Follows are part of the log of normal print and exception print. In my environment, there are 11 disks on expander (5 SATA disks and 6 SAS disks). regards, shawn Normal log of suspend as follows: [ 189.789974] PM: suspend entry (s2idle) [ 189.798045] PM: Syncing filesystems ... done. [ 189.935324] Freezing user space processes ... (elapsed 0.009 seconds) done. [ 189.955044] OOM killer disabled. [ 189.959111] Freezing remaining freezable tasks ... (elapsed 0.012 seconds) done. [ 190.065399] sd 0:0:10:0: [sdk] Synchronizing SCSI cache [ 190.090948] sd 0:0:10:0: [sdk] Stopping disk [ 190.113157] sd 0:0:9:0: [sdj] Synchronizing SCSI cache [ 190.139116] sd 0:0:8:0: [sdi] Synchronizing SCSI cache [ 190.160252] sd 0:0:7:0: [sdh] Synchronizing SCSI cache [ 190.175968] sd 0:0:7:0: [sdh] Stopping disk [ 191.401325] sd 0:0:5:0: [sdf] Synchronizing SCSI cache [ 191.429529] sd 0:0:3:0: [sdd] Synchronizing SCSI cache [ 191.452532] sd 0:0:2:0: [sdc] Synchronizing SCSI cache [ 191.752248] sd 0:0:2:0: [sdc] Stopping disk [ 192.045536] sd 0:0:1:0: [sdb] Synchronizing SCSI cache [ 192.072078] sd 0:0:1:0: [sdb] Stopping disk [ 192.744386] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 192.760704] sd 0:0:0:0: [sda] Stopping disk [ 193.397060] hisi_sas_v3_suspend [ 194.587783] PM: suspend devices took 4.604 seconds [ 194.594688] PM: suspend debug: Waiting for 5 second(s). [ 199.687272] hisi_sas_v3_resume [ 199.740433] hisi_sas_v3_hw :74:02.0: phyup: phy0 link_rate=8 [ 199.747979] hisi_sas_v3_hw :74:02.0: phyup: phy1 link_rate=8 [ 199.754995] hisi_sas_v3_hw :74:02.0: phyup: phy2 link_rate=8 [ 199.787405] hisi_sas_v3_hw :74:02.0: phyup: phy3 link_rate=8 [ 199.820411] sas: sas_form_port: phy0 belongs to port0 already(4)! [ 199.831008] sas: sas_form_port: phy1 belongs to port0 already(4)! [ 199.839963] sas: sas_form_port: phy2 belongs to port0 already(4)! [ 199.849237] sas: sas_form_port: phy3 belongs to port0 already(4)! [ 200.366863] ata7: SATA link down (SStatus 0 SControl 300) [ 200.379220] ata6: SATA link down (SStatus 0 SControl 300) [ 200.828384] sas: broadcast received: 0 [ 200.838602] sas: REVALIDATING DOMAIN on port 0, pid:1454 [ 200.862876] sd 0:0:0:0: [sda] Starting disk [ 200.881098] sd 0:0:1:0: [sdb] Starting disk [ 200.891951] sd 0:0:2:0: [sdc] Starting disk [ 200.949758] sd 0:0:7:0: [sdh] Starting disk [ 200.990573] sd 0:0:10:0: [sdk] Starting disk [ 201.114636] sas: Expander phy change count has changed [ 201.188709] sas: ex 500e004aaa1f phy16 originated BROADCAST(CHANGE) [ 201.197145] sas: phy16 part of wide port with phy17 [ 201.211154] sas: ex 500e004aaa1f phy 0x10 broadcast flutter [ 201.222900] sas: ex 500e004aaa1f phy17 originated BROADCAST(CHANGE) [ 201.231045] sas: phy17 part of wide port with phy16 [ 201.244352] sas: ex 500e004aaa1f phy 0x11 broadcast flutter [ 201.258344] sas: ex 500e004aaa1f phy18 originated BROADCAST(CHANGE) [ 201.268344] sas: phy18 part of wide port with phy16 [ 201.283102] sas: ex 500e004aaa1f phy 0x12 broadcast flutter [ 201.294168] sas: ex 500e004aaa1f phy19 originated BROADCAST(CHANGE) [ 201.302386] sas: phy19 part of wide port with phy16 [ 201.315813] sas: ex 500e004aaa1f phy 0x13 broadcast flutter [ 201.346615] sas: done REVALIDATING DOMAIN on port 0, pid:1454, res 0x0 [ 203.640641] PM: resume devices took 3.984 seconds [ 203.648996] OOM killer enabled. [ 203.655016] Restarting tasks ... done. [ 203.690494] PM: suspend exit The log of removing disk when suspend is as follows: [44750.653222] PM: suspend entry (s2idle) [44750.658991] PM: Syncing filesystems ... done. [44750.699868] Freezing user space processes ... (elapsed 0.015 seconds) done. [44750.725734] OOM killer disabled. [44750.729711] Freezing remaining freezable tasks ... (elapsed 0.018 seconds) done. [44750.820672] sd 0:0:13:0: [sdk] Synchronizing SCSI cache [44750.844056] sd 0:0:13:0: [sdk] Stopping disk [44750.917433] sd 0:0:9:0: [sdj] Synchronizing SCSI cache [44750.949032] sd 0:0:8:0: [sdi] Synchronizing SCSI cache [44750.972730] sd 0:0:7:0: [sdh] Synchronizing SCSI cache [44750.989813] sd 0:0:7:0: [sdh] Stopping disk [44752.207539] sd 0:0:5:0: [sdf] Synchronizing SCSI cache [44752.239773] sd 0:0:3:0: [sdd] Synchronizing SCSI
[PATCH 8/8] target/iscsi: Improve 16 size determinations
From: Markus ElfringDate: Tue, 12 Dec 2017 22:06:00 +0100 Replace the specification of data types by pointer dereferences as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/target/iscsi/iscsi_target_erl2.c | 2 +- drivers/target/iscsi/iscsi_target_login.c| 6 +++--- drivers/target/iscsi/iscsi_target_parameters.c | 10 +- drivers/target/iscsi/iscsi_target_seq_pdu_list.c | 11 ++- drivers/target/iscsi/iscsi_target_tpg.c | 4 ++-- 5 files changed, 17 insertions(+), 16 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target_erl2.c b/drivers/target/iscsi/iscsi_target_erl2.c index 87c27e8d4f49..80b8297f4b05 100644 --- a/drivers/target/iscsi/iscsi_target_erl2.c +++ b/drivers/target/iscsi/iscsi_target_erl2.c @@ -324,7 +324,7 @@ int iscsit_prepare_cmds_for_reallegiance(struct iscsi_conn *conn) * (struct iscsi_cmd->cr) so we need to allocate this before preparing the * connection's command list for connection recovery. */ - cr = kzalloc(sizeof(struct iscsi_conn_recovery), GFP_KERNEL); + cr = kzalloc(sizeof(*cr), GFP_KERNEL); if (!cr) return -1; diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c index cfaf564825e0..095651e48f36 100644 --- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c @@ -46,7 +46,7 @@ static struct iscsi_login *iscsi_login_init_conn(struct iscsi_conn *conn) { struct iscsi_login *login; - login = kzalloc(sizeof(struct iscsi_login), GFP_KERNEL); + login = kzalloc(sizeof(*login), GFP_KERNEL); if (!login) return NULL; @@ -294,7 +294,7 @@ static int iscsi_login_zero_tsih_s1( struct iscsi_login_req *pdu = (struct iscsi_login_req *)buf; int ret; - sess = kzalloc(sizeof(struct iscsi_session), GFP_KERNEL); + sess = kzalloc(sizeof(*sess), GFP_KERNEL); if (!sess) { iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, ISCSI_LOGIN_STATUS_NO_RESOURCES); @@ -1244,7 +1244,7 @@ static int __iscsi_target_login_thread(struct iscsi_np *np) } spin_unlock_bh(>np_thread_lock); - conn = kzalloc(sizeof(struct iscsi_conn), GFP_KERNEL); + conn = kzalloc(sizeof(*conn), GFP_KERNEL); if (!conn) { /* Get another socket */ return 1; diff --git a/drivers/target/iscsi/iscsi_target_parameters.c b/drivers/target/iscsi/iscsi_target_parameters.c index a49b0b2a4f6c..110e81a0c97d 100644 --- a/drivers/target/iscsi/iscsi_target_parameters.c +++ b/drivers/target/iscsi/iscsi_target_parameters.c @@ -129,7 +129,7 @@ static struct iscsi_param *iscsi_set_default_param(struct iscsi_param_list *para { struct iscsi_param *param; - param = kzalloc(sizeof(struct iscsi_param), GFP_KERNEL); + param = kzalloc(sizeof(*param), GFP_KERNEL); if (!param) goto out; @@ -199,7 +199,7 @@ int iscsi_create_default_params(struct iscsi_param_list **param_list_ptr) struct iscsi_param *param; struct iscsi_param_list *pl; - pl = kzalloc(sizeof(struct iscsi_param_list), GFP_KERNEL); + pl = kzalloc(sizeof(*pl), GFP_KERNEL); if (!pl) return -ENOMEM; @@ -568,7 +568,7 @@ int iscsi_copy_param_list( struct iscsi_param *new_param = NULL; struct iscsi_param_list *param_list; - param_list = kzalloc(sizeof(struct iscsi_param_list), GFP_KERNEL); + param_list = kzalloc(sizeof(*param_list), GFP_KERNEL); if (!param_list) return -ENOMEM; @@ -583,7 +583,7 @@ int iscsi_copy_param_list( continue; } - new_param = kzalloc(sizeof(struct iscsi_param), GFP_KERNEL); + new_param = kzalloc(sizeof(*new_param), GFP_KERNEL); if (!new_param) goto err_out; @@ -714,7 +714,7 @@ static int iscsi_add_notunderstood_response( return -1; } - extra_response = kzalloc(sizeof(struct iscsi_extra_response), GFP_KERNEL); + extra_response = kzalloc(sizeof(*extra_response), GFP_KERNEL); if (!extra_response) return -ENOMEM; diff --git a/drivers/target/iscsi/iscsi_target_seq_pdu_list.c b/drivers/target/iscsi/iscsi_target_seq_pdu_list.c index 3a6e619bb30e..0fd914d92f4f 100644 --- a/drivers/target/iscsi/iscsi_target_seq_pdu_list.c +++ b/drivers/target/iscsi/iscsi_target_seq_pdu_list.c @@ -137,7 +137,8 @@ static int iscsit_randomize_pdu_lists( seq_count++;
[PATCH 7/8] target/iscsi: Delete an unnecessary variable initialisation in iscsi_set_default_param()
From: Markus ElfringDate: Tue, 12 Dec 2017 21:21:31 +0100 The local variable "param" will be reassigned by a following statement. Thus omit the explicit initialisation at the beginning. Signed-off-by: Markus Elfring --- drivers/target/iscsi/iscsi_target_parameters.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/target/iscsi/iscsi_target_parameters.c b/drivers/target/iscsi/iscsi_target_parameters.c index 151269b8816d..a49b0b2a4f6c 100644 --- a/drivers/target/iscsi/iscsi_target_parameters.c +++ b/drivers/target/iscsi/iscsi_target_parameters.c @@ -127,7 +127,7 @@ static struct iscsi_param *iscsi_set_default_param(struct iscsi_param_list *para char *name, char *value, u8 phase, u8 scope, u8 sender, u16 type_range, u8 use) { - struct iscsi_param *param = NULL; + struct iscsi_param *param; param = kzalloc(sizeof(struct iscsi_param), GFP_KERNEL); if (!param) -- 2.15.1
[PATCH 6/8] target/iscsi: Delete an unnecessary variable initialisation in iscsi_create_default_params()
From: Markus ElfringDate: Tue, 12 Dec 2017 21:18:39 +0100 The variable "param" will eventually be set to an appropriate pointer a bit later. Thus omit the explicit initialisation at the beginning. Signed-off-by: Markus Elfring --- drivers/target/iscsi/iscsi_target_parameters.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/target/iscsi/iscsi_target_parameters.c b/drivers/target/iscsi/iscsi_target_parameters.c index 25a3a9550488..151269b8816d 100644 --- a/drivers/target/iscsi/iscsi_target_parameters.c +++ b/drivers/target/iscsi/iscsi_target_parameters.c @@ -196,7 +196,7 @@ static struct iscsi_param *iscsi_set_default_param(struct iscsi_param_list *para /* #warning Add extension keys */ int iscsi_create_default_params(struct iscsi_param_list **param_list_ptr) { - struct iscsi_param *param = NULL; + struct iscsi_param *param; struct iscsi_param_list *pl; pl = kzalloc(sizeof(struct iscsi_param_list), GFP_KERNEL); -- 2.15.1
[PATCH 5/8] target/iscsi: Delete an unnecessary variable initialisation in iscsi_copy_param_list()
From: Markus ElfringDate: Tue, 12 Dec 2017 21:15:55 +0100 The variable "param_list" will be reassigned by a following statement. Thus omit the explicit initialisation at the beginning. Signed-off-by: Markus Elfring --- drivers/target/iscsi/iscsi_target_parameters.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/target/iscsi/iscsi_target_parameters.c b/drivers/target/iscsi/iscsi_target_parameters.c index 06310b2c4e26..25a3a9550488 100644 --- a/drivers/target/iscsi/iscsi_target_parameters.c +++ b/drivers/target/iscsi/iscsi_target_parameters.c @@ -566,7 +566,7 @@ int iscsi_copy_param_list( { struct iscsi_param *param = NULL; struct iscsi_param *new_param = NULL; - struct iscsi_param_list *param_list = NULL; + struct iscsi_param_list *param_list; param_list = kzalloc(sizeof(struct iscsi_param_list), GFP_KERNEL); if (!param_list) -- 2.15.1
[PATCH 4/8] target/iscsi: Delete an unnecessary variable initialisation in iscsit_allocate_ooo_cmdsn()
From: Markus ElfringDate: Tue, 12 Dec 2017 21:13:49 +0100 The local variable "ooo_cmdsn" will be reassigned by a following statement. Thus omit the explicit initialisation at the beginning. Signed-off-by: Markus Elfring --- drivers/target/iscsi/iscsi_target_erl1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/target/iscsi/iscsi_target_erl1.c b/drivers/target/iscsi/iscsi_target_erl1.c index ff3e08b6d4e1..c4c550128161 100644 --- a/drivers/target/iscsi/iscsi_target_erl1.c +++ b/drivers/target/iscsi/iscsi_target_erl1.c @@ -782,7 +782,7 @@ int iscsit_recover_dataout_sequence( static struct iscsi_ooo_cmdsn *iscsit_allocate_ooo_cmdsn(void) { - struct iscsi_ooo_cmdsn *ooo_cmdsn = NULL; + struct iscsi_ooo_cmdsn *ooo_cmdsn; ooo_cmdsn = kmem_cache_zalloc(lio_ooo_cache, GFP_ATOMIC); if (!ooo_cmdsn) -- 2.15.1
[PATCH 3/8] target/iscsi: Delete 36 error messages for a failed memory allocation
From: Markus ElfringDate: Tue, 12 Dec 2017 21:07:16 +0100 Omit extra messages for a memory allocation failure in these functions. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/target/iscsi/iscsi_target.c | 2 -- drivers/target/iscsi/iscsi_target_auth.c | 17 +++-- drivers/target/iscsi/iscsi_target_datain_values.c | 6 ++-- drivers/target/iscsi/iscsi_target_erl1.c | 12 +++ drivers/target/iscsi/iscsi_target_erl2.c | 6 ++-- drivers/target/iscsi/iscsi_target_login.c | 23 +++-- drivers/target/iscsi/iscsi_target_nego.c | 4 +-- drivers/target/iscsi/iscsi_target_parameters.c| 42 +++ drivers/target/iscsi/iscsi_target_seq_pdu_list.c | 23 + drivers/target/iscsi/iscsi_target_tpg.c | 9 ++--- drivers/target/iscsi/iscsi_target_util.c | 11 +++--- 11 files changed, 46 insertions(+), 109 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 9eb10d34682c..f3c6ea556ea8 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -812,7 +812,6 @@ int iscsit_add_reject( cmd->buf_ptr = kmemdup(buf, ISCSI_HDR_LEN, GFP_KERNEL); if (!cmd->buf_ptr) { - pr_err("Unable to allocate memory for cmd->buf_ptr\n"); iscsit_free_cmd(cmd, false); return -1; } @@ -849,7 +848,6 @@ static int iscsit_add_reject_from_cmd( cmd->buf_ptr = kmemdup(buf, ISCSI_HDR_LEN, GFP_KERNEL); if (!cmd->buf_ptr) { - pr_err("Unable to allocate memory for cmd->buf_ptr\n"); iscsit_free_cmd(cmd, false); return -1; } diff --git a/drivers/target/iscsi/iscsi_target_auth.c b/drivers/target/iscsi/iscsi_target_auth.c index d837fcbdbaf2..3a17343f43ed 100644 --- a/drivers/target/iscsi/iscsi_target_auth.c +++ b/drivers/target/iscsi/iscsi_target_auth.c @@ -80,10 +80,9 @@ static int chap_check_algorithm(const char *a_str) char *tmp, *orig, *token; tmp = kstrdup(a_str, GFP_KERNEL); - if (!tmp) { - pr_err("Memory allocation failed for CHAP_A temporary buffer\n"); + if (!tmp) return CHAP_DIGEST_UNKNOWN; - } + orig = tmp; token = strsep(, "="); @@ -198,16 +197,12 @@ static int chap_server_compute_md5( int auth_ret = -1, ret, challenge_len; challenge = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL); - if (!challenge) { - pr_err("Unable to allocate challenge buffer\n"); + if (!challenge) goto exit; - } challenge_binhex = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL); - if (!challenge_binhex) { - pr_err("Unable to allocate challenge_binhex buffer\n"); + if (!challenge_binhex) goto free_challenge; - } memset(chap_n, 0, MAX_CHAP_N_SIZE); @@ -257,10 +252,8 @@ static int chap_server_compute_md5( } desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(tfm), GFP_KERNEL); - if (!desc) { - pr_err("Unable to allocate struct shash_desc\n"); + if (!desc) goto free_shash; - } desc->tfm = tfm; desc->flags = 0; diff --git a/drivers/target/iscsi/iscsi_target_datain_values.c b/drivers/target/iscsi/iscsi_target_datain_values.c index 173ddd93c757..c591165f9b1b 100644 --- a/drivers/target/iscsi/iscsi_target_datain_values.c +++ b/drivers/target/iscsi/iscsi_target_datain_values.c @@ -30,11 +30,9 @@ struct iscsi_datain_req *iscsit_allocate_datain_req(void) struct iscsi_datain_req *dr; dr = kmem_cache_zalloc(lio_dr_cache, GFP_ATOMIC); - if (!dr) { - pr_err("Unable to allocate memory for" - " struct iscsi_datain_req\n"); + if (!dr) return NULL; - } + INIT_LIST_HEAD(>cmd_datain_node); return dr; diff --git a/drivers/target/iscsi/iscsi_target_erl1.c b/drivers/target/iscsi/iscsi_target_erl1.c index 5efa42b939a1..ff3e08b6d4e1 100644 --- a/drivers/target/iscsi/iscsi_target_erl1.c +++ b/drivers/target/iscsi/iscsi_target_erl1.c @@ -59,11 +59,9 @@ int iscsit_dump_data_payload( length = min(buf_len, OFFLOAD_BUF_SIZE); buf = kzalloc(length, GFP_ATOMIC); - if (!buf) { - pr_err("Unable to allocate %u bytes for offload" - " buffer.\n", length); + if (!buf) return -1; - } + memset(, 0, sizeof(struct kvec)); while (offset < buf_len) { @@ -787,11 +785,9 @@ static struct iscsi_ooo_cmdsn *iscsit_allocate_ooo_cmdsn(void) struct iscsi_ooo_cmdsn *ooo_cmdsn = NULL; ooo_cmdsn =
[PATCH 2/8] target/iscsi: Move resetting of seven variables in chap_server_compute_md5()
From: Markus ElfringDate: Tue, 12 Dec 2017 19:43:47 +0100 Move the resetting of these array variables so that this operation will be performed only if memory allocations succeeded before in this function. Signed-off-by: Markus Elfring --- drivers/target/iscsi/iscsi_target_auth.c | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target_auth.c b/drivers/target/iscsi/iscsi_target_auth.c index 94b011fe74e8..d837fcbdbaf2 100644 --- a/drivers/target/iscsi/iscsi_target_auth.c +++ b/drivers/target/iscsi/iscsi_target_auth.c @@ -197,14 +197,6 @@ static int chap_server_compute_md5( struct shash_desc *desc; int auth_ret = -1, ret, challenge_len; - memset(identifier, 0, 10); - memset(chap_n, 0, MAX_CHAP_N_SIZE); - memset(chap_r, 0, MAX_RESPONSE_LENGTH); - memset(digest, 0, MD5_SIGNATURE_SIZE); - memset(response, 0, MD5_SIGNATURE_SIZE * 2 + 2); - memset(client_digest, 0, MD5_SIGNATURE_SIZE); - memset(server_digest, 0, MD5_SIGNATURE_SIZE); - challenge = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL); if (!challenge) { pr_err("Unable to allocate challenge buffer\n"); @@ -216,6 +208,9 @@ static int chap_server_compute_md5( pr_err("Unable to allocate challenge_binhex buffer\n"); goto free_challenge; } + + memset(chap_n, 0, MAX_CHAP_N_SIZE); + /* * Extract CHAP_N. */ @@ -236,6 +231,8 @@ static int chap_server_compute_md5( goto free_challenge_binhex; } pr_debug("[server] Got CHAP_N=%s\n", chap_n); + memset(chap_r, 0, MAX_RESPONSE_LENGTH); + /* * Extract CHAP_R. */ @@ -250,6 +247,7 @@ static int chap_server_compute_md5( } pr_debug("[server] Got CHAP_R=%s\n", chap_r); + memset(client_digest, 0, MD5_SIGNATURE_SIZE); chap_string_to_hex(client_digest, chap_r, strlen(chap_r)); tfm = crypto_alloc_shash("md5", 0, 0); @@ -286,6 +284,7 @@ static int chap_server_compute_md5( goto free_desc; } + memset(server_digest, 0, MD5_SIGNATURE_SIZE); ret = crypto_shash_finup(desc, chap->challenge, CHAP_CHALLENGE_LENGTH, server_digest); if (ret < 0) { @@ -293,6 +292,7 @@ static int chap_server_compute_md5( goto free_desc; } + memset(response, 0, MD5_SIGNATURE_SIZE * 2 + 2); chap_binaryhex_to_asciihex(response, server_digest, MD5_SIGNATURE_SIZE); pr_debug("[server] MD5 Server Digest: %s\n", response); @@ -310,6 +310,9 @@ static int chap_server_compute_md5( auth_ret = 0; goto free_desc; } + + memset(identifier, 0, ARRAY_SIZE(identifier)); + /* * Get CHAP_I. */ @@ -393,6 +396,9 @@ static int chap_server_compute_md5( " password_mutual\n"); goto free_desc; } + + memset(digest, 0, MD5_SIGNATURE_SIZE); + /* * Convert received challenge to binary hex. */ -- 2.15.1
[PATCH 1/8] target/iscsi: Less function calls in chap_server_compute_md5() after error detection
From: Markus ElfringDate: Tue, 12 Dec 2017 18:00:41 +0100 The functions "crypto_free_shash", "kfree" and "kzfree" were called in a few cases by the chap_server_compute_md5() function during error handling even if the passed variable contained a null pointer. * Adjust jump targets according to the Linux coding style convention. * Delete initialisations for the variables "challenge", "challenge_binhex", "desc" and "tfm" at the beginning which became unnecessary with this refactoring. Fixes: 69110e3cedbb8aad1c70d91ed58a9f4f0ed9eec6 ("iscsi-target: Use shash and ahash") Fixes: e48354ce078c079996f89d715dfa44814b4eba01 ("iscsi-target: Add iSCSI fabric support for target v4.1") Signed-off-by: Markus Elfring --- drivers/target/iscsi/iscsi_target_auth.c | 71 +--- 1 file changed, 37 insertions(+), 34 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target_auth.c b/drivers/target/iscsi/iscsi_target_auth.c index f9bc8ec6fb6b..94b011fe74e8 100644 --- a/drivers/target/iscsi/iscsi_target_auth.c +++ b/drivers/target/iscsi/iscsi_target_auth.c @@ -186,15 +186,15 @@ static int chap_server_compute_md5( unsigned char id_as_uchar; unsigned char digest[MD5_SIGNATURE_SIZE]; unsigned char type, response[MD5_SIGNATURE_SIZE * 2 + 2]; - unsigned char identifier[10], *challenge = NULL; - unsigned char *challenge_binhex = NULL; + unsigned char identifier[10], *challenge; + unsigned char *challenge_binhex; unsigned char client_digest[MD5_SIGNATURE_SIZE]; unsigned char server_digest[MD5_SIGNATURE_SIZE]; unsigned char chap_n[MAX_CHAP_N_SIZE], chap_r[MAX_RESPONSE_LENGTH]; size_t compare_len; struct iscsi_chap *chap = conn->auth_protocol; - struct crypto_shash *tfm = NULL; - struct shash_desc *desc = NULL; + struct crypto_shash *tfm; + struct shash_desc *desc; int auth_ret = -1, ret, challenge_len; memset(identifier, 0, 10); @@ -208,13 +208,13 @@ static int chap_server_compute_md5( challenge = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL); if (!challenge) { pr_err("Unable to allocate challenge buffer\n"); - goto out; + goto exit; } challenge_binhex = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL); if (!challenge_binhex) { pr_err("Unable to allocate challenge_binhex buffer\n"); - goto out; + goto free_challenge; } /* * Extract CHAP_N. @@ -222,18 +222,18 @@ static int chap_server_compute_md5( if (extract_param(nr_in_ptr, "CHAP_N", MAX_CHAP_N_SIZE, chap_n, ) < 0) { pr_err("Could not find CHAP_N.\n"); - goto out; + goto free_challenge_binhex; } if (type == HEX) { pr_err("Could not find CHAP_N.\n"); - goto out; + goto free_challenge_binhex; } /* Include the terminating NULL in the compare */ compare_len = strlen(auth->userid) + 1; if (strncmp(chap_n, auth->userid, compare_len) != 0) { pr_err("CHAP_N values do not match!\n"); - goto out; + goto free_challenge_binhex; } pr_debug("[server] Got CHAP_N=%s\n", chap_n); /* @@ -242,11 +242,11 @@ static int chap_server_compute_md5( if (extract_param(nr_in_ptr, "CHAP_R", MAX_RESPONSE_LENGTH, chap_r, ) < 0) { pr_err("Could not find CHAP_R.\n"); - goto out; + goto free_challenge_binhex; } if (type != HEX) { pr_err("Could not find CHAP_R.\n"); - goto out; + goto free_challenge_binhex; } pr_debug("[server] Got CHAP_R=%s\n", chap_r); @@ -254,15 +254,14 @@ static int chap_server_compute_md5( tfm = crypto_alloc_shash("md5", 0, 0); if (IS_ERR(tfm)) { - tfm = NULL; pr_err("Unable to allocate struct crypto_shash\n"); - goto out; + goto free_challenge_binhex; } desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(tfm), GFP_KERNEL); if (!desc) { pr_err("Unable to allocate struct shash_desc\n"); - goto out; + goto free_shash; } desc->tfm = tfm; @@ -271,27 +270,27 @@ static int chap_server_compute_md5( ret = crypto_shash_init(desc); if (ret < 0) { pr_err("crypto_shash_init() failed\n"); - goto out; + goto free_desc; } ret = crypto_shash_update(desc, >id, 1); if (ret < 0) { pr_err("crypto_shash_update() failed for id\n"); - goto out; + goto
[PATCH 0/8] target-iSCSI: Adjustments for several function implementations
From: Markus ElfringDate: Tue, 12 Dec 2017 22:22:11 +0100 Some update suggestions were taken into account from static source code analysis. Markus Elfring (8): Less function calls in chap_server_compute_md5() after error detection Move resetting of seven variables in chap_server_compute_md5() Delete 36 error messages for a failed memory allocation Delete an unnecessary variable initialisation in iscsit_allocate_ooo_cmdsn() Delete an unnecessary variable initialisation in iscsi_copy_param_list() Delete an unnecessary variable initialisation in iscsi_create_default_params() Delete an unnecessary variable initialisation in iscsi_set_default_param() Improve 16 size determinations drivers/target/iscsi/iscsi_target.c | 2 - drivers/target/iscsi/iscsi_target_auth.c | 110 +++--- drivers/target/iscsi/iscsi_target_datain_values.c | 6 +- drivers/target/iscsi/iscsi_target_erl1.c | 14 +-- drivers/target/iscsi/iscsi_target_erl2.c | 8 +- drivers/target/iscsi/iscsi_target_login.c | 29 ++ drivers/target/iscsi/iscsi_target_nego.c | 4 +- drivers/target/iscsi/iscsi_target_parameters.c| 58 +--- drivers/target/iscsi/iscsi_target_seq_pdu_list.c | 34 +++ drivers/target/iscsi/iscsi_target_tpg.c | 13 +-- drivers/target/iscsi/iscsi_target_util.c | 11 +-- 11 files changed, 118 insertions(+), 171 deletions(-) -- 2.15.1
Re: [PATCH 4/4] scsi_debug: add resp_write_scat function
On 2017-12-12 02:40 PM, Bart Van Assche wrote: On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote: -static const struct opcode_info_t vl_iarr[1] = { /* VARIABLE LENGTH */ +static const struct opcode_info_t vl_iarr[2] = { /* VARIABLE LENGTH */ Please leave out the array size and let the compiler determine the array size. I like the "belts and braces" approach. Especially if offsetting an array element by one position would silently destroy the parser (which is not the case with vl_iarr but is the case with a few other arrays). {0, 0x7f, 0xb, F_SA_HIGH | F_D_OUT | FF_DIRECT_IO, resp_write_dt0, - NULL, {32, 0xc7, 0, 0, 0, 0, 0x1f, 0x18, 0x0, 0xb, 0xfa, - 0, 0xff, 0xff, 0xff, 0xff} },/* WRITE(32) */ + NULL, {32, 0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0xb, 0xfa, + 0, 0xff, 0xff, 0xff, 0xff} }, /* WRITE(32) */ Shouldn't this change have been included in the patch that fixes the group number mask? git add --patch was used to break up a monolithic patch into 4 parts. However in the above case it refused to "split" the group_number part (shown) from the renaming of service actions (not shown above). At that point I swear at git and move on. If folks think that hours are spent testing a kernel with 1/4 and 2/4 applied while 3/4 and 4/4 have not been applied then I think they are dreaming. IMO The split up is eye candy for reviewers. static const struct opcode_info_t maint_in_iarr[2] = { @@ -518,9 +523,10 @@ static const struct opcode_info_t opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = { {6, 0x1, 0, 0xf, 0xf7, 0xc7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} }, {1, 0x9e, 0x10, F_SA_LOW | F_D_IN, resp_readcap16, sa_in_iarr, {16, 0x10, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, -0xff, 0xff, 0xff, 0x1, 0xc7} },/* READ CAPACITY(16) */ - {0, 0, 0, F_INV_OP | FF_RESPOND, NULL, NULL, /* SA OUT */ - {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} }, +0xff, 0xff, 0xff, 0x1, 0xc7} },/* SA_IN(16), READ CAPACITY(16) */ Shouldn't the above change be folded into one of the other patches? I considered the patch adding resp_write_scat to the parsing table and the splitting SDEB_I_SERV_ACT_IN/OUT to its 12 and 16 byte cdb components as very closely related, so I put them together. @@ -529,9 +535,9 @@ static const struct opcode_info_t opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = { {0, 0x2f, 0, F_D_OUT_MAYBE | FF_DIRECT_IO, NULL, NULL, /* VERIFY(10) */ {10, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xc7, 0, 0, 0, 0, 0, 0} }, - {1, 0x7f, 0x9, F_SA_HIGH | F_D_IN | FF_DIRECT_IO, resp_read_dt0, - vl_iarr, {32, 0xc7, 0, 0, 0, 0, 0x1f, 0x18, 0x0, 0x9, 0xfe, 0, - 0xff, 0xff, 0xff, 0xff} },/* VARIABLE LENGTH, READ(32) */ + {2, 0x7f, 0x9, F_SA_HIGH | F_D_IN | FF_DIRECT_IO, resp_read_dt0, + vl_iarr, {32, 0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0x9, 0xfe, 0, + 0xff, 0xff, 0xff, 0xff} },/* VARIABLE LENGTH, READ(32) */ Have you considered to use ARRAY_SIZE(vl_iarr) instead of hard-coding the array size? No. It could be done with the advantage of making the code a bit safer for someone who changes it, but it obfuscates the number elements in the bucket list (array) making it harder for the reader of the code (maybe). + if (cmd[0] == VARIABLE_LENGTH_CMD) { + is_16 = false; + wrprotect = (cmd[10] >> 5) & 0x7; + lbdof = get_unaligned_be16(cmd + 12); + num_lrd = get_unaligned_be16(cmd + 16); + bt_len = get_unaligned_be32(cmd + 28); + check_prot = false; + } else {/* that leaves WRITE SCATTERED(16) */ + is_16 = true; + wrprotect = (cmd[2] >> 5) & 0x7; + lbdof = get_unaligned_be16(cmd + 4); + num_lrd = get_unaligned_be16(cmd + 8); + bt_len = get_unaligned_be32(cmd + 10); + check_prot = true; + } It's not clear to me why check_prot is set to false for WRITE SCATTERED(32) and set to true for WRITE SCATTERED(16)? Me neither. I was just following what the code for WRITE(16 and 32) does. My guess is that the code in question is written by Martin P. who is effectively co-maintainer of this driver having written all the DIF and DIX stuff. Martin ... ? Doug Gilbert
[PATCH] scsi: aacraid: fix io drop during the reset
"FIB_CONTEXT_FLAG_TIMEDOUT" flag is set in aac_eh_abort to indicate command timeout, using the same flag in reset handler causes the command to timeout and the IO's were droped. defined a new flag "FIB_CONTEXT_FLAG_EH_RESET" to make sure IO is properly handled in eh_reset handler. Signed-off-by: Prasad B Munirathnam--- drivers/scsi/aacraid/aacraid.h | 1 + drivers/scsi/aacraid/linit.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h index 403a639..b0b290f 100644 --- a/drivers/scsi/aacraid/aacraid.h +++ b/drivers/scsi/aacraid/aacraid.h @@ -1724,6 +1724,7 @@ struct aac_dev #define FIB_CONTEXT_FLAG_NATIVE_HBA(0x0010) #define FIB_CONTEXT_FLAG_NATIVE_HBA_TMF(0x0020) #define FIB_CONTEXT_FLAG_SCSI_CMD (0x0040) +#define FIB_CONTEXT_FLAG_EH_RESET (0x0080) /* * Define the command values diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c index 3677bef..03af5bd 100644 --- a/drivers/scsi/aacraid/linit.c +++ b/drivers/scsi/aacraid/linit.c @@ -1034,7 +1034,7 @@ static int aac_eh_bus_reset(struct scsi_cmnd* cmd) info = >hba_map[bus][cid]; if (bus >= AAC_MAX_BUSES || cid >= AAC_MAX_TARGETS || info->devtype != AAC_DEVTYPE_NATIVE_RAW) { - fib->flags |= FIB_CONTEXT_FLAG_TIMED_OUT; + fib->flags |= FIB_CONTEXT_FLAG_EH_RESET; cmd->SCp.phase = AAC_OWNER_ERROR_HANDLER; } } -- 2.7.4
Re: [PATCH 4/4] scsi_debug: add resp_write_scat function
On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote: > -static const struct opcode_info_t vl_iarr[1] = { /* VARIABLE LENGTH */ > +static const struct opcode_info_t vl_iarr[2] = { /* VARIABLE LENGTH */ Please leave out the array size and let the compiler determine the array size. > {0, 0x7f, 0xb, F_SA_HIGH | F_D_OUT | FF_DIRECT_IO, resp_write_dt0, > - NULL, {32, 0xc7, 0, 0, 0, 0, 0x1f, 0x18, 0x0, 0xb, 0xfa, > -0, 0xff, 0xff, 0xff, 0xff} },/* WRITE(32) */ > + NULL, {32, 0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0xb, 0xfa, > + 0, 0xff, 0xff, 0xff, 0xff} }, /* WRITE(32) */ Shouldn't this change have been included in the patch that fixes the group number mask? > static const struct opcode_info_t maint_in_iarr[2] = { > @@ -518,9 +523,10 @@ static const struct opcode_info_t > opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = { > {6, 0x1, 0, 0xf, 0xf7, 0xc7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} }, > {1, 0x9e, 0x10, F_SA_LOW | F_D_IN, resp_readcap16, sa_in_iarr, > {16, 0x10, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, > - 0xff, 0xff, 0xff, 0x1, 0xc7} },/* READ CAPACITY(16) */ > - {0, 0, 0, F_INV_OP | FF_RESPOND, NULL, NULL, /* SA OUT */ > - {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} }, > + 0xff, 0xff, 0xff, 0x1, 0xc7} },/* SA_IN(16), READ CAPACITY(16) */ Shouldn't the above change be folded into one of the other patches? > @@ -529,9 +535,9 @@ static const struct opcode_info_t > opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = { > {0, 0x2f, 0, F_D_OUT_MAYBE | FF_DIRECT_IO, NULL, NULL, /* VERIFY(10) */ > {10, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xc7, >0, 0, 0, 0, 0, 0} }, > - {1, 0x7f, 0x9, F_SA_HIGH | F_D_IN | FF_DIRECT_IO, resp_read_dt0, > - vl_iarr, {32, 0xc7, 0, 0, 0, 0, 0x1f, 0x18, 0x0, 0x9, 0xfe, 0, > - 0xff, 0xff, 0xff, 0xff} },/* VARIABLE LENGTH, READ(32) */ > + {2, 0x7f, 0x9, F_SA_HIGH | F_D_IN | FF_DIRECT_IO, resp_read_dt0, > + vl_iarr, {32, 0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0x9, 0xfe, 0, > + 0xff, 0xff, 0xff, 0xff} },/* VARIABLE LENGTH, READ(32) */ Have you considered to use ARRAY_SIZE(vl_iarr) instead of hard-coding the array size? > + if (cmd[0] == VARIABLE_LENGTH_CMD) { > + is_16 = false; > + wrprotect = (cmd[10] >> 5) & 0x7; > + lbdof = get_unaligned_be16(cmd + 12); > + num_lrd = get_unaligned_be16(cmd + 16); > + bt_len = get_unaligned_be32(cmd + 28); > + check_prot = false; > + } else {/* that leaves WRITE SCATTERED(16) */ > + is_16 = true; > + wrprotect = (cmd[2] >> 5) & 0x7; > + lbdof = get_unaligned_be16(cmd + 4); > + num_lrd = get_unaligned_be16(cmd + 8); > + bt_len = get_unaligned_be32(cmd + 10); > + check_prot = true; > + } It's not clear to me why check_prot is set to false for WRITE SCATTERED(32) and set to true for WRITE SCATTERED(16)? Bart.
Re: [PATCH 2/4] scsi_debug: fix group_number mask 0x1f->0x3f
On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote: > Various cdb masks incorrectly assumed the GROUP NUMBER field was > 5 bits long. It is actually 6 bits long. Correct. Reviewed-by: Bart Van Assche
Re: [PATCH 3/4] scsi_debug: expand do_device_access to take sg offset
On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote: > WRITE SCATTERED needs to take several "bites" out of the data-out buffer. > Expand the do_device_access() function to take a sg_skip argument. Reviewed-by: Bart Van Assche
Re: [PATCH 1/4] scsi_debug: tab, kstrto changes, requested by checkpatch.pl
On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote: > Some of my development tools tend to add spaces (my preference) rather > than tabs (kernel convention). Running unexpand to clean these spaces > up found more of them than checkpatch.pl did. Then checkpatch.pl > complained about other style violations in those newly tabbed lines. Reviewed-by: Bart Van Assche
Re: [PATCH] scsi_dh_alua: skip RTPG for devices only supporting active/optimized
On Tue, 2017-12-12 at 08:01 +0100, Hannes Reinecke wrote: > On 12/12/2017 01:00 AM, Bart Van Assche wrote: > > On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote: > > > @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, > > > struct alua_port_group *pg) > > > retval = submit_rtpg(sdev, buff, bufflen, _hdr, pg->flags); > > > > > > if (retval) { > > > + /* > > > + * If the target only supports active/optimized there's > > > + * not much we can do; it's not that we can switch paths > > > + * or somesuch. > > > + * So ignore any errors to avoid spurious failures during > > > + * path failover. > > > + */ > > > + if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) { > > > + sdev_printk(KERN_INFO, sdev, > > > + "%s: ignoring rtpg result %d\n", > > > + ALUA_DH_NAME, retval); > > > + kfree(buff); > > > + return SCSI_DH_OK; > > > + } > > > > Hello Hannes, > > > > Sorry but this change looks weird to me. If an RTPG fails, shouldn't > > alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target > > supports? Are you perhaps trying to implement a work-around for an array > > that does not respond to RTPG during path transitions? > > > > Yes, precisely. > > Thing is: if an array is only supporting active/optimized the entire > device-handler stuff is basically a no-op as we can't switch paths anyway. > So it doesn't really matter if the RTPG fails; in fact, we could just > short-circuit the entire logic. I did not do that to allow for a state > modification (ie arrays _might_ decide to announce additional states > eventually, and only starting off with active/optimized as an initial > state set). > > But if we return SCSI_DH_IO here the multipath logic will not attempt to > switch paths, and failover will not work. Hello Hannes, I would appreciate it if it would be mentioned more clearly in the comment above pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED that the newly added code is a workaround for non-standard array behavior. I'm afraid that otherwise people who will read that code will be puzzled about why that code has been added. Thanks, Bart.
[PATCH v3] Use blist_flags_t consistently
Use the type blist_flags_t for all variables that represent blacklist flags. Additionally, suppress recently introduced sparse warnings related to blacklist flags. Fixes: commit c6b54164508a ("scsi: Use 'blist_flags_t' for scsi_devinfo flags") Signed-off-by: Bart Van AsscheReviewed-by: Christoph Hellwig Cc: Hannes Reinecke Cc: Johannes Thumshirn --- Changes compared to v2: - Changed (__force unsigned int)sdev->sdev_bflags & BIT(i) into sdev->sdev_bflags & (__force blist_flags_t)BIT(i). Changes compared to v1: - Changed (__force u32) casts into (__force unsigned int). - Left out one cast, namely the one in front of scsi_default_dev_flags. drivers/scsi/scsi_devinfo.c | 6 ++ drivers/scsi/scsi_scan.c | 13 +++-- drivers/scsi/scsi_sysfs.c | 5 +++-- drivers/scsi/scsi_transport_spi.c | 12 +++- 4 files changed, 19 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c index a15eb4d15114..700988e2769c 100644 --- a/drivers/scsi/scsi_devinfo.c +++ b/drivers/scsi/scsi_devinfo.c @@ -382,10 +382,8 @@ int scsi_dev_info_list_add_keyed(int compatible, char *vendor, char *model, model, compatible); if (strflags) - devinfo->flags = simple_strtoul(strflags, NULL, 0); - else - devinfo->flags = flags; - + flags = (__force blist_flags_t)simple_strtoul(strflags, NULL, 0); + devinfo->flags = flags; devinfo->compatible = compatible; if (compatible) diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index be5e919db0e8..0880d975eed3 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -770,7 +770,7 @@ static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result, * SCSI_SCAN_LUN_PRESENT: a new scsi_device was allocated and initialized **/ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result, - int *bflags, int async) + blist_flags_t *bflags, int async) { int ret; @@ -1049,14 +1049,15 @@ static unsigned char *scsi_inq_str(unsigned char *buf, unsigned char *inq, * - SCSI_SCAN_LUN_PRESENT: a new scsi_device was allocated and initialized **/ static int scsi_probe_and_add_lun(struct scsi_target *starget, - u64 lun, int *bflagsp, + u64 lun, blist_flags_t *bflagsp, struct scsi_device **sdevp, enum scsi_scan_mode rescan, void *hostdata) { struct scsi_device *sdev; unsigned char *result; - int bflags, res = SCSI_SCAN_NO_RESPONSE, result_len = 256; + blist_flags_t bflags; + int res = SCSI_SCAN_NO_RESPONSE, result_len = 256; struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); /* @@ -1201,7 +1202,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, * Modifies sdevscan->lun. **/ static void scsi_sequential_lun_scan(struct scsi_target *starget, -int bflags, int scsi_level, +blist_flags_t bflags, int scsi_level, enum scsi_scan_mode rescan) { uint max_dev_lun; @@ -1292,7 +1293,7 @@ static void scsi_sequential_lun_scan(struct scsi_target *starget, * 0: scan completed (or no memory, so further scanning is futile) * 1: could not scan with REPORT LUN **/ -static int scsi_report_lun_scan(struct scsi_target *starget, int bflags, +static int scsi_report_lun_scan(struct scsi_target *starget, blist_flags_t bflags, enum scsi_scan_mode rescan) { unsigned char scsi_cmd[MAX_COMMAND_SIZE]; @@ -1538,7 +1539,7 @@ static void __scsi_scan_target(struct device *parent, unsigned int channel, unsigned int id, u64 lun, enum scsi_scan_mode rescan) { struct Scsi_Host *shost = dev_to_shost(parent); - int bflags = 0; + blist_flags_t bflags = 0; int res; struct scsi_target *starget; diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index cbc0fe2c5485..c15ca457dd23 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -967,7 +967,8 @@ sdev_show_wwid(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR(wwid, S_IRUGO, sdev_show_wwid, NULL); -#define BLIST_FLAG_NAME(name) [ilog2(BLIST_##name)] = #name +#define BLIST_FLAG_NAME(name) \ + [ilog2((__force unsigned int)BLIST_##name)] = #name static const char *const sdev_bflags_name[] = { #include "scsi_devinfo_tbl.c" }; @@ -984,7 +985,7 @@ sdev_show_blacklist(struct device *dev, struct device_attribute *attr, for (i = 0; i <
Re: [GIT PULL] SCSI fixes for 4.15-rc3
Linus, > The commit you point to _is_ the probnlem. It does: > > struct bfad_im_port_s *im_port = shost->hostdata[0]; > > which is utter bullshit crap. Sorry, wrong commit: https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=fixes=48d83282db077f93b2cf40de120f4d6f29eb293b -- Martin K. Petersen Oracle Linux Engineering
Re: [GIT PULL] SCSI fixes for 4.15-rc3
On Tue, 2017-12-12 at 09:32 -0800, Linus Torvalds wrote: > On Tue, Dec 12, 2017 at 9:22 AM, Martin K. Petersen >wrote: > > > > > > Arnd and Johannes fixed this up right away: > > The commit you point to _is_ the probnlem. It does: > > struct bfad_im_port_s *im_port = shost->hostdata[0]; > > which is utter bullshit crap. > > Notice? It's assigning a pointer (im_port), from an integer value > ("hostdata[0]" is "unsigned long"). > > The code is garbage. he means this one: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git/commit/?h=fixes=48d83282db077f93b2cf40de120f4d6f29eb293b Which properly encapsulates the reference in a function which does the correct conversions. James
Re: [GIT PULL] SCSI fixes for 4.15-rc3
On Tue, Dec 12, 2017 at 9:22 AM, Martin K. Petersenwrote: > > Arnd and Johannes fixed this up right away: The commit you point to _is_ the probnlem. It does: struct bfad_im_port_s *im_port = shost->hostdata[0]; which is utter bullshit crap. Notice? It's assigning a pointer (im_port), from an integer value ("hostdata[0]" is "unsigned long"). The code is garbage. Linus
Re: [GIT PULL] SCSI fixes for 4.15-rc3
On Tue, 2017-12-12 at 12:22 -0500, Martin K. Petersen wrote: > Linus, > > > > > This is utter shite, and doesn't even compile cleanly. > > > > Sure, it's "just" a warning, and the code works. But no, I'm not > > pulling crap like this. If you save a pointer in an integer > > "hostdata[0]" field, then you damn well do the proper casts or > > helper > > functions or something, you don't just ignore the compiler when it > > very reasonably warns about it. > > > > What the hell is going on? Nobody compiled this stuff at all? Or > > nobody cares about new build warnings? > > Arnd and Johannes fixed this up right away: > > https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/? > h=fixes=45349821ab3a8d378b8f37e52c6fe1aa1b870c47 Yes, I have that in the current testing batch, how about I push the combined thing on wednesday. James
Re: [GIT PULL] SCSI fixes for 4.15-rc3
Linus, > This is utter shite, and doesn't even compile cleanly. > > Sure, it's "just" a warning, and the code works. But no, I'm not > pulling crap like this. If you save a pointer in an integer > "hostdata[0]" field, then you damn well do the proper casts or helper > functions or something, you don't just ignore the compiler when it > very reasonably warns about it. > > What the hell is going on? Nobody compiled this stuff at all? Or > nobody cares about new build warnings? Arnd and Johannes fixed this up right away: https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=fixes=45349821ab3a8d378b8f37e52c6fe1aa1b870c47 -- Martin K. Petersen Oracle Linux Engineering
Re: [GIT PULL] SCSI fixes for 4.15-rc3
On Tue, Dec 12, 2017 at 8:21 AM, James Bottomleywrote: > > The most important one is the bfa fix because it's easy to oops the > kernel with this driver, a regression in the new timespec conversion in > aacraid and a regression in the Fibre Channel ELS handling patch. The > other three are a theoretical problem with termination in the > vendor/host matching code and a use after free in lpfc. No, this is just complete garbage. > Johannes Thumshirn (1): > scsi: bfa: fix access to bfad_im_port_s This is utter shite, and doesn't even compile cleanly. Sure, it's "just" a warning, and the code works. But no, I'm not pulling crap like this. If you save a pointer in an integer "hostdata[0]" field, then you damn well do the proper casts or helper functions or something, you don't just ignore the compiler when it very reasonably warns about it. What the hell is going on? Nobody compiled this stuff at all? Or nobody cares about new build warnings? That is not acceptable. Linus
Re: [PATCH 1/4] block: check for GENHD_FL_UP in del_gendisk()
On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote: > From: Hannes Reinecke> > When a device is probed asynchronously del_gendisk() might be called > before the async probing was run, causing del_gendisk() to crash > due to uninitialized sysfs objects. > > Signed-off-by: Hannes Reinecke > --- > block/genhd.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/block/genhd.c b/block/genhd.c > index c2223f1..cc40d95 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -697,6 +697,9 @@ void del_gendisk(struct gendisk *disk) > struct disk_part_iter piter; > struct hd_struct *part; > > + if (!(disk->flags & GENHD_FL_UP)) > + return; > + > blk_integrity_del(disk); > disk_del_events(disk); Hello Hannes, Thank you for having published your approach for increasing disk probing concurrency. Your approach looks interesting to me. However, I don't think that patches 1/4..3/4 are sufficient to avoid races between e.g. device_add_disk() and del_gendisk(). As far as I know no locks are held around the device_add_disk() and del_gendisk() calls. Does that mean that del_gendisk() can call e.g. blk_integrity_del() before device_add_disk() has called blk_integrity_add()? Bart.
Re: [PATCH v2] Use blist_flags_t consistently
On Tue, 2017-12-12 at 08:52 +0100, Christoph Hellwig wrote: > On Mon, Dec 11, 2017 at 04:08:03PM -0800, Bart Van Assche wrote: > > + if (!((__force unsigned int)sdev->sdev_bflags & BIT(i))) > > I'd case the right side argument to __force blist_flags_t here for > purely esthetic reasons. > > Except for that this looks fine to me: > > Reviewed-by: Christoph HellwigHello Christoph, Thanks for the review. I will make the proposed change and repost this patch. Bart.
[GIT PULL] SCSI fixes for 4.15-rc3
The most important one is the bfa fix because it's easy to oops the kernel with this driver, a regression in the new timespec conversion in aacraid and a regression in the Fibre Channel ELS handling patch. The other three are a theoretical problem with termination in the vendor/host matching code and a use after free in lpfc. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Arnd Bergmann (1): scsi: aacraid: address UBSAN warning regression Dan Carpenter (1): scsi: lpfc: Use after free in lpfc_rq_buf_free() Johannes Thumshirn (1): scsi: bfa: fix access to bfad_im_port_s Martin Wilck (3): scsi: scsi_devinfo: cleanly zero-pad devinfo strings scsi: scsi_devinfo: handle non-terminated strings scsi: libfc: fix ELS request handling With the diffstat: drivers/scsi/aacraid/commsup.c | 8 ++-- drivers/scsi/bfa/bfad_bsg.c| 6 -- drivers/scsi/libfc/fc_lport.c | 4 drivers/scsi/lpfc/lpfc_mem.c | 2 +- drivers/scsi/scsi_devinfo.c| 27 ++- 5 files changed, 25 insertions(+), 22 deletions(-) And full diff below James --- diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c index bec9f3193f60..80a8cb26cdea 100644 --- a/drivers/scsi/aacraid/commsup.c +++ b/drivers/scsi/aacraid/commsup.c @@ -2482,8 +2482,8 @@ int aac_command_thread(void *data) /* Synchronize our watches */ if (((NSEC_PER_SEC - (NSEC_PER_SEC / HZ)) > now.tv_nsec) && (now.tv_nsec > (NSEC_PER_SEC / HZ))) - difference = (((NSEC_PER_SEC - now.tv_nsec) * HZ) - + NSEC_PER_SEC / 2) / NSEC_PER_SEC; + difference = HZ + HZ / 2 - +now.tv_nsec / (NSEC_PER_SEC / HZ); else { if (now.tv_nsec > NSEC_PER_SEC / 2) ++now.tv_sec; @@ -2507,6 +2507,10 @@ int aac_command_thread(void *data) if (kthread_should_stop()) break; + /* +* we probably want usleep_range() here instead of the +* jiffies computation +*/ schedule_timeout(difference); if (kthread_should_stop()) diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c index 72ca2a2e08e2..09ef68c8225f 100644 --- a/drivers/scsi/bfa/bfad_bsg.c +++ b/drivers/scsi/bfa/bfad_bsg.c @@ -3135,7 +3135,8 @@ bfad_im_bsg_vendor_request(struct bsg_job *job) struct fc_bsg_request *bsg_request = job->request; struct fc_bsg_reply *bsg_reply = job->reply; uint32_t vendor_cmd = bsg_request->rqst_data.h_vendor.vendor_cmd[0]; - struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job)); + struct Scsi_Host *shost = fc_bsg_to_shost(job); + struct bfad_im_port_s *im_port = shost->hostdata[0]; struct bfad_s *bfad = im_port->bfad; void *payload_kbuf; int rc = -EINVAL; @@ -3350,7 +3351,8 @@ int bfad_im_bsg_els_ct_request(struct bsg_job *job) { struct bfa_bsg_data *bsg_data; - struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job)); + struct Scsi_Host *shost = fc_bsg_to_shost(job); + struct bfad_im_port_s *im_port = shost->hostdata[0]; struct bfad_s *bfad = im_port->bfad; bfa_bsg_fcpt_t *bsg_fcpt; struct bfad_fcxp*drv_fcxp; diff --git a/drivers/scsi/libfc/fc_lport.c b/drivers/scsi/libfc/fc_lport.c index 5da46052e179..21be672679fb 100644 --- a/drivers/scsi/libfc/fc_lport.c +++ b/drivers/scsi/libfc/fc_lport.c @@ -904,10 +904,14 @@ static void fc_lport_recv_els_req(struct fc_lport *lport, case ELS_FLOGI: if (!lport->point_to_multipoint) fc_lport_recv_flogi_req(lport, fp); + else + fc_rport_recv_req(lport, fp); break; case ELS_LOGO: if (fc_frame_sid(fp) == FC_FID_FLOGI) fc_lport_recv_logo_req(lport, fp); + else + fc_rport_recv_req(lport, fp); break; case ELS_RSCN: lport->tt.disc_recv_req(lport, fp); diff --git a/drivers/scsi/lpfc/lpfc_mem.c b/drivers/scsi/lpfc/lpfc_mem.c index 56faeb049b4a..87c08ff37ddd 100644 --- a/drivers/scsi/lpfc/lpfc_mem.c +++ b/drivers/scsi/lpfc/lpfc_mem.c @@ -753,12 +753,12 @@ lpfc_rq_buf_free(struct lpfc_hba *phba, struct lpfc_dmabuf *mp) drqe.address_hi = putPaddrHigh(rqb_entry->dbuf.phys); rc = lpfc_sli4_rq_put(rqb_entry->hrq, rqb_entry->drq, , ); if (rc < 0) { -
[PATCH] arcmsr: Fix possible sleep-in-atomic bugs in arcmsr_queue_command
From: Jia-Ju BaiThe driver may sleep under a spinlock, and the function call paths are: arcmsr_queue_command(acquire the spinlock) arcmsr_queue_command_lck arcmsr_handle_virtual_command arcmsr_iop_message_xfer arcmsr_iop_parking arcmsr_stop_adapter_bgrb arcmsr_hbaA_stop_bgrb arcmsr_hbaA_wait_msgint_ready msleep --> may sleep arcmsr_queue_command(acquire the spinlock) arcmsr_queue_command_lck arcmsr_handle_virtual_command arcmsr_iop_message_xfer arcmsr_iop_parking arcmsr_stop_adapter_bgrb arcmsr_hbaB_stop_bgrb arcmsr_hbaB_wait_msgint_ready msleep --> may sleep arcmsr_queue_command(acquire the spinlock) arcmsr_queue_command_lck arcmsr_handle_virtual_command arcmsr_iop_message_xfer arcmsr_iop_parking arcmsr_stop_adapter_bgrb arcmsr_hbaC_stop_bgrb arcmsr_hbaC_wait_msgint_ready msleep --> may sleep arcmsr_queue_command(acquire the spinlock) arcmsr_queue_command_lck arcmsr_handle_virtual_command arcmsr_iop_message_xfer arcmsr_iop_parking arcmsr_stop_adapter_bgrb arcmsr_hbaD_stop_bgrb arcmsr_hbaD_wait_msgint_ready msleep --> may sleep To fix them, msleep is replaced with mdelay. These bugs are found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai --- drivers/scsi/arcmsr/arcmsr_hba.c |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c index af032c4..31d94bd 100644 --- a/drivers/scsi/arcmsr/arcmsr_hba.c +++ b/drivers/scsi/arcmsr/arcmsr_hba.c @@ -347,7 +347,7 @@ static uint8_t arcmsr_hbaA_wait_msgint_ready(struct AdapterControlBlock *acb) >outbound_intstatus); return true; } - msleep(10); + mdelay(10); } /* max 20 seconds */ return false; @@ -367,7 +367,7 @@ static uint8_t arcmsr_hbaB_wait_msgint_ready(struct AdapterControlBlock *acb) reg->drv2iop_doorbell); return true; } - msleep(10); + mdelay(10); } /* max 20 seconds */ return false; @@ -385,7 +385,7 @@ static uint8_t arcmsr_hbaC_wait_msgint_ready(struct AdapterControlBlock *pACB) >outbound_doorbell_clear); /*clear interrupt*/ return true; } - msleep(10); + mdelay(10); } /* max 20 seconds */ return false; @@ -403,7 +403,7 @@ static bool arcmsr_hbaD_wait_msgint_ready(struct AdapterControlBlock *pACB) reg->outbound_doorbell); return true; } - msleep(10); + mdelay(10); } /* max 20 seconds */ return false; } -- 1.7.9.5
[PATCH 2/2] target: cxgbit: Combine substrings for 11 messages
From: Markus ElfringDate: Tue, 12 Dec 2017 12:57:47 +0100 The script "checkpatch.pl" pointed information out like the following. WARNING: quoted string split across lines Thus fix the affected source code places. Signed-off-by: Markus Elfring --- drivers/target/iscsi/cxgbit/cxgbit_target.c | 50 + 1 file changed, 16 insertions(+), 34 deletions(-) diff --git a/drivers/target/iscsi/cxgbit/cxgbit_target.c b/drivers/target/iscsi/cxgbit/cxgbit_target.c index 9163152b6d0b..e4a0dbc8fb9d 100644 --- a/drivers/target/iscsi/cxgbit/cxgbit_target.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_target.c @@ -881,9 +881,7 @@ cxgbit_handle_immediate_data(struct iscsi_cmd *cmd, struct iscsi_scsi_req *hdr, if (pdu_cb->flags & PDUCBF_RX_DCRC_ERR) { pr_err("ImmediateData CRC32C DataDigest error\n"); if (!conn->sess->sess_ops->ErrorRecoveryLevel) { - pr_err("Unable to recover from" - " Immediate Data digest failure while" - " in ERL=0.\n"); + pr_err("Unable to recover from Immediate Data digest failure while in ERL=0.\n"); iscsit_reject_cmd(cmd, ISCSI_REASON_DATA_DIGEST_ERROR, (unsigned char *)hdr); return IMMEDIATE_DATA_CANNOT_RECOVER; @@ -1054,19 +1052,14 @@ static int cxgbit_handle_iscsi_dataout(struct cxgbit_sock *csk) } if (pdu_cb->flags & PDUCBF_RX_DCRC_ERR) { - pr_err("ITT: 0x%08x, Offset: %u, Length: %u," - " DataSN: 0x%08x\n", - hdr->itt, hdr->offset, data_len, - hdr->datasn); - + pr_err("ITT: 0x%08x, Offset: %u, Length: %u, DataSN: 0x%08x\n", + hdr->itt, hdr->offset, data_len, hdr->datasn); dcrc_err = true; goto check_payload; } - pr_debug("DataOut data_len: %u, " - "write_data_done: %u, data_length: %u\n", - data_len, cmd->write_data_done, - cmd->se_cmd.data_length); + pr_debug("DataOut data_len: %u, write_data_done: %u, data_length: %u\n", +data_len, cmd->write_data_done, cmd->se_cmd.data_length); if (!(pdu_cb->flags & PDUCBF_RX_DATA_DDPD)) { u32 skip = data_offset % PAGE_SIZE; @@ -1102,9 +1095,7 @@ static int cxgbit_handle_nop_out(struct cxgbit_sock *csk, struct iscsi_cmd *cmd) if (pdu_cb->flags & PDUCBF_RX_DCRC_ERR) { if (!conn->sess->sess_ops->ErrorRecoveryLevel) { - pr_err("Unable to recover from" - " NOPOUT Ping DataCRC failure while in" - " ERL=0.\n"); + pr_err("Unable to recover from NOPOUT Ping DataCRC failure while in ERL=0.\n"); ret = -1; goto out; } else { @@ -1112,9 +1103,8 @@ static int cxgbit_handle_nop_out(struct cxgbit_sock *csk, struct iscsi_cmd *cmd) * drop this PDU and let the * initiator plug the CmdSN gap. */ - pr_info("Dropping NOPOUT" - " Command CmdSN: 0x%08x due to" - " DataCRC error.\n", hdr->cmdsn); + pr_info("Dropping NOPOUT Command CmdSN: 0x%08x due to DataCRC error.\n", + hdr->cmdsn); ret = 0; goto out; } @@ -1139,9 +1129,7 @@ static int cxgbit_handle_nop_out(struct cxgbit_sock *csk, struct iscsi_cmd *cmd) */ cmd->buf_ptr = ping_data; cmd->buf_ptr_size = payload_length; - - pr_debug("Got %u bytes of NOPOUT ping" - " data.\n", payload_length); + pr_debug("Got %u bytes of NOPOUT ping data.\n", payload_length); pr_debug("Ping Data: \"%s\"\n", ping_data); } @@ -1168,18 +1156,15 @@ cxgbit_handle_text_cmd(struct cxgbit_sock *csk, struct iscsi_cmd *cmd) if (pdu_cb->flags & PDUCBF_RX_DCRC_ERR) { if (!conn->sess->sess_ops->ErrorRecoveryLevel) { - pr_err("Unable to recover from" - " Text Data digest failure while in" - " ERL=0.\n"); + pr_err("Unable to recover from Text Data digest failure while in ERL=0.\n"); goto reject; } else { /* * drop this PDU and let the * initiator plug the CmdSN gap. */ - pr_info("Dropping Text" -
[PATCH 1/2] target: cxgbit: Delete an error message for a failed memory allocation in two functions
From: Markus ElfringDate: Tue, 12 Dec 2017 11:34:51 +0100 Omit an extra message for a memory allocation failure in these functions. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/target/iscsi/cxgbit/cxgbit_target.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/target/iscsi/cxgbit/cxgbit_target.c b/drivers/target/iscsi/cxgbit/cxgbit_target.c index 514986b57c2d..9163152b6d0b 100644 --- a/drivers/target/iscsi/cxgbit/cxgbit_target.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_target.c @@ -1126,8 +1126,6 @@ static int cxgbit_handle_nop_out(struct cxgbit_sock *csk, struct iscsi_cmd *cmd) if (payload_length && hdr->ttt == cpu_to_be32(0x)) { ping_data = kzalloc(payload_length + 1, GFP_KERNEL); if (!ping_data) { - pr_err("Unable to allocate memory for" - " NOPOUT ping data.\n"); ret = -1; goto out; } @@ -1188,11 +1186,9 @@ cxgbit_handle_text_cmd(struct cxgbit_sock *csk, struct iscsi_cmd *cmd) if (payload_length) { text_in = kzalloc(payload_length, GFP_KERNEL); - if (!text_in) { - pr_err("Unable to allocate text_in of payload_length: %u\n", - payload_length); + if (!text_in) return -ENOMEM; - } + skb_copy_bits(csk->skb, pdu_cb->doffset, text_in, payload_length); -- 2.15.1
[PATCH 0/2] target/iscsi/cxgbit: Adjustments for seven function implementations
From: Markus ElfringDate: Tue, 12 Dec 2017 13:12:11 +0100 Two update suggestions were taken into account from static source code analysis. Markus Elfring (2): Delete an error message for a failed memory allocation in two functions Combine substrings for 11 messages drivers/target/iscsi/cxgbit/cxgbit_target.c | 58 + 1 file changed, 18 insertions(+), 40 deletions(-) -- 2.15.1
[PATCH 4/4] scsi: arcmsr: simplify all arcmsr_hbaX_get_config routine by call a new get_adapter_config function
From: Ching Huangsimplify all arcmsr_hbaX_get_config routine by call a new get_adapter_config function Signed-off-by: Ching Huang --- diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c index b7a56e8..95c9f08 100755 --- a/drivers/scsi/arcmsr/arcmsr_hba.c +++ b/drivers/scsi/arcmsr/arcmsr_hba.c @@ -2956,75 +2956,66 @@ static int arcmsr_queue_command_lck(struct scsi_cmnd *cmd, static DEF_SCSI_QCMD(arcmsr_queue_command) -static bool arcmsr_hbaA_get_config(struct AdapterControlBlock *acb) +static void arcmsr_get_adapter_config(struct AdapterControlBlock *pACB, uint32_t *rwbuffer) { - struct MessageUnit_A __iomem *reg = acb->pmuA; - char *acb_firm_model = acb->firm_model; - char *acb_firm_version = acb->firm_version; - char *acb_device_map = acb->device_map; - char __iomem *iop_firm_model = (char __iomem *)(>message_rwbuffer[15]); - char __iomem *iop_firm_version = (char __iomem *)(>message_rwbuffer[17]); - char __iomem *iop_device_map = (char __iomem *)(>message_rwbuffer[21]); int count; - arcmsr_wait_firmware_ready(acb); - writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0); - if (!arcmsr_hbaA_wait_msgint_ready(acb)) { - printk(KERN_NOTICE "arcmsr%d: wait 'get adapter firmware \ - miscellaneous data' timeout \n", acb->host->host_no); - return false; - } - count = 8; - while (count){ - *acb_firm_model = readb(iop_firm_model); + uint32_t *acb_firm_model = (uint32_t *)pACB->firm_model; + uint32_t *acb_firm_version = (uint32_t *)pACB->firm_version; + uint32_t *acb_device_map = (uint32_t *)pACB->device_map; + uint32_t *firm_model = [15]; + uint32_t *firm_version = [17]; + uint32_t *device_map = [21]; + + count = 2; + while (count) { + *acb_firm_model = readl(firm_model); acb_firm_model++; - iop_firm_model++; + firm_model++; count--; } - - count = 16; - while (count){ - *acb_firm_version = readb(iop_firm_version); + count = 4; + while (count) { + *acb_firm_version = readl(firm_version); acb_firm_version++; - iop_firm_version++; + firm_version++; count--; } - - count=16; - while(count){ - *acb_device_map = readb(iop_device_map); + count = 4; + while (count) { + *acb_device_map = readl(device_map); acb_device_map++; - iop_device_map++; + device_map++; count--; } + pACB->signature = readl([0]); + pACB->firm_request_len = readl([1]); + pACB->firm_numbers_queue = readl([2]); + pACB->firm_sdram_size = readl([3]); + pACB->firm_hd_channels = readl([4]); + pACB->firm_cfg_version = readl([25]); pr_notice("Areca RAID Controller%d: Model %s, F/W %s\n", - acb->host->host_no, - acb->firm_model, - acb->firm_version); - acb->signature = readl(>message_rwbuffer[0]); - acb->firm_request_len = readl(>message_rwbuffer[1]); - acb->firm_numbers_queue = readl(>message_rwbuffer[2]); - acb->firm_sdram_size = readl(>message_rwbuffer[3]); - acb->firm_hd_channels = readl(>message_rwbuffer[4]); - acb->firm_cfg_version = readl(>message_rwbuffer[25]); /*firm_cfg_version,25,100-103*/ + pACB->host->host_no, + pACB->firm_model, + pACB->firm_version); +} + +static bool arcmsr_hbaA_get_config(struct AdapterControlBlock *acb) +{ + struct MessageUnit_A __iomem *reg = acb->pmuA; + + arcmsr_wait_firmware_ready(acb); + writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0); + if (!arcmsr_hbaA_wait_msgint_ready(acb)) { + printk(KERN_NOTICE "arcmsr%d: wait 'get adapter firmware \ + miscellaneous data' timeout \n", acb->host->host_no); + return false; + } + arcmsr_get_adapter_config(acb, reg->message_rwbuffer); return true; } static bool arcmsr_hbaB_get_config(struct AdapterControlBlock *acb) { struct MessageUnit_B *reg = acb->pmuB; - char *acb_firm_model = acb->firm_model; - char *acb_firm_version = acb->firm_version; - char *acb_device_map = acb->device_map; - char __iomem *iop_firm_model; - /*firm_model,15,60-67*/ - char __iomem *iop_firm_version; - /*firm_version,17,68-83*/ - char __iomem *iop_device_map; - /*firm_version,21,84-99*/ - int count; - - iop_firm_model = (char __iomem *)(>message_rwbuffer[15]); /*firm_model,15,60-67*/ - iop_firm_version = (char __iomem *)(>message_rwbuffer[17]);
[PATCH 3/4] scsi: arcmsr: simplify arcmsr_hbaE_get_config function
From: Ching Huangsimplify arcmsr_hbaE_get_config function Signed-off-by: Ching Huang --- diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c index dfaea8f..b7a56e8 100755 --- a/drivers/scsi/arcmsr/arcmsr_hba.c +++ b/drivers/scsi/arcmsr/arcmsr_hba.c @@ -3205,16 +3205,14 @@ static bool arcmsr_hbaE_get_config(struct AdapterControlBlock *pACB) struct MessageUnit_E __iomem *reg = pACB->pmuE; char __iomem *iop_firm_model = (char __iomem *)(>msgcode_rwbuffer[15]); char __iomem *iop_firm_version = (char __iomem *)(>msgcode_rwbuffer[17]); - uint32_t intmask_org, Index, firmware_state = 0, read_doorbell; + uint32_t intmask_org; int count; /* disable all outbound interrupt */ intmask_org = readl(>host_int_mask); /* disable outbound message0 int */ writel(intmask_org | ARCMSR_HBEMU_ALL_INTMASKENABLE, >host_int_mask); /* wait firmware ready */ - do { - firmware_state = readl(>outbound_msgaddr1); - } while ((firmware_state & ARCMSR_HBEMU_MESSAGE_FIRMWARE_OK) == 0); + arcmsr_wait_firmware_ready(pACB); mdelay(20); /* post "get config" instruction */ writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0); @@ -3222,17 +3220,7 @@ static bool arcmsr_hbaE_get_config(struct AdapterControlBlock *pACB) pACB->out_doorbell ^= ARCMSR_HBEMU_DRV2IOP_MESSAGE_CMD_DONE; writel(pACB->out_doorbell, >iobound_doorbell); /* wait message ready */ - for (Index = 0; Index < 2000; Index++) { - read_doorbell = readl(>iobound_doorbell); - if ((read_doorbell ^ pACB->in_doorbell) & ARCMSR_HBEMU_IOP2DRV_MESSAGE_CMD_DONE) { - writel(0, >host_int_status); - pACB->in_doorbell = read_doorbell; - break; - } - mdelay(1); - } /*max 1 seconds*/ - - if (Index >= 2000) { + if (!arcmsr_hbaE_wait_msgint_ready(pACB)) { pr_notice("arcmsr%d: wait get adapter firmware " "miscellaneous data timeout\n", pACB->host->host_no); return false;
[PATCH 2/4] scsi: arcmsr: waiting for iop firmware ready before issue get_config command to iop
From: Ching Huangwaiting for iop firmware ready before issue get_config command to iop for adapter type A and D Signed-off-by: Ching Huang --- diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c index 9b587ca..dfaea8f 100755 --- a/drivers/scsi/arcmsr/arcmsr_hba.c +++ b/drivers/scsi/arcmsr/arcmsr_hba.c @@ -2966,6 +2966,7 @@ static bool arcmsr_hbaA_get_config(struct AdapterControlBlock *acb) char __iomem *iop_firm_version = (char __iomem *)(>message_rwbuffer[17]); char __iomem *iop_device_map = (char __iomem *)(>message_rwbuffer[21]); int count; + arcmsr_wait_firmware_ready(acb); writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0); if (!arcmsr_hbaA_wait_msgint_ready(acb)) { printk(KERN_NOTICE "arcmsr%d: wait 'get adapter firmware \ @@ -3149,6 +3150,7 @@ static bool arcmsr_hbaD_get_config(struct AdapterControlBlock *acb) writel(ARCMSR_ARC1214_IOP2DRV_MESSAGE_CMD_DONE, acb->pmuD->outbound_doorbell);/*clear interrupt*/ } + arcmsr_wait_firmware_ready(acb); /* post "get config" instruction */ writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, reg->inbound_msgaddr0); /* wait message ready */
[PATCH 1/4] scsi: arcmsr: simplify arcmsr_hbaC_get_config function
From: Ching Huangsimplify arcmsr_hbaC_get_config function Signed-off-by: Ching Huang --- diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c index e4258b6..9b587ca 100755 --- a/drivers/scsi/arcmsr/arcmsr_hba.c +++ b/drivers/scsi/arcmsr/arcmsr_hba.c @@ -3082,7 +3082,7 @@ static bool arcmsr_hbaB_get_config(struct AdapterControlBlock *acb) static bool arcmsr_hbaC_get_config(struct AdapterControlBlock *pACB) { - uint32_t intmask_org, Index, firmware_state = 0; + uint32_t intmask_org; struct MessageUnit_C __iomem *reg = pACB->pmuC; char *acb_firm_model = pACB->firm_model; char *acb_firm_version = pACB->firm_version; @@ -3093,21 +3093,12 @@ static bool arcmsr_hbaC_get_config(struct AdapterControlBlock *pACB) intmask_org = readl(>host_int_mask); /* disable outbound message0 int */ writel(intmask_org|ARCMSR_HBCMU_ALL_INTMASKENABLE, >host_int_mask); /* wait firmware ready */ - do { - firmware_state = readl(>outbound_msgaddr1); - } while ((firmware_state & ARCMSR_HBCMU_MESSAGE_FIRMWARE_OK) == 0); + arcmsr_wait_firmware_ready(pACB); /* post "get config" instruction */ writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0); writel(ARCMSR_HBCMU_DRV2IOP_MESSAGE_CMD_DONE, >inbound_doorbell); /* wait message ready */ - for (Index = 0; Index < 2000; Index++) { - if (readl(>outbound_doorbell) & ARCMSR_HBCMU_IOP2DRV_MESSAGE_CMD_DONE) { - writel(ARCMSR_HBCMU_IOP2DRV_MESSAGE_CMD_DONE_DOORBELL_CLEAR, >outbound_doorbell_clear);/*clear interrupt*/ - break; - } - udelay(10); - } /*max 1 seconds*/ - if (Index >= 2000) { + if (!arcmsr_hbaC_wait_msgint_ready(pACB)) { printk(KERN_NOTICE "arcmsr%d: wait 'get adapter firmware \ miscellaneous data' timeout \n", pACB->host->host_no); return false;
[PATCH 0/4] scsi: arcmsr: simplify hba_get_config routine
From: Ching HuangThese patches are apply to Martin's 4.16/scsi-queue. patch 1: simplify arcmsr_hbaC_get_config function. patch 2: wait iop firmware ready before issue get_config command to iop. patch 3: simplify arcmsr_hbaE_get_config function. patch 4: simplify all arcmsr_hbaX_get_config routine by call a new get_adapter_config function ---
[mainline] rcu stalls on CPU when unbinding mpt3sas driver
Hi, Off late we are seeing cpu stalls messages while mpt3sas driver unbind on powerpc machine for both mainline and linux-next kernels Machine Type: Power 8 Bare-metal Kernel version: 4.15.0-rc2 config: attached. test: driver unbind $ echo -n 0001:03:00.0 > /sys/bus/pci/drivers/mpt3sas/unbind mpt3sas_cm0: removing handle(0x000a), sas_addr(0x500304801f080d00) mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(0) mpt3sas_cm0: removing enclosure level(0x), connector name( ) mpt3sas_cm0: removing handle(0x000b), sas_addr(0x500304801f080d01) mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(1) mpt3sas_cm0: removing enclosure level(0x), connector name( ) mpt3sas_cm0: removing handle(0x000c), sas_addr(0x500304801f080d02) mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(2) mpt3sas_cm0: removing enclosure level(0x), connector name( ) mpt3sas_cm0: removing handle(0x000d), sas_addr(0x500304801f080d03) mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(3) mpt3sas_cm0: removing enclosure level(0x), connector name( ) mpt3sas_cm0: removing handle(0x000e), sas_addr(0x500304801f080d04) mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(4) mpt3sas_cm0: removing enclosure level(0x), connector name( ) mpt3sas_cm0: removing handle(0x000f), sas_addr(0x500304801f080d3d) mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(12) mpt3sas_cm0: removing enclosure level(0x), connector name( ) sd 16:0:0:0: [sdb] Synchronizing SCSI cache sd 16:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 16:0:1:0: [sdc] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 16:0:1:0: [sdc] tag#0 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00 sd 16:0:2:0: [sdd] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 16:0:2:0: [sdd] tag#0 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00 sd 16:0:3:0: [sde] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 16:0:3:0: [sde] tag#0 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00 sd 16:0:4:0: [sdf] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 16:0:4:0: [sdf] tag#0 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00 few minutes after above command was executed, machine is flooded with rcu stalls messages. INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 86-... } 44191221 jiffies s: 3445 root: 0x20/. blocking rcu_node structures: l=1:80-95:0x40/. Task dump for CPU 86: sh R running task10384 18136 1 0x00042086 Call Trace: [c07792d47370] [c07933667200] 0xc07933667200 (unreliable) INFO: rcu_sched self-detected stall on CPU 86-: (50420459 ticks this GP) idle=0ae/141/0 softirq=11962/11962 fqs=24724293 (t=50420460 jiffies g=80217 c=80216 q=36817447) NMI backtrace for cpu 86 CPU: 86 PID: 18136 Comm: sh Not tainted 4.15.0-rc2-autotest #1 Call Trace: [c07792d46f20] [c099b83c] dump_stack+0xb0/0xf4 (unreliable) [c07792d46f60] [c09a43e4] nmi_cpu_backtrace+0x1a4/0x210 [c07792d46ff0] [c09a462c] nmi_trigger_cpumask_backtrace+0x1dc/0x220 [c07792d47090] [c002c7d0] arch_trigger_cpumask_backtrace+0x20/0x40 [c07792d470b0] [c017496c] rcu_dump_cpu_stacks+0xf4/0x158 [c07792d47100] [c0173cb0] rcu_check_callbacks+0x8f0/0xb00 [c07792d47230] [c017c25c] update_process_times+0x3c/0x90 [c07792d47260] [c01921e4] tick_sched_handle.isra.13+0x44/0x80 [c07792d47280] [c0192278] tick_sched_timer+0x58/0xb0 [c07792d472c0] [c017cd58] __hrtimer_run_queues+0xf8/0x330 [c07792d47340] [c017da74] hrtimer_interrupt+0xe4/0x280 [c07792d47400] [c0022660] __timer_interrupt+0x90/0x270 [c07792d47450] [c0022d30] timer_interrupt+0xa0/0xe0 [c07792d47480] [c0009238] decrementer_common+0x158/0x160 --- interrupt: 901 at replay_interrupt_return+0x0/0x4 LR = arch_local_irq_restore+0x74/0x90 [c07792d47770] [c03fb3185000] 0xc03fb3185000 (unreliable) [c07792d47790] [c09bb658] _raw_spin_unlock_irqrestore+0x38/0x60 [c07792d477b0] [c066f274] scsi_remove_target+0x204/0x270 [c07792d47820] [dfc72604] sas_rphy_remove+0x94/0xa0 [scsi_transport_sas] [c07792d47850] [dfc745bc] sas_port_delete+0x4c/0x238 [scsi_transport_sas] [c07792d478b0] [d00010e82990] mpt3sas_transport_port_remove+0x2d0/0x310 [mpt3sas] [c07792d47950] [d00010e71ba0] _scsih_remove_device+0x100/0x2a0 [mpt3sas] [c07792d47a10] [d00010e774d4] mpt3sas_device_remove_by_sas_address.part.44+0xb4/0x160 [mpt3sas] [c07792d47a70] [d00010e77614]
[PATCH 4/4] sd: use async_probe cookie to avoid deadlocks
With the current design we're waiting for all async probes to finish when removing any sd device. This might lead to a livelock where the 'remove' call is blocking for any probe calls to finish, and the probe calls are waiting for a response, which will never be processes as the thread handling the responses is waiting for the remove call to finish. Which is completely pointless as we only _really_ care for the probe on _this_ device to be completed; any other probing can happily continue for all we care. So save the async probing cookie in the structure and only wait if this specific probe is still active. Signed-off-by: Hannes Reinecke--- drivers/scsi/sd.c | 6 -- drivers/scsi/sd.h | 3 +++ 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index abbab17..7bf20ca 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3416,7 +3416,8 @@ static int sd_probe(struct device *dev) dev_set_drvdata(dev, sdkp); get_device(>dev); /* prevent release before async_schedule */ - async_schedule_domain(sd_probe_async, sdkp, _sd_probe_domain); + sdkp->async_probe = async_schedule_domain(sd_probe_async, sdkp, + _sd_probe_domain); return 0; @@ -3454,7 +3455,8 @@ static int sd_remove(struct device *dev) scsi_autopm_get_device(sdkp->device); async_synchronize_full_domain(_sd_pm_domain); - async_synchronize_full_domain(_sd_probe_domain); + async_synchronize_cookie_domain(sdkp->async_probe, + _sd_probe_domain); device_del(>dev); del_gendisk(sdkp->disk); sd_shutdown(dev); diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index 320de75..d8aff29 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -2,6 +2,8 @@ #ifndef _SCSI_DISK_H #define _SCSI_DISK_H +#include + /* * More than enough for everybody ;) The huge number of majors * is a leftover from 16bit dev_t days, we don't really need that @@ -73,6 +75,7 @@ struct scsi_disk { struct device dev; struct gendisk *disk; struct opal_dev *opal_dev; + async_cookie_t async_probe; #ifdef CONFIG_BLK_DEV_ZONED unsigned intnr_zones; unsigned intzone_blocks; -- 1.8.5.6
[PATCH 3/4] scsi: add missing get_device() return value checks
We need to validate that get_device() succeeded, otherwise we'll end up working with invalid devices. Signed-off-by: Hannes Reinecke--- drivers/scsi/scsi_scan.c | 14 -- drivers/scsi/sd.c| 8 +--- 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index be5e919..18edfd7 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -245,6 +245,10 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget, INIT_WORK(>requeue_work, scsi_requeue_run_queue); sdev->sdev_gendev.parent = get_device(>dev); + if (!sdev->sdev_gendev.parent) { + kfree(sdev); + return NULL; + } sdev->sdev_target = starget; /* usually NULL and set by ->slave_alloc instead */ @@ -366,7 +370,8 @@ static struct scsi_target *__scsi_find_target(struct device *parent, } } if (found_starget) - get_device(_starget->dev); + if (!get_device(_starget->dev)) + found_starget = NULL; return found_starget; } @@ -436,6 +441,10 @@ static struct scsi_target *scsi_alloc_target(struct device *parent, device_initialize(dev); kref_init(>reap_ref); dev->parent = get_device(parent); + if (!dev->parent) { + kfree(starget); + return NULL; + } dev_set_name(dev, "target%d:%d:%d", shost->host_no, channel, id); dev->bus = _bus_type; dev->type = _target_type; @@ -469,7 +478,8 @@ static struct scsi_target *scsi_alloc_target(struct device *parent, return NULL; } } - get_device(dev); + /* No good way to recover here; keep fingers crossed */ + WARN_ON(!get_device(dev)); return starget; diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 228b0b62..abbab17 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -619,10 +619,12 @@ static struct scsi_disk *scsi_disk_get(struct gendisk *disk) if (disk->private_data) { sdkp = scsi_disk(disk); - if (scsi_device_get(sdkp->device) == 0) - get_device(>dev); - else + if (scsi_device_get(sdkp->device)) + sdkp = NULL; + else if (!get_device(>dev)) { + scsi_device_put(sdkp->device); sdkp = NULL; + } } mutex_unlock(_ref_mutex); return sdkp; -- 1.8.5.6
[PATCH 2/4] sd: Check if parent is still present before proceeding with probing
When a new SCSI disk is created we should be checking if the parent device is actually present before proceeding with probing. Signed-off-by: Hannes Reinecke--- drivers/scsi/sd.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index ab75ebd..228b0b62 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3399,6 +3399,10 @@ static int sd_probe(struct device *dev) } device_initialize(>dev); + + if (!get_device(dev)) + goto out_free_index; + sdkp->dev.parent = dev; sdkp->dev.class = _disk_class; dev_set_name(>dev, "%s", dev_name(dev)); @@ -3407,7 +3411,6 @@ static int sd_probe(struct device *dev) if (error) goto out_free_index; - get_device(dev); dev_set_drvdata(dev, sdkp); get_device(>dev); /* prevent release before async_schedule */ -- 1.8.5.6
[PATCH 0/4] sd async probing
Hi all, here's a patchset to facilitate full async scsi disk probing. The current approach has the problem of calling async_synchronize_full_domain() during sd_remove(), which is required to wait for _all_ current asynchronous probes to complete. When running under iSER sd_remove() is called from the RX thread directly, ie sd_remove() has to complete before RX processing can continue. But as probing has to issue I/O any response will be stuck in the RX queue, resulting in a deadlock. This patchset breaks up the synchronization point into a per-device cookie, thus removing the global synchronisation and breaking the deadlock. But during testing some more deficiencies up and down the stack had been revealed, so we'll need the entire patchset to make it work. As usual, comments and reviews are welcome. Hannes Reinecke (4): block: check for GENHD_FL_UP in del_gendisk() sd: Check if parent is still present before proceeding with probing scsi: add missing get_device() return value checks sd: use async_probe cookie to avoid deadlocks block/genhd.c| 3 +++ drivers/scsi/scsi_scan.c | 14 -- drivers/scsi/sd.c| 19 +-- drivers/scsi/sd.h| 3 +++ 4 files changed, 31 insertions(+), 8 deletions(-) -- 1.8.5.6
[PATCH 1/4] block: check for GENHD_FL_UP in del_gendisk()
From: Hannes ReineckeWhen a device is probed asynchronously del_gendisk() might be called before the async probing was run, causing del_gendisk() to crash due to uninitialized sysfs objects. Signed-off-by: Hannes Reinecke --- block/genhd.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/genhd.c b/block/genhd.c index c2223f1..cc40d95 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -697,6 +697,9 @@ void del_gendisk(struct gendisk *disk) struct disk_part_iter piter; struct hd_struct *part; + if (!(disk->flags & GENHD_FL_UP)) + return; + blk_integrity_del(disk); disk_del_events(disk); -- 1.8.5.6
[PATCH] esas2r: Fix a possible sleep-in-atomic bug in esas2r_wait_request
The driver may sleep in the interrupt handler. The function call path is: esas2r_adapter_tasklet (interrupt handler) esas2r_do_tasklet_tasks esas2r_handle_chip_rst_during_tasklet esas2r_init_adapter_hw esas2r_init_msgs esas2r_wait_request schedule_timeout_interruptible --> may sleep To fix it, schedule_timeout_uninterruptible is replaced with mdelay. This bug is found by my static analysis tool(DSAC) and checked by my code review. Signed-off-by: Jia-Ju Bai--- drivers/scsi/esas2r/esas2r_main.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/esas2r/esas2r_main.c b/drivers/scsi/esas2r/esas2r_main.c index 4eb1430..4cd8f79 100644 --- a/drivers/scsi/esas2r/esas2r_main.c +++ b/drivers/scsi/esas2r/esas2r_main.c @@ -1307,7 +1307,7 @@ void esas2r_wait_request(struct esas2r_adapter *a, struct esas2r_request *rq) if (rq->req_stat != RS_STARTED) break; - schedule_timeout_interruptible(msecs_to_jiffies(100)); + mdelay(100); if ((jiffies_to_msecs(jiffies) - starttime) > timeout) { esas2r_hdebug("request TMO"); -- 1.7.9.5
[PATCH] esas2r: Fix a possible sleep-in-atomic bug in esas2r_flash_access
The driver may sleep in the interrupt handler. The function call path is: esas2r_adapter_tasklet (interrupt handler) esas2r_do_tasklet_tasks esas2r_handle_chip_rst_during_tasklet esas2r_init_adapter_hw esas2r_nvram_read_direct esas2r_read_flash_block esas2r_flash_access schedule_timeout_interruptible --> may sleep To fix it, schedule_timeout_uninterruptible is replaced with mdelay. This bug is found by my static analysis tool(DSAC) and checked by my code review. Signed-off-by: Jia-Ju Bai--- drivers/scsi/esas2r/esas2r_flash.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/esas2r/esas2r_flash.c b/drivers/scsi/esas2r/esas2r_flash.c index 7bd376d..9b3da4c 100644 --- a/drivers/scsi/esas2r/esas2r_flash.c +++ b/drivers/scsi/esas2r/esas2r_flash.c @@ -965,7 +965,7 @@ static bool esas2r_flash_access(struct esas2r_adapter *a, u32 function) break; } - schedule_timeout_interruptible(msecs_to_jiffies(100)); + mdelay(100); if ((jiffies_to_msecs(jiffies) - starttime) > timeout) { /* -- 1.7.9.5
[BUG] drivers/scsi/esas2r: a possible sleep-in-atomic bug in esas2r_nvram_read_direct
The driver may sleep in the interrupt handler. The function call path is: esas2r_adapter_tasklet (interrupt handler) esas2r_do_tasklet_tasks esas2r_handle_chip_rst_during_tasklet esas2r_init_adapter_hw esas2r_nvram_read_direct down_interruptible --> may sleep I do not find a good way to fix it, so I only report. This possible bug is found by my static analysis tool (DSAC) and checked by my code review. Thanks, Jia-Ju Bai
[PATCH] esas2r: Fix possible sleep-in-atomic bugs in esas2r_check_adapter
The driver may sleep in the interrupt handler. The function call path is: esas2r_adapter_tasklet (interrupt handler) esas2r_do_tasklet_tasks esas2r_handle_chip_rst_during_tasklet esas2r_check_adapter schedule_timeout_interruptible To fix it, schedule_timeout_uninterruptible is replaced with mdelay. This bug is found by my static analysis tool(DSAC) and checked by my code review. Signed-off-by: Jia-Ju Bai--- drivers/scsi/esas2r/esas2r_init.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/esas2r/esas2r_init.c b/drivers/scsi/esas2r/esas2r_init.c index 5b14dd2..0b9f547 100644 --- a/drivers/scsi/esas2r/esas2r_init.c +++ b/drivers/scsi/esas2r/esas2r_init.c @@ -1068,7 +1068,7 @@ bool esas2r_check_adapter(struct esas2r_adapter *a) break; } - schedule_timeout_interruptible(msecs_to_jiffies(100)); + mdelay(100); if ((jiffies_to_msecs(jiffies) - starttime) > 18) { esas2r_hdebug("FW ready TMO"); @@ -1091,7 +1091,7 @@ bool esas2r_check_adapter(struct esas2r_adapter *a) break; } - schedule_timeout_interruptible(msecs_to_jiffies(50)); + mdelay(50); if ((jiffies_to_msecs(jiffies) - starttime) > 3000) { esas2r_hdebug("timeout waiting for interface down"); @@ -1180,7 +1180,7 @@ bool esas2r_check_adapter(struct esas2r_adapter *a) break; } - schedule_timeout_interruptible(msecs_to_jiffies(100)); + mdelay(100); if ((jiffies_to_msecs(jiffies) - starttime) > 3000) { esas2r_hdebug( -- 1.7.9.5
Re: [PATCH] sd: Increase SCSI disk probing concurrency
On 12/12/2017 01:16 AM, Bart Van Assche wrote: > On Sun, 2017-12-10 at 15:44 +0100, Hannes Reinecke wrote: >> You know what, I have been working on a similar patch for quite some >> time now; [ ... ] > > That's very interesting :-) > >> However, in doing so I have encountered several issues which have been >> exposed by that; the most trivial one being that del_gendisk() doesn't >> check if GENHD_FL_UP is set, so it'll crash if sd_remove is called >> before sd_async_probe() is run. > > Have you noticed that my patch adds a call to sd_wait_for_probing() before > the call to del_gendisk()? That should be sufficient to prevent the crash you > described. > Not necessarily; see below. >> The other one is an imbalance between sd_probe and sd_remove; >> when sd_probe_async() is called _after_ scsi_device_remove_device() >> (which it will as the synchronization is only after device_del() has >> been called) it will trip over non-existent sysfs directories in add_disk(). >> So one need to short-circuit sd_probe_async() for devices in SDEV_CANCEL >> or SDEV_DEL. > > Same comment here - I think the sd_wait_for_probing() call I inserted in > sd_remove() should prevent this scenario. > No, it doesn't. sd_remove is called via __scsi_device_remove_device() -> device_del() -> sd_remove() with this piece of code: bsg_unregister_queue(sdev->request_queue); device_unregister(>sdev_dev); transport_remove_device(dev); device_del(dev); ie by the time device_del() we've already called device_unregister(), which unfortunately is the sysfs parent kobj for the gendisk. So when you're calling sd_wait_for_probing() in sd_remove() it'll call sd_async_probe(), and add_disk() will be called with a NULL parent object. >> However, I'm still waiting for a final confirmation on the latter issue, >> hence I haven't posted the patchset. >> If there's interest I can post them, though. > > The big question here is how you would like to proceed - should we start from > your patch series or start from my patch? Since I have not encountered any of > the issues you described with my patch, does that mean that the approach I > followed results in more reliable operation? > Not necessarily. The issues I've described got reported by a big CDN provider after several days worth of hammering. It's not something I could reproduce here in my puny lab. So just because you haven't seen the race doesn't mean there is none :-) I'll be posting the patches shortly. Cheers, Hannes -- Dr. Hannes ReineckezSeries & Storage h...@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)