date:20171212

Re: [BUG] scsi/qla2xxx: a possible sleep-in-atomic bug in qlt_get_tag

2017-12-12 Thread Jia-Ju Bai



On 2017/12/13 12:42, James Bottomley wrote:

On Wed, 2017-12-13 at 11:18 +0800, Jia-Ju Bai wrote:

The driver may sleep under a spinlock.
The function call paths are:
qlt_handle_abts_recv_work (acquire the spinlock)
qlt_response_pkt_all_vps
  qlt_response_pkt
qlt_handle_cmd_for_atio
  qlt_get_tag
percpu_ida_alloc --> may sleep

qla82xx_msix_rsp_q (acquire the spinlock)
qla24xx_process_response_queue
  qlt_handle_abts_recv
qlt_response_pkt_all_vps
  qlt_response_pkt
qlt_handle_cmd_for_atio
  qlt_get_tag
percpu_ida_alloc --> may sleep-in-atomic

qla24xx_intr_handler (acquire the spinlock)
qla24xx_process_response_queue
  qlt_handle_abts_recv
qlt_response_pkt
  qlt_handle_cmd_for_atio
qlt_get_tag
  percpu_ida_alloc --> may sleep

I do not find a good way to fix it, so I only report.
This possible bug is found by my static analysis tool (DSAC) and
checked by my code review.

The report is incorrect: percpu_ida_alloc with state==TASK_RUNNING is
atomic (and interrupt) safe which appears to be the case here.

James



Thanks for your reply :)
I have checked the definition of percpu_ida_alloc, and I think you are 
right.

Sorry for my incorrect bug report.


Thanks,
Jia-Ju Bai

Re: [PATCH 1/4] block: check for GENHD_FL_UP in del_gendisk()

2017-12-12 Thread Hannes Reinecke

On 12/12/2017 05:57 PM, Bart Van Assche wrote:
> On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote:
>> From: Hannes Reinecke 
>>
>> When a device is probed asynchronously del_gendisk() might be called
>> before the async probing was run, causing del_gendisk() to crash
>> due to uninitialized sysfs objects.
>>
>> Signed-off-by: Hannes Reinecke 
>> ---
>>  block/genhd.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/block/genhd.c b/block/genhd.c
>> index c2223f1..cc40d95 100644
>> --- a/block/genhd.c
>> +++ b/block/genhd.c
>> @@ -697,6 +697,9 @@ void del_gendisk(struct gendisk *disk)
>>  struct disk_part_iter piter;
>>  struct hd_struct *part;
>>  
>> +if (!(disk->flags & GENHD_FL_UP))
>> +return;
>> +
>>  blk_integrity_del(disk);
>>  disk_del_events(disk);
> 
> Hello Hannes,
> 
> Thank you for having published your approach for increasing disk probing
> concurrency. Your approach looks interesting to me. However, I don't think
> that patches 1/4..3/4 are sufficient to avoid races between e.g.
> device_add_disk() and del_gendisk(). As far as I know no locks are held
> around the device_add_disk() and del_gendisk() calls. Does that mean that
> del_gendisk() can call e.g. blk_integrity_del() before device_add_disk() has
> called blk_integrity_add()?
> 
In principle, yes.
However, the overall idea here is that  device_add_disk() and
del_gendisk() are enclosed within upper layer procedures, which
themselves provide additional locking.
In our case the sd driver provided synchronisation guarantees ensuring
that device_add_disk() and del_gendisk() doesn't run concurrently.

if one is really concerned we could convert disk->flags to a bitmask,
and use atomic bitmask modification; that should avoid any concurrency
issues.
However, I'm not sure if it's worth it.

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.com  +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [BUG] scsi/qla2xxx: a possible sleep-in-atomic bug in qlt_get_tag

2017-12-12 Thread James Bottomley

On Wed, 2017-12-13 at 11:18 +0800, Jia-Ju Bai wrote:
> The driver may sleep under a spinlock.
> The function call paths are:
> qlt_handle_abts_recv_work (acquire the spinlock)
>    qlt_response_pkt_all_vps
>  qlt_response_pkt
>    qlt_handle_cmd_for_atio
>  qlt_get_tag
>    percpu_ida_alloc --> may sleep
> 
> qla82xx_msix_rsp_q (acquire the spinlock)
>    qla24xx_process_response_queue
>  qlt_handle_abts_recv
>    qlt_response_pkt_all_vps
>  qlt_response_pkt
>    qlt_handle_cmd_for_atio
>  qlt_get_tag
>    percpu_ida_alloc --> may sleep-in-atomic
> 
> qla24xx_intr_handler (acquire the spinlock)
>    qla24xx_process_response_queue
>  qlt_handle_abts_recv
>    qlt_response_pkt
>  qlt_handle_cmd_for_atio
>    qlt_get_tag
>  percpu_ida_alloc --> may sleep
> 
> I do not find a good way to fix it, so I only report.
> This possible bug is found by my static analysis tool (DSAC) and
> checked by my code review.

The report is incorrect: percpu_ida_alloc with state==TASK_RUNNING is
atomic (and interrupt) safe which appears to be the case here.

James

[BUG] scsi/qla2xxx: a possible sleep-in-atomic bug in qlt_get_tag

2017-12-12 Thread Jia-Ju Bai


The driver may sleep under a spinlock.
The function call paths are:
qlt_handle_abts_recv_work (acquire the spinlock)
  qlt_response_pkt_all_vps
qlt_response_pkt
  qlt_handle_cmd_for_atio
qlt_get_tag
  percpu_ida_alloc --> may sleep

qla82xx_msix_rsp_q (acquire the spinlock)
  qla24xx_process_response_queue
qlt_handle_abts_recv
  qlt_response_pkt_all_vps
qlt_response_pkt
  qlt_handle_cmd_for_atio
qlt_get_tag
  percpu_ida_alloc --> may sleep-in-atomic

qla24xx_intr_handler (acquire the spinlock)
  qla24xx_process_response_queue
qlt_handle_abts_recv
  qlt_response_pkt
qlt_handle_cmd_for_atio
  qlt_get_tag
percpu_ida_alloc --> may sleep

I do not find a good way to fix it, so I only report.
This possible bug is found by my static analysis tool (DSAC) and checked 
by my code review.



Thanks,
Jia-Ju Bai

[PATCH] scsi: mpt3sas: validate device handle before ata pending flag

2017-12-12 Thread Yuanyuan Zhong

Commit ffb584565894 ("scsi: mpt3sas: fix hang on ata passthrough
commands") introduced ata_command_pending bit-flag to ensure sending
only one outstanding pass through ATA command to target device.

However if target device become irresponsive with ata_command_pending,
we won't be able to remove it timely, since we try to dispatch a
SYNCHRONIZE_CACHE command and bail out with SAM_STAT_BUSY repeatedly.

Move the invalid device handle checking a bit earlier than the
ata_command_pending.

Signed-off-by: Yuanyuan Zhong 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index b258f21..32fc0d8 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -4758,6 +4758,16 @@ static int _scsih_set_satl_pending(struct scsi_cmnd 
*scmd, bool pending)
return 0;
}
 
+   sas_target_priv_data = sas_device_priv_data->sas_target;
+
+   /* invalid device handle */
+   handle = sas_target_priv_data->handle;
+   if (handle == MPT3SAS_INVALID_DEVICE_HANDLE) {
+   scmd->result = DID_NO_CONNECT << 16;
+   scmd->scsi_done(scmd);
+   return 0;
+   }
+
/*
 * Bug work around for firmware SATL handling.  The loop
 * is based on atomic operations and ensures consistency
@@ -4771,17 +4781,6 @@ static int _scsih_set_satl_pending(struct scsi_cmnd 
*scmd, bool pending)
}
} while (_scsih_set_satl_pending(scmd, true));
 
-   sas_target_priv_data = sas_device_priv_data->sas_target;
-
-   /* invalid device handle */
-   handle = sas_target_priv_data->handle;
-   if (handle == MPT3SAS_INVALID_DEVICE_HANDLE) {
-   scmd->result = DID_NO_CONNECT << 16;
-   scmd->scsi_done(scmd);
-   return 0;
-   }
-
-
/* host recovery or link resets sent via IOCTLs */
if (ioc->shost_recovery || ioc->ioc_link_reset_in_progress)
return SCSI_MLQUEUE_HOST_BUSY;
-- 
1.9.1

Re: [bug report] A race between device_resume and removing disk

2017-12-12 Thread chenxiang (M)


Ping...Does anyone has some idea about this issue?

在 2017/11/16 11:54, chenxiang (M) 写道:

Hi all,

When debugging suspend and resume of hisi_sas, I find a issue: use 
commands (echo freeze > /sys/power/state) to

suspend, after 5s system will be resumed as i enable TEST_DEVICES. But if
I plug one disks during suspend, system will be blocked all the time 
and it seems that there is a deadlock.


I suspect that when device resume, it will acquire dev->mutex 
(device_lock(dev)) during almost the
whole device resume process, but if remove a disk, it also tries to 
acquire dev->mutex when device_del.

So it causes a deadlock. Do you have any idea about this issue?

Follows are part of the log of normal print and exception print. In my 
environment, there are 11 disks on expander

(5 SATA disks and 6 SAS disks).

regards,
shawn


Normal log of suspend as follows:

[  189.789974] PM: suspend entry (s2idle)
[  189.798045] PM: Syncing filesystems ... done.
[  189.935324] Freezing user space processes ... (elapsed 0.009 
seconds) done.

[  189.955044] OOM killer disabled.
[  189.959111] Freezing remaining freezable tasks ... (elapsed 0.012 
seconds) done.

[  190.065399] sd 0:0:10:0: [sdk] Synchronizing SCSI cache
[  190.090948] sd 0:0:10:0: [sdk] Stopping disk
[  190.113157] sd 0:0:9:0: [sdj] Synchronizing SCSI cache
[  190.139116] sd 0:0:8:0: [sdi] Synchronizing SCSI cache
[  190.160252] sd 0:0:7:0: [sdh] Synchronizing SCSI cache
[  190.175968] sd 0:0:7:0: [sdh] Stopping disk
[  191.401325] sd 0:0:5:0: [sdf] Synchronizing SCSI cache
[  191.429529] sd 0:0:3:0: [sdd] Synchronizing SCSI cache
[  191.452532] sd 0:0:2:0: [sdc] Synchronizing SCSI cache
[  191.752248] sd 0:0:2:0: [sdc] Stopping disk
[  192.045536] sd 0:0:1:0: [sdb] Synchronizing SCSI cache
[  192.072078] sd 0:0:1:0: [sdb] Stopping disk
[  192.744386] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[  192.760704] sd 0:0:0:0: [sda] Stopping disk
[  193.397060] hisi_sas_v3_suspend
[  194.587783] PM: suspend devices took 4.604 seconds
[  194.594688] PM: suspend debug: Waiting for 5 second(s).
[  199.687272] hisi_sas_v3_resume
[  199.740433] hisi_sas_v3_hw :74:02.0: phyup: phy0 link_rate=8
[  199.747979] hisi_sas_v3_hw :74:02.0: phyup: phy1 link_rate=8
[  199.754995] hisi_sas_v3_hw :74:02.0: phyup: phy2 link_rate=8
[  199.787405] hisi_sas_v3_hw :74:02.0: phyup: phy3 link_rate=8
[  199.820411] sas: sas_form_port: phy0 belongs to port0 already(4)!
[  199.831008] sas: sas_form_port: phy1 belongs to port0 already(4)!
[  199.839963] sas: sas_form_port: phy2 belongs to port0 already(4)!
[  199.849237] sas: sas_form_port: phy3 belongs to port0 already(4)!
[  200.366863] ata7: SATA link down (SStatus 0 SControl 300)
[  200.379220] ata6: SATA link down (SStatus 0 SControl 300)
[  200.828384] sas: broadcast received: 0
[  200.838602] sas: REVALIDATING DOMAIN on port 0, pid:1454
[  200.862876] sd 0:0:0:0: [sda] Starting disk
[  200.881098] sd 0:0:1:0: [sdb] Starting disk
[  200.891951] sd 0:0:2:0: [sdc] Starting disk
[  200.949758] sd 0:0:7:0: [sdh] Starting disk
[  200.990573] sd 0:0:10:0: [sdk] Starting disk
[  201.114636] sas: Expander phy change count has changed
[  201.188709] sas: ex 500e004aaa1f phy16 originated 
BROADCAST(CHANGE)

[  201.197145] sas: phy16 part of wide port with phy17
[  201.211154] sas: ex 500e004aaa1f phy 0x10 broadcast flutter
[  201.222900] sas: ex 500e004aaa1f phy17 originated 
BROADCAST(CHANGE)

[  201.231045] sas: phy17 part of wide port with phy16
[  201.244352] sas: ex 500e004aaa1f phy 0x11 broadcast flutter
[  201.258344] sas: ex 500e004aaa1f phy18 originated 
BROADCAST(CHANGE)

[  201.268344] sas: phy18 part of wide port with phy16
[  201.283102] sas: ex 500e004aaa1f phy 0x12 broadcast flutter
[  201.294168] sas: ex 500e004aaa1f phy19 originated 
BROADCAST(CHANGE)

[  201.302386] sas: phy19 part of wide port with phy16
[  201.315813] sas: ex 500e004aaa1f phy 0x13 broadcast flutter
[  201.346615] sas: done REVALIDATING DOMAIN on port 0, pid:1454, res 0x0
[  203.640641] PM: resume devices took 3.984 seconds
[  203.648996] OOM killer enabled.
[  203.655016] Restarting tasks ... done.
[  203.690494] PM: suspend exit

The log of removing disk when suspend is as follows:

[44750.653222] PM: suspend entry (s2idle)
[44750.658991] PM: Syncing filesystems ... done.
[44750.699868] Freezing user space processes ... (elapsed 0.015 
seconds) done.

[44750.725734] OOM killer disabled.
[44750.729711] Freezing remaining freezable tasks ... (elapsed 0.018 
seconds) done.

[44750.820672] sd 0:0:13:0: [sdk] Synchronizing SCSI cache
[44750.844056] sd 0:0:13:0: [sdk] Stopping disk
[44750.917433] sd 0:0:9:0: [sdj] Synchronizing SCSI cache
[44750.949032] sd 0:0:8:0: [sdi] Synchronizing SCSI cache
[44750.972730] sd 0:0:7:0: [sdh] Synchronizing SCSI cache
[44750.989813] sd 0:0:7:0: [sdh] Stopping disk
[44752.207539] sd 0:0:5:0: [sdf] Synchronizing SCSI cache
[44752.239773] sd 0:0:3:0: [sdd] Synchronizing SCSI

[PATCH 8/8] target/iscsi: Improve 16 size determinations

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 22:06:00 +0100

Replace the specification of data types by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/iscsi_target_erl2.c |  2 +-
 drivers/target/iscsi/iscsi_target_login.c|  6 +++---
 drivers/target/iscsi/iscsi_target_parameters.c   | 10 +-
 drivers/target/iscsi/iscsi_target_seq_pdu_list.c | 11 ++-
 drivers/target/iscsi/iscsi_target_tpg.c  |  4 ++--
 5 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target_erl2.c 
b/drivers/target/iscsi/iscsi_target_erl2.c
index 87c27e8d4f49..80b8297f4b05 100644
--- a/drivers/target/iscsi/iscsi_target_erl2.c
+++ b/drivers/target/iscsi/iscsi_target_erl2.c
@@ -324,7 +324,7 @@ int iscsit_prepare_cmds_for_reallegiance(struct iscsi_conn 
*conn)
 * (struct iscsi_cmd->cr) so we need to allocate this before preparing 
the
 * connection's command list for connection recovery.
 */
-   cr = kzalloc(sizeof(struct iscsi_conn_recovery), GFP_KERNEL);
+   cr = kzalloc(sizeof(*cr), GFP_KERNEL);
if (!cr)
return -1;
 
diff --git a/drivers/target/iscsi/iscsi_target_login.c 
b/drivers/target/iscsi/iscsi_target_login.c
index cfaf564825e0..095651e48f36 100644
--- a/drivers/target/iscsi/iscsi_target_login.c
+++ b/drivers/target/iscsi/iscsi_target_login.c
@@ -46,7 +46,7 @@ static struct iscsi_login *iscsi_login_init_conn(struct 
iscsi_conn *conn)
 {
struct iscsi_login *login;
 
-   login = kzalloc(sizeof(struct iscsi_login), GFP_KERNEL);
+   login = kzalloc(sizeof(*login), GFP_KERNEL);
if (!login)
return NULL;
 
@@ -294,7 +294,7 @@ static int iscsi_login_zero_tsih_s1(
struct iscsi_login_req *pdu = (struct iscsi_login_req *)buf;
int ret;
 
-   sess = kzalloc(sizeof(struct iscsi_session), GFP_KERNEL);
+   sess = kzalloc(sizeof(*sess), GFP_KERNEL);
if (!sess) {
iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR,
ISCSI_LOGIN_STATUS_NO_RESOURCES);
@@ -1244,7 +1244,7 @@ static int __iscsi_target_login_thread(struct iscsi_np 
*np)
}
spin_unlock_bh(>np_thread_lock);
 
-   conn = kzalloc(sizeof(struct iscsi_conn), GFP_KERNEL);
+   conn = kzalloc(sizeof(*conn), GFP_KERNEL);
if (!conn) {
/* Get another socket */
return 1;
diff --git a/drivers/target/iscsi/iscsi_target_parameters.c 
b/drivers/target/iscsi/iscsi_target_parameters.c
index a49b0b2a4f6c..110e81a0c97d 100644
--- a/drivers/target/iscsi/iscsi_target_parameters.c
+++ b/drivers/target/iscsi/iscsi_target_parameters.c
@@ -129,7 +129,7 @@ static struct iscsi_param *iscsi_set_default_param(struct 
iscsi_param_list *para
 {
struct iscsi_param *param;
 
-   param = kzalloc(sizeof(struct iscsi_param), GFP_KERNEL);
+   param = kzalloc(sizeof(*param), GFP_KERNEL);
if (!param)
goto out;
 
@@ -199,7 +199,7 @@ int iscsi_create_default_params(struct iscsi_param_list 
**param_list_ptr)
struct iscsi_param *param;
struct iscsi_param_list *pl;
 
-   pl = kzalloc(sizeof(struct iscsi_param_list), GFP_KERNEL);
+   pl = kzalloc(sizeof(*pl), GFP_KERNEL);
if (!pl)
return -ENOMEM;
 
@@ -568,7 +568,7 @@ int iscsi_copy_param_list(
struct iscsi_param *new_param = NULL;
struct iscsi_param_list *param_list;
 
-   param_list = kzalloc(sizeof(struct iscsi_param_list), GFP_KERNEL);
+   param_list = kzalloc(sizeof(*param_list), GFP_KERNEL);
if (!param_list)
return -ENOMEM;
 
@@ -583,7 +583,7 @@ int iscsi_copy_param_list(
continue;
}
 
-   new_param = kzalloc(sizeof(struct iscsi_param), GFP_KERNEL);
+   new_param = kzalloc(sizeof(*new_param), GFP_KERNEL);
if (!new_param)
goto err_out;
 
@@ -714,7 +714,7 @@ static int iscsi_add_notunderstood_response(
return -1;
}
 
-   extra_response = kzalloc(sizeof(struct iscsi_extra_response), 
GFP_KERNEL);
+   extra_response = kzalloc(sizeof(*extra_response), GFP_KERNEL);
if (!extra_response)
return -ENOMEM;
 
diff --git a/drivers/target/iscsi/iscsi_target_seq_pdu_list.c 
b/drivers/target/iscsi/iscsi_target_seq_pdu_list.c
index 3a6e619bb30e..0fd914d92f4f 100644
--- a/drivers/target/iscsi/iscsi_target_seq_pdu_list.c
+++ b/drivers/target/iscsi/iscsi_target_seq_pdu_list.c
@@ -137,7 +137,8 @@ static int iscsit_randomize_pdu_lists(
seq_count++;

[PATCH 7/8] target/iscsi: Delete an unnecessary variable initialisation in iscsi_set_default_param()

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 21:21:31 +0100

The local variable "param" will be reassigned by a following statement.
Thus omit the explicit initialisation at the beginning.

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/iscsi_target_parameters.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/iscsi/iscsi_target_parameters.c 
b/drivers/target/iscsi/iscsi_target_parameters.c
index 151269b8816d..a49b0b2a4f6c 100644
--- a/drivers/target/iscsi/iscsi_target_parameters.c
+++ b/drivers/target/iscsi/iscsi_target_parameters.c
@@ -127,7 +127,7 @@ static struct iscsi_param *iscsi_set_default_param(struct 
iscsi_param_list *para
char *name, char *value, u8 phase, u8 scope, u8 sender,
u16 type_range, u8 use)
 {
-   struct iscsi_param *param = NULL;
+   struct iscsi_param *param;
 
param = kzalloc(sizeof(struct iscsi_param), GFP_KERNEL);
if (!param)
-- 
2.15.1

[PATCH 6/8] target/iscsi: Delete an unnecessary variable initialisation in iscsi_create_default_params()

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 21:18:39 +0100

The variable "param" will eventually be set to an appropriate pointer
a bit later. Thus omit the explicit initialisation at the beginning.

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/iscsi_target_parameters.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/iscsi/iscsi_target_parameters.c 
b/drivers/target/iscsi/iscsi_target_parameters.c
index 25a3a9550488..151269b8816d 100644
--- a/drivers/target/iscsi/iscsi_target_parameters.c
+++ b/drivers/target/iscsi/iscsi_target_parameters.c
@@ -196,7 +196,7 @@ static struct iscsi_param *iscsi_set_default_param(struct 
iscsi_param_list *para
 /* #warning Add extension keys */
 int iscsi_create_default_params(struct iscsi_param_list **param_list_ptr)
 {
-   struct iscsi_param *param = NULL;
+   struct iscsi_param *param;
struct iscsi_param_list *pl;
 
pl = kzalloc(sizeof(struct iscsi_param_list), GFP_KERNEL);
-- 
2.15.1

[PATCH 5/8] target/iscsi: Delete an unnecessary variable initialisation in iscsi_copy_param_list()

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 21:15:55 +0100

The variable "param_list" will be reassigned by a following statement.
Thus omit the explicit initialisation at the beginning.

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/iscsi_target_parameters.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/iscsi/iscsi_target_parameters.c 
b/drivers/target/iscsi/iscsi_target_parameters.c
index 06310b2c4e26..25a3a9550488 100644
--- a/drivers/target/iscsi/iscsi_target_parameters.c
+++ b/drivers/target/iscsi/iscsi_target_parameters.c
@@ -566,7 +566,7 @@ int iscsi_copy_param_list(
 {
struct iscsi_param *param = NULL;
struct iscsi_param *new_param = NULL;
-   struct iscsi_param_list *param_list = NULL;
+   struct iscsi_param_list *param_list;
 
param_list = kzalloc(sizeof(struct iscsi_param_list), GFP_KERNEL);
if (!param_list)
-- 
2.15.1

[PATCH 4/8] target/iscsi: Delete an unnecessary variable initialisation in iscsit_allocate_ooo_cmdsn()

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 21:13:49 +0100

The local variable "ooo_cmdsn" will be reassigned by a following statement.
Thus omit the explicit initialisation at the beginning.

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/iscsi_target_erl1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/iscsi/iscsi_target_erl1.c 
b/drivers/target/iscsi/iscsi_target_erl1.c
index ff3e08b6d4e1..c4c550128161 100644
--- a/drivers/target/iscsi/iscsi_target_erl1.c
+++ b/drivers/target/iscsi/iscsi_target_erl1.c
@@ -782,7 +782,7 @@ int iscsit_recover_dataout_sequence(
 
 static struct iscsi_ooo_cmdsn *iscsit_allocate_ooo_cmdsn(void)
 {
-   struct iscsi_ooo_cmdsn *ooo_cmdsn = NULL;
+   struct iscsi_ooo_cmdsn *ooo_cmdsn;
 
ooo_cmdsn = kmem_cache_zalloc(lio_ooo_cache, GFP_ATOMIC);
if (!ooo_cmdsn)
-- 
2.15.1

[PATCH 3/8] target/iscsi: Delete 36 error messages for a failed memory allocation

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 21:07:16 +0100

Omit extra messages for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/iscsi_target.c   |  2 --
 drivers/target/iscsi/iscsi_target_auth.c  | 17 +++--
 drivers/target/iscsi/iscsi_target_datain_values.c |  6 ++--
 drivers/target/iscsi/iscsi_target_erl1.c  | 12 +++
 drivers/target/iscsi/iscsi_target_erl2.c  |  6 ++--
 drivers/target/iscsi/iscsi_target_login.c | 23 +++--
 drivers/target/iscsi/iscsi_target_nego.c  |  4 +--
 drivers/target/iscsi/iscsi_target_parameters.c| 42 +++
 drivers/target/iscsi/iscsi_target_seq_pdu_list.c  | 23 +
 drivers/target/iscsi/iscsi_target_tpg.c   |  9 ++---
 drivers/target/iscsi/iscsi_target_util.c  | 11 +++---
 11 files changed, 46 insertions(+), 109 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target.c 
b/drivers/target/iscsi/iscsi_target.c
index 9eb10d34682c..f3c6ea556ea8 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -812,7 +812,6 @@ int iscsit_add_reject(
 
cmd->buf_ptr = kmemdup(buf, ISCSI_HDR_LEN, GFP_KERNEL);
if (!cmd->buf_ptr) {
-   pr_err("Unable to allocate memory for cmd->buf_ptr\n");
iscsit_free_cmd(cmd, false);
return -1;
}
@@ -849,7 +848,6 @@ static int iscsit_add_reject_from_cmd(
 
cmd->buf_ptr = kmemdup(buf, ISCSI_HDR_LEN, GFP_KERNEL);
if (!cmd->buf_ptr) {
-   pr_err("Unable to allocate memory for cmd->buf_ptr\n");
iscsit_free_cmd(cmd, false);
return -1;
}
diff --git a/drivers/target/iscsi/iscsi_target_auth.c 
b/drivers/target/iscsi/iscsi_target_auth.c
index d837fcbdbaf2..3a17343f43ed 100644
--- a/drivers/target/iscsi/iscsi_target_auth.c
+++ b/drivers/target/iscsi/iscsi_target_auth.c
@@ -80,10 +80,9 @@ static int chap_check_algorithm(const char *a_str)
char *tmp, *orig, *token;
 
tmp = kstrdup(a_str, GFP_KERNEL);
-   if (!tmp) {
-   pr_err("Memory allocation failed for CHAP_A temporary 
buffer\n");
+   if (!tmp)
return CHAP_DIGEST_UNKNOWN;
-   }
+
orig = tmp;
 
token = strsep(, "=");
@@ -198,16 +197,12 @@ static int chap_server_compute_md5(
int auth_ret = -1, ret, challenge_len;
 
challenge = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL);
-   if (!challenge) {
-   pr_err("Unable to allocate challenge buffer\n");
+   if (!challenge)
goto exit;
-   }
 
challenge_binhex = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL);
-   if (!challenge_binhex) {
-   pr_err("Unable to allocate challenge_binhex buffer\n");
+   if (!challenge_binhex)
goto free_challenge;
-   }
 
memset(chap_n, 0, MAX_CHAP_N_SIZE);
 
@@ -257,10 +252,8 @@ static int chap_server_compute_md5(
}
 
desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(tfm), GFP_KERNEL);
-   if (!desc) {
-   pr_err("Unable to allocate struct shash_desc\n");
+   if (!desc)
goto free_shash;
-   }
 
desc->tfm = tfm;
desc->flags = 0;
diff --git a/drivers/target/iscsi/iscsi_target_datain_values.c 
b/drivers/target/iscsi/iscsi_target_datain_values.c
index 173ddd93c757..c591165f9b1b 100644
--- a/drivers/target/iscsi/iscsi_target_datain_values.c
+++ b/drivers/target/iscsi/iscsi_target_datain_values.c
@@ -30,11 +30,9 @@ struct iscsi_datain_req *iscsit_allocate_datain_req(void)
struct iscsi_datain_req *dr;
 
dr = kmem_cache_zalloc(lio_dr_cache, GFP_ATOMIC);
-   if (!dr) {
-   pr_err("Unable to allocate memory for"
-   " struct iscsi_datain_req\n");
+   if (!dr)
return NULL;
-   }
+
INIT_LIST_HEAD(>cmd_datain_node);
 
return dr;
diff --git a/drivers/target/iscsi/iscsi_target_erl1.c 
b/drivers/target/iscsi/iscsi_target_erl1.c
index 5efa42b939a1..ff3e08b6d4e1 100644
--- a/drivers/target/iscsi/iscsi_target_erl1.c
+++ b/drivers/target/iscsi/iscsi_target_erl1.c
@@ -59,11 +59,9 @@ int iscsit_dump_data_payload(
length = min(buf_len, OFFLOAD_BUF_SIZE);
 
buf = kzalloc(length, GFP_ATOMIC);
-   if (!buf) {
-   pr_err("Unable to allocate %u bytes for offload"
-   " buffer.\n", length);
+   if (!buf)
return -1;
-   }
+
memset(, 0, sizeof(struct kvec));
 
while (offset < buf_len) {
@@ -787,11 +785,9 @@ static struct iscsi_ooo_cmdsn 
*iscsit_allocate_ooo_cmdsn(void)
struct iscsi_ooo_cmdsn *ooo_cmdsn = NULL;
 
ooo_cmdsn =

[PATCH 2/8] target/iscsi: Move resetting of seven variables in chap_server_compute_md5()

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 19:43:47 +0100

Move the resetting of these array variables so that this operation will
be performed only if memory allocations succeeded before in this function.

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/iscsi_target_auth.c | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target_auth.c 
b/drivers/target/iscsi/iscsi_target_auth.c
index 94b011fe74e8..d837fcbdbaf2 100644
--- a/drivers/target/iscsi/iscsi_target_auth.c
+++ b/drivers/target/iscsi/iscsi_target_auth.c
@@ -197,14 +197,6 @@ static int chap_server_compute_md5(
struct shash_desc *desc;
int auth_ret = -1, ret, challenge_len;
 
-   memset(identifier, 0, 10);
-   memset(chap_n, 0, MAX_CHAP_N_SIZE);
-   memset(chap_r, 0, MAX_RESPONSE_LENGTH);
-   memset(digest, 0, MD5_SIGNATURE_SIZE);
-   memset(response, 0, MD5_SIGNATURE_SIZE * 2 + 2);
-   memset(client_digest, 0, MD5_SIGNATURE_SIZE);
-   memset(server_digest, 0, MD5_SIGNATURE_SIZE);
-
challenge = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL);
if (!challenge) {
pr_err("Unable to allocate challenge buffer\n");
@@ -216,6 +208,9 @@ static int chap_server_compute_md5(
pr_err("Unable to allocate challenge_binhex buffer\n");
goto free_challenge;
}
+
+   memset(chap_n, 0, MAX_CHAP_N_SIZE);
+
/*
 * Extract CHAP_N.
 */
@@ -236,6 +231,8 @@ static int chap_server_compute_md5(
goto free_challenge_binhex;
}
pr_debug("[server] Got CHAP_N=%s\n", chap_n);
+   memset(chap_r, 0, MAX_RESPONSE_LENGTH);
+
/*
 * Extract CHAP_R.
 */
@@ -250,6 +247,7 @@ static int chap_server_compute_md5(
}
 
pr_debug("[server] Got CHAP_R=%s\n", chap_r);
+   memset(client_digest, 0, MD5_SIGNATURE_SIZE);
chap_string_to_hex(client_digest, chap_r, strlen(chap_r));
 
tfm = crypto_alloc_shash("md5", 0, 0);
@@ -286,6 +284,7 @@ static int chap_server_compute_md5(
goto free_desc;
}
 
+   memset(server_digest, 0, MD5_SIGNATURE_SIZE);
ret = crypto_shash_finup(desc, chap->challenge,
 CHAP_CHALLENGE_LENGTH, server_digest);
if (ret < 0) {
@@ -293,6 +292,7 @@ static int chap_server_compute_md5(
goto free_desc;
}
 
+   memset(response, 0, MD5_SIGNATURE_SIZE * 2 + 2);
chap_binaryhex_to_asciihex(response, server_digest, MD5_SIGNATURE_SIZE);
pr_debug("[server] MD5 Server Digest: %s\n", response);
 
@@ -310,6 +310,9 @@ static int chap_server_compute_md5(
auth_ret = 0;
goto free_desc;
}
+
+   memset(identifier, 0, ARRAY_SIZE(identifier));
+
/*
 * Get CHAP_I.
 */
@@ -393,6 +396,9 @@ static int chap_server_compute_md5(
" password_mutual\n");
goto free_desc;
}
+
+   memset(digest, 0, MD5_SIGNATURE_SIZE);
+
/*
 * Convert received challenge to binary hex.
 */
-- 
2.15.1

[PATCH 1/8] target/iscsi: Less function calls in chap_server_compute_md5() after error detection

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 18:00:41 +0100

The functions "crypto_free_shash", "kfree" and "kzfree" were called
in a few cases by the chap_server_compute_md5() function during error
handling even if the passed variable contained a null pointer.

* Adjust jump targets according to the Linux coding style convention.

* Delete initialisations for the variables "challenge", "challenge_binhex",
  "desc" and "tfm" at the beginning which became unnecessary
  with this refactoring.

Fixes: 69110e3cedbb8aad1c70d91ed58a9f4f0ed9eec6 ("iscsi-target: Use shash and 
ahash")
Fixes: e48354ce078c079996f89d715dfa44814b4eba01 ("iscsi-target: Add iSCSI 
fabric support for target v4.1")

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/iscsi_target_auth.c | 71 +---
 1 file changed, 37 insertions(+), 34 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target_auth.c 
b/drivers/target/iscsi/iscsi_target_auth.c
index f9bc8ec6fb6b..94b011fe74e8 100644
--- a/drivers/target/iscsi/iscsi_target_auth.c
+++ b/drivers/target/iscsi/iscsi_target_auth.c
@@ -186,15 +186,15 @@ static int chap_server_compute_md5(
unsigned char id_as_uchar;
unsigned char digest[MD5_SIGNATURE_SIZE];
unsigned char type, response[MD5_SIGNATURE_SIZE * 2 + 2];
-   unsigned char identifier[10], *challenge = NULL;
-   unsigned char *challenge_binhex = NULL;
+   unsigned char identifier[10], *challenge;
+   unsigned char *challenge_binhex;
unsigned char client_digest[MD5_SIGNATURE_SIZE];
unsigned char server_digest[MD5_SIGNATURE_SIZE];
unsigned char chap_n[MAX_CHAP_N_SIZE], chap_r[MAX_RESPONSE_LENGTH];
size_t compare_len;
struct iscsi_chap *chap = conn->auth_protocol;
-   struct crypto_shash *tfm = NULL;
-   struct shash_desc *desc = NULL;
+   struct crypto_shash *tfm;
+   struct shash_desc *desc;
int auth_ret = -1, ret, challenge_len;
 
memset(identifier, 0, 10);
@@ -208,13 +208,13 @@ static int chap_server_compute_md5(
challenge = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL);
if (!challenge) {
pr_err("Unable to allocate challenge buffer\n");
-   goto out;
+   goto exit;
}
 
challenge_binhex = kzalloc(CHAP_CHALLENGE_STR_LEN, GFP_KERNEL);
if (!challenge_binhex) {
pr_err("Unable to allocate challenge_binhex buffer\n");
-   goto out;
+   goto free_challenge;
}
/*
 * Extract CHAP_N.
@@ -222,18 +222,18 @@ static int chap_server_compute_md5(
if (extract_param(nr_in_ptr, "CHAP_N", MAX_CHAP_N_SIZE, chap_n,
) < 0) {
pr_err("Could not find CHAP_N.\n");
-   goto out;
+   goto free_challenge_binhex;
}
if (type == HEX) {
pr_err("Could not find CHAP_N.\n");
-   goto out;
+   goto free_challenge_binhex;
}
 
/* Include the terminating NULL in the compare */
compare_len = strlen(auth->userid) + 1;
if (strncmp(chap_n, auth->userid, compare_len) != 0) {
pr_err("CHAP_N values do not match!\n");
-   goto out;
+   goto free_challenge_binhex;
}
pr_debug("[server] Got CHAP_N=%s\n", chap_n);
/*
@@ -242,11 +242,11 @@ static int chap_server_compute_md5(
if (extract_param(nr_in_ptr, "CHAP_R", MAX_RESPONSE_LENGTH, chap_r,
) < 0) {
pr_err("Could not find CHAP_R.\n");
-   goto out;
+   goto free_challenge_binhex;
}
if (type != HEX) {
pr_err("Could not find CHAP_R.\n");
-   goto out;
+   goto free_challenge_binhex;
}
 
pr_debug("[server] Got CHAP_R=%s\n", chap_r);
@@ -254,15 +254,14 @@ static int chap_server_compute_md5(
 
tfm = crypto_alloc_shash("md5", 0, 0);
if (IS_ERR(tfm)) {
-   tfm = NULL;
pr_err("Unable to allocate struct crypto_shash\n");
-   goto out;
+   goto free_challenge_binhex;
}
 
desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(tfm), GFP_KERNEL);
if (!desc) {
pr_err("Unable to allocate struct shash_desc\n");
-   goto out;
+   goto free_shash;
}
 
desc->tfm = tfm;
@@ -271,27 +270,27 @@ static int chap_server_compute_md5(
ret = crypto_shash_init(desc);
if (ret < 0) {
pr_err("crypto_shash_init() failed\n");
-   goto out;
+   goto free_desc;
}
 
ret = crypto_shash_update(desc, >id, 1);
if (ret < 0) {
pr_err("crypto_shash_update() failed for id\n");
-   goto out;
+   goto

[PATCH 0/8] target-iSCSI: Adjustments for several function implementations

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 22:22:11 +0100

Some update suggestions were taken into account
from static source code analysis.

Markus Elfring (8):
  Less function calls in chap_server_compute_md5() after error detection
  Move resetting of seven variables in chap_server_compute_md5()
  Delete 36 error messages for a failed memory allocation
  Delete an unnecessary variable initialisation in iscsit_allocate_ooo_cmdsn()
  Delete an unnecessary variable initialisation in iscsi_copy_param_list()
  Delete an unnecessary variable initialisation in iscsi_create_default_params()
  Delete an unnecessary variable initialisation in iscsi_set_default_param()
  Improve 16 size determinations

 drivers/target/iscsi/iscsi_target.c   |   2 -
 drivers/target/iscsi/iscsi_target_auth.c  | 110 +++---
 drivers/target/iscsi/iscsi_target_datain_values.c |   6 +-
 drivers/target/iscsi/iscsi_target_erl1.c  |  14 +--
 drivers/target/iscsi/iscsi_target_erl2.c  |   8 +-
 drivers/target/iscsi/iscsi_target_login.c |  29 ++
 drivers/target/iscsi/iscsi_target_nego.c  |   4 +-
 drivers/target/iscsi/iscsi_target_parameters.c|  58 +---
 drivers/target/iscsi/iscsi_target_seq_pdu_list.c  |  34 +++
 drivers/target/iscsi/iscsi_target_tpg.c   |  13 +--
 drivers/target/iscsi/iscsi_target_util.c  |  11 +--
 11 files changed, 118 insertions(+), 171 deletions(-)

-- 
2.15.1

Re: [PATCH 4/4] scsi_debug: add resp_write_scat function

2017-12-12 Thread Douglas Gilbert


On 2017-12-12 02:40 PM, Bart Van Assche wrote:

On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote:

-static const struct opcode_info_t vl_iarr[1] = {   /* VARIABLE LENGTH */
+static const struct opcode_info_t vl_iarr[2] = {   /* VARIABLE LENGTH */


Please leave out the array size and let the compiler determine the array size.


I like the "belts and braces" approach. Especially if offsetting an
array element by one position would silently destroy the parser
(which is not the case with vl_iarr but is the case with a few
other arrays).




{0, 0x7f, 0xb, F_SA_HIGH | F_D_OUT | FF_DIRECT_IO, resp_write_dt0,
-   NULL, {32,  0xc7, 0, 0, 0, 0, 0x1f, 0x18, 0x0, 0xb, 0xfa,
-  0, 0xff, 0xff, 0xff, 0xff} },/* WRITE(32) */
+   NULL, {32,  0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0xb, 0xfa,
+   0, 0xff, 0xff, 0xff, 0xff} },   /* WRITE(32) */


Shouldn't this change have been included in the patch that fixes the group
number mask?


git add --patch

was used to break up a monolithic patch into 4 parts. However in
the above case it refused to "split" the group_number part (shown)
from the renaming of service actions (not shown above). At that
point I swear at git and move on.

If folks think that hours are spent testing a kernel with 1/4 and
2/4 applied while 3/4 and 4/4 have not been applied then I think
they are dreaming. IMO The split up is eye candy for reviewers.


  static const struct opcode_info_t maint_in_iarr[2] = {
@@ -518,9 +523,10 @@ static const struct opcode_info_t 
opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = {
{6,  0x1, 0, 0xf, 0xf7, 0xc7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
{1, 0x9e, 0x10, F_SA_LOW | F_D_IN, resp_readcap16, sa_in_iarr,
{16,  0x10, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
-0xff, 0xff, 0xff, 0x1, 0xc7} },/* READ CAPACITY(16) */
-   {0, 0, 0, F_INV_OP | FF_RESPOND, NULL, NULL, /* SA OUT */
-   {0,  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
+0xff, 0xff, 0xff, 0x1, 0xc7} },/* SA_IN(16), READ CAPACITY(16) */


Shouldn't the above change be folded into one of the other patches?


I considered the patch adding resp_write_scat to the parsing table
and the splitting SDEB_I_SERV_ACT_IN/OUT to its 12 and 16 byte cdb
components as very closely related, so I put them together.


@@ -529,9 +535,9 @@ static const struct opcode_info_t 
opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = {
{0, 0x2f, 0, F_D_OUT_MAYBE | FF_DIRECT_IO, NULL, NULL, /* VERIFY(10) */
{10,  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xc7,
 0, 0, 0, 0, 0, 0} },
-   {1, 0x7f, 0x9, F_SA_HIGH | F_D_IN | FF_DIRECT_IO, resp_read_dt0,
-   vl_iarr, {32,  0xc7, 0, 0, 0, 0, 0x1f, 0x18, 0x0, 0x9, 0xfe, 0,
- 0xff, 0xff, 0xff, 0xff} },/* VARIABLE LENGTH, READ(32) */
+   {2, 0x7f, 0x9, F_SA_HIGH | F_D_IN | FF_DIRECT_IO, resp_read_dt0,
+   vl_iarr, {32,  0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0x9, 0xfe, 0,
+   0xff, 0xff, 0xff, 0xff} },/* VARIABLE LENGTH, READ(32) */


Have you considered to use ARRAY_SIZE(vl_iarr) instead of hard-coding the array
size?


No. It could be done with the advantage of making the code
a bit safer for someone who changes it, but it obfuscates
the number elements in the bucket list (array) making it
harder for the reader of the code (maybe).


+   if (cmd[0] == VARIABLE_LENGTH_CMD) {
+   is_16 = false;
+   wrprotect = (cmd[10] >> 5) & 0x7;
+   lbdof = get_unaligned_be16(cmd + 12);
+   num_lrd = get_unaligned_be16(cmd + 16);
+   bt_len = get_unaligned_be32(cmd + 28);
+   check_prot = false;
+   } else {/* that leaves WRITE SCATTERED(16) */
+   is_16 = true;
+   wrprotect = (cmd[2] >> 5) & 0x7;
+   lbdof = get_unaligned_be16(cmd + 4);
+   num_lrd = get_unaligned_be16(cmd + 8);
+   bt_len = get_unaligned_be32(cmd + 10);
+   check_prot = true;
+   }


It's not clear to me why check_prot is set to false for WRITE SCATTERED(32)
and set to true for WRITE SCATTERED(16)?


Me neither. I was just following what the code for WRITE(16 and 32)
does. My guess is that the code in question is written by Martin P.
who is effectively co-maintainer of this driver having written all
the DIF and DIX stuff. Martin ... ?

Doug Gilbert

[PATCH] scsi: aacraid: fix io drop during the reset

2017-12-12 Thread Prasad B Munirathnam

"FIB_CONTEXT_FLAG_TIMEDOUT" flag is set in aac_eh_abort to
indicate command timeout, using the same flag in reset handler
causes the command to timeout and the IO's were droped.

defined a new flag "FIB_CONTEXT_FLAG_EH_RESET" to make sure IO
is properly handled in eh_reset handler.

Signed-off-by: Prasad B Munirathnam 
---
 drivers/scsi/aacraid/aacraid.h | 1 +
 drivers/scsi/aacraid/linit.c   | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
index 403a639..b0b290f 100644
--- a/drivers/scsi/aacraid/aacraid.h
+++ b/drivers/scsi/aacraid/aacraid.h
@@ -1724,6 +1724,7 @@ struct aac_dev
 #define FIB_CONTEXT_FLAG_NATIVE_HBA(0x0010)
 #define FIB_CONTEXT_FLAG_NATIVE_HBA_TMF(0x0020)
 #define FIB_CONTEXT_FLAG_SCSI_CMD  (0x0040)
+#define FIB_CONTEXT_FLAG_EH_RESET  (0x0080)
 
 /*
  * Define the command values
diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index 3677bef..03af5bd 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -1034,7 +1034,7 @@ static int aac_eh_bus_reset(struct scsi_cmnd* cmd)
info = >hba_map[bus][cid];
if (bus >= AAC_MAX_BUSES || cid >= AAC_MAX_TARGETS ||
info->devtype != AAC_DEVTYPE_NATIVE_RAW) {
-   fib->flags |= FIB_CONTEXT_FLAG_TIMED_OUT;
+   fib->flags |= FIB_CONTEXT_FLAG_EH_RESET;
cmd->SCp.phase = AAC_OWNER_ERROR_HANDLER;
}
}
-- 
2.7.4

Re: [PATCH 4/4] scsi_debug: add resp_write_scat function

2017-12-12 Thread Bart Van Assche

On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote:
> -static const struct opcode_info_t vl_iarr[1] = { /* VARIABLE LENGTH */
> +static const struct opcode_info_t vl_iarr[2] = { /* VARIABLE LENGTH */

Please leave out the array size and let the compiler determine the array size.

>   {0, 0x7f, 0xb, F_SA_HIGH | F_D_OUT | FF_DIRECT_IO, resp_write_dt0,
> - NULL, {32,  0xc7, 0, 0, 0, 0, 0x1f, 0x18, 0x0, 0xb, 0xfa,
> -0, 0xff, 0xff, 0xff, 0xff} },/* WRITE(32) */
> + NULL, {32,  0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0xb, 0xfa,
> + 0, 0xff, 0xff, 0xff, 0xff} },   /* WRITE(32) */

Shouldn't this change have been included in the patch that fixes the group
number mask?
 
>  static const struct opcode_info_t maint_in_iarr[2] = {
> @@ -518,9 +523,10 @@ static const struct opcode_info_t 
> opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = {
>   {6,  0x1, 0, 0xf, 0xf7, 0xc7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
>   {1, 0x9e, 0x10, F_SA_LOW | F_D_IN, resp_readcap16, sa_in_iarr,
>   {16,  0x10, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> -  0xff, 0xff, 0xff, 0x1, 0xc7} },/* READ CAPACITY(16) */
> - {0, 0, 0, F_INV_OP | FF_RESPOND, NULL, NULL, /* SA OUT */
> - {0,  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
> +  0xff, 0xff, 0xff, 0x1, 0xc7} },/* SA_IN(16), READ CAPACITY(16) */

Shouldn't the above change be folded into one of the other patches?

> @@ -529,9 +535,9 @@ static const struct opcode_info_t 
> opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = {
>   {0, 0x2f, 0, F_D_OUT_MAYBE | FF_DIRECT_IO, NULL, NULL, /* VERIFY(10) */
>   {10,  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xc7,
>0, 0, 0, 0, 0, 0} },
> - {1, 0x7f, 0x9, F_SA_HIGH | F_D_IN | FF_DIRECT_IO, resp_read_dt0,
> - vl_iarr, {32,  0xc7, 0, 0, 0, 0, 0x1f, 0x18, 0x0, 0x9, 0xfe, 0,
> -   0xff, 0xff, 0xff, 0xff} },/* VARIABLE LENGTH, READ(32) */
> + {2, 0x7f, 0x9, F_SA_HIGH | F_D_IN | FF_DIRECT_IO, resp_read_dt0,
> + vl_iarr, {32,  0xc7, 0, 0, 0, 0, 0x3f, 0x18, 0x0, 0x9, 0xfe, 0,
> + 0xff, 0xff, 0xff, 0xff} },/* VARIABLE LENGTH, READ(32) */

Have you considered to use ARRAY_SIZE(vl_iarr) instead of hard-coding the array
size?

> + if (cmd[0] == VARIABLE_LENGTH_CMD) {
> + is_16 = false;
> + wrprotect = (cmd[10] >> 5) & 0x7;
> + lbdof = get_unaligned_be16(cmd + 12);
> + num_lrd = get_unaligned_be16(cmd + 16);
> + bt_len = get_unaligned_be32(cmd + 28);
> + check_prot = false;
> + } else {/* that leaves WRITE SCATTERED(16) */
> + is_16 = true;
> + wrprotect = (cmd[2] >> 5) & 0x7;
> + lbdof = get_unaligned_be16(cmd + 4);
> + num_lrd = get_unaligned_be16(cmd + 8);
> + bt_len = get_unaligned_be32(cmd + 10);
> + check_prot = true;
> + }

It's not clear to me why check_prot is set to false for WRITE SCATTERED(32)
and set to true for WRITE SCATTERED(16)?

Bart.

Re: [PATCH 2/4] scsi_debug: fix group_number mask 0x1f->0x3f

2017-12-12 Thread Bart Van Assche

On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote:
> Various cdb masks incorrectly assumed the GROUP NUMBER field was
> 5 bits long. It is actually 6 bits long. Correct.

Reviewed-by: Bart Van Assche

Re: [PATCH 3/4] scsi_debug: expand do_device_access to take sg offset

2017-12-12 Thread Bart Van Assche

On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote:
> WRITE SCATTERED needs to take several "bites" out of the data-out buffer.
> Expand the do_device_access() function to take a sg_skip argument.

Reviewed-by: Bart Van Assche

Re: [PATCH 1/4] scsi_debug: tab, kstrto changes, requested by checkpatch.pl

2017-12-12 Thread Bart Van Assche

On Mon, 2017-12-11 at 20:10 -0500, Douglas Gilbert wrote:
> Some of my development tools tend to add spaces (my preference) rather
> than tabs (kernel convention). Running unexpand to clean these spaces
> up found more of them than checkpatch.pl did. Then checkpatch.pl
> complained about other style violations in those newly tabbed lines.

Reviewed-by: Bart Van Assche

Re: [PATCH] scsi_dh_alua: skip RTPG for devices only supporting active/optimized

2017-12-12 Thread Bart Van Assche

On Tue, 2017-12-12 at 08:01 +0100, Hannes Reinecke wrote:
> On 12/12/2017 01:00 AM, Bart Van Assche wrote:
> > On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote:
> > > @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, 
> > > struct alua_port_group *pg)
> > >   retval = submit_rtpg(sdev, buff, bufflen, _hdr, pg->flags);
> > >  
> > >   if (retval) {
> > > + /*
> > > +  * If the target only supports active/optimized there's
> > > +  * not much we can do; it's not that we can switch paths
> > > +  * or somesuch.
> > > +  * So ignore any errors to avoid spurious failures during
> > > +  * path failover.
> > > +  */
> > > + if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) {
> > > + sdev_printk(KERN_INFO, sdev,
> > > + "%s: ignoring rtpg result %d\n",
> > > + ALUA_DH_NAME, retval);
> > > + kfree(buff);
> > > + return SCSI_DH_OK;
> > > + }
> > 
> > Hello Hannes,
> > 
> > Sorry but this change looks weird to me. If an RTPG fails, shouldn't
> > alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target
> > supports? Are you perhaps trying to implement a work-around for an array
> > that does not respond to RTPG during path transitions?
> > 
> 
> Yes, precisely.
> 
> Thing is: if an array is only supporting active/optimized the entire
> device-handler stuff is basically a no-op as we can't switch paths anyway.
> So it doesn't really matter if the RTPG fails; in fact, we could just
> short-circuit the entire logic. I did not do that to allow for a state
> modification (ie arrays _might_ decide to announce additional states
> eventually, and only starting off with active/optimized as an initial
> state set).
> 
> But if we return SCSI_DH_IO here the multipath logic will not attempt to
> switch paths, and failover will not work.

Hello Hannes,

I would appreciate it if it would be mentioned more clearly in the comment
above pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED that the newly added code
is a workaround for non-standard array behavior. I'm afraid that otherwise
people who will read that code will be puzzled about why that code has been
added.

Thanks,

Bart.

[PATCH v3] Use blist_flags_t consistently

2017-12-12 Thread Bart Van Assche

Use the type blist_flags_t for all variables that represent blacklist
flags. Additionally, suppress recently introduced sparse warnings related
to blacklist flags.

Fixes: commit c6b54164508a ("scsi: Use 'blist_flags_t' for scsi_devinfo flags")
Signed-off-by: Bart Van Assche 
Reviewed-by: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
---
Changes compared to v2:
- Changed (__force unsigned int)sdev->sdev_bflags & BIT(i) into
  sdev->sdev_bflags & (__force blist_flags_t)BIT(i).

Changes compared to v1:
- Changed (__force u32) casts into (__force unsigned int).
- Left out one cast, namely the one in front of scsi_default_dev_flags.

 drivers/scsi/scsi_devinfo.c   |  6 ++
 drivers/scsi/scsi_scan.c  | 13 +++--
 drivers/scsi/scsi_sysfs.c |  5 +++--
 drivers/scsi/scsi_transport_spi.c | 12 +++-
 4 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c
index a15eb4d15114..700988e2769c 100644
--- a/drivers/scsi/scsi_devinfo.c
+++ b/drivers/scsi/scsi_devinfo.c
@@ -382,10 +382,8 @@ int scsi_dev_info_list_add_keyed(int compatible, char 
*vendor, char *model,
model, compatible);
 
if (strflags)
-   devinfo->flags = simple_strtoul(strflags, NULL, 0);
-   else
-   devinfo->flags = flags;
-
+   flags = (__force blist_flags_t)simple_strtoul(strflags, NULL, 
0);
+   devinfo->flags = flags;
devinfo->compatible = compatible;
 
if (compatible)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index be5e919db0e8..0880d975eed3 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -770,7 +770,7 @@ static int scsi_probe_lun(struct scsi_device *sdev, 
unsigned char *inq_result,
  * SCSI_SCAN_LUN_PRESENT: a new scsi_device was allocated and initialized
  **/
 static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
-   int *bflags, int async)
+   blist_flags_t *bflags, int async)
 {
int ret;
 
@@ -1049,14 +1049,15 @@ static unsigned char *scsi_inq_str(unsigned char *buf, 
unsigned char *inq,
  *   - SCSI_SCAN_LUN_PRESENT: a new scsi_device was allocated and initialized
  **/
 static int scsi_probe_and_add_lun(struct scsi_target *starget,
- u64 lun, int *bflagsp,
+ u64 lun, blist_flags_t *bflagsp,
  struct scsi_device **sdevp,
  enum scsi_scan_mode rescan,
  void *hostdata)
 {
struct scsi_device *sdev;
unsigned char *result;
-   int bflags, res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
+   blist_flags_t bflags;
+   int res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
 
/*
@@ -1201,7 +1202,7 @@ static int scsi_probe_and_add_lun(struct scsi_target 
*starget,
  * Modifies sdevscan->lun.
  **/
 static void scsi_sequential_lun_scan(struct scsi_target *starget,
-int bflags, int scsi_level,
+blist_flags_t bflags, int scsi_level,
 enum scsi_scan_mode rescan)
 {
uint max_dev_lun;
@@ -1292,7 +1293,7 @@ static void scsi_sequential_lun_scan(struct scsi_target 
*starget,
  * 0: scan completed (or no memory, so further scanning is futile)
  * 1: could not scan with REPORT LUN
  **/
-static int scsi_report_lun_scan(struct scsi_target *starget, int bflags,
+static int scsi_report_lun_scan(struct scsi_target *starget, blist_flags_t 
bflags,
enum scsi_scan_mode rescan)
 {
unsigned char scsi_cmd[MAX_COMMAND_SIZE];
@@ -1538,7 +1539,7 @@ static void __scsi_scan_target(struct device *parent, 
unsigned int channel,
unsigned int id, u64 lun, enum scsi_scan_mode rescan)
 {
struct Scsi_Host *shost = dev_to_shost(parent);
-   int bflags = 0;
+   blist_flags_t bflags = 0;
int res;
struct scsi_target *starget;
 
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index cbc0fe2c5485..c15ca457dd23 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -967,7 +967,8 @@ sdev_show_wwid(struct device *dev, struct device_attribute 
*attr,
 }
 static DEVICE_ATTR(wwid, S_IRUGO, sdev_show_wwid, NULL);
 
-#define BLIST_FLAG_NAME(name) [ilog2(BLIST_##name)] = #name
+#define BLIST_FLAG_NAME(name)  \
+   [ilog2((__force unsigned int)BLIST_##name)] = #name
 static const char *const sdev_bflags_name[] = {
 #include "scsi_devinfo_tbl.c"
 };
@@ -984,7 +985,7 @@ sdev_show_blacklist(struct device *dev, struct 
device_attribute *attr,
for (i = 0; i <

Re: [GIT PULL] SCSI fixes for 4.15-rc3

2017-12-12 Thread Martin K. Petersen


Linus,

> The commit you point to _is_ the probnlem. It does:
>
> struct bfad_im_port_s *im_port = shost->hostdata[0];
>
> which is utter bullshit crap.

Sorry, wrong commit:

https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=fixes=48d83282db077f93b2cf40de120f4d6f29eb293b

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [GIT PULL] SCSI fixes for 4.15-rc3

2017-12-12 Thread James Bottomley

On Tue, 2017-12-12 at 09:32 -0800, Linus Torvalds wrote:
> On Tue, Dec 12, 2017 at 9:22 AM, Martin K. Petersen
>  wrote:
> > 
> > 
> > Arnd and Johannes fixed this up right away:
> 
> The commit you point to _is_ the probnlem. It does:
> 
> struct bfad_im_port_s *im_port = shost->hostdata[0];
> 
> which is utter bullshit crap.
> 
> Notice? It's assigning a pointer (im_port), from an integer value
> ("hostdata[0]" is "unsigned long").
> 
> The code is garbage.

he means this one:

https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git/commit/?h=fixes=48d83282db077f93b2cf40de120f4d6f29eb293b

Which properly encapsulates the reference in a function which does the correct 
conversions.

James

Re: [GIT PULL] SCSI fixes for 4.15-rc3

2017-12-12 Thread Linus Torvalds

On Tue, Dec 12, 2017 at 9:22 AM, Martin K. Petersen
 wrote:
>
> Arnd and Johannes fixed this up right away:

The commit you point to _is_ the probnlem. It does:

struct bfad_im_port_s *im_port = shost->hostdata[0];

which is utter bullshit crap.

Notice? It's assigning a pointer (im_port), from an integer value
("hostdata[0]" is "unsigned long").

The code is garbage.

  Linus

Re: [GIT PULL] SCSI fixes for 4.15-rc3

2017-12-12 Thread James Bottomley

On Tue, 2017-12-12 at 12:22 -0500, Martin K. Petersen wrote:
> Linus,
> 
> > 
> > This is utter shite, and doesn't even compile cleanly.
> > 
> > Sure, it's "just" a warning, and the code works. But no, I'm not
> > pulling crap like this. If you save a pointer in an integer
> > "hostdata[0]" field, then you damn well do the proper casts or
> > helper
> > functions or something, you don't just ignore the compiler when it
> > very reasonably warns about it.
> > 
> > What the hell is going on? Nobody compiled this stuff at all? Or
> > nobody cares about new build warnings?
> 
> Arnd and Johannes fixed this up right away:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?
> h=fixes=45349821ab3a8d378b8f37e52c6fe1aa1b870c47

Yes, I have that in the current testing batch, how about I push the
combined thing on wednesday.

James

Re: [GIT PULL] SCSI fixes for 4.15-rc3

2017-12-12 Thread Martin K. Petersen


Linus,

> This is utter shite, and doesn't even compile cleanly.
>
> Sure, it's "just" a warning, and the code works. But no, I'm not
> pulling crap like this. If you save a pointer in an integer
> "hostdata[0]" field, then you damn well do the proper casts or helper
> functions or something, you don't just ignore the compiler when it
> very reasonably warns about it.
>
> What the hell is going on? Nobody compiled this stuff at all? Or
> nobody cares about new build warnings?

Arnd and Johannes fixed this up right away:

https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=fixes=45349821ab3a8d378b8f37e52c6fe1aa1b870c47

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [GIT PULL] SCSI fixes for 4.15-rc3

2017-12-12 Thread Linus Torvalds

On Tue, Dec 12, 2017 at 8:21 AM, James Bottomley
 wrote:
>
> The most important one is the bfa fix because it's easy to oops the
> kernel with this driver, a regression in the new timespec conversion in
> aacraid and a regression in the Fibre Channel ELS handling patch.  The
> other three are a theoretical problem with termination in the
> vendor/host matching code and a use after free in lpfc.

No, this is just complete garbage.

> Johannes Thumshirn (1):
>   scsi: bfa: fix access to bfad_im_port_s

This is utter shite, and doesn't even compile cleanly.

Sure, it's "just" a warning, and the code works. But no, I'm not
pulling crap like this. If you save a pointer in an integer
"hostdata[0]" field, then you damn well do the proper casts or helper
functions or something, you don't just ignore the compiler when it
very reasonably warns about it.

What the hell is going on? Nobody compiled this stuff at all? Or
nobody cares about new build warnings?

That is not acceptable.

  Linus

Re: [PATCH 1/4] block: check for GENHD_FL_UP in del_gendisk()

2017-12-12 Thread Bart Van Assche

On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote:
> From: Hannes Reinecke 
> 
> When a device is probed asynchronously del_gendisk() might be called
> before the async probing was run, causing del_gendisk() to crash
> due to uninitialized sysfs objects.
> 
> Signed-off-by: Hannes Reinecke 
> ---
>  block/genhd.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index c2223f1..cc40d95 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -697,6 +697,9 @@ void del_gendisk(struct gendisk *disk)
>   struct disk_part_iter piter;
>   struct hd_struct *part;
>  
> + if (!(disk->flags & GENHD_FL_UP))
> + return;
> +
>   blk_integrity_del(disk);
>   disk_del_events(disk);

Hello Hannes,

Thank you for having published your approach for increasing disk probing
concurrency. Your approach looks interesting to me. However, I don't think
that patches 1/4..3/4 are sufficient to avoid races between e.g.
device_add_disk() and del_gendisk(). As far as I know no locks are held
around the device_add_disk() and del_gendisk() calls. Does that mean that
del_gendisk() can call e.g. blk_integrity_del() before device_add_disk() has
called blk_integrity_add()?

Bart.

Re: [PATCH v2] Use blist_flags_t consistently

2017-12-12 Thread Bart Van Assche

On Tue, 2017-12-12 at 08:52 +0100, Christoph Hellwig wrote:
> On Mon, Dec 11, 2017 at 04:08:03PM -0800, Bart Van Assche wrote:
> > +   if (!((__force unsigned int)sdev->sdev_bflags & BIT(i)))
> 
> I'd case the right side argument to  __force blist_flags_t here for
> purely esthetic reasons.
> 
> Except for that this looks fine to me:
> 
> Reviewed-by: Christoph Hellwig 

Hello Christoph,

Thanks for the review. I will make the proposed change and repost this patch.

Bart.

[GIT PULL] SCSI fixes for 4.15-rc3

2017-12-12 Thread James Bottomley

The most important one is the bfa fix because it's easy to oops the
kernel with this driver, a regression in the new timespec conversion in
aacraid and a regression in the Fibre Channel ELS handling patch.  The
other three are a theoretical problem with termination in the
vendor/host matching code and a use after free in lpfc.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Arnd Bergmann (1):
  scsi: aacraid: address UBSAN warning regression

Dan Carpenter (1):
  scsi: lpfc: Use after free in lpfc_rq_buf_free()

Johannes Thumshirn (1):
  scsi: bfa: fix access to bfad_im_port_s

Martin Wilck (3):
  scsi: scsi_devinfo: cleanly zero-pad devinfo strings
  scsi: scsi_devinfo: handle non-terminated strings
  scsi: libfc: fix ELS request handling

With the diffstat:

 drivers/scsi/aacraid/commsup.c |  8 ++--
 drivers/scsi/bfa/bfad_bsg.c|  6 --
 drivers/scsi/libfc/fc_lport.c  |  4 
 drivers/scsi/lpfc/lpfc_mem.c   |  2 +-
 drivers/scsi/scsi_devinfo.c| 27 ++-
 5 files changed, 25 insertions(+), 22 deletions(-)


And full diff below

James

---

diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index bec9f3193f60..80a8cb26cdea 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -2482,8 +2482,8 @@ int aac_command_thread(void *data)
/* Synchronize our watches */
if (((NSEC_PER_SEC - (NSEC_PER_SEC / HZ)) > now.tv_nsec)
 && (now.tv_nsec > (NSEC_PER_SEC / HZ)))
-   difference = (((NSEC_PER_SEC - now.tv_nsec) * 
HZ)
- + NSEC_PER_SEC / 2) / NSEC_PER_SEC;
+   difference = HZ + HZ / 2 -
+now.tv_nsec / (NSEC_PER_SEC / HZ);
else {
if (now.tv_nsec > NSEC_PER_SEC / 2)
++now.tv_sec;
@@ -2507,6 +2507,10 @@ int aac_command_thread(void *data)
if (kthread_should_stop())
break;
 
+   /*
+* we probably want usleep_range() here instead of the
+* jiffies computation
+*/
schedule_timeout(difference);
 
if (kthread_should_stop())
diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c
index 72ca2a2e08e2..09ef68c8225f 100644
--- a/drivers/scsi/bfa/bfad_bsg.c
+++ b/drivers/scsi/bfa/bfad_bsg.c
@@ -3135,7 +3135,8 @@ bfad_im_bsg_vendor_request(struct bsg_job *job)
struct fc_bsg_request *bsg_request = job->request;
struct fc_bsg_reply *bsg_reply = job->reply;
uint32_t vendor_cmd = bsg_request->rqst_data.h_vendor.vendor_cmd[0];
-   struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job));
+   struct Scsi_Host *shost = fc_bsg_to_shost(job);
+   struct bfad_im_port_s *im_port = shost->hostdata[0];
struct bfad_s *bfad = im_port->bfad;
void *payload_kbuf;
int rc = -EINVAL;
@@ -3350,7 +3351,8 @@ int
 bfad_im_bsg_els_ct_request(struct bsg_job *job)
 {
struct bfa_bsg_data *bsg_data;
-   struct bfad_im_port_s *im_port = shost_priv(fc_bsg_to_shost(job));
+   struct Scsi_Host *shost = fc_bsg_to_shost(job);
+   struct bfad_im_port_s *im_port = shost->hostdata[0];
struct bfad_s *bfad = im_port->bfad;
bfa_bsg_fcpt_t *bsg_fcpt;
struct bfad_fcxp*drv_fcxp;
diff --git a/drivers/scsi/libfc/fc_lport.c b/drivers/scsi/libfc/fc_lport.c
index 5da46052e179..21be672679fb 100644
--- a/drivers/scsi/libfc/fc_lport.c
+++ b/drivers/scsi/libfc/fc_lport.c
@@ -904,10 +904,14 @@ static void fc_lport_recv_els_req(struct fc_lport *lport,
case ELS_FLOGI:
if (!lport->point_to_multipoint)
fc_lport_recv_flogi_req(lport, fp);
+   else
+   fc_rport_recv_req(lport, fp);
break;
case ELS_LOGO:
if (fc_frame_sid(fp) == FC_FID_FLOGI)
fc_lport_recv_logo_req(lport, fp);
+   else
+   fc_rport_recv_req(lport, fp);
break;
case ELS_RSCN:
lport->tt.disc_recv_req(lport, fp);
diff --git a/drivers/scsi/lpfc/lpfc_mem.c b/drivers/scsi/lpfc/lpfc_mem.c
index 56faeb049b4a..87c08ff37ddd 100644
--- a/drivers/scsi/lpfc/lpfc_mem.c
+++ b/drivers/scsi/lpfc/lpfc_mem.c
@@ -753,12 +753,12 @@ lpfc_rq_buf_free(struct lpfc_hba *phba, struct 
lpfc_dmabuf *mp)
drqe.address_hi = putPaddrHigh(rqb_entry->dbuf.phys);
rc = lpfc_sli4_rq_put(rqb_entry->hrq, rqb_entry->drq, , );
if (rc < 0) {
-

[PATCH] arcmsr: Fix possible sleep-in-atomic bugs in arcmsr_queue_command

2017-12-12 Thread Jia-Ju Bai

From: Jia-Ju Bai 

The driver may sleep under a spinlock, and the function call paths are:
arcmsr_queue_command(acquire the spinlock)
  arcmsr_queue_command_lck
arcmsr_handle_virtual_command
  arcmsr_iop_message_xfer
arcmsr_iop_parking
  arcmsr_stop_adapter_bgrb
arcmsr_hbaA_stop_bgrb
  arcmsr_hbaA_wait_msgint_ready
msleep --> may sleep

arcmsr_queue_command(acquire the spinlock)
  arcmsr_queue_command_lck
arcmsr_handle_virtual_command
  arcmsr_iop_message_xfer
arcmsr_iop_parking
  arcmsr_stop_adapter_bgrb
arcmsr_hbaB_stop_bgrb
  arcmsr_hbaB_wait_msgint_ready
msleep --> may sleep

arcmsr_queue_command(acquire the spinlock)
  arcmsr_queue_command_lck
arcmsr_handle_virtual_command
  arcmsr_iop_message_xfer
arcmsr_iop_parking
  arcmsr_stop_adapter_bgrb
arcmsr_hbaC_stop_bgrb
  arcmsr_hbaC_wait_msgint_ready
msleep --> may sleep

arcmsr_queue_command(acquire the spinlock)
  arcmsr_queue_command_lck
arcmsr_handle_virtual_command
  arcmsr_iop_message_xfer
arcmsr_iop_parking
  arcmsr_stop_adapter_bgrb
arcmsr_hbaD_stop_bgrb
  arcmsr_hbaD_wait_msgint_ready
msleep --> may sleep

To fix them, msleep is replaced with mdelay.
These bugs are found by my static analysis tool and my code review.

Signed-off-by: Jia-Ju Bai 
---
 drivers/scsi/arcmsr/arcmsr_hba.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index af032c4..31d94bd 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -347,7 +347,7 @@ static uint8_t arcmsr_hbaA_wait_msgint_ready(struct 
AdapterControlBlock *acb)
>outbound_intstatus);
return true;
}
-   msleep(10);
+   mdelay(10);
} /* max 20 seconds */
 
return false;
@@ -367,7 +367,7 @@ static uint8_t arcmsr_hbaB_wait_msgint_ready(struct 
AdapterControlBlock *acb)
reg->drv2iop_doorbell);
return true;
}
-   msleep(10);
+   mdelay(10);
} /* max 20 seconds */
 
return false;
@@ -385,7 +385,7 @@ static uint8_t arcmsr_hbaC_wait_msgint_ready(struct 
AdapterControlBlock *pACB)
>outbound_doorbell_clear); /*clear 
interrupt*/
return true;
}
-   msleep(10);
+   mdelay(10);
} /* max 20 seconds */
 
return false;
@@ -403,7 +403,7 @@ static bool arcmsr_hbaD_wait_msgint_ready(struct 
AdapterControlBlock *pACB)
reg->outbound_doorbell);
return true;
}
-   msleep(10);
+   mdelay(10);
} /* max 20 seconds */
return false;
 }
-- 
1.7.9.5

[PATCH 2/2] target: cxgbit: Combine substrings for 11 messages

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 12:57:47 +0100

The script "checkpatch.pl" pointed information out like the following.

WARNING: quoted string split across lines

Thus fix the affected source code places.

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/cxgbit/cxgbit_target.c | 50 +
 1 file changed, 16 insertions(+), 34 deletions(-)

diff --git a/drivers/target/iscsi/cxgbit/cxgbit_target.c 
b/drivers/target/iscsi/cxgbit/cxgbit_target.c
index 9163152b6d0b..e4a0dbc8fb9d 100644
--- a/drivers/target/iscsi/cxgbit/cxgbit_target.c
+++ b/drivers/target/iscsi/cxgbit/cxgbit_target.c
@@ -881,9 +881,7 @@ cxgbit_handle_immediate_data(struct iscsi_cmd *cmd, struct 
iscsi_scsi_req *hdr,
if (pdu_cb->flags & PDUCBF_RX_DCRC_ERR) {
pr_err("ImmediateData CRC32C DataDigest error\n");
if (!conn->sess->sess_ops->ErrorRecoveryLevel) {
-   pr_err("Unable to recover from"
-  " Immediate Data digest failure while"
-  " in ERL=0.\n");
+   pr_err("Unable to recover from Immediate Data digest 
failure while in ERL=0.\n");
iscsit_reject_cmd(cmd, ISCSI_REASON_DATA_DIGEST_ERROR,
  (unsigned char *)hdr);
return IMMEDIATE_DATA_CANNOT_RECOVER;
@@ -1054,19 +1052,14 @@ static int cxgbit_handle_iscsi_dataout(struct 
cxgbit_sock *csk)
}
 
if (pdu_cb->flags & PDUCBF_RX_DCRC_ERR) {
-   pr_err("ITT: 0x%08x, Offset: %u, Length: %u,"
-  " DataSN: 0x%08x\n",
-  hdr->itt, hdr->offset, data_len,
-  hdr->datasn);
-
+   pr_err("ITT: 0x%08x, Offset: %u, Length: %u, DataSN: 0x%08x\n",
+  hdr->itt, hdr->offset, data_len, hdr->datasn);
dcrc_err = true;
goto check_payload;
}
 
-   pr_debug("DataOut data_len: %u, "
-   "write_data_done: %u, data_length: %u\n",
- data_len,  cmd->write_data_done,
- cmd->se_cmd.data_length);
+   pr_debug("DataOut data_len: %u, write_data_done: %u, data_length: %u\n",
+data_len, cmd->write_data_done, cmd->se_cmd.data_length);
 
if (!(pdu_cb->flags & PDUCBF_RX_DATA_DDPD)) {
u32 skip = data_offset % PAGE_SIZE;
@@ -1102,9 +1095,7 @@ static int cxgbit_handle_nop_out(struct cxgbit_sock *csk, 
struct iscsi_cmd *cmd)
 
if (pdu_cb->flags & PDUCBF_RX_DCRC_ERR) {
if (!conn->sess->sess_ops->ErrorRecoveryLevel) {
-   pr_err("Unable to recover from"
-  " NOPOUT Ping DataCRC failure while in"
-  " ERL=0.\n");
+   pr_err("Unable to recover from NOPOUT Ping DataCRC 
failure while in ERL=0.\n");
ret = -1;
goto out;
} else {
@@ -1112,9 +1103,8 @@ static int cxgbit_handle_nop_out(struct cxgbit_sock *csk, 
struct iscsi_cmd *cmd)
 * drop this PDU and let the
 * initiator plug the CmdSN gap.
 */
-   pr_info("Dropping NOPOUT"
-   " Command CmdSN: 0x%08x due to"
-   " DataCRC error.\n", hdr->cmdsn);
+   pr_info("Dropping NOPOUT Command CmdSN: 0x%08x due to 
DataCRC error.\n",
+   hdr->cmdsn);
ret = 0;
goto out;
}
@@ -1139,9 +1129,7 @@ static int cxgbit_handle_nop_out(struct cxgbit_sock *csk, 
struct iscsi_cmd *cmd)
 */
cmd->buf_ptr = ping_data;
cmd->buf_ptr_size = payload_length;
-
-   pr_debug("Got %u bytes of NOPOUT ping"
-   " data.\n", payload_length);
+   pr_debug("Got %u bytes of NOPOUT ping data.\n", payload_length);
pr_debug("Ping Data: \"%s\"\n", ping_data);
}
 
@@ -1168,18 +1156,15 @@ cxgbit_handle_text_cmd(struct cxgbit_sock *csk, struct 
iscsi_cmd *cmd)
 
if (pdu_cb->flags & PDUCBF_RX_DCRC_ERR) {
if (!conn->sess->sess_ops->ErrorRecoveryLevel) {
-   pr_err("Unable to recover from"
-  " Text Data digest failure while in"
-  " ERL=0.\n");
+   pr_err("Unable to recover from Text Data digest failure 
while in ERL=0.\n");
goto reject;
} else {
/*
 * drop this PDU and let the
 * initiator plug the CmdSN gap.
 */
-   pr_info("Dropping Text"
-

[PATCH 1/2] target: cxgbit: Delete an error message for a failed memory allocation in two functions

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 11:34:51 +0100

Omit an extra message for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/target/iscsi/cxgbit/cxgbit_target.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/target/iscsi/cxgbit/cxgbit_target.c 
b/drivers/target/iscsi/cxgbit/cxgbit_target.c
index 514986b57c2d..9163152b6d0b 100644
--- a/drivers/target/iscsi/cxgbit/cxgbit_target.c
+++ b/drivers/target/iscsi/cxgbit/cxgbit_target.c
@@ -1126,8 +1126,6 @@ static int cxgbit_handle_nop_out(struct cxgbit_sock *csk, 
struct iscsi_cmd *cmd)
if (payload_length && hdr->ttt == cpu_to_be32(0x)) {
ping_data = kzalloc(payload_length + 1, GFP_KERNEL);
if (!ping_data) {
-   pr_err("Unable to allocate memory for"
-   " NOPOUT ping data.\n");
ret = -1;
goto out;
}
@@ -1188,11 +1186,9 @@ cxgbit_handle_text_cmd(struct cxgbit_sock *csk, struct 
iscsi_cmd *cmd)
 
if (payload_length) {
text_in = kzalloc(payload_length, GFP_KERNEL);
-   if (!text_in) {
-   pr_err("Unable to allocate text_in of payload_length: 
%u\n",
-  payload_length);
+   if (!text_in)
return -ENOMEM;
-   }
+
skb_copy_bits(csk->skb, pdu_cb->doffset,
  text_in, payload_length);
 
-- 
2.15.1

[PATCH 0/2] target/iscsi/cxgbit: Adjustments for seven function implementations

2017-12-12 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 12 Dec 2017 13:12:11 +0100

Two update suggestions were taken into account
from static source code analysis.

Markus Elfring (2):
  Delete an error message for a failed memory allocation in two functions
  Combine substrings for 11 messages

 drivers/target/iscsi/cxgbit/cxgbit_target.c | 58 +
 1 file changed, 18 insertions(+), 40 deletions(-)

-- 
2.15.1

[PATCH 4/4] scsi: arcmsr: simplify all arcmsr_hbaX_get_config routine by call a new get_adapter_config function

2017-12-12 Thread Ching Huang

From: Ching Huang 

simplify all arcmsr_hbaX_get_config routine by call a new get_adapter_config 
function

Signed-off-by: Ching Huang 
---

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index b7a56e8..95c9f08 100755
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -2956,75 +2956,66 @@ static int arcmsr_queue_command_lck(struct scsi_cmnd 
*cmd,
 
 static DEF_SCSI_QCMD(arcmsr_queue_command)
 
-static bool arcmsr_hbaA_get_config(struct AdapterControlBlock *acb)
+static void arcmsr_get_adapter_config(struct AdapterControlBlock *pACB, 
uint32_t *rwbuffer)
 {
-   struct MessageUnit_A __iomem *reg = acb->pmuA;
-   char *acb_firm_model = acb->firm_model;
-   char *acb_firm_version = acb->firm_version;
-   char *acb_device_map = acb->device_map;
-   char __iomem *iop_firm_model = (char __iomem 
*)(>message_rwbuffer[15]);
-   char __iomem *iop_firm_version = (char __iomem 
*)(>message_rwbuffer[17]);
-   char __iomem *iop_device_map = (char __iomem 
*)(>message_rwbuffer[21]);
int count;
-   arcmsr_wait_firmware_ready(acb);
-   writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0);
-   if (!arcmsr_hbaA_wait_msgint_ready(acb)) {
-   printk(KERN_NOTICE "arcmsr%d: wait 'get adapter firmware \
-   miscellaneous data' timeout \n", acb->host->host_no);
-   return false;
-   }
-   count = 8;
-   while (count){
-   *acb_firm_model = readb(iop_firm_model);
+   uint32_t *acb_firm_model = (uint32_t *)pACB->firm_model;
+   uint32_t *acb_firm_version = (uint32_t *)pACB->firm_version;
+   uint32_t *acb_device_map = (uint32_t *)pACB->device_map;
+   uint32_t *firm_model = [15];
+   uint32_t *firm_version = [17];
+   uint32_t *device_map = [21];
+
+   count = 2;
+   while (count) {
+   *acb_firm_model = readl(firm_model);
acb_firm_model++;
-   iop_firm_model++;
+   firm_model++;
count--;
}
-
-   count = 16;
-   while (count){
-   *acb_firm_version = readb(iop_firm_version);
+   count = 4;
+   while (count) {
+   *acb_firm_version = readl(firm_version);
acb_firm_version++;
-   iop_firm_version++;
+   firm_version++;
count--;
}
-
-   count=16;
-   while(count){
-   *acb_device_map = readb(iop_device_map);
+   count = 4;
+   while (count) {
+   *acb_device_map = readl(device_map);
acb_device_map++;
-   iop_device_map++;
+   device_map++;
count--;
}
+   pACB->signature = readl([0]);
+   pACB->firm_request_len = readl([1]);
+   pACB->firm_numbers_queue = readl([2]);
+   pACB->firm_sdram_size = readl([3]);
+   pACB->firm_hd_channels = readl([4]);
+   pACB->firm_cfg_version = readl([25]);
pr_notice("Areca RAID Controller%d: Model %s, F/W %s\n",
-   acb->host->host_no,
-   acb->firm_model,
-   acb->firm_version);
-   acb->signature = readl(>message_rwbuffer[0]);
-   acb->firm_request_len = readl(>message_rwbuffer[1]);
-   acb->firm_numbers_queue = readl(>message_rwbuffer[2]);
-   acb->firm_sdram_size = readl(>message_rwbuffer[3]);
-   acb->firm_hd_channels = readl(>message_rwbuffer[4]);
-   acb->firm_cfg_version = readl(>message_rwbuffer[25]);  
/*firm_cfg_version,25,100-103*/
+   pACB->host->host_no,
+   pACB->firm_model,
+   pACB->firm_version);
+}
+
+static bool arcmsr_hbaA_get_config(struct AdapterControlBlock *acb)
+{
+   struct MessageUnit_A __iomem *reg = acb->pmuA;
+
+   arcmsr_wait_firmware_ready(acb);
+   writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0);
+   if (!arcmsr_hbaA_wait_msgint_ready(acb)) {
+   printk(KERN_NOTICE "arcmsr%d: wait 'get adapter firmware \
+   miscellaneous data' timeout \n", acb->host->host_no);
+   return false;
+   }
+   arcmsr_get_adapter_config(acb, reg->message_rwbuffer);
return true;
 }
 static bool arcmsr_hbaB_get_config(struct AdapterControlBlock *acb)
 {
struct MessageUnit_B *reg = acb->pmuB;
-   char *acb_firm_model = acb->firm_model;
-   char *acb_firm_version = acb->firm_version;
-   char *acb_device_map = acb->device_map;
-   char __iomem *iop_firm_model;
-   /*firm_model,15,60-67*/
-   char __iomem *iop_firm_version;
-   /*firm_version,17,68-83*/
-   char __iomem *iop_device_map;
-   /*firm_version,21,84-99*/
-   int count;
-
-   iop_firm_model = (char __iomem *)(>message_rwbuffer[15]);  
/*firm_model,15,60-67*/
-   iop_firm_version = (char __iomem *)(>message_rwbuffer[17]);

[PATCH 3/4] scsi: arcmsr: simplify arcmsr_hbaE_get_config function

2017-12-12 Thread Ching Huang

From: Ching Huang 

simplify arcmsr_hbaE_get_config function

Signed-off-by: Ching Huang 
---

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index dfaea8f..b7a56e8 100755
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -3205,16 +3205,14 @@ static bool arcmsr_hbaE_get_config(struct 
AdapterControlBlock *pACB)
struct MessageUnit_E __iomem *reg = pACB->pmuE;
char __iomem *iop_firm_model = (char __iomem 
*)(>msgcode_rwbuffer[15]);
char __iomem *iop_firm_version = (char __iomem 
*)(>msgcode_rwbuffer[17]);
-   uint32_t intmask_org, Index, firmware_state = 0, read_doorbell;
+   uint32_t intmask_org;
int count;
 
/* disable all outbound interrupt */
intmask_org = readl(>host_int_mask); /* disable outbound message0 
int */
writel(intmask_org | ARCMSR_HBEMU_ALL_INTMASKENABLE, 
>host_int_mask);
/* wait firmware ready */
-   do {
-   firmware_state = readl(>outbound_msgaddr1);
-   } while ((firmware_state & ARCMSR_HBEMU_MESSAGE_FIRMWARE_OK) == 0);
+   arcmsr_wait_firmware_ready(pACB);
mdelay(20);
/* post "get config" instruction */
writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0);
@@ -3222,17 +3220,7 @@ static bool arcmsr_hbaE_get_config(struct 
AdapterControlBlock *pACB)
pACB->out_doorbell ^= ARCMSR_HBEMU_DRV2IOP_MESSAGE_CMD_DONE;
writel(pACB->out_doorbell, >iobound_doorbell);
/* wait message ready */
-   for (Index = 0; Index < 2000; Index++) {
-   read_doorbell = readl(>iobound_doorbell);
-   if ((read_doorbell ^ pACB->in_doorbell) & 
ARCMSR_HBEMU_IOP2DRV_MESSAGE_CMD_DONE) {
-   writel(0, >host_int_status);
-   pACB->in_doorbell = read_doorbell;
-   break;
-   }
-   mdelay(1);
-   } /*max 1 seconds*/
-
-   if (Index >= 2000) {
+   if (!arcmsr_hbaE_wait_msgint_ready(pACB)) {
pr_notice("arcmsr%d: wait get adapter firmware "
"miscellaneous data timeout\n", pACB->host->host_no);
return false;

[PATCH 2/4] scsi: arcmsr: waiting for iop firmware ready before issue get_config command to iop

2017-12-12 Thread Ching Huang

From: Ching Huang 

waiting for iop firmware ready before issue get_config command to iop for 
adapter type A and D

Signed-off-by: Ching Huang 
---

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 9b587ca..dfaea8f 100755
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -2966,6 +2966,7 @@ static bool arcmsr_hbaA_get_config(struct 
AdapterControlBlock *acb)
char __iomem *iop_firm_version = (char __iomem 
*)(>message_rwbuffer[17]);
char __iomem *iop_device_map = (char __iomem 
*)(>message_rwbuffer[21]);
int count;
+   arcmsr_wait_firmware_ready(acb);
writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0);
if (!arcmsr_hbaA_wait_msgint_ready(acb)) {
printk(KERN_NOTICE "arcmsr%d: wait 'get adapter firmware \
@@ -3149,6 +3150,7 @@ static bool arcmsr_hbaD_get_config(struct 
AdapterControlBlock *acb)
writel(ARCMSR_ARC1214_IOP2DRV_MESSAGE_CMD_DONE,
acb->pmuD->outbound_doorbell);/*clear interrupt*/
}
+   arcmsr_wait_firmware_ready(acb);
/* post "get config" instruction */
writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, reg->inbound_msgaddr0);
/* wait message ready */

[PATCH 1/4] scsi: arcmsr: simplify arcmsr_hbaC_get_config function

2017-12-12 Thread Ching Huang

From: Ching Huang 

simplify arcmsr_hbaC_get_config function

Signed-off-by: Ching Huang 
---

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index e4258b6..9b587ca 100755
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -3082,7 +3082,7 @@ static bool arcmsr_hbaB_get_config(struct 
AdapterControlBlock *acb)
 
 static bool arcmsr_hbaC_get_config(struct AdapterControlBlock *pACB)
 {
-   uint32_t intmask_org, Index, firmware_state = 0;
+   uint32_t intmask_org;
struct MessageUnit_C __iomem *reg = pACB->pmuC;
char *acb_firm_model = pACB->firm_model;
char *acb_firm_version = pACB->firm_version;
@@ -3093,21 +3093,12 @@ static bool arcmsr_hbaC_get_config(struct 
AdapterControlBlock *pACB)
intmask_org = readl(>host_int_mask); /* disable outbound message0 
int */
writel(intmask_org|ARCMSR_HBCMU_ALL_INTMASKENABLE, >host_int_mask);
/* wait firmware ready */
-   do {
-   firmware_state = readl(>outbound_msgaddr1);
-   } while ((firmware_state & ARCMSR_HBCMU_MESSAGE_FIRMWARE_OK) == 0);
+   arcmsr_wait_firmware_ready(pACB);
/* post "get config" instruction */
writel(ARCMSR_INBOUND_MESG0_GET_CONFIG, >inbound_msgaddr0);
writel(ARCMSR_HBCMU_DRV2IOP_MESSAGE_CMD_DONE, >inbound_doorbell);
/* wait message ready */
-   for (Index = 0; Index < 2000; Index++) {
-   if (readl(>outbound_doorbell) & 
ARCMSR_HBCMU_IOP2DRV_MESSAGE_CMD_DONE) {
-   
writel(ARCMSR_HBCMU_IOP2DRV_MESSAGE_CMD_DONE_DOORBELL_CLEAR, 
>outbound_doorbell_clear);/*clear interrupt*/
-   break;
-   }
-   udelay(10);
-   } /*max 1 seconds*/
-   if (Index >= 2000) {
+   if (!arcmsr_hbaC_wait_msgint_ready(pACB)) {
printk(KERN_NOTICE "arcmsr%d: wait 'get adapter firmware \
miscellaneous data' timeout \n", pACB->host->host_no);
return false;

[PATCH 0/4] scsi: arcmsr: simplify hba_get_config routine

2017-12-12 Thread Ching Huang

From: Ching Huang 

These patches are apply to Martin's 4.16/scsi-queue.

patch 1: simplify arcmsr_hbaC_get_config function.

patch 2: wait iop firmware ready before issue get_config command to iop.

patch 3: simplify arcmsr_hbaE_get_config function.

patch 4: simplify all arcmsr_hbaX_get_config routine by call a new 
get_adapter_config function
---

[mainline] rcu stalls on CPU when unbinding mpt3sas driver

2017-12-12 Thread Abdul Haleem

Hi,

Off late we are seeing cpu stalls messages while mpt3sas driver unbind
on powerpc machine for both mainline and linux-next kernels

Machine Type: Power 8 Bare-metal
Kernel version: 4.15.0-rc2
config: attached.
test: driver unbind

$ echo -n 0001:03:00.0 > /sys/bus/pci/drivers/mpt3sas/unbind
mpt3sas_cm0: removing handle(0x000a), sas_addr(0x500304801f080d00)
mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(0)
mpt3sas_cm0: removing enclosure level(0x), connector name( )
mpt3sas_cm0: removing handle(0x000b), sas_addr(0x500304801f080d01)
mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(1)
mpt3sas_cm0: removing enclosure level(0x), connector name( )
mpt3sas_cm0: removing handle(0x000c), sas_addr(0x500304801f080d02)
mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(2)
mpt3sas_cm0: removing enclosure level(0x), connector name( )
mpt3sas_cm0: removing handle(0x000d), sas_addr(0x500304801f080d03)
mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(3)
mpt3sas_cm0: removing enclosure level(0x), connector name( )
mpt3sas_cm0: removing handle(0x000e), sas_addr(0x500304801f080d04)
mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(4)
mpt3sas_cm0: removing enclosure level(0x), connector name( )
mpt3sas_cm0: removing handle(0x000f), sas_addr(0x500304801f080d3d)
mpt3sas_cm0: removing : enclosure logical id(0x500304801f080d3f), slot(12)
mpt3sas_cm0: removing enclosure level(0x), connector name( )
sd 16:0:0:0: [sdb] Synchronizing SCSI cache
sd 16:0:0:0: [sdb] Synchronize Cache(10) failed: Result: 
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
sd 16:0:1:0: [sdc] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT 
driverbyte=DRIVER_OK
sd 16:0:1:0: [sdc] tag#0 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 
00 00 00 00 00 00 00 e5 00
sd 16:0:2:0: [sdd] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT 
driverbyte=DRIVER_OK
sd 16:0:2:0: [sdd] tag#0 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 
00 00 00 00 00 00 00 e5 00
sd 16:0:3:0: [sde] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT 
driverbyte=DRIVER_OK
sd 16:0:3:0: [sde] tag#0 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 
00 00 00 00 00 00 00 e5 00
sd 16:0:4:0: [sdf] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT 
driverbyte=DRIVER_OK
sd 16:0:4:0: [sdf] tag#0 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 
00 00 00 00 00 00 00 e5 00

few minutes after above command was executed, machine is flooded with rcu 
stalls messages.

INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 86-... } 44191221 
jiffies s: 3445 root: 0x20/.
blocking rcu_node structures: l=1:80-95:0x40/.
Task dump for CPU 86:
sh  R  running task10384 18136  1 0x00042086
Call Trace:
[c07792d47370] [c07933667200] 0xc07933667200 (unreliable)
INFO: rcu_sched self-detected stall on CPU
86-: (50420459 ticks this GP) idle=0ae/141/0 
softirq=11962/11962 fqs=24724293 
 (t=50420460 jiffies g=80217 c=80216 q=36817447)
NMI backtrace for cpu 86
CPU: 86 PID: 18136 Comm: sh Not tainted 4.15.0-rc2-autotest #1
Call Trace:
[c07792d46f20] [c099b83c] dump_stack+0xb0/0xf4 (unreliable)
[c07792d46f60] [c09a43e4] nmi_cpu_backtrace+0x1a4/0x210
[c07792d46ff0] [c09a462c] nmi_trigger_cpumask_backtrace+0x1dc/0x220
[c07792d47090] [c002c7d0] arch_trigger_cpumask_backtrace+0x20/0x40
[c07792d470b0] [c017496c] rcu_dump_cpu_stacks+0xf4/0x158
[c07792d47100] [c0173cb0] rcu_check_callbacks+0x8f0/0xb00
[c07792d47230] [c017c25c] update_process_times+0x3c/0x90
[c07792d47260] [c01921e4] tick_sched_handle.isra.13+0x44/0x80
[c07792d47280] [c0192278] tick_sched_timer+0x58/0xb0
[c07792d472c0] [c017cd58] __hrtimer_run_queues+0xf8/0x330
[c07792d47340] [c017da74] hrtimer_interrupt+0xe4/0x280
[c07792d47400] [c0022660] __timer_interrupt+0x90/0x270
[c07792d47450] [c0022d30] timer_interrupt+0xa0/0xe0
[c07792d47480] [c0009238] decrementer_common+0x158/0x160
--- interrupt: 901 at replay_interrupt_return+0x0/0x4
LR = arch_local_irq_restore+0x74/0x90
[c07792d47770] [c03fb3185000] 0xc03fb3185000 (unreliable)
[c07792d47790] [c09bb658] _raw_spin_unlock_irqrestore+0x38/0x60
[c07792d477b0] [c066f274] scsi_remove_target+0x204/0x270
[c07792d47820] [dfc72604] sas_rphy_remove+0x94/0xa0 
[scsi_transport_sas]
[c07792d47850] [dfc745bc] sas_port_delete+0x4c/0x238 
[scsi_transport_sas]
[c07792d478b0] [d00010e82990] mpt3sas_transport_port_remove+0x2d0/0x310 
[mpt3sas]
[c07792d47950] [d00010e71ba0] _scsih_remove_device+0x100/0x2a0 [mpt3sas]
[c07792d47a10] [d00010e774d4] 
mpt3sas_device_remove_by_sas_address.part.44+0xb4/0x160 [mpt3sas]
[c07792d47a70] [d00010e77614]

[PATCH 4/4] sd: use async_probe cookie to avoid deadlocks

2017-12-12 Thread Hannes Reinecke

With the current design we're waiting for all async probes to
finish when removing any sd device.
This might lead to a livelock where the 'remove' call is blocking
for any probe calls to finish, and the probe calls are waiting for
a response, which will never be processes as the thread handling
the responses is waiting for the remove call to finish.
Which is completely pointless as we only _really_ care for the
probe on _this_ device to be completed; any other probing can
happily continue for all we care.
So save the async probing cookie in the structure and only wait
if this specific probe is still active.

Signed-off-by: Hannes Reinecke 
---
 drivers/scsi/sd.c | 6 --
 drivers/scsi/sd.h | 3 +++
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index abbab17..7bf20ca 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3416,7 +3416,8 @@ static int sd_probe(struct device *dev)
dev_set_drvdata(dev, sdkp);
 
get_device(>dev); /* prevent release before async_schedule */
-   async_schedule_domain(sd_probe_async, sdkp, _sd_probe_domain);
+   sdkp->async_probe = async_schedule_domain(sd_probe_async, sdkp,
+ _sd_probe_domain);
 
return 0;
 
@@ -3454,7 +3455,8 @@ static int sd_remove(struct device *dev)
scsi_autopm_get_device(sdkp->device);
 
async_synchronize_full_domain(_sd_pm_domain);
-   async_synchronize_full_domain(_sd_probe_domain);
+   async_synchronize_cookie_domain(sdkp->async_probe,
+   _sd_probe_domain);
device_del(>dev);
del_gendisk(sdkp->disk);
sd_shutdown(dev);
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 320de75..d8aff29 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -2,6 +2,8 @@
 #ifndef _SCSI_DISK_H
 #define _SCSI_DISK_H
 
+#include 
+
 /*
  * More than enough for everybody ;)  The huge number of majors
  * is a leftover from 16bit dev_t days, we don't really need that
@@ -73,6 +75,7 @@ struct scsi_disk {
struct device   dev;
struct gendisk  *disk;
struct opal_dev *opal_dev;
+   async_cookie_t  async_probe;
 #ifdef CONFIG_BLK_DEV_ZONED
unsigned intnr_zones;
unsigned intzone_blocks;
-- 
1.8.5.6

[PATCH 3/4] scsi: add missing get_device() return value checks

2017-12-12 Thread Hannes Reinecke

We need to validate that get_device() succeeded, otherwise
we'll end up working with invalid devices.

Signed-off-by: Hannes Reinecke 
---
 drivers/scsi/scsi_scan.c | 14 --
 drivers/scsi/sd.c|  8 +---
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index be5e919..18edfd7 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -245,6 +245,10 @@ static struct scsi_device *scsi_alloc_sdev(struct 
scsi_target *starget,
INIT_WORK(>requeue_work, scsi_requeue_run_queue);
 
sdev->sdev_gendev.parent = get_device(>dev);
+   if (!sdev->sdev_gendev.parent) {
+   kfree(sdev);
+   return NULL;
+   }
sdev->sdev_target = starget;
 
/* usually NULL and set by ->slave_alloc instead */
@@ -366,7 +370,8 @@ static struct scsi_target *__scsi_find_target(struct device 
*parent,
}
}
if (found_starget)
-   get_device(_starget->dev);
+   if (!get_device(_starget->dev))
+   found_starget = NULL;
 
return found_starget;
 }
@@ -436,6 +441,10 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
device_initialize(dev);
kref_init(>reap_ref);
dev->parent = get_device(parent);
+   if (!dev->parent) {
+   kfree(starget);
+   return NULL;
+   }
dev_set_name(dev, "target%d:%d:%d", shost->host_no, channel, id);
dev->bus = _bus_type;
dev->type = _target_type;
@@ -469,7 +478,8 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
return NULL;
}
}
-   get_device(dev);
+   /* No good way to recover here; keep fingers crossed */
+   WARN_ON(!get_device(dev));
 
return starget;
 
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 228b0b62..abbab17 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -619,10 +619,12 @@ static struct scsi_disk *scsi_disk_get(struct gendisk 
*disk)
 
if (disk->private_data) {
sdkp = scsi_disk(disk);
-   if (scsi_device_get(sdkp->device) == 0)
-   get_device(>dev);
-   else
+   if (scsi_device_get(sdkp->device))
+   sdkp = NULL;
+   else if (!get_device(>dev)) {
+   scsi_device_put(sdkp->device);
sdkp = NULL;
+   }
}
mutex_unlock(_ref_mutex);
return sdkp;
-- 
1.8.5.6

[PATCH 2/4] sd: Check if parent is still present before proceeding with probing

2017-12-12 Thread Hannes Reinecke

When a new SCSI disk is created we should be checking if the parent
device is actually present before proceeding with probing.

Signed-off-by: Hannes Reinecke 
---
 drivers/scsi/sd.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index ab75ebd..228b0b62 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3399,6 +3399,10 @@ static int sd_probe(struct device *dev)
}
 
device_initialize(>dev);
+
+   if (!get_device(dev))
+   goto out_free_index;
+
sdkp->dev.parent = dev;
sdkp->dev.class = _disk_class;
dev_set_name(>dev, "%s", dev_name(dev));
@@ -3407,7 +3411,6 @@ static int sd_probe(struct device *dev)
if (error)
goto out_free_index;
 
-   get_device(dev);
dev_set_drvdata(dev, sdkp);
 
get_device(>dev); /* prevent release before async_schedule */
-- 
1.8.5.6

[PATCH 0/4] sd async probing

2017-12-12 Thread Hannes Reinecke

Hi all,

here's a patchset to facilitate full async scsi disk probing.
The current approach has the problem of calling
async_synchronize_full_domain() during sd_remove(), which is required
to wait for _all_ current asynchronous probes to complete.
When running under iSER sd_remove() is called from the RX thread directly,
ie sd_remove() has to complete before RX processing can continue.
But as probing has to issue I/O any response will be stuck in the RX queue,
resulting in a deadlock.

This patchset breaks up the synchronization point into a per-device cookie,
thus removing the global synchronisation and breaking the deadlock.
But during testing some more deficiencies up and down the stack had been
revealed, so we'll need the entire patchset to make it work.

As usual, comments and reviews are welcome.

Hannes Reinecke (4):
  block: check for GENHD_FL_UP in del_gendisk()
  sd: Check if parent is still present before proceeding with probing
  scsi: add missing get_device() return value checks
  sd: use async_probe cookie to avoid deadlocks

 block/genhd.c|  3 +++
 drivers/scsi/scsi_scan.c | 14 --
 drivers/scsi/sd.c| 19 +--
 drivers/scsi/sd.h|  3 +++
 4 files changed, 31 insertions(+), 8 deletions(-)

-- 
1.8.5.6

[PATCH 1/4] block: check for GENHD_FL_UP in del_gendisk()

2017-12-12 Thread Hannes Reinecke

From: Hannes Reinecke 

When a device is probed asynchronously del_gendisk() might be called
before the async probing was run, causing del_gendisk() to crash
due to uninitialized sysfs objects.

Signed-off-by: Hannes Reinecke 
---
 block/genhd.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/genhd.c b/block/genhd.c
index c2223f1..cc40d95 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -697,6 +697,9 @@ void del_gendisk(struct gendisk *disk)
struct disk_part_iter piter;
struct hd_struct *part;
 
+   if (!(disk->flags & GENHD_FL_UP))
+   return;
+
blk_integrity_del(disk);
disk_del_events(disk);
 
-- 
1.8.5.6

[PATCH] esas2r: Fix a possible sleep-in-atomic bug in esas2r_wait_request

2017-12-12 Thread Jia-Ju Bai

The driver may sleep in the interrupt handler.
The function call path is:
esas2r_adapter_tasklet (interrupt handler)
  esas2r_do_tasklet_tasks
esas2r_handle_chip_rst_during_tasklet
  esas2r_init_adapter_hw
esas2r_init_msgs
  esas2r_wait_request
schedule_timeout_interruptible --> may sleep

To fix it, schedule_timeout_uninterruptible is replaced with mdelay.

This bug is found by my static analysis tool(DSAC) and checked by my code 
review.

Signed-off-by: Jia-Ju Bai 
---
 drivers/scsi/esas2r/esas2r_main.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/esas2r/esas2r_main.c 
b/drivers/scsi/esas2r/esas2r_main.c
index 4eb1430..4cd8f79 100644
--- a/drivers/scsi/esas2r/esas2r_main.c
+++ b/drivers/scsi/esas2r/esas2r_main.c
@@ -1307,7 +1307,7 @@ void esas2r_wait_request(struct esas2r_adapter *a, struct 
esas2r_request *rq)
if (rq->req_stat != RS_STARTED)
break;
 
-   schedule_timeout_interruptible(msecs_to_jiffies(100));
+   mdelay(100);
 
if ((jiffies_to_msecs(jiffies) - starttime) > timeout) {
esas2r_hdebug("request TMO");
-- 
1.7.9.5

[PATCH] esas2r: Fix a possible sleep-in-atomic bug in esas2r_flash_access

2017-12-12 Thread Jia-Ju Bai

The driver may sleep in the interrupt handler.
The function call path is:
esas2r_adapter_tasklet (interrupt handler)
  esas2r_do_tasklet_tasks
esas2r_handle_chip_rst_during_tasklet
  esas2r_init_adapter_hw
esas2r_nvram_read_direct
  esas2r_read_flash_block
esas2r_flash_access
  schedule_timeout_interruptible --> may sleep

To fix it, schedule_timeout_uninterruptible is replaced with mdelay.

This bug is found by my static analysis tool(DSAC) and checked by my code 
review.

Signed-off-by: Jia-Ju Bai 
---
 drivers/scsi/esas2r/esas2r_flash.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/esas2r/esas2r_flash.c 
b/drivers/scsi/esas2r/esas2r_flash.c
index 7bd376d..9b3da4c 100644
--- a/drivers/scsi/esas2r/esas2r_flash.c
+++ b/drivers/scsi/esas2r/esas2r_flash.c
@@ -965,7 +965,7 @@ static bool esas2r_flash_access(struct esas2r_adapter *a, 
u32 function)
break;
}
 
-   schedule_timeout_interruptible(msecs_to_jiffies(100));
+   mdelay(100);
 
if ((jiffies_to_msecs(jiffies) - starttime) > timeout) {
/*
-- 
1.7.9.5

[BUG] drivers/scsi/esas2r: a possible sleep-in-atomic bug in esas2r_nvram_read_direct

2017-12-12 Thread Jia-Ju Bai


The driver may sleep in the interrupt handler.
The function call path is:
esas2r_adapter_tasklet (interrupt handler)
  esas2r_do_tasklet_tasks
esas2r_handle_chip_rst_during_tasklet
  esas2r_init_adapter_hw
esas2r_nvram_read_direct
  down_interruptible --> may sleep

I do not find a good way to fix it, so I only report.
This possible bug is found by my static analysis tool (DSAC) and checked 
by my code review.



Thanks,
Jia-Ju Bai

[PATCH] esas2r: Fix possible sleep-in-atomic bugs in esas2r_check_adapter

2017-12-12 Thread Jia-Ju Bai

The driver may sleep in the interrupt handler.
The function call path is:
esas2r_adapter_tasklet (interrupt handler)
  esas2r_do_tasklet_tasks
esas2r_handle_chip_rst_during_tasklet
  esas2r_check_adapter
schedule_timeout_interruptible

To fix it, schedule_timeout_uninterruptible is replaced with mdelay.

This bug is found by my static analysis tool(DSAC) and checked by my code 
review.


Signed-off-by: Jia-Ju Bai 
---
 drivers/scsi/esas2r/esas2r_init.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/esas2r/esas2r_init.c 
b/drivers/scsi/esas2r/esas2r_init.c
index 5b14dd2..0b9f547 100644
--- a/drivers/scsi/esas2r/esas2r_init.c
+++ b/drivers/scsi/esas2r/esas2r_init.c
@@ -1068,7 +1068,7 @@ bool esas2r_check_adapter(struct esas2r_adapter *a)
break;
}
 
-   schedule_timeout_interruptible(msecs_to_jiffies(100));
+   mdelay(100);
 
if ((jiffies_to_msecs(jiffies) - starttime) > 18) {
esas2r_hdebug("FW ready TMO");
@@ -1091,7 +1091,7 @@ bool esas2r_check_adapter(struct esas2r_adapter *a)
break;
}
 
-   schedule_timeout_interruptible(msecs_to_jiffies(50));
+   mdelay(50);
 
if ((jiffies_to_msecs(jiffies) - starttime) > 3000) {
esas2r_hdebug("timeout waiting for interface down");
@@ -1180,7 +1180,7 @@ bool esas2r_check_adapter(struct esas2r_adapter *a)
break;
}
 
-   schedule_timeout_interruptible(msecs_to_jiffies(100));
+   mdelay(100);
 
if ((jiffies_to_msecs(jiffies) - starttime) > 3000) {
esas2r_hdebug(
-- 
1.7.9.5

Re: [PATCH] sd: Increase SCSI disk probing concurrency

2017-12-12 Thread Hannes Reinecke

On 12/12/2017 01:16 AM, Bart Van Assche wrote:
> On Sun, 2017-12-10 at 15:44 +0100, Hannes Reinecke wrote:
>> You know what, I have been working on a similar patch for quite some
>> time now; [ ... ]
> 
> That's very interesting :-)
> 
>> However, in doing so I have encountered several issues which have been
>> exposed by that; the most trivial one being that del_gendisk() doesn't
>> check if GENHD_FL_UP is set, so it'll crash if sd_remove is called
>> before sd_async_probe() is run.
> 
> Have you noticed that my patch adds a call to sd_wait_for_probing() before
> the call to del_gendisk()? That should be sufficient to prevent the crash you
> described.
> 
Not necessarily; see below.

>> The other one is an imbalance between sd_probe and sd_remove;
>> when sd_probe_async() is called _after_ scsi_device_remove_device()
>> (which it will as the synchronization is only after device_del() has
>> been called) it will trip over non-existent sysfs directories in add_disk().
>> So one need to short-circuit sd_probe_async() for devices in SDEV_CANCEL
>> or SDEV_DEL.
> 
> Same comment here - I think the sd_wait_for_probing() call I inserted in
> sd_remove() should prevent this scenario.
> 
No, it doesn't.
sd_remove is called via

__scsi_device_remove_device()
-> device_del()
   -> sd_remove()

with this piece of code:

bsg_unregister_queue(sdev->request_queue);
device_unregister(>sdev_dev);
transport_remove_device(dev);
device_del(dev);

ie by the time device_del() we've already called device_unregister(),
which unfortunately is the sysfs parent kobj for the gendisk.
So when you're calling sd_wait_for_probing() in sd_remove() it'll call
sd_async_probe(), and add_disk() will be called with a NULL parent object.

>> However, I'm still waiting for a final confirmation on the latter issue,
>> hence I haven't posted the patchset.
>> If there's interest I can post them, though.
> 
> The big question here is how you would like to proceed - should we start from
> your patch series or start from my patch? Since I have not encountered any of
> the issues you described with my patch, does that mean that the approach I
> followed results in more reliable operation?
> 
Not necessarily.
The issues I've described got reported by a big CDN provider after
several days worth of hammering.
It's not something I could reproduce here in my puny lab.
So just because you haven't seen the race doesn't mean there is none :-)

I'll be posting the patches shortly.

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.com  +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

52 matches

Mail list logo