date:20180403

Re: [PATCH] scsi: Fix failed request error code

2018-04-03 Thread Hannes Reinecke

On Wed,  4 Apr 2018 15:51:38 +0900
Damien Le Moal  wrote:

> With the introduction of commit e39a97353e53 ("scsi: core: return
> BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()"), a command
> that failed with hostbyte=DID_OK and driverbyte=DRIVER_SENSE but
> lacking additional sense information will have a return code set to
> BLK_STS_OK. This results in the request issuer to see successful
> request execution despite the failure. An example of such case is an
> unaligned write on a host managed ZAC disk connected to a SAS HBA
> with a malfunctioning SAT. The unaligned write command gets aborted
> but has no additional sense information.
> 
> sd 10:0:0:0: [sde] tag#3905 FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE sd 10:0:0:0: [sde] tag#3905 Sense Key :
> Aborted Command [current] sd 10:0:0:0: [sde] tag#3905 Add. Sense: No
> additional sense information sd 10:0:0:0: [sde] tag#3905 CDB:
> Write(16) 8a 00 00 00 00 00 02 0c 00 01 00 00 00 01 00 00
> print_req_error: I/O error, dev sde, sector 274726920
> 
> In scsi_io_completion(), sense key handling to not change the request
> error code and success being reported to the issuer.
> 
> Fix this by making sure that the error code always indicates an error
> if scsi_io_completion() decide that the action to be taken for a
> failed command is to not retry it and terminate it immediately
> (ACTION_FAIL) .
> 
> Signed-off-by: Damien Le Moal 
> Fixes: e39a97353e53 ("scsi: core: return BLK_STS_OK for DID_OK in
> __scsi_error_from_host_byte()") Cc: Hannes Reinecke 
> Cc: 
> ---
>  drivers/scsi/scsi_lib.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index c84f931388f2..87579bfcc186 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1002,6 +1002,15 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
> unsigned int good_bytes) scsi_print_command(cmd);
>   }
>   }
> + /*
> +  * The command failed and should not be retried. If
> the host
> +  * byte is DID_OK, then
> __scsi_error_from_host_byte() returned
> +  * BLK_STS_OK and error indicates a success. Make
> sure to not
> +  * use that as the completion code and always return
> an
> +  * I/O error.
> +  */
> + if (error == BLK_STS_OK)
> + error = BLK_STS_IOERR;
>   if (!scsi_end_request(req, error,
> blk_rq_err_bytes(req), 0)) return;
>   /*FALLTHRU*/

That looks wrong.
Shouldn't __scsi_error_from_host_byte() return the correct status here?

Cheers,

Hannes

[PATCH] scsi: Fix failed request error code

2018-04-03 Thread Damien Le Moal

With the introduction of commit e39a97353e53 ("scsi: core: return
BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()"), a command that
failed with hostbyte=DID_OK and driverbyte=DRIVER_SENSE but lacking
additional sense information will have a return code set to BLK_STS_OK.
This results in the request issuer to see successful request execution
despite the failure. An example of such case is an unaligned write on a
host managed ZAC disk connected to a SAS HBA with a malfunctioning SAT.
The unaligned write command gets aborted but has no additional sense
information.

sd 10:0:0:0: [sde] tag#3905 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
sd 10:0:0:0: [sde] tag#3905 Sense Key : Aborted Command [current]
sd 10:0:0:0: [sde] tag#3905 Add. Sense: No additional sense information
sd 10:0:0:0: [sde] tag#3905 CDB: Write(16) 8a 00 00 00 00 00 02 0c 00 01 00 00 
00 01 00 00
print_req_error: I/O error, dev sde, sector 274726920

In scsi_io_completion(), sense key handling to not change the request
error code and success being reported to the issuer.

Fix this by making sure that the error code always indicates an error
if scsi_io_completion() decide that the action to be taken for a failed
command is to not retry it and terminate it immediately (ACTION_FAIL) .

Signed-off-by: Damien Le Moal 
Fixes: e39a97353e53 ("scsi: core: return BLK_STS_OK for DID_OK in 
__scsi_error_from_host_byte()")
Cc: Hannes Reinecke 
Cc: 
---
 drivers/scsi/scsi_lib.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c84f931388f2..87579bfcc186 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1002,6 +1002,15 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
scsi_print_command(cmd);
}
}
+   /*
+* The command failed and should not be retried. If the host
+* byte is DID_OK, then __scsi_error_from_host_byte() returned
+* BLK_STS_OK and error indicates a success. Make sure to not
+* use that as the completion code and always return an
+* I/O error.
+*/
+   if (error == BLK_STS_OK)
+   error = BLK_STS_IOERR;
if (!scsi_end_request(req, error, blk_rq_err_bytes(req), 0))
return;
/*FALLTHRU*/
-- 
2.14.3

[PATCH 1/1] scsi:mvsas:fix memory leak

2018-04-03 Thread Xidong Wang

In function mvs_pci_init(), the memory allocated by
scsi_host_alloc() is not released on the error path that mvi,
which holds the return value of mvs_pci_alloc(), is NULL.
This will result in a memory leak bug.

Signed-off-by: Xidong Wang 
---
 drivers/scsi/mvsas/mv_init.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index 8c91637..936b8ef 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -370,8 +370,10 @@ static struct mvs_info *mvs_pci_alloc(struct pci_dev *pdev,
mvi = kzalloc(sizeof(*mvi) +
(1L << mvs_chips[ent->driver_data].slot_width) *
sizeof(struct mvs_slot_info), GFP_KERNEL);
-   if (!mvi)
+   if (!mvi) {
+   scsi_host_put(shost);
return NULL;
+   }
 
mvi->pdev = pdev;
mvi->dev = &pdev->dev;
-- 
2.7.4

[PATCH 1/1] scsi: mvsas:fix memory leak

2018-04-03 Thread Xidong Wang

In function mvs_pci_init(), the memory allocated by
scsi_host_alloc() is not released on the error path that mvi,
which holds the return value of mvs_pci_alloc(), is NULL.
This will result in a memory leak bug.

Signed-off-by: Xidong Wang 
---
 drivers/scsi/mvsas/mv_init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index 8c91637..bde4b50 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -371,6 +371,7 @@ static struct mvs_info *mvs_pci_alloc(struct pci_dev *pdev,
(1L << mvs_chips[ent->driver_data].slot_width) *
sizeof(struct mvs_slot_info), GFP_KERNEL);
if (!mvi)
+   scsi_host_put(shost);
return NULL;
 
mvi->pdev = pdev;
-- 
2.7.4

Re: [PATCH v5] scsi: new zorro_esp.c for Amiga Zorro NCR53C9x boards

2018-04-03 Thread Finn Thain

On Tue, 3 Apr 2018, Christoph Hellwig wrote:

> > +static void zorro_esp_send_pio_cmd(struct esp *esp, u32 addr, u32 
> > esp_count,
> > +u32 dma_count, int write, u8 cmd)
> > +{
> > +   struct zorro_esp_priv *zep = dev_get_drvdata(esp->dev);
> > +   u8 __iomem *fifo = esp->regs + ESP_FDATA * 16;
> > +   u8 phase = esp->sreg & ESP_STAT_PMASK;
> > +
> > +   cmd &= ~ESP_CMD_DMA;
> > +
> > +   if (write) {
> 
> It seems like this routine would benefit from being split into a read
> and a write one.
> 

I'd prefer to leave this unchanged, for a couple of reasons.

The main reason is that zorro_esp_send_pio_cmd() is supposed to be a copy 
of mac_esp_send_pio_cmd(). Identical code is easy to refactor in order to 
eliminate the duplication, whereas small variations complicate review. 
I've already started work on patches to resolve the duplicated code.

That effort is complicated by the question that Michael has raised about 
the use of certain interrupt flags in the shared code. So I didn't simply 
put a refactoring patch before this one. Merging Michael's driver first 
seemed to make it easier for us to collaborate on this shared code.

Also, the calling convention here is just the send_dma_cmd method from 
struct esp_driver_ops. The change you suggest would lead to this,

static void zorro_esp_send_pio_cmd(struct esp *esp, u32 addr, u32 esp_count,
u32 dma_count, int write, u8 cmd)
{
if (write)
zorro_esp_send_pio_cmd_write(esp, addr, esp_count, dma_count,
cmd);
else
zorro_esp_send_pio_cmd_read(esp, addr, esp_count, dma_count,
cmd);
}

The reader who is trying to understand the call chain starting from the 
core driver learns nothing from this routine. Instead they have to jump 
through another level of indirection to find the actual implementation. 

Given that some esp drivers have multiple implementations of this 
send_dma_cmd method, your version would have three static functions where 
one is sufficient. Drivers like mac_esp that have two implementations of 
the send_dma_cmd method would end up with six functions instead of two.

--

[PATCH] aacraid: Insure command thread is not recursively stopped

2018-04-03 Thread Dave Carroll

If a recursive IOP_RESET is invoked, usually due to the eh_thread handling 
errors after the first reset, be sure we flag that the command thread has 
been stopped to avoid an Oops of the form;

 [ 336.620256] CPU: 28 PID: 1193 Comm: scsi_eh_0 Kdump: loaded Not tainted 
4.14.0-49.el7a.ppc64le #1
 [ 336.620297] task: c03fd630b800 task.stack: c03fd61a4000
 [ 336.620326] NIP: c0176794 LR: c013038c CTR: c024bc10
 [ 336.620361] REGS: c03fd61a7720 TRAP: 0300 Not tainted 
(4.14.0-49.el7a.ppc64le)
 [ 336.620395] MSR: 90009033  CR: 22084022 
XER: 2004
 [ 336.620435] CFAR: c0130388 DAR:  DSISR: 4000 
SOFTE: 1
 [ 336.620435] GPR00: c013038c c03fd61a79a0 c14c7e00 

 [ 336.620435] GPR04: 000c 000c 90009033 
0477
 [ 336.620435] GPR08: 0477   
c00810f7d940
 [ 336.620435] GPR12: c024bc10 c7a33400 c01708a8 
c03fe3b881d8
 [ 336.620435] GPR16: c03fe3b88060 c03fd61a7d10 f000 
001e
 [ 336.620435] GPR20: 0001 c0ebf1a0 0001 
c03fe3b88000
 [ 336.620435] GPR24: 0003 0002 c03fe3b88840 
c03fe3b887e8
 [ 336.620435] GPR28: c03fe3b88000 c03fc8181788  
c03fc8181700
 [ 336.620750] NIP [c0176794] exit_creds+0x34/0x160
 [ 336.620775] LR [c013038c] __put_task_struct+0x8c/0x1f0
 [ 336.620804] Call Trace:
 [ 336.620817] [c03fd61a79a0] [c03fe3b88000] 0xc03fe3b88000 
(unreliable)
 [ 336.620853] [c03fd61a79d0] [c013038c] 
__put_task_struct+0x8c/0x1f0
 [ 336.620889] [c03fd61a7a00] [c0171418] kthread_stop+0x1e8/0x1f0
 [ 336.620922] [c03fd61a7a40] [c00810f7448c] 
aac_reset_adapter+0x14c/0x8d0 [aacraid]
 [ 336.620959] [c03fd61a7b00] [c00810f60174] 
aac_eh_host_reset+0x84/0x100 [aacraid]
 [ 336.621010] [c03fd61a7b30] [c0864f24] 
scsi_try_host_reset+0x74/0x180
 [ 336.621046] [c03fd61a7bb0] [c0867ac0] 
scsi_eh_ready_devs+0xc00/0x14d0
 [ 336.625165] [c03fd61a7ca0] [c08699e0] 
scsi_error_handler+0x550/0x730
 [ 336.632101] [c03fd61a7dc0] [c0170a08] kthread+0x168/0x1b0
 [ 336.639031] [c03fd61a7e30] [c000b528] 
ret_from_kernel_thread+0x5c/0xb4
 [ 336.645971] Instruction dump:
 [ 336.648743] 384216a0 7c0802a6 fbe1fff8 f8010010 f821ffd1 7c7f1b78 6000 
6000
 [ 336.657056] 3940 e87f0838 f95f0838 7c0004ac <7d401828> 314a 7d40192d 
40c2fff4
 [ 336.663997] -[ end trace 4640cf8d4945ad95 ]-

So flag when the thread is stopped by setting the thread pointer to NULL.

Signed-off-by: Dave Carroll 
---
 drivers/scsi/aacraid/commsup.c | 4 +++-
 drivers/scsi/aacraid/linit.c   | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 84858d5..0156c96 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -1502,9 +1502,10 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
forced, u8 reset_type)
host = aac->scsi_host_ptr;
scsi_block_requests(host);
aac_adapter_disable_int(aac);
-   if (aac->thread->pid != current->pid) {
+   if (aac->thread && aac->thread->pid != current->pid) {
spin_unlock_irq(host->host_lock);
kthread_stop(aac->thread);
+   aac->thread = NULL;
jafo = 1;
}
 
@@ -1591,6 +1592,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
forced, u8 reset_type)
  aac->name);
if (IS_ERR(aac->thread)) {
retval = PTR_ERR(aac->thread);
+   aac->thread = NULL;
goto out;
}
}
diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index 2664ea0..f24fb94 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -1562,6 +1562,7 @@ static void __aac_shutdown(struct aac_dev * aac)
up(&fib->event_wait);
}
kthread_stop(aac->thread);
+   aac->thread = NULL;
}
 
aac_send_shutdown(aac);
-- 
2.8.4

Re: 4.15.14 crash with iscsi target and dvd

2018-04-03 Thread Bart Van Assche

On Sun, 2018-04-01 at 14:27 -0400, Wakko Warner wrote:
> Wakko Warner wrote:
> > Wakko Warner wrote:
> > > I tested 4.14.32 last night with the same oops.  4.9.91 works fine.
> > > From the initiator, if I do cat /dev/sr1 > /dev/null it works.  If I mount
> > > /dev/sr1 and then do find -type f | xargs cat > /dev/null the target
> > > crashes.  I'm using the builtin iscsi target with pscsi.  I can burn from
> > > the initiator with out problems.  I'll test other kernels between 4.9 and
> > > 4.14.
> > 
> > So I've tested 4.x.y where x one of 10 11 12 14 15 and y is the latest patch
> > (except for 4.15 which was 1 behind)
> > Each of these kernels crash within seconds or immediate of doing find -type
> > f | xargs cat > /dev/null from the initiator.
> 
> I tried 4.10.0.  It doesn't completely lockup the system, but the device
> that was used hangs.  So from the initiator, it's /dev/sr1 and from the
> target it's /dev/sr0.  Attempting to read /dev/sr0 after the oops causes the
> process to hang in D state.

Hello Wakko,

Thank you for having narrowed down this further. I think that you encountered
a regression either in the block layer core or in the SCSI core. Unfortunately
the number of changes between kernel versions v4.9 and v4.10 in these two
subsystems is huge. I see two possible ways forward:
- Either that you perform a bisect to identify the patch that introduced this
  regression. However, I'm not sure whether you are familiar with the bisect
  process.
- Or that you identify the command that triggers this crash such that others
  can reproduce this issue without needing access to your setup.

How about reproducing this crash with the below patch applied on top of
kernel v4.15.x? The additional output sent by this patch to the system log
should allow us to reproduce this issue by submitting the same SCSI command
with sg_raw.

Thanks,

Bart.


Subject: [PATCH] Report commands with no physical segments in the system log

---
 drivers/scsi/scsi_lib.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6b6a6705f6e5..74a39db57d49 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1093,8 +1093,10 @@ int scsi_init_io(struct scsi_cmnd *cmd)
bool is_mq = (rq->mq_ctx != NULL);
int error = BLKPREP_KILL;
 
-   if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq)))
+   if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq))) {
+   scsi_print_command(cmd);
goto err_exit;
+   }
 
error = scsi_init_sgtable(rq, &cmd->sdb);
if (error)

Re: Regarding SCSI SANITIZE command support

2018-04-03 Thread Douglas Gilbert


On 2018-04-02 07:10 AM, Mahesh Rajashekhara wrote:

Hello,

I am RAID HBA driver engineer here at Microsemi. We are working on linux driver 
development for Microsemi SAS/SATA RAID HBA controllers.

As per our understanding, while a drive is processing the SANITIZE command:
- The drive should still be exposed to the OS.
- The drive will fail all commands other than REQUEST SENSE


Not sure that the drive should ever knowingly fail INQUIRY and REPORT
LUNS. That is to allow a discovery process to occur. In the INQUIRY
response the peripheral qualifier should be 000b by my reading.

Go look at sbc4r15.pdf, specifically clause 4.11.2 to see all the
commands that should be responded to during a SANITIZE (there are
9 of them).


When we initiate SCSI SANITIZE operation on disk drive,  from the OS dmesg, we 
could see that continuous SCSI READ (10) commands have been sent by the OS 
upper layer drivers
and drive reports ASC: 04 ASCQ: 1B "LOGICAL UNIT NOT READY, SANITIZE IN 
PROGRESS". Since the sanitize operation is in progress, the drive will fail the 
commands.
Although, OS could see drive is sanitizing, we are not sure why does OS upper 
layer driver is sending SCSI READ (10).

During the OS boot, if the drive is sanitizing, OS boot is taking very long 
time and call trace issue has been occurred.

It would be of great help if you could shed some light on this issue.

Snippet of dmesg:

  [ 202.659196] sd 0:0:11:0: [sdm] CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
  [ 202.659198] blk_update_request: I/O error, dev sdm, sector 0
  [ 202.659201] Buffer I/O error on dev sdm, logical block 0, async page read
  [ 202.659294] sd 0:0:11:0: [sdm] FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
  [ 202.659299] sd 0:0:11:0: [sdm] Sense Key : Not Ready [current]
  [ 202.659303] sd 0:0:11:0: [sdm] Add. Sense: Logical unit not ready, sanitize 
in progress
  [ 202.659308] sd 0:0:11:0: [sdm] CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
  [ 202.659311] blk_update_request: I/O error, dev sdm, sector 0
  [ 202.659314] Buffer I/O error on dev sdm, logical block 0, async page read
  [ 202.659322] sdm: unable to read partition table
  [ 202.659633] sd 0:0:11:0: [sdm] FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
  [ 202.659638] sd 0:0:11:0: [sdm] Sense Key : Not Ready [current]
  [ 202.659641] sd 0:0:11:0: [sdm] Add. Sense: Logical unit not ready, sanitize 
in progress
  [ 202.659645] sd 0:0:11:0: [sdm] CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
  [ 202.659648] blk_update_request: I/O error, dev sdm, sector 0
  [ 202.660621] sd 0:0:11:0: [sdm] Spinning up disk...
  [ 204.067155] Buffer I/O error on dev sdm, logical block 488378624, async 
page read
  [ 204.122152] SGI XFS with ACLs, security attributes, no debug enabled
  [ 204.123494] XFS (dm-0): Mounting V5 Filesystem
  [ 204.289817] XFS (dm-0): Ending clean mount
  [ 240.069283] INFO: task systemd-udevd:251 blocked for more than 120 seconds.
  [ 240.069289] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message
  [  203.661324] .
  [ 240.069293] systemd-udevd D
  [ 240.069296] c00c3580 0 251 1 0x0004
  [ 240.069301] 88007e0e7d18 0082 88007e0ceeb0 
88007e0e7fd8
  [ 240.069306] 88007e0e7fd8 88007e0e7fd8 88007e0ceeb0 

  [ 240.069311]  c00c35d0 0001 
c00c3580
  [ 240.069316] Call Trace:
  [ 240.069328] [] schedule+0x29/0x70
  [ 240.069334] [] async_synchronize_cookie_domain+0x85/0x150
  [ 240.069340] [] ? wake_up_atomic_t+0x30/0x30
  [ 240.069346] [] async_synchronize_full+0x17/0x20
  [ 240.069351] [] load_module+0x1fc0/0x29e0
  [ 240.069358] [] ? ddebug_proc_write+0xf0/0xf0
  [ 240.069365] [] SyS_init_module+0xc5/0x110
  [ 240.069371] [] system_call_fastpath+0x16/0x1b


The scsi mid-level and/or the sd driver sends TEST UNIT READY (TUR)
commands and should take note of the NOT READY sense key and should
not send READ commands until that condition clears. If we are talking
about modern SCSI devices then REQUEST SENSE should respond with progress
information.

Doug Gilbert

Re: [PATCH 03/15] mpt3sas: Add sanity checks for scsi tracker before accessing it.

2018-04-03 Thread Bart Van Assche

On Tue, 2018-04-03 at 10:11 +0530, Sreekanth Reddy wrote:
> [SR] No, driver calls _scsih_flush_running_cmds during Host reset time
> and during host reset time driver will set 'ioc->shost_recovery' flag.
> So in the scsih_qcmd() driver will return the incoming SCSI cmds with
> "SCSI_MLQUEUE_HOST_BUSY" whenever 'ioc->shost_recovery' flag is set as
> shown below,
> 
> /* host recovery or link resets sent via IOCTLs */
> if (ioc->shost_recovery || ioc->ioc_link_reset_in_progress)
> return SCSI_MLQUEUE_HOST_BUSY;

The ioc->shost_recovery flag is involved in at least the following race:
* From one context a SCSI command is submitted and scsih_qcmd() gets called.
* At the same time sg_reset is invoked from a shell and triggers a call to
  scsih_host_reset(). That function in turn will call
  mpt3sas_base_hard_reset_handler().

I think this scenario can cause ioc->shost_recovery to be set by the mpt3sas
driver after it has been checked by the scsih_qcmd() function.

Anyway, let's get back to patch 03/15 at the start of this e-mail thread.
That patch looks to me like an incomplete attempt to work around a race
condition in the mpt3sas driver. I don't expect that anyone will trust that
patch without further explanation. Which race condition does that patch
address? And why do the mpt3sas maintainers believe that this patch is
sufficient to address that race condition?

Thanks,

Bart.

Re: [PATCH v3] scsi: Introduce sdev_printk_ratelimited to throttle frequent printk

2018-04-03 Thread Jason Yan



On 2018/4/3 14:04, Wen Yang wrote:

There would be so many same lines printed by frequent printk if one
disk went wrong, like,
[  546.185242] sd 0:1:0:0: rejecting I/O to offline device
[  546.185258] sd 0:1:0:0: rejecting I/O to offline device
[  546.185280] sd 0:1:0:0: rejecting I/O to offline device
[  546.185307] sd 0:1:0:0: rejecting I/O to offline device
[  546.185334] sd 0:1:0:0: rejecting I/O to offline device
[  546.185364] sd 0:1:0:0: rejecting I/O to offline device
[  546.185390] sd 0:1:0:0: rejecting I/O to offline device
[  546.185410] sd 0:1:0:0: rejecting I/O to offline device
For slow serial console, the frequent printk may be blocked for a
long time, and if any spin_lock has been acquired before the printk
like in scsi_request_fn, watchdog could be triggered.

Related disscussion can be found here,
https://bugzilla.kernel.org/show_bug.cgi?id=199003
And Petr brought the idea to throttle the frequent printk, it's
useless to print the same lines frequently after all.

v2: fix some typos
v3: limit the print only for the same device

Suggested-by: Petr Mladek
Suggested-by: Sergey Senozhatsky
Signed-off-by: Wen Yang
Signed-off-by: Jiang Biao
Signed-off-by: Tan Hu
Reviewed-by: Bart Van Assche
CC: BartVanAssche
CC: Petr Mladek
CC: Sergey Senozhatsky
CC: Martin K. Petersen
CC: "James E.J. Bottomley"
CC: Tejun Heo
CC: JasonYan


In my machine it works fine.

Tested-by: Jason Yan

[PATCH v2] scsi: cxgb4i: silence overflow warning in t4_uld_rx_handler()

2018-04-03 Thread Dan Carpenter

Smatch marks skb->data as untrusted so it complains that there is a
potential overflow here:

drivers/scsi/cxgbi/cxgb4i/cxgb4i.c:2111 t4_uld_rx_handler()
error: buffer overflow 'cxgb4i_cplhandlers' 239 <= 255.

In this case, skb->data comes from the hardware or firmware so it's not
going to overflow unless there is a firmware bug.

Signed-off-by: Dan Carpenter 
---
v2: rebase, and re-write commit message

diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c 
b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
index 406e94312d4e..beb146b7c17c 100644
--- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
+++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
@@ -2108,11 +2108,11 @@ static int t4_uld_rx_handler(void *handle, const __be64 
*rsp,
log_debug(1 << CXGBI_DBG_TOE,
"cdev %p, opcode 0x%x(0x%x,0x%x), skb %p.\n",
 cdev, opc, rpl->ot.opcode_tid, ntohl(rpl->ot.opcode_tid), skb);
-   if (cxgb4i_cplhandlers[opc])
-   cxgb4i_cplhandlers[opc](cdev, skb);
-   else {
+   if (opc >= ARRAY_SIZE(cxgb4i_cplhandlers) || !cxgb4i_cplhandlers[opc]) {
pr_err("No handler for opcode 0x%x.\n", opc);
__kfree_skb(skb);
+   } else {
+   cxgb4i_cplhandlers[opc](cdev, skb);
}
return 0;
 nomem:

Re: [PATCH v5] scsi: new zorro_esp.c for Amiga Zorro NCR53C9x boards

2018-04-03 Thread Finn Thain

On Tue, 3 Apr 2018, Christoph Hellwig wrote:

> 
> > +
> > +   if (zep->zorro3) {
> > +   /* Only Fastlane Z3 for now - add switch for correct struct
> > +* dma_registers size if adding any more
> > +*/
> > +   esp->dma_regs = ioremap_nocache(dmaaddr,
> > +   sizeof(struct fastlane_dma_registers));
> > +   } else
> > +   esp->dma_regs = (void __iomem *)ZTWO_VADDR(dmaaddr);
> 
> doesn't this need a __force (and a comment on why it is safe) to make
> sparse happy?
> 

Sparse is happy with that cast. These are the warnings I get:

  CHECK   /home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c
/home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c:976:33: warning: 
incorrect type in assignment (different address spaces)
/home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c:976:33:expected 
unsigned long *board_base
/home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c:976:33:got void 
[noderef] *
/home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c:1068:28: warning: 
incorrect type in argument 1 (different address spaces)
/home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c:1068:28:expected 
void [noderef] *addr
/home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c:1068:28:got 
unsigned long *board_base
/home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c:1096:28: warning: 
incorrect type in argument 1 (different address spaces)
/home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c:1096:28:expected 
void [noderef] *addr
/home/fthain/src/kernel.org/linux/drivers/scsi/zorro_esp.c:1096:28:got 
unsigned long *board_base

which seems to indicate that board_base should have type void __iomem *.

--

Re: Multi-Actuator SAS HDD First Look

2018-04-03 Thread Christoph Hellwig

On Sat, Mar 31, 2018 at 01:03:46PM +0200, Hannes Reinecke wrote:
> Actually I would propose to have a 'management' LUN at LUN0, who could
> handle all the device-wide commands (eg things like START STOP UNIT,
> firmware update, or even SMART commands), and ignoring them for the
> remaining LUNs.

That is in fact the only workable option at all.  Everything else
completly breaks the scsi architecture.

Re: [PATCH v5] scsi: new zorro_esp.c for Amiga Zorro NCR53C9x boards

2018-04-03 Thread Christoph Hellwig

Just a few style comments:

> +static int zorro_esp_irq_pending(struct esp *);
> +static int cyber_esp_irq_pending(struct esp *);
> +static int fastlane_esp_irq_pending(struct esp *);

Please avoid forward declarations wherever possible.

> +struct zorro_driver_data {
> + const char *name;
> + unsigned long offset;
> + unsigned long dma_offset;
> + int absolute;   /* offset is absolute address */
> + int scsi_option;
> + int (*irq_pending)(struct esp *esp);
> + void (*dma_invalidate)(struct esp *esp);
> + void (*send_dma_cmd)(struct esp *esp, u32 dma_addr, u32 esp_count,
> +  u32 dma_count, int write, u8 cmd);
> +};

Please use different esp_driver_ops instances for the different board
types.

> +static const struct zorro_driver_data cyberstormI_data = {
> + .name = "CyberStormI",
> + .offset = 0xf400,
> + .dma_offset = 0xf800,
> + .absolute = 0,
> + .scsi_option = 0,

You can remove all the xero initializations on static data.
Also please align the = signs with tabs in struct declarations.

> +static dma_addr_t zorro_esp_map_single(struct esp *esp, void *buf,
> +   size_t sz, int dir)
> +{
> + return dma_map_single(esp->dev, buf, sz, dir);
> +}
> +
> +static int zorro_esp_map_sg(struct esp *esp, struct scatterlist *sg,
> +   int num_sg, int dir)
> +{
> + return dma_map_sg(esp->dev, sg, num_sg, dir);
> +}
> +
> +static void zorro_esp_unmap_single(struct esp *esp, dma_addr_t addr,
> +   size_t sz, int dir)
> +{
> + dma_unmap_single(esp->dev, addr, sz, dir);
> +}
> +
> +static void zorro_esp_unmap_sg(struct esp *esp, struct scatterlist *sg,
> +   int num_sg, int dir)
> +{
> + dma_unmap_sg(esp->dev, sg, num_sg, dir);
> +}

These wrappers seem rather pointless.

> +/* PIO macros as used in mac_esp.c.
> + * Note that addr and fifo arguments are local-scope variables declared
> + * in zorro_esp_send_pio_cmd(), the macros are only used in that function,
> + * and addr and fifo are referenced in each use of the macros so there
> + * is no need to pass them as macro parameters.
> + */

Please use normal Linux comment style:

/*
 * Blah, blah, blah.
 */

> +#define ZORRO_ESP_PIO_LOOP(operands, reg1) \
> + { \
> + asm volatile ( \
> +  "1: moveb " operands "\n" \
> +  "   subqw #1,%1   \n" \
> +  "   jbne 1b   \n" \
> +  : "+a" (addr), "+r" (reg1) \
> +  : "a" (fifo)); \
> + }

What is the point of the curly braces around the asm statement?

> +static void zorro_esp_send_pio_cmd(struct esp *esp, u32 addr, u32 esp_count,
> +  u32 dma_count, int write, u8 cmd)
> +{
> + struct zorro_esp_priv *zep = dev_get_drvdata(esp->dev);
> + u8 __iomem *fifo = esp->regs + ESP_FDATA * 16;
> + u8 phase = esp->sreg & ESP_STAT_PMASK;
> +
> + cmd &= ~ESP_CMD_DMA;
> +
> + if (write) {

It seems like this routine would benefit from being split into a read
and a write one.

> +// Blizzard 1230/60 SCSI-IV DMA

/* ... */

> + /* Use PIO if transferring message bytes to esp->command_block_dma.
> +  * PIO requires a virtual address, so substitute esp->command_block
> +  * for addr.
> +  */

Comment style, again.

> +static struct esp_driver_ops zorro_esp_ops = {

should be marked const.

> + .esp_write8   = zorro_esp_write8,

Just use a space after the '='.

> +
> + if (zep->zorro3) {
> + /* Only Fastlane Z3 for now - add switch for correct struct
> +  * dma_registers size if adding any more
> +  */
> + esp->dma_regs = ioremap_nocache(dmaaddr,
> + sizeof(struct fastlane_dma_registers));
> + } else
> + esp->dma_regs = (void __iomem *)ZTWO_VADDR(dmaaddr);

doesn't this need a __force (and a comment on why it is ѕafe) to make
sparse happy?

Re: [PATCH] scsi: hosts: remove redundant assingment of shost->use_blk_mq

2018-04-03 Thread Johannes Thumshirn

On Mon, 2018-04-02 at 13:53 +, Bart Van Assche wrote:
> A similar patch has already been queued by Martin. See also
> https://patchwork.kernel.org/patch/10313569/.

Ah OK, fine then :-)
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH] scsi: Fix failed request error code

[PATCH] scsi: Fix failed request error code

[PATCH 1/1] scsi:mvsas:fix memory leak

[PATCH 1/1] scsi: mvsas:fix memory leak

Re: [PATCH v5] scsi: new zorro_esp.c for Amiga Zorro NCR53C9x boards

[PATCH] aacraid: Insure command thread is not recursively stopped

Re: 4.15.14 crash with iscsi target and dvd

Re: Regarding SCSI SANITIZE command support

Re: [PATCH 03/15] mpt3sas: Add sanity checks for scsi tracker before accessing it.

Re: [PATCH v3] scsi: Introduce sdev_printk_ratelimited to throttle frequent printk

[PATCH v2] scsi: cxgb4i: silence overflow warning in t4_uld_rx_handler()

Re: [PATCH v5] scsi: new zorro_esp.c for Amiga Zorro NCR53C9x boards

Re: Multi-Actuator SAS HDD First Look

Re: [PATCH v5] scsi: new zorro_esp.c for Amiga Zorro NCR53C9x boards

Re: [PATCH] scsi: hosts: remove redundant assingment of shost->use_blk_mq

15 matches

Site Navigation

Mail list logo

Footer information