date:20170816

Re: [PATCH 1/7] smartpqi: add pqi reset quiesce support

2017-08-16 Thread James Bottomley

On Thu, 2017-08-10 at 13:46 -0500, Don Brace wrote:
> From: Kevin Barnett 

To misspell someone else's company name could be considered unfortunate
... to misspell your own looks like carelessness.

You've done this twice, by the way.

James

> Reviewed-by: Scott Benesh 
> Signed-off-by: Kevin Barnett 
> Signed-off-by: Don Brace

Re: [PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Martin K. Petersen


Bart,

> When I checked earlier today the ipr patch was not yet in linux-next

That's weird. They were both committed two weeks ago.

They appear to be in there now, though:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/drivers/scsi?ofs=50

-- 
Martin K. Petersen  Oracle Linux Engineering

[PATCH] scsi: sd_zbc: Disable zoned block devices with scsi-mq

2017-08-16 Thread Damien Le Moal

blk-mq may not maintain write requests order at dispatch time. So host
managed drives will not reliably work. Worse, potential reordering of
write requests on requeue may cause the zone write locking code to
deadlock command dispatch to the disk. So for now, until the write
ordering issue is fixed, do not support zoned block devices with
scsi-mq by showing a 0 LBA capacity disk.

Signed-off-by: Damien Le Moal 
---
 drivers/scsi/sd_zbc.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 8aa54779aac1..07bd8511102a 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -561,6 +561,19 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp,
 */
return 0;
 
+   /*
+* In the scsi-mq case, write ordering is not guaranteed so
+* host managed drives will not reliably work. Worse, zone write
+* locking can cause dispatch deadlocks. So for now, do not support
+* zoned block devices with scsi-mq by showing a 0 capacity disk.
+*/
+   if (sdkp->disk->queue->mq_ops) {
+   if (sdkp->first_scan)
+   sd_printk(KERN_WARNING, sdkp,
+ "Zoned block devices are not supported with 
scsi-mq\n");
+   ret = -ENODEV;
+   goto err;
+   }
 
/* Get zoned block device characteristics */
ret = sd_zbc_read_zoned_characteristics(sdkp, buf);
-- 
2.13.5

Re: [PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Bart Van Assche

On Wed, 2017-08-16 at 22:00 -0400, Martin K. Petersen wrote:
> Bart,
> 
> > I'm 99% sure there is nothing wrong with my patch but that my patch
> > uncovered a bug in the ipr driver (drivers/scsi/ipr). See also
> > "WARNING: CPU: 15 PID: 0 at block/blk-mq.c:
> > __blk_mq_run_hw_queue+0x1d8/0x1f0F"
> > (https://marc.info/?l=linux-block=150290686719644).
> 
> But I already have the ipr patch queued. What am I missing?

Hello Martin,

When I checked earlier today the ipr patch was not yet in linux-next ...

Bart.

Re: [PATCH] sd: read unmap block limits even if lbpme=0

2017-08-16 Thread Martin K. Petersen


Tom,

> Some devices may not be decent enough to report lbpme bit properly
> even when they do support unmap and report relevant information in the
> block limits and logical block provisioning VPDs properly (Namely,
> ASMedia ASM1351, a UAS-SATA bridge). One of the reasons is, lbpme=1 is
> not a requirement for "DeleteNotify" in Windows to be activated:

I have to confess I'm wary of interpreting values reported by a device
that messes up setting the single bit flag that enables the feature.

Before I entertain taking your patch I'd like to take a closer look to
make sure everything is gated by sdkp->lbpme.

However, instead of relying on UNMAP, does this bridge support WRITE
SAME with the UNMAP bit? Because in that case you should be able to set
provisioning_mode to WS10 or WS16 and then adjust max_write_same_blocks.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Martin K. Petersen


Bart,

> I'm 99% sure there is nothing wrong with my patch but that my patch
> uncovered a bug in the ipr driver (drivers/scsi/ipr). See also
> "WARNING: CPU: 15 PID: 0 at block/blk-mq.c:
> __blk_mq_run_hw_queue+0x1d8/0x1f0F"
> (https://marc.info/?l=linux-block=150290686719644).

But I already have the ipr patch queued. What am I missing?

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Bart Van Assche

On Wed, 2017-08-16 at 20:52 -0400, Martin K. Petersen wrote:
> Damien,
> 
> > Actually, we can keep that commit as it makes the sq case cleaner
> > anyway. But we need something else to prevent deadlock in the mq
> > case...
> 
> OK, put it back.
> 
> Please send a lockout patch. We can always zap it if we find out what
> went wrong with Bart's patch and queue a "real" fix.

Hello Martin,

I'm 99% sure there is nothing wrong with my patch but that my patch uncovered
a bug in the ipr driver (drivers/scsi/ipr). See also "WARNING: CPU: 15 PID: 0
at block/blk-mq.c: __blk_mq_run_hw_queue+0x1d8/0x1f0F"
(https://marc.info/?l=linux-block=150290686719644).

Bart.

Re: [PATCH] scsi-sysfs: correct errno for host_reset

2017-08-16 Thread Martin K. Petersen


weiping,

> if scsi_host_template->host_reset is NULL, when user "echo adapter >
> /sys/class/scsi_host/hostx/host_reset", -EINVAL will return even
> string compare successfully. It make user confuse.
>
> change to:
> -EINVAL   if string not match "adapter" / "firmware";
> -EOPNOTSUPP   if string match but not implement ->host_reset.

Applied to 4.14/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] ch: add refcounting

2017-08-16 Thread Martin K. Petersen


Hannes,

> struct scsi_changer needs refcounting as the device might be removed
> while the fd is still open.

Applied to 4.14/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] scsi: ncr5380: constify pnp_device_id

2017-08-16 Thread Martin K. Petersen


Arvind,

> pnp_device_id are not supposed to change at runtime. All functions
> working with pnp_device_id provided by  work with
> const pnp_device_id. So mark the non-const structs as const.

Applied to 4.14/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Martin K. Petersen


Damien,

> Actually, we can keep that commit as it makes the sq case cleaner
> anyway. But we need something else to prevent deadlock in the mq
> case...

OK, put it back.

Please send a lockout patch. We can always zap it if we find out what
went wrong with Bart's patch and queue a "real" fix.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Damien Le Moal

Martin,

On 8/17/17 09:41, Damien Le Moal wrote:
> Martin,
> 
> On 8/17/17 09:11, Martin K. Petersen wrote:
>>
>> Bart,
>>
>>> For an unknown reason this patch causes the boot process to hang on
>>> PowerPC systems:
>>
>> OK, dropped it from fixes for now.
>>
>> Thanks!
> 
> It means that commit 70e42fd02c46e2aa9ab07b766d418637e3a51de7 "scsi:
> sd_zbc: Write unlock zone from sd_uninit_cmnd()" will need to be
> reverted too as it will not solve the potential deadlock anymore. Bart's
> patch was needed for it to work.
> 
> So we are back to the initial patch I sent that Chirstoph nacked, that
> is, not write locking zones with scsi-mq to avoid deadlocking (blk-mq
> does not dispatch write requests in order anyway, so SMR does not work
> with mq for now).
> 
> Other solutions would be:
> 1) Do not change the zone locking and add a big fat warning for scanned
> zoned block devices that things can get nasty
> 2) Fail initialization of zoned block devices with scsi-mq (do not
> create the block devices)
> 
> (2) may actually be the better solution for now until I get a fix for
> blk-mq.
> 
> Thoughts ?
> 
> Best regards.

Actually, we can keep that commit as it makes the sq case cleaner
anyway. But we need something else to prevent deadlock in the mq case...

-- 
Damien Le Moal,
Western Digital

Re: [PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Martin K. Petersen


Damien,

> It means that commit 70e42fd02c46e2aa9ab07b766d418637e3a51de7 "scsi:
> sd_zbc: Write unlock zone from sd_uninit_cmnd()" will need to be
> reverted too as it will not solve the potential deadlock anymore. Bart's
> patch was needed for it to work.

OK, also dropped.

> (2) may actually be the better solution for now until I get a fix for
> blk-mq.

Yeah, that's fine with me.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH 0/6] Update qedf to version 8.20.5.0.

2017-08-16 Thread Martin K. Petersen


Chad,

> Please apply the following patches at your earliest convenience.

Applied to 4.14/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Damien Le Moal

Martin,

On 8/17/17 09:11, Martin K. Petersen wrote:
> 
> Bart,
> 
>> For an unknown reason this patch causes the boot process to hang on
>> PowerPC systems:
> 
> OK, dropped it from fixes for now.
> 
> Thanks!

It means that commit 70e42fd02c46e2aa9ab07b766d418637e3a51de7 "scsi:
sd_zbc: Write unlock zone from sd_uninit_cmnd()" will need to be
reverted too as it will not solve the potential deadlock anymore. Bart's
patch was needed for it to work.

So we are back to the initial patch I sent that Chirstoph nacked, that
is, not write locking zones with scsi-mq to avoid deadlocking (blk-mq
does not dispatch write requests in order anyway, so SMR does not work
with mq for now).

Other solutions would be:
1) Do not change the zone locking and add a big fat warning for scanned
zoned block devices that things can get nasty
2) Fail initialization of zoned block devices with scsi-mq (do not
create the block devices)

(2) may actually be the better solution for now until I get a fix for
blk-mq.

Thoughts ?

Best regards.

-- 
Damien Le Moal,
Western Digital

Re: [PATCH] scsi: qla2xxx: fix spelling mistake of variable sfp_additonal_info

2017-08-16 Thread Martin K. Petersen


Colin,

> Trivial fix to variable name, sfp_additonal_info should be
> sfp_additional_info (add in missing i).

Applied to 4.14/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] scsi: osst: silence underflow warning in osst_verify_frame()

2017-08-16 Thread Martin K. Petersen


Dan,

> If i is negative then it's less than OS_FM_TAB_MAX so we read before
> the start of the STp->header_cache->dat_fm_tab.fm_tab_ent[] array.

Applied to 4.14/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] scsi: mpt3sas: fix pr_info message continuation

2017-08-16 Thread Martin K. Petersen


Colin,

> An optional discovery status should be printed with a pr_cont and
> needs a leading space to make it more readable. The final new line
> should also be a pr_cont and the indentation is out by one, so fix
> that too.

Applied to 4.14/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH 0/4] ses: simple subenclosure support

2017-08-16 Thread Martin K. Petersen


Hannes,

> some arrays (most notably 3Par) only support simple subenclosures.
> Sadly our ses implementation doesn't handle this properly, so we're
> greeted with error messages like:
>
> scsi 1:0:0:254: Wrong diagnostic page; asked for 2 got 0
> scsi 1:0:0:254: Failed to get diagnostic page 0xffea
> scsi 1:0:0:254: Failed to bind enclosure -19
> ses 1:0:0:254: Attached Enclosure device
>
> This patchset fixes up our ses implementation to work properly
> with simple subenclosures.

Looks OK to me. Applied to 4.14/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH 1/4] scsi: Fixup ses page check

2017-08-16 Thread Martin K. Petersen


Hannes,

> The error code from a scsi_execute_req() is a SCSI status, not
> a normal errno. So whenever it returns a value here an error
> occurred and there's no point in looking at the page number.

424f727b9413 scsi: ses: Fix wrong page error

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCHv3 0/6] hpsa legacy support

2017-08-16 Thread Martin K. Petersen


Hannes,

> here's now the third attempt to add support for legacy boards, ie for
> boards previously supported by cciss only.  With this patchset the
> hpsa driver should work with all Smart Array boards if the
> 'hpsa_allow_any' module option is set, rendering the cciss driver
> obsolete.  Consequently I've added a patch to remove the cciss driver
> and make 'hpsa' an alias to 'cciss' and removed the 'hpsa_allow_any'
> module option.

Applied to 4.14/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] scsi: cxgb4i: call neigh_event_send() to update MAC address

2017-08-16 Thread Martin K. Petersen


Varun,

> If nud_state is not valid then call neigh_event_send() to update MAC
> address.

Applied to 4.13/scsi-fixes, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCHv3 2/5] scsi: Export blacklist flags to sysfs

2017-08-16 Thread Martin K. Petersen


Hannes,

> I still would like to keep the above, as the admin can feed blacklist
> flags via the kernel commandline, and we don't do any validity checks on
> that. So we might end up with invalid flags after all.

I suggest you handle this by printing something along the lines of
"INVALID_BIT_SET(42)" for each bad flag.

Also: Brownie points for adding a new sysfs interface to augment the
existing /proc goo for adding entries. Preferably one that takes BLIST
names instead of huge hex numbers.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Martin K. Petersen


Bart,

> For an unknown reason this patch causes the boot process to hang on
> PowerPC systems:

OK, dropped it from fixes for now.

Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

No I/O errors reported after SATA link hard reset

2017-08-16 Thread Gionatan Danti


Hi list,
some time ago, I had a filesystem corruption on a simple, two disks 
RAID1 MD array. On the affected machine, /var/log/messages shown some 
"failed command: WRITE FPDMA QUEUED" entries, but *no* action (ie: kick 
off disk) was taken by MDRAID. I tracked down the problem to an instable 
power supply (switching power rail/connector solved the problem).


In the latest day I had some spare time and I am now able to regularly 
replicate the problem. Basically, when a short powerloss happens, the 
scsi midlayer logs some failed operations, but does *not* pass these 
errors to higher layer. In other words, no I/O error is returned to the 
calling application. This is the reason why MDRAID did not kick off the 
instable disk on the machine with corrupted filesystem.


To replicated the problem, I wrote a large random file on a small MD 
RAID1 array, pulling off the power of one disk from about 2 seconds. The 
file write operation stopped for some seconds, than recovered. Running 
an array check resulted in a high number of mismatch_cnt sectors. Dmesg 
logged the following lines:


Aug 16 16:04:02 blackhole kernel: ata6.00: exception Emask 0x50 SAct 
0x7fff SErr 0x90a00 action 0xe frozen
Aug 16 16:04:02 blackhole kernel: ata6.00: irq_stat 0x0040, PHY RDY 
changed
Aug 16 16:04:02 blackhole kernel: ata6: SError: { Persist HostInt 
PHYRdyChg 10B8B }
Aug 16 16:04:02 blackhole kernel: ata6.00: failed command: WRITE FPDMA 
QUEUED
Aug 16 16:04:02 blackhole kernel: ata6.00: cmd 
61/00:00:10:82:09/04:00:00:00:00/40 tag 0 ncq 524288 out#012 res 
40/00:d8:10:72:09/00:00:00:00:00/40 Emask 0x50 (ATA bus error)

Aug 16 16:04:02 blackhole kernel: ata6.00: status: { DRDY }
...
Aug 16 16:04:02 blackhole kernel: ata6.00: failed command: WRITE FPDMA 
QUEUED
Aug 16 16:04:02 blackhole kernel: ata6.00: cmd 
61/00:f0:10:7e:09/04:00:00:00:00/40 tag 30 ncq 524288 out#012 
res 40/00:d8:10:72:09/00:00:00:00:00/40 Emask 0x50 (ATA bus error)

Aug 16 16:04:02 blackhole kernel: ata6.00: status: { DRDY }
Aug 16 16:04:02 blackhole kernel: ata6: hard resetting link
Aug 16 16:04:03 blackhole kernel: ata6: SATA link down (SStatus 0 
SControl 310)

Aug 16 16:04:04 blackhole kernel: ata6: hard resetting link
Aug 16 16:04:14 blackhole kernel: ata6: softreset failed (device not 
ready)

Aug 16 16:04:14 blackhole kernel: ata6: hard resetting link
Aug 16 16:04:24 blackhole kernel: ata6: softreset failed (device not 
ready)

Aug 16 16:04:24 blackhole kernel: ata6: hard resetting link
Aug 16 16:04:35 blackhole kernel: ata6: link is slow to respond, please 
be patient (ready=0)
Aug 16 16:04:42 blackhole kernel: ata6: SATA link down (SStatus 0 
SControl 310)

Aug 16 16:04:46 blackhole kernel: ata6: hard resetting link
Aug 16 16:04:46 blackhole kernel: ata3: exception Emask 0x10 SAct 0x0 
SErr 0x40d0202 action 0xe frozen
Aug 16 16:04:46 blackhole kernel: ata3: irq_stat 0x0040, PHY RDY 
changed
Aug 16 16:04:46 blackhole kernel: ata3: SError: { RecovComm Persist 
PHYRdyChg CommWake 10B8B DevExch }

Aug 16 16:04:46 blackhole kernel: ata3: hard resetting link
Aug 16 16:04:51 blackhole kernel: ata3: softreset failed (device not 
ready)
Aug 16 16:04:51 blackhole kernel: ata3: applying PMP SRST workaround and 
retrying
Aug 16 16:04:51 blackhole kernel: ata3: SATA link up 3.0 Gbps (SStatus 
123 SControl 300)

Aug 16 16:04:51 blackhole kernel: ata3.00: configured for UDMA/133
Aug 16 16:04:51 blackhole kernel: ata3: EH complete
Aug 16 16:04:52 blackhole kernel: ata6: softreset failed (device not 
ready)
Aug 16 16:04:52 blackhole kernel: ata6: applying PMP SRST workaround and 
retrying
Aug 16 16:04:52 blackhole kernel: ata6: SATA link up 1.5 Gbps (SStatus 
113 SControl 310)

Aug 16 16:04:52 blackhole kernel: ata6.00: configured for UDMA/133
Aug 16 16:04:52 blackhole kernel: ata6: EH complete

As you can see, while failed SATA operation were logged in dmesg (and 
/var/log/messages), no I/O errors where returned to the upper layer 
(MDRAID) or the calling application. I had to say that I *fully expect* 
some inconsistencies: after all, removing the power wipes the volatile 
disk's DRAM cache, which means data loss. However, I really expected 
some I/O errors to be thrown to the higher layers, causing visible 
reactions (ie: a disks pushed out the array). With no I/O errors 
returned, the higher layer application are effectively blind.


More concerning is the fact that these undetected errors can make their 
way even when the higher application consistently calls sync() and/or 
fsync. In other words, it seems than even acknowledged writes can fail 
in this manner (and this is consistent with the first machine corrupting 
its filesystem due to journal trashing - XFS journal surely uses sync() 
where appropriate). The mechanism seems the following:


- an higher layer application issue sync();
- a write barrier is generated;
- a first FLUSH CACHE command is sent to the disk;
- data are written to the disk's DRAM cache;
- power is lost!

[GIT PULL] SCSI fixes for 4.13-rc4

2017-08-16 Thread James Bottomley

A couple of minor fixes (st, ses) and some bigger driver fixes for
qla2xxx (crash triggered by fw dump) and ipr (lockdep problems with
mq).

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Bodo Stroesser (1):
  scsi: st: fix blk_get_queue usage

Brian King (2):
  scsi: ses: Fix wrong page error
  scsi: ipr: Fix scsi-mq lockdep issue

Michael Hernandez (1):
  scsi: qla2xxx: Fix system crash while triggering FW dump

And the diffstat:

 drivers/scsi/ipr.c  | 33 +++--
 drivers/scsi/ipr.h  |  2 ++
 drivers/scsi/qla2xxx/qla_tmpl.c | 12 
 drivers/scsi/ses.c  |  2 +-
 drivers/scsi/st.c   |  4 ++--
 5 files changed, 24 insertions(+), 29 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index b0c68d24db01..da5bdbdcce52 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -3351,6 +3351,16 @@ static void ipr_worker_thread(struct work_struct *work)
return;
}
 
+   if (ioa_cfg->scsi_unblock) {
+   ioa_cfg->scsi_unblock = 0;
+   ioa_cfg->scsi_blocked = 0;
+   spin_unlock_irqrestore(ioa_cfg->host->host_lock, lock_flags);
+   scsi_unblock_requests(ioa_cfg->host);
+   spin_lock_irqsave(ioa_cfg->host->host_lock, lock_flags);
+   if (ioa_cfg->scsi_blocked)
+   scsi_block_requests(ioa_cfg->host);
+   }
+
if (!ioa_cfg->scan_enabled) {
spin_unlock_irqrestore(ioa_cfg->host->host_lock, lock_flags);
return;
@@ -7211,9 +7221,8 @@ static int ipr_ioa_bringdown_done(struct ipr_cmnd 
*ipr_cmd)
ENTER;
if (!ioa_cfg->hrrq[IPR_INIT_HRRQ].removing_ioa) {
ipr_trace;
-   spin_unlock_irq(ioa_cfg->host->host_lock);
-   scsi_unblock_requests(ioa_cfg->host);
-   spin_lock_irq(ioa_cfg->host->host_lock);
+   ioa_cfg->scsi_unblock = 1;
+   schedule_work(_cfg->work_q);
}
 
ioa_cfg->in_reset_reload = 0;
@@ -7287,13 +7296,7 @@ static int ipr_ioa_reset_done(struct ipr_cmnd *ipr_cmd)
list_add_tail(_cmd->queue, _cmd->hrrq->hrrq_free_q);
wake_up_all(_cfg->reset_wait_q);
 
-   spin_unlock(ioa_cfg->host->host_lock);
-   scsi_unblock_requests(ioa_cfg->host);
-   spin_lock(ioa_cfg->host->host_lock);
-
-   if (!ioa_cfg->hrrq[IPR_INIT_HRRQ].allow_cmds)
-   scsi_block_requests(ioa_cfg->host);
-
+   ioa_cfg->scsi_unblock = 1;
schedule_work(_cfg->work_q);
LEAVE;
return IPR_RC_JOB_RETURN;
@@ -9249,8 +9252,11 @@ static void _ipr_initiate_ioa_reset(struct ipr_ioa_cfg 
*ioa_cfg,
spin_unlock(_cfg->hrrq[i]._lock);
}
wmb();
-   if (!ioa_cfg->hrrq[IPR_INIT_HRRQ].removing_ioa)
+   if (!ioa_cfg->hrrq[IPR_INIT_HRRQ].removing_ioa) {
+   ioa_cfg->scsi_unblock = 0;
+   ioa_cfg->scsi_blocked = 1;
scsi_block_requests(ioa_cfg->host);
+   }
 
ipr_cmd = ipr_get_free_ipr_cmnd(ioa_cfg);
ioa_cfg->reset_cmd = ipr_cmd;
@@ -9306,9 +9312,8 @@ static void ipr_initiate_ioa_reset(struct ipr_ioa_cfg 
*ioa_cfg,
wake_up_all(_cfg->reset_wait_q);
 
if (!ioa_cfg->hrrq[IPR_INIT_HRRQ].removing_ioa) {
-   spin_unlock_irq(ioa_cfg->host->host_lock);
-   scsi_unblock_requests(ioa_cfg->host);
-   spin_lock_irq(ioa_cfg->host->host_lock);
+   ioa_cfg->scsi_unblock = 1;
+   schedule_work(_cfg->work_q);
}
return;
} else {
diff --git a/drivers/scsi/ipr.h b/drivers/scsi/ipr.h
index e98a87a65335..c7f0e9e3cd7d 100644
--- a/drivers/scsi/ipr.h
+++ b/drivers/scsi/ipr.h
@@ -1488,6 +1488,8 @@ struct ipr_ioa_cfg {
u8 cfg_locked:1;
u8 clear_isr:1;
u8 probe_done:1;
+   u8 scsi_unblock:1;
+   u8 scsi_blocked:1;
 
u8 revid;
 
diff --git a/drivers/scsi/qla2xxx/qla_tmpl.c b/drivers/scsi/qla2xxx/qla_tmpl.c
index 33142610882f..b18646d6057f 100644
--- a/drivers/scsi/qla2xxx/qla_tmpl.c
+++ b/drivers/scsi/qla2xxx/qla_tmpl.c
@@ -401,9 +401,6 @@ qla27xx_fwdt_entry_t263(struct scsi_qla_host *vha,
for (i = 0; i < vha->hw->max_req_queues; i++) {
struct req_que *req = vha->hw->req_q_map[i];
 
-   if (!test_bit(i, vha->hw->req_qid_map))
-   continue;
-
if (req || !buf) {
length = req ?
req->length : REQUEST_ENTRY_CNT_24XX;
@@ -418,9 +415,6 @@ qla27xx_fwdt_entry_t263(struct scsi_qla_host *vha,

[PATCH] Revert "scsi-mq: Always unprepare before requeuing a request"

2017-08-16 Thread Bart Van Assche

For an unknown reason this patch causes the boot process to hang on
PowerPC systems:

sd 0:2:0:0: [sda] 272646144 512-byte logical blocks: (140 GB/130 GiB)
sd 0:2:0:0: [sda] 4096-byte physical blocks
sd 0:2:0:0: [sda] Write Protect is off
INFO: task swapper/5:1 blocked for more than 120 seconds.
  Not tainted 4.13.0-rc4-next-20170810-autotest #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
swapper/5   D 9936 1  0 0x0800
Call Trace:
[c007f8483a10] [c007f8483a80] 0xc007f8483a80 (unreliable)
[c007f8483be0] [c001b358] __switch_to+0x2e8/0x430
[c007f8483c40] [c09d134c] __schedule+0x38c/0xaf0
[c007f8483d20] [c09d1af0] schedule+0x40/0xb0
[c007f8483d50] [c0110bd4] async_synchronize_cookie_domain+0xd4/0x150
[c007f8483dc0] [c000d8f8] kernel_init+0x28/0x140
[c007f8483e30] [c000bc60] ret_from_kernel_thread+0x5c/0x7c

Hence revert commit 270065e92c31.

Fixes: commit 270065e92c31 ("scsi: scsi-mq: Always unprepare before requeuing a 
request")
Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Damien Le Moal 
Cc: Johannes Thumshirn 
Cc: Michael Ellerman 
Cc: Brian King 
Cc: Abdul Haleem 
Cc: 
---
 drivers/scsi/scsi_lib.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 7e92818e6597..e57c80d4648d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -44,8 +44,6 @@ static struct kmem_cache *scsi_sense_cache;
 static struct kmem_cache *scsi_sense_isadma_cache;
 static DEFINE_MUTEX(scsi_sense_cache_mutex);
 
-static void scsi_mq_uninit_cmd(struct scsi_cmnd *cmd);
-
 static inline struct kmem_cache *
 scsi_select_sense_cache(bool unchecked_isa_dma)
 {
@@ -142,12 +140,6 @@ static void scsi_mq_requeue_cmd(struct scsi_cmnd *cmd)
 {
struct scsi_device *sdev = cmd->device;
 
-   if (cmd->request->rq_flags & RQF_DONTPREP) {
-   cmd->request->rq_flags &= ~RQF_DONTPREP;
-   scsi_mq_uninit_cmd(cmd);
-   } else {
-   WARN_ON_ONCE(true);
-   }
blk_mq_requeue_request(cmd->request, true);
put_device(>sdev_gendev);
 }
@@ -985,6 +977,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
 * A new command will be prepared and issued.
 */
if (q->mq_ops) {
+   cmd->request->rq_flags &= ~RQF_DONTPREP;
+   scsi_mq_uninit_cmd(cmd);
scsi_mq_requeue_cmd(cmd);
} else {
scsi_release_buffers(cmd);
-- 
2.14.0

[PATCH] scsi: cxgb4i: call neigh_event_send() to update MAC address

2017-08-16 Thread Varun Prakash

If nud_state is not valid then call neigh_event_send()
to update MAC address.

Signed-off-by: Varun Prakash 
---
 drivers/scsi/cxgbi/cxgb4i/cxgb4i.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c 
b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
index a69a9ac..1d02cf9 100644
--- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
+++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
@@ -1635,6 +1635,9 @@ static int init_act_open(struct cxgbi_sock *csk)
goto rel_resource;
}
 
+   if (!(n->nud_state & NUD_VALID))
+   neigh_event_send(n, NULL);
+
csk->atid = cxgb4_alloc_atid(lldi->tids, csk);
if (csk->atid < 0) {
pr_err("%s, NO atid available.\n", ndev->name);
-- 
2.0.2

Re: [PATCH] megaraid_sas: boot hangs while LD is offline

2017-08-16 Thread Hannes Reinecke

On 08/16/2017 03:05 PM, Sumit Saxena wrote:
>> -Original Message-
>> From: Hannes Reinecke [mailto:h...@suse.de]
>> Sent: Tuesday, August 15, 2017 5:29 PM
>> To: Martin K. Petersen
>> Cc: Christoph Hellwig; James Bottomley; Sumit Saxena; Kashyap Desai;
>> megaraidlinux@broadcom.com; linux-scsi@vger.kernel.org; Hannes
>> Reinecke; Hannes Reinecke
>> Subject: [PATCH] megaraid_sas: boot hangs while LD is offline
>>
>> Offline Logical drives (LDs) should not allowed to be visible to the OS,
> as the
>> OS will hang trying to send commands to it.
>> This patch skips offline LDs like it already does for non-system physical
> drives
>> (PDs).
> 
> Hi Hannes,
> 
> You have submitted this patch in past as it was part of SLES kernel but
> missing in upstream and it was concluded to drop this patch then.
> This issue was fixed in firmware 4 years back and many firmware versions
> should be available with the fix so this fix is not required in
> driver(upstream
> as well SLES kernel). Do you see any functional issue without this patch ?
> If not, I would request  to back out this patch from SLES kernel also.
> 
Okay, if you say so.
Will be removing it.

Martin, please drop this patch.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

RE: [PATCH] megaraid_sas: boot hangs while LD is offline

2017-08-16 Thread Sumit Saxena

>-Original Message-
>From: Hannes Reinecke [mailto:h...@suse.de]
>Sent: Tuesday, August 15, 2017 5:29 PM
>To: Martin K. Petersen
>Cc: Christoph Hellwig; James Bottomley; Sumit Saxena; Kashyap Desai;
>megaraidlinux@broadcom.com; linux-scsi@vger.kernel.org; Hannes
>Reinecke; Hannes Reinecke
>Subject: [PATCH] megaraid_sas: boot hangs while LD is offline
>
>Offline Logical drives (LDs) should not allowed to be visible to the OS,
as the
>OS will hang trying to send commands to it.
>This patch skips offline LDs like it already does for non-system physical
drives
>(PDs).

Hi Hannes,

You have submitted this patch in past as it was part of SLES kernel but
missing in upstream and it was concluded to drop this patch then.
This issue was fixed in firmware 4 years back and many firmware versions
should be available with the fix so this fix is not required in
driver(upstream
as well SLES kernel). Do you see any functional issue without this patch ?
If not, I would request  to back out this patch from SLES kernel also.

Thanks,
Sumit

>
>Signed-off-by: Hannes Reinecke 
>---
> drivers/scsi/megaraid/megaraid_sas_base.c | 9 +++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>b/drivers/scsi/megaraid/megaraid_sas_base.c
>index 5202c2f..39b08fc 100644
>--- a/drivers/scsi/megaraid/megaraid_sas_base.c
>+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>@@ -1897,14 +1897,19 @@ static void
>megasas_set_static_target_properties(struct scsi_device *sdev,
>
> static int megasas_slave_configure(struct scsi_device *sdev)  {
>-  u16 pd_index = 0;
>+  u16 pd_index = 0, ld_index;
>   struct megasas_instance *instance;
>   int ret_target_prop = DCMD_FAILED;
>   bool is_target_prop = false;
>
>   instance = megasas_lookup_instance(sdev->host->host_no);
>   if (instance->pd_list_not_supported) {
>-  if (!MEGASAS_IS_LOGICAL(sdev) && sdev->type ==
>TYPE_DISK) {
>+  if (MEGASAS_IS_LOGICAL(sdev)) {
>+  ld_index = ((sdev->channel -
>MEGASAS_MAX_PD_CHANNELS) *
>+  MEGASAS_MAX_DEV_PER_CHANNEL) +
>sdev->id;
>+  if (instance->ld_ids[ld_index] == 0xff)
>+  return -ENXIO;
>+  } else if (sdev->type == TYPE_DISK) {
>   pd_index = (sdev->channel *
>MEGASAS_MAX_DEV_PER_CHANNEL) +
>   sdev->id;
>   if (instance->pd_list[pd_index].driveState !=
>--
>1.8.5.6

RE: [PATCH] megaraid_sas: Fallback to older scanning if no disks are found

2017-08-16 Thread Sumit Saxena

>-Original Message-
>From: Hannes Reinecke [mailto:h...@suse.de]
>Sent: Tuesday, August 15, 2017 5:36 PM
>To: Martin K. Petersen
>Cc: Christoph Hellwig; James Bottomley; Sumit Saxena; Kashyap Desai;
>megaraidlinux@broadcom.com; linux-scsi@vger.kernel.org; Hannes
>Reinecke
>Subject: [PATCH] megaraid_sas: Fallback to older scanning if no disks are
>found
>
>commit 21c9e160a51383d4cb0b882398534b0c95c0cc3b implemented a new
>driver lookup using the MR_DCMD_LD_LIST_QUERY firmware command.
>However, this command might not work properly on older firmware, causing
>the command to return no drives instead of an error.
>This causes a regression on older firmware as the driver will no longer
detect
>any drives.
>This patch checks if MR_DCMD_LD_LIST_QUERY return no drives, and falls
>back to the original method if so.
>
>Signed-off-by: Hannes Reinecke 
>---
> drivers/scsi/megaraid/megaraid_sas_base.c | 29
>+++--
> 1 file changed, 23 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>b/drivers/scsi/megaraid/megaraid_sas_base.c
>index 39b08fc..a1cf2c3 100644
>--- a/drivers/scsi/megaraid/megaraid_sas_base.c
>+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>@@ -4502,7 +4502,6 @@ int megasas_alloc_cmds(struct megasas_instance
>*instance)
>   dev_info(>pdev->dev,
>   "DCMD not supported by firmware - %s %d\n",
>   __func__, __LINE__);
>-  ret = megasas_get_ld_list(instance);
>   break;
>   case DCMD_TIMEOUT:
>   switch (dcmd_timeout_ocr_possible(instance)) { @@ -4530,6
>+4529,14 @@ int megasas_alloc_cmds(struct megasas_instance *instance)
>   break;
>   case DCMD_SUCCESS:
>   tgtid_count = le32_to_cpu(ci->count);
>+  /*
>+   * Some older firmware return '0' if the LD LIST QUERY
>+   * command is not supported.
>+   */
>+  if (tgtid_count == 0) {
>+  ret = DCMD_FAILED;
>+  break;
>+  }

If firmware does not support LD_LIST_QUERY,  "DCMD_FAILED" should be
returned by megasas_issue_blocked_cmd
(function which fires command to firmware) instead of DCMD_SCCESS.
Can you please help with older firmware version which returns
DCMD_SUCCESS(0) if LD_LIST_QUERY is not supported ?

"tgtid_count" may be zero(in case of no logical disks configured) even
when firmware supports LD LIST QUERY
and in that case also driver will fire old command by calling
megasas_get_ld_list() but that's not a problem.

Thanks,
Sumit

>
>   if ((tgtid_count > (instance->fw_supported_vd_count)))
>   break;
>@@ -5146,7 +5153,7 @@ static int megasas_init_fw(struct megasas_instance
>*instance)
>   struct megasas_register_set __iomem *reg_set;
>   struct megasas_ctrl_info *ctrl_info = NULL;
>   unsigned long bar_list;
>-  int i, j, loop, fw_msix_count = 0;
>+  int i, j, loop, fw_msix_count = 0, ret;
>   struct IOV_111 *iovPtr;
>   struct fusion_context *fusion;
>
>@@ -5384,8 +5391,11 @@ static int megasas_init_fw(struct megasas_instance
>*instance)
>   }
>   }
>
>-  if (megasas_ld_list_query(instance,
>-MR_LD_QUERY_TYPE_EXPOSED_TO_HOST))
>+  ret = megasas_ld_list_query(instance,
>+  MR_LD_QUERY_TYPE_EXPOSED_TO_HOST);
>+  if (ret == DCMD_FAILED)
>+  ret = megasas_get_ld_list(instance);
>+  if (ret)
>   goto fail_get_ld_pd_list;
>
>   /*
>@@ -7426,8 +7436,12 @@ static inline void
>megasas_remove_scsi_device(struct scsi_device *sdev)
>   case MR_EVT_LD_DELETED:
>   case MR_EVT_LD_CREATED:
>   if (!instance->requestorId ||
>-  (instance->requestorId &&
>megasas_get_ld_vf_affiliation(instance, 0)))
>+  (instance->requestorId &&
>+   megasas_get_ld_vf_affiliation(instance, 0)))
{
>   dcmd_ret = megasas_ld_list_query(instance,
>MR_LD_QUERY_TYPE_EXPOSED_TO_HOST);
>+  if (dcmd_ret == DCMD_FAILED)
>+  dcmd_ret =
>megasas_get_ld_list(instance);
>+  }
>
>   if (dcmd_ret == DCMD_SUCCESS)
>   doscan = SCAN_VD_CHANNEL;
>@@ -7443,8 +7457,11 @@ static inline void
>megasas_remove_scsi_device(struct scsi_device *sdev)
>   break;
>
>   if (!instance->requestorId ||
>-  (instance->requestorId &&
>megasas_get_ld_vf_affiliation(instance, 0)))
>+  (instance->requestorId &&
>+megasas_get_ld_vf_affiliation(instance, 0))) {
>   dcmd_ret = megasas_ld_list_query(instance,

Re: [PATCH] Revert "scsi: default to scsi-mq"

2017-08-16 Thread Adrian Hunter

On 14/08/17 18:21, Bart Van Assche wrote:
> On Sun, 2017-08-13 at 19:44 +0200, Christoph Hellwig wrote:
>> Defaulting to scsi-mq in 4.13-rc has shown various regressions
>> on setups that we didn't previously consider.  Fixes for them are
>> in progress, but too invasive to make it in this cycle.  So for
>> now revert the commit that defaults to blk-mq for SCSI.  For 4.14
>> we'll plan to try again with these fixes.
> 
> Should a patch for v4.14 perhaps be queued now?
Please consider adding support for runtime pm first.  For people using
scsi/ufs/ufshcd it might be an essential feature.

Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer

2017-08-16 Thread Christoph Hellwig

On Mon, Aug 14, 2017 at 06:32:17PM +0200, Benjamin Block wrote:
> > -   blk_end_request_all(rq, BLK_STS_OK);
> > -
> > put_device(job->dev);   /* release reference for the request */
> >
> > kfree(job->request_payload.sg_list);
> > kfree(job->reply_payload.sg_list);
> > -   kfree(job);
> > +   blk_end_request_all(rq, BLK_STS_OK);
> 
> What is the reason for moving that last line? Just wondering whether
> that might change the behavior somehow, although it doesn't look like it
> from the code.

The job is now allocated as part of the request, so we must fre
it last.  The only change in behavior is that the reference gets dropped
a bit earlier, but once ownership is handed to the block layer
it's not needed, as are the memory allocations for the S/G lists.

> > +{
> > +   struct bsg_job *job = blk_mq_rq_to_pdu(req);
> > +
> > +   memset(job, 0, sizeof(*job));
> > +   job->req = req;
> > +   job->request = job->sreq.cmd;
> 
> That doesn't work with bsg.c if the request submitted by the user is
> bigger than BLK_MAX_CDB. There is code in blk_fill_sgv4_hdr_rq() that
> will reassign the req->cmd point in that case to something else.
> 
> This is maybe wrong in the same vein as my Patch 1 is. If not for the
> legacy code in bsg.c, setting this here, will miss changes to that
> pointer between request-allocation and job-submission.
> 
> So maybe just move this to bsg_create_job().

Yes, this should be in  indeed.

> 
> > +   job->dd_data = job + 1;
> > +   job->reply = job->sreq.sense = kzalloc(job->reply_len, gfp);
> 
> job->reply_len will be 0 here, won't it? You probably have to pull the
> "job->reply_len = SCSI_SENSE_BUFFERSIZE" here from bsg_create_job().

True.

[PATCH] scsi: qla2xxx: fix spelling mistake of variable sfp_additonal_info

2017-08-16 Thread Colin King

From: Colin Ian King 

Trivial fix to variable name, sfp_additonal_info should be
sfp_additional_info (add in missing i).

Signed-off-by: Colin Ian King 
---
 drivers/scsi/qla2xxx/qla_isr.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index c14fab35fc36..916f685872aa 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -452,7 +452,7 @@ qla83xx_handle_8200_aen(scsi_qla_host_t *vha, uint16_t *mb)
uint16_t peg_fw_state, nw_interface_link_up;
uint16_t nw_interface_signal_detect, sfp_status;
uint16_t htbt_counter, htbt_monitor_enable;
-   uint16_t sfp_additonal_info, sfp_multirate;
+   uint16_t sfp_additional_info, sfp_multirate;
uint16_t sfp_tx_fault, link_speed, dcbx_status;
 
/*
@@ -492,7 +492,7 @@ qla83xx_handle_8200_aen(scsi_qla_host_t *vha, uint16_t *mb)
sfp_status = ((mb[2] & 0x0c00) >> 10);
htbt_counter = ((mb[2] & 0x7000) >> 12);
htbt_monitor_enable = ((mb[2] & 0x8000) >> 15);
-   sfp_additonal_info = (mb[6] & 0x0003);
+   sfp_additional_info = (mb[6] & 0x0003);
sfp_multirate = ((mb[6] & 0x0004) >> 2);
sfp_tx_fault = ((mb[6] & 0x0008) >> 3);
link_speed = ((mb[6] & 0x0070) >> 4);
@@ -507,9 +507,9 @@ qla83xx_handle_8200_aen(scsi_qla_host_t *vha, uint16_t *mb)
sfp_status);
ql_log(ql_log_warn, vha, 0x5067,
"htbt_counter=0x%x, htbt_monitor_enable=0x%x, "
-   "sfp_additonal_info=0x%x, sfp_multirate=0x%x.\n ",
+   "sfp_additional_info=0x%x, sfp_multirate=0x%x.\n ",
htbt_counter, htbt_monitor_enable,
-   sfp_additonal_info, sfp_multirate);
+   sfp_additional_info, sfp_multirate);
ql_log(ql_log_warn, vha, 0x5068,
"sfp_tx_fault=0x%x, link_state=0x%x, "
"dcbx_status=0x%x.\n", sfp_tx_fault, link_speed,
-- 
2.11.0

Re: [PATCH 1/2] nvmet_fc: add defer_req callback for deferment of cmd buffer return

2017-08-16 Thread Christoph Hellwig

On Thu, Aug 10, 2017 at 11:06:04AM +0200, Christoph Hellwig wrote:
> > @@ -463,9 +472,9 @@ static struct nvmet_fc_fcp_iod *
> >  nvmet_fc_alloc_fcp_iod(struct nvmet_fc_tgt_queue *queue)
> >  {
> > static struct nvmet_fc_fcp_iod *fod;
> 
> This isn't new, but is this really supposed to be a static variable,
> that is all instances of this code sharing it use the same?
> 
> After a short code inspection this looks like a nasty bug to me.

James, can you look into this asap?

Re: [PATCH] scsi: ncr5380: constify pnp_device_id

2017-08-16 Thread Finn Thain

On Wed, 16 Aug 2017, Arvind Yadav wrote:

> pnp_device_id are not supposed to change at runtime. All functions
> working with pnp_device_id provided by  work with
> const pnp_device_id. So mark the non-const structs as const.
> 
> Signed-off-by: Arvind Yadav 

Acked-by: Finn Thain 

> ---
>  drivers/scsi/g_NCR5380.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
> index c34fc91..1968d81 100644
> --- a/drivers/scsi/g_NCR5380.c
> +++ b/drivers/scsi/g_NCR5380.c
> @@ -703,7 +703,7 @@ static struct isa_driver generic_NCR5380_isa_driver = {
>  };
>  
>  #ifdef CONFIG_PNP
> -static struct pnp_device_id generic_NCR5380_pnp_ids[] = {
> +static const struct pnp_device_id generic_NCR5380_pnp_ids[] = {
>   { .id = "DTC436e", .driver_data = BOARD_DTC3181E },
>   { .id = "" }
>  };
>

Re: [PATCH 2/2] tests/scsi/0001: Regression test for SCSI device blacklisting

2017-08-16 Thread Omar Sandoval

On Wed, Aug 09, 2017 at 12:50:06PM +0200, Hannes Reinecke wrote:
> SCSI device blacklisting seems to be a tricky subject, with
> lots of potential for messing up the selection algorithm.
> This adds a test for catching regressions here.

I'm waiting to see how the patches end up before applying this, but I'm
glad to see this test :) A few comments below. I've addressed most of
them and pushed it to https://github.com/osandov/blktests/tree/scsi-blacklist,
but the golden output needs to be updated for v3 of your patches. Could
you modify the patch in my branch and resend? Thanks for adding tests!

> Signed-off-by: Hannes Reinecke 
> ---
>  tests/scsi/001 | 69 
> ++
>  tests/scsi/001.out | 10 
>  2 files changed, 79 insertions(+)
>  create mode 100644 tests/scsi/001
>  create mode 100644 tests/scsi/001.out
> 
> diff --git a/tests/scsi/001 b/tests/scsi/001
> new file mode 100644
> index 000..374a458
> --- /dev/null
> +++ b/tests/scsi/001
> @@ -0,0 +1,69 @@
> +#!/bin/bash
> +#
> +# Regression test for scsi device blacklisting

In my experience with xfstests, the single most useful piece of
information to include for a regression test is the commit or patch it
tests. I fixed this to say 'Regression test for patch "scsi_devinfo:
fixup string compare".'

> +# Copyright (C) 2017 Hannes Reinecke, SUSE Linux GmbH
> +#
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +
> +DESCRIPTION="SCSI device blacklisting"
> +
> +QUICK=1
> +
> +CHECK_DMESG=0

I realize that the documentation wasn't clear about what we check for in
dmesg. It's just oopses and warnings and such, so we don't need/want to
disable checking dmesg here. I removed it and improved the
documentation.

> +requires() {
> +if modinfo scsi_debug | grep -q inq_vendor ; then
> + return 0
> +fi
> +return 1
> +}

There's a helper for this, _have_module_param.

> +test() {
> +local inq vendor model host dev blacklist
> +
> +echo "Running ${TEST_NAME}"
> +
> +for inq in \
> + "" \
> + "" \
> + "HITACHI OPEN-V  " \
> + "Scanner " \
> + "Inateck " \
> + "Promise STEX" \
> + "HITAOPEN-V  " \
> + "ABCDScanner " ; do
> + vendor=${inq:0:8}
> + model=${inq:8:16}
> + modprobe scsi_debug inq_vendor="$vendor" inq_product="$model"
> + host=$(lsscsi -H | sed -n 's/.\([0-9]*\).*scsi_debug/\1/p')
> + if [ -z "$host" ] ; then
> + echo "Test failed, scsi_debug could not be loaded"
> + return 1
> + fi
> + dev=$(lsscsi | grep $host | sed -n 's/.*\/dev\/\(sd[a-z]*\).*/\1/p')
> + if [ -z "$dev" ] ; then
> + echo "Test failed, SCSI device not found"
> + rmmod scsi_debug
> + return 1
> + fi

The biggest change I made was improving the existing scsi_debug helpers
so you don't have to hardcode this.

> + vendor=$(cat /sys/block/$dev/device/vendor)
> + model=$(cat /sys/block/$dev/device/model)
> + blacklist=$(cat /sys/block/$dev/device/blacklist)
> + echo "$vendor $model $blacklist"
> + rmmod scsi_debug
> +done
> +echo "Test complete"
> +return 0
> +}
> diff --git a/tests/scsi/001.out b/tests/scsi/001.out
> new file mode 100644
> index 000..64db97c
> --- /dev/null
> +++ b/tests/scsi/001.out
> @@ -0,0 +1,10 @@
> +Running scsi/001
> +  0x0
> +  0x0
> +HITACHI  OPEN-V   0x2
> + Scanner  0x1
> +Inateck   0x0
> +Promise  STEX 0x40
> +HITA OPEN-V   0x0
> +ABCD Scanner  0x0
> +Test complete

As mentioned above, this needs updating for the symbolic constants.

37 matches

Mail list logo