Re: [PATCH v2] target: fix tcm_loop build errors when SCSI=m

2018-08-06 Thread Randy Dunlap
On 08/06/2018 04:29 PM, Randy Dunlap wrote:
> On 08/06/2018 04:26 PM, Bart Van Assche wrote:
>> On Mon, 2018-08-06 at 16:20 -0700, Randy Dunlap wrote:
>>> Fixes: 3703b2c5d041 ("[SCSI] tcm_loop: Add multi-fabric Linux/SCSI LLD
>>>fabric module")
>>
>> From drivers/target/Kconfig on Linus' master branch:
>>
>> menuconfig TARGET_CORE
>>  tristate "Generic Target Core Mod (TCM) and ConfigFS Infrastructure"
>>  depends on SCSI && BLOCK
>>
>> if TARGET_CORE
>> [ ... ]
>> source "drivers/target/loopback/Kconfig"
>> endif
>>
>> In other words, the loopback driver already depends on SCSI. So I doubt that
>> this is a longstanding issue. Did you encounter this with Linus' master 
>> branch
>> or rather with linux-next?
> 
> Darn.  Thanks for the info.
> 
> I encountered it in linux-next.  I'll dig deeper.

For TARGET_CORE, the depends on SCSI has been removed in linux-next.
That allows the kconfig to go sideways.


-- 
~Randy


Re: [PATCH v2] target: fix tcm_loop build errors when SCSI=m

2018-08-06 Thread Bart Van Assche
On Mon, 2018-08-06 at 16:29 -0700, Randy Dunlap wrote:
> On 08/06/2018 04:26 PM, Bart Van Assche wrote:
> > On Mon, 2018-08-06 at 16:20 -0700, Randy Dunlap wrote:
> > > Fixes: 3703b2c5d041 ("[SCSI] tcm_loop: Add multi-fabric Linux/SCSI LLD
> > >fabric module")
> > 
> > From drivers/target/Kconfig on Linus' master branch:
> > 
> > menuconfig TARGET_CORE
> > tristate "Generic Target Core Mod (TCM) and ConfigFS Infrastructure"
> > depends on SCSI && BLOCK
> > 
> > if TARGET_CORE
> > [ ... ]
> > source "drivers/target/loopback/Kconfig"
> > endif
> > 
> > In other words, the loopback driver already depends on SCSI. So I doubt that
> > this is a longstanding issue. Did you encounter this with Linus' master 
> > branch
> > or rather with linux-next?
> 
> Darn.  Thanks for the info.
> 
> I encountered it in linux-next.  I'll dig deeper.

This patch series may be what you are looking for: "[PATCH v2 0/9] block:
Consolidate scsi sense buffer usage" (https://lkml.org/lkml/2018/7/31/829).

Bart.



Re: [PATCH v2] target: fix tcm_loop build errors when SCSI=m

2018-08-06 Thread Randy Dunlap
On 08/06/2018 04:26 PM, Bart Van Assche wrote:
> On Mon, 2018-08-06 at 16:20 -0700, Randy Dunlap wrote:
>> Fixes: 3703b2c5d041 ("[SCSI] tcm_loop: Add multi-fabric Linux/SCSI LLD
>>fabric module")
> 
> From drivers/target/Kconfig on Linus' master branch:
> 
> menuconfig TARGET_CORE
>   tristate "Generic Target Core Mod (TCM) and ConfigFS Infrastructure"
>   depends on SCSI && BLOCK
> 
> if TARGET_CORE
> [ ... ]
> source "drivers/target/loopback/Kconfig"
> endif
> 
> In other words, the loopback driver already depends on SCSI. So I doubt that
> this is a longstanding issue. Did you encounter this with Linus' master branch
> or rather with linux-next?

Darn.  Thanks for the info.

I encountered it in linux-next.  I'll dig deeper.

-- 
~Randy


Re: [PATCH v2] target: fix tcm_loop build errors when SCSI=m

2018-08-06 Thread Bart Van Assche
On Mon, 2018-08-06 at 16:20 -0700, Randy Dunlap wrote:
> Fixes: 3703b2c5d041 ("[SCSI] tcm_loop: Add multi-fabric Linux/SCSI LLD
>fabric module")

>From drivers/target/Kconfig on Linus' master branch:

menuconfig TARGET_CORE
tristate "Generic Target Core Mod (TCM) and ConfigFS Infrastructure"
depends on SCSI && BLOCK

if TARGET_CORE
[ ... ]
source "drivers/target/loopback/Kconfig"
endif

In other words, the loopback driver already depends on SCSI. So I doubt that
this is a longstanding issue. Did you encounter this with Linus' master branch
or rather with linux-next?

Thanks,

Bart.



[PATCH v2] target: fix tcm_loop build errors when SCSI=m

2018-08-06 Thread Randy Dunlap
From: Randy Dunlap 

Fix build errors when CONFIG_SCSI=m and CONFIG_LOOPBACK_TARGET=y
by making LOOPBACK_TARGET depend on SCSI.

drivers/target/loopback/tcm_loop.o: In function `tcm_loop_port_link':
tcm_loop.c:(.text+0x445): undefined reference to `scsi_add_device'
drivers/target/loopback/tcm_loop.o: In function `tcm_loop_driver_remove':
tcm_loop.c:(.text+0x55c): undefined reference to `scsi_remove_host'
tcm_loop.c:(.text+0x564): undefined reference to `scsi_host_put'
drivers/target/loopback/tcm_loop.o: In function `tcm_loop_submission_work':
tcm_loop.c:(.text+0x7c4): undefined reference to `scmd_printk'
drivers/target/loopback/tcm_loop.o: In function `tcm_loop_driver_probe':
tcm_loop.c:(.text+0x7fb): undefined reference to `scsi_host_alloc'
tcm_loop.c:(.text+0x85b): undefined reference to `scsi_add_host_with_dma'
tcm_loop.c:(.text+0x896): undefined reference to `scsi_host_put'
drivers/target/loopback/tcm_loop.o: In function `tcm_loop_port_unlink':
tcm_loop.c:(.text+0x962): undefined reference to `scsi_device_lookup'
tcm_loop.c:(.text+0x972): undefined reference to `scsi_remove_device'
tcm_loop.c:(.text+0x97a): undefined reference to `scsi_device_put'
drivers/target/loopback/tcm_loop.o:(.data+0x210): undefined reference to 
`scsi_change_queue_depth'

Fixes: 3703b2c5d041 ("[SCSI] tcm_loop: Add multi-fabric Linux/SCSI LLD
   fabric module")

Signed-off-by: Randy Dunlap 
Cc: "Nicholas A. Bellinger" 
Cc: linux-scsi@vger.kernel.org
Cc: target-de...@vger.kernel.org
Cc: Bart Van Assche 
Cc: sta...@vger.kernel.org # 2.6.39++
---
v2: add Fixes: and Cc:stable

 drivers/target/loopback/Kconfig |1 +
 1 file changed, 1 insertion(+)

--- linux-next-20180806.orig/drivers/target/loopback/Kconfig
+++ linux-next-20180806/drivers/target/loopback/Kconfig
@@ -1,5 +1,6 @@
 config LOOPBACK_TARGET
tristate "TCM Virtual SAS target and Linux/SCSI LDD fabric loopback 
module"
+   depends on SCSI
help
  Say Y here to enable the TCM Virtual SAS target and Linux/SCSI LLD
  fabric loopback module.




Re: [PATCH] aha1542: convert to DMA mapping API

2018-08-06 Thread Ondrej Zary
On Thursday 02 August 2018 15:11:05 Christoph Hellwig wrote:
> aha1542 is one of the last users of the legacy isa_*_to_bus APIs, which
> also isn't portable enough.  Convert it to the proper DMA mapping API.
>
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/aha1542.c | 112 -
>  1 file changed, 77 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/scsi/aha1542.c b/drivers/scsi/aha1542.c
> index 41add33e3f1f..679b52a7cc69 100644
> --- a/drivers/scsi/aha1542.c
> +++ b/drivers/scsi/aha1542.c
> @@ -58,8 +58,15 @@ struct aha1542_hostdata {
>   int aha1542_last_mbi_used;
>   int aha1542_last_mbo_used;
>   struct scsi_cmnd *int_cmds[AHA1542_MAILBOXES];
> - struct mailbox mb[2 * AHA1542_MAILBOXES];
> - struct ccb ccb[AHA1542_MAILBOXES];
> + struct mailbox *mb;
> + dma_addr_t mb_handle;
> + struct ccb *ccb;
> + dma_addr_t ccb_handle;
> +};
> +
> +struct aha1542_cmd {
> + struct chain *chain;
> + dma_addr_t chain_handle;
>  };
>
>  static inline void aha1542_intr_reset(u16 base)
> @@ -233,6 +240,17 @@ static int aha1542_test_port(struct Scsi_Host *sh)
>   return 1;
>  }
>
> +static void aha1542_free_cmd(struct scsi_cmnd *cmd)
> +{
> + struct aha1542_cmd *acmd = scsi_cmd_priv(cmd);
> + struct device *dev = cmd->device->host->dma_dev;
> +
> + dma_free_coherent(dev, scsi_sg_count(cmd) * sizeof(struct chain),
> +   acmd->chain, acmd->chain_handle);
> + acmd->chain = NULL;
> + scsi_dma_unmap(cmd);
> +}
> +
>  static irqreturn_t aha1542_interrupt(int irq, void *dev_id)
>  {
>   struct Scsi_Host *sh = dev_id;
> @@ -303,7 +321,7 @@ static irqreturn_t aha1542_interrupt(int irq, void
> *dev_id) return IRQ_HANDLED;
>   };
>
> - mbo = (scsi2int(mb[mbi].ccbptr) - (isa_virt_to_bus([0]))) /
> sizeof(struct ccb); + mbo = (scsi2int(mb[mbi].ccbptr) -
> aha1542->ccb_handle) / sizeof(struct ccb); mbistatus = mb[mbi].status;
>   mb[mbi].status = 0;
>   aha1542->aha1542_last_mbi_used = mbi;
> @@ -331,8 +349,7 @@ static irqreturn_t aha1542_interrupt(int irq, void
> *dev_id) return IRQ_HANDLED;
>   }
>   my_done = tmp_cmd->scsi_done;
> - kfree(tmp_cmd->host_scribble);
> - tmp_cmd->host_scribble = NULL;
> + aha1542_free_cmd(tmp_cmd);
>   /* Fetch the sense data, and tuck it away, in the required 
> slot.  The
>  Adaptec automatically fetches it, and there is no guarantee 
> that
>  we will still have it in the cdb when we come back */
> @@ -369,6 +386,7 @@ static irqreturn_t aha1542_interrupt(int irq, void
> *dev_id)
>
>  static int aha1542_queuecommand(struct Scsi_Host *sh, struct scsi_cmnd
> *cmd) {
> + struct aha1542_cmd *acmd = scsi_cmd_priv(cmd);
>   struct aha1542_hostdata *aha1542 = shost_priv(sh);
>   u8 direction;
>   u8 target = cmd->device->id;
> @@ -378,7 +396,6 @@ static int aha1542_queuecommand(struct Scsi_Host *sh,
> struct scsi_cmnd *cmd) int mbo, sg_count;
>   struct mailbox *mb = aha1542->mb;
>   struct ccb *ccb = aha1542->ccb;
> - struct chain *cptr;
>
>   if (*cmd->cmnd == REQUEST_SENSE) {
>   /* Don't do the command - we have the sense data already */
> @@ -398,15 +415,17 @@ static int aha1542_queuecommand(struct Scsi_Host *sh,
> struct scsi_cmnd *cmd) print_hex_dump_bytes("command: ", DUMP_PREFIX_NONE,
> cmd->cmnd, cmd->cmd_len); }
>  #endif
> - if (bufflen) {  /* allocate memory before taking host_lock */
> - sg_count = scsi_sg_count(cmd);
> - cptr = kmalloc_array(sg_count, sizeof(*cptr),
> -  GFP_KERNEL | GFP_DMA);
> - if (!cptr)
> + sg_count = scsi_dma_map(cmd);
> + if (sg_count) {
> + acmd->chain = dma_alloc_coherent(sh->dma_dev,
> + sg_count * sizeof(struct chain),
> + >chain_handle, GFP_DMA);
> + if (!acmd->chain) {
> + scsi_dma_unmap(cmd);
>   return SCSI_MLQUEUE_HOST_BUSY;
> + }
>   } else {
> - sg_count = 0;
> - cptr = NULL;
> + acmd->chain = NULL;
>   }
>
>   /* Use the outgoing mailboxes in a round-robin fashion, because this
> @@ -437,7 +456,8 @@ static int aha1542_queuecommand(struct Scsi_Host *sh,
> struct scsi_cmnd *cmd) shost_printk(KERN_DEBUG, sh, "Sending command (%d
> %p)...", mbo, cmd->scsi_done); #endif
>
> - any2scsi(mb[mbo].ccbptr, isa_virt_to_bus([mbo]));   /* This gets
> trashed for some reason */ +  /* This gets trashed for some reason */
> + any2scsi(mb[mbo].ccbptr, aha1542->ccb_handle + mbo * sizeof(ccb));
>
>   memset([mbo], 0, sizeof(struct ccb));
>
> @@ -456,21 +476,18 @@ static int aha1542_queuecommand(struct Scsi_Host *sh,
> struct scsi_cmnd *cmd) int i;
>
>   

Re: [PATCH v3] mpt3sas: correct reset of smid while clearing scsi tracker

2018-08-06 Thread Bart Van Assche
On Sat, 2018-08-04 at 18:56 +0530, Sreekanth Reddy wrote:
> No Bart, their is no race condition here. Since chain lookup table
> entry is uniquely accessed using smid value. And this smid (which is
> scmd->request->tag +1) is unique for each IO request. And
> _base_get_chain_buffer_tracker() function is called only during IO
> submission time (i.e. before submitting the IO to the IOC firmware)
> and mpt3sas_base_clear_st() function is called in the ISR (i.e during
> IO completion) path. So for a particular IO these two functions called
> at two different instances of time.

Hello Sreekanth,

In mpt3sas_base_get_smid_scsiio() I found the following code:

struct scsiio_tracker *request = scsi_cmd_priv(scmd);
u16 smid = scmd->request->tag + 1;
request->smid = smid;

That confirms what you wrote about smid being unique for each I/O request.
However, this also raises new questions. As far as I can see all code in
the I/O path that accesses the chain_lookup[] array uses index smid - 1.
The patch at the start of this e-mail thread modifies two functions, namely
mpt3sas_remove_dead_ioc_func() and mpt3sas_base_clear_st(). Since the
chain_lookup[] array is indexed with the smid and hence contains request-
private data, how is it even possible that these functions race against
each other? Is there perhaps code that accesses the chain_lookup[] array
and that is called from another context than the I/O completion path?

Is the patch at the start of this e-mail thread the result of a theoretical
concern or of a real failure? In the latter case, which sequence of actions
triggered the failure? The answer to this question should already have been
mentioned in the description of v1 of this patch.

Thanks,

Bart.


Re: [RFC PATCH 3/5] streamline REQ_OP_READ-WRITE access

2018-08-06 Thread Douglas Gilbert

On 2018-08-06 01:41 AM, Damien Le Moal wrote:

On 2018/08/06 13:51, Douglas Gilbert wrote:

Make the two block layer operations most frequently used (REQ_OP_READ
and REQ_OP_WRITE) bypass the switch statements in the submission and
response paths. Assume these two enums are 0 and 1 which allows a
single comparison to select both of them. Check that assumption
at driver start-up.


What about simply moving the F cases at the top of
the switch case list ? Since that gets (usually) compiled as a jump, the effect
should be the same, no ?


Yes, that could be done. The proposed code takes it out of the hands
of the compiler's switch implementation and makes the favouring of
REQ_OP_READ and REQ_OP_WRITE more explicit at the cost of code
simplicity.

Also the leading "if (likely(...))" gives the runtime the option of
not caching the corresponding "else" code (which could be placed in
a separate function), leaving code cache space free for other fast
path functions.


A previous patch proposal by me had sd_init_command() callback going
straight into sd_setup_read_write_cmnd() and then going to a selection
function when the invocation was not for either REQ_OP_READ and
REQ_OP_WRITE. That saves one level of calls on the fast path. A reviewer
didn't like that.

Doug Gilbert


Signed-off-by: Douglas Gilbert 
---
  drivers/scsi/sd.c | 82 +++
  1 file changed, 55 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 4b1402633c36..9f047fd3c92d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1227,6 +1227,14 @@ static int sd_init_command(struct scsi_cmnd *cmd)
  {
struct request *rq = cmd->request;
  
+	/*

+* Assumption that REQ_OP_READ and REQ_OP_WRITE are 0 and 1 is
+* checked in init_sd(). The following "if" is done to dispatch
+* REQ_OP_READ and REQ_OP_WRITE quickly.
+*/
+   if (likely(req_op(rq) < 2))
+   return sd_setup_read_write_cmnd(cmd);
+
switch (req_op(rq)) {
case REQ_OP_DISCARD:
switch (scsi_disk(rq->rq_disk)->provisioning_mode) {
@@ -1247,9 +1255,6 @@ static int sd_init_command(struct scsi_cmnd *cmd)
return sd_setup_write_same_cmnd(cmd);
case REQ_OP_FLUSH:
return sd_setup_flush_cmnd(cmd);
-   case REQ_OP_READ:
-   case REQ_OP_WRITE:
-   return sd_setup_read_write_cmnd(cmd);
case REQ_OP_ZONE_REPORT:
return sd_zbc_setup_report_cmnd(cmd);
case REQ_OP_ZONE_RESET:
@@ -1973,30 +1978,12 @@ static int sd_done(struct scsi_cmnd *SCpnt)
int sense_valid = 0;
int sense_deferred = 0;
  
-	switch (req_op(req)) {

-   case REQ_OP_DISCARD:
-   case REQ_OP_WRITE_ZEROES:
-   case REQ_OP_WRITE_SAME:
-   case REQ_OP_ZONE_RESET:
-   if (!result) {
-   good_bytes = blk_rq_bytes(req);
-   scsi_set_resid(SCpnt, 0);
-   } else {
-   good_bytes = 0;
-   scsi_set_resid(SCpnt, blk_rq_bytes(req));
-   }
-   break;
-   case REQ_OP_ZONE_REPORT:
-   if (!result) {
-   good_bytes = scsi_bufflen(SCpnt)
-   - scsi_get_resid(SCpnt);
-   scsi_set_resid(SCpnt, 0);
-   } else {
-   good_bytes = 0;
-   scsi_set_resid(SCpnt, blk_rq_bytes(req));
-   }
-   break;
-   default:
+   /*
+* Assumption that REQ_OP_READ and REQ_OP_WRITE are 0 and 1 is
+* checked in init_sd(). The following "if" is done to expedite
+* REQ_OP_READ and REQ_OP_WRITE.
+*/
+   if (likely(req_op(req) < 2)) {
/*
 * In case of bogus fw or device, we could end up having
 * an unaligned partial completion. Check this here and force
@@ -2011,6 +1998,36 @@ static int sd_done(struct scsi_cmnd *SCpnt)
round_up(resid, sector_size));
scsi_set_resid(SCpnt, resid);
}
+   } else {
+   switch (req_op(req)) {
+   case REQ_OP_DISCARD:
+   case REQ_OP_WRITE_ZEROES:
+   case REQ_OP_WRITE_SAME:
+   case REQ_OP_ZONE_RESET:
+   if (!result) {
+   good_bytes = blk_rq_bytes(req);
+   scsi_set_resid(SCpnt, 0);
+   } else {
+   good_bytes = 0;
+   scsi_set_resid(SCpnt, blk_rq_bytes(req));
+   }
+   break;
+   case REQ_OP_ZONE_REPORT:
+   if (!result) {
+   good_bytes = scsi_bufflen(SCpnt)
+   - scsi_get_resid(SCpnt);

Re: [RFC PATCH 4/5] streamline some logical operations

2018-08-06 Thread Douglas Gilbert

On 2018-08-06 01:41 AM, Damien Le Moal wrote:

On 2018/08/06 13:51, Douglas Gilbert wrote:

Re-arrange some logic to lessen the number of checks. With logical
ANDs put the least likely first, with logical ORs put the most
likely first. Also add conditional hints on the assumed fastpath.

Signed-off-by: Douglas Gilbert 
---
  drivers/scsi/sd.c | 43 ---
  1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 9f047fd3c92d..05014054e357 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1171,9 +1171,9 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
  
  	fua = (rq->cmd_flags & REQ_FUA) ? 0x8 : 0;

dix = scsi_prot_sg_count(cmd);
-   dif = scsi_host_dif_capable(cmd->device->host, sdkp->protection_type);
+   dif = scsi_host_dif_capable(sdp->host, sdkp->protection_type);
  
-	if (write && dix)

+   if (dix && write)
sd_dif_prepare(cmd);
  
  	if (dif || dix)

@@ -1181,19 +1181,27 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd 
*cmd)
else
protect = 0;
  
-	if (protect && sdkp->protection_type == T10_PI_TYPE2_PROTECTION) {

+   if (unlikely(protect &&
+sdkp->protection_type == T10_PI_TYPE2_PROTECTION))
ret = sd_setup_read_write_32_cmnd(cmd, write, lba, nr_blocks,
  protect | fua);
-   } else if (sdp->use_16_for_rw || (nr_blocks > 0x)) {
+   else if (sdp->use_16_for_rw)
ret = sd_setup_read_write_16_cmnd(cmd, write, lba, nr_blocks,
  protect | fua);


So here, without use_16_for_rw being forced on (which is the case for most disks
I think, except ZBC disks which mandate it) or most disks, all read/write to low
LBAs will have to go through a longer chain of if/else if... Is this change
really such a gain in average ? It looks like this will be a loss for the first
small partition at the beginning of the disk.


For comparison I assume:
  protect = 0
  (nr_blocks < 0x100) == 1
  (use_16_for_rw || use_10_for_rw) == 1

So for the use_16_for_rw == 0 and lba <= 0x1f case I count:
  old: 6proposed: 4
comparisons when use_10_for_rw == 1. The old figure could be reduced to
4 by re-ordering the comparisons on its use_10_for_rw "else if" line
(i.e. place use_10_for_rw first rather than third).

I'm not so concerned about the speed of READ(6) ... no SSD should be
using that deprecated command. That said, I think the proposed code
is slightly faster because the old code checks that both
nr_blocks > 0x and nr_blocks > 0xff fail. The proposed code only
checks the latter en route to the READ(6).


Also I think the proposed small decision tree is slightly easier to
follow (and hence maintain) than the existing "flat" if/else chain
which is deceptively complex. So more a code clean-up than a
speed-up :-)

Thanks for the feedback.
Doug Gilbert


BTW the major code speed improvement is not in this patchset but an earlier
patch to convert the opened coded shifts to put_unaligned_be*() calls, IMO.


-   } else if ((nr_blocks > 0xff) || (lba > 0x1f) || sdp->use_10_for_rw
-  || protect) {
-   ret = sd_setup_read_write_10_cmnd(cmd, write, lba, nr_blocks,
- protect | fua);
-   } else {
-   ret = sd_setup_read_write_6_cmnd(cmd, write, lba, nr_blocks,
-protect | fua);
+   else if (likely(nr_blocks < 0x100)) {
+   if (sdp->use_10_for_rw || (lba > 0x1f) || protect)
+   ret = sd_setup_read_write_10_cmnd(cmd, write, lba,
+nr_blocks, protect | fua);
+   else
+   ret = sd_setup_read_write_6_cmnd(cmd, write, lba,
+nr_blocks, protect | fua);
+   } else {/* not already done and nr_blocks > 0xff */
+   if (unlikely(nr_blocks > 0x))
+   ret = sd_setup_read_write_16_cmnd(cmd, write, lba,
+nr_blocks, protect | fua);
+   else
+   ret = sd_setup_read_write_10_cmnd(cmd, write, lba,
+nr_blocks, protect | fua);
}


Re: [RFC PATCH 2/5] break sd_done sense processing out to own function

2018-08-06 Thread Douglas Gilbert

On 2018-08-06 01:41 AM, Damien Le Moal wrote:

On 2018/08/06 13:51, Douglas Gilbert wrote:

Break out the sense key handling in the sd_done() (response) path into
its own function. Note that the sense key only needs to be inspected
when a SCSI status of Check Condition is returned.


It looks like sd_done_sense() is called for any command that has sense data.
Check Condition or not. This should be clarified.


Yes. It is only recent SAM documents that have reduced the SCSI
statuses that yield sense data to just CHECK CONDITION. Also as
the code shows, just setting DRIVER_SENSE in the result (irrespective
of the SCSI status) will cause the sense data to be checked.

So that can be expended.

Doug Gilbert


Other than that, this looks OK to me.




Signed-off-by: Douglas Gilbert 
---

No speed up here, just a clean up. There could possibly be a speed
improvement (not observed in tests) from lessening the cache footprint
of the sd_done() function which is on the fast path.


  drivers/scsi/sd.c | 112 --
  1 file changed, 59 insertions(+), 53 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index b17b8c66881d..4b1402633c36 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1898,6 +1898,62 @@ static unsigned int sd_completed_bytes(struct scsi_cmnd 
*scmd)
return min(good_bytes, transferred);
  }
  
+/* Helper for scsi_done() when there is a sense buffer */

+static int sd_done_sense(struct scsi_cmnd *SCpnt, unsigned int good_bytes,
+struct scsi_sense_hdr *sshdrp)
+{
+   struct request *req = SCpnt->request;
+   struct scsi_disk *sdkp = scsi_disk(SCpnt->request->rq_disk);
+
+   switch (sshdrp->sense_key) {
+   case HARDWARE_ERROR:
+   case MEDIUM_ERROR:
+   return sd_completed_bytes(SCpnt);
+   case RECOVERED_ERROR:
+   return scsi_bufflen(SCpnt);
+   case NO_SENSE:
+   /* This indicates a false check condition, so ignore it.  An
+* unknown amount of data was transferred so treat it as an
+* error.
+*/
+   SCpnt->result = 0;
+   memset(SCpnt->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
+   break;
+   case ABORTED_COMMAND:
+   if (sshdrp->asc == 0x10)  /* DIF: Target detected corruption */
+   good_bytes = sd_completed_bytes(SCpnt);
+   break;
+   case ILLEGAL_REQUEST:
+   switch (sshdrp->asc) {
+   case 0x10:  /* DIX: Host detected corruption */
+   good_bytes = sd_completed_bytes(SCpnt);
+   break;
+   case 0x20:  /* INVALID COMMAND OPCODE */
+   case 0x24:  /* INVALID FIELD IN CDB */
+   switch (SCpnt->cmnd[0]) {
+   case UNMAP:
+   sd_config_discard(sdkp, SD_LBP_DISABLE);
+   break;
+   case WRITE_SAME_16:
+   case WRITE_SAME:
+   if (SCpnt->cmnd[1] & 8) { /* UNMAP */
+   sd_config_discard(sdkp, SD_LBP_DISABLE);
+   } else {
+   sdkp->device->no_write_same = 1;
+   sd_config_write_same(sdkp);
+   req->__data_len = blk_rq_bytes(req);
+   req->rq_flags |= RQF_QUIET;
+   }
+   break;
+   }
+   }
+   break;
+   default:
+   break;
+   }
+   return good_bytes;
+}
+
  /**
   *sd_done - bottom half handler: called when the lower level
   *driver has completed (successfully or otherwise) a scsi command.
@@ -1964,60 +2020,10 @@ static int sd_done(struct scsi_cmnd *SCpnt)
}
sdkp->medium_access_timed_out = 0;
  
-	if (driver_byte(result) != DRIVER_SENSE &&

-   (!sense_valid || sense_deferred))
-   goto out;
+   if (unlikely(driver_byte(result) == DRIVER_SENSE ||
+(sense_valid && !sense_deferred)))
+   good_bytes = sd_done_sense(SCpnt, good_bytes, );
  
-	switch (sshdr.sense_key) {

-   case HARDWARE_ERROR:
-   case MEDIUM_ERROR:
-   good_bytes = sd_completed_bytes(SCpnt);
-   break;
-   case RECOVERED_ERROR:
-   good_bytes = scsi_bufflen(SCpnt);
-   break;
-   case NO_SENSE:
-   /* This indicates a false check condition, so ignore it.  An
-* unknown amount of data was transferred so treat it as an
-* error.
-*/
-   SCpnt->result = 0;
-   memset(SCpnt->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
-   break;
-   

Re: [RFC PATCH 1/5] add tweakable bounds_check flag, now off by default

2018-08-06 Thread Douglas Gilbert

On 2018-08-06 01:41 AM, Damien Le Moal wrote:

Douglas,

On 2018/08/06 13:51, Douglas Gilbert wrote:

Add bounds_check "rw" attribute to the sd driver. It controls whether
each read/write operation submission does an "out of range" bounds check
and a LBA/number_of_blocks alignment bounds check. The mainline kernel
currently does both these checks. This patch changes that default
to bounds_check=false, that is: the two checks are not done.

SBC, SBC-2, SBC-3 and draft SBC-4 are require a device server (i.e. a


"are require" -> "require"


Yes.


SCSI disk) to fail any media command in which the LBA+number_of_blocks
exceeds the capacity of the disk. So why should the sd driver also
check it, given the block layer generated the request and knows the
disk capacity and the disk itself also checks it?


What about SATA drives for non read/write commands when accessible max address
is set lower to a value lower then native max address ? Need to dig again in the
code and the specs, but I think to remember that Linux disk capacity is in that
case set to accessible max address, but some commands may address beyond that
value.


I do not consider this patch complete. You make a good argument why SATA
disks should have the "out of range" test done.

The worst case is that the bounds_check is not done and the device silently
wraps the LBA then overwrites some innocent blocks.

So perhaps struct scsi_host_template and scsi_host need a bounds_check:1
flag so certain "SCSI" hosts (e.g. ATA and USB mass storage but not UAS(P))
could set it by default. So the SATA host (libata ?) could set the bit in
its template.

Would the block layer also want to tweak bounds_check?


The block layer does almost all its block handling in units of 512 byte
blocks whereas SCSI disks may have other logical block sizes (e.g. 4096
byte logical blocks). So when the block sizes are different it is
possible that there is an alignment issue. But if that occurs, it
would be a logic issue in the block layer. Note that the current
mainline code does this check even when it is not needed (e.g. when
the logical block size of the disk is 512 bytes).


Of note here is that there is a special case that is not being handled: ZBC
drives with 512B logical and 4K physical blocks will accept any 512B aligned
read commands (that is any read command), but will require 4K aligned write
commands for sequential write preferred zones (writes to conventional zones can
be 512B aligned). Since there is currently no differentiation between read and
write commands for the alignment check, much less based on the target zone type
of these commands, checks are incomplete for ZBC & ZAC disks in the generic
block layer. These checks could be added to SD though now that a bitmap is
attached to the device request queue to indicate if a zone is conventional or
sequential. For such test to be effective, sdkp->bounds_check would need to be
forced to 1 for host managed zoned disks.


Like the "out of range" case I guess that compliant ZBC & ZAC disks are required
to fail this case and send back an indicative error message. But yes, extra code
could be added to do that. If there are enough ZBC/ZAC special cases (like the
protect code) we could split the submission side into three fast paths on
entry to sd_init_command(): one for protect, one for ZBC/ZAC and one for the
rest (the most common case, I assume). That leaves open the question can a ZBC
disk also have protection information :-)

Doug Gilbert


Signed-off-by: Douglas Gilbert 
---

Should a mechanism be added so this could set/cleared by:
   - a LLD for all disks it controls
   - kernel boot time parameter?

  drivers/scsi/sd.c | 55 +--
  drivers/scsi/sd.h |  1 +
  2 files changed, 45 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d1d08f039bdd..b17b8c66881d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -522,6 +522,32 @@ max_write_same_blocks_store(struct device *dev, struct 
device_attribute *attr,
  }
  static DEVICE_ATTR_RW(max_write_same_blocks);
  
+static ssize_t

+bounds_check_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   struct scsi_disk *sdkp = to_scsi_disk(dev);
+
+   return sprintf(buf, "%u\n", sdkp->bounds_check);
+}
+
+static ssize_t
+bounds_check_store(struct device *dev, struct device_attribute *attr,
+  const char *buf, size_t count)
+{
+   struct scsi_disk *sdkp = to_scsi_disk(dev);
+   int err, n;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
+   err = kstrtoint(buf, 10, );
+   if (!err)
+   sdkp->bounds_check = !!n;
+
+   return err ? err : count;
+}
+static DEVICE_ATTR_RW(bounds_check);
+
  static struct attribute *sd_disk_attrs[] = {
_attr_cache_type.attr,
_attr_FUA.attr,
@@ -535,6 +561,7 @@ static struct attribute *sd_disk_attrs[] = {
_attr_zeroing_mode.attr,