[PATCH 0/3] scsi: More detailed I/O errors

2013-06-05 Thread Hannes Reinecke
As discussed at LSF the SCSI stack already knows about several
error conditions, it's just not well documented.
Also there are some more conditions upper layers might be
interested in.

This patchset improves the documentation of the existing
I/O error codes and adds two more error codes, ENOSPC
for thin provisioning failure and ENODATA for medium error.

Hannes Reinecke (3):
  scsi: Document enhanced error codes
  scsi: Return ENOSPC on thin provisioning failure
  scsi: Return ENODATA on medium error

 drivers/scsi/scsi_error.c | 26 ++
 drivers/scsi/scsi_lib.c   | 21 +
 include/scsi/scsi.h   |  4 
 3 files changed, 47 insertions(+), 4 deletions(-)

-- 
1.7.12.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] scsi: Return ENOSPC on thin provisioning failure

2013-06-05 Thread Hannes Reinecke
When the thin provisioning hard threshold is reached we
should return ENOSPC to inform upper layers about this fact.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 drivers/scsi/scsi_error.c | 12 +++-
 drivers/scsi/scsi_lib.c   |  5 +
 include/scsi/scsi.h   |  2 ++
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 443b0e3..bf5e61a 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -234,6 +234,7 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host 
*shost,
  * FAILED
  * NEEDS_RETRY
  * TARGET_ERROR
+ * ALLOC_ERROR
  *
  * Notes:
  * When a deferred error is detected the current command has
@@ -359,11 +360,15 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
return SUCCESS;
 
/* these are not supported */
+   case DATA_PROTECT:
+   if (sshdr.asc == 0x27  sshdr.ascq == 0x07) {
+   /* Thin provisioning hard threshold reached */
+   return ALLOC_ERROR;
+   }
case COPY_ABORTED:
case VOLUME_OVERFLOW:
case MISCOMPARE:
case BLANK_CHECK:
-   case DATA_PROTECT:
return TARGET_ERROR;
 
case MEDIUM_ERROR:
@@ -849,6 +854,7 @@ retry:
case NEEDS_RETRY:
case FAILED:
case TARGET_ERROR:
+   case ALLOC_ERROR:
break;
case ADD_TO_MLQUEUE:
rtn = NEEDS_RETRY;
@@ -1588,6 +1594,10 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 */
set_host_byte(scmd, DID_TARGET_FAILURE);
rtn = SUCCESS;
+   } else if (rtn == ALLOC_ERROR) {
+   /* target hit out-of-space condition */
+   set_host_byte(scmd, DID_ALLOC_FAILURE);
+   rtn = SUCCESS;
}
/* if rtn == FAILED, we have no sense information;
 * returning FAILED will wake the error handler thread
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 12bfa73..209a4d5 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -710,6 +710,7 @@ EXPORT_SYMBOL(scsi_release_buffers);
  * -ENOLINKtemporary transport failure
  * -EREMOTEIO  permanent target failure, do not retry
  * -EBADE  permanent nexus failure, retry on other path
+ * -ENOSPC No write space available
  */
 static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result)
 {
@@ -727,6 +728,10 @@ static int __scsi_error_from_host_byte(struct scsi_cmnd 
*cmd, int result)
set_host_byte(cmd, DID_OK);
error = -EBADE;
break;
+   case DID_ALLOC_FAILURE:
+   set_host_byte(cmd, DID_OK);
+   error = -ENOSPC;
+   break;
default:
error = -EIO;
break;
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 66216c1..5ead86b 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -452,6 +452,7 @@ static inline int scsi_is_wlun(unsigned int lun)
 * other paths */
 #define DID_NEXUS_FAILURE 0x11  /* Permanent nexus failure, retry on other
 * paths might yield different results */
+#define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
 #define DRIVER_OK   0x00   /* Driver status   */
 
 /*
@@ -482,6 +483,7 @@ static inline int scsi_is_wlun(unsigned int lun)
 #define SCSI_RETURN_NOT_HANDLED   0x2008
 #define FAST_IO_FAIL   0x2009
 #define TARGET_ERROR0x200A
+#define ALLOC_ERROR 0x200B
 
 /*
  * Midlevel queue return values.
-- 
1.7.12.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] scsi: Return ENODATA on medium error

2013-06-05 Thread Hannes Reinecke
When a medium error is detected the SCSI stack should return
ENODATA to the upper layers.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 drivers/scsi/scsi_error.c | 7 ++-
 drivers/scsi/scsi_lib.c   | 5 +
 include/scsi/scsi.h   | 2 ++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index bf5e61a..2ded10a 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -235,6 +235,7 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host 
*shost,
  * NEEDS_RETRY
  * TARGET_ERROR
  * ALLOC_ERROR
+ * MEDIA_FAILURE
  *
  * Notes:
  * When a deferred error is detected the current command has
@@ -375,7 +376,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
sshdr.asc == 0x13 || /* AMNF DATA FIELD */
sshdr.asc == 0x14) { /* RECORD NOT FOUND */
-   return TARGET_ERROR;
+   return MEDIA_FAILURE;
}
return NEEDS_RETRY;
 
@@ -1598,6 +1599,10 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
/* target hit out-of-space condition */
set_host_byte(scmd, DID_ALLOC_FAILURE);
rtn = SUCCESS;
+   } else if (rtn == MEDIA_FAILURE) {
+   /* medium error */
+   set_host_byte(scmd, DID_MEDIUM_ERROR);
+   rtn = SUCCESS;
}
/* if rtn == FAILED, we have no sense information;
 * returning FAILED will wake the error handler thread
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 209a4d5..39d626e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -711,6 +711,7 @@ EXPORT_SYMBOL(scsi_release_buffers);
  * -EREMOTEIO  permanent target failure, do not retry
  * -EBADE  permanent nexus failure, retry on other path
  * -ENOSPC No write space available
+ * -ENODATAMedium error
  */
 static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result)
 {
@@ -732,6 +733,10 @@ static int __scsi_error_from_host_byte(struct scsi_cmnd 
*cmd, int result)
set_host_byte(cmd, DID_OK);
error = -ENOSPC;
break;
+   case DID_MEDIUM_ERROR:
+   set_host_byte(cmd, DID_OK);
+   error = -ENODATA;
+   break;
default:
error = -EIO;
break;
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5ead86b..c397684 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -453,6 +453,7 @@ static inline int scsi_is_wlun(unsigned int lun)
 #define DID_NEXUS_FAILURE 0x11  /* Permanent nexus failure, retry on other
 * paths might yield different results */
 #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
+#define DID_MEDIUM_ERROR  0x13  /* Medium error */
 #define DRIVER_OK   0x00   /* Driver status   */
 
 /*
@@ -484,6 +485,7 @@ static inline int scsi_is_wlun(unsigned int lun)
 #define FAST_IO_FAIL   0x2009
 #define TARGET_ERROR0x200A
 #define ALLOC_ERROR 0x200B
+#define MEDIA_FAILURE   0x200C
 
 /*
  * Midlevel queue return values.
-- 
1.7.12.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] scsi: Document enhanced error codes

2013-06-05 Thread Hannes Reinecke
Document the various error codes returned on I/O failure.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 drivers/scsi/scsi_error.c |  7 +--
 drivers/scsi/scsi_lib.c   | 11 +++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f43de1e..443b0e3 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -229,8 +229,11 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host 
*shost,
  * scsi_check_sense - Examine scsi cmd sense
  * @scmd:  Cmd to have sense checked.
  *
- * Return value:
- * SUCCESS or FAILED or NEEDS_RETRY or TARGET_ERROR
+ * Possible return values:
+ * SUCCESS
+ * FAILED
+ * NEEDS_RETRY
+ * TARGET_ERROR
  *
  * Notes:
  * When a deferred error is detected the current command has
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..12bfa73 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -700,6 +700,17 @@ void scsi_release_buffers(struct scsi_cmnd *cmd)
 }
 EXPORT_SYMBOL(scsi_release_buffers);
 
+/**
+ * __scsi_error_from_host_byte - translate SCSI error code into errno
+ * @cmd:   SCSI command (unused)
+ * @result:scsi error code
+ *
+ * Translate SCSI error code into standard UNIX errno.
+ * Return values:
+ * -ENOLINKtemporary transport failure
+ * -EREMOTEIO  permanent target failure, do not retry
+ * -EBADE  permanent nexus failure, retry on other path
+ */
 static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result)
 {
int error = 0;
-- 
1.7.12.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 59301] New: mpt2sas0 fails after short time of work

2013-06-05 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=59301

   Summary: mpt2sas0 fails after short time of work
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: 3.5.0-32-generic ubuntu
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: blocking
  Priority: P1
 Component: Other
AssignedTo: scsi_drivers-ot...@kernel-bugs.osdl.org
ReportedBy: pomoz...@gmail.com
Regression: No


Created an attachment (id=103551)
 -- (https://bugzilla.kernel.org/attachment.cgi?id=103551)
Full dmesg log

mpt2sas0 module v16.00.00.00 always fails after boot:

[ 1638.408402] mpt2sas0: SAS host is non-operational 
[ 1638.409803] mpt2sas0: _base_fault_reset_work: Running mpt2sas_dead_ioc
thread success 
[ 1638.412848] sd 1:0:0:0: [sdb] Synchronizing SCSI cache
[ 1638.412889] sd 1:0:0:0: [sdb]  
[ 1638.412893] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 1638.413845] mpt2sas0: removing handle(0x000a), sas_addr(0x443322110700)
[ 1638.414519] sd 1:0:1:0: [sdc] Synchronizing SCSI cache
[ 1638.414552] sd 1:0:1:0: [sdc]  
[ 1638.414555] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 1638.414708] mpt2sas0: removing handle(0x0009), sas_addr(0x443322110400)
[ 1638.415060] mpt2sas0: sending diag reset !!
[ 1638.572147] mpt2sas0: diag reset: FAILED

It fails on CentOS 6.4 and Ubuntu Server 12.04.2 LTS
Module v13 (it goes within latest linux distributives) fails same way.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] sg: atomize check and set sdp-exclude in sg_open

2013-06-05 Thread vaughan

Check and set sdp-exclude should be atomic when set in sg_open().

Signed-off-by: Vaughan Cao vaughan@oracle.com
---
 drivers/scsi/sg.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 25b5455..0ede08f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -245,6 +245,21 @@ static int set_exclude(Sg_device *sdp, char val)
 return val;
 }

+/* Check if we can set exclude and then set, return 1 if success */
+static int try_set_exclude(Sg_device *sdp)
+{
+unsigned long flags;
+
+spin_lock_irqsave(sg_open_exclusive_lock, flags);
+if (sdp-exclude) {
+spin_unlock_irqrestore(sg_open_exclusive_lock, flags);
+return 0;
+}
+sdp-exclude = 1;
+spin_unlock_irqrestore(sg_open_exclusive_lock, flags);
+return 1;
+}
+
 static int sfds_list_empty(Sg_device *sdp)
 {
 unsigned long flags;
@@ -303,7 +318,7 @@ sg_open(struct inode *inode, struct file *filp)
 goto error_out;
 }
 res = wait_event_interruptible(sdp-o_excl_wait,
-   ((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 
: set_exclude(sdp, 1)));
+((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 : 
try_set_exclude(sdp)));

 if (res) {
 retval = res;/* -ERESTARTSYS because signal hit process */
 goto error_out;
--
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID-10 keeps aborting

2013-06-05 Thread Bernd Schubert

On 06/05/2013 12:02 PM, Bernd Schubert wrote:

On 06/04/2013 05:39 PM, Joe Lawrence wrote:


Just curious, what type drives were in your RAID and what does
/sys/class/scsi_disk/*/max_write_same_blocks report?  If you have a spare
drive to test, maybe you could try a quick sg_write_same command to see
how the drive reacts?



I just run into the same issue with an ancient system from 2006. Except
that I'm in hurry an need it to stress-test my own work, I can do
anything with it - it is booted via NFS and disks are only used for
development/testing.


(squeeze)fslab1:~# cat /sys/block/md126/queue/write_same_max_bytes
16384



(squeeze)fslab1:~# cat /sys/block/sd[o,n,m,l]/queue/write_same_max_bytes
0
0
0
0



Ah, now I found the reason why it fails, scsi-layer had set
write_same_max_bytes to zero when it detected that it does not support
it, but after reloading the arecal module (arcmsr) I now get:


(squeeze)fslab1:~# cat /sys/block/sd[o,n,m,l]/queue/write_same_max_bytes
33553920
33553920
33553920
33553920


In sd_config_write_same() it sets

if (sdkp-max_ws_blocks == 0)
sdkp-max_ws_blocks = SD_MAX_WS10_BLOCKS;

except when sdkp-device-no_write_same is set.
But only ata_scsi_sdev_config() sets that. And I also don't see any 
driver setting max_ws_blocks, so everything except of libata gets the 
default of SD_MAX_WS10_BLOCKS. This also seems to be consistent with the 
33553920 I see, except that there is somewhere a bit shift.
So no surprise that mptsas and arcmsr (and anything else) devices have 
write_same_max_bytes set.
As the correct handling in the md layer seems to be difficult, can we 
send a fake request at device configuration time to figure out if the 
device really support write-same?



--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/10] qla2xxx: Clean up qla24xx_iidma()

2013-06-05 Thread Bart Van Assche
Remove dead code and simplify a pointer computation.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_bsg.c |9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_bsg.c b/drivers/scsi/qla2xxx/qla_bsg.c
index 39719f8..7221521 100644
--- a/drivers/scsi/qla2xxx/qla_bsg.c
+++ b/drivers/scsi/qla2xxx/qla_bsg.c
@@ -1282,14 +1282,7 @@ qla24xx_iidma(struct fc_bsg_job *bsg_job)
return -EINVAL;
}
 
-   port_param = (struct qla_port_param *)((char *)bsg_job-request +
-   sizeof(struct fc_bsg_request));
-   if (!port_param) {
-   ql_log(ql_log_warn, vha, 0x7047,
-   port_param header not provided.\n);
-   return -EINVAL;
-   }
-
+   port_param = (void *)bsg_job-request + sizeof(struct fc_bsg_request);
if (port_param-fc_scsi_addr.dest_type != EXT_DEF_TYPE_WWPN) {
ql_log(ql_log_warn, vha, 0x7048,
Invalid destination type.\n);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/10] qla2xxx: Remove dead code in qla2x00_configure_hba()

2013-06-05 Thread Bart Van Assche
At the end of qla2x00_configure_hba() we know that rval == QLA_SUCCESS.
Hence remove the code that depends on rval != QLA_SUCCESS.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_init.c |8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 3565dfd..c68bb01 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -2309,13 +2309,7 @@ qla2x00_configure_hba(scsi_qla_host_t *vha)
Topology - %s, Host Loop address 0x%x.\n,
connect_type, vha-loop_id);
 
-   if (rval) {
-   ql_log(ql_log_warn, vha, 0x2011,
-   %s FAILED\n, __func__);
-   } else {
-   ql_dbg(ql_dbg_disc, vha, 0x2012,
-   %s success\n, __func__);
-   }
+   ql_dbg(ql_dbg_disc, vha, 0x2012, %s success\n, __func__);
 
return(rval);
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/10] qla2xxx: Clean up qla84xx_mgmt_cmd()

2013-06-05 Thread Bart Van Assche
Remove dead code, simplify a pointer computation and move the
ql84_mgmt assignment to just before its first use.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_bsg.c |9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_bsg.c b/drivers/scsi/qla2xxx/qla_bsg.c
index 7221521..cf07491 100644
--- a/drivers/scsi/qla2xxx/qla_bsg.c
+++ b/drivers/scsi/qla2xxx/qla_bsg.c
@@ -1084,14 +1084,6 @@ qla84xx_mgmt_cmd(struct fc_bsg_job *bsg_job)
return -EINVAL;
}
 
-   ql84_mgmt = (struct qla_bsg_a84_mgmt *)((char *)bsg_job-request +
-   sizeof(struct fc_bsg_request));
-   if (!ql84_mgmt) {
-   ql_log(ql_log_warn, vha, 0x703b,
-   MGMT header not provided, exiting.\n);
-   return -EINVAL;
-   }
-
mn = dma_pool_alloc(ha-s_dma_pool, GFP_KERNEL, mn_dma);
if (!mn) {
ql_log(ql_log_warn, vha, 0x703c,
@@ -1103,6 +1095,7 @@ qla84xx_mgmt_cmd(struct fc_bsg_job *bsg_job)
mn-entry_type = ACCESS_CHIP_IOCB_TYPE;
mn-entry_count = 1;
 
+   ql84_mgmt = (void *)bsg_job-request + sizeof(struct fc_bsg_request);
switch (ql84_mgmt-mgmt.cmd) {
case QLA84_MGMT_READ_MEM:
case QLA84_MGMT_GET_INFO:
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/10] qla2xxx: Remove two superfluous tests

2013-06-05 Thread Bart Van Assche
Since ha-model_desc is an array comparing it against NULL is
superfluous. Hence remove these tests.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_gs.c |3 +--
 drivers/scsi/qla2xxx/qla_os.c |3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c
index d0ea8b9..f26442a 100644
--- a/drivers/scsi/qla2xxx/qla_gs.c
+++ b/drivers/scsi/qla2xxx/qla_gs.c
@@ -1370,8 +1370,7 @@ qla2x00_fdmi_rhba(scsi_qla_host_t *vha)
/* Model description. */
eiter = (struct ct_fdmi_hba_attr *) (entries + size);
eiter-type = __constant_cpu_to_be16(FDMI_HBA_MODEL_DESCRIPTION);
-   if (ha-model_desc)
-   strncpy(eiter-a.model_desc, ha-model_desc, 80);
+   strncpy(eiter-a.model_desc, ha-model_desc, 80);
alen = strlen(eiter-a.model_desc);
alen += (alen  3) ? (4 - (alen  3)) : 4;
eiter-len = cpu_to_be16(4 + alen);
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index ad72c1d..1d97626 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -2840,8 +2840,7 @@ skip_dpc:
qla2x00_dfs_setup(base_vha);
 
ql_log(ql_log_info, base_vha, 0x00fb,
-   QLogic %s - %s.\n,
-   ha-model_number, ha-model_desc ? ha-model_desc : );
+   QLogic %s - %s.\n, ha-model_number, ha-model_desc);
ql_log(ql_log_info, base_vha, 0x00fc,
ISP%04X: %s @ %s hdma%c host#=%ld fw=%s.\n,
pdev-device, ha-isp_ops-pci_info_str(base_vha, pci_info),
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/10] qla2xxx: Remove a dead assignment in qla24xx_build_scsi_crc_2_iocbs()

2013-06-05 Thread Bart Van Assche
Since the value of cur_seg is not used and since scsi_prot_sglist()
has no side effects it is safe to remove the statement
cur_seg = scsi_prot_sglist(cmd). Detected by Coverity.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_iocb.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
index 15e4080..b589d24 100644
--- a/drivers/scsi/qla2xxx/qla_iocb.c
+++ b/drivers/scsi/qla2xxx/qla_iocb.c
@@ -1189,7 +1189,6 @@ qla24xx_build_scsi_crc_2_iocbs(srb_t *sp, struct 
cmd_type_crc_2 *cmd_pkt,
uint32_t*cur_dsd, *fcp_dl;
scsi_qla_host_t *vha;
struct scsi_cmnd*cmd;
-   struct scatterlist  *cur_seg;
int sgc;
uint32_ttotal_bytes = 0;
uint32_tdata_bytes;
@@ -1396,7 +1395,6 @@ qla24xx_build_scsi_crc_2_iocbs(srb_t *sp, struct 
cmd_type_crc_2 *cmd_pkt,
 
if (bundling  tot_prot_dsds) {
/* Walks dif segments */
-   cur_seg = scsi_prot_sglist(cmd);
cmd_pkt-control_flags |=
__constant_cpu_to_le16(CF_DIF_SEG_DESCR_ENABLE);
cur_dsd = (uint32_t *) crc_ctx_pkt-u.bundling.dif_address;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/10] qla2xxx: Remove redundant assignments

2013-06-05 Thread Bart Van Assche
The value of the pointer called nxt is not used after the
nxt = qla24xx_copy_eft(ha, nxt) statement. Hence keep the function
call but remove the assignment.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_dbg.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_dbg.c b/drivers/scsi/qla2xxx/qla_dbg.c
index cfa2a20..c612785 100644
--- a/drivers/scsi/qla2xxx/qla_dbg.c
+++ b/drivers/scsi/qla2xxx/qla_dbg.c
@@ -1468,7 +1468,7 @@ qla25xx_fw_dump(scsi_qla_host_t *vha, int hardware_locked)
 
nxt = qla2xxx_copy_queues(ha, nxt);
 
-   nxt = qla24xx_copy_eft(ha, nxt);
+   qla24xx_copy_eft(ha, nxt);
 
/* Chain entries -- started with MQ. */
nxt_chain = qla25xx_copy_fce(ha, nxt_chain, last_chain);
@@ -1787,7 +1787,7 @@ qla81xx_fw_dump(scsi_qla_host_t *vha, int hardware_locked)
 
nxt = qla2xxx_copy_queues(ha, nxt);
 
-   nxt = qla24xx_copy_eft(ha, nxt);
+   qla24xx_copy_eft(ha, nxt);
 
/* Chain entries -- started with MQ. */
nxt_chain = qla25xx_copy_fce(ha, nxt_chain, last_chain);
@@ -2289,7 +2289,7 @@ qla83xx_fw_dump(scsi_qla_host_t *vha, int hardware_locked)
 copy_queue:
nxt = qla2xxx_copy_queues(ha, nxt);
 
-   nxt = qla24xx_copy_eft(ha, nxt);
+   qla24xx_copy_eft(ha, nxt);
 
/* Chain entries -- started with MQ. */
nxt_chain = qla25xx_copy_fce(ha, nxt_chain, last_chain);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/10] qla2xxx: Help Coverity with analyzing ct_sns_pkt initialization

2013-06-05 Thread Bart Van Assche
Coverity reports Overrunning struct type ct_sns_req of 1228 bytes
by passing it to a function which accesses it at byte offset 8207
for each qla2x00_prep_ct_req(), qla2x00_prep_ct_fdmi_req() and
qla24xx_prep_ct_fm_req() call. Help Coverity to recognize that
these calls do not trigger a buffer overflow by making it explicit
that these three functions initializes both the request and reply
structures. This patch does not change any functionality.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_gs.c |   86 ++---
 1 file changed, 38 insertions(+), 48 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c
index f26442a..1ad361b 100644
--- a/drivers/scsi/qla2xxx/qla_gs.c
+++ b/drivers/scsi/qla2xxx/qla_gs.c
@@ -99,17 +99,17 @@ qla24xx_prep_ms_iocb(scsi_qla_host_t *vha, uint32_t 
req_size, uint32_t rsp_size)
  * Returns a pointer to the intitialized @ct_req.
  */
 static inline struct ct_sns_req *
-qla2x00_prep_ct_req(struct ct_sns_req *ct_req, uint16_t cmd, uint16_t rsp_size)
+qla2x00_prep_ct_req(struct ct_sns_pkt *p, uint16_t cmd, uint16_t rsp_size)
 {
-   memset(ct_req, 0, sizeof(struct ct_sns_pkt));
+   memset(p, 0, sizeof(struct ct_sns_pkt));
 
-   ct_req-header.revision = 0x01;
-   ct_req-header.gs_type = 0xFC;
-   ct_req-header.gs_subtype = 0x02;
-   ct_req-command = cpu_to_be16(cmd);
-   ct_req-max_rsp_size = cpu_to_be16((rsp_size - 16) / 4);
+   p-p.req.header.revision = 0x01;
+   p-p.req.header.gs_type = 0xFC;
+   p-p.req.header.gs_subtype = 0x02;
+   p-p.req.command = cpu_to_be16(cmd);
+   p-p.req.max_rsp_size = cpu_to_be16((rsp_size - 16) / 4);
 
-   return (ct_req);
+   return p-p.req;
 }
 
 static int
@@ -188,8 +188,7 @@ qla2x00_ga_nxt(scsi_qla_host_t *vha, fc_port_t *fcport)
GA_NXT_RSP_SIZE);
 
/* Prepare CT request */
-   ct_req = qla2x00_prep_ct_req(ha-ct_sns-p.req, GA_NXT_CMD,
-   GA_NXT_RSP_SIZE);
+   ct_req = qla2x00_prep_ct_req(ha-ct_sns, GA_NXT_CMD, GA_NXT_RSP_SIZE);
ct_rsp = ha-ct_sns-p.rsp;
 
/* Prepare CT arguments -- port_id */
@@ -284,8 +283,7 @@ qla2x00_gid_pt(scsi_qla_host_t *vha, sw_info_t *list)
gid_pt_rsp_size);
 
/* Prepare CT request */
-   ct_req = qla2x00_prep_ct_req(ha-ct_sns-p.req, GID_PT_CMD,
-   gid_pt_rsp_size);
+   ct_req = qla2x00_prep_ct_req(ha-ct_sns, GID_PT_CMD, gid_pt_rsp_size);
ct_rsp = ha-ct_sns-p.rsp;
 
/* Prepare CT arguments -- port_type */
@@ -359,7 +357,7 @@ qla2x00_gpn_id(scsi_qla_host_t *vha, sw_info_t *list)
GPN_ID_RSP_SIZE);
 
/* Prepare CT request */
-   ct_req = qla2x00_prep_ct_req(ha-ct_sns-p.req, GPN_ID_CMD,
+   ct_req = qla2x00_prep_ct_req(ha-ct_sns, GPN_ID_CMD,
GPN_ID_RSP_SIZE);
ct_rsp = ha-ct_sns-p.rsp;
 
@@ -421,7 +419,7 @@ qla2x00_gnn_id(scsi_qla_host_t *vha, sw_info_t *list)
GNN_ID_RSP_SIZE);
 
/* Prepare CT request */
-   ct_req = qla2x00_prep_ct_req(ha-ct_sns-p.req, GNN_ID_CMD,
+   ct_req = qla2x00_prep_ct_req(ha-ct_sns, GNN_ID_CMD,
GNN_ID_RSP_SIZE);
ct_rsp = ha-ct_sns-p.rsp;
 
@@ -495,7 +493,7 @@ qla2x00_rft_id(scsi_qla_host_t *vha)
RFT_ID_RSP_SIZE);
 
/* Prepare CT request */
-   ct_req = qla2x00_prep_ct_req(ha-ct_sns-p.req, RFT_ID_CMD,
+   ct_req = qla2x00_prep_ct_req(ha-ct_sns, RFT_ID_CMD,
RFT_ID_RSP_SIZE);
ct_rsp = ha-ct_sns-p.rsp;
 
@@ -551,8 +549,7 @@ qla2x00_rff_id(scsi_qla_host_t *vha)
RFF_ID_RSP_SIZE);
 
/* Prepare CT request */
-   ct_req = qla2x00_prep_ct_req(ha-ct_sns-p.req, RFF_ID_CMD,
-   RFF_ID_RSP_SIZE);
+   ct_req = qla2x00_prep_ct_req(ha-ct_sns, RFF_ID_CMD, RFF_ID_RSP_SIZE);
ct_rsp = ha-ct_sns-p.rsp;
 
/* Prepare CT arguments -- port_id, FC-4 feature, FC-4 type */
@@ -606,8 +603,7 @@ qla2x00_rnn_id(scsi_qla_host_t *vha)
RNN_ID_RSP_SIZE);
 
/* Prepare CT request */
-   ct_req = qla2x00_prep_ct_req(ha-ct_sns-p.req, RNN_ID_CMD,
-   RNN_ID_RSP_SIZE);
+   ct_req = qla2x00_prep_ct_req(ha-ct_sns, RNN_ID_CMD, RNN_ID_RSP_SIZE);
ct_rsp = ha-ct_sns-p.rsp;
 
/* Prepare CT arguments -- port_id, node_name */
@@ -676,8 +672,7 @@ qla2x00_rsnn_nn(scsi_qla_host_t *vha)
ms_pkt = ha-isp_ops-prep_ms_iocb(vha, 0, RSNN_NN_RSP_SIZE);
 
/* Prepare CT request */
-   ct_req = qla2x00_prep_ct_req(ha-ct_sns-p.req, RSNN_NN_CMD,
-   RSNN_NN_RSP_SIZE);
+   ct_req = qla2x00_prep_ct_req(ha-ct_sns, RSNN_NN_CMD, RSNN_NN_RSP_SIZE);
ct_rsp = ha-ct_sns-p.rsp;
 
/* Prepare CT arguments -- node_name, symbolic node_name, size */
@@ -1262,18 

[PATCH 08/10] qla2xxx: Fix qla2xxx_check_risc_status()

2013-06-05 Thread Bart Van Assche
Change the 'rval' variable from QLA_FUNCTION_TIMEOUT into QLA_SUCCESS
before starting a loop that is only executed if rval is initialized
to QLA_SUCCESS. Coverity reported that loop as dead code.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_isr.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 259d920..bd0e2fa 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -2495,6 +2495,7 @@ qla2xxx_check_risc_status(scsi_qla_host_t *vha)
if (rval == QLA_SUCCESS)
goto next_test;
 
+   rval = QLA_SUCCESS;
WRT_REG_DWORD(reg-iobase_window, 0x0003);
for (cnt = 100; (RD_REG_DWORD(reg-iobase_window)  BIT_0) == 0 
rval == QLA_SUCCESS; cnt--) {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/10] qla2xxx: Remove an unused variable from qla2x00_remove_one()

2013-06-05 Thread Bart Van Assche
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_os.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 1d97626..2176cc9 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -2980,14 +2980,12 @@ qla2x00_remove_one(struct pci_dev *pdev)
set_bit(UNLOADING, base_vha-dpc_flags);
mutex_lock(ha-vport_lock);
while (ha-cur_vport_count) {
-   struct Scsi_Host *scsi_host;
-
spin_lock_irqsave(ha-vport_slock, flags);
 
BUG_ON(base_vha-list.next == ha-vp_list);
/* This assumes first entry in ha-vp_list is always base vha */
vha = list_first_entry(base_vha-list, scsi_qla_host_t, list);
-   scsi_host = scsi_host_get(vha-host);
+   scsi_host_get(vha-host);
 
spin_unlock_irqrestore(ha-vport_slock, flags);
mutex_unlock(ha-vport_lock);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/10] qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()

2013-06-05 Thread Bart Van Assche
Avoid that the fcport structure gets leaked if
bsg_job-request-msgcode == FC_BSG_HST_ELS_NOLOGIN, the fcport
allocation succeeds and the !vha-flags.online branch is taken.
This was detected by Coverity. However, Coverity does not recognize
that all qla2x00_process_els() callers specify either
FC_BSG_RPT_ELS or FC_BSG_HST_ELS_NOLOGIN in the field
bsg_job-request-msgcode and that the value of that field is not
modified inside that function. This results in a false positive
report about a possible memory leak in an error path for
bsg_job-request-msgcode values other than the two mentioned
values.  Make it easy for Coverity (and for humans) to recognize
that there is no fcport leak in the error path by changing the
bsg_job-request-msgcode == FC_BSG_HST_ELS_NOLOGIN test into
bsg_job-request-msgcode != FC_BSG_RPT_ELS.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Chad Dupuis chad.dup...@qlogic.com
Cc: Saurav Kashyap saurav.kash...@qlogic.com
---
 drivers/scsi/qla2xxx/qla_bsg.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_bsg.c b/drivers/scsi/qla2xxx/qla_bsg.c
index cf07491..f8a2634 100644
--- a/drivers/scsi/qla2xxx/qla_bsg.c
+++ b/drivers/scsi/qla2xxx/qla_bsg.c
@@ -255,6 +255,12 @@ qla2x00_process_els(struct fc_bsg_job *bsg_job)
int rval =  (DRIVER_ERROR  16);
uint16_t nextlid = 0;
 
+   if (!vha-flags.online) {
+   ql_log(ql_log_warn, vha, 0x7005, Host not online.\n);
+   rval = -EIO;
+   goto done;
+   }
+
if (bsg_job-request-msgcode == FC_BSG_RPT_ELS) {
rport = bsg_job-rport;
fcport = *(fc_port_t **) rport-dd_data;
@@ -326,12 +332,6 @@ qla2x00_process_els(struct fc_bsg_job *bsg_job)
NPH_FABRIC_CONTROLLER : NPH_F_PORT;
}
 
-   if (!vha-flags.online) {
-   ql_log(ql_log_warn, vha, 0x7005, Host not online.\n);
-   rval = -EIO;
-   goto done;
-   }
-
req_sg_cnt =
dma_map_sg(ha-pdev-dev, bsg_job-request_payload.sg_list,
bsg_job-request_payload.sg_cnt, DMA_TO_DEVICE);
@@ -399,7 +399,7 @@ done_unmap_sg:
goto done_free_fcport;
 
 done_free_fcport:
-   if (bsg_job-request-msgcode == FC_BSG_HST_ELS_NOLOGIN)
+   if (bsg_job-request-msgcode != FC_BSG_RPT_ELS)
kfree(fcport);
 done:
return rval;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sg: atomize check and set sdp-exclude in sg_open

2013-06-05 Thread Jörn Engel
On Wed, 5 June 2013 17:18:33 +0800, vaughan wrote:
 
 Check and set sdp-exclude should be atomic when set in sg_open().

The patch is line-wrapped.  More importantly, it doesn't seem to do
what your description indicates it should do.  And lastly, does this
fix a bug, possibly even one you have a testcase for, or was it found
by code inspection?

 Signed-off-by: Vaughan Cao vaughan@oracle.com
 ---
  drivers/scsi/sg.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
 index 25b5455..0ede08f 100644
 --- a/drivers/scsi/sg.c
 +++ b/drivers/scsi/sg.c
 @@ -245,6 +245,21 @@ static int set_exclude(Sg_device *sdp, char val)
  return val;
  }
 
 +/* Check if we can set exclude and then set, return 1 if success */
 +static int try_set_exclude(Sg_device *sdp)
 +{
 +unsigned long flags;
 +
 +spin_lock_irqsave(sg_open_exclusive_lock, flags);
 +if (sdp-exclude) {
 +spin_unlock_irqrestore(sg_open_exclusive_lock, flags);
 +return 0;
 +}
 +sdp-exclude = 1;
 +spin_unlock_irqrestore(sg_open_exclusive_lock, flags);
 +return 1;
 +}
 +
  static int sfds_list_empty(Sg_device *sdp)
  {
  unsigned long flags;
 @@ -303,7 +318,7 @@ sg_open(struct inode *inode, struct file *filp)
  goto error_out;
  }
  res = wait_event_interruptible(sdp-o_excl_wait,
 -   ((!sfds_list_empty(sdp) || get_exclude(sdp))
 ? 0 : set_exclude(sdp, 1)));
 +((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 :
 try_set_exclude(sdp)));
  if (res) {
  retval = res;/* -ERESTARTSYS because signal hit process */
  goto error_out;
 -- 
 1.7.11.7
 

Jörn

--
Fantasy is more important than knowledge. Knowledge is limited,
while fantasy embraces the whole world.
-- Albert Einstein
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to fix this sleep-inside-lock problem?

2013-06-05 Thread Martin Peschke
Hi,

I would like to ask for advice, prior to submitting a patch for our lldd
zfcp, or alternatively for common code.

Someone reported this warning and function call stack:

BUG: sleeping function called from invalid context at kernel/workqueue.c:2752
in_atomic(): 1, irqs_disabled(): 1, pid: 360, name: zfcperp0.0.1700
CPU: 1 Not tainted 3.9.3+ #69
Process zfcperp0.0.1700 (pid: 360, task: 75b7e080, ksp: 
7476bc30)
snip
Call Trace:
([001165de] show_trace+0x106/0x154)
 [001166a0] show_stack+0x74/0xf4
 [006ff646] dump_stack+0xc6/0xd4
 [0017f3a0] __might_sleep+0x128/0x148
 [0015ece8] flush_work+0x54/0x1f8
 [001630de] __cancel_work_timer+0xc6/0x128
 [005067ac] scsi_device_dev_release_usercontext+0x164/0x23c
 [00161816] execute_in_process_context+0x96/0xa8
 [004d33d8] device_release+0x60/0xc0
 [0048af48] kobject_release+0xa8/0x1c4
 [004f4bf2] __scsi_iterate_devices+0xfa/0x130
 [03ff801b307a] zfcp_erp_strategy+0x4da/0x1014 [zfcp]
 [03ff801b3caa] zfcp_erp_thread+0xf6/0x2b0 [zfcp]
 [0016b75a] kthread+0xf2/0xfc
 [0070c9de] kernel_thread_starter+0x6/0xc
 [0070c9d8] kernel_thread_starter+0x0/0xc

Apparently, the ref_count for some scsi_device drops down to zero,
triggering device removal through execute_in_process_context(), while
the lldd error recovery thread iterates through a scsi device list.
Unfortunately, execute_in_process_context() decides to immediately
execute that device removal function, instead of scheduling asynchronous
execution, since it detects process context and thinks it is safe to do
so. But almost all calls to shost_for_each_device() in our lldd are
inside spin_lock_irq, even in thread context. Obviously, schedule()
inside spin_lock_irq sections is a bad idea.

Does it make sense to teach execute_in_process_context() to run a
function asynchronously when irqs are disabled, as inside a
spin_lock_irq section? If so, is the addition of an irqs_disabled()
check sufficient?

Or is it preferable to change the lldd to use __shost_for_each_device()
with shost-host_lock? (hoping that doesn't result in locking order
issues with host_lock inside our lldd erp_lock...)

Other suggestions?

The problem has been introduced when our LUN related data was moved to
the midlayer (commit b62a8d9b45b971a67a0f8413338c230e3117dff5) back in
2010.


Thanks,
Martin

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 1/1] ipr: IOA Status Code(IOASC) update

2013-06-05 Thread Brian King
On 06/04/2013 06:57 PM, wenxi...@linux.vnet.ibm.com wrote:
 The patch updates some new IOA Status Code(IOASC) in ipr driver.
 
 Signed-off-by: Wen Xiong wenxi...@linux.vnet.ibm.com
 ---

Acked-by: Brian King brk...@linux.vnet.ibm.com

Thanks,

Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sg: atomize check and set sdp-exclude in sg_open

2013-06-05 Thread vaughan

于 2013年06月05日 21:27, Jörn Engel 写道:

On Wed, 5 June 2013 17:18:33 +0800, vaughan wrote:


Check and set sdp-exclude should be atomic when set in sg_open().


The patch is line-wrapped.  More importantly, it doesn't seem to do

It's shorter than the original line, so I just leave it like this...


what your description indicates it should do.  And lastly, does this
fix a bug, possibly even one you have a testcase for, or was it found
by code inspection?
I found it by code inspection. A race condition may happen with the old 
code if two threads are both trying to open the same sg with O_EXCL 
simultaneously. It's possible that they both find fsds list is empty and 
get_exclude(sdp) returns 0, then they both call set_exclude() and break 
out from wait_event_interruptible and resume open. So it's necessary to 
check again with sg_open_exclusive_lock held to ensure only one can set 
sdp-exclude and return 0 to break out from wait_event loop.





Signed-off-by: Vaughan Cao vaughan@oracle.com
---
  drivers/scsi/sg.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 25b5455..0ede08f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -245,6 +245,21 @@ static int set_exclude(Sg_device *sdp, char val)
  return val;
  }

+/* Check if we can set exclude and then set, return 1 if success */
+static int try_set_exclude(Sg_device *sdp)
+{
+unsigned long flags;
+
+spin_lock_irqsave(sg_open_exclusive_lock, flags);
+if (sdp-exclude) {
+spin_unlock_irqrestore(sg_open_exclusive_lock, flags);
+return 0;
+}
+sdp-exclude = 1;
+spin_unlock_irqrestore(sg_open_exclusive_lock, flags);
+return 1;
+}
+
  static int sfds_list_empty(Sg_device *sdp)
  {
  unsigned long flags;
@@ -303,7 +318,7 @@ sg_open(struct inode *inode, struct file *filp)
  goto error_out;
  }
  res = wait_event_interruptible(sdp-o_excl_wait,
-   ((!sfds_list_empty(sdp) || get_exclude(sdp))
? 0 : set_exclude(sdp, 1)));
+((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 :
try_set_exclude(sdp)));
  if (res) {
  retval = res;/* -ERESTARTSYS because signal hit process */
  goto error_out;
--
1.7.11.7



Jörn

--
Fantasy is more important than knowledge. Knowledge is limited,
while fantasy embraces the whole world.
-- Albert Einstein


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] tcm_qla2xxx: Fix residual for underrun commands that fail

2013-06-05 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com

Suppose an initiator sends a DATA IN command with an allocation length
shorter than the FC transfer length -- we get a target message like

TARGET_CORE[qla2xxx]: Expected Transfer Length: 256 does not match SCSI CDB 
Length: 0 for SAM Opcode: 0x12

In that case, the target core adjusts the data_length and sets
se_cmd-residual_count for the underrun.  But now suppose that command
fails and we end up in tcm_qla2xxx_queue_status() -- that function
unconditionally overwrites residual_count with the already adjusted
data_length, and the initiator will burp with a message like

qla2xxx [:00:06.0]-301d:0: Dropped frame(s) detected (0x100 of 0x100 
bytes).

Fix this by adding on to the existing underflow residual count instead.

Signed-off-by: Roland Dreier rol...@purestorage.com
---
 drivers/scsi/qla2xxx/tcm_qla2xxx.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 7a3870f..66b0b26 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -688,8 +688,12 @@ static int tcm_qla2xxx_queue_status(struct se_cmd *se_cmd)
 * For FCP_READ with CHECK_CONDITION status, clear cmd-bufflen
 * for qla_tgt_xmit_response LLD code
 */
+   if (se_cmd-se_cmd_flags  SCF_OVERFLOW_BIT) {
+   se_cmd-se_cmd_flags = ~SCF_OVERFLOW_BIT;
+   se_cmd-residual_count = 0;
+   }
se_cmd-se_cmd_flags |= SCF_UNDERFLOW_BIT;
-   se_cmd-residual_count = se_cmd-data_length;
+   se_cmd-residual_count += se_cmd-data_length;
 
cmd-bufflen = 0;
}
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sg: atomize check and set sdp-exclude in sg_open

2013-06-05 Thread Jörn Engel
On Thu, 6 June 2013 00:16:45 +0800, vaughan wrote:
 于 2013年06月05日 21:27, Jörn Engel 写道:
 On Wed, 5 June 2013 17:18:33 +0800, vaughan wrote:
 
 Check and set sdp-exclude should be atomic when set in sg_open().
 
 The patch is line-wrapped.  More importantly, it doesn't seem to do
 It's shorter than the original line, so I just leave it like this...

Sure.  What I meant by line-wrapped is that your mailer mangled the
patch.  Those two lines should have been one:
 -   ((!sfds_list_empty(sdp) || get_exclude(sdp))
 ? 0 : set_exclude(sdp, 1)));

 what your description indicates it should do.  And lastly, does this
 fix a bug, possibly even one you have a testcase for, or was it found
 by code inspection?
 I found it by code inspection. A race condition may happen with the
 old code if two threads are both trying to open the same sg with
 O_EXCL simultaneously. It's possible that they both find fsds list
 is empty and get_exclude(sdp) returns 0, then they both call
 set_exclude() and break out from wait_event_interruptible and resume
 open. So it's necessary to check again with sg_open_exclusive_lock
 held to ensure only one can set sdp-exclude and return 0 to break
 out from wait_event loop.

Makes sense.  And reading the code again, I have to wonder what monkey
came up with the get_exclude/set_exclude functions.

Can I sucker you into a slightly larger cleanup?  I think the entire
get_exclude(sdp)) ? 0 : set_exclude(sdp, 1) should be simplified.
And once you add the try_set_exclude(), set_exclude will only ever do
clear_exclude, so you might as well rename and simplify that as well.

Let no good deed go unpunished.

Jörn

--
It's just what we asked for, but not what we want!
-- anonymous
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID-10 keeps aborting

2013-06-05 Thread Martin K. Petersen
 Bernd == Bernd Schubert bernd.schub...@itwm.fraunhofer.de writes:

Bernd And I also don't see any driver setting max_ws_blocks, so
Bernd everything except of libata gets the default of
Bernd SD_MAX_WS10_BLOCKS. 

Yes. That's intentional. Unless the device provides MAXIMUM WRITE SAME
BLOCKS in the BLOCK LIMITS VPD.


Bernd As the correct handling in the md layer seems to be difficult,
Bernd can we send a fake request at device configuration time to figure
Bernd out if the device really support write-same?

The problem is that WRITE SAME is destructive. And unfortunately a block
count of 0 means write entire device.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: Check if the device support WRITE_SAME_10

2013-06-05 Thread Bernd Schubert
On 06/05/2013 09:14 PM, Martin K. Petersen wrote: Bernd == Bernd 
Schubert bernd.schub...@itwm.fraunhofer.de writes:
 
 Bernd The md layer currently cannot handle failed WRITE_SAME commands
 Bernd and the initial easiest fix is to check if the device supports
 Bernd WRITE_SAME at all. It already tested for WRITE_SAME_16 and this
 Bernd commit adds a test for WRITE_SAME_10.
 
 No go. That'll disable WRITE SAME for drives which don't support
 RSOC. Which means almost all of them.

Ah, sorry, I didn't check the specs.

 
 I propose the following...
 

 --- a/drivers/scsi/sd.c
 +++ b/drivers/scsi/sd.c
 @@ -442,8 +442,15 @@ sd_store_write_same_blocks(struct device *dev, struct 
 device_attribute *attr,
  
   if (max == 0)
   sdp-no_write_same = 1;
 - else if (max = SD_MAX_WS16_BLOCKS)
 - sdkp-max_ws_blocks = max;
 + else
 + sdp-no_write_same = 0;
 +
 + if (sdkp-ws16)
 + sdkp-max_ws_blocks =
 + max_t(unsigned long, max, SD_MAX_WS16_BLOCKS);
 + else
 + sdkp-max_ws_blocks =
 + max_t(unsigned long, max, SD_MAX_WS10_BLOCKS);
  
   sd_config_write_same(sdkp);

Max? Not min_t()?


 @@ -762,16 +769,16 @@ static void sd_config_write_same(struct scsi_disk *sdkp)
* blocks per I/O unless the device explicitly advertises a
* bigger limit.
*/
 - if (sdkp-max_ws_blocks == 0)
 - sdkp-max_ws_blocks = SD_MAX_WS10_BLOCKS;
 -
 - if (sdkp-ws16 || sdkp-max_ws_blocks  SD_MAX_WS10_BLOCKS)
 + if (sdkp-max_ws_blocks  SD_MAX_WS10_BLOCKS)
   blocks = min_not_zero(sdkp-max_ws_blocks,
 (u32)SD_MAX_WS16_BLOCKS);
   else
   blocks = min_not_zero(sdkp-max_ws_blocks,
 (u32)SD_MAX_WS10_BLOCKS);
  
 + if (sdkp-ws16 || sdkp-ws10 || sdkp-device-no_report_opcodes)
 + sdkp-max_ws_blocks = blocks;
 +
  out:
   blk_queue_max_write_same_sectors(q, blocks * (logical_block_size  9));
  }

blk_queue_max_write_same_sectors(q, sdkp-max_ws_blocks * (logical_block_size 
 9)) ?
Otherwise sdkp-max_ws_blocks and the queue might have different values, 
wouldn't they?


I cant't provide a comment about scsi_get_vpd_page, I simply don't know. You 
certainly
know the scsi specs ways better than I do!


Thanks,
Bernd



--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] make dev_loss_tmo checks uniform across drivers

2013-06-05 Thread Shergill, Gurinder
 -Original Message-
 From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
 ow...@vger.kernel.org] On Behalf Of Bart Van Assche
 Sent: Monday, June 03, 2013 11:54 PM
 To: Shergill, Gurinder
 Cc: linux-scsi@vger.kernel.org
 Subject: Re: [PATCH] make dev_loss_tmo checks uniform across drivers
 
 On 06/03/13 20:25, gurinder.sherg...@hp.com wrote:
  The dev_loss_tmo parameter controls how long the FC transport layer
  insulates a loss of a remote port. It is defined to be between 1 and
  SCSI_DEVICE_BLOCK_MAX_TIMEOUT (600). Different drivers do slightly
  different checks when passed this parameter. This patch makes the
  dev_loss_tmo setting uniform across all the drivers by pulling the
  common logic into the transport layer function which is responsible
  for invoking the driver provided callouts.
 
  In addition, this patch also fixes a issue (see below), where
  dev_loss_tmo could end up with a value more than
  SCSI_DEVICE_BLOCK_MAX_TIMEOUT when fast_io_fail_tmo is enabled and
  then disabled.  The change is to never allow dev_loss_tmo to be out of
  its defined range. And, once that is a given, then fast_io_fail_tmo can
 be capped by dev_loss_tmo.
 
 Hello Sunny,
 
 As far as I know some users set fast_io_fail_tmo such that dev_loss_tmo
 can be set to a very large value in order to prevent that the SCSI host
 number changes. Sorry but I'm not sure it's a good idea to remove that
 functionality.

Fair enough. I wasn't aware of such usage of fast_io_fail_tmo. I'll take that 
part out of the patch.

Thanks for the feedback,
Sunny

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] scsi_transport_fc: FC timeout handler

2013-06-05 Thread Jörn Engel
On Sat, 25 May 2013 11:55:58 +0200, Hannes Reinecke wrote:
 
 It should be possible to move the code into scsi_lib and just have
 small hooks for the individual transports to use this.

We have done something a bit tasteless and unconditionally enabled
your new error handling for all scsi drivers.  That solved a rather
serious problem for us and the subset of scsi drivers we actually
exercise seems to be doing fine.  Maybe we could add a hook into
scsi_times_out() that does the unconditional thing if users opt-in by
setting CONFIG_I_MIGHT_BREAK_STUFF or so.

We also have a scsi error injector now that should allow testing your
patches on any hardware, including kvm.  On a trial run the system
eventually rebooted (likely through a hung task timeout, haven't
checked) with the old error handling and lost a single drive with your
new code.  Awesome!

So my vote is for ignoring the remaining bits and pushing forward.
Your code is too useful to wait another half year.

And we should brush up the error injector and send it out as well.
Testing your code on cheap hardware would be nice.

Jörn

--
He who knows others is wise.
He who knows himself is enlightened.
-- Lao Tsu
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] scsi: Document enhanced error codes

2013-06-05 Thread Ren Mingxin

Hi, Hannes:

I have two questions about the comments:

On 06/05/2013 03:10 PM, Hannes Reinecke wrote:

Document the various error codes returned on I/O failure.

Signed-off-by: Hannes Reineckeh...@suse.de
---
  drivers/scsi/scsi_error.c |  7 +--
  drivers/scsi/scsi_lib.c   | 11 +++
  2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f43de1e..443b0e3 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -229,8 +229,11 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host 
*shost,
   * scsi_check_sense - Examine scsi cmd sense
   * @scmd: Cmd to have sense checked.
   *
- * Return value:
- * SUCCESS or FAILED or NEEDS_RETRY or TARGET_ERROR
+ * Possible return values:
+ * SUCCESS
+ * FAILED
+ * NEEDS_RETRY
+ * TARGET_ERROR


This is more likely to be a historical non-update issue -
there is another possible return value 'ADD_TO_MLQUEUE' which may be
returned by the handler check_sense() or the case of this
scsi_check_sense() below, right?

switch (sshdr.sense_key) {
case HARDWARE_ERROR:
if (scmd-device-retry_hwerror)
return ADD_TO_MLQUEUE;



   *
   * Notes:
   *When a deferred error is detected the current command has
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..12bfa73 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -700,6 +700,17 @@ void scsi_release_buffers(struct scsi_cmnd *cmd)
  }
  EXPORT_SYMBOL(scsi_release_buffers);

+/**
+ * __scsi_error_from_host_byte - translate SCSI error code into errno
+ * @cmd:   SCSI command (unused)
+ * @result:scsi error code
+ *
+ * Translate SCSI error code into standard UNIX errno.
+ * Return values:
+ * -ENOLINKtemporary transport failure
+ * -EREMOTEIO  permanent target failure, do not retry
+ * -EBADE  permanent nexus failure, retry on other path


Sorry, I'm afraid that I'm not clear why '-EIO' is not listed here...

Perhaps some of them are not necessary to document for some reasons?

Thanks,
Ren


+ */
  static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result)
  {
int error = 0;


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] scsi: Document enhanced error codes

2013-06-05 Thread Hannes Reinecke

On 06/06/2013 07:49 AM, Ren Mingxin wrote:

Hi, Hannes:

I have two questions about the comments:

On 06/05/2013 03:10 PM, Hannes Reinecke wrote:

Document the various error codes returned on I/O failure.

Signed-off-by: Hannes Reineckeh...@suse.de
---
  drivers/scsi/scsi_error.c |  7 +--
  drivers/scsi/scsi_lib.c   | 11 +++
  2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f43de1e..443b0e3 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -229,8 +229,11 @@ static inline void scsi_eh_prt_fail_stats(struct
Scsi_Host *shost,
   * scsi_check_sense - Examine scsi cmd sense
   * @scmd:Cmd to have sense checked.
   *
- * Return value:
- *SUCCESS or FAILED or NEEDS_RETRY or TARGET_ERROR
+ * Possible return values:
+ *SUCCESS
+ *FAILED
+ *NEEDS_RETRY
+ *TARGET_ERROR


This is more likely to be a historical non-update issue -
there is another possible return value 'ADD_TO_MLQUEUE' which may be
returned by the handler check_sense() or the case of this
scsi_check_sense() below, right?

 switch (sshdr.sense_key) {
 case HARDWARE_ERROR:
 if (scmd-device-retry_hwerror)
 return ADD_TO_MLQUEUE;


Blast. You are correct. Will be fixing it up.



   *
   * Notes:
   *When a deferred error is detected the current command has
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..12bfa73 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -700,6 +700,17 @@ void scsi_release_buffers(struct scsi_cmnd *cmd)
  }
  EXPORT_SYMBOL(scsi_release_buffers);

+/**
+ * __scsi_error_from_host_byte - translate SCSI error code into errno
+ * @cmd:SCSI command (unused)
+ * @result:scsi error code
+ *
+ * Translate SCSI error code into standard UNIX errno.
+ * Return values:
+ * -ENOLINKtemporary transport failure
+ * -EREMOTEIOpermanent target failure, do not retry
+ * -EBADEpermanent nexus failure, retry on other path


Sorry, I'm afraid that I'm not clear why '-EIO' is not listed here...

Perhaps some of them are not necessary to document for some reasons?


Hmm. What with EIO being the default I've seen no need to document this.
But yes, you are right, it should be documented.

Cheers,

Hannes

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html