[PATCH 1/2] iscsi_tcp: fix potential lockup with write commands

2007-11-14 Thread michaelc
From: Tony Battersby [EMAIL PROTECTED]

There is a race condition in iscsi_tcp.c that may cause it to forget
that it received a R2T from the target.  This race may cause a data-out
command (such as a write) to lock up.  The race occurs here:

static int
iscsi_send_unsol_pdu(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
struct iscsi_tcp_cmd_task *tcp_ctask = ctask-dd_data;
int rc;

if (tcp_ctask-xmstate  XMSTATE_UNS_HDR) {
BUG_ON(!ctask-unsol_count);
tcp_ctask-xmstate = ~XMSTATE_UNS_HDR;  RACE
...

static int
iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask)
{
...
tcp_ctask-xmstate |= XMSTATE_SOL_HDR_INIT;  RACE
...

While iscsi_xmitworker() (called from scsi_queue_work()) is preparing to
send unsolicited data, iscsi_tcp_data_recv() (called from
tcp_read_sock()) interrupts it upon receipt of a R2T from the target.
Both contexts do read-modify-write of tcp_ctask-xmstate.  Usually, gcc
on x86 will make = and |= atomic on UP (not guaranteed of course), but
in this case iscsi_send_unsol_pdu() reads the value of xmstate before
clearing the bit, which causes gcc to read xmstate into a CPU register,
test it, clear the bit, and then store it back to memory.  If the recv
interrupt happens during this sequence, then the XMSTATE_SOL_HDR_INIT
bit set by the recv interrupt will be lost, and the R2T will be
forgotten.

The patch below (against 2.6.24-rc1) converts accesses of xmstate to use
set_bit, clear_bit, and test_bit instead of |= and =.  I have tested
this patch and verified that it fixes the problem.  Another possible
approach would be to hold a lock during most of the rx/tx setup and
post-processing, and drop the lock only for the actual rx/tx.

Signed-off-by: Tony Battersby [EMAIL PROTECTED]
Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/iscsi_tcp.c |  139 +++---
 drivers/scsi/iscsi_tcp.h |   34 ++--
 2 files changed, 86 insertions(+), 87 deletions(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 4bcf916..57ce225 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -197,7 +197,7 @@ iscsi_tcp_cleanup_ctask(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
if (unlikely(!sc))
return;
 
-   tcp_ctask-xmstate = XMSTATE_IDLE;
+   tcp_ctask-xmstate = XMSTATE_VALUE_IDLE;
tcp_ctask-r2t = NULL;
 }
 
@@ -409,7 +409,7 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
 
tcp_ctask-exp_datasn = r2tsn + 1;
__kfifo_put(tcp_ctask-r2tqueue, (void*)r2t, sizeof(void*));
-   tcp_ctask-xmstate |= XMSTATE_SOL_HDR_INIT;
+   set_bit(XMSTATE_BIT_SOL_HDR_INIT, tcp_ctask-xmstate);
list_move_tail(ctask-running, conn-xmitqueue);
 
scsi_queue_work(session-host, conn-xmitwork);
@@ -1254,7 +1254,7 @@ static void iscsi_set_padding(struct iscsi_tcp_cmd_task 
*tcp_ctask,
 
tcp_ctask-pad_count = ISCSI_PAD_LEN - tcp_ctask-pad_count;
debug_scsi(write padding %d bytes\n, tcp_ctask-pad_count);
-   tcp_ctask-xmstate |= XMSTATE_W_PAD;
+   set_bit(XMSTATE_BIT_W_PAD, tcp_ctask-xmstate);
 }
 
 /**
@@ -1269,7 +1269,7 @@ iscsi_tcp_cmd_init(struct iscsi_cmd_task *ctask)
struct iscsi_tcp_cmd_task *tcp_ctask = ctask-dd_data;
 
BUG_ON(__kfifo_len(tcp_ctask-r2tqueue));
-   tcp_ctask-xmstate = XMSTATE_CMD_HDR_INIT;
+   tcp_ctask-xmstate = 1  XMSTATE_BIT_CMD_HDR_INIT;
 }
 
 /**
@@ -1283,10 +1283,10 @@ iscsi_tcp_cmd_init(struct iscsi_cmd_task *ctask)
  * xmit.
  *
  * Management xmit state machine consists of these states:
- * XMSTATE_IMM_HDR_INIT- calculate digest of PDU Header
- * XMSTATE_IMM_HDR - PDU Header xmit in progress
- * XMSTATE_IMM_DATA- PDU Data xmit in progress
- * XMSTATE_IDLE- management PDU is done
+ * XMSTATE_BIT_IMM_HDR_INIT - calculate digest of PDU Header
+ * XMSTATE_BIT_IMM_HDR  - PDU Header xmit in progress
+ * XMSTATE_BIT_IMM_DATA - PDU Data xmit in progress
+ * XMSTATE_VALUE_IDLE   - management PDU is done
  **/
 static int
 iscsi_tcp_mtask_xmit(struct iscsi_conn *conn, struct iscsi_mgmt_task *mtask)
@@ -1297,12 +1297,12 @@ iscsi_tcp_mtask_xmit(struct iscsi_conn *conn, struct 
iscsi_mgmt_task *mtask)
debug_scsi(mtask deq [cid %d state %x itt 0x%x]\n,
conn-id, tcp_mtask-xmstate, mtask-itt);
 
-   if (tcp_mtask-xmstate  XMSTATE_IMM_HDR_INIT) {
+   if (test_bit(XMSTATE_BIT_IMM_HDR_INIT, tcp_mtask-xmstate)) {
iscsi_buf_init_iov(tcp_mtask-headbuf, (char*)mtask-hdr,
   sizeof(struct iscsi_hdr));
 
if (mtask-data_count) {
-   tcp_mtask-xmstate |= XMSTATE_IMM_DATA;
+   

RE: how to handle QUEUE_FULL/SAM_STAT_TASK_SET_FULL in userspace?

2007-11-14 Thread Moore, Eric
On Wednesday, November 14, 2007 10:23 AM, Chris Friesen wrote: 
  QUEUE_FULL and SAM_STAT_TASK_SET_FULL are not errors.
 
 I consider them errors in the same way that ENOMEM or ENOBUFS 
 (or even 
 EAGAIN) are errors.  There is a shortage of resources and 
 the command 
 could not be completed, please try again later.
 
 Also, the behaviour has changed from 2.6.10 with the 3.01.18 fusion 
 driver, to 2.6.14 with the 3.02.57 fusion driver.
 
 With 2.6.10 our user app never saw SAM_STAT_TASK_SET_FULL.  I 
 suspect it 
 is due to the fact that it's using a queue size of 7, while in 2.6.14 
 it's using a queue size of 32 or 64.
 
 Which kernel version is behaving properly?

You already figured out the problem, I don't understand why your asking
if the kernel verison is behaving properly.   You said between those
driver versions the device queue depth increased from 32 to 64, and that
is exactly what happened.   The reason for the increase is some customer
ask for the increase queue_depth which helps with performance. We are
not going to decrease it back.

 
 I've asked seagate what the queue size should be for that 
 hardware, but 
 haven't heard back yet.
 
  SAM_STAT_TASK_SET_FULL returned for the target that handle 
 the number of
  commands, and QUEUE_FULL returned from hba firmware meaning 
 its can't
  handle the number of commands.  Translated, the commands 
 are retried by
  scsiml.I probably should be calling scsi_track_queue_full which
  would be throttling the command back, however I'm not sure 
 whether it
  matters.
 
 We have a userspace app calling ioctl(...SG_IO...) on /dev/sdX and 
 occasionally getting a status of SAM_STAT_TASK_SET_FULL.  I may be 
 misreading the code, but it doesn't appear that the midlayer 
 is retrying 
 these commands.
 
 If the queue length in 2.6.14 is correct then how do I handle that 
 status code?  Maybe delay a bit then retry a few times?  How 
 much delay? 
How many retries?
 

SAM_STAT_TASK_SET_FULL in /usr/src/linux/scsi/scsi.h, is the same as
QUEUE_FULL.  If you look in scsi_error.c searching for QUEUE_FULL, you
will see that it will translate to ADD_TO_MLQUEUE, which means it will
reposted to the request queue.  Ultimately, calling
scsi_track_queue_full would help by reducing the queue_depth on the fly,
however I'm not sure if that is there in the older kernels your running.
What I suggest you do is write a script to update the queue_depth to the
values youre wanting.

Example
#  echo 32  /sys/class/scsi_device/0:0:0:0/device/queue_depth




-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH resend] scsi boot options: correct option name and tell where to find docs for it

2007-11-14 Thread Randy Dunlap
From: Randy Dunlap [EMAIL PROTECTED]

Minor corrections and additions to 'scsi_logging_level', as pointed out
by Chuck Ebbert.
Also point out the IBM S390-tools 'scsi_logging_level' script.

Signed-off-by: Randy Dunlap [EMAIL PROTECTED]
---
 Documentation/kernel-parameters.txt |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- linux-2.6.24-rc2-git4.orig/Documentation/kernel-parameters.txt
+++ linux-2.6.24-rc2-git4/Documentation/kernel-parameters.txt
@@ -1584,7 +1584,13 @@ and is between 256 and 4096 characters. 
Format: vendor:model:flags
(flags are integer value)
 
-   scsi_logging=   [SCSI]
+   scsi_logging_level= [SCSI] a bit mask of logging levels
+   See drivers/scsi/scsi_logging.h for bits.  Also
+   settable via sysctl at dev.scsi.logging_level
+   (/proc/sys/dev/scsi/logging_level).
+   There is also a nice 'scsi_logging_level' script in the
+   S390-tools package, available for download at
+   
http://www-128.ibm.com/developerworks/linux/linux390/s390-tools-1.5.4.html
 
scsi_mod.scan=  [SCSI] sync (default) scans SCSI busses as they are
discovered.  async scans them in kernel threads,

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 32/40] hptiop: avoid buffer overflow when returning sense data

2007-11-14 Thread Greg Kroah-Hartman


-stable review patch.  If anyone has any objections, please let us know.

--
From: HighPoint Linux Team [EMAIL PROTECTED]

patch 0fec02c93f60fb44ba3a24a0d3e4a52521d34d3f in mainline.

avoid buffer overflow when returning sense data.

With current adapter firmware the driver is working but future firmware
updates may return sense data larger than 96 bytes, causing overflow on
scp-sense_buffer and a kernel crash.

This fix should be backported to earlier kernels.

Signed-off-by: HighPoint Linux Team [EMAIL PROTECTED]
Signed-off-by: James Bottomley [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Signed-off-by: Greg Kroah-Hartman [EMAIL PROTECTED]

---
 drivers/scsi/hptiop.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/scsi/hptiop.c
+++ b/drivers/scsi/hptiop.c
@@ -365,8 +365,9 @@ static void hptiop_host_request_callback
scp-result = SAM_STAT_CHECK_CONDITION;
memset(scp-sense_buffer,
0, sizeof(scp-sense_buffer));
-   memcpy(scp-sense_buffer,
-   req-sg_list, le32_to_cpu(req-dataxfer_length));
+   memcpy(scp-sense_buffer, req-sg_list,
+   min(sizeof(scp-sense_buffer),
+   le32_to_cpu(req-dataxfer_length)));
break;
 
default:

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html