[PATCH 1/2] iscsi_tcp: fix potential lockup with write commands
From: Tony Battersby [EMAIL PROTECTED] There is a race condition in iscsi_tcp.c that may cause it to forget that it received a R2T from the target. This race may cause a data-out command (such as a write) to lock up. The race occurs here: static int iscsi_send_unsol_pdu(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) { struct iscsi_tcp_cmd_task *tcp_ctask = ctask-dd_data; int rc; if (tcp_ctask-xmstate XMSTATE_UNS_HDR) { BUG_ON(!ctask-unsol_count); tcp_ctask-xmstate = ~XMSTATE_UNS_HDR; RACE ... static int iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) { ... tcp_ctask-xmstate |= XMSTATE_SOL_HDR_INIT; RACE ... While iscsi_xmitworker() (called from scsi_queue_work()) is preparing to send unsolicited data, iscsi_tcp_data_recv() (called from tcp_read_sock()) interrupts it upon receipt of a R2T from the target. Both contexts do read-modify-write of tcp_ctask-xmstate. Usually, gcc on x86 will make = and |= atomic on UP (not guaranteed of course), but in this case iscsi_send_unsol_pdu() reads the value of xmstate before clearing the bit, which causes gcc to read xmstate into a CPU register, test it, clear the bit, and then store it back to memory. If the recv interrupt happens during this sequence, then the XMSTATE_SOL_HDR_INIT bit set by the recv interrupt will be lost, and the R2T will be forgotten. The patch below (against 2.6.24-rc1) converts accesses of xmstate to use set_bit, clear_bit, and test_bit instead of |= and =. I have tested this patch and verified that it fixes the problem. Another possible approach would be to hold a lock during most of the rx/tx setup and post-processing, and drop the lock only for the actual rx/tx. Signed-off-by: Tony Battersby [EMAIL PROTECTED] Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/iscsi_tcp.c | 139 +++--- drivers/scsi/iscsi_tcp.h | 34 ++-- 2 files changed, 86 insertions(+), 87 deletions(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 4bcf916..57ce225 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -197,7 +197,7 @@ iscsi_tcp_cleanup_ctask(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) if (unlikely(!sc)) return; - tcp_ctask-xmstate = XMSTATE_IDLE; + tcp_ctask-xmstate = XMSTATE_VALUE_IDLE; tcp_ctask-r2t = NULL; } @@ -409,7 +409,7 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask) tcp_ctask-exp_datasn = r2tsn + 1; __kfifo_put(tcp_ctask-r2tqueue, (void*)r2t, sizeof(void*)); - tcp_ctask-xmstate |= XMSTATE_SOL_HDR_INIT; + set_bit(XMSTATE_BIT_SOL_HDR_INIT, tcp_ctask-xmstate); list_move_tail(ctask-running, conn-xmitqueue); scsi_queue_work(session-host, conn-xmitwork); @@ -1254,7 +1254,7 @@ static void iscsi_set_padding(struct iscsi_tcp_cmd_task *tcp_ctask, tcp_ctask-pad_count = ISCSI_PAD_LEN - tcp_ctask-pad_count; debug_scsi(write padding %d bytes\n, tcp_ctask-pad_count); - tcp_ctask-xmstate |= XMSTATE_W_PAD; + set_bit(XMSTATE_BIT_W_PAD, tcp_ctask-xmstate); } /** @@ -1269,7 +1269,7 @@ iscsi_tcp_cmd_init(struct iscsi_cmd_task *ctask) struct iscsi_tcp_cmd_task *tcp_ctask = ctask-dd_data; BUG_ON(__kfifo_len(tcp_ctask-r2tqueue)); - tcp_ctask-xmstate = XMSTATE_CMD_HDR_INIT; + tcp_ctask-xmstate = 1 XMSTATE_BIT_CMD_HDR_INIT; } /** @@ -1283,10 +1283,10 @@ iscsi_tcp_cmd_init(struct iscsi_cmd_task *ctask) * xmit. * * Management xmit state machine consists of these states: - * XMSTATE_IMM_HDR_INIT- calculate digest of PDU Header - * XMSTATE_IMM_HDR - PDU Header xmit in progress - * XMSTATE_IMM_DATA- PDU Data xmit in progress - * XMSTATE_IDLE- management PDU is done + * XMSTATE_BIT_IMM_HDR_INIT - calculate digest of PDU Header + * XMSTATE_BIT_IMM_HDR - PDU Header xmit in progress + * XMSTATE_BIT_IMM_DATA - PDU Data xmit in progress + * XMSTATE_VALUE_IDLE - management PDU is done **/ static int iscsi_tcp_mtask_xmit(struct iscsi_conn *conn, struct iscsi_mgmt_task *mtask) @@ -1297,12 +1297,12 @@ iscsi_tcp_mtask_xmit(struct iscsi_conn *conn, struct iscsi_mgmt_task *mtask) debug_scsi(mtask deq [cid %d state %x itt 0x%x]\n, conn-id, tcp_mtask-xmstate, mtask-itt); - if (tcp_mtask-xmstate XMSTATE_IMM_HDR_INIT) { + if (test_bit(XMSTATE_BIT_IMM_HDR_INIT, tcp_mtask-xmstate)) { iscsi_buf_init_iov(tcp_mtask-headbuf, (char*)mtask-hdr, sizeof(struct iscsi_hdr)); if (mtask-data_count) { - tcp_mtask-xmstate |= XMSTATE_IMM_DATA; +
RE: how to handle QUEUE_FULL/SAM_STAT_TASK_SET_FULL in userspace?
On Wednesday, November 14, 2007 10:23 AM, Chris Friesen wrote: QUEUE_FULL and SAM_STAT_TASK_SET_FULL are not errors. I consider them errors in the same way that ENOMEM or ENOBUFS (or even EAGAIN) are errors. There is a shortage of resources and the command could not be completed, please try again later. Also, the behaviour has changed from 2.6.10 with the 3.01.18 fusion driver, to 2.6.14 with the 3.02.57 fusion driver. With 2.6.10 our user app never saw SAM_STAT_TASK_SET_FULL. I suspect it is due to the fact that it's using a queue size of 7, while in 2.6.14 it's using a queue size of 32 or 64. Which kernel version is behaving properly? You already figured out the problem, I don't understand why your asking if the kernel verison is behaving properly. You said between those driver versions the device queue depth increased from 32 to 64, and that is exactly what happened. The reason for the increase is some customer ask for the increase queue_depth which helps with performance. We are not going to decrease it back. I've asked seagate what the queue size should be for that hardware, but haven't heard back yet. SAM_STAT_TASK_SET_FULL returned for the target that handle the number of commands, and QUEUE_FULL returned from hba firmware meaning its can't handle the number of commands. Translated, the commands are retried by scsiml.I probably should be calling scsi_track_queue_full which would be throttling the command back, however I'm not sure whether it matters. We have a userspace app calling ioctl(...SG_IO...) on /dev/sdX and occasionally getting a status of SAM_STAT_TASK_SET_FULL. I may be misreading the code, but it doesn't appear that the midlayer is retrying these commands. If the queue length in 2.6.14 is correct then how do I handle that status code? Maybe delay a bit then retry a few times? How much delay? How many retries? SAM_STAT_TASK_SET_FULL in /usr/src/linux/scsi/scsi.h, is the same as QUEUE_FULL. If you look in scsi_error.c searching for QUEUE_FULL, you will see that it will translate to ADD_TO_MLQUEUE, which means it will reposted to the request queue. Ultimately, calling scsi_track_queue_full would help by reducing the queue_depth on the fly, however I'm not sure if that is there in the older kernels your running. What I suggest you do is write a script to update the queue_depth to the values youre wanting. Example # echo 32 /sys/class/scsi_device/0:0:0:0/device/queue_depth - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH resend] scsi boot options: correct option name and tell where to find docs for it
From: Randy Dunlap [EMAIL PROTECTED] Minor corrections and additions to 'scsi_logging_level', as pointed out by Chuck Ebbert. Also point out the IBM S390-tools 'scsi_logging_level' script. Signed-off-by: Randy Dunlap [EMAIL PROTECTED] --- Documentation/kernel-parameters.txt |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) --- linux-2.6.24-rc2-git4.orig/Documentation/kernel-parameters.txt +++ linux-2.6.24-rc2-git4/Documentation/kernel-parameters.txt @@ -1584,7 +1584,13 @@ and is between 256 and 4096 characters. Format: vendor:model:flags (flags are integer value) - scsi_logging= [SCSI] + scsi_logging_level= [SCSI] a bit mask of logging levels + See drivers/scsi/scsi_logging.h for bits. Also + settable via sysctl at dev.scsi.logging_level + (/proc/sys/dev/scsi/logging_level). + There is also a nice 'scsi_logging_level' script in the + S390-tools package, available for download at + http://www-128.ibm.com/developerworks/linux/linux390/s390-tools-1.5.4.html scsi_mod.scan= [SCSI] sync (default) scans SCSI busses as they are discovered. async scans them in kernel threads, - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 32/40] hptiop: avoid buffer overflow when returning sense data
-stable review patch. If anyone has any objections, please let us know. -- From: HighPoint Linux Team [EMAIL PROTECTED] patch 0fec02c93f60fb44ba3a24a0d3e4a52521d34d3f in mainline. avoid buffer overflow when returning sense data. With current adapter firmware the driver is working but future firmware updates may return sense data larger than 96 bytes, causing overflow on scp-sense_buffer and a kernel crash. This fix should be backported to earlier kernels. Signed-off-by: HighPoint Linux Team [EMAIL PROTECTED] Signed-off-by: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] Signed-off-by: Greg Kroah-Hartman [EMAIL PROTECTED] --- drivers/scsi/hptiop.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/drivers/scsi/hptiop.c +++ b/drivers/scsi/hptiop.c @@ -365,8 +365,9 @@ static void hptiop_host_request_callback scp-result = SAM_STAT_CHECK_CONDITION; memset(scp-sense_buffer, 0, sizeof(scp-sense_buffer)); - memcpy(scp-sense_buffer, - req-sg_list, le32_to_cpu(req-dataxfer_length)); + memcpy(scp-sense_buffer, req-sg_list, + min(sizeof(scp-sense_buffer), + le32_to_cpu(req-dataxfer_length))); break; default: -- - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html