Re: [PATCH] cnic: Fix ISCSI_KEVENT_IF_DOWN message handling.
Michael Chan wrote: When a net device goes down or when the bnx2i driver is unloaded, the code was not generating the ISCSI_KEVENT_IF_DOWN message properly and this could cause the userspace driver to crash. This is fixed by sending the message properly in the shutdown path. cnic_uio_stop() is also added to send the message when bnx2i is unregistering. Signed-off-by: Michael Chan mc...@broadcom.com --- drivers/net/cnic.c | 23 +-- 1 files changed, 21 insertions(+), 2 deletions(-) diff --git a/drivers/net/cnic.c b/drivers/net/cnic.c index 4d1515f..4869d77 100644 --- a/drivers/net/cnic.c +++ b/drivers/net/cnic.c @@ -227,7 +227,7 @@ static int cnic_send_nlmsg(struct cnic_local *cp, u32 type, } rcu_read_lock(); - ulp_ops = rcu_dereference(cp-ulp_ops[CNIC_ULP_ISCSI]); + ulp_ops = rcu_dereference(cnic_ulp_tbl[CNIC_ULP_ISCSI]); if (ulp_ops) ulp_ops-iscsi_nl_send_msg(cp-dev, msg_type, buf, len); rcu_read_unlock(); @@ -319,6 +319,20 @@ static int cnic_abort_prep(struct cnic_sock *csk) return 0; } +static void cnic_uio_stop(void) +{ + struct cnic_dev *dev; + + read_lock(cnic_dev_lock); + list_for_each_entry(dev, cnic_dev_list, list) { + struct cnic_local *cp = dev-cnic_priv; + + if (cp-cnic_uinfo) + cnic_send_nlmsg(cp, ISCSI_KEVENT_IF_DOWN, NULL); I don't think you can call this with the cnic_dev_lock held. They have the same sleeping restrictions as a spin_lock right? If so, the problem is that iscsi_nl_send_ms calls iscsi_offload_mesg which uses GFP_NOIO and can sleep. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] Don't kill iscsid if logout from all nodes fail
Erez Zilber wrote: If 'iscsiadm -m node --logoutall=all' fails when stopping the open-iscsi service, we shouldn't kill iscsid. This solves the following race: 1. A logout from a node is initiated by the user. 2. Before the logout completes, the user runs /etc/init.d/iscsi stop. The 'stop' method logs out from all nodes. When it tries to logout from the node that is already logging out (step #1), it fails because it is already logging out. Then, the 'stop' method kills iscsid. 3. The logout command form step #1 returns and notifies the (dead) daemon. Now, running 'iscsiadm -m session' shows a session (which, actually, doesn't exist anymore) and the iscsi service is down. Signed-off-by: Erez Zilber erezzi.l...@gmail.com Thanks Erez. Merged in a62d1b60856dc3118ab1d07990d43695b336fd69. It should be on kernel.org in a little bit. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] cnic: Fix ISCSI_KEVENT_IF_DOWN message handling.
Mike Christie wrote: Michael Chan wrote: When a net device goes down or when the bnx2i driver is unloaded, the code was not generating the ISCSI_KEVENT_IF_DOWN message properly and this could cause the userspace driver to crash. This is fixed by sending the message properly in the shutdown path. cnic_uio_stop() is also added to send the message when bnx2i is unregistering. Signed-off-by: Michael Chan mc...@broadcom.com --- drivers/net/cnic.c | 23 +-- 1 files changed, 21 insertions(+), 2 deletions(-) diff --git a/drivers/net/cnic.c b/drivers/net/cnic.c index 4d1515f..4869d77 100644 --- a/drivers/net/cnic.c +++ b/drivers/net/cnic.c @@ -227,7 +227,7 @@ static int cnic_send_nlmsg(struct cnic_local *cp, u32 type, } rcu_read_lock(); - ulp_ops = rcu_dereference(cp-ulp_ops[CNIC_ULP_ISCSI]); + ulp_ops = rcu_dereference(cnic_ulp_tbl[CNIC_ULP_ISCSI]); if (ulp_ops) ulp_ops-iscsi_nl_send_msg(cp-dev, msg_type, buf, len); rcu_read_unlock(); @@ -319,6 +319,20 @@ static int cnic_abort_prep(struct cnic_sock *csk) return 0; } +static void cnic_uio_stop(void) +{ + struct cnic_dev *dev; + + read_lock(cnic_dev_lock); + list_for_each_entry(dev, cnic_dev_list, list) { + struct cnic_local *cp = dev-cnic_priv; + + if (cp-cnic_uinfo) + cnic_send_nlmsg(cp, ISCSI_KEVENT_IF_DOWN, NULL); I don't think you can call this with the cnic_dev_lock held. They have the same sleeping restrictions as a spin_lock right? If so, the problem is that iscsi_nl_send_ms calls iscsi_offload_mesg which uses GFP_NOIO and can sleep. In that case, can I send in a patch to change iscsi_offload_mesg() to use GFP_ATOMIC? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] cnic: Fix ISCSI_KEVENT_IF_DOWN message handling.
Michael Chan wrote: Mike Christie wrote: Michael Chan wrote: When a net device goes down or when the bnx2i driver is unloaded, the code was not generating the ISCSI_KEVENT_IF_DOWN message properly and this could cause the userspace driver to crash. This is fixed by sending the message properly in the shutdown path. cnic_uio_stop() is also added to send the message when bnx2i is unregistering. Signed-off-by: Michael Chan mc...@broadcom.com --- drivers/net/cnic.c | 23 +-- 1 files changed, 21 insertions(+), 2 deletions(-) diff --git a/drivers/net/cnic.c b/drivers/net/cnic.c index 4d1515f..4869d77 100644 --- a/drivers/net/cnic.c +++ b/drivers/net/cnic.c @@ -227,7 +227,7 @@ static int cnic_send_nlmsg(struct cnic_local *cp, u32 type, } rcu_read_lock(); - ulp_ops = rcu_dereference(cp-ulp_ops[CNIC_ULP_ISCSI]); + ulp_ops = rcu_dereference(cnic_ulp_tbl[CNIC_ULP_ISCSI]); if (ulp_ops) ulp_ops-iscsi_nl_send_msg(cp-dev, msg_type, buf, len); rcu_read_unlock(); @@ -319,6 +319,20 @@ static int cnic_abort_prep(struct cnic_sock *csk) return 0; } +static void cnic_uio_stop(void) +{ + struct cnic_dev *dev; + + read_lock(cnic_dev_lock); + list_for_each_entry(dev, cnic_dev_list, list) { + struct cnic_local *cp = dev-cnic_priv; + + if (cp-cnic_uinfo) + cnic_send_nlmsg(cp, ISCSI_KEVENT_IF_DOWN, NULL); I don't think you can call this with the cnic_dev_lock held. They have the same sleeping restrictions as a spin_lock right? If so, the problem is that iscsi_nl_send_ms calls iscsi_offload_mesg which uses GFP_NOIO and can sleep. In that case, can I send in a patch to change iscsi_offload_mesg() to use GFP_ATOMIC? Yes, I guess so. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: LUN Reset TMF and R2T
Steven Hayter wrote: On 28/07/2009 06:14 pm, Mike Christie wrote: On 07/28/2009 06:53 AM, Hannes Reinecke wrote: Hi all, when my device-reset testcase I've come across this: Jul 28 12:46:08 tyne kernel: session1: iscsi_eh_device_reset LU Reset [sc 8800731e9480 lun 6] Jul 28 12:46:08 tyne kernel: session1: iscsi_exec_task_mgmt_fn tmf set timeout Jul 28 12:46:08 tyne kernel: session1: mgmtpdu [op 0x2 hdr-itt 0x69 datalen 0] Jul 28 12:46:08 tyne kernel: connection1:0: mgmtpdu [itt 0x69 task 88007b022800] xmit Jul 28 12:46:08 tyne kernel: connection1:0: tmf rsp [itt 0x69] response 0 state 1 Jul 28 12:46:08 tyne kernel: session1: iscsi_suspend_tx suspend Tx Jul 28 12:46:08 tyne kernel: session1: fail_scsi_tasks failing sc 88006fd20380 itt 0x54 state 3 Jul 28 12:46:08 tyne kernel: session1: fail_scsi_task fail task [sc 88006fd20380 lun 6 itt x54] state 3 Jul 28 12:46:08 tyne kernel: session1: fail_scsi_tasks failing sc 88007119b880 itt 0x5d state 3 Jul 28 12:46:08 tyne kernel: session1: fail_scsi_task fail task [sc 88007119b880 lun 6 itt x5d] state 3 Jul 28 12:46:08 tyne kernel: session1: fail_scsi_tasks failing sc 88007116ec80 itt 0x60 state 3 Jul 28 12:46:08 tyne kernel: session1: fail_scsi_task fail task [sc 88007116ec80 lun 6 itt x60] state 3 Jul 28 12:46:08 tyne kernel: session1: fail_scsi_tasks failing sc 880079dd8180 itt 0x61 state 3 Jul 28 12:46:08 tyne kernel: session1: fail_scsi_task fail task [sc 880079dd8180 lun 6 itt x61] state 3 Jul 28 12:46:08 tyne kernel: connection1:0: invalid itt 0x5d in R2T hdr Jul 28 12:46:08 tyne kernel: session1: iscsi_start_tx resume Tx Jul 28 12:46:08 tyne kernel: session1: iscsi_eh_device_reset dev reset result = SUCCESS Jul 28 12:46:08 tyne kernel: connection1:0: invalid itt 0x60 in R2T hdr Jul 28 12:46:08 tyne kernel: connection1:0: invalid itt 0x61 in R2T hdr As you can see, we're receiving R2Ts for tasks we've just aborted :-( Looking closely, I don't _actually_ think the we've received them out-of-order (which would be a violation of the RFC). The problem seems to be our skb handling (again): We're reading an skb, and call the handler function once the PDU is ready. However, we're _not_ checking if there is more data to be read from the socket. So it looks to me as if we're first reading the TMF response, aborting all tasks, and then continue reading PDUs for tasks which we just aborted. We will definately do this. You mean the target sends a tmf response that indicates it cleaned up some tasks, then it sends pdus for the tasks that should have been affected by the TMF, right? If so I do not think targets are allowed to do this. In 3.5.1.4 we have: After the Task Management response indicates Task Management function completion, the initiator will not receive any additional responses from the affected tasks. additional responses means scsi response pdus and data-in with status, right? Does it also mean R2Ts? I thought it did, so we will just drop the session when getting all those pdus we thought the target should not be sending. If additional responses does not mean R2Ts, then what are we supposed to do? Handle them? Silently drop them? I could not find anything in the RFC. The nasty problem with the code and this scenario is that we preallcoate the tasks and itts. Once iscsi_eh_device_reset returns SUCCESS and cleans up the tasks, the scsi layer can start sending us commands. We could then allocate a task/itt that was used before and should have been cleaned up. The target could then send us pdus for the cleaned up task/itt while we are using the task/itt for a new command. Then Kablewly. It does look confusing, I think RFC 5048, Section 4.1.2. Clarified Multi-Task Abort Semantics, gives guidelines as to what should happen. Every way read it, the target shouldn't be sending R2Ts for tasks which are part of the affected task set. (those equal or exceeding the CmdSN of the reset TMF). But I've been wrong in the past. Nevermind, found the reason. Totally different story, but we've been the culprit nevertheless. iscsi_xmit_task() runs in a loop, disregarding any TMF state. So we will happily continue sending R2T transfers even though the LU Reset has already finished. Patch to follow. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi
[PATCH 0/2] Update TMF handling
Hi all, these two patches update the TMF handling to make it more efficient and less error prone. The first patch is just a minor tweak to allow new TMF tasks as soon as we've received a response for the pending one. Reasoning here is that eg LUN Reset might take quite a while to abort all outstanding tasks, during which time we cannot send any other LUN Reset even to another LUN. So obviously, allowing another LUN Reset here is the right thing to do. And even if we would be sending a LUN Reset to this LUN we wouldn't do any harm as the SCSI command abort is protected by a lock, so nothing will happen here for consecutive LUN Resets. And of course we're observing the error recovery hierarchy, so an ABORT TASK will be rejected if LUN Reset is in progress. The second patch is the more important one, as it fixes an error during LUN Reset handling in the initiator. When sending a LUN Reset during an ongoing R2T transfer, we're suspending Tx and aborting all _SCSI_ tasks. However, once we're done there we're resuming Tx and the R2T transfer will happily continue. So we should rather be checking for ongoing TMF tasks in iscsi_task_xmit and terminate the I/O of the task affects us. Note we're not actually interested in the outcome of the TMF task as the I/O will be stopped anyway even if the TMF task fails. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
[PATCH 2/2] libiscsi: check for pending TMF during task xmit
iscsi_tcp_task_xmit() doesn't check for pending TMF tasks, so we might happily continue sending R2T data even though we've already aborted the command. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/libiscsi_tcp.c | 29 + 1 files changed, 29 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c index 2e0746d..83ddb44 100644 --- a/drivers/scsi/libiscsi_tcp.c +++ b/drivers/scsi/libiscsi_tcp.c @@ -1000,6 +1000,30 @@ static struct iscsi_r2t_info *iscsi_tcp_get_curr_r2t(struct iscsi_task *task) return r2t; } +static int iscsi_tcp_check_tmf_task(struct iscsi_task *task) +{ + struct iscsi_conn *conn = task-conn; + struct iscsi_tm *hdr = conn-tmhdr; + unsigned int hdr_lun, task_lun; + + if (hdr-opcode != (ISCSI_OP_SCSI_TMFUNC | ISCSI_OP_IMMEDIATE)) + return FAILED; + + /* Check for matching LUN */ + hdr_lun = scsilun_to_int((struct scsi_lun *)hdr-lun); + task_lun = scsilun_to_int((struct scsi_lun *)task-lun); + if (hdr_lun != task_lun) + return FAILED; + + /* Check for matching task */ + if (ISCSI_TM_FUNC_VALUE(hdr) == ISCSI_TM_FUNC_ABORT_TASK) { + if (task-cmdsn != hdr-refcmdsn) + return FAILED; + } + + return SUCCESS; +} + /** * iscsi_tcp_task_xmit - xmit normal PDU task * @task: iscsi command task @@ -1032,6 +1056,11 @@ flush: if (task-sc-sc_data_direction != DMA_TO_DEVICE) return 0; + /* Check for pending TMF */ + if (conn-tmf_state != TMF_INITIAL + iscsi_tcp_check_tmf_task(task) == SUCCESS) + return 0; + r2t = iscsi_tcp_get_curr_r2t(task); if (r2t == NULL) { /* Waiting for more R2Ts to arrive. */ -- 1.6.0.2 --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH 0/2] Update TMF handling
The second patch is the more important one, as it fixes an error during LUN Reset handling in the initiator. When sending a LUN Reset during an ongoing R2T transfer, we're suspending Tx and aborting all _SCSI_ tasks. However, once we're done there we're resuming Tx and the R2T transfer will happily continue. So we should rather be This should not be happening. When iscsi_suspend_tx returns the tx thread has stopped so we know there are no users accessing the task (well, there could be if a target is sending a tmf response then a r2t, but if the target is following the rfc there should not be). So when fail_scsi_tasks calls fail_scsi_task -iscsi_complete_task (this will cleanup conn-task if this is the same task) - __iscsi_put_task this should be the last put on the task and that should release it calling iscsi_free_task which should call cleanup_task to kill any pending r2t handling and it would remove it from the requeue list. If we are sending a data-out for a task that has had fail_scsi_task -iscsi_complete_task - __iscsi_put_task called for it then we are in bigger trouble because the last put should have been called on it and we are accessing a bad task. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH 0/2] Update TMF handling
Mike Christie wrote: The second patch is the more important one, as it fixes an error during LUN Reset handling in the initiator. When sending a LUN Reset during an ongoing R2T transfer, we're suspending Tx and aborting all _SCSI_ tasks. However, once we're done there we're resuming Tx and the R2T transfer will happily continue. So we should rather be This should not be happening. When iscsi_suspend_tx returns the tx thread has stopped so we know there are no users accessing the task (well, there could be if a target is sending a tmf response then a r2t, but if the target is following the rfc there should not be). So when fail_scsi_tasks calls fail_scsi_task -iscsi_complete_task (this will cleanup conn-task if this is the same task) - __iscsi_put_task this should be the last put on the task and that should release it calling iscsi_free_task which should call cleanup_task to kill any pending r2t handling and it would remove it from the requeue list. If we are sending a data-out for a task that has had fail_scsi_task -iscsi_complete_task - __iscsi_put_task called for it then we are in bigger trouble because the last put should have been called on it and we are accessing a bad task. I fully agree, this is something which shouldn't happen. However, using this patch stops me from receiving invalid R2T PDUs. So I can't be that far off the mark. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] libiscsi: Update queuecommand status return codes
Hannes Reinecke wrote: For multipathing we should ensure to return a DID_TRANSPORT_XX result code whenever applicable; this will ensure a fast failover to other paths if this one is temporarily out of order. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/scsi/libiscsi.c | 11 --- 1 files changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 716cc34..fc10544 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -1429,12 +1429,6 @@ int iscsi_queuecommand(struct scsi_cmnd *sc, void (*done)(struct scsi_cmnd *)) session = cls_session-dd_data; spin_lock(session-lock); - reason = iscsi_session_chkready(cls_session); - if (reason) { - sc-result = reason; - goto fault; - } - if (session-state != ISCSI_STATE_LOGGED_IN) { /* * to handle the race between when we set the recovery state @@ -1444,6 +1438,9 @@ int iscsi_queuecommand(struct scsi_cmnd *sc, void (*done)(struct scsi_cmnd *)) */ switch (session-state) { case ISCSI_STATE_FAILED: + reason = FAILURE_SESSION_FAILED; + sc-result = DID_TRANSPORT_DISRUPTED 16; + break; This probably speeds up the failover time by accident because the retries/allowed counter/check hits zero before the replacement_timeout/recovery_timeout (fast io fail in fc class terms) timer has fired. This can be in the failed state for a couple seconds while we transistion the sdevs to blocked. At this time we do not want the retries to be decremented. case ISCSI_STATE_IN_RECOVERY: reason = FAILURE_SESSION_IN_RECOVERY; sc-result = DID_IMM_RETRY 16; @@ -1462,7 +1459,7 @@ int iscsi_queuecommand(struct scsi_cmnd *sc, void (*done)(struct scsi_cmnd *)) break; default: reason = FAILURE_SESSION_FREED; - sc-result = DID_NO_CONNECT 16; + sc-result = DID_TRANSPORT_FAILFAST 16; I am not sure why you are changing this one. When are you hitting it? What is the session-state. } goto fault; } --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH 0/2] Update TMF handling
Hannes Reinecke wrote: Mike Christie wrote: The second patch is the more important one, as it fixes an error during LUN Reset handling in the initiator. When sending a LUN Reset during an ongoing R2T transfer, we're suspending Tx and aborting all _SCSI_ tasks. However, once we're done there we're resuming Tx and the R2T transfer will happily continue. So we should rather be This should not be happening. When iscsi_suspend_tx returns the tx thread has stopped so we know there are no users accessing the task (well, there could be if a target is sending a tmf response then a r2t, but if the target is following the rfc there should not be). So when fail_scsi_tasks calls fail_scsi_task -iscsi_complete_task (this will cleanup conn-task if this is the same task) - __iscsi_put_task this should be the last put on the task and that should release it calling iscsi_free_task which should call cleanup_task to kill any pending r2t handling and it would remove it from the requeue list. If we are sending a data-out for a task that has had fail_scsi_task -iscsi_complete_task - __iscsi_put_task called for it then we are in bigger trouble because the last put should have been called on it and we are accessing a bad task. This is the log I'm getting: Jul 29 10:34:48 tyne kernel: session1: iscsi_eh_device_reset LU Reset [sc 88007b94d080 lun 6] Jul 29 10:34:48 tyne kernel: session1: iscsi_exec_task_mgmt_fn tmf set timeout Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x3a lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: mgmtpdu [op 0x2 hdr-itt 0x5d datalen 0] Jul 29 10:34:48 tyne kernel: connection1:0: mgmtpdu [itt 0x5d task 88007a01fc00] xmit Jul 29 10:34:48 tyne kernel: connection1:0: tmf rsp [itt 0x5d] response 0 state 1 Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x72 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: iscsi_suspend_tx suspend Tx Jul 29 10:34:48 tyne kernel: session1: iscsi_complete_task task itt 0x72 sc 88007b5bc580 still active Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x57 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x59 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: Tx suspended! So we're indeed would have continued the R2T task (itt 0x57 and itt 0x59) even though we've already received a valid TMF response. So I'm afraid it's us ... Ah, I misunderstood you. I do not think it has to do with the cleanup still leaving r2ts. I am not sure where you are putting printks, but I think it is this: while (!list_empty(conn-requeue)) { if (conn-session-fast_abort conn-tmf_state != TMF_INITIAL) break; Once the tmf completes, we will start sending data again. This sort of lines up with where I think you put your printks. Is iscsi_suspend_tx suspend Tx getting printed out before or after the the flush_workqueue. I really do think part of the problem is that we setting the SUSPEND bit without holding the session lock. We _check_ it under the lock in iscsi_xmit(), but setting is done without the lock. Which of course causes all sorts of race conditions. Yeah, we use atomics to set it then just do a if (conn-suspend_tx) under the lock to test it. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH 0/2] Update TMF handling
Mike Christie wrote: Hannes Reinecke wrote: Mike Christie wrote: The second patch is the more important one, as it fixes an error during LUN Reset handling in the initiator. When sending a LUN Reset during an ongoing R2T transfer, we're suspending Tx and aborting all _SCSI_ tasks. However, once we're done there we're resuming Tx and the R2T transfer will happily continue. So we should rather be This should not be happening. When iscsi_suspend_tx returns the tx thread has stopped so we know there are no users accessing the task (well, there could be if a target is sending a tmf response then a r2t, but if the target is following the rfc there should not be). So when fail_scsi_tasks calls fail_scsi_task -iscsi_complete_task (this will cleanup conn-task if this is the same task) - __iscsi_put_task this should be the last put on the task and that should release it calling iscsi_free_task which should call cleanup_task to kill any pending r2t handling and it would remove it from the requeue list. If we are sending a data-out for a task that has had fail_scsi_task -iscsi_complete_task - __iscsi_put_task called for it then we are in bigger trouble because the last put should have been called on it and we are accessing a bad task. This is the log I'm getting: Jul 29 10:34:48 tyne kernel: session1: iscsi_eh_device_reset LU Reset [sc 88007b94d080 lun 6] Jul 29 10:34:48 tyne kernel: session1: iscsi_exec_task_mgmt_fn tmf set timeout Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x3a lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: mgmtpdu [op 0x2 hdr-itt 0x5d datalen 0] Jul 29 10:34:48 tyne kernel: connection1:0: mgmtpdu [itt 0x5d task 88007a01fc00] xmit Jul 29 10:34:48 tyne kernel: connection1:0: tmf rsp [itt 0x5d] response 0 state 1 Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x72 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: iscsi_suspend_tx suspend Tx Jul 29 10:34:48 tyne kernel: session1: iscsi_complete_task task itt 0x72 sc 88007b5bc580 still active Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x57 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x59 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: Tx suspended! So we're indeed would have continued the R2T task (itt 0x57 and itt 0x59) even though we've already received a valid TMF response. So I'm afraid it's us ... Ah, I misunderstood you. I do not think it has to do with the cleanup still leaving r2ts. I am not sure where you are putting printks, but I think it is this: while (!list_empty(conn-requeue)) { if (conn-session-fast_abort conn-tmf_state != TMF_INITIAL) break; Once the tmf completes, we will start sending data again. Ooops. I am too sleepy. Ignore that. I am wrong there. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH 0/2] Update TMF handling
Mike Christie wrote: Mike Christie wrote: Mike Christie wrote: Mike Christie wrote: Hannes Reinecke wrote: Mike Christie wrote: The second patch is the more important one, as it fixes an error during LUN Reset handling in the initiator. When sending a LUN Reset during an ongoing R2T transfer, we're suspending Tx and aborting all _SCSI_ tasks. However, once we're done there we're resuming Tx and the R2T transfer will happily continue. So we should rather be This should not be happening. When iscsi_suspend_tx returns the tx thread has stopped so we know there are no users accessing the task (well, there could be if a target is sending a tmf response then a r2t, but if the target is following the rfc there should not be). So when fail_scsi_tasks calls fail_scsi_task -iscsi_complete_task (this will cleanup conn-task if this is the same task) - __iscsi_put_task this should be the last put on the task and that should release it calling iscsi_free_task which should call cleanup_task to kill any pending r2t handling and it would remove it from the requeue list. If we are sending a data-out for a task that has had fail_scsi_task -iscsi_complete_task - __iscsi_put_task called for it then we are in bigger trouble because the last put should have been called on it and we are accessing a bad task. This is the log I'm getting: Jul 29 10:34:48 tyne kernel: session1: iscsi_eh_device_reset LU Reset [sc 88007b94d080 lun 6] Jul 29 10:34:48 tyne kernel: session1: iscsi_exec_task_mgmt_fn tmf set timeout Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x3a lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: mgmtpdu [op 0x2 hdr-itt 0x5d datalen 0] Jul 29 10:34:48 tyne kernel: connection1:0: mgmtpdu [itt 0x5d task 88007a01fc00] xmit Jul 29 10:34:48 tyne kernel: connection1:0: tmf rsp [itt 0x5d] response 0 state 1 Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x72 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: iscsi_suspend_tx suspend Tx Jul 29 10:34:48 tyne kernel: session1: iscsi_complete_task task itt 0x72 sc 88007b5bc580 still active Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x57 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x59 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: Tx suspended! So we're indeed would have continued the R2T task (itt 0x57 and itt 0x59) even though we've already received a valid TMF response. So I'm afraid it's us ... Ah, I misunderstood you. I do not think it has to do with the cleanup still leaving r2ts. I am not sure where you are putting printks, but I think it is this: while (!list_empty(conn-requeue)) { if (conn-session-fast_abort conn-tmf_state != TMF_INITIAL) break; Once the tmf completes, we will start sending data again. Ooops. I am too sleepy. Ignore that. I am wrong there. I guess if fast_abort is 0 though, we will hit this problem. And we will send data-outs when getting tmf responses as well as when we are sending the tmf. I think the problem is wording like in 10.5.1: For ABORT TASK SET and CLEAR TASK SET, the issuing initiator MUST continue to respond to all valid target transfer tags (received via R2T, Text Response, NOP-In, or SCSI Data-In PDUs) related to the affected task set, even after issuing the task management request. I think in some other doc (probably the one Mathew and Ulrich mentioned) there is wording about doing similar for abort and lu resets. The things is that I think half of targets want us to respond to r2ts and half do not. This is where the fast_abort comes from. If set then we reply to r2ts and if not set we do not. I think once we get a successful Fudge. I am really going to be now. I mean if it is set we do not reply to r2ts. If not set then we reply. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH 2/2] libiscsi: check for pending TMF during task xmit
Hannes Reinecke wrote: iscsi_tcp_task_xmit() doesn't check for pending TMF tasks, so we might happily continue sending R2T data even though we've already aborted the command. Signed-off-by: Hannes Reinecke h...@suse.de The patch is better than how we stop all r2t processing right now so even if the problem is just us not checking the suspend bit right, I think this patch makes a nice improvement. Some comments. --- drivers/scsi/libiscsi_tcp.c | 29 + 1 files changed, 29 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c index 2e0746d..83ddb44 100644 --- a/drivers/scsi/libiscsi_tcp.c +++ b/drivers/scsi/libiscsi_tcp.c @@ -1000,6 +1000,30 @@ static struct iscsi_r2t_info *iscsi_tcp_get_curr_r2t(struct iscsi_task *task) return r2t; } +static int iscsi_tcp_check_tmf_task(struct iscsi_task *task) +{ + struct iscsi_conn *conn = task-conn; + struct iscsi_tm *hdr = conn-tmhdr; + unsigned int hdr_lun, task_lun; + + if (hdr-opcode != (ISCSI_OP_SCSI_TMFUNC | ISCSI_OP_IMMEDIATE)) Could you just mask the opcode off and not assume other bits are set or not set? If ((hdr-opcode ISCSI_OPCODE_MASK) == ISCSI_OP_SCSI_TMFUNC) + return FAILED; Could you not reuse the scsi eh return values here. Just do 0 and a EXXX value. + + /* Check for matching LUN */ + hdr_lun = scsilun_to_int((struct scsi_lun *)hdr-lun); + task_lun = scsilun_to_int((struct scsi_lun *)task-lun); + if (hdr_lun != task_lun) + return FAILED; + + /* Check for matching task */ + if (ISCSI_TM_FUNC_VALUE(hdr) == ISCSI_TM_FUNC_ABORT_TASK) { + if (task-cmdsn != hdr-refcmdsn) + return FAILED; + } + + return SUCCESS; +} + /** * iscsi_tcp_task_xmit - xmit normal PDU task * @task: iscsi command task @@ -1032,6 +1056,11 @@ flush: if (task-sc-sc_data_direction != DMA_TO_DEVICE) return 0; + /* Check for pending TMF */ Could you add a check if fast_abort is set then do this? If it is not set, it means targets want us to respond to r2ts while the tmf is in flight. + if (conn-tmf_state != TMF_INITIAL + iscsi_tcp_check_tmf_task(task) == SUCCESS) + return 0; + r2t = iscsi_tcp_get_curr_r2t(task); if (r2t == NULL) { /* Waiting for more R2Ts to arrive. */ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH 0/2] Update TMF handling
Mike Christie wrote: Mike Christie wrote: Mike Christie wrote: Mike Christie wrote: Mike Christie wrote: Hannes Reinecke wrote: Mike Christie wrote: The second patch is the more important one, as it fixes an error during LUN Reset handling in the initiator. When sending a LUN Reset during an ongoing R2T transfer, we're suspending Tx and aborting all _SCSI_ tasks. However, once we're done there we're resuming Tx and the R2T transfer will happily continue. So we should rather be This should not be happening. When iscsi_suspend_tx returns the tx thread has stopped so we know there are no users accessing the task (well, there could be if a target is sending a tmf response then a r2t, but if the target is following the rfc there should not be). So when fail_scsi_tasks calls fail_scsi_task -iscsi_complete_task (this will cleanup conn-task if this is the same task) - __iscsi_put_task this should be the last put on the task and that should release it calling iscsi_free_task which should call cleanup_task to kill any pending r2t handling and it would remove it from the requeue list. If we are sending a data-out for a task that has had fail_scsi_task -iscsi_complete_task - __iscsi_put_task called for it then we are in bigger trouble because the last put should have been called on it and we are accessing a bad task. This is the log I'm getting: Jul 29 10:34:48 tyne kernel: session1: iscsi_eh_device_reset LU Reset [sc 88007b94d080 lun 6] Jul 29 10:34:48 tyne kernel: session1: iscsi_exec_task_mgmt_fn tmf set timeout Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x3a lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: mgmtpdu [op 0x2 hdr-itt 0x5d datalen 0] Jul 29 10:34:48 tyne kernel: connection1:0: mgmtpdu [itt 0x5d task 88007a01fc00] xmit Jul 29 10:34:48 tyne kernel: connection1:0: tmf rsp [itt 0x5d] response 0 state 1 Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x72 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: iscsi_suspend_tx suspend Tx Jul 29 10:34:48 tyne kernel: session1: iscsi_complete_task task itt 0x72 sc 88007b5bc580 still active Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x57 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x59 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: Tx suspended! So we're indeed would have continued the R2T task (itt 0x57 and itt 0x59) even though we've already received a valid TMF response. So I'm afraid it's us ... Ah, I misunderstood you. I do not think it has to do with the cleanup still leaving r2ts. I am not sure where you are putting printks, but I think it is this: while (!list_empty(conn-requeue)) { if (conn-session-fast_abort conn-tmf_state != TMF_INITIAL) break; Once the tmf completes, we will start sending data again. Ooops. I am too sleepy. Ignore that. I am wrong there. I guess if fast_abort is 0 though, we will hit this problem. And we will send data-outs when getting tmf responses as well as when we are sending the tmf. I think the problem is wording like in 10.5.1: For ABORT TASK SET and CLEAR TASK SET, the issuing initiator MUST continue to respond to all valid target transfer tags (received via R2T, Text Response, NOP-In, or SCSI Data-In PDUs) related to the affected task set, even after issuing the task management request. I think in some other doc (probably the one Mathew and Ulrich mentioned) there is wording about doing similar for abort and lu resets. The things is that I think half of targets want us to respond to r2ts and half do not. This is where the fast_abort comes from. If set then we reply to r2ts and if not set we do not. I think once we get a successful Fudge. I am really going to be now. I mean if it is set we do not reply to r2ts. If not set then we reply. Actually, I think it's a race condition: drivers/scsi/libiscsi.c:iscsi_eh_device_reset() rc = SUCCESS; spin_unlock_bh(session-lock); iscsi_suspend_tx(conn); So the workqueue thread could wedge in after we've unlocked the session lock and start sending data even though we're meant to suspend transmitting here. Will be trying it. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi
Re: [PATCH 0/2] Update TMF handling
Hannes Reinecke wrote: Mike Christie wrote: Mike Christie wrote: Mike Christie wrote: Mike Christie wrote: Mike Christie wrote: Hannes Reinecke wrote: Mike Christie wrote: The second patch is the more important one, as it fixes an error during LUN Reset handling in the initiator. When sending a LUN Reset during an ongoing R2T transfer, we're suspending Tx and aborting all _SCSI_ tasks. However, once we're done there we're resuming Tx and the R2T transfer will happily continue. So we should rather be This should not be happening. When iscsi_suspend_tx returns the tx thread has stopped so we know there are no users accessing the task (well, there could be if a target is sending a tmf response then a r2t, but if the target is following the rfc there should not be). So when fail_scsi_tasks calls fail_scsi_task -iscsi_complete_task (this will cleanup conn-task if this is the same task) - __iscsi_put_task this should be the last put on the task and that should release it calling iscsi_free_task which should call cleanup_task to kill any pending r2t handling and it would remove it from the requeue list. If we are sending a data-out for a task that has had fail_scsi_task -iscsi_complete_task - __iscsi_put_task called for it then we are in bigger trouble because the last put should have been called on it and we are accessing a bad task. This is the log I'm getting: Jul 29 10:34:48 tyne kernel: session1: iscsi_eh_device_reset LU Reset [sc 88007b94d080 lun 6] Jul 29 10:34:48 tyne kernel: session1: iscsi_exec_task_mgmt_fn tmf set timeout Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x3a lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: mgmtpdu [op 0x2 hdr-itt 0x5d datalen 0] Jul 29 10:34:48 tyne kernel: connection1:0: mgmtpdu [itt 0x5d task 88007a01fc00] xmit Jul 29 10:34:48 tyne kernel: connection1:0: tmf rsp [itt 0x5d] response 0 state 1 Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x72 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: iscsi_suspend_tx suspend Tx Jul 29 10:34:48 tyne kernel: session1: iscsi_complete_task task itt 0x72 sc 88007b5bc580 still active Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x57 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: connection1:0: task itt 0x59 lun 6 abort transfer Jul 29 10:34:48 tyne kernel: session1: Tx suspended! So we're indeed would have continued the R2T task (itt 0x57 and itt 0x59) even though we've already received a valid TMF response. So I'm afraid it's us ... Ah, I misunderstood you. I do not think it has to do with the cleanup still leaving r2ts. I am not sure where you are putting printks, but I think it is this: while (!list_empty(conn-requeue)) { if (conn-session-fast_abort conn-tmf_state != TMF_INITIAL) break; Once the tmf completes, we will start sending data again. Ooops. I am too sleepy. Ignore that. I am wrong there. I guess if fast_abort is 0 though, we will hit this problem. And we will send data-outs when getting tmf responses as well as when we are sending the tmf. I think the problem is wording like in 10.5.1: For ABORT TASK SET and CLEAR TASK SET, the issuing initiator MUST continue to respond to all valid target transfer tags (received via R2T, Text Response, NOP-In, or SCSI Data-In PDUs) related to the affected task set, even after issuing the task management request. I think in some other doc (probably the one Mathew and Ulrich mentioned) there is wording about doing similar for abort and lu resets. The things is that I think half of targets want us to respond to r2ts and half do not. This is where the fast_abort comes from. If set then we reply to r2ts and if not set we do not. I think once we get a successful Fudge. I am really going to be now. I mean if it is set we do not reply to r2ts. If not set then we reply. Actually, I think it's a race condition: drivers/scsi/libiscsi.c:iscsi_eh_device_reset() rc = SUCCESS; spin_unlock_bh(session-lock); iscsi_suspend_tx(conn); So the workqueue thread could wedge in after we've unlocked the session lock and start sending data even though we're meant to suspend transmitting here. Will be trying it. U, you are right. And we are probably hitting this: /* process pending command queue */ while (!list_empty(conn-cmdqueue)) { if (conn-tmf_state == TMF_QUEUED) break; Once the tmf completes we start sending new commands, because the tmf_state changes. But then if the tmf had cmdns 10 and that completes then we start sending new commands above (tmf_state == TMF_SUCCESS), iscsi_eh_device_reset will cleanup cmdd with sn less than 10 and will cleanup cmds with sn higher than 10. We clean everything up. But cmds with cmdns with 11 are ok and
Kernel panic -not syncing :Fatal Exception ---Process istoid1
I am trying to configure the Linux VTL (open iscsi) using the Vmware iSCSI intiator from a ESXi box. It does get configured, but the Linux VTL comes into a panic mode and freezes when the ESXi starts scanning for devices. Any suggestions please. The Kernel panic error message is on the Process istoid1 0Kernel panic -not syncing :Fatal Exception Regards Raj --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Kernel panic -not syncing :Fatal Exception ---Process istoid1
Am Mittwoch, den 29.07.2009, 19:14 +0530 schrieb Raj: Screenshot of the error message attached: On Wed, Jul 29, 2009 at 7:11 PM, Raj rajeevman...@gmail.com wrote: I am trying to configure the Linux VTL (open iscsi) using the Vmware iSCSI intiator from a ESXi box. It does get configured, but the Linux VTL comes into a panic mode and freezes when the ESXi starts scanning for devices. Any suggestions please. The Kernel panic error message is on the Process istoid1 0Kernel panic -not syncing :Fatal Exception Regards Raj This problem seems not related to the open-iscsi initiator. What is this Linux VTL? Do you have any links? Thanks, Arne --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] iscsi: Use GFP_ATOMIC in iscsi_offload_mesg().
On 07/29/2009 01:49 PM, Michael Chan wrote: Changing to GFP_ATOMIC because the only caller in cnic/bnx2i may be calling this function while holding spin_lock. This problem was discovered by Mike Christie. Signed-off-by: Michael Chanmc...@broadcom.com --- drivers/scsi/scsi_transport_iscsi.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 783e33c..b47240c 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -990,7 +990,7 @@ int iscsi_offload_mesg(struct Scsi_Host *shost, struct iscsi_uevent *ev; int len = NLMSG_SPACE(sizeof(*ev) + data_size); - skb = alloc_skb(len, GFP_NOIO); + skb = alloc_skb(len, GFP_ATOMIC); if (!skb) { printk(KERN_ERR can not deliver iscsi offload message:OOM\n); return -ENOMEM; @@ -1012,7 +1012,7 @@ int iscsi_offload_mesg(struct Scsi_Host *shost, memcpy((char *)ev + sizeof(*ev), data, data_size); - return iscsi_multicast_skb(skb, ISCSI_NL_GRP_UIP, GFP_NOIO); + return iscsi_multicast_skb(skb, ISCSI_NL_GRP_UIP, GFP_ATOMIC); } EXPORT_SYMBOL_GPL(iscsi_offload_mesg); Using GFP_NOIO and changing the locking is my preference normally, but if the locking changes are going to be a problem, then this is ok with me since we can fail allocations in other parts of the code. Acked-by: Mike Christie micha...@cs.wisc.edu Dave Miller probably wants to take this in his tree since it fixes a bug with this patch http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commit;h=6d7760a88c25057c2c2243e5dfe2d731064bd31d --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] Add logging to scsi_transport_iscsi.c
On 07/26/2009 08:48 AM, Erez Zilber wrote: I've attached a new version. I hope it's better. Whenever possible, there's a dbg statement before after. For example, if we free the conn object, I can't put a dbg call after it (because conn is already NULL). If you still see specific things that need to be fixed, let me know. Thanks for the work on this. How about the attached. - I added a : between the function name and debug output. - Removed some extra newlines - Tried to add dbg statements at the top and end of functions that can take a long time or fail in odd ways because they call into the scsi layer like the scanning, blocking, target removal, etc. For functions like allocation, adding, destroying and freeing I tried to just add a dbg statement at the top of end of the function. The patch was made over the linux-2.6-iscsi tree iscsi branch. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~--- diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index b47240c..5d765f5 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -36,6 +36,38 @@ #define ISCSI_TRANSPORT_VERSION 2.0-870 +static int dbg_session; +module_param_named(debug_session, dbg_session, int, + S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(debug_session, +Turn on debugging for sessions in scsi_transport_iscsi +module. Set to 1 to turn on, and zero to turn off. Default +is off.); + +static int dbg_conn; +module_param_named(debug_conn, dbg_conn, int, + S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(debug_conn, +Turn on debugging for connections in scsi_transport_iscsi +module. Set to 1 to turn on, and zero to turn off. Default +is off.); + +#define ISCSI_DBG_TRANS_SESSION(_session, dbg_fmt, arg...) \ + do {\ + if (dbg_session)\ + iscsi_cls_session_printk(KERN_INFO, _session, \ +%s: dbg_fmt,\ +__func__, ##arg); \ + } while (0); + +#define ISCSI_DBG_TRANS_CONN(_conn, dbg_fmt, arg...) \ + do {\ + if (dbg_conn) \ + iscsi_cls_conn_printk(KERN_INFO, _conn, \ + %s: dbg_fmt, \ + __func__, ##arg); \ + } while (0); + struct iscsi_internal { struct scsi_transport_template t; struct iscsi_transport *iscsi_transport; @@ -377,6 +409,7 @@ static void iscsi_session_release(struct device *dev) shost = iscsi_session_to_shost(session); scsi_host_put(shost); + ISCSI_DBG_TRANS_SESSION(session, Completing session release\n); kfree(session); } @@ -441,6 +474,9 @@ static int iscsi_user_scan_session(struct device *dev, void *data) return 0; session = iscsi_dev_to_session(dev); + + ISCSI_DBG_TRANS_SESSION(session, Scanning session\n); + shost = iscsi_session_to_shost(session); ihost = shost-shost_data; @@ -448,8 +484,7 @@ static int iscsi_user_scan_session(struct device *dev, void *data) spin_lock_irqsave(session-lock, flags); if (session-state != ISCSI_SESSION_LOGGED_IN) { spin_unlock_irqrestore(session-lock, flags); - mutex_unlock(ihost-mutex); - return 0; + goto user_scan_exit; } id = session-target_id; spin_unlock_irqrestore(session-lock, flags); @@ -462,7 +497,10 @@ static int iscsi_user_scan_session(struct device *dev, void *data) scsi_scan_target(session-dev, 0, id, scan_data-lun, 1); } + +user_scan_exit: mutex_unlock(ihost-mutex); + ISCSI_DBG_TRANS_SESSION(session, Completed session scan\n); return 0; } @@ -522,7 +560,9 @@ static void session_recovery_timedout(struct work_struct *work) if (session-transport-session_recovery_timedout) session-transport-session_recovery_timedout(session); + ISCSI_DBG_TRANS_SESSION(session, Unblocking SCSI target\n); scsi_target_unblock(session-dev); +
Re: iscsiadm -m iface + routing
Hello, Could you please send the mib,snmpwalk output of EqualLogic.If it supports SMI-s could you post the mof files for the same.Or is there any other way(CLI Interface)to monitor equallogic...? On Tue, Jul 28, 2009 at 11:42 PM, Mike Christie micha...@cs.wisc.eduwrote: Ulrich Windl wrote: On 28 Jul 2009 at 0:22, Moi meme wrote: Hello, I am using a DELL Equallogic at work and I use a SLES10 SP2 (was SP1 before last week-end), are they known problems with the SLES SP2 ? I didn't notice any problem since the upgrade ! Same here: Only when the network has a problem, I see _many_ messages. Only problem (not iSCSI) is that he links in /dev/disk/by-id are not reliably populated after boot. This may be a multipath/udev feature. As we boot very rarely, I did not put much effort into examining this... What EQL firmware are you using? On the EQL box if you do a show command it is in there. I was having a similar problem and updated the firmware to 4.1.4 and it has been working for me now. For some reason the udev scsi_id callout would send some commands to the target, and the target would never respond. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: undefined reference to `strlcpy' when building iscsid
Hi, you need to compile sysdeps.c in utils/sysdeps first. Then it will work. Best regards --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
[PATCH 2/2] bnx2i : Fix cid #n not valid issue
* when bnx2i_adapter_ready() fails, connection handle(cid) = 0 is wrongly freed because 'cid' is not yet allocated for the endpoint * Fix is to initialize bnx2i_ep-ep_iscsi_cid to '-1' in bnx2i_alloc_ep() and not in bnx2i_ep_connect() to avoid releasing invalid 'cid' * There is already a check in bnx2i_free_iscsi_cid() not to free invalid iscsi connection handle (-1) Signed-off-by: Anil Veerabhadrappa ani...@broadcom.com --- drivers/scsi/bnx2i/bnx2i_iscsi.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c index 9535bb6..08d0bfc 100644 --- a/drivers/scsi/bnx2i/bnx2i_iscsi.c +++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c @@ -387,6 +387,7 @@ static struct iscsi_endpoint *bnx2i_alloc_ep(struct bnx2i_hba *hba) bnx2i_ep = ep-dd_data; INIT_LIST_HEAD(bnx2i_ep-link); bnx2i_ep-state = EP_STATE_IDLE; + bnx2i_ep-ep_iscsi_cid = (u16) -1; bnx2i_ep-hba = hba; bnx2i_ep-hba_age = hba-age; hba-ofld_conns_active++; @@ -1678,8 +1679,6 @@ static struct iscsi_endpoint *bnx2i_ep_connect(struct Scsi_Host *shost, goto net_if_down; } - bnx2i_ep-state = EP_STATE_IDLE; - bnx2i_ep-ep_iscsi_cid = (u16) -1; bnx2i_ep-num_active_cmds = 0; iscsi_cid = bnx2i_alloc_iscsi_cid(hba); if (iscsi_cid == -1) { -- 1.5.4.3 --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---