Re: [PATCH] libiscsi: Check for CmdSN window before sending data
Amazingly the RFC says The target MUST silently ignore any non-immediate command outside of this range(...), but it does not say something like the Initiator SHOULD NOT send ... any command out of range... Regards, Ulrich On 21 Jul 2009 at 15:21, Mike Christie wrote: On 07/21/2009 11:35 AM, Boaz Harrosh wrote: On 07/21/2009 03:33 PM, Hannes Reinecke wrote: When sending a PDU we're just increase the CmdSN number without any check for MaxCmdSN. This results in unexpected ping timeouts and even connection stalls. So we should make sure to check CmdSN against MaxCmdSN before transferring a PDU, and just retry until MaxCmdSN has been updated. Signed-off-by: Hannes Reineckeh...@suse.de --- drivers/scsi/libiscsi.c | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 21ed45f..ffb1338 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -1273,6 +1273,19 @@ static int iscsi_data_xmit(struct iscsi_conn *conn) goto again; } + if (conn-session-state == ISCSI_STATE_LOGGED_IN + !iscsi_sna_lte(conn-session-cmdsn, + conn-session-max_cmdsn)) { + /* Window closed, wait for MaxCmdSN update */ + ISCSI_DBG_SESSION(conn-session, +target window closed, +cmd %u max %u\n, +conn-session-cmdsn, +conn-session-max_cmdsn); + spin_unlock_bh(conn-session-lock); + return -ENODATA; + } I am not sure if we should be needing this if the target is operating within the RFC (there is one exception but I am not sure if you are hitting it). In 3.2.2.1, I saw this: An iSCSI target MUST be able to handle at least one immediate task management command and one immediate non-task-management iSCSI command per connection at any time. 3.2.2.1 also has: The target MUST silently ignore any non-immediate command outside of this range or non-immediate duplicates within the range. The CmdSN carried by immediate commands may lie outside the ExpCmdSN to MaxCmdSN range. I took this to mean even when the window is closed we can send a nop as immediate. What do you guys think? The initiator will only send 1 TMF as immediate per session at a time and it will only send one nop as a ping marked as immediate at a time. The only exception to use sending more than one non tmf immediate cmd is if the target sends us a nop-in we could have sent two nop-outs marked as immediate (for the nop-out in response to the target's nop-in, 10.18.1 says we have to set the I bit). If we send too many nops marked as immediate we should be getting a reject pdu right? If so then I think we just need the attached patch which adds some code to handle rejected immediate pdus. The patch is made over scsi-rc-fixes and is only compile tested. Are you only seeing this with the one target? Could we confirm with them that they will accept one non tmf immediate command? If I am reading the RFC wrong, then for your patch, we want to move the check to below the check_mgmt label because iscsi_data_xmit can send multiple pdus. You probably just want to move it to iscsi_prep_mgmt_task(). Also I think we want to dequeue a nop as a ping so it does not timeout while the cmd window is closed (the problem would be is if the window was closed and then the connection goes bad - we would not be able to catch that). + Looks good, From the time queuecommand did the check (iscsi_check_cmdsn_window_closed) a management command came in without checking and stuffed an entry into the task queue. Good catch. Just nit picking you could use iscsi_check_cmdsn_window_closed() above just for style optimization. And also you might want to add the below cleanup. One question though. Do we want to return -ENODATA which means we get woken up only at next command submission, or we want goto again to wake up after a while for retry? I'd say the later since it might be the last command in a set and We do the same thing for the goto if rc == -ENODATA and if you return -ENODATA. If the cmd window opens iscsi_update_cmdsn will call queue_work, so the xmit thread is woken up. Or if mem/space opens up in the socket, iscsi_tcp will call queue_work to wake it, or if a new command comes in that will also kick the thread. The only time we would retry immediately is if -EAGAIN is returned from iscsi_data_xmit. The goto again is probably better so it is consistent with the other code. then we get sleepy with queues full. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email
Re: [PATCH] libiscsi: Check for CmdSN window before sending data
Mike Christie wrote: [ .. ] I am not sure if we should be needing this if the target is operating within the RFC (there is one exception but I am not sure if you are hitting it). In 3.2.2.1, I saw this: An iSCSI target MUST be able to handle at least one immediate task management command and one immediate non-task-management iSCSI command per connection at any time. 3.2.2.1 also has: The target MUST silently ignore any non-immediate command outside of this range or non-immediate duplicates within the range. The CmdSN carried by immediate commands may lie outside the ExpCmdSN to MaxCmdSN range. I took this to mean even when the window is closed we can send a nop as immediate. What do you guys think? Totally beside the point. We're not sending NOPs outside the CmdSN range, we're sending _data_ PDUs outside the CmdSN range. Just make it a printk in the above patch and start hitting the target hard. You'll see an amazing number of messages ... There is _nothing_ in the code which checks if the data PDU we're about to send has a CmdSN within the target window. And then we're hitting the quoted text and the target drops the PDU, leading to a nice I/O stall. Which is the I/O stall I'm fighting with since _months_. The initiator will only send 1 TMF as immediate per session at a time and it will only send one nop as a ping marked as immediate at a time. The only exception to use sending more than one non tmf immediate cmd is if the target sends us a nop-in we could have sent two nop-outs marked as immediate (for the nop-out in response to the target's nop-in, 10.18.1 says we have to set the I bit). If we send too many nops marked as immediate we should be getting a reject pdu right? If so then I think we just need the attached patch which adds some code to handle rejected immediate pdus. The patch is made over scsi-rc-fixes and is only compile tested. Are you only seeing this with the one target? Could we confirm with them that they will accept one non tmf immediate command? If I am reading the RFC wrong, then for your patch, we want to move the check to below the check_mgmt label because iscsi_data_xmit can send multiple pdus. You probably just want to move it to iscsi_prep_mgmt_task(). Also I think we want to dequeue a nop as a ping so it does not timeout while the cmd window is closed (the problem would be is if the window was closed and then the connection goes bad - we would not be able to catch that). Yes, you are probably correct in that we'd need to move it into the individual queue loops to be able to transmit as many PDUs as possible. With the original patch we're running into the risk of hitting the same error when enough PDUs are queued. I'll update the patch. + Looks good, From the time queuecommand did the check (iscsi_check_cmdsn_window_closed) a management command came in without checking and stuffed an entry into the task queue. Good catch. No, wrong. The check in queuecommand is by no means relevant to the actual window. We're checking the target window at the time queuecommand is run, but we're _generating_ the CmdSN only much later after we've dequeued the command. And it's quite feasible to flood the xmitqueue with more commands than can be transmitted, so the CmdSN window won't be updated for a long time. In addition we're not incrementing the This allows us to stuff quite a few commands in the xmitqueue. You can easily check this by eg doing a journal replay. That puts out quite a lot of I/O in a short time: Jul 22 12:29:46 esk kernel: [ 2164.102874] connection1:0: cmd target window closed, cmd 5319 max 5318 Jul 22 12:29:46 esk kernel: [ 2164.138952] connection1:0: cmd target window closed, cmd 5339 max 5338 Jul 22 12:29:46 esk kernel: [ 2164.177920] connection1:0: cmd target window closed, cmd 5362 max 5361 Jul 22 12:29:46 esk kernel: [ 2164.213620] connection1:0: cmd target window closed, cmd 5382 max 5381 Jul 22 12:29:46 esk kernel: [ 2164.251724] connection1:0: cmd target window closed, cmd 5402 max 5401 Jul 22 12:29:46 esk kernel: [ 2154.265954] connection2:0: cmd target window closed, cmd 5283 max 5282 Jul 22 12:29:46 esk kernel: [ 2164.298269] connection1:0: cmd target window closed, cmd 5445 max 5444 Jul 22 12:29:46 esk kernel: [ 2164.298380] connection1:0: cmd target window closed, cmd 5446 max 5445 Jul 22 12:29:46 esk kernel: [ 2164.298757] connection1:0: cmd target window closed, cmd 5447 max 5446 Jul 22 12:29:46 esk kernel: [ 2164.299374] connection1:0: cmd target window closed, cmd 5448 max 5447 Jul 22 12:29:46 esk kernel: [ 2164.299971] connection1:0: cmd target window closed, cmd 5449 max 5448 Jul 22 12:29:46 esk kernel: [ 2164.300717] connection1:0: cmd target window closed, cmd 5450 max 5449 Jul 22 12:29:46 esk kernel: [ 2164.301455] connection1:0: cmd target window closed, cmd 5451 max 5450 Jul 22 12:29:46 esk kernel: [ 2164.302187] connection1:0: cmd target window
Race during logout
Mike, I'm seeing from time to time the following scenario (although currently I'm not able to reproduce it): 1. A logout from a node is initiated by the user. 2. Before the logout completes, the user runs /etc/init.d/iscsi stop. The 'stop' method logs out from all nodes. When it tries to logout from the node that is already logging out (step #1), it fails because it is already logging out (see initiator.c::session_logout_task when conn-logout_qtask != NULL). Then, the 'stop' method kills iscsid. 3. The logout command form step #1 returns and notifies the (dead) daemon. I was thinking about something like this to solve that: diff --git a/etc/initd/initd.redhat b/etc/initd/initd.redhat index d68f135..55d35ec 100644 --- a/etc/initd/initd.redhat +++ b/etc/initd/initd.redhat @@ -39,6 +39,11 @@ stop() echo -n $Stopping iSCSI initiator service: sync iscsiadm -m node --logoutall=all + RETVAL=$? + if [ $RETVAL -ne 0 ]; then + echo Could not logout from all nodes, try again later + return $RETVAL + fi killproc iscsid rm -f /var/run/iscsid.pid [ $RETVAL -eq 0 ] rm -f /var/lock/subsys/open-iscsi @@ -76,6 +81,10 @@ case $1 in ;; restart) stop + if [ $RETVAL -ne 0 ]; then + echo Stopping iSCSI initiator service failed, not starting + exit $RETVAL + fi start ;; status) I'm not sure if there are other logout failure scenarios in which we want to shutdown the open-iscsi service anyway. If not, I guess that this fix (and a similar fix for SuSE Debian) should do the work. Thanks, Erez --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
RE: iscsiadm -m iface + routing
Mike or others, Do you or others think that my ping/CHAP issue that I'm having -- aka sessions resetting (logging out) could be related to the more recent ping timeout and maxcmds patch that has been worked on and discussed? I am attempting to get my test systems updated to test this, but I figured that I'd toss it out there to see if they might be related. Thanks, Joe -Original Message- From: open-iscsi@googlegroups.com [mailto:open-is...@googlegroups.com] On Behalf Of Mike Christie Sent: Tuesday, July 14, 2009 6:01 PM To: open-iscsi@googlegroups.com Subject: Re: iscsiadm -m iface + routing On 07/13/2009 09:20 AM, Hoot, Joseph wrote: Mike, Just as an FYI (in case you were most curious about this issue) I've narrowed this issue down to something with CHAP. On my EqualLogic, if I disable CHAP, I can't reproduce this issue. So I did the following. I after upgrading to the latest OEL 5.3 release of the iscsi-initiator, I could still reproduce the problem. Therefore, I did the following: 1) Setup another test environment using the same hardware (physical different hardware, but all same firmware, models, etc..) 2) presented a new test volume from EqualLogic 3) ran the ping test (ping -Ieth2 192.168.0.19 ping -Ieth3 192.168.0.19). 4) I couldn't reproduce the issue. 5) I checked what the difference were-- CHAP the only difference. 6) So I turned on CHAP authentication to the volume. 7) rm -rf /var/lib/iscsi/nodes/* /var/lib/iscsi/send_targets/* 8) rediscovered targets (after modifying /etc/iscsi/iscsid.conf with CHAP information) node.session.auth.authmethod = CHAP node.session.auth.username = mychapuserhere node.session.auth.password = mypasshere 9) ran the same ping test and was able to get iscsi sessions to fail within 2 minutes. 10) I wanted to prove that CHAP was the issue. So I logout out of all iscsi sessions. 11) I disabled CHAP on the EqualLogic 12) rediscovery targets and re-logged in to the sessions (without CHAP authentication) 13) ran the ping tests and couldn't break it after 30 minutes. 14) added CHAP again and was able to break the sessions within 2 minutes. So definitely something odd with CHAP (my guess, either in open-iscsi code or EqualLogic code). I've asked Roger Lopez, from Dell, to attempt to reproduce this in his lab. He has EqualLogic and Oracle VM Servers. Oracle developers that I'm working with don't currently have access to an EqualLogic, but they are attempting to reproduce this with their iSCSI equipment as well. I'm going to setup port mirroring on our switch and run tcpdumps to see what I can get. This is strange because CHAP does not come into play in the normal IO path. When we login we will do authentication with CHAP, but after that succeeds it nevers comes up when doing IO. It would only come up again when the session/connection is dropped/disconnected and we relogin. For the relogin we will do the CHAP authentication again. Maybe some memory error where chap values are overwriting some other memory. There was one recent fix by Dell, where when using CHAP they could segfault iscsid. Here is the updated RPM that I am working on for RHEL 5.4 that has the fix: http://people.redhat.com/mchristi/iscsi/rhel5.4/iscsi-initiator-utils/ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] libiscsi: Check for CmdSN window before sending data
Hannes Reinecke wrote: Mike Christie wrote: [ .. ] I am not sure if we should be needing this if the target is operating within the RFC (there is one exception but I am not sure if you are hitting it). In 3.2.2.1, I saw this: An iSCSI target MUST be able to handle at least one immediate task management command and one immediate non-task-management iSCSI command per connection at any time. 3.2.2.1 also has: The target MUST silently ignore any non-immediate command outside of this range or non-immediate duplicates within the range. The CmdSN carried by immediate commands may lie outside the ExpCmdSN to MaxCmdSN range. I took this to mean even when the window is closed we can send a nop as immediate. What do you guys think? Totally beside the point. We're not sending NOPs outside the CmdSN range, we're sending _data_ PDUs outside the CmdSN range. Just make it a printk in the above patch and start hitting the target hard. You'll see an amazing number of messages ... There is _nothing_ in the code which checks if the data PDU we're about to send has a CmdSN within the target window. And then we're hitting the quoted text and the target drops the PDU, leading to a nice I/O stall. Which is the I/O stall I'm fighting with since _months_. The initiator will only send 1 TMF as immediate per session at a time and it will only send one nop as a ping marked as immediate at a time. The only exception to use sending more than one non tmf immediate cmd is if the target sends us a nop-in we could have sent two nop-outs marked as immediate (for the nop-out in response to the target's nop-in, 10.18.1 says we have to set the I bit). If we send too many nops marked as immediate we should be getting a reject pdu right? If so then I think we just need the attached patch which adds some code to handle rejected immediate pdus. The patch is made over scsi-rc-fixes and is only compile tested. Are you only seeing this with the one target? Could we confirm with them that they will accept one non tmf immediate command? If I am reading the RFC wrong, then for your patch, we want to move the check to below the check_mgmt label because iscsi_data_xmit can send multiple pdus. You probably just want to move it to iscsi_prep_mgmt_task(). Also I think we want to dequeue a nop as a ping so it does not timeout while the cmd window is closed (the problem would be is if the window was closed and then the connection goes bad - we would not be able to catch that). Yes, you are probably correct in that we'd need to move it into the individual queue loops to be able to transmit as many PDUs as possible. With the original patch we're running into the risk of hitting the same error when enough PDUs are queued. I'll update the patch. So, this looks far better: diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 21ed45f..8303676 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -1212,6 +1212,13 @@ static int iscsi_xmit_task(struct iscsi_conn *conn) struct iscsi_task *task = conn-task; int rc; + /* Check for target CmdSN window */ + if (conn-session-state == ISCSI_STATE_LOGGED_IN + iscsi_sna_lt(conn-session-max_cmdsn, +conn-session-cmdsn)) + /* Window closed, wait for CmdSN update */ + return -EPROTO; + __iscsi_get_task(task); spin_unlock_bh(conn-session-lock); rc = conn-session-tt-xmit_task(task); There is one issue still outstanding: Even with this patch the iscsi stack feels it necessary to send NOPs: Jul 22 14:03:40 esk kernel: [ 485.274672] connection2:0: running ctask itt 90 rc -71 Jul 22 14:03:40 esk kernel: [ 485.297249] connection2:0: running ctask itt 8 rc -71 Jul 22 14:03:40 esk kernel: [ 547.508673] connection1:0: running ctask itt 46 rc -71 Jul 22 14:03:41 esk kernel: [ 548.155353] connection1:0: running ctask itt 85 rc -71 Jul 22 14:03:45 esk kernel: [ 550.291112] connection2:0: Sending nopout,cmd 283450 max 283512 exp 283449 Jul 22 14:03:45 esk kernel: [ 550.294583] connection2:0: mgmtpdu [itt xa09 p 81007a1ffa40] queued Jul 22 14:03:45 esk kernel: [ 487.595817] connection2:0: mgmtpdu [op 0x0 hdr-itt 0xa09 datalen 0 cmdsn 283450/283450] Jul 22 14:03:45 esk kernel: [ 487.598659] connection2:0: mgmtpdu [itt 0xa09 p 81007a1ffa40] done Jul 22 14:03:45 esk kernel: [ 539.429005] connection2:0: mgmtpdu [itt xa09 p 81007a1ffa40] finished Jul 22 14:03:45 esk kernel: [ 487.603722] connection2:0: running ctask itt 82 rc -71 Jul 22 14:03:45 esk kernel: [ 487.604826] connection2:0: running ctask itt 38 rc -71 And a wireshark trace reveals that indeed there is a network hickup before the NOP is sent during which time simply _nothing_ happens on the line. Well, not between the target and the initiator; can't tell about the overall load. (By way of clarification: 'running ctask
Re: [PATCH] libiscsi: Check for CmdSN window before sending data
On 07/22/2009 05:57 AM, Hannes Reinecke wrote: Mike Christie wrote: [ .. ] I am not sure if we should be needing this if the target is operating within the RFC (there is one exception but I am not sure if you are hitting it). In 3.2.2.1, I saw this: An iSCSI target MUST be able to handle at least one immediate task management command and one immediate non-task-management iSCSI command per connection at any time. 3.2.2.1 also has: The target MUST silently ignore any non-immediate command outside of this range or non-immediate duplicates within the range. The CmdSN carried by immediate commands may lie outside the ExpCmdSN to MaxCmdSN range. I took this to mean even when the window is closed we can send a nop as immediate. What do you guys think? Totally beside the point. I do not get that. In the original mail you wrote: When sending a PDU we're just increase the CmdSN number without any check for MaxCmdSN. This results in unexpected ping timeouts and even connection stalls. I thought you were saying that we got ping timeouts because we sent a nop out side of the window. How does a data-out or scsi cmnd pdu getting sent out of the window cause a ping timeout? The target should just drop the data-out or scsi cmnd pdu and process the nop, right? We're not sending NOPs outside the CmdSN range, we're sending _data_ PDUs outside the CmdSN range. Just make it a printk in the above patch and start hitting the target hard. You'll see an amazing number of messages ... Do you mean data-out pdus or scsi cmd pdus? You do not check the window for a data-out do you? It does not have a higher cmdsn then the cmd that started it (data-out pdus do not have a cmdsn field and are not starting a new command). I accidentally did check the cmdsn for data-outs and it lead to lots of bug reports. I accidentally added the check here http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=77a23c21aaa723f6b0ffc4a701be8c8e5a32346d and then reverted it here: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=e07264071f7f2b02a2973cb28d9fdf5eb8866cc1 Your old patch and new patch are checking the cmdsn window for data-outs and for commands like nops and tmfs that marked immediate which I do not think is right. There is _nothing_ in the code which checks if the data PDU we're about to send has a CmdSN within the target window. And then we're hitting the quoted text and the target drops the PDU, leading to a nice I/O stall. Which is the I/O stall I'm fighting with since _months_. The initiator will only send 1 TMF as immediate per session at a time and it will only send one nop as a ping marked as immediate at a time. The only exception to use sending more than one non tmf immediate cmd is if the target sends us a nop-in we could have sent two nop-outs marked as immediate (for the nop-out in response to the target's nop-in, 10.18.1 says we have to set the I bit). If we send too many nops marked as immediate we should be getting a reject pdu right? If so then I think we just need the attached patch which adds some code to handle rejected immediate pdus. The patch is made over scsi-rc-fixes and is only compile tested. Are you only seeing this with the one target? Could we confirm with them that they will accept one non tmf immediate command? If I am reading the RFC wrong, then for your patch, we want to move the check to below the check_mgmt label because iscsi_data_xmit can send multiple pdus. You probably just want to move it to iscsi_prep_mgmt_task(). Also I think we want to dequeue a nop as a ping so it does not timeout while the cmd window is closed (the problem would be is if the window was closed and then the connection goes bad - we would not be able to catch that). Yes, you are probably correct in that we'd need to move it into the individual queue loops to be able to transmit as many PDUs as possible. With the original patch we're running into the risk of hitting the same error when enough PDUs are queued. I'll update the patch. + Looks good, From the time queuecommand did the check (iscsi_check_cmdsn_window_closed) a management command came in without checking and stuffed an entry into the task queue. Good catch. No, wrong. The check in queuecommand is by no means relevant to the actual window. We're checking the target window at the time queuecommand is run, but we're _generating_ the CmdSN only much later after we've dequeued the command. The check in queuecommand is for scsi cmd pdus and checks the session-queued_cmdsn against the MaxCmdSn. The queued_cmdsn is just a preallocation of CmdSn values. So if the scsi layer sends us X commands, the initiator checks if the window has X spaces open. If it does, then we add the task to the cmdqueue, and increase queued_cmdsn. This is supposed to prevent the scsi layer from sending us too many commands and them sitting in the
Re: [PATCH] libiscsi: Check for CmdSN window before sending data
On 07/22/2009 12:00 PM, Mike Christie wrote: No, wrong. The check in queuecommand is by no means relevant to the actual window. We're checking the target window at the time queuecommand is run, but we're _generating_ the CmdSN only much later after we've dequeued the command. The check in queuecommand is for scsi cmd pdus and checks the session-queued_cmdsn against the MaxCmdSn. The queued_cmdsn is just a preallocation of CmdSn values. So if the scsi layer sends us X commands, the initiator checks if the window has X spaces open. If it does, then we add the task to the cmdqueue, and increase queued_cmdsn. This is supposed to prevent the scsi layer from sending us too many commands and them sitting in the driver cmdqueue and timing out. The cmdsn is then allocated when we put the cmd on the wire. So the Maybe this is causing some confusion. We do not increment the cmdsn value for immediate commands (how we send a tmf or nop) and for data-out pdus. We only increment it for scsi cmd pdus, so the only place I check for the window having space is in queuecommand. If you are worried about a nop or data out increasing the cmdsn value in one thread and a scsi cmd pdu increasing it in another, then it will not happen. We only this one path to worry about. session-CmdSn should always be lower than the MaxCmdSn value (as long as the MaxCmdSn value does not suddenly decrease on us (it cannot do that can it?)), because the session-queued_cmdsn value is always lower than the MaxCmdSn. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Race during logout
On 07/22/2009 09:09 AM, Erez Zilber wrote: Mike, I'm seeing from time to time the following scenario (although currently I'm not able to reproduce it): No need to replicate it. I know what you are referring to. 1. A logout from a node is initiated by the user. 2. Before the logout completes, the user runs /etc/init.d/iscsi stop. The 'stop' method logs out from all nodes. When it tries to logout from the node that is already logging out (step #1), it fails because it is already logging out (see initiator.c::session_logout_task when conn-logout_qtask != NULL). Then, the 'stop' method kills iscsid. 3. The logout command form step #1 returns and notifies the (dead) daemon. I was thinking about something like this to solve that: diff --git a/etc/initd/initd.redhat b/etc/initd/initd.redhat index d68f135..55d35ec 100644 --- a/etc/initd/initd.redhat +++ b/etc/initd/initd.redhat @@ -39,6 +39,11 @@ stop() echo -n $Stopping iSCSI initiator service: sync iscsiadm -m node --logoutall=all + RETVAL=$? + if [ $RETVAL -ne 0 ]; then + echo Could not logout from all nodes, try again later + return $RETVAL + fi killproc iscsid rm -f /var/run/iscsid.pid [ $RETVAL -eq 0 ] rm -f /var/lock/subsys/open-iscsi @@ -76,6 +81,10 @@ case $1 in ;; restart) stop + if [ $RETVAL -ne 0 ]; then + echo Stopping iSCSI initiator service failed, not starting + exit $RETVAL + fi start ;; status) I'm not sure if there are other logout failure scenarios in which we want to shutdown the open-iscsi service anyway. If not, I guess that this fix (and a similar fix for SuSE Debian) should do the work. This looks ok to me. We could make iscsid more complex and send notifications to multiple iscsiadm requests, but I am thinking this does not come up very much in normal use, so I am ok with the simple fix in the patch. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: undefined reference to strl* when building iscsid
Hey Mike, I figured out that you need to be in the root this aft. The Gentoo ebuild for open-iscsi-2.0-870.3 (which I was updating for open-iscsi-2.0-871) did a cd usr; make. I changed it to just make user and all worked as expected. Thanks for the reply E -Original Message- From: Mike Christie micha...@cs.wisc.edu Date: Tue, 21 Jul 2009 14:22:39 To: open-iscsiopen-iscsi@googlegroups.com Subject: Re: undefined reference to strl* when building iscsid On Jul 20, 4:18 pm, Evan Borgstrom e...@borgstrom.ca wrote: Hi, I'm trying to build open-iscsi-2.0-871 on Gentoo (Kernel 2.6.30 with OpenSUSE Xen patches / GCC 4.3.3 / glibc 2.9). The kernel modules build fine but when trying to build iscsid I get the following errors: cc -O2 -g -Wall -Wstrict-prototypes -I../include -I. -DLinux - DNETLINK_ISCSI=8 -D_GNU_SOURCE -c -o iscsid.o iscsid.c iscsid.c: In function 'main': Are you doing make from the usr dir or from the open-iscsi root dir? You have to do it from the open-iscsi root dir, because some stuff in utils is needed by the stuff in usr. I suck at Makefile stuff so the build system is crap :( --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: Centos 5.2 w open ISCSI
On 07/21/2009 02:33 PM, Nate wrote: I am relatively new to linux, so parden my ignorance. I have installed ISCSI on my Centos box and having some issues with getting ISCSI to connect\mount the drive at boot. I have issued the /etc/init.d/iscsi status after a reboot and see that the ISCSI is up and running -- iscsid (pid 3557 3556) is running. The following has already been done: iscsiadm -m discovery -t sendtargets –p 10.10.1.4:3260(ip changed for the post to protect the innocent) However the drive is not mounted, so I set the login attemps using the following command: vi /etc/iscsi/iscsid.conf and set the node.session.initial_login_retry_max = 60 Upon issuing the reboot command on the server the serve comes up and I do not see the ISCSI volumes. I did some more research and found this article: http://www.cyberciti.biz/tips/rhel-centos-fedora-linux-iscsi-howto.html Step 4. Say to enable ISCSI on startup by issuing the followoing command chkconfig iscsi on This has been done... restarted and same issue Then I looked at the fstab so I edited fstab file (vi /etc/fstab) and added the following line: (came from URL above in step 4) /dev/sdd1 /mnt/iscsi ext3 If you have multiple targets or if the target has multiple portals or if the target does a target for each lun (so you get multiple targets for each lun), then the initiator will setup each target, portal and lun in parallel. This results in the /dev/sdX naming not being persistent between reboots. You want to use udev names (/dev/disk/ or labels). _netdev 0 0 However I think this may be wrong as when I issue the fdisk -l command I see a different device Disk /dev/sda: 541.6 GB, 541693575168 bytes 255 heads, 63 sectors/track, 65857 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sda doesn't contain a valid partition table Disk /dev/sdf: 108.9 GB, 108998688768 bytes 255 heads, 63 sectors/track, 13251 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes The other issue is the drives show up 4x each = 8 drives, and should only be two. Any assistance would be greatly appricieated. When you run iscsiadm -m discovery -t sendtargets –p 10.10.1.4 What output did you get. Do you see multiple targets or portals? If so then we are going to set them up in parallel and the /dev/sd names could be differnt for each reboot. After the iscsi service has started, you can run iscsiadm -m session -P 3 to see all the running sessions and what /dev/sdXs we got and what portal they are being accessed from (you should use the /dev/disk/ naming for something like fstab though). And you probably see multple /dev/sdXs because each path gets one. So if you have one lun and the target has two ports you would get two /dev/sdX entries. One for each path. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] libiscsi: Check for CmdSN window before sending data
On 07/22/2009 12:24 PM, Mike Christie wrote: On 07/22/2009 12:00 PM, Mike Christie wrote: No, wrong. The check in queuecommand is by no means relevant to the actual window. We're checking the target window at the time queuecommand is run, but we're _generating_ the CmdSN only much later after we've dequeued the command. The check in queuecommand is for scsi cmd pdus and checks the session-queued_cmdsn against the MaxCmdSn. The queued_cmdsn is just a preallocation of CmdSn values. So if the scsi layer sends us X commands, the initiator checks if the window has X spaces open. If it does, then we add the task to the cmdqueue, and increase queued_cmdsn. This is supposed to prevent the scsi layer from sending us too many commands and them sitting in the driver cmdqueue and timing out. The cmdsn is then allocated when we put the cmd on the wire. So the Maybe this is causing some confusion. We do not increment the cmdsn value for immediate commands (how we send a tmf or nop) and for data-out pdus. We only increment it for scsi cmd pdus, so the only place I check for the window having space is in queuecommand. If you are worried about a nop or data out increasing the cmdsn value in one thread and a scsi cmd pdu increasing it in another, then it will not happen. We only this one path to worry about. session-CmdSn should always be lower than the MaxCmdSn value (as long as the MaxCmdSn value does not suddenly decrease on us (it cannot do Ah, I guess it does not have a MUST NOT decrease in the RFC. There is this though: - If the PDU MaxCmdSN is greater than the local MaxCmdSN (in Serial Arithmetic Sense), it updates the local MaxCmdSN; otherwise, it is ignored. If the target did send a MaxCmdSN that is smaller than our local then we are not going to update our value. The target would then drop the scsi cmd pdu. The scsi cmd timeout for that cmd will eventually fire and we will go through that. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] libiscsi: Check for CmdSN window before sending data
On 07/22/2009 12:00 PM, Mike Christie wrote: No, wrong. The check in queuecommand is by no means relevant to the actual window. We're checking the target window at the time queuecommand is run, but we're _generating_ the CmdSN only much later after we've dequeued the command. The check in queuecommand is for scsi cmd pdus and checks the session-queued_cmdsn against the MaxCmdSn. The queued_cmdsn is just a preallocation of CmdSn values. So if the scsi layer sends us X commands, the initiator checks if the window has X spaces open. If it does, then we add the task to the cmdqueue, and increase queued_cmdsn. This is supposed to prevent the scsi layer from sending us too many commands and them sitting in the driver cmdqueue and timing out. The cmdsn is then allocated when we put the cmd on the wire. So the session-CmdSn should always be lower than the MaxCmdSn value (as long as the MaxCmdSn value does not suddenly decrease on us (it cannot do that can it?)), because the session-queued_cmdsn value is always lower than the MaxCmdSn. Here is an example of what I am seeing when I run the code. - Login done and we are in FFP. iscsi_conn_start() { session-queued_cmdsn = session-cmdsn; } (here session-cmdsn is probably 1 and MaxCmdSn we will say is 4) - The scsi layer then sends a scsi command. queuecommand checks queued_cmdsn(1) MaxCmdSn(4). It is ok so it gets incremnted to queued_cmdsn = 2. The queuecommand does queue_work to wake the xmit thread. - Let's say the scsi layer is really quick. And calls the queuecommand again. queuecommand checks queued_cmdsn(2) MaxCmdSn(4). It is ok so it gets incremnted to queued_cmdsn = 3. - Let's say the scsi layer is really quick. And calls the queuecommand again. queuecommand checks queued_cmdsn(3) MaxCmdSn(4). It is ok so gets incremnted to queued_cmdsn = 4. - Let's say the scsi layer is really quick. And calls the queuecommand again. queuecommand checks queued_cmdsn(4) MaxCmdSn(4). Window is not open so queuecommand requeus command. - Now the xmit thread finally wakes up. It takes a task from the cmdqueue, and preps it. This does hdr-cmdsn = session-cmdsn; So here hdr-cmdsn and session-cmdsn is 1; Then we do session-cmdsn++ so the session-cmdsn is now 2; - we loop in the xmit thread and it takes a task from the cmdqueue, and preps it. This does hdr-cmdsn = session-cmdsn; So here hdr-cmdsn and session-cmdsn is 2; Then we do session-cmdsn++ so the session-cmdsn is now 3; - we loop in the xmit thread and it takes a task from the cmdqueue, and preps it. This does hdr-cmdsn = session-cmdsn; So here hdr-cmdsn and session-cmdsn is 3; Then we do session-cmdsn++ so the session-cmdsn is now 4; - We now return from iscsi_data_xmit because there was only 3 cmds queued and there is nothing left on the cmdqueue list. - Now when the the window opens and iscsi_update_cmdsn will update the session-max_cmdsn. At this point we should call something like scsi_run_queue but we don't. We wait for a scsi cmd to complete and then the scsi_done/scsi_io_completion completion path will run the scsi queues again. And so we start again, but with a higher MaxCmdSn. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [PATCH] libiscsi: Check for CmdSN window before sending data
On 07/22/2009 03:01 PM, Mike Christie wrote: On 07/22/2009 12:00 PM, Mike Christie wrote: No, wrong. The check in queuecommand is by no means relevant to the actual window. We're checking the target window at the time queuecommand is run, but we're _generating_ the CmdSN only much later after we've dequeued the command. The check in queuecommand is for scsi cmd pdus and checks the session-queued_cmdsn against the MaxCmdSn. The queued_cmdsn is just a preallocation of CmdSn values. So if the scsi layer sends us X commands, the initiator checks if the window has X spaces open. If it does, then we add the task to the cmdqueue, and increase queued_cmdsn. This is supposed to prevent the scsi layer from sending us too many commands and them sitting in the driver cmdqueue and timing out. The cmdsn is then allocated when we put the cmd on the wire. So the session-CmdSn should always be lower than the MaxCmdSn value (as long as the MaxCmdSn value does not suddenly decrease on us (it cannot do that can it?)), because the session-queued_cmdsn value is always lower than the MaxCmdSn. Here is an example of what I am seeing when I run the code. - Login done and we are in FFP. iscsi_conn_start() { session-queued_cmdsn = session-cmdsn; } (here session-cmdsn is probably 1 and MaxCmdSn we will say is 4) - The scsi layer then sends a scsi command. queuecommand checks queued_cmdsn(1) MaxCmdSn(4). It is ok so it gets incremnted to queued_cmdsn = 2. The queuecommand does queue_work to wake the xmit thread. - Let's say the scsi layer is really quick. And calls the queuecommand again. queuecommand checks queued_cmdsn(2) MaxCmdSn(4). It is ok so it gets incremnted to queued_cmdsn = 3. - Let's say the scsi layer is really quick. And calls the queuecommand again. queuecommand checks queued_cmdsn(3) MaxCmdSn(4). It is ok so gets incremnted to queued_cmdsn = 4. - Let's say the scsi layer is really quick. And calls the queuecommand again. queuecommand checks queued_cmdsn(4) MaxCmdSn(4). Window is not open so queuecommand requeus command. Oh yeah, I messed up when I took this from my log and tried to write it out nicer below. The window check is for less than or equal to so this would be queued and we would have 4 tasks queued and the 5th would hit the window check. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: [RFC]iscsid to run in conjunction with uIP stack
Sorry for the really late response on this. On 06/25/2009 08:25 AM, Rakesh Ranjan wrote: Li, I have attached the diff against git master, please go through it. I have merged the contents of uip directory and nic related files into one common libuip. So now offload dir looks like offload/drivers offload/libuip Mike, what are your thoughts about linking uip core code int iscsid. Weather it gonna be static or dynamic ? Also it would be very helpful if I think we would want to do it static, because it will make the distro boot and installer stuff easier. you could help on adding additional iface options. Regards Rakesh Ranjan Was the attached patch ok for a starting point? I will cleanup the open-iscsi/user and include stuff and add the uip stuff as is and then put it in a new branch for us to work off of. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
[PATCH] don't increment cmdsn if cmd is not sent
I found a bug while reviewing that cmdsn code. We increment session-cmdsn at the top of iscsi_prep_scsi_cmd_pdu, but if the prep ecb or prep bidi or init_task calls fails then we leave the session-cmdsn incremented. This moves the cmdsn manipulation to the end of the function when we know is has succeeded. It also adds a session-cmdsn-- in queuecommand for if a driver like bnx2i tries to send a a task from that context but it fails. We do not have to do this in the xmit thread context because that code will retry the same task if the initial call fails. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~--- diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 716cc34..3cef35b 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -301,8 +301,6 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_task *task) hdr-flags = ISCSI_ATTR_SIMPLE; int_to_scsilun(sc-device-lun, (struct scsi_lun *)hdr-lun); memcpy(task-lun, hdr-lun, sizeof(task-lun)); - hdr-cmdsn = task-cmdsn = cpu_to_be32(session-cmdsn); - session-cmdsn++; hdr-exp_statsn = cpu_to_be32(conn-exp_statsn); cmd_len = sc-cmd_len; if (cmd_len ISCSI_CDB_SIZE) @@ -388,6 +386,8 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_task *task) return -EIO; task-state = ISCSI_TASK_RUNNING; + hdr-cmdsn = task-cmdsn = cpu_to_be32(session-cmdsn); + session-cmdsn++; conn-scsicmd_pdus_cnt++; ISCSI_DBG_SESSION(session, iscsi prep [%s cid %d sc %p cdb 0x%x @@ -1497,6 +1497,7 @@ int iscsi_queuecommand(struct scsi_cmnd *sc, void (*done)(struct scsi_cmnd *)) } } if (session-tt-xmit_task(task)) { + session-cmdsn--; reason = FAILURE_SESSION_NOT_READY; goto prepd_reject; }
[PATCH] cnic: Fix ISCSI_KEVENT_IF_DOWN message handling.
When a net device goes down or when the bnx2i driver is unloaded, the code was not generating the ISCSI_KEVENT_IF_DOWN message properly and this could cause the userspace driver to crash. This is fixed by sending the message properly in the shutdown path. cnic_uio_stop() is also added to send the message when bnx2i is unregistering. Signed-off-by: Michael Chan mc...@broadcom.com --- drivers/net/cnic.c | 23 +-- 1 files changed, 21 insertions(+), 2 deletions(-) diff --git a/drivers/net/cnic.c b/drivers/net/cnic.c index 4d1515f..4869d77 100644 --- a/drivers/net/cnic.c +++ b/drivers/net/cnic.c @@ -227,7 +227,7 @@ static int cnic_send_nlmsg(struct cnic_local *cp, u32 type, } rcu_read_lock(); - ulp_ops = rcu_dereference(cp-ulp_ops[CNIC_ULP_ISCSI]); + ulp_ops = rcu_dereference(cnic_ulp_tbl[CNIC_ULP_ISCSI]); if (ulp_ops) ulp_ops-iscsi_nl_send_msg(cp-dev, msg_type, buf, len); rcu_read_unlock(); @@ -319,6 +319,20 @@ static int cnic_abort_prep(struct cnic_sock *csk) return 0; } +static void cnic_uio_stop(void) +{ + struct cnic_dev *dev; + + read_lock(cnic_dev_lock); + list_for_each_entry(dev, cnic_dev_list, list) { + struct cnic_local *cp = dev-cnic_priv; + + if (cp-cnic_uinfo) + cnic_send_nlmsg(cp, ISCSI_KEVENT_IF_DOWN, NULL); + } + read_unlock(cnic_dev_lock); +} + int cnic_register_driver(int ulp_type, struct cnic_ulp_ops *ulp_ops) { struct cnic_dev *dev; @@ -390,6 +404,9 @@ int cnic_unregister_driver(int ulp_type) } read_unlock(cnic_dev_lock); + if (ulp_type == CNIC_ULP_ISCSI) + cnic_uio_stop(); + rcu_assign_pointer(cnic_ulp_tbl[ulp_type], NULL); mutex_unlock(cnic_lock); @@ -632,7 +649,6 @@ static void cnic_free_resc(struct cnic_dev *dev) int i = 0; if (cp-cnic_uinfo) { - cnic_send_nlmsg(cp, ISCSI_KEVENT_IF_DOWN, NULL); while (cp-uio_dev != -1 i 15) { msleep(100); i++; @@ -1057,6 +1073,9 @@ static void cnic_ulp_stop(struct cnic_dev *dev) struct cnic_local *cp = dev-cnic_priv; int if_type; + if (cp-cnic_uinfo) + cnic_send_nlmsg(cp, ISCSI_KEVENT_IF_DOWN, NULL); + rcu_read_lock(); for (if_type = 0; if_type MAX_CNIC_ULP_TYPE; if_type++) { struct cnic_ulp_ops *ulp_ops; -- 1.5.6.GIT --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---