Sure. Its the OVM 2.2 environment that I talked with you about a couple of days ago. Here is the info:
[r...@oim6102506 ~]# uname -r 2.6.18-128.2.1.4.9.el5xen [r...@oim6102506 ~]# rpm -qa | grep iscsi iscsi-initiator-utils-6.2.0.871-0.7.el5 I have a tcpdump that I had sent to Don WIlliams. I'll pull that up shortly. On Nov 17, 2009, at 10:06 PM, Mike Christie wrote: > Mike Christie wrote: >> Hoot, Joseph wrote: >>> more INLINE below... >>> >>> On Nov 17, 2009, at 7:27 PM, Mike Christie wrote: >>> >>>> Pasi Kärkkäinen wrote: >>>>> On Mon, Nov 16, 2009 at 09:39:00PM -0500, Hoot, Joseph wrote: >>>>>> On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote: >>>>>> >>>>>>> thanks. That helps. So I know that with the EqualLogic targets, there >>>>>>> is a "Group IP" which, I believe, responds with an iscsi >>>>>>> login_redirect. >>>>>>> >>>>>>> 1) Could the "Login authentication failed" message be the response >>>>>>> because of a login redirect messages from the EQL redirect? >>>>>>> >>>>>>> and then my next question is more for curiosity sake: >>>>>>> >>>>>>> 2) Are there plans in the future to have more than one connection per >>>>>>> session? and I guess in addition to that, would that mean multiple >>>>>>> connections to a single volume over the same nic? >>>>>>> >>>>>>> >>>>>> Also Mike, I'm seeing one or two of these every 30-40 minutes if I slam >>>>>> our EqualLogic with roughly 7-15k IOPS (reads and writes) non stop on 3 >>>>>> volumes. In this type of scenario, would you expect to see timeouts >>>>>> like this once in awhile? If so, do you think increasing my NOOP >>>>>> timeouts would assist so we don't get these? maybe set it to 15 seconds >>>>>> instead of 10? >>>>>> >>>>> Equallogic does active loadbalancing (redirects) during operation.. >>>>> dunno about the errors though. >>>>> >>>> Oh yeah, forgot about that. Thanks Pasi! >>>> >>>> Joseph, look in the EQL target logs for something about the EQL box >>>> doing load balancing. I think normally we handle the load balancing more >>>> gracefully, but we might be messing up. I think if EQL was load >>>> balancing in the open-iscsi logs we would see something about getting a >>>> async iscsi pdu from the target that asks us to logout. Then when we >>>> relogin the target would redirect us to the optimal path. >>> >>> There are two things that the EQL does, I believe-- one thing is async >>> logout, the other is login_redirect. Unfortunately, from the EQL syslog >>> side we don't see any errors related to this. It's my understanding, >>> however, that when a login is initially attempted to the EQL, it hits the >>> "group ip" or an alias'd IP sitting on a real nic. The group IP looks at >>> all the interfaces on the EQL and decides, based on some algorithm, which >>> EQL nic the session should connect to. It then sends the initiator that >>> made the request a login_redirect, which I thought is basically a "logout >>> and reconnect" pdu. It would say, for example, "you're can't log into the >>> group IP, however, you can log into this IP (a real nic) that it would >>> prefer you be logged into." >>> >>> I'm thinking that the "failed login" is actually the result of that attempt >>> to log into the group IP and it sending a login redirect pdu back to it. >>> >> >> If the target was load balancing us it would: >> >> - Send a async logout pdu. >> - We then send a logout pdu. >> - When we get the logout response pdu we kill the tcp ip connection >> - We then create a new tcp connection >> - We then log in to the portal that was passed into iscsiadm/iscsid (the >> one in the DB that you see when you run iscsiadm -m node, which is >> probably what you call the group IP). For this process we send a login >> pdu. It then sends a login response pdu with the login redirect >> response. In this response we also get the new IP to log into. >> - We see that response and kill the tcp connection, and create a new tcp >> connection to the portal we are being redirected to. >> - We then log into the portal we were redirected to. We again do this by >> sending a login pdu. This time the login response pdu should be ok and >> we are done. > > Oh yeah, I meant to also say that this is pretty much the same process > that happens we do the first login, and if we have to relogin because of > a connection problem like the nop/ping timeout. The only difference in > those cases is that we do not get the async logout and we do not do a > logout by sending a logout pdu. We start at the killing tcp ip > connection step. > > So even if we are not getting load balanced we would be in the same > place in the open-iscsi code when we are getting the login failed errors. > > > To get back on track solving why we get the nop timeouts then if we are > not seeing load balancing messages or async logout messages, it could > be the open-iscsi bug I mentioned in the other mail. If you can send the > open-iscsi and kernel info I asked for in the other mail, we can start > down that path. > > -- > > You received this message because you are subscribed to the Google Groups > "open-iscsi" group. > To post to this group, send email to open-is...@googlegroups.com. > To unsubscribe from this group, send email to > open-iscsi+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/open-iscsi?hl=. > > =========================== Joseph R. Hoot Lead System Programmer/Analyst (w) 716-878-4832 (c) 716-759-HOOT joe.h...@itec.suny.edu GPG KEY: 7145F633 =========================== -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=.