On 11/25/2014 12:49 PM, The Lee-Man wrote:
> On Tuesday, November 25, 2014 9:42:59 AM UTC-8, Mike Christie wrote:
> 
>     On 11/24/2014 11:04 AM, The Lee-Man wrote:
>     > Okay, I spent most the day yesterday playing with the
>     > code in question, i.e. the open-iscsi code that rescans
>     > the session list looking for the current session.
>     >
>     > In particular, I was looking at update_sessions().
>     >
>     > One thing I noticed is that this code only gets executed
>     > if discovery.sendtargets.use_discoveryd is set to Yes for
>     > a particular target, by the way.
> 
>     So how does your interconnect test come into play for this issue? It
>     seems like you should be hitting this issue all the time even when the
>     connection is ok, because that code polls every N seconds.
> 
> 
> When the connections are being dropped and then reconnected, the
> sessions (created by the kernel) are coming and going. And each time
> a connection goes away and comes back, it gets a new session ID,
> and a new symlink in /sys/class/iscsi_sessions, which of course
> is not cached. (This is my theory, since I don't have a configuration
> which can test this ATM.)

The session and/or connection id does not change if the connection is
just dropped and then relogged into by iscsid. It would only change if
you are removing a session by hand by doing iscsiadm ... --logout, then
readding it by doing iscsiadm .... --login, or if maybe discoveryd is
causing sessions to be deleted then added due to getting different
portals back or due to a bug in that code.

Or, are you using eql's multipating software that might be dynamically
creating/deleting sessions?


> 
> This is backed up by the fact that I see lot of messages like:
> 

Maybe you should send the entire log.

> Oct 13 13:00:09 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:09 bumble1 iscsid: could not find session info for session11
> Oct 13 13:00:09 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:09 bumble1 iscsid: could not find session info for session11
> Oct 13 13:00:10 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:10 bumble1 iscsid: could not find session info for session5
> Oct 13 13:00:10 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:10 bumble1 iscsid: could not find session info for session5
> Oct 13 13:00:10 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:10 bumble1 iscsid: could not find session info for session10
> Oct 13 13:00:11 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:11 bumble1 iscsid: could not find session info for session11
> Oct 13 13:00:12 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:12 bumble1 iscsid: could not find session info for session11
> Oct 13 13:00:13 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:13 bumble1 iscsid: could not find session info for session13
> Oct 13 13:00:13 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:13 bumble1 iscsid: could not find session info for session17
> Oct 13 13:00:13 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:13 bumble1 iscsid: could not find session info for session14
> Oct 13 13:00:13 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:13 bumble1 iscsid: could not find session info for session18
> Oct 13 13:00:13 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:13 bumble1 iscsid: could not find session info for session15
> Oct 13 13:00:13 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:13 bumble1 iscsid: could not find session info for session16
> Oct 13 13:00:13 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:13 bumble1 iscsid: could not find session info for session19
> Oct 13 13:00:14 bumble1 iscsid: could not read session targetname: 5
> Oct 13 13:00:14 bumble1 iscsid: could not find session info for session20
> 
> So not only are we doing I/O to discover info about new sessions, but
> existing sessions are going away, causing lots of error I/O output, again
> swamping the time taken to compute things.
> 
> 
>     >
>     > Bottom line: I did not find any way to significantly speed
>     > up the search other than the "cache the last session"
>     > patch I already posted.
> 
>     Can you explain the problem again? I thought originally you were
>     thinking it was due to the sysfs operations taking too long and then
>     compounded by them being repeated. However, I thought for the
>     discoveryd
>     daemon process, sysfs.c is caching that info, so we are not actually
>     reading from sysfs every time.
> 
>     Is the issue just a normal old bad search cpu time type of issue and
>     not
>     really sysfs read/scan operations taking a long time? If so, then the
>     patch makes sense.


Will reply to this once we figure out why you are hitting this problem
in the first place as I would like your patch but also want to figure
out why we cannot read existing sessions in case that is another bug.


> 
> 
> I know sysfs attributes are cached here, after spending a day playing
> with the code. And, as I said, I'm guessing as to the "/sysfs read delays"
> part, since I can't recreate the problem. But I'm sure it is not the sort
> causing the delay, since the supplied patch fixed the problem, and the
> sort is still present.
> 
> Think about it: sorting one or two hundred entries is not going to take
> very long compared to reading a dozen attributes for the new sessions
> since the last discoveryd scan.
> 
> It seems to me like the problem should be related to the discoveryd
> poll time: if that poll time is less than the time it takes to rescan
> sessions, then things will back up. So even if we didn't have
> a problem, for example, with the default rescan time of 30 seconds,
> what happens if we set the poll time to 10 or 5 seconds?
> 
> In simple terms, this patch just caches the last session, and
> sorts it to the front of the list.
> 
> After the patch, if there was still a CPU-bound issue, could
> scan the cached session first before even building and sorting
> the session list. I will leave that for the next update, if and
> when it's needed.
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "open-iscsi" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected]
> <mailto:[email protected]>.
> To post to this group, send email to [email protected]
> <mailto:[email protected]>.
> Visit this group at http://groups.google.com/group/open-iscsi.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to