On Friday, November 21, 2014 12:37:59 AM UTC-8, Uli wrote:
>
> >>> The Lee-Man <[email protected] <javascript:>> schrieb am 20.11.2014 
> um 18:07 in 
> Nachricht <[email protected] 
> <javascript:>>: 
> > On Wednesday, November 19, 2014 11:23:10 PM UTC-8, Uli wrote: 
> >> 
> >> >>> The Lee-Man <[email protected] <javascript:>> schrieb am 
> 19.11.2014 
> >> um 19:35 in 
> >> Nachricht <[email protected] 
> <javascript:> 
> >> <javascript:>>: 
> >> > On Tuesday, November 18, 2014 10:55:31 PM UTC-8, Uli wrote: 
> >> >> 
> >> >> >>> Lee Duncan <[email protected] <javascript:>> schrieb am 
> >> 18.11.2014 
> >> >> um 22:35 in 
> >> >> Nachricht <1416346536-18198-1-git-seemail-###-duncan@ 
> >> <javascript:>###>: 
> >> >> > The following patch fixes a problem where the CPU becomes compute 
> >> bound 
> >> >> > when rediscovering targets, when there are hundreds of sessions. 
> >> >> > 
> >> >> > When his occurs, most of the time is spent in the function 
> >> >> > iscsi_sysfs_for_each_session(). This function does a scandir(), 
> >> >> > sorted alphabetically, to get a list of sessions, then scans 
> >> >> > that list looking for a match. When there are hundreds of sesions 
> >> >> > this can take forever. 
> >> >> 
> >> >> 
> >> >> I wonder: What takes forever: reading hundreds of sysfs entries, 
> >> sorting 
> >> >> them, or looking for a match? I guess none of them should take 
> forever 
> >> >> unless the algorithm is really very bad. 
> >> >> 
> >> > 
> >> > The problem is not the sort, since the patch still does a sort of the 
> >> > directory entries. 
> >> > 
> >> > I believe the problem is in processing the sorted data, in 
> >> > iscsi_sysfs_for_each_session(). This function does dozens or more 
> >> > open/read/close cycles on sysfs attributes. Multiply that times 
> hundreds 
> >> of 
> >> > session, and you have tens of thousands I/O operations. This fix 
> ensures 
> >> > that, much of time, this loop is only gone through once. 
> >> 
> >> When the reason for the sort is just to find some extrema (min or max), 
> >> the sort is not needed. 
> > 
> > 
> > Actually, in general, that is not true. In the case where we've cached 
> an 
> > entry, that may or may 
> > not be true, depending on the use case. 
>
> What I was saying: If you just need to remember one entry from a list of 
> entries, there is no need to move entries around (as sorting does). 
>

I continue to understand what you are saying.

What I am trying to say, and evidently not doing very well,
is that the function where this change was made does not know what
sessions the caller wants. This is because this function is called
in different ways.

One of the ways this function is called is the find the current
session. But, again, the function that gets and sorts sessions
doesn't know this. This function is "generic", in that only:

1. gets a list of sessions
2. Sorts them (usually alphabetically, though I believe
    that's arbitrary), and
3. then scans the sessions, one at a time, filling in
   lots of attributes for each one, before seeing the
   the calling function wants this session or not

It's true that this does not seem efficient for the case where
we are just looking for the current session. But my contention
is that it is efficient enough, for the time being.

Could this whole section of code be redesigned? Yes, of course,
and I'm fairly sure it could be made faster. But that would take
a redesign of the places where these search functions are called,
not just the search function itself.

>
> >   
> > 
> >> Also when just needing some extreme entry (min or max) it makes little 
> >> sense to populate some array with all entries just to discard them all, 
> but 
> >> one. I don't know the details, but if all the other stuff isn't needed, 
> it 
> >> shouldn't be read. 
> >> 
> > 
> > It also does not hurt anything to sort the whole array. I am trying to 
> fix 
> > an issue where the CPU become completely compute-bound with hundreds (or 
> > more) sessions coming and going. 
>
> And I tried to understand why it's going to be CPU-bound, and how you came 
> up with your fix.
>

It becomes CPU bound from all slow I/O it's doing. Disc I/O is fast, 
but reads (e.e.g from /sys) are relatively slow. And when you have
more reads to do then time to do them (i.e. with new requests coming in),
things get backed up and the CPU can't keep up.

>
> > 
> > It sounds to me like you would like to redesign some of the code to make 
> it 
> > more efficient. While I believe in efficiency in general, I don't 
> believe 
> > the trade-off is worth the time here, since this is not the issue that 
> is 
> > causing problems. 
>
> I understand, but one quick and dirty fix, then another, and one more, and 
> at some time later you have a real mess. Redesigning code, especially if 
> the existing one shows to have issues, is quite normal to do (even though 
> not popular). 
>

I don't consider this a quick nor dirty fix, but perhaps I'm being picky 
about language.

My point is that this is a simple optimization.

And I did not come up with it. It came from a large SUSE customer.

I do not have the network nor the number of targets to be able
to perform such a test. So even if I did want to redesign this section,
I'd need to build up a target and network farm first.

And, honestly, there are other things that are actually broken that I
would probably spend my time on first.

If you would like to take a stab at redesigning this section (which
assumes you'd be able to test such a change), I would love to
help review such changes. 

>
> > 
> > 
> >> What I saw from your path is that you just cheat on the sort routine. 
> So 
> >> if getting the entries takes all the time (I guess it happens before 
> >> sorting), I don't see how you save a lot of time that way, _unless_ the 
> >> compare routine actually does I/O to compare the values which is bad, 
> >> because every value is compared more than once in a typical sort. 
> >> 
> > 
> > As I said, it is trying to populate hundreds of internal session objects 
> to 
> > find the one we need that takes so much time. 
>
> That why I questioned the sort in the beginning. 
>

The sort is a non-issue. Sorting even hundreds of entries takes very
little time compared to reading tens of hundreds of attributes, trying
to find the "right" entry. 

>
> > 
> > I believe part of the problem making more optimizations here is the fact 
> > that iscsi_sysfs_for_each_session() is called with a compare function 
> > pointer, so the routine does not know if the first entry in the sort 
> list 
> > is going to make the caller happy or not. So in order to make this code 
> > work in all current cases *and* fix the compute-bound problem, a small 
> > change that sorts the last-known session to the top of the list cannot 
> > break any code, and also solves my problem. 
> > 
> > Here are some actual numbers from testing this patch: 
> > 
> > For an idle system (no data or link bounce) with 120 targets the CPU 
> time 
> > improvement is about 3:1 after running 72 hours 
> > 
> > For a system with data and link bounce every 10 minutes the CPU time 
> > improvement is on the order to 100:1 after 24 hours.   
>
> I'm not saying your fix is not effective; I was just wondering whether you 
> really fixed the root of the problem.


Yes. The root of the problem is unneeded /sys I/O. Of course there are lots
of ways to fix any problem, but I believe this is a simple and clean
solution, without doing a total re-write.

>
>
> > The CPU time with the patch is linear with time.  Without the patch, 
> it's 
> > exponential and will break the system to its knees after 1 day of 
> bouncing 
> > links.   
> > Memory use was over 40G additional on the system without the patch. 
> > 
> > Here's sar data from the 2 systems (bumble1 has the patch; bumble2 does 
> > not).  This is with bouncing links after 24+ hours: 
> > 
> > bumble1:~ # sar -u 1 1000 
> > Linux 2.6.32.54-0.23.TDC.1.R.2-default (bumble1)        11/13/14       
> >  _x86_64_ 
> > 
> > 10:52:16        CPU     %user     %nice   %system   %iowait    %steal   
>   
> > %idle 
> > 10:52:17        all      0.00      0.00      0.02      0.00      0.00   
>   
> > 99.98 
> > 10:52:18        all      0.00      0.00      0.07      0.00      0.00   
>   
> > 99.93 
> > 10:52:19        all      0.00      0.00      0.04      0.00      0.00   
>   
> > 99.96 
> > 10:52:20        all      0.00      0.00      0.16      0.00      0.00   
>   
> > 99.84 
> > 10:52:21        all      0.00      0.00      0.04      0.00      0.00   
>   
> > 99.96 
> > 10:52:22        all      0.00      0.00      0.10      0.00      0.00   
>   
> > 99.90 
> > 10:52:23        all      0.05      0.00      0.07      0.00      0.00   
>   
> > 99.89 
> > 
> > 
> > bumble2:~ # sar -u 1 10000 
> > Linux 2.6.32.54-0.23.TDC.1.R.2-default (bumble2)        11/13/14       
> >  _x86_64_ 
> > 
> > 10:52:20        CPU     %user     %nice   %system   %iowait    %steal   
>   
> > %idle 
> > 10:52:21        all     99.53      0.00      0.47      0.00      0.00   
>   
> >  0.00 
> > 10:52:22        all     99.56      0.00      0.44      0.00      0.00   
>   
> >  0.00 
> > 10:52:23        all     99.56      0.00      0.44      0.00      0.00   
>   
> >  0.00 
> > 10:52:24        all     99.28      0.00      0.72      0.00      0.00   
>   
> >  0.00 
> > 10:52:25        all     99.59      0.00      0.41      0.00      0.00   
>   
> >  0.00 
> > 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to