Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-16 Thread Willy Tarreau
Hi Nick,

On Tue, Apr 17, 2007 at 06:29:54AM +0200, Nick Piggin wrote:
(...)
 And my scheduler for example cuts down the amount of policy code and
 code size significantly. I haven't looked at Con's ones for a while,
 but I believe they are also much more straightforward than mainline...
 
 For example, let's say all else is equal between them, then why would
 we go with the O(logN) implementation rather than the O(1)?

Of course, if this is the case, the question will be raised. But as a
general rule, I don't see much potential in O(1) to finely tune scheduling
according to several criteria. In O(logN), you can adjust scheduling in
realtime at a very low cost. Better processing of varying priorities or
fork() comes to mind.

Regards,
Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Gene Heskett
On Monday 16 April 2007, Con Kolivas wrote:

And I snipped, Sorry fellas.

Con's original submission was to me, quite an improvement.  But I have to say 
it, and no denegration of your efforts is intended Con, but you did 'pull the 
trigger' and get this thing rolling by scratching the itch & drawing 
attention to an ugly lack of user interactivity that had crept into the 2.6 
family.  So from me to Con, a tip of the hat, and a deep bow in your 
direction, thank you.  Now, you have done what you aimed to do, so please get 
well.

I've now been through most of an amanda session using Ingo's "CFS" and I have 
to say that it is another improvement over your 0.40 that's is just as 
obvious as your first patch was against the stock scheduler.  No other 
scheduler yet has allowed the full utilization of the cpu, and maintained 
user interactivity as well as this one has,  my cpu is running about 5 
degrees F hotter just from this effect alone.  gzip, if the rest of the 
system is in between tasks, is consistently showing around 95%, but let 
anything else stick up its hand, like procmail etc, and gzip now dutifully 
steps aside, dropping into the 40% range until procmail and spamd are done, 
at which point there is no rest for the wicked and the cpu never gets a 
chance to cool.

There was, just now, a pause of about 2 seconds, while amanda moved a tarball 
from the holding disk area on /dev/hda to the vtapes disk on /dev/hdd, so 
that would have been an I/O bound situation.

This one Ingo, even without any other patches and I think I did see one go by 
in this thread which I didn't apply, is a definite keeper.  Sweet even.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
A word to the wise is enough.
-- Miguel de Cervantes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Davide Libenzi
On Mon, 16 Apr 2007, Pavel Pisa wrote:

> I cannot help myself to not report results with GAVL
> tree algorithm there as an another race competitor.
> I believe, that it is better solution for large priority
> queues than RB-tree and even heap trees. It could be
> disputable if the scheduler needs such scalability on
> the other hand. The AVL heritage guarantees lower height
> which results in shorter search times which could
> be profitable for other uses in kernel.
> 
> GAVL algorithm is AVL tree based, so it does not suffer from
> "infinite" priorities granularity there as TR does. It allows
> use for generalized case where tree is not fully balanced.
> This allows to cut the first item withour rebalancing.
> This leads to the degradation of the tree by one more level
> (than non degraded AVL gives) in maximum, which is still
> considerably better than RB-trees maximum.
> 
> http://cmp.felk.cvut.cz/~pisa/linux/smart-queue-v-gavl.c

Here are the results on my Opteron 252:

Testing N=1
gavl_cfs = 187.20 cycles/loop
CFS = 194.16 cycles/loop
TR  = 314.87 cycles/loop
CFS = 194.15 cycles/loop
gavl_cfs = 187.15 cycles/loop

Testing N=2
gavl_cfs = 268.94 cycles/loop
CFS = 305.53 cycles/loop
TR  = 313.78 cycles/loop
CFS = 289.58 cycles/loop
gavl_cfs = 266.02 cycles/loop

Testing N=4
gavl_cfs = 452.13 cycles/loop
CFS = 518.81 cycles/loop
TR  = 311.54 cycles/loop
CFS = 516.23 cycles/loop
gavl_cfs = 450.73 cycles/loop

Testing N=8
gavl_cfs = 609.29 cycles/loop
CFS = 644.65 cycles/loop
TR  = 308.11 cycles/loop
CFS = 667.01 cycles/loop
gavl_cfs = 592.89 cycles/loop

Testing N=16
gavl_cfs = 686.30 cycles/loop
CFS = 807.41 cycles/loop
TR  = 317.20 cycles/loop
CFS = 810.24 cycles/loop
gavl_cfs = 688.42 cycles/loop

Testing N=32
gavl_cfs = 756.57 cycles/loop
CFS = 852.14 cycles/loop
TR  = 301.22 cycles/loop
CFS = 876.12 cycles/loop
gavl_cfs = 758.46 cycles/loop

Testing N=64
gavl_cfs = 831.97 cycles/loop
CFS = 997.16 cycles/loop
TR  = 304.74 cycles/loop
CFS = 1003.26 cycles/loop
gavl_cfs = 832.83 cycles/loop

Testing N=128
gavl_cfs = 897.33 cycles/loop
CFS = 1030.36 cycles/loop
TR  = 295.65 cycles/loop
CFS = 1035.29 cycles/loop
gavl_cfs = 892.51 cycles/loop

Testing N=256
gavl_cfs = 963.17 cycles/loop
CFS = 1146.04 cycles/loop
TR  = 295.35 cycles/loop
CFS = 1162.04 cycles/loop
gavl_cfs = 966.31 cycles/loop

Testing N=512
gavl_cfs = 1029.82 cycles/loop
CFS = 1218.34 cycles/loop
TR  = 288.78 cycles/loop
CFS = 1257.97 cycles/loop
gavl_cfs = 1029.83 cycles/loop

Testing N=1024
gavl_cfs = 1091.76 cycles/loop
CFS = 1318.47 cycles/loop
TR  = 287.74 cycles/loop
CFS = 1311.72 cycles/loop
gavl_cfs = 1093.29 cycles/loop

Testing N=2048
gavl_cfs = 1153.03 cycles/loop
CFS = 1398.84 cycles/loop
TR  = 286.75 cycles/loop
CFS = 1438.68 cycles/loop
gavl_cfs = 1149.97 cycles/loop


There seem to be some difference from your numbers. This is with:

gcc version 4.1.2

and -O2. But then and Opteron can behave quite differentyl than a Duron on 
a bench like this ;)



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread hui
On Sun, Apr 15, 2007 at 09:25:07AM -0700, Arjan van de Ven wrote:
> Now this doesn't mean that people shouldn't be nice to each other, not
> cooperate or steal credits, but I don't get the impression that that is
> happening here. Ingo is taking part in the discussion with a counter
> proposal for discussion *on the mailing list*. What more do you want??

Con should have been CCed from the first moment this was put into motion
to limit the perception of exclusion. That was mistake number one and big
time failures to understand this dynamic. After it was Con's idea. Why
the hell he was excluded from Ingo's development process is baffling to
me and him (most likely).

He put int a lot of effort into SDL and his experiences with scheduling
should still be seriously considered in this development process even if
he doesn't write a single line of code from this moment on.

What should have happened is that our very busy associate at RH by the
name of Ingo Molnar should have leverage more of Con's and Bill's work
and use them as a proxy for his own ideas. They would have loved to have
contributed more and our very busy Ingo Molnar would have gotten a lot
of his work and ideas implemented without him even opening a single
source file for editting. They would have happily done this work for
Ingo. Ingo could have been used for something else more important like
making KVM less of a freaking ugly hack and we all would have benefitted
from this.

He could have been working on SystemTap so that you stop losing accounts
to Sun and Solaris 10's Dtrace. He could have been working with Riel to
fix your butt ugly page scanning problem causing horrible contention via
the Clock/Pro algorithm, etc... He could have been fixing the ugly futex
rwsem mapping problem that's killing -rt and anything that uses Posix
threads. He could have created a userspace thread control block (TCB)
with Mr. Drepper so that we can turn off preemption in userspace
(userspace per CPU local storage) and implement a very quick non-kernel
crossing implementation of priority ceilings (userspace check for priority
and flags at preempt_schedule() in the TCB) so that our -rt Posix API
doesn't suck donkey shit... Need I say more ?

As programmers like Ingo get spread more thinly, he needs super smart
folks like Bill Irwin and Con to help him out and learn to resist NIH
folk's stuff out of some weird fear. When this happens, folks like Ingo
must learn to "facilitate" development in addition to implementing it
with those kind of folks.

This takes time and practice to entrust folks to do things for him.
Ingo is the best method of getting new Linux kernel ideas and communicate
them to Linus. His value goes beyond just just code and is often the
biggest hammer we have in the Linux community to get stuff into the
kernel. "Facilitation" of others is something that solo programmers must
need when groups like the Linux kernel get larger and large every year.

Understand ? Are we in embarrassing agreement here ?

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Con Kolivas
On Monday 16 April 2007 01:05, Ingo Molnar wrote:
> * Con Kolivas <[EMAIL PROTECTED]> wrote:
> > 2. Since then I've been thinking/working on a cpu scheduler design
> > that takes away all the guesswork out of scheduling and gives very
> > predictable, as fair as possible, cpu distribution and latency while
> > preserving as solid interactivity as possible within those confines.
>
> yeah. I think you were right on target with this call.

Yay thank goodness :) It's time to fix the damn cpu scheduler once and for 
all. Everyone uses this; it's no minor driver or $bigsmp or $bigram or 
$small_embedded_RT_hardware feature.

> I've applied the 
> sched.c change attached at the bottom of this mail to the CFS patch, if
> you dont mind. (or feel free to suggest some other text instead.)

>   *  2003-09-03   Interactivity tuning by Con Kolivas.
>   *  2004-04-02   Scheduler domains code by Nick Piggin
> + *  2007-04-15   Con Kolivas was dead right: fairness matters! :)

LOL that's awful. I'd prefer something meaningful like "Work begun on 
replacing all interactivity tuning with a fair virtual-deadline design by Con 
Kolivas".

While you're at it, it's worth getting rid of a few slightly pointless name 
changes too. Don't rename SCHED_NORMAL yet again, and don't call all your 
things sched_fair blah_fair __blah_fair and so on. It means that anything 
else is by proxy going to be considered unfair. Leave SCHED_NORMAL as is, 
replace the use of the word _fair with _cfs. I don't really care how many 
copyright notices you put into our already noisy bootup but it's redundant 
since there is no choice; we all get the same cpu scheduler.

> > 1. I tried in vain some time ago to push a working extensable
> > pluggable cpu scheduler framework (based on wli's work) for the linux
> > kernel. It was perma-vetoed by Linus and Ingo (and Nick also said he
> > didn't like it) as being absolutely the wrong approach and that we
> > should never do that. [...]
>
> i partially replied to that point to Will already, and i'd like to make
> it clear again: yes, i rejected plugsched 2-3 years ago (which already
> drifted away from wli's original codebase) and i would still reject it
> today.

No that was just me being flabbergasted by what appeared to be you posting 
your own plugsched. Note nowhere in the 40 iterations of rsdl->sd did I 
ask/suggest for plugsched. I said in my first announcement my aim was to 
create a scheduling policy robust enough for all situations rather than 
fantastic a lot of the time and awful sometimes. There are plenty of people 
ready to throw out arguments for plugsched now and I don't have the energy to 
continue that fight (I never did really).

But my question still stands about this comment:

>   case, all of SD's logic could be added via a kernel/sched_sd.c module
>   as well, if Con is interested in such an approach. ]

What exactly would be the purpose of such a module that governs nothing in 
particular? Since there'll be no pluggable scheduler by your admission it has 
no control over SCHED_NORMAL, and would require another scheduling policy for 
it to govern which there is no express way to use at the moment and people 
tend to just use the default without great effort. 

> First and foremost, please dont take such rejections too personally - i
> had my own share of rejections (and in fact, as i mentioned it in a
> previous mail, i had a fair number of complete project throwaways:
> 4g:4g, in-kernel Tux, irqrate and many others). I know that they can
> hurt and can demoralize, but if i dont like something it's my job to
> tell that.

Hmm? No that's not what this is about. Remember dynticks which was not 
originally my code but I tried to bring it up to mainline standard which I 
fought with for months? You came along with yet another rewrite from scratch 
and the flaws in the design I was working with were obvious so I instantly 
bowed down to that and never touched my code again. I didn't ask for credit 
back then, but obviously brought the requirement for a no idle tick 
implementation to the table.

> My view about plugsched: first please take a look at the latest
> plugsched code:
>
>   http://downloads.sourceforge.net/cpuse/plugsched-6.5-for-2.6.20.patch
>
>   26 files changed, 8951 insertions(+), 1495 deletions(-)
>
> As an experiment i've removed all the add-on schedulers (both the core
> and the include files, only kept the vanilla one) from the plugsched
> patch (and the makefile and kconfig complications, etc), to see the
> 'infrastructure cost', and it still gave:
>
>   12 files changed, 1933 insertions(+), 1479 deletions(-)

I do not see extra code per-se as being a bad thing. I've heard said a few 
times before "ever notice how when the correct solution is done it is a lot 
more code than the quick hack that ultimately fails?". Insert long winded 
discussion of perfect is the enemy of good here, _but_ I'm not arguing 
perfect versus good, I'm talking about solid code 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Nick Piggin
On Mon, Apr 16, 2007 at 01:15:27PM +1000, Con Kolivas wrote:
> On Monday 16 April 2007 12:28, Nick Piggin wrote:
> > So, on to something productive, we have 3 candidates for a new scheduler so
> > far. How do we decide which way to go? (and yes, I still think switchable
> > schedulers is wrong and a copout) This is one area where it is virtually
> > impossible to discount any decent design on correctness/performance/etc.
> > and even testing in -mm isn't really enough.
> 
> We're in agreement! YAY!
> 
> Actually this is simpler than that. I'm taking SD out of the picture. It has 
> served it's purpose of proving that we need to seriously address all the 
> scheduling issues and did more than a half decent job at it. Unfortunately I 
> also cannot sit around supporting it forever by myself. My own life is more 
> important, so consider SD not even running the race any more.
> 
> I'm off to continue maintaining permanent-out-of-tree leisurely code at my 
> own 
> pace. What's more is, I think I'll just stick to staircase Gen I version blah 
> and shelve SD and try to have fond memories of SD as an intellectual 
> prompting exercise only.

Well I would hope that _if_ we decide to switch schedulers, then you
get a chance to field something (and I hope you will decide to and have
time to), and I hope we don't rush into the decision.

We've had the current scheduler for so many years now that it is much
more important to make sure we take the time to do the right thing
rather than absolutely have to merge a new scheduler right now ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Con Kolivas
On Monday 16 April 2007 12:28, Nick Piggin wrote:
> So, on to something productive, we have 3 candidates for a new scheduler so
> far. How do we decide which way to go? (and yes, I still think switchable
> schedulers is wrong and a copout) This is one area where it is virtually
> impossible to discount any decent design on correctness/performance/etc.
> and even testing in -mm isn't really enough.

We're in agreement! YAY!

Actually this is simpler than that. I'm taking SD out of the picture. It has 
served it's purpose of proving that we need to seriously address all the 
scheduling issues and did more than a half decent job at it. Unfortunately I 
also cannot sit around supporting it forever by myself. My own life is more 
important, so consider SD not even running the race any more.

I'm off to continue maintaining permanent-out-of-tree leisurely code at my own 
pace. What's more is, I think I'll just stick to staircase Gen I version blah 
and shelve SD and try to have fond memories of SD as an intellectual 
prompting exercise only.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
William Lee Irwin III wrote:
>> One of the reasons I never posted my own code is that it never met its
>> own design goals, which absolutely included switching on the fly. I
>> think Peter Williams may have done something about that.
>> It was my hope
>> to be able to do insmod sched_foo.ko until it became clear that the
>> effort it was intended to assist wasn't going to get even the limited
>> hardware access required, at which point I largely stopped working on
>> it.

On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote:
> I didn't but some students did.
> In a previous life, I did implement a runtime configurable CPU 
> scheduling mechanism (implemented on True64, Solaris and Linux) that 
> allowed schedulers to be loaded as modules at run time.  This was 
> released commercially on True64 and Solaris.  So I know that it can be done.
> I have thought about doing something similar for the SPA schedulers 
> which differ in only small ways from each other but lack motivation.

Driver models for scheduling are not so far out. AFAICS it's largely a
tug-of-war over design goals, e.g. maintaining per-cpu runqueues and
switching out intra-queue policies vs. switching out whole-system
policies, SMP handling and all. Whether this involves load balancing
depends strongly on e.g. whether you have per-cpu runqueues. A 2.4.x
scheduler module, for instance, would not have a load balancer at all,
as it has only one global runqueue. There are other sorts of policies
wanting significant changes to SMP handling vs. the stock load
balancing.


William Lee Irwin III wrote:
>> I'm not sure what happened there. It wasn't a big enough patch to take
>> hits in this area due to getting overwhelmed by the programming burden
>> like some other efforts of mine. Maybe things started getting ugly once
>> on-the-fly switching entered the picture. My guess is that Peter Williams
>> will have to chime in here, since things have diverged enough from my
>> one-time contribution 4 years ago.

On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote:
> From my POV, the current version of plugsched is considerably simpler 
> than it was when I took the code over from Con as I put considerable 
> effort into minimizing code overlap in the various schedulers.
> I also put considerable effort into minimizing any changes to the load 
> balancing code (something Ingo seems to think is a deficiency) and the 
> result is that plugsched allows "intra run queue" scheduling to be 
> easily modified WITHOUT effecting load balancing.  To my mind scheduling 
> and load balancing are orthogonal and keeping them that way simplifies 
> things.

ISTR rearranging things for con in such a fashion that it no longer
worked out of the box (though that wasn't the intention; restructuring it
to be more suited to his purposes was) and that's what he worked off of
afterward. I don't remember very well what changed there as I clearly
invested less effort there than the prior versions. Now that I think of
it, that may have been where the sample policy demonstrating scheduling
classes was lost.


On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote:
> As Ingo correctly points out, plugsched does not allow different 
> schedulers to be used per CPU but it would not be difficult to modify it 
> so that they could.  Although I've considered doing this over the years 
> I decided not to as it would just increase the complexity and the amount 
> of work required to keep the patch set going.  About six months ago I 
> decided to reduce the amount of work I was doing on plugsched (as it was 
> obviously never going to be accepted) and now only publish patches 
> against the vanilla kernel's major releases (and the only reason that I 
> kept doing that is that the download figures indicated that about 80 
> users were interested in the experiment).

That's a rather different goal from what I was going on about with it,
so it's all diverged quite a bit. Where I had a significant need for
mucking with the entire concept of how SMP was handled, this is rather
different. At this point I'm questioning the relevance of my own work,
though it was already relatively marginal as it started life as an
attempt at a sort of debug patch to help gang scheduling (which is in
itself a rather marginally relevant feature to most users) code along.


On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote:
> PS I no longer read LKML (due to time constraints) and would appreciate 
> it if I could be CC'd on any e-mails suggesting scheduler changes.
> PPS I'm just happy to see that Ingo has finally accepted that the 
> vanilla scheduler was badly in need of fixing and don't really care who 
> fixes it.
> PPS Different schedulers for different aims (i.e. server or work 
> station) do make a difference.  E.g. the spa_svr scheduler in plugsched 
> does about 1% better on kernbench than the next best scheduler in the bunch.
> PPPS Con, fairness isn't always best as 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Nick Piggin
On Sun, Apr 15, 2007 at 04:31:54PM -0500, Matt Mackall wrote:
> On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:
>  
> > 4) the good thing that happened to I/O, after years of stagnation isnt
> >I/O schedulers. The good thing that happened to I/O is called Jens
> >Axboe. If you care about the I/O subystem then print that name out 
> >and hang it on the wall. That and only that is what mattered.
> 
> Disagree. Things didn't actually get interesting until Nick showed up
> with AS and got it in-tree to demonstrate the huge amount of room we
> had for improvement. It took several iterations of AS and CFQ (with a
> couple complete rewrites) before CFQ began to look like the winner.
> The resulting time-sliced CFQ was fairly heavily influenced by the
> ideas in AS.

Well to be fair, Jens had just implemented deadline, which got me
interested ;)

Actually, I would still like to be able to deprecate deadline for
AS, because AS has a tunable that you can switch to turn off read
anticipation and revert to deadline behaviour (or very close to).

It would have been nice if CFQ were then a layer on top of AS that
implemented priorities (or vice versa). And then AS could be
deprecated and we'd be back to 1 primary scheduler.

Well CFQ seems to be going in the right direction with that, however
some large users still find AS faster for some reason...

Anyway, moral of the story is that I think it would have been nice
if we hadn't proliferated IO schedulers, however in practice it
isn't easy to just layer features on top of each other, and also
keeping deadline helped a lot to be able to debug and examine
performance regressions and actually get code upstream. And this
was true even when it was globally boottine switchable only.

I'd prefer if we kept a single CPU scheduler in mainline, because I
think that simplifies analysis and focuses testing. I think we can
have one that is good enough for everyone. But if the only other
option for progress is that Linus or Andrew just pull one out of a
hat, then I would rather merge all of them. Yes I think Con's
scheduler should get a fair go, ditto for Ingo's, mine, and anyone
else's.


> > nor was the non-modularity of some piece of code ever an impediment to 
> > competition. May i remind you of the pretty competitive SLAB allocator 
> > landscape, resulting in things like the SLOB allocator, written by 
> > yourself? ;-)
> 
> Thankfully no one came out and said "we don't want to balkanize the
> allocator landscape" when I submitted it or I probably would have just
> dropped it, rather than painfully dragging it along out of tree for
> years. I'm not nearly the glutton for punishment that Con is. :-P

I don't think this is a fault of the people or the code involved.
We just didn't have much collective drive to replace the scheduler,
and even less an idea of how to decide between any two of them.

I've kept nicksched around since 2003 or so and no hard feelings ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Nick Piggin
On Mon, Apr 16, 2007 at 08:52:33AM +1000, Con Kolivas wrote:
> On Monday 16 April 2007 05:00, Jonathan Lundell wrote:
> > On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote:
> > > It's a really good thing, and it means that if somebody shows that
> > > your
> > > code is flawed in some way (by, for example, making a patch that
> > > people
> > > claim gets better behaviour or numbers), any *good* programmer that
> > > actually cares about his code will obviously suddenly be very
> > > motivated to
> > > out-do the out-doer!
> >
> > "No one who cannot rejoice in the discovery of his own mistakes
> > deserves to be called a scholar."
> 
> Lovely comment. I realise this is not truly directed at me but clearly in the 
> context it has been said people will assume it is directed my way, so while 
> we're all spinning lkml quality rhetoric, let me have a right of reply.
> 
> One thing I have never tried to do was to ignore bug reports. I'm forever 
> joking that I keep pulling code out of my arse to improve what I've done. 
> RSDL/SD was no exception; heck it had 40 iterations. The reason I could not 
> reply to bug report A with "Oh that is problem B so I'll fix it with code C" 
> was, as I've said many many times over, health related. I did indeed try to 
> fix many of them without spending hours replying to sometimes unpleasant 
> emails. If health wasn't an issue there might have been 1000 iterations of 
> SD.

Well what matters is the code and development. I don't think Ingo's
scheduler is the final word, although I worry that Linus might jump the
gun and merge something "just to give it a test", which we then get
stuck with :P

I don't know how anybody can think Ingo's new scheduler is anything but
a good thing (so long as it has to compete before being merged). And
that's coming from someone who wants *their* scheduler to get merged...
I think mine can compete ;) and if it can't, then I'd rather be using
the scheduler that beats it.


> There was only ever _one_ thing that I was absolutely steadfast on as a 
> concept that I refused to fix that people might claim was "a mistake I did 
> not rejoice in to be a scholar". That was that the _correct_ behaviour for a 
> scheduler is to be fair such that proportional slowdown with load is (using 
> that awful pun) a feature, not a bug.

If something is using more than a fair share of CPU time, over some macro
period, in order to be interactive, then definitely it should get throttled.
I've always maintained (since starting scheduler work) that the 2.6 scheduler
is horrible because it allows these cases where some things can get more CPU
time just by how they behave.

Glad people are starting to come around on that point.


So, on to something productive, we have 3 candidates for a new scheduler so
far. How do we decide which way to go? (and yes, I still think switchable
schedulers is wrong and a copout) This is one area where it is virtually
impossible to discount any decent design on correctness/performance/etc.
and even testing in -mm isn't really enough.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Peter Williams

William Lee Irwin III wrote:

On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:

2) plugsched did not allow on the fly selection of schedulers, nor did
   it allow a per CPU selection of schedulers. IO schedulers you can 
   change per disk, on the fly, making them much more useful in
   practice. Also, IO schedulers (while definitely not being slow!) are 
   alot less performance sensitive than CPU schedulers.


One of the reasons I never posted my own code is that it never met its
own design goals, which absolutely included switching on the fly. I
think Peter Williams may have done something about that.


I didn't but some students did.

In a previous life, I did implement a runtime configurable CPU 
scheduling mechanism (implemented on True64, Solaris and Linux) that 
allowed schedulers to be loaded as modules at run time.  This was 
released commercially on True64 and Solaris.  So I know that it can be done.


I have thought about doing something similar for the SPA schedulers 
which differ in only small ways from each other but lack motivation.



It was my hope
to be able to do insmod sched_foo.ko until it became clear that the
effort it was intended to assist wasn't going to get even the limited
hardware access required, at which point I largely stopped working on
it.


On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:

3) I/O schedulers are pretty damn clean code, and plugsched, at least
   the last version i saw of it, didnt come even close.


I'm not sure what happened there. It wasn't a big enough patch to take
hits in this area due to getting overwhelmed by the programming burden
like some other efforts of mine. Maybe things started getting ugly once
on-the-fly switching entered the picture. My guess is that Peter Williams
will have to chime in here, since things have diverged enough from my
one-time contribution 4 years ago.


From my POV, the current version of plugsched is considerably simpler 
than it was when I took the code over from Con as I put considerable 
effort into minimizing code overlap in the various schedulers.


I also put considerable effort into minimizing any changes to the load 
balancing code (something Ingo seems to think is a deficiency) and the 
result is that plugsched allows "intra run queue" scheduling to be 
easily modified WITHOUT effecting load balancing.  To my mind scheduling 
and load balancing are orthogonal and keeping them that way simplifies 
things.


As Ingo correctly points out, plugsched does not allow different 
schedulers to be used per CPU but it would not be difficult to modify it 
so that they could.  Although I've considered doing this over the years 
I decided not to as it would just increase the complexity and the amount 
of work required to keep the patch set going.  About six months ago I 
decided to reduce the amount of work I was doing on plugsched (as it was 
obviously never going to be accepted) and now only publish patches 
against the vanilla kernel's major releases (and the only reason that I 
kept doing that is that the download figures indicated that about 80 
users were interested in the experiment).


Peter
PS I no longer read LKML (due to time constraints) and would appreciate 
it if I could be CC'd on any e-mails suggesting scheduler changes.
PPS I'm just happy to see that Ingo has finally accepted that the 
vanilla scheduler was badly in need of fixing and don't really care who 
fixes it.
PPS Different schedulers for different aims (i.e. server or work 
station) do make a difference.  E.g. the spa_svr scheduler in plugsched 
does about 1% better on kernbench than the next best scheduler in the bunch.
PPPS Con, fairness isn't always best as humans aren't very altruistic 
and we need to give unfair preference to interactive tasks in order to 
stop the users flinging their PCs out the window.  But the current 
scheduler doesn't do this very well and is also not very good at 
fairness so needs to change.  But the changes need to address 
interactive response and fairness not just fairness.

--
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Gene Heskett
On Sunday 15 April 2007, Mike Galbraith wrote:
>On Sun, 2007-04-15 at 12:58 -0400, Gene Heskett wrote:
>> Chuckle, possibly but then I'm not anything even remotely close to an
>> expert here Con, just reporting what I get.  And I just rebooted to
>> 2.6.21-rc6 + sched-mike-5.patch for grins and giggles, or frowns and
>> profanity as the case may call for.
>
>Erm, that patch is embarrassingly buggy, so profanity should dominate.
>
>   -Mike

Chuckle, ROTFLMAO even.

I didn't run it that long as I immediately rebuilt and rebooted when I found 
I'd used the wrong patch, and in fact had tested that one and found it 
sub-optimal before I'd built and ran Con's -0.40 version.  As for bugs of the 
type that make it to the screen or logs, I didn't see any.  OTOH, my eyesight 
is slowly going downhill, now 20/25.  It was 20/10 30 years ago.  Now thats 
reason for profanity...

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Unix weanies are as bad at this as anyone.
 -- Larry Wall in <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
* William Lee Irwin III <[EMAIL PROTECTED]> wrote:
>> I've been suggesting testing CPU bandwidth allocation as influenced by 
>> nice numbers for a while now for a reason.

On Sun, Apr 15, 2007 at 09:57:48PM +0200, Ingo Molnar wrote:
> Oh I was very much testing "CPU bandwidth allocation as influenced by 
> nice numbers" - it's one of the basic things i do when modifying the 
> scheduler. An automated tool, while nice (all automation is nice) 
> wouldnt necessarily show such bugs though, because here too it needed 
> thousands of running tasks to trigger in practice. Any volunteers? ;)

Worse comes to worse I might actually get around to doing it myself.
Any more detailed descriptions of the test for a rainy day?


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Pavel Pisa
On Sunday 15 April 2007 00:38, Davide Libenzi wrote:
> Haven't looked at the scheduler code yet, but for a similar problem I use
> a time ring. The ring has Ns (2 power is better) slots (where tasks are
> queued - in my case they were som sort of timers), and it has a current
> base index (Ib), a current base time (Tb) and a time granularity (Tg). It
> also has a bitmap with bits telling you which slots contains queued tasks.
> An item (task) that has to be scheduled at time T, will be queued in the
> slot:
>
> S = Ib + min((T - Tb) / Tg, Ns - 1);
>
> Items with T longer than Ns*Tg will be scheduled in the relative last slot
> (chosing a proper Ns and Tg can minimize this).
> Queueing is O(1) and de-queueing is O(Ns). You can play with Ns and Tg to
> suite to your needs.
> This is a simple bench between time-ring (TR) and CFS queueing:
>
> http://www.xmailserver.org/smart-queue.c
>
> In my box (Dual Opteron 252):
>
> [EMAIL PROTECTED]:~$ ./smart-queue -n 8
> CFS = 142.21 cycles/loop
> TR  = 72.33 cycles/loop
> [EMAIL PROTECTED]:~$ ./smart-queue -n 16
> CFS = 188.74 cycles/loop
> TR  = 83.79 cycles/loop
> [EMAIL PROTECTED]:~$ ./smart-queue -n 32
> CFS = 221.36 cycles/loop
> TR  = 75.93 cycles/loop
> [EMAIL PROTECTED]:~$ ./smart-queue -n 64
> CFS = 242.89 cycles/loop
> TR  = 81.29 cycles/loop

Hello all,

I cannot help myself to not report results with GAVL
tree algorithm there as an another race competitor.
I believe, that it is better solution for large priority
queues than RB-tree and even heap trees. It could be
disputable if the scheduler needs such scalability on
the other hand. The AVL heritage guarantees lower height
which results in shorter search times which could
be profitable for other uses in kernel.

GAVL algorithm is AVL tree based, so it does not suffer from
"infinite" priorities granularity there as TR does. It allows
use for generalized case where tree is not fully balanced.
This allows to cut the first item withour rebalancing.
This leads to the degradation of the tree by one more level
(than non degraded AVL gives) in maximum, which is still
considerably better than RB-trees maximum.

http://cmp.felk.cvut.cz/~pisa/linux/smart-queue-v-gavl.c

The description behind the code is there

http://cmp.felk.cvut.cz/~pisa/ulan/gavl.pdf

The code is part of much more covering uLUt library

http://cmp.felk.cvut.cz/~pisa/ulan/ulut.pdf
http://sourceforge.net/project/showfiles.php?group_id=118937_id=130840

I have included all required GAVL code directly into smart-queue-v-gavl.c
to provide it for easy testing.

There are tests run on my little dated computer - Duron 600 MHz.
Test are run twice to suppress run order influence.

./smart-queue-v-gavl -n 1 -l 200
gavl_cfs = 55.66 cycles/loop
CFS = 88.33 cycles/loop
TR  = 141.78 cycles/loop
CFS = 90.45 cycles/loop
gavl_cfs = 55.38 cycles/loop

./smart-queue-v-gavl -n 2 -l 200
gavl_cfs = 82.85 cycles/loop
CFS = 104.18 cycles/loop
TR  = 145.21 cycles/loop
CFS = 102.74 cycles/loop
gavl_cfs = 82.05 cycles/loop

./smart-queue-v-gavl -n 4 -l 200
gavl_cfs = 137.45 cycles/loop
CFS = 156.47 cycles/loop
TR  = 142.00 cycles/loop
CFS = 152.65 cycles/loop
gavl_cfs = 139.38 cycles/loop

./smart-queue-v-gavl -n 10 -l 200
gavl_cfs = 229.22 cycles/loop   (WORSE)
CFS = 206.26 cycles/loop
TR  = 140.81 cycles/loop
CFS = 208.29 cycles/loop
gavl_cfs = 223.62 cycles/loop   (WORSE)

./smart-queue-v-gavl -n 100 -l 200
gavl_cfs = 257.66 cycles/loop
CFS = 329.68 cycles/loop
TR  = 142.20 cycles/loop
CFS = 319.34 cycles/loop
gavl_cfs = 260.02 cycles/loop

./smart-queue-v-gavl -n 1000 -l 200
gavl_cfs = 258.41 cycles/loop
CFS = 393.04 cycles/loop
TR  = 134.76 cycles/loop
CFS = 392.20 cycles/loop
gavl_cfs = 260.93 cycles/loop

./smart-queue-v-gavl -n 1 -l 200
gavl_cfs = 259.45 cycles/loop
CFS = 605.89 cycles/loop
TR  = 196.69 cycles/loop
CFS = 622.60 cycles/loop
gavl_cfs = 262.72 cycles/loop

./smart-queue-v-gavl -n 10 -l 200
gavl_cfs = 258.21 cycles/loop
CFS = 845.62 cycles/loop
TR  = 315.37 cycles/loop
CFS = 860.21 cycles/loop
gavl_cfs = 258.94 cycles/loop

The GAVL code has not been tuned by any "likely"/"unlikely"
constructs. It brings even some other overhead from it generic
design which is not necessary for this use - it keeps
permanently even pointer to the last element, ensures,
that the insertion order is preserved for same key values
etc. But it still proves much better scalability then
kernel used RB-tree code. On the other hand, it does not
encode color/height in one of the pointers and requires
additional field for height.

May it be, that difference is due some bug in my testing,
then I would be interrested in correction. The test case
is oversimplified probably. I have already run more different
tests against GAVL code in the past to compare it with
different tree and queues implementations and I have not found
case with real performance degradation. On the other hand, there
are cases for small items counts where GAVL 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:
> 2) plugsched did not allow on the fly selection of schedulers, nor did
>it allow a per CPU selection of schedulers. IO schedulers you can 
>change per disk, on the fly, making them much more useful in
>practice. Also, IO schedulers (while definitely not being slow!) are 
>alot less performance sensitive than CPU schedulers.

One of the reasons I never posted my own code is that it never met its
own design goals, which absolutely included switching on the fly. I
think Peter Williams may have done something about that. It was my hope
to be able to do insmod sched_foo.ko until it became clear that the
effort it was intended to assist wasn't going to get even the limited
hardware access required, at which point I largely stopped working on
it.


On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:
> 3) I/O schedulers are pretty damn clean code, and plugsched, at least
>the last version i saw of it, didnt come even close.

I'm not sure what happened there. It wasn't a big enough patch to take
hits in this area due to getting overwhelmed by the programming burden
like some other efforts of mine. Maybe things started getting ugly once
on-the-fly switching entered the picture. My guess is that Peter Williams
will have to chime in here, since things have diverged enough from my
one-time contribution 4 years ago.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ismail Dönmez
On Monday 16 April 2007 02:23:08 Arjan van de Ven wrote:
> On Mon, 2007-04-16 at 01:49 +0300, Ismail Dönmez wrote:
> > Hi,
> >
> > On Friday 13 April 2007 23:21:00 Ingo Molnar wrote:
> > > [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler
> > > [CFS]
> > >
> > > i'm pleased to announce the first release of the "Modular Scheduler
> > > Core and Completely Fair Scheduler [CFS]" patchset:
> > >
> > >http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch
> >
> > Tested this on top of Linus' GIT tree but the system gets very
> > unresponsive during high disk i/o using ext3 as filesystem but even
> > writing a 300mb file to a usb disk (iPod actually) has the same affect.
>
> just to make sure; this exact same workload but with the stock scheduler
> does not have this effect?
>
> if so, then it could well be that the scheduler is too fair for it's own
> good (being really fair inevitably ends up not batching as much as one
> should, and batching is needed to get any kind of decent performance out
> of disks nowadays)

Tried with make install in kdepim (which made system sluggish with CFS) and 
the system is just fine (using CFQ).

Regards,
ismail
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Arjan van de Ven
On Mon, 2007-04-16 at 01:49 +0300, Ismail Dönmez wrote:
> Hi,
> On Friday 13 April 2007 23:21:00 Ingo Molnar wrote:
> > [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler
> > [CFS]
> >
> > i'm pleased to announce the first release of the "Modular Scheduler Core
> > and Completely Fair Scheduler [CFS]" patchset:
> >
> >http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch
> 
> Tested this on top of Linus' GIT tree but the system gets very unresponsive 
> during high disk i/o using ext3 as filesystem but even writing a 300mb file 
> to a usb disk (iPod actually) has the same affect.

just to make sure; this exact same workload but with the stock scheduler
does not have this effect?

if so, then it could well be that the scheduler is too fair for it's own
good (being really fair inevitably ends up not batching as much as one
should, and batching is needed to get any kind of decent performance out
of disks nowadays)


-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Con Kolivas
On Monday 16 April 2007 05:00, Jonathan Lundell wrote:
> On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote:
> > It's a really good thing, and it means that if somebody shows that
> > your
> > code is flawed in some way (by, for example, making a patch that
> > people
> > claim gets better behaviour or numbers), any *good* programmer that
> > actually cares about his code will obviously suddenly be very
> > motivated to
> > out-do the out-doer!
>
> "No one who cannot rejoice in the discovery of his own mistakes
> deserves to be called a scholar."

Lovely comment. I realise this is not truly directed at me but clearly in the 
context it has been said people will assume it is directed my way, so while 
we're all spinning lkml quality rhetoric, let me have a right of reply.

One thing I have never tried to do was to ignore bug reports. I'm forever 
joking that I keep pulling code out of my arse to improve what I've done. 
RSDL/SD was no exception; heck it had 40 iterations. The reason I could not 
reply to bug report A with "Oh that is problem B so I'll fix it with code C" 
was, as I've said many many times over, health related. I did indeed try to 
fix many of them without spending hours replying to sometimes unpleasant 
emails. If health wasn't an issue there might have been 1000 iterations of 
SD.

There was only ever _one_ thing that I was absolutely steadfast on as a 
concept that I refused to fix that people might claim was "a mistake I did 
not rejoice in to be a scholar". That was that the _correct_ behaviour for a 
scheduler is to be fair such that proportional slowdown with load is (using 
that awful pun) a feature, not a bug. Now there are people who will still 
disagree violently with me on that. SD attempted to be a fairness first 
virtual-deadline design. If I failed on that front, then so be it (and at 
least one person certainly has said in lovely warm fuzzy friendly 
communication that I'm a global failure on all fronts with SD). But let me 
point out now that Ingo's shiny new scheduler is a fairness-first 
virtual-deadline design which will have proportional slowdown with load. So 
it will have a very similar feature. I dare anyone to claim that proportional 
slowdown with load is a bug, because I will no longer feel like I'm standing 
alone with a BFG9000 trying to defend my standpoint. Others can take up the 
post at last.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ismail Dönmez
Hi,
On Friday 13 April 2007 23:21:00 Ingo Molnar wrote:
> [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler
> [CFS]
>
> i'm pleased to announce the first release of the "Modular Scheduler Core
> and Completely Fair Scheduler [CFS]" patchset:
>
>http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch

Tested this on top of Linus' GIT tree but the system gets very unresponsive 
during high disk i/o using ext3 as filesystem but even writing a 300mb file 
to a usb disk (iPod actually) has the same affect.

Regards,
ismail


signature.asc
Description: This is a digitally signed message part.


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Matt Mackall
On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:
> 
> * Matt Mackall <[EMAIL PROTECTED]> wrote:
> 
> > Look at what happened with I/O scheduling. Opening things up to some 
> > new ideas by making it possible to select your I/O scheduler took us 
> > from 10 years of stagnation to healthy, competitive development, which 
> > gave us a substantially better I/O scheduler.
> 
> actually, 2-3 years ago we already had IO schedulers, and my opinion 
> against plugsched back then (also shared by Nick and Linus) was very 
> much considering them. There are at least 4 reasons why I/O schedulers 
> are different from CPU schedulers:

...

> 3) I/O schedulers are pretty damn clean code, and plugsched, at least
>the last version i saw of it, didnt come even close.

That's irrelevant. Plugsched was an attempt to get alternative
schedulers exposure in mainline. I know, because I remember
encouraging Bill to pursue it. Not only did you veto plugsched (which
may have been a perfectly reasonable thing to do), but you also vetoed
the whole concept of multiple schedulers in the tree too. "We don't
want to balkanize the scheduling landscape".

And that latter part is what I'm claiming has set us back for years.
It's not a technical argument but a strategic one. And it's just not a
good strategy.
 
> 4) the good thing that happened to I/O, after years of stagnation isnt
>I/O schedulers. The good thing that happened to I/O is called Jens
>Axboe. If you care about the I/O subystem then print that name out 
>and hang it on the wall. That and only that is what mattered.

Disagree. Things didn't actually get interesting until Nick showed up
with AS and got it in-tree to demonstrate the huge amount of room we
had for improvement. It took several iterations of AS and CFQ (with a
couple complete rewrites) before CFQ began to look like the winner.
The resulting time-sliced CFQ was fairly heavily influenced by the
ideas in AS.

Similarly, things in scheduler land had been pretty damn boring until
Con finally got Andrew to take one of his schedulers for a spin.

> nor was the non-modularity of some piece of code ever an impediment to 
> competition. May i remind you of the pretty competitive SLAB allocator 
> landscape, resulting in things like the SLOB allocator, written by 
> yourself? ;-)

Thankfully no one came out and said "we don't want to balkanize the
allocator landscape" when I submitted it or I probably would have just
dropped it, rather than painfully dragging it along out of tree for
years. I'm not nearly the glutton for punishment that Con is. :-P

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Matt Mackall <[EMAIL PROTECTED]> wrote:

> Look at what happened with I/O scheduling. Opening things up to some 
> new ideas by making it possible to select your I/O scheduler took us 
> from 10 years of stagnation to healthy, competitive development, which 
> gave us a substantially better I/O scheduler.

actually, 2-3 years ago we already had IO schedulers, and my opinion 
against plugsched back then (also shared by Nick and Linus) was very 
much considering them. There are at least 4 reasons why I/O schedulers 
are different from CPU schedulers:

1) CPUs are a non-persistent resource shared by _all_ tasks and 
   workloads in the system. Disks are _persistent_ resources very much 
   attached to specific workloads. (If tasks had to be 'persistent' to
   the CPU they were started on we'd have much different scheduling
   technology, and there would be much less complexity.) More analogous 
   to CPU schedulers would perhaps be VM/MM schedulers, and those tend 
   to be hard to modularize in a technologically sane way too. (and 
   unlike disks there's no good generic way to attach VM/MM schedulers 
   to particular workloads.) So it's apples to oranges.

   in practice it comes down to having one good scheduler that runs all 
   workloads on a system reasonably well. And given that a very large 
   portion of system runs mixed workloads, the demand for one good 
   scheduler is pretty high. While i can run with mixed IO schedulers 
   just fine.

2) plugsched did not allow on the fly selection of schedulers, nor did
   it allow a per CPU selection of schedulers. IO schedulers you can 
   change per disk, on the fly, making them much more useful in
   practice. Also, IO schedulers (while definitely not being slow!) are 
   alot less performance sensitive than CPU schedulers.

3) I/O schedulers are pretty damn clean code, and plugsched, at least
   the last version i saw of it, didnt come even close.

4) the good thing that happened to I/O, after years of stagnation isnt
   I/O schedulers. The good thing that happened to I/O is called Jens
   Axboe. If you care about the I/O subystem then print that name out 
   and hang it on the wall. That and only that is what mattered.

all in one, while there are definitely uses (embedded would like to have 
a smaller/different scheduler, etc.), the technical case for 
modularization for the sake of selectability is alot lower for CPU 
schedulers than it is for I/O schedulers.

nor was the non-modularity of some piece of code ever an impediment to 
competition. May i remind you of the pretty competitive SLAB allocator 
landscape, resulting in things like the SLOB allocator, written by 
yourself? ;-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Matt Mackall
On Sun, Apr 15, 2007 at 05:05:36PM +0200, Ingo Molnar wrote:
> so the rejection was on these grounds, and i still very much stand by 
> that position here and today: i didnt want to see the Linux scheduler 
> landscape balkanized and i saw no technological reasons for the 
> complication that external modularization brings.

But "balkanization" is a good thing. "Monoculture" is a bad thing.

Look at what happened with I/O scheduling. Opening things up to some
new ideas by making it possible to select your I/O scheduler took us
from 10 years of stagnation to healthy, competitive development, which
gave us a substantially better I/O scheduler.

Look at what's happening right now with TCP congestion algorithms.
We've had decades of tweaking Reno slightly now turned into a vibrant
research area with lots of radical alternatives. A winner will
eventually emerge and it will probably look quite a bit different than
Reno.

Similar things have gone on since the beginning with filesystems on
Linux. Being able to easily compare filesystems head to head has been
immensely valuable in improving our 'core' Linux filesystems.

And what we've had up to now is a scheduler monoculture. Until Andrew
put RSDL in -mm, if people wanted to experiment with other schedulers,
they had to go well off the beaten path to do it. So all the people
who've been hopelessy frustrated with the mainline scheduler go off to
the -ck ghetto, or worse, stick with 2.4.

Whether your motivations have been protectionist or merely
shortsighted, you've stomped pretty heavily on alternative scheduler
development by completely rejecting the whole plugsched concept. If
we'd opened up mainline to a variety of schedulers _3 years ago_, we'd
probably have gotten to where we are today much sooner.

Hopefully, the next time Rik suggests pluggable page replacement
algorithms, folks will actually seriously consider it.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* William Lee Irwin III <[EMAIL PROTECTED]> wrote:

> On Sun, Apr 15, 2007 at 09:20:46PM +0200, Ingo Molnar wrote:
> > so Linus was right: this was caused by scheduler starvation. I can 
> > see one immediate problem already: the 'nice offset' is not divided 
> > by nr_running as it should. The patch below should fix this but i 
> > have yet to test it accurately, this change might as well render 
> > nice levels unacceptably ineffective under high loads.
> 
> I've been suggesting testing CPU bandwidth allocation as influenced by 
> nice numbers for a while now for a reason.

Oh I was very much testing "CPU bandwidth allocation as influenced by 
nice numbers" - it's one of the basic things i do when modifying the 
scheduler. An automated tool, while nice (all automation is nice) 
wouldnt necessarily show such bugs though, because here too it needed 
thousands of running tasks to trigger in practice. Any volunteers? ;)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> so Linus was right: this was caused by scheduler starvation. I can see 
> one immediate problem already: the 'nice offset' is not divided by 
> nr_running as it should. The patch below should fix this but i have 
> yet to test it accurately, this change might as well render nice 
> levels unacceptably ineffective under high loads.

erm, rather the updated patch below if you want to use this on a 32-bit 
system. But ... i think you should wait until i have all this re-tested.

Ingo

---
 include/linux/sched.h |2 +-
 kernel/sched_fair.c   |4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux/include/linux/sched.h
===
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -839,7 +839,7 @@ struct task_struct {
 
s64 wait_runtime;
u64 exec_runtime, fair_key;
-   s64 nice_offset, hog_limit;
+   s32 nice_offset, hog_limit;
 
unsigned long policy;
cpumask_t cpus_allowed;
Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -31,7 +31,9 @@ static void __enqueue_task_fair(struct r
int leftmost = 1;
long long key;
 
-   key = rq->fair_clock - p->wait_runtime + p->nice_offset;
+   key = rq->fair_clock - p->wait_runtime;
+   if (unlikely(p->nice_offset))
+   key += p->nice_offset / (rq->nr_running + 1);
 
p->fair_key = key;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
On Sun, Apr 15, 2007 at 09:20:46PM +0200, Ingo Molnar wrote:
> so Linus was right: this was caused by scheduler starvation. I can see 
> one immediate problem already: the 'nice offset' is not divided by 
> nr_running as it should. The patch below should fix this but i have yet 
> to test it accurately, this change might as well render nice levels 
> unacceptably ineffective under high loads.

I've been suggesting testing CPU bandwidth allocation as influenced by
nice numbers for a while now for a reason.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Willy Tarreau <[EMAIL PROTECTED]> wrote:

> > to debug this, could you try to apply this add-on as well:
> > 
> >   http://redhat.com/~mingo/cfs-scheduler/sched-fair-print.patch
> > 
> > with this patch applied you should have a /proc/sched_debug file 
> > that prints all runnable tasks and other interesting info from the 
> > runqueue.
> 
> I don't know if you have seen my mail from yesterday evening (here). I 
> found that changing keventd prio fixed the problem. You may be 
> interested in the description. I sent it at 21:01 (+200).

ah, indeed i missed that mail - the response to the patches was quite 
overwhelming (and i naively thought people dont do Linux hacking over 
the weekends anymore ;).

so Linus was right: this was caused by scheduler starvation. I can see 
one immediate problem already: the 'nice offset' is not divided by 
nr_running as it should. The patch below should fix this but i have yet 
to test it accurately, this change might as well render nice levels 
unacceptably ineffective under high loads.

Ingo

->
---
 kernel/sched_fair.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -31,7 +31,9 @@ static void __enqueue_task_fair(struct r
int leftmost = 1;
long long key;
 
-   key = rq->fair_clock - p->wait_runtime + p->nice_offset;
+   key = rq->fair_clock - p->wait_runtime;
+   if (unlikely(p->nice_offset))
+   key += p->nice_offset / rq->nr_running;
 
p->fair_key = key;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Jonathan Lundell

On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote:

It's a really good thing, and it means that if somebody shows that  
your
code is flawed in some way (by, for example, making a patch that  
people

claim gets better behaviour or numbers), any *good* programmer that
actually cares about his code will obviously suddenly be very  
motivated to

out-do the out-doer!


"No one who cannot rejoice in the discovery of his own mistakes  
deserves to be called a scholar."


--Don Foster, "literary sleuth", on retracting his attribution of "A  
Funerall Elegye" to Shakespeare (it's more likely John Ford's work).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Tim Tassonis

+   printk("Fair Scheduler: Copyright (c) 2007 Red Hat, Inc., Ingo 
Molnar\n");


So that's what all the fuss about the staircase scheduler is all about 
then! At last, I see your point.




   i'd like to give credit to Con Kolivas for the general approach here:
   he has proven via RSDL/SD that 'fair scheduling' is possible and that
   it results in better desktop scheduling. Kudos Con!



How pathetic can you get?

Tim, really looking forward to the CL final where Liverpool will beat 
the shit out of Scum (and there's a lot to be beaten out).


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Willy Tarreau
Hi Ingo,

On Sun, Apr 15, 2007 at 07:55:55PM +0200, Ingo Molnar wrote:
> 
> * Willy Tarreau <[EMAIL PROTECTED]> wrote:
> 
> > Well, since I merged the fair-fork patch, I cannot reproduce (in fact, 
> > bash forks 1000 processes, then progressively execs scheddos, but it 
> > takes some time). So I'm rebuilding right now. But I think that Linus 
> > has an interesting clue about GPM and notification before switching 
> > the terminal. I think it was enabled in console mode. I don't know how 
> > that translates to frozen xterms, but let's attack the problems one at 
> > a time.
> 
> to debug this, could you try to apply this add-on as well:
> 
>   http://redhat.com/~mingo/cfs-scheduler/sched-fair-print.patch
> 
> with this patch applied you should have a /proc/sched_debug file that 
> prints all runnable tasks and other interesting info from the runqueue. 

I don't know if you have seen my mail from yesterday evening (here). I
found that changing keventd prio fixed the problem. You may be interested
in the description. I sent it at 21:01 (+200).

> [ i've refreshed all the patches on the CFS webpage, so if this doesnt 
>   apply cleanly to your current tree then you'll probably have to 
>   refresh one of the patches.]

Fine, I'll have a look. I already had to rediff the sched-fair-fork
patch last time.

> The output should look like this:
> 
>  Sched Debug Version: v0.01
>  now at 226761724575 nsecs
> 
>  cpu: 0
>.nr_running: 3
>.raw_weighted_load : 384
>.nr_switches   : 13666
>.nr_uninterruptible: 0
>.next_balance  : 4294947416
>.curr->pid : 2179
>.rq_clock  : 241337421233
>.fair_clock: 7503791206
>.wait_runtime  : 2269918379
> 
>  runnable tasks:
> task | PID | tree-key |   -delta |  waiting | switches
>  -
>  +cat  2179 7501930066   -18611401861140 2
>   loop_silent  2149 7503010354-780852  0   911
>   loop_silent  2148 7503510048-281158 280753   918

Nice.

> now for your workload the list should be considerably larger. If there's 
> starvation going on then the 'switches' field (number of context 
> switches) of one of the tasks would never increase while you have this 
> 'cannot switch consoles' problem.
> 
> maybe you'll have to unapply the fair-fork patch to make it trigger 
> again. (fair-fork does not fix anything, so it probably just hides a 
> real bug.)
> 
> (i'm meanwhile busy running your scheddos utilities to reproduce it 
> locally as well :)

I discovered I had the frame-buffer enabled (I did not notice it first
because I do not have the logo and the resolution is the same as text).
It's matroxfb with a G400, if that can help. It may be possible that
it needs some CPU that it cannot get to clear the display before
switching, I don't know.

However I won't try this right now, I'm deep in userland at the moment.

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Linus Torvalds


On Sun, 15 Apr 2007, Mike Galbraith wrote:

> On Sun, 2007-04-15 at 16:08 +0300, Pekka J Enberg wrote:
> > 
> > He did exactly that and he did it with a patch. Nothing new here. This is 
> > how development on LKML proceeds when you have two or more competing 
> > designs. There's absolutely no need to get upset or hurt your feelings 
> > over it. It's not malicious, it's how we do Linux development.
> 
> Yes.  Exactly.  This is what it's all about, this is what makes it work.

I obviously agree, but I will also add that one of the most motivating 
things there *is* in open source is "personal pride".

It's a really good thing, and it means that if somebody shows that your 
code is flawed in some way (by, for example, making a patch that people 
claim gets better behaviour or numbers), any *good* programmer that 
actually cares about his code will obviously suddenly be very motivated to 
out-do the out-doer!

Does this mean that there will be tension and rivalry? Hell yes. But 
that's kind of the point. Life is a game, and if you aren't in it to win, 
what the heck are you still doing here?

As long as it's reasonably civil (I'm not personally a huge believer in 
being too polite or "politically correct", so I think the "reasonably" is 
more important than the "civil" part!), and as long as the end result is 
judged on TECHNICAL MERIT, it's all good.

We don't want to play politics. But encouraging peoples competitive 
feelings? Oh, yes. 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sun, 2007-04-15 at 12:58 -0400, Gene Heskett wrote:

> Chuckle, possibly but then I'm not anything even remotely close to an expert 
> here Con, just reporting what I get.  And I just rebooted to 2.6.21-rc6 + 
> sched-mike-5.patch for grins and giggles, or frowns and profanity as the case 
> may call for.

Erm, that patch is embarrassingly buggy, so profanity should dominate.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Willy Tarreau <[EMAIL PROTECTED]> wrote:

> Well, since I merged the fair-fork patch, I cannot reproduce (in fact, 
> bash forks 1000 processes, then progressively execs scheddos, but it 
> takes some time). So I'm rebuilding right now. But I think that Linus 
> has an interesting clue about GPM and notification before switching 
> the terminal. I think it was enabled in console mode. I don't know how 
> that translates to frozen xterms, but let's attack the problems one at 
> a time.

to debug this, could you try to apply this add-on as well:

  http://redhat.com/~mingo/cfs-scheduler/sched-fair-print.patch

with this patch applied you should have a /proc/sched_debug file that 
prints all runnable tasks and other interesting info from the runqueue. 

[ i've refreshed all the patches on the CFS webpage, so if this doesnt 
  apply cleanly to your current tree then you'll probably have to 
  refresh one of the patches.]

The output should look like this:

 Sched Debug Version: v0.01
 now at 226761724575 nsecs

 cpu: 0
   .nr_running: 3
   .raw_weighted_load : 384
   .nr_switches   : 13666
   .nr_uninterruptible: 0
   .next_balance  : 4294947416
   .curr->pid : 2179
   .rq_clock  : 241337421233
   .fair_clock: 7503791206
   .wait_runtime  : 2269918379

 runnable tasks:
task | PID | tree-key |   -delta |  waiting | switches
 -
 +cat  2179 7501930066   -18611401861140 2
  loop_silent  2149 7503010354-780852  0   911
  loop_silent  2148 7503510048-281158 280753   918

now for your workload the list should be considerably larger. If there's 
starvation going on then the 'switches' field (number of context 
switches) of one of the tasks would never increase while you have this 
'cannot switch consoles' problem.

maybe you'll have to unapply the fair-fork patch to make it trigger 
again. (fair-fork does not fix anything, so it probably just hides a 
real bug.)

(i'm meanwhile busy running your scheddos utilities to reproduce it 
locally as well :)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sun, 2007-04-15 at 16:08 +0300, Pekka J Enberg wrote:
> On Sun, 15 Apr 2007, Willy Tarreau wrote:
> > Ingo could have publicly spoken with them about his ideas of killing
> > the O(1) scheduler and replacing it with an rbtree-based one, and using
> > part of Bill's work to speed up development.
> 
> He did exactly that and he did it with a patch. Nothing new here. This is 
> how development on LKML proceeds when you have two or more competing 
> designs. There's absolutely no need to get upset or hurt your feelings 
> over it. It's not malicious, it's how we do Linux development.

Yes.  Exactly.  This is what it's all about, this is what makes it work.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Gene Heskett
On Sunday 15 April 2007, Con Kolivas wrote:
>On Monday 16 April 2007 01:16, Gene Heskett wrote:
>> On Sunday 15 April 2007, Pekka Enberg wrote:
>> >On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote:
>> >> The perception here is that there is that there is this expectation
>> >> that sections of the Linux kernel are intentionally "churn squated" to
>> >> prevent any other ideas from creeping in other than of the owner of
>> >> that subsytem
>> >
>> >Strangely enough, my perception is that Ingo is simply trying to
>> >address the issues Mike's testing discovered in RDSL and SD. It's not
>> >surprising Ingo made it a separate patch set as Con has repeatedly
>> >stated that the "problems" are in fact by design and won't be fixed.
>>
>> I won't get into the middle of this just yet, not having decided which dog
>> I should bet on yet.  I've been running 2.6.21-rc6 + Con's 0.40 patch for
>> about 24 hours, its been generally usable, but gzip still causes lots of 5
>> to 10+ second lags when its running.  I'm coming to the conclusion that
>> gzip simply doesn't play well with others...
>
>Actually Gene I think you're being bitten here by something I/O bound since
>the cpu usage never tops out. If that's the case and gzip is dumping
>truckloads of writes then you're suffering something that irks me even more
>than the scheduler in linux, and that's how much writes hurt just about
>everything else. Try your testcase with bzip2 instead (since that won't be
>i/o bound), or drop your dirty ratio to as low as possible which helps a
>little bit (5% is the minimum)
>
>echo 5 > /proc/sys/vm/dirty_ratio
>
>and finally try the braindead noop i/o scheduler as well.
>
>echo noop > /sys/block/sda/queue/scheduler
>
>(replace sda with your drive obviously).
>
>I'd wager a big one that's what causes your gzip pain. If it wasn't for the
>fact that I've decided to all but give up ever trying to provide code for
>mainline again, trying my best to make writes hurt less on linux would be my
>next big thing [tm].

Chuckle, possibly but then I'm not anything even remotely close to an expert 
here Con, just reporting what I get.  And I just rebooted to 2.6.21-rc6 + 
sched-mike-5.patch for grins and giggles, or frowns and profanity as the case 
may call for.

>Oh and for the others watching, (points to vm hackers) I found a bug when
>playing with the dirty ratio code. If you modify it to allow it drop below
> 5% but still above the minimum in the vm code, stalls happen somewhere in
> the vm where nothing much happens for sometimes 20 or 30 seconds worst case
> scenario. I had to drop a patch in 2.6.19 that allowed the dirty ratio to
> be set ultra low because these stalls were gross.

I think I'd need a bit of tutoring on how to do that.  I recall that one other 
time, several weeks back, I thought I would try one of those famous echo this 
>/proc/that ideas that went by on this list, but even though I was root, 
apparently /proc was read-only AFAIWC.

>> Amazing to me, the cpu its using stays generally below 80%, and often
>> below 60%, even while the kmail composer has a full sentence in its buffer
>> that it still hasn't shown me when I switch to the htop screen to check,
>> and back to the kmail screen to see if its updated yet.  The screen switch
>> doesn't seem to lag so I don't think renicing x would be helpfull.  Those
>> are the obvious lags, and I'll build & reboot to the CFS patch at some
>> point this morning (whats left of it that is :).  And report in due time
>> of course

And now I wonder if I applied the right patch.  This one feels good ATM, but I 
don't think its the CFS thingy.  No, I'm sure of it now, none of the patches 
I've saved say a thing about CFS.  Backtrack up the list time I guess, ignore 
me for the nonce.


-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Microsoft: Re-inventing square wheels

   -- From a Slashdot.org post
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Con Kolivas
On Monday 16 April 2007 01:16, Gene Heskett wrote:
> On Sunday 15 April 2007, Pekka Enberg wrote:
> >On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote:
> >> The perception here is that there is that there is this expectation that
> >> sections of the Linux kernel are intentionally "churn squated" to
> >> prevent any other ideas from creeping in other than of the owner of that
> >> subsytem
> >
> >Strangely enough, my perception is that Ingo is simply trying to
> >address the issues Mike's testing discovered in RDSL and SD. It's not
> >surprising Ingo made it a separate patch set as Con has repeatedly
> >stated that the "problems" are in fact by design and won't be fixed.
>
> I won't get into the middle of this just yet, not having decided which dog
> I should bet on yet.  I've been running 2.6.21-rc6 + Con's 0.40 patch for
> about 24 hours, its been generally usable, but gzip still causes lots of 5
> to 10+ second lags when its running.  I'm coming to the conclusion that
> gzip simply doesn't play well with others...

Actually Gene I think you're being bitten here by something I/O bound since 
the cpu usage never tops out. If that's the case and gzip is dumping 
truckloads of writes then you're suffering something that irks me even more 
than the scheduler in linux, and that's how much writes hurt just about 
everything else. Try your testcase with bzip2 instead (since that won't be 
i/o bound), or drop your dirty ratio to as low as possible which helps a 
little bit (5% is the minimum)

echo 5 > /proc/sys/vm/dirty_ratio

and finally try the braindead noop i/o scheduler as well.

echo noop > /sys/block/sda/queue/scheduler

(replace sda with your drive obviously).

I'd wager a big one that's what causes your gzip pain. If it wasn't for the 
fact that I've decided to all but give up ever trying to provide code for 
mainline again, trying my best to make writes hurt less on linux would be my 
next big thing [tm]. 

Oh and for the others watching, (points to vm hackers) I found a bug when 
playing with the dirty ratio code. If you modify it to allow it drop below 5% 
but still above the minimum in the vm code, stalls happen somewhere in the vm 
where nothing much happens for sometimes 20 or 30 seconds worst case 
scenario. I had to drop a patch in 2.6.19 that allowed the dirty ratio to be 
set ultra low because these stalls were gross.

> Amazing to me, the cpu its using stays generally below 80%, and often below
> 60%, even while the kmail composer has a full sentence in its buffer that
> it still hasn't shown me when I switch to the htop screen to check, and
> back to the kmail screen to see if its updated yet.  The screen switch
> doesn't seem to lag so I don't think renicing x would be helpfull.  Those
> are the obvious lags, and I'll build & reboot to the CFS patch at some
> point this morning (whats left of it that is :).  And report in due time of
> course

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Arjan van de Ven

> It outlines the problems with Linux kernel development and questionable
> elistism regarding ownership of certain sections of the kernel code.

I have to step in and disagree here

Linux is not about who writes the code.

Linux is about getting the best solution for a problem. Who wrote which
line of the code is irrelevant in the big picture.

that often means that multiple implementations happen, and that the a
darwinistic process decides that the best solution wins.

This darwinistic process often happens in the form of discussion, and
that discussion can happen with words or with code. In this case it
happened with a code proposal.

To make this specific: it has happened many times to me that when I
solved an issue with code, someone else stepped in and wrote a different
solution (although that was usually for smaller pieces). Was I upset
about that? No! I was happy because my *problem got solved* in the best
possible way.

Now this doesn't mean that people shouldn't be nice to each other, not
cooperate or steal credits, but I don't get the impression that that is
happening here. Ingo is taking part in the discussion with a counter
proposal for discussion *on the mailing list*. What more do you want??
If you or anyone else can improve it or do better, take part of this
discussion and show what you mean either in words or in code.

Your qualification of the discussion as a elitist takeover... I disagree
with that. It's a *discussion*. Now if you agree that Ingo's patch is
better technically, you and others should be happy about that because
your problem is getting solved better. If you don't agree that his patch
is better technically, take part in the technical discussion.

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Bernd Eckenfels
In article <[EMAIL PROTECTED]> you wrote:
> A development process like this is likely to exclude smart people from wanting
> to contribute to Linux and folks should be conscious about this issues.

Nobody is excluded, you can always have a next iteration.

Gruss
Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
* Willy Tarreau <[EMAIL PROTECTED]> wrote:
>> [...] and using part of Bill's work to speed up development.

On Sun, Apr 15, 2007 at 05:39:33PM +0200, Ingo Molnar wrote:
> ok, let me make this absolutely clear: i didnt use any bit of plugsched 
> - in fact the most difficult bits of the modularization was for areas of 
> sched.c that plugsched never even touched AFAIK. (the load-balancer for 
> example.)
> Plugsched simply does something else: i modularized scheduling policies 
> in essence that have to cooperate with each other, while plugsched 
> modularized complete schedulers which are compile-time or boot-time 
> selected, with no runtime cooperation between them. (one has to be 
> selected at a time)
> (and i have no trouble at all with crediting Will's work either: a few 
> years ago i used Will's PID rework concepts for an NPTL related speedup 
> and Will is very much credited for it in today's kernel/pid.c and he 
> continued to contribute to it later on.)
> (the tree walking bits of sched_fair.c were in fact derived from 
> kernel/hrtimer.c, the rbtree code written by Thomas and me :-)

The extant plugsched patches have nothing to do with cfs; I suspect
what everyone else is going on about is terminological confusion. The
4-year-old sample policy with scheduling classes for the original
plugsched is something you had no way of knowing about, as it was never
publicly posted. There isn't really anything all that interesting going
on here, apart from pointing out that it's been done before.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Willy Tarreau <[EMAIL PROTECTED]> wrote:

> Ingo could have publicly spoken with them about his ideas of killing 
> the O(1) scheduler and replacing it with an rbtree-based one, [...]

yes, that's precisely what i did, via a patchset :)

[ I can even tell you when it all started: i was thinking about Mike's
  throttling patches while watching Manchester United beat the crap out
  of AS Roma (7 to 1 end result), Thuesday evening. I started coding it
  Wednesday morning and sent the patch Friday evening. I very much
  believe in low-latency when it comes to development too ;) ]

(if this had been done via a comittee then today we'd probably still be 
trying to find a suitable timeslot for the initial conference call where 
we'd discuss the election of a chair who would be tasked with writing up 
an initial document of feature requests, on which we'd take a vote, 
possibly this year already, because the matter is really urgent you know 
;-)

> [...] and using part of Bill's work to speed up development.

ok, let me make this absolutely clear: i didnt use any bit of plugsched 
- in fact the most difficult bits of the modularization was for areas of 
sched.c that plugsched never even touched AFAIK. (the load-balancer for 
example.)

Plugsched simply does something else: i modularized scheduling policies 
in essence that have to cooperate with each other, while plugsched 
modularized complete schedulers which are compile-time or boot-time 
selected, with no runtime cooperation between them. (one has to be 
selected at a time)

(and i have no trouble at all with crediting Will's work either: a few 
years ago i used Will's PID rework concepts for an NPTL related speedup 
and Will is very much credited for it in today's kernel/pid.c and he 
continued to contribute to it later on.)

(the tree walking bits of sched_fair.c were in fact derived from 
kernel/hrtimer.c, the rbtree code written by Thomas and me :-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
On Sun, Apr 15, 2007 at 02:45:27PM +0200, Willy Tarreau wrote:
> Now I hope he and Bill will get over this and accept to work on improving
> this scheduler, because I really find it smarter than a dumb O(1). I even
> agree with Mike that we now have a solid basis for future work. But for
> this, maybe a good starting point would be to remove the selfish printk
> at boot, revert useless changes (SCHED_NORMAL->SCHED_FAIR come to mind)
> and improve the documentation a bit so that people can work together on
> the new design, without feeling like their work will only server to
> promote X or Y.

While I appreciate people coming to my defense, or at least the good
intentions behind such, my only actual interest in pointing out
4-year-old work is getting some acknowledgment of having done something
relevant at all. Sometimes it has "I told you so" value. At other times
it's merely clarifying what went on when people refer to it since in a
number of cases the patches are no longer extant, so they can't
actually look at it to get an idea of what was or wasn't done. At other
times I'm miffed about not being credited, whether I should've been or
whether dead and buried code has an implementation of the same idea
resurfacing without the author(s) having any knowledge of my prior work.

One should note that in this case, the first work of mine this trips
over (scheduling classes) was never publicly posted as it was only a
part of the original plugsched (an alternate scheduler implementation
devised to demonstrate plugsched's flexibility with respect to
scheduling policies), and a part that was dropped by subsequent
maintainers. The second work of mine this trips over, a virtual deadline
scheduler named "vdls," was also never publicly posted. Both are from
around the same time period, which makes them approximately 4 years dead.
Neither of the codebases are extant, having been lost in a transition
between employers, though various people recall having been sent them
privately, and plugsched survives in a mutated form as maintained by
Peter Williams, who's been very good about acknowledging my original
contribution.

If I care to become a direct participant in scheduler work, I can do so
easily enough.

I'm not entirely sure what this is about a basis for future work. By
and large one should alter the API's and data structures to fit the
policy being implemented. While the array swapping was nice for
algorithmically improving 2.4.x -style epoch expiry, most algorithms
not based on the 2.4.x scheduler (in however mutated a form) should use
a different queue structure, in fact, one designed around their
policy's specific algorithmic needs. IOW, when one alters the scheduler,
one should also alter the queue data structure appropriately. I'd not
expect the priority queue implementation in cfs to continue to be used
unaltered as it matures, nor would I expect any significant modification
of the scheduler to necessarily use a similar one.

By and large I've been mystified as to why there is such a penchant for
preserving the existing queue structures in the various scheduler
patches floating around. I am now every bit as mystified at the point
of view that seems to be emerging that a change of queue structure is
particularly significant. These are all largely internal changes to
sched.c, and as such, rather small changes in and of themselves. While
they do tend to have user-visible effects, from this point of view
even changing out every line of sched.c is effectively a micropatch.

Something more significant might be altering the schedule() API to
take a mandatory description of the intention of the call to it, or
breaking up schedule() into several different functions to distinguish
between different sorts of uses of it to which one would then respond
differently. Also more significant would be adding a new state beyond
TASK_INTERRUPTIBLE, TASK_UNINTERRUPTIBLE, and TASK_RUNNING for some
tasks to respond only to fatal signals, then sweeping TASK_UNINTERRUPTIBLE
users to use the new state and handle those fatal signals. While not
quite as ostentatious in their user-visible effects as SCHED_OTHER
policy affairs, they are tremendously more work than switching out the
implementation of a single C file, and so somewhat more respectable.

Even as scheduling semantics go, these are micropatches. So SCHED_OTHER
changes a little. Where are the gang schedulers? Where are the batch
schedulers (SCHED_BATCH is not truly such)? Where are the isochronous
(frame) schedulers? I suppose there is some CKRM work that actually has
a semantic impact despite being largely devoted to SCHED_OTHER, and
there's some spufs gang scheduling going on, though not all that much.
And to reiterate a point from other threads, even as SCHED_OTHER
patches go, I see precious little verification that things like the
semantics of nice numbers or other sorts of CPU bandwidth allocation
between competing tasks of various natures are staying the same while
other 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Gene Heskett
On Sunday 15 April 2007, Pekka Enberg wrote:
>On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote:
>> The perception here is that there is that there is this expectation that
>> sections of the Linux kernel are intentionally "churn squated" to prevent
>> any other ideas from creeping in other than of the owner of that subsytem
>
>Strangely enough, my perception is that Ingo is simply trying to
>address the issues Mike's testing discovered in RDSL and SD. It's not
>surprising Ingo made it a separate patch set as Con has repeatedly
>stated that the "problems" are in fact by design and won't be fixed.

I won't get into the middle of this just yet, not having decided which dog I 
should bet on yet.  I've been running 2.6.21-rc6 + Con's 0.40 patch for about 
24 hours, its been generally usable, but gzip still causes lots of 5 to 10+ 
second lags when its running.  I'm coming to the conclusion that gzip simply 
doesn't play well with others...  

Amazing to me, the cpu its using stays generally below 80%, and often below 
60%, even while the kmail composer has a full sentence in its buffer that it 
still hasn't shown me when I switch to the htop screen to check, and back to 
the kmail screen to see if its updated yet.  The screen switch doesn't seem 
to lag so I don't think renicing x would be helpfull.  Those are the obvious 
lags, and I'll build & reboot to the CFS patch at some point this morning 
(whats left of it that is :).  And report in due time of course

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
knot in cables caused data stream to become twisted and kinked
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Con Kolivas <[EMAIL PROTECTED]> wrote:

[ i'm quoting this bit out of order: ]

> 2. Since then I've been thinking/working on a cpu scheduler design 
> that takes away all the guesswork out of scheduling and gives very 
> predictable, as fair as possible, cpu distribution and latency while 
> preserving as solid interactivity as possible within those confines.

yeah. I think you were right on target with this call. I've applied the 
sched.c change attached at the bottom of this mail to the CFS patch, if 
you dont mind. (or feel free to suggest some other text instead.)

> 1. I tried in vain some time ago to push a working extensable 
> pluggable cpu scheduler framework (based on wli's work) for the linux 
> kernel. It was perma-vetoed by Linus and Ingo (and Nick also said he 
> didn't like it) as being absolutely the wrong approach and that we 
> should never do that. [...]

i partially replied to that point to Will already, and i'd like to make 
it clear again: yes, i rejected plugsched 2-3 years ago (which already 
drifted away from wli's original codebase) and i would still reject it 
today.

First and foremost, please dont take such rejections too personally - i 
had my own share of rejections (and in fact, as i mentioned it in a 
previous mail, i had a fair number of complete project throwaways: 
4g:4g, in-kernel Tux, irqrate and many others). I know that they can 
hurt and can demoralize, but if i dont like something it's my job to 
tell that.

Can i sum up your argument as: "you rejected plugsched, but then why on 
earth did you modularize portions of the scheduler in CFS? Isnt your 
position thus woefully inconsistent?" (i'm sure you would never put it 
this impolitely though, but i guess i can flame myself with impunity ;)

While having an inconsistent position isnt a terminal sin in itself, 
please realize that the scheduler classes code in CFS is quite different 
from plugsched: it was a result of what i saw to be technological 
pressure for _internal modularization_. (This internal/policy 
modularization aspect is something that Will said was present in his 
original plugsched code, but which aspect i didnt see in the plugsched 
patches that i reviewed.)

That possibility never even occured to me to until 3 days ago. You never 
raised it either AFAIK. No patches to simplify the scheduler that way 
were ever sent. Plugsched doesnt even touch the core load-balancer for 
example, and most of the time i spent with the modularization was to get 
the load-balancing details right. So it's really apples to oranges.

My view about plugsched: first please take a look at the latest 
plugsched code:

  http://downloads.sourceforge.net/cpuse/plugsched-6.5-for-2.6.20.patch

  26 files changed, 8951 insertions(+), 1495 deletions(-)

As an experiment i've removed all the add-on schedulers (both the core 
and the include files, only kept the vanilla one) from the plugsched 
patch (and the makefile and kconfig complications, etc), to see the 
'infrastructure cost', and it still gave:

  12 files changed, 1933 insertions(+), 1479 deletions(-)

that's the extra complication i didnt like 3 years ago and which i still 
dont like today. What the current plugsched code does is that it 
simplifies the adding of new experimental schedulers, but it doesnt 
really do what i wanted: to simplify the _scheduler itself_. Personally 
i'm still not primarily interested in having a large selection of 
schedulers, i'm mainly interested in a good and maintainable scheduler 
that works for people.

so the rejection was on these grounds, and i still very much stand by 
that position here and today: i didnt want to see the Linux scheduler 
landscape balkanized and i saw no technological reasons for the 
complication that external modularization brings.

the new scheding classes code in the CFS patch was not a result of "oh, 
i want to write a new scheduler, lets make schedulers pluggable" kind of 
thinking. That result was just a side-effect of it. (and as you 
correctly noted it, the CFS related modularization is incomplete).

Btw., the thing that triggered the scheduling classes code wasnt even 
plugsched or RSDL/SD, it was Mike's patches. Mike had an itch and he 
fixed it within the framework of the existing scheduler, and the end 
result behaved quite well when i threw various testloads on it.

But i felt a bit uncomfortable that it added another few hundred lines 
of code to an already complex sched.c. This felt unnatural so i mailed 
Mike that i'd attempt to clean these infrastructure aspects of sched.c 
up a bit so that it becomes more hackable to him. Thus 3 days ago, 
without having made up my mind about anything, i started this experiment 
(which ended up in the modularization and in the CFS scheduler) to 
simplify the code and to enable Mike to fix such itches in an easier 
way. By your logic Mike should in fact be quite upset about this: if the 
new code works out and proves to be useful then it obsoletes a whole lot 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Pekka J Enberg
On Sun, 15 Apr 2007, Willy Tarreau wrote:
> Ingo could have publicly spoken with them about his ideas of killing
> the O(1) scheduler and replacing it with an rbtree-based one, and using
> part of Bill's work to speed up development.

He did exactly that and he did it with a patch. Nothing new here. This is 
how development on LKML proceeds when you have two or more competing 
designs. There's absolutely no need to get upset or hurt your feelings 
over it. It's not malicious, it's how we do Linux development.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Esben Nielsen <[EMAIL PROTECTED]> wrote:

> I took a brief look at it. Have you tested priority inheritance?

yeah, you are right, it's broken at the moment, i'll fix it. But the 
good news is that i think PI could become cleaner via scheduling 
classes.

> As far as I can see rt_mutex_setprio doesn't have much effect on 
> SCHED_FAIR/SCHED_BATCH. I am looking for a place where such a task 
> change scheduler class when boosted in rt_mutex_setprio().

i think via scheduling classes we dont have to do the p->policy and 
p->prio based gymnastics anymore, we can just have a clean look at 
p->sched_class and stack the original scheduling class into 
p->real_sched_class. It would probably also make sense to 'privatize' 
p->prio into the scheduling class. That way PI would be a pure property 
of sched_rt, and the PI scheduler would be driven purely by 
p->rt_priority, not by p->prio. That way all the normal_prio() kind of 
complications and interactions with SCHED_OTHER/SCHED_FAIR would be 
eliminated as well. What do you think?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Willy Tarreau
On Sun, Apr 15, 2007 at 01:39:27PM +0300, Pekka Enberg wrote:
> On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote:
> >The perception here is that there is that there is this expectation that
> >sections of the Linux kernel are intentionally "churn squated" to prevent
> >any other ideas from creeping in other than of the owner of that subsytem
> 
> Strangely enough, my perception is that Ingo is simply trying to
> address the issues Mike's testing discovered in RDSL and SD. It's not
> surprising Ingo made it a separate patch set as Con has repeatedly
> stated that the "problems" are in fact by design and won't be fixed.

That's not exactly the problem. There are people who work very hard to
try to improve some areas of the kernel. They progress slowly, and
acquire more and more skills. Sometimes they feel like they need to
change some concepts and propose those changes which are required for
them to go further, or to develop faster. Those are rejected. So they
are constrained to work in a delimited perimeter from which it is
difficult for them to escape.

Then, the same person who rejected their changes comes with something
shiny new, better and which took him far less time. But he sort of
broke the rules because what was forbidden to the first persons is
suddenly permitted. Maybe for very good reasons, I'm not discussing
that. The good reason should have been valid the first time too.

The fact is that when changes are rejected, we should not simply say
"no", but explain why and define what would be acceptable. Some people
here have excellent teaching skills for this, but most others do not.
Anyway, the rules should be the same for everybody.

Also, there is what can be perceived as marketting here. Con worked
on his idea with convictions, he took time to write some generous
documentation, but he hit a wall where his concept was suboptimal on
a given workload. But at least, all the work was oriented on a technical
basis : design + code + doc.

Then, Ingo comes in with something looking amazingly better, with
virtually no documentation, an appealing announcement, and a shiny
advertising at boot. All this implemented without the constraints
other people had to respect. It already looks like definitive work
which will be merge as-is without many changes except a few bugfixes.

If those were two companies, the first one would simply have accused
the second one of not having respected contracts and having employed
heaving marketting to take the first place.

People here do not code for a living, they do it at least because they
believe in what they are doing, and some of them want a bit of gratitude
for their work. I've met people who were proud to say they implement
this or that feature in the kernel, so it is something important for
them. And being cited in an email is nothing compared to advertising
at boot time.

When the discussion was blocked between Con and Mike concerning the
design problems, it is where a new discussion should have taken place.
Ingo could have publicly spoken with them about his ideas of killing
the O(1) scheduler and replacing it with an rbtree-based one, and using
part of Bill's work to speed up development.

It is far easier to resign when people explain what concepts are wrong
and how they think they will do than when they suddenly present something
out of nowhere which is already better.

And it's not specific to Ingo (though I think his ability to work that
fast alone makes him tend to practise this more often than others).

Imagine if Con had worked another full week on his scheduler with better
results on Mike's workload, but still not as good as Ingo's, and they both
published at the same time. You certainly can imagine he would have preferred
to be informed first that it was pointless to continue in that direction.

Now I hope he and Bill will get over this and accept to work on improving
this scheduler, because I really find it smarter than a dumb O(1). I even
agree with Mike that we now have a solid basis for future work. But for
this, maybe a good starting point would be to remove the selfish printk
at boot, revert useless changes (SCHED_NORMAL->SCHED_FAIR come to mind)
and improve the documentation a bit so that people can work together on
the new design, without feeling like their work will only server to
promote X or Y.

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Esben Nielsen

On Fri, 13 Apr 2007, Ingo Molnar wrote:


[announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

i'm pleased to announce the first release of the "Modular Scheduler Core
and Completely Fair Scheduler [CFS]" patchset:

  http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch

This project is a complete rewrite of the Linux task scheduler. My goal
is to address various feature requests and to fix deficiencies in the
vanilla scheduler that were suggested/found in the past few years, both
for desktop scheduling and for server scheduling workloads.

[...]


I took a brief look at it. Have you tested priority inheritance?
As far as  I can see rt_mutex_setprio doesn't have much effect on 
SCHED_FAIR/SCHED_BATCH. I am looking for a place where such a task change 
scheduler class when boosted in rt_mutex_setprio().


Esben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Pekka Enberg

On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote:

The perception here is that there is that there is this expectation that
sections of the Linux kernel are intentionally "churn squated" to prevent
any other ideas from creeping in other than of the owner of that subsytem


Strangely enough, my perception is that Ingo is simply trying to
address the issues Mike's testing discovered in RDSL and SD. It's not
surprising Ingo made it a separate patch set as Con has repeatedly
stated that the "problems" are in fact by design and won't be fixed.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread hui
On Sun, Apr 15, 2007 at 10:44:47AM +0200, Ingo Molnar wrote:
> I prefer such early releases to lkml _alot_ more than any private review 
> process. I released the CFS code about 6 hours after i thought "okay, 
> this looks pretty good" and i spent those final 6 hours on testing it 
> (making sure it doesnt blow up on your box, etc.), in the final 2 hours 
> i showed it to two folks i could reach on IRC (Arjan and Thomas) and on 
> various finishing touches. It doesnt get much faster than that and i 
> definitely didnt want to sit on it even one day longer because i very 
> much thought that Con and others should definitely see this work!
> 
> And i very much credited (and still credit) Con for the whole fairness 
> angle:
> 
> ||  i'd like to give credit to Con Kolivas for the general approach here:
> ||  he has proven via RSDL/SD that 'fair scheduling' is possible and that
> ||  it results in better desktop scheduling. Kudos Con!
> 
> the 'design consultation' phase you are talking about is _NOW_! :)
> 
> I got the v1 code out to Con, to Mike and to many others ASAP. That's 
> how you are able to comment on this thread and be part of the 
> development process to begin with, in a 'private consultation' setup 
> you'd not have had any opportunity to see _any_ of this.
> 
> In the BSD space there seem to be more 'political' mechanisms for 
> development, but Linux is truly about doing things out in the open, and 
> doing it immediately.

I can't even begin to talk about how screwed up BSD development is. Maybe
another time privately.

Ok, Linux development and inclusiveness can be improved. I'm not trying
to "call you out" (slang for accusing you with the sole intention to call
you crazy in a highly confrontative manner). This is discussed publically
here to bring this issue to light, open a communication channel as a means
to resolve it.

> Okay? ;-)

It's cool. We're still getting to know each other professionally and it's
okay to a certain degree to have a communication disconnect but only as
long as it clears. Your productivity is amazing BTW. But here's the
problem, there's this perception that NIH is the default mentality here
in Linux.

Con feels that this kind of action is intentional and has a malicious
quality to it as means of "churn squating" sections of the kernel tree.
The perception here is that there is that there is this expectation that
sections of the Linux kernel are intentionally "churn squated" to prevent
any other ideas from creeping in other than of the owner of that subsytem
(VM, scheduling, etc...) because of lack of modularity in the kernel.
This isn't an API question but a question possibly general code quality
and how maintenance () of it can .

This was predicted by folks and then this perception was *realized* when
you wrote the equivalent kind of code that has technical overlap with SDL
(this is just one dry example). To a person that is writing new code for
Linux, having one of the old guards write equivalent code to that of a
newcomer has the effect of displacing that person both with regards to
code and responsibility with that. When this happens over and over again
and folks get annoyed by it, it starts seeming that Linux development
seems elitist.

I know this because I heard (read) Con's IRC chats all the time about
these matters all of the time. This is not just his view but a view of
other kernel folks that differing views as to. The closing talk at OLS
2006 was highly disturbing in many ways. It went "Christoph" is right
everybody else is wrong which sends a highly negative message to new
kernel developers that, say, don't work for RH directly or any of the
other mainstream Linux companies. After a while, it starts seeming like
this kind of behavior is completely intentional and that Linux is
full of arrogant bastards.

What I would have done here was to contact Peter Williams, Bill Irwin
and Con about what your doing and reach a common concensus about how
to create something that would be inclusive of all of their ideas.
Discussions can technically heated but that's ok, the discussion is
happening and it brings down the wall of this perception. Bill and
Con are on oftc.net/#offtopic2. Riel is there as well as Peter Zijlstra.
It might be very useful, it might not be. Folks are all stubborn
about there ideas and hold on to them for dear life. Effective
leaders can deconstruct this hostility and animosity. I don't claim
to be one.

Because of past hostility to something like schedplugin, the hostility
and terseness of responses can be percieved simply as "I'm right,
you're wrong" which is condescending. This effects discussion and
outright destroys a constructive process if this happens continually
since it reenforces that view of "You're an outsider, we don't care
about you". Nobody is listening to each other at that point, folks get
pissed. Then they think about "I'm going to NIH this person with patc
X because he/she did the same here" which is dysfunctional.


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sun, 2007-04-15 at 10:58 +0200, Ingo Molnar wrote:
> * Mike Galbraith <[EMAIL PROTECTED]> wrote:

> > [...] (I know a trivial way to cure that, and this framework makes 
> > that possible without dorking up fairness as a general policy.)
> 
> great! Please send patches so i can add them (once you are happy with 
> the solution) - i think your workload isnt special in any way and could 
> hit other people too.

I'll give it a shot.  (have to read and actually understand your new
code first though, then see if it's really viable)

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Bill Huey <[EMAIL PROTECTED]> wrote:

> On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote:
> > [...]
> > 
> > Demystify what?  The casual observer need only read either your 
> > attempt
> 
> Here's the problem. You're a casual observer and obviously not paying 
> attention.

guys, please calm down. Judging by the number of contributions to 
sched.c the main folks who are not 'observers' here and who thus have an 
unalienable right to be involved in a nasty flamewar about scheduler 
interactivity are Con, Mike, Nick and me ;-) Everyone else is just a 
happy bystander, ok? ;-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Mike Galbraith <[EMAIL PROTECTED]> wrote:

> On Sat, 2007-04-14 at 15:01 +0200, Willy Tarreau wrote:
> 
> > Well, I'll stop heating the room for now as I get out of ideas about 
> > how to defeat it. I'm convinced. I'm impatient to read about Mike's 
> > feedback with his workload which behaves strangely on RSDL. If it 
> > works OK here, it will be the proof that heuristics should not be 
> > needed.
> 
> You mean the X + mp3 player + audio visualization test?  X+Gforce 
> visualization have problems getting half of my box in the presence of 
> two other heavy cpu using tasks.  Behavior is _much_ better than 
> RSDL/SD, but the synchronous nature of X/client seems to be a problem.
> 
> With this scheduler, renicing X/client does cure it, whereas with SD 
> it did not help one bit. [...]

thanks for testing it! I was quite worried about your setup - two tasks 
using up 50%/50% of CPU time, pitted against a kernel rebuild workload 
seems to be a hard workload to get right.

> [...] (I know a trivial way to cure that, and this framework makes 
> that possible without dorking up fairness as a general policy.)

great! Please send patches so i can add them (once you are happy with 
the solution) - i think your workload isnt special in any way and could 
hit other people too.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sun, 2007-04-15 at 01:36 -0700, Bill Huey wrote:
> On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote:
> > [...]
> > 
> > Demystify what?   The casual observer need only read either your attempt
> 
> Here's the problem. You're a casual observer and obviously not paying
> attention.
> 
> > at writing a scheduler, or my attempts at fixing the one we have, to see
> > that it was high time for someone with the necessary skills to step in.
> > Now progress can happen, which was _not_ happening before.
> 
> I think that's inaccurate and there are plenty of folks that have that
> technical skill and background. The scheduler code isn't a deep mystery
> and there are plenty of good kernel hackers out here across many
> communities.  Ingo isn't the only person on this planet to have deep
> scheduler knowledge.

Ok , I'm not paying attention, and you can't read.  We're even.
Have a nice life.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Bill Huey <[EMAIL PROTECTED]> wrote:

> Hello folks,
> 
> I think the main failure I see here is that Con wasn't included in 
> this design or privately in review process. There could have been 
> better co-ownership of the code. This could also have been done openly 
> on lkml [...]

Bill, you come from a BSD background and you are still relatively new to 
Linux development, so i dont at all fault you for misunderstanding this 
situation, and fortunately i have a really easy resolution for your 
worries: i did exactly that! :)

i wrote the first line of code of the CFS patch this week, 8am Wednesday 
morning, and released it to lkml 62 hours later, 22pm on Friday. (I've 
listed the file timestamps of my backup patches further below, for all 
the fine details.)

I prefer such early releases to lkml _alot_ more than any private review 
process. I released the CFS code about 6 hours after i thought "okay, 
this looks pretty good" and i spent those final 6 hours on testing it 
(making sure it doesnt blow up on your box, etc.), in the final 2 hours 
i showed it to two folks i could reach on IRC (Arjan and Thomas) and on 
various finishing touches. It doesnt get much faster than that and i 
definitely didnt want to sit on it even one day longer because i very 
much thought that Con and others should definitely see this work!

And i very much credited (and still credit) Con for the whole fairness 
angle:

||  i'd like to give credit to Con Kolivas for the general approach here:
||  he has proven via RSDL/SD that 'fair scheduling' is possible and that
||  it results in better desktop scheduling. Kudos Con!

the 'design consultation' phase you are talking about is _NOW_! :)

I got the v1 code out to Con, to Mike and to many others ASAP. That's 
how you are able to comment on this thread and be part of the 
development process to begin with, in a 'private consultation' setup 
you'd not have had any opportunity to see _any_ of this.

In the BSD space there seem to be more 'political' mechanisms for 
development, but Linux is truly about doing things out in the open, and 
doing it immediately.

Okay? ;-)

Here's the timestamps of all my backups of the patch, from its humble 4K 
beginnings to the 100K first-cut v1 result:

-rw-rw-r-- 1 mingo mingo 4230 Apr 11 08:47 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 7653 Apr 11 09:12 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 7728 Apr 11 09:26 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 14416 Apr 11 10:08 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 24211 Apr 11 10:41 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 27878 Apr 11 10:45 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 33807 Apr 11 11:05 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 34524 Apr 11 11:09 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 39650 Apr 11 11:19 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 40231 Apr 11 11:34 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 40627 Apr 11 11:48 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 40638 Apr 11 11:54 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 42733 Apr 11 12:19 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 42817 Apr 11 12:31 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 43270 Apr 11 12:41 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 43531 Apr 11 12:48 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 44331 Apr 11 12:51 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45173 Apr 11 12:56 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45288 Apr 11 12:59 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45368 Apr 11 13:06 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45370 Apr 11 13:06 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45815 Apr 11 13:14 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45887 Apr 11 13:19 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45914 Apr 11 13:25 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45850 Apr 11 13:29 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 49196 Apr 11 13:39 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 64317 Apr 11 13:45 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 64403 Apr 11 13:52 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 65199 Apr 11 14:03 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 65199 Apr 11 14:07 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 68995 Apr 11 14:50 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 69919 Apr 11 15:23 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 71065 Apr 11 16:26 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 70642 Apr 11 16:28 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 72334 Apr 11 16:49 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 71624 Apr 11 17:01 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 71854 Apr 11 17:20 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 73571 Apr 11 17:42 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 74708 Apr 11 17:49 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 74708 Apr 11 17:51 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread hui
On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote:
> [...]
> 
> Demystify what?   The casual observer need only read either your attempt

Here's the problem. You're a casual observer and obviously not paying
attention.

> at writing a scheduler, or my attempts at fixing the one we have, to see
> that it was high time for someone with the necessary skills to step in.
> Now progress can happen, which was _not_ happening before.

I think that's inaccurate and there are plenty of folks that have that
technical skill and background. The scheduler code isn't a deep mystery
and there are plenty of good kernel hackers out here across many
communities.  Ingo isn't the only person on this planet to have deep
scheduler knowledge. Priority heaps are not new and Solaris has had a
pluggable scheduler framework for years.

Con's characterization is something that I'm more prone to believe about
how Linux kernel development works versus your view. I think it's a great
shame to have folks like Bill Irwin and Con to have waste time trying to
do something right only to have their ideas attack, then copied and held
as the solution for this kind of technical problem as complete reversal
of technical opinion as it suits a moment. This is just wrong in so many
ways.

It outlines the problems with Linux kernel development and questionable
elistism regarding ownership of certain sections of the kernel code.

I call it "churn squat" and instances like this only support that view
which I would rather it be completely wrong and inaccurate instead.

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sat, 2007-04-14 at 15:01 +0200, Willy Tarreau wrote:

> Well, I'll stop heating the room for now as I get out of ideas about how
> to defeat it. I'm convinced. I'm impatient to read about Mike's feedback
> with his workload which behaves strangely on RSDL. If it works OK here,
> it will be the proof that heuristics should not be needed.

You mean the X + mp3 player + audio visualization test?  X+Gforce
visualization have problems getting half of my box in the presence of
two other heavy cpu using tasks.  Behavior is _much_ better than
RSDL/SD, but the synchronous nature of X/client seems to be a problem.  

With this scheduler, renicing X/client does cure it, whereas with SD it
did not help one bit.  (I know a trivial way to cure that, and this
framework makes that possible without dorking up fairness as a general
policy.)

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sun, 2007-04-15 at 13:27 +1000, Con Kolivas wrote:
> On Saturday 14 April 2007 06:21, Ingo Molnar wrote:
> > [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler
> > [CFS]
> >
> > i'm pleased to announce the first release of the "Modular Scheduler Core
> > and Completely Fair Scheduler [CFS]" patchset:
> >
> >http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch
> >
> > This project is a complete rewrite of the Linux task scheduler. My goal
> > is to address various feature requests and to fix deficiencies in the
> > vanilla scheduler that were suggested/found in the past few years, both
> > for desktop scheduling and for server scheduling workloads.
> 
> The casual observer will be completely confused by what on earth has happened 
> here so let me try to demystify things for them.

[...]

Demystify what?   The casual observer need only read either your attempt
at writing a scheduler, or my attempts at fixing the one we have, to see
that it was high time for someone with the necessary skills to step in.
Now progress can happen, which was _not_ happening before.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ismail Dönmez
Hi,
On Friday 13 April 2007 23:21:00 Ingo Molnar wrote:
 [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler
 [CFS]

 i'm pleased to announce the first release of the Modular Scheduler Core
 and Completely Fair Scheduler [CFS] patchset:

http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch

Tested this on top of Linus' GIT tree but the system gets very unresponsive 
during high disk i/o using ext3 as filesystem but even writing a 300mb file 
to a usb disk (iPod actually) has the same affect.

Regards,
ismail


signature.asc
Description: This is a digitally signed message part.


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Con Kolivas
On Monday 16 April 2007 05:00, Jonathan Lundell wrote:
 On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote:
  It's a really good thing, and it means that if somebody shows that
  your
  code is flawed in some way (by, for example, making a patch that
  people
  claim gets better behaviour or numbers), any *good* programmer that
  actually cares about his code will obviously suddenly be very
  motivated to
  out-do the out-doer!

 No one who cannot rejoice in the discovery of his own mistakes
 deserves to be called a scholar.

Lovely comment. I realise this is not truly directed at me but clearly in the 
context it has been said people will assume it is directed my way, so while 
we're all spinning lkml quality rhetoric, let me have a right of reply.

One thing I have never tried to do was to ignore bug reports. I'm forever 
joking that I keep pulling code out of my arse to improve what I've done. 
RSDL/SD was no exception; heck it had 40 iterations. The reason I could not 
reply to bug report A with Oh that is problem B so I'll fix it with code C 
was, as I've said many many times over, health related. I did indeed try to 
fix many of them without spending hours replying to sometimes unpleasant 
emails. If health wasn't an issue there might have been 1000 iterations of 
SD.

There was only ever _one_ thing that I was absolutely steadfast on as a 
concept that I refused to fix that people might claim was a mistake I did 
not rejoice in to be a scholar. That was that the _correct_ behaviour for a 
scheduler is to be fair such that proportional slowdown with load is (using 
that awful pun) a feature, not a bug. Now there are people who will still 
disagree violently with me on that. SD attempted to be a fairness first 
virtual-deadline design. If I failed on that front, then so be it (and at 
least one person certainly has said in lovely warm fuzzy friendly 
communication that I'm a global failure on all fronts with SD). But let me 
point out now that Ingo's shiny new scheduler is a fairness-first 
virtual-deadline design which will have proportional slowdown with load. So 
it will have a very similar feature. I dare anyone to claim that proportional 
slowdown with load is a bug, because I will no longer feel like I'm standing 
alone with a BFG9000 trying to defend my standpoint. Others can take up the 
post at last.

-- 
-ck
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Arjan van de Ven
On Mon, 2007-04-16 at 01:49 +0300, Ismail Dönmez wrote:
 Hi,
 On Friday 13 April 2007 23:21:00 Ingo Molnar wrote:
  [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler
  [CFS]
 
  i'm pleased to announce the first release of the Modular Scheduler Core
  and Completely Fair Scheduler [CFS] patchset:
 
 http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch
 
 Tested this on top of Linus' GIT tree but the system gets very unresponsive 
 during high disk i/o using ext3 as filesystem but even writing a 300mb file 
 to a usb disk (iPod actually) has the same affect.

just to make sure; this exact same workload but with the stock scheduler
does not have this effect?

if so, then it could well be that the scheduler is too fair for it's own
good (being really fair inevitably ends up not batching as much as one
should, and batching is needed to get any kind of decent performance out
of disks nowadays)


-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ismail Dönmez
On Monday 16 April 2007 02:23:08 Arjan van de Ven wrote:
 On Mon, 2007-04-16 at 01:49 +0300, Ismail Dönmez wrote:
  Hi,
 
  On Friday 13 April 2007 23:21:00 Ingo Molnar wrote:
   [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler
   [CFS]
  
   i'm pleased to announce the first release of the Modular Scheduler
   Core and Completely Fair Scheduler [CFS] patchset:
  
  http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch
 
  Tested this on top of Linus' GIT tree but the system gets very
  unresponsive during high disk i/o using ext3 as filesystem but even
  writing a 300mb file to a usb disk (iPod actually) has the same affect.

 just to make sure; this exact same workload but with the stock scheduler
 does not have this effect?

 if so, then it could well be that the scheduler is too fair for it's own
 good (being really fair inevitably ends up not batching as much as one
 should, and batching is needed to get any kind of decent performance out
 of disks nowadays)

Tried with make install in kdepim (which made system sluggish with CFS) and 
the system is just fine (using CFQ).

Regards,
ismail
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:
 2) plugsched did not allow on the fly selection of schedulers, nor did
it allow a per CPU selection of schedulers. IO schedulers you can 
change per disk, on the fly, making them much more useful in
practice. Also, IO schedulers (while definitely not being slow!) are 
alot less performance sensitive than CPU schedulers.

One of the reasons I never posted my own code is that it never met its
own design goals, which absolutely included switching on the fly. I
think Peter Williams may have done something about that. It was my hope
to be able to do insmod sched_foo.ko until it became clear that the
effort it was intended to assist wasn't going to get even the limited
hardware access required, at which point I largely stopped working on
it.


On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:
 3) I/O schedulers are pretty damn clean code, and plugsched, at least
the last version i saw of it, didnt come even close.

I'm not sure what happened there. It wasn't a big enough patch to take
hits in this area due to getting overwhelmed by the programming burden
like some other efforts of mine. Maybe things started getting ugly once
on-the-fly switching entered the picture. My guess is that Peter Williams
will have to chime in here, since things have diverged enough from my
one-time contribution 4 years ago.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Pavel Pisa
On Sunday 15 April 2007 00:38, Davide Libenzi wrote:
 Haven't looked at the scheduler code yet, but for a similar problem I use
 a time ring. The ring has Ns (2 power is better) slots (where tasks are
 queued - in my case they were som sort of timers), and it has a current
 base index (Ib), a current base time (Tb) and a time granularity (Tg). It
 also has a bitmap with bits telling you which slots contains queued tasks.
 An item (task) that has to be scheduled at time T, will be queued in the
 slot:

 S = Ib + min((T - Tb) / Tg, Ns - 1);

 Items with T longer than Ns*Tg will be scheduled in the relative last slot
 (chosing a proper Ns and Tg can minimize this).
 Queueing is O(1) and de-queueing is O(Ns). You can play with Ns and Tg to
 suite to your needs.
 This is a simple bench between time-ring (TR) and CFS queueing:

 http://www.xmailserver.org/smart-queue.c

 In my box (Dual Opteron 252):

 [EMAIL PROTECTED]:~$ ./smart-queue -n 8
 CFS = 142.21 cycles/loop
 TR  = 72.33 cycles/loop
 [EMAIL PROTECTED]:~$ ./smart-queue -n 16
 CFS = 188.74 cycles/loop
 TR  = 83.79 cycles/loop
 [EMAIL PROTECTED]:~$ ./smart-queue -n 32
 CFS = 221.36 cycles/loop
 TR  = 75.93 cycles/loop
 [EMAIL PROTECTED]:~$ ./smart-queue -n 64
 CFS = 242.89 cycles/loop
 TR  = 81.29 cycles/loop

Hello all,

I cannot help myself to not report results with GAVL
tree algorithm there as an another race competitor.
I believe, that it is better solution for large priority
queues than RB-tree and even heap trees. It could be
disputable if the scheduler needs such scalability on
the other hand. The AVL heritage guarantees lower height
which results in shorter search times which could
be profitable for other uses in kernel.

GAVL algorithm is AVL tree based, so it does not suffer from
infinite priorities granularity there as TR does. It allows
use for generalized case where tree is not fully balanced.
This allows to cut the first item withour rebalancing.
This leads to the degradation of the tree by one more level
(than non degraded AVL gives) in maximum, which is still
considerably better than RB-trees maximum.

http://cmp.felk.cvut.cz/~pisa/linux/smart-queue-v-gavl.c

The description behind the code is there

http://cmp.felk.cvut.cz/~pisa/ulan/gavl.pdf

The code is part of much more covering uLUt library

http://cmp.felk.cvut.cz/~pisa/ulan/ulut.pdf
http://sourceforge.net/project/showfiles.php?group_id=118937package_id=130840

I have included all required GAVL code directly into smart-queue-v-gavl.c
to provide it for easy testing.

There are tests run on my little dated computer - Duron 600 MHz.
Test are run twice to suppress run order influence.

./smart-queue-v-gavl -n 1 -l 200
gavl_cfs = 55.66 cycles/loop
CFS = 88.33 cycles/loop
TR  = 141.78 cycles/loop
CFS = 90.45 cycles/loop
gavl_cfs = 55.38 cycles/loop

./smart-queue-v-gavl -n 2 -l 200
gavl_cfs = 82.85 cycles/loop
CFS = 104.18 cycles/loop
TR  = 145.21 cycles/loop
CFS = 102.74 cycles/loop
gavl_cfs = 82.05 cycles/loop

./smart-queue-v-gavl -n 4 -l 200
gavl_cfs = 137.45 cycles/loop
CFS = 156.47 cycles/loop
TR  = 142.00 cycles/loop
CFS = 152.65 cycles/loop
gavl_cfs = 139.38 cycles/loop

./smart-queue-v-gavl -n 10 -l 200
gavl_cfs = 229.22 cycles/loop   (WORSE)
CFS = 206.26 cycles/loop
TR  = 140.81 cycles/loop
CFS = 208.29 cycles/loop
gavl_cfs = 223.62 cycles/loop   (WORSE)

./smart-queue-v-gavl -n 100 -l 200
gavl_cfs = 257.66 cycles/loop
CFS = 329.68 cycles/loop
TR  = 142.20 cycles/loop
CFS = 319.34 cycles/loop
gavl_cfs = 260.02 cycles/loop

./smart-queue-v-gavl -n 1000 -l 200
gavl_cfs = 258.41 cycles/loop
CFS = 393.04 cycles/loop
TR  = 134.76 cycles/loop
CFS = 392.20 cycles/loop
gavl_cfs = 260.93 cycles/loop

./smart-queue-v-gavl -n 1 -l 200
gavl_cfs = 259.45 cycles/loop
CFS = 605.89 cycles/loop
TR  = 196.69 cycles/loop
CFS = 622.60 cycles/loop
gavl_cfs = 262.72 cycles/loop

./smart-queue-v-gavl -n 10 -l 200
gavl_cfs = 258.21 cycles/loop
CFS = 845.62 cycles/loop
TR  = 315.37 cycles/loop
CFS = 860.21 cycles/loop
gavl_cfs = 258.94 cycles/loop

The GAVL code has not been tuned by any likely/unlikely
constructs. It brings even some other overhead from it generic
design which is not necessary for this use - it keeps
permanently even pointer to the last element, ensures,
that the insertion order is preserved for same key values
etc. But it still proves much better scalability then
kernel used RB-tree code. On the other hand, it does not
encode color/height in one of the pointers and requires
additional field for height.

May it be, that difference is due some bug in my testing,
then I would be interrested in correction. The test case
is oversimplified probably. I have already run more different
tests against GAVL code in the past to compare it with
different tree and queues implementations and I have not found
case with real performance degradation. On the other hand, there
are cases for small items counts where GAVL is sometimes
a little worse 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
* William Lee Irwin III [EMAIL PROTECTED] wrote:
 I've been suggesting testing CPU bandwidth allocation as influenced by 
 nice numbers for a while now for a reason.

On Sun, Apr 15, 2007 at 09:57:48PM +0200, Ingo Molnar wrote:
 Oh I was very much testing CPU bandwidth allocation as influenced by 
 nice numbers - it's one of the basic things i do when modifying the 
 scheduler. An automated tool, while nice (all automation is nice) 
 wouldnt necessarily show such bugs though, because here too it needed 
 thousands of running tasks to trigger in practice. Any volunteers? ;)

Worse comes to worse I might actually get around to doing it myself.
Any more detailed descriptions of the test for a rainy day?


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Gene Heskett
On Sunday 15 April 2007, Mike Galbraith wrote:
On Sun, 2007-04-15 at 12:58 -0400, Gene Heskett wrote:
 Chuckle, possibly but then I'm not anything even remotely close to an
 expert here Con, just reporting what I get.  And I just rebooted to
 2.6.21-rc6 + sched-mike-5.patch for grins and giggles, or frowns and
 profanity as the case may call for.

Erm, that patch is embarrassingly buggy, so profanity should dominate.

   -Mike

Chuckle, ROTFLMAO even.

I didn't run it that long as I immediately rebuilt and rebooted when I found 
I'd used the wrong patch, and in fact had tested that one and found it 
sub-optimal before I'd built and ran Con's -0.40 version.  As for bugs of the 
type that make it to the screen or logs, I didn't see any.  OTOH, my eyesight 
is slowly going downhill, now 20/25.  It was 20/10 30 years ago.  Now thats 
reason for profanity...

-- 
Cheers, Gene
There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order.
-Ed Howdershelt (Author)
Unix weanies are as bad at this as anyone.
 -- Larry Wall in [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Peter Williams

William Lee Irwin III wrote:

On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:

2) plugsched did not allow on the fly selection of schedulers, nor did
   it allow a per CPU selection of schedulers. IO schedulers you can 
   change per disk, on the fly, making them much more useful in
   practice. Also, IO schedulers (while definitely not being slow!) are 
   alot less performance sensitive than CPU schedulers.


One of the reasons I never posted my own code is that it never met its
own design goals, which absolutely included switching on the fly. I
think Peter Williams may have done something about that.


I didn't but some students did.

In a previous life, I did implement a runtime configurable CPU 
scheduling mechanism (implemented on True64, Solaris and Linux) that 
allowed schedulers to be loaded as modules at run time.  This was 
released commercially on True64 and Solaris.  So I know that it can be done.


I have thought about doing something similar for the SPA schedulers 
which differ in only small ways from each other but lack motivation.



It was my hope
to be able to do insmod sched_foo.ko until it became clear that the
effort it was intended to assist wasn't going to get even the limited
hardware access required, at which point I largely stopped working on
it.


On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:

3) I/O schedulers are pretty damn clean code, and plugsched, at least
   the last version i saw of it, didnt come even close.


I'm not sure what happened there. It wasn't a big enough patch to take
hits in this area due to getting overwhelmed by the programming burden
like some other efforts of mine. Maybe things started getting ugly once
on-the-fly switching entered the picture. My guess is that Peter Williams
will have to chime in here, since things have diverged enough from my
one-time contribution 4 years ago.


From my POV, the current version of plugsched is considerably simpler 
than it was when I took the code over from Con as I put considerable 
effort into minimizing code overlap in the various schedulers.


I also put considerable effort into minimizing any changes to the load 
balancing code (something Ingo seems to think is a deficiency) and the 
result is that plugsched allows intra run queue scheduling to be 
easily modified WITHOUT effecting load balancing.  To my mind scheduling 
and load balancing are orthogonal and keeping them that way simplifies 
things.


As Ingo correctly points out, plugsched does not allow different 
schedulers to be used per CPU but it would not be difficult to modify it 
so that they could.  Although I've considered doing this over the years 
I decided not to as it would just increase the complexity and the amount 
of work required to keep the patch set going.  About six months ago I 
decided to reduce the amount of work I was doing on plugsched (as it was 
obviously never going to be accepted) and now only publish patches 
against the vanilla kernel's major releases (and the only reason that I 
kept doing that is that the download figures indicated that about 80 
users were interested in the experiment).


Peter
PS I no longer read LKML (due to time constraints) and would appreciate 
it if I could be CC'd on any e-mails suggesting scheduler changes.
PPS I'm just happy to see that Ingo has finally accepted that the 
vanilla scheduler was badly in need of fixing and don't really care who 
fixes it.
PPS Different schedulers for different aims (i.e. server or work 
station) do make a difference.  E.g. the spa_svr scheduler in plugsched 
does about 1% better on kernbench than the next best scheduler in the bunch.
PPPS Con, fairness isn't always best as humans aren't very altruistic 
and we need to give unfair preference to interactive tasks in order to 
stop the users flinging their PCs out the window.  But the current 
scheduler doesn't do this very well and is also not very good at 
fairness so needs to change.  But the changes need to address 
interactive response and fairness not just fairness.

--
Peter Williams   [EMAIL PROTECTED]

Learning, n. The kind of ignorance distinguishing the studious.
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Nick Piggin
On Mon, Apr 16, 2007 at 08:52:33AM +1000, Con Kolivas wrote:
 On Monday 16 April 2007 05:00, Jonathan Lundell wrote:
  On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote:
   It's a really good thing, and it means that if somebody shows that
   your
   code is flawed in some way (by, for example, making a patch that
   people
   claim gets better behaviour or numbers), any *good* programmer that
   actually cares about his code will obviously suddenly be very
   motivated to
   out-do the out-doer!
 
  No one who cannot rejoice in the discovery of his own mistakes
  deserves to be called a scholar.
 
 Lovely comment. I realise this is not truly directed at me but clearly in the 
 context it has been said people will assume it is directed my way, so while 
 we're all spinning lkml quality rhetoric, let me have a right of reply.
 
 One thing I have never tried to do was to ignore bug reports. I'm forever 
 joking that I keep pulling code out of my arse to improve what I've done. 
 RSDL/SD was no exception; heck it had 40 iterations. The reason I could not 
 reply to bug report A with Oh that is problem B so I'll fix it with code C 
 was, as I've said many many times over, health related. I did indeed try to 
 fix many of them without spending hours replying to sometimes unpleasant 
 emails. If health wasn't an issue there might have been 1000 iterations of 
 SD.

Well what matters is the code and development. I don't think Ingo's
scheduler is the final word, although I worry that Linus might jump the
gun and merge something just to give it a test, which we then get
stuck with :P

I don't know how anybody can think Ingo's new scheduler is anything but
a good thing (so long as it has to compete before being merged). And
that's coming from someone who wants *their* scheduler to get merged...
I think mine can compete ;) and if it can't, then I'd rather be using
the scheduler that beats it.


 There was only ever _one_ thing that I was absolutely steadfast on as a 
 concept that I refused to fix that people might claim was a mistake I did 
 not rejoice in to be a scholar. That was that the _correct_ behaviour for a 
 scheduler is to be fair such that proportional slowdown with load is (using 
 that awful pun) a feature, not a bug.

If something is using more than a fair share of CPU time, over some macro
period, in order to be interactive, then definitely it should get throttled.
I've always maintained (since starting scheduler work) that the 2.6 scheduler
is horrible because it allows these cases where some things can get more CPU
time just by how they behave.

Glad people are starting to come around on that point.


So, on to something productive, we have 3 candidates for a new scheduler so
far. How do we decide which way to go? (and yes, I still think switchable
schedulers is wrong and a copout) This is one area where it is virtually
impossible to discount any decent design on correctness/performance/etc.
and even testing in -mm isn't really enough.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Nick Piggin
On Sun, Apr 15, 2007 at 04:31:54PM -0500, Matt Mackall wrote:
 On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote:
  
  4) the good thing that happened to I/O, after years of stagnation isnt
 I/O schedulers. The good thing that happened to I/O is called Jens
 Axboe. If you care about the I/O subystem then print that name out 
 and hang it on the wall. That and only that is what mattered.
 
 Disagree. Things didn't actually get interesting until Nick showed up
 with AS and got it in-tree to demonstrate the huge amount of room we
 had for improvement. It took several iterations of AS and CFQ (with a
 couple complete rewrites) before CFQ began to look like the winner.
 The resulting time-sliced CFQ was fairly heavily influenced by the
 ideas in AS.

Well to be fair, Jens had just implemented deadline, which got me
interested ;)

Actually, I would still like to be able to deprecate deadline for
AS, because AS has a tunable that you can switch to turn off read
anticipation and revert to deadline behaviour (or very close to).

It would have been nice if CFQ were then a layer on top of AS that
implemented priorities (or vice versa). And then AS could be
deprecated and we'd be back to 1 primary scheduler.

Well CFQ seems to be going in the right direction with that, however
some large users still find AS faster for some reason...

Anyway, moral of the story is that I think it would have been nice
if we hadn't proliferated IO schedulers, however in practice it
isn't easy to just layer features on top of each other, and also
keeping deadline helped a lot to be able to debug and examine
performance regressions and actually get code upstream. And this
was true even when it was globally boottine switchable only.

I'd prefer if we kept a single CPU scheduler in mainline, because I
think that simplifies analysis and focuses testing. I think we can
have one that is good enough for everyone. But if the only other
option for progress is that Linus or Andrew just pull one out of a
hat, then I would rather merge all of them. Yes I think Con's
scheduler should get a fair go, ditto for Ingo's, mine, and anyone
else's.


  nor was the non-modularity of some piece of code ever an impediment to 
  competition. May i remind you of the pretty competitive SLAB allocator 
  landscape, resulting in things like the SLOB allocator, written by 
  yourself? ;-)
 
 Thankfully no one came out and said we don't want to balkanize the
 allocator landscape when I submitted it or I probably would have just
 dropped it, rather than painfully dragging it along out of tree for
 years. I'm not nearly the glutton for punishment that Con is. :-P

I don't think this is a fault of the people or the code involved.
We just didn't have much collective drive to replace the scheduler,
and even less an idea of how to decide between any two of them.

I've kept nicksched around since 2003 or so and no hard feelings ;)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
William Lee Irwin III wrote:
 One of the reasons I never posted my own code is that it never met its
 own design goals, which absolutely included switching on the fly. I
 think Peter Williams may have done something about that.
 It was my hope
 to be able to do insmod sched_foo.ko until it became clear that the
 effort it was intended to assist wasn't going to get even the limited
 hardware access required, at which point I largely stopped working on
 it.

On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote:
 I didn't but some students did.
 In a previous life, I did implement a runtime configurable CPU 
 scheduling mechanism (implemented on True64, Solaris and Linux) that 
 allowed schedulers to be loaded as modules at run time.  This was 
 released commercially on True64 and Solaris.  So I know that it can be done.
 I have thought about doing something similar for the SPA schedulers 
 which differ in only small ways from each other but lack motivation.

Driver models for scheduling are not so far out. AFAICS it's largely a
tug-of-war over design goals, e.g. maintaining per-cpu runqueues and
switching out intra-queue policies vs. switching out whole-system
policies, SMP handling and all. Whether this involves load balancing
depends strongly on e.g. whether you have per-cpu runqueues. A 2.4.x
scheduler module, for instance, would not have a load balancer at all,
as it has only one global runqueue. There are other sorts of policies
wanting significant changes to SMP handling vs. the stock load
balancing.


William Lee Irwin III wrote:
 I'm not sure what happened there. It wasn't a big enough patch to take
 hits in this area due to getting overwhelmed by the programming burden
 like some other efforts of mine. Maybe things started getting ugly once
 on-the-fly switching entered the picture. My guess is that Peter Williams
 will have to chime in here, since things have diverged enough from my
 one-time contribution 4 years ago.

On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote:
 From my POV, the current version of plugsched is considerably simpler 
 than it was when I took the code over from Con as I put considerable 
 effort into minimizing code overlap in the various schedulers.
 I also put considerable effort into minimizing any changes to the load 
 balancing code (something Ingo seems to think is a deficiency) and the 
 result is that plugsched allows intra run queue scheduling to be 
 easily modified WITHOUT effecting load balancing.  To my mind scheduling 
 and load balancing are orthogonal and keeping them that way simplifies 
 things.

ISTR rearranging things for con in such a fashion that it no longer
worked out of the box (though that wasn't the intention; restructuring it
to be more suited to his purposes was) and that's what he worked off of
afterward. I don't remember very well what changed there as I clearly
invested less effort there than the prior versions. Now that I think of
it, that may have been where the sample policy demonstrating scheduling
classes was lost.


On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote:
 As Ingo correctly points out, plugsched does not allow different 
 schedulers to be used per CPU but it would not be difficult to modify it 
 so that they could.  Although I've considered doing this over the years 
 I decided not to as it would just increase the complexity and the amount 
 of work required to keep the patch set going.  About six months ago I 
 decided to reduce the amount of work I was doing on plugsched (as it was 
 obviously never going to be accepted) and now only publish patches 
 against the vanilla kernel's major releases (and the only reason that I 
 kept doing that is that the download figures indicated that about 80 
 users were interested in the experiment).

That's a rather different goal from what I was going on about with it,
so it's all diverged quite a bit. Where I had a significant need for
mucking with the entire concept of how SMP was handled, this is rather
different. At this point I'm questioning the relevance of my own work,
though it was already relatively marginal as it started life as an
attempt at a sort of debug patch to help gang scheduling (which is in
itself a rather marginally relevant feature to most users) code along.


On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote:
 PS I no longer read LKML (due to time constraints) and would appreciate 
 it if I could be CC'd on any e-mails suggesting scheduler changes.
 PPS I'm just happy to see that Ingo has finally accepted that the 
 vanilla scheduler was badly in need of fixing and don't really care who 
 fixes it.
 PPS Different schedulers for different aims (i.e. server or work 
 station) do make a difference.  E.g. the spa_svr scheduler in plugsched 
 does about 1% better on kernbench than the next best scheduler in the bunch.
 PPPS Con, fairness isn't always best as humans aren't very altruistic 
 and we need to give unfair 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Con Kolivas
On Monday 16 April 2007 12:28, Nick Piggin wrote:
 So, on to something productive, we have 3 candidates for a new scheduler so
 far. How do we decide which way to go? (and yes, I still think switchable
 schedulers is wrong and a copout) This is one area where it is virtually
 impossible to discount any decent design on correctness/performance/etc.
 and even testing in -mm isn't really enough.

We're in agreement! YAY!

Actually this is simpler than that. I'm taking SD out of the picture. It has 
served it's purpose of proving that we need to seriously address all the 
scheduling issues and did more than a half decent job at it. Unfortunately I 
also cannot sit around supporting it forever by myself. My own life is more 
important, so consider SD not even running the race any more.

I'm off to continue maintaining permanent-out-of-tree leisurely code at my own 
pace. What's more is, I think I'll just stick to staircase Gen I version blah 
and shelve SD and try to have fond memories of SD as an intellectual 
prompting exercise only.

-- 
-ck
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Nick Piggin
On Mon, Apr 16, 2007 at 01:15:27PM +1000, Con Kolivas wrote:
 On Monday 16 April 2007 12:28, Nick Piggin wrote:
  So, on to something productive, we have 3 candidates for a new scheduler so
  far. How do we decide which way to go? (and yes, I still think switchable
  schedulers is wrong and a copout) This is one area where it is virtually
  impossible to discount any decent design on correctness/performance/etc.
  and even testing in -mm isn't really enough.
 
 We're in agreement! YAY!
 
 Actually this is simpler than that. I'm taking SD out of the picture. It has 
 served it's purpose of proving that we need to seriously address all the 
 scheduling issues and did more than a half decent job at it. Unfortunately I 
 also cannot sit around supporting it forever by myself. My own life is more 
 important, so consider SD not even running the race any more.
 
 I'm off to continue maintaining permanent-out-of-tree leisurely code at my 
 own 
 pace. What's more is, I think I'll just stick to staircase Gen I version blah 
 and shelve SD and try to have fond memories of SD as an intellectual 
 prompting exercise only.

Well I would hope that _if_ we decide to switch schedulers, then you
get a chance to field something (and I hope you will decide to and have
time to), and I hope we don't rush into the decision.

We've had the current scheduler for so many years now that it is much
more important to make sure we take the time to do the right thing
rather than absolutely have to merge a new scheduler right now ;)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Con Kolivas
On Monday 16 April 2007 01:05, Ingo Molnar wrote:
 * Con Kolivas [EMAIL PROTECTED] wrote:
  2. Since then I've been thinking/working on a cpu scheduler design
  that takes away all the guesswork out of scheduling and gives very
  predictable, as fair as possible, cpu distribution and latency while
  preserving as solid interactivity as possible within those confines.

 yeah. I think you were right on target with this call.

Yay thank goodness :) It's time to fix the damn cpu scheduler once and for 
all. Everyone uses this; it's no minor driver or $bigsmp or $bigram or 
$small_embedded_RT_hardware feature.

 I've applied the 
 sched.c change attached at the bottom of this mail to the CFS patch, if
 you dont mind. (or feel free to suggest some other text instead.)

   *  2003-09-03   Interactivity tuning by Con Kolivas.
   *  2004-04-02   Scheduler domains code by Nick Piggin
 + *  2007-04-15   Con Kolivas was dead right: fairness matters! :)

LOL that's awful. I'd prefer something meaningful like Work begun on 
replacing all interactivity tuning with a fair virtual-deadline design by Con 
Kolivas.

While you're at it, it's worth getting rid of a few slightly pointless name 
changes too. Don't rename SCHED_NORMAL yet again, and don't call all your 
things sched_fair blah_fair __blah_fair and so on. It means that anything 
else is by proxy going to be considered unfair. Leave SCHED_NORMAL as is, 
replace the use of the word _fair with _cfs. I don't really care how many 
copyright notices you put into our already noisy bootup but it's redundant 
since there is no choice; we all get the same cpu scheduler.

  1. I tried in vain some time ago to push a working extensable
  pluggable cpu scheduler framework (based on wli's work) for the linux
  kernel. It was perma-vetoed by Linus and Ingo (and Nick also said he
  didn't like it) as being absolutely the wrong approach and that we
  should never do that. [...]

 i partially replied to that point to Will already, and i'd like to make
 it clear again: yes, i rejected plugsched 2-3 years ago (which already
 drifted away from wli's original codebase) and i would still reject it
 today.

No that was just me being flabbergasted by what appeared to be you posting 
your own plugsched. Note nowhere in the 40 iterations of rsdl-sd did I 
ask/suggest for plugsched. I said in my first announcement my aim was to 
create a scheduling policy robust enough for all situations rather than 
fantastic a lot of the time and awful sometimes. There are plenty of people 
ready to throw out arguments for plugsched now and I don't have the energy to 
continue that fight (I never did really).

But my question still stands about this comment:

   case, all of SD's logic could be added via a kernel/sched_sd.c module
   as well, if Con is interested in such an approach. ]

What exactly would be the purpose of such a module that governs nothing in 
particular? Since there'll be no pluggable scheduler by your admission it has 
no control over SCHED_NORMAL, and would require another scheduling policy for 
it to govern which there is no express way to use at the moment and people 
tend to just use the default without great effort. 

 First and foremost, please dont take such rejections too personally - i
 had my own share of rejections (and in fact, as i mentioned it in a
 previous mail, i had a fair number of complete project throwaways:
 4g:4g, in-kernel Tux, irqrate and many others). I know that they can
 hurt and can demoralize, but if i dont like something it's my job to
 tell that.

Hmm? No that's not what this is about. Remember dynticks which was not 
originally my code but I tried to bring it up to mainline standard which I 
fought with for months? You came along with yet another rewrite from scratch 
and the flaws in the design I was working with were obvious so I instantly 
bowed down to that and never touched my code again. I didn't ask for credit 
back then, but obviously brought the requirement for a no idle tick 
implementation to the table.

 My view about plugsched: first please take a look at the latest
 plugsched code:

   http://downloads.sourceforge.net/cpuse/plugsched-6.5-for-2.6.20.patch

   26 files changed, 8951 insertions(+), 1495 deletions(-)

 As an experiment i've removed all the add-on schedulers (both the core
 and the include files, only kept the vanilla one) from the plugsched
 patch (and the makefile and kconfig complications, etc), to see the
 'infrastructure cost', and it still gave:

   12 files changed, 1933 insertions(+), 1479 deletions(-)

I do not see extra code per-se as being a bad thing. I've heard said a few 
times before ever notice how when the correct solution is done it is a lot 
more code than the quick hack that ultimately fails?. Insert long winded 
discussion of perfect is the enemy of good here, _but_ I'm not arguing 
perfect versus good, I'm talking about solid code versus quick fix. Again, 
none of this comment is directed 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread hui
On Sun, Apr 15, 2007 at 09:25:07AM -0700, Arjan van de Ven wrote:
 Now this doesn't mean that people shouldn't be nice to each other, not
 cooperate or steal credits, but I don't get the impression that that is
 happening here. Ingo is taking part in the discussion with a counter
 proposal for discussion *on the mailing list*. What more do you want??

Con should have been CCed from the first moment this was put into motion
to limit the perception of exclusion. That was mistake number one and big
time failures to understand this dynamic. After it was Con's idea. Why
the hell he was excluded from Ingo's development process is baffling to
me and him (most likely).

He put int a lot of effort into SDL and his experiences with scheduling
should still be seriously considered in this development process even if
he doesn't write a single line of code from this moment on.

What should have happened is that our very busy associate at RH by the
name of Ingo Molnar should have leverage more of Con's and Bill's work
and use them as a proxy for his own ideas. They would have loved to have
contributed more and our very busy Ingo Molnar would have gotten a lot
of his work and ideas implemented without him even opening a single
source file for editting. They would have happily done this work for
Ingo. Ingo could have been used for something else more important like
making KVM less of a freaking ugly hack and we all would have benefitted
from this.

He could have been working on SystemTap so that you stop losing accounts
to Sun and Solaris 10's Dtrace. He could have been working with Riel to
fix your butt ugly page scanning problem causing horrible contention via
the Clock/Pro algorithm, etc... He could have been fixing the ugly futex
rwsem mapping problem that's killing -rt and anything that uses Posix
threads. He could have created a userspace thread control block (TCB)
with Mr. Drepper so that we can turn off preemption in userspace
(userspace per CPU local storage) and implement a very quick non-kernel
crossing implementation of priority ceilings (userspace check for priority
and flags at preempt_schedule() in the TCB) so that our -rt Posix API
doesn't suck donkey shit... Need I say more ?

As programmers like Ingo get spread more thinly, he needs super smart
folks like Bill Irwin and Con to help him out and learn to resist NIH
folk's stuff out of some weird fear. When this happens, folks like Ingo
must learn to facilitate development in addition to implementing it
with those kind of folks.

This takes time and practice to entrust folks to do things for him.
Ingo is the best method of getting new Linux kernel ideas and communicate
them to Linus. His value goes beyond just just code and is often the
biggest hammer we have in the Linux community to get stuff into the
kernel. Facilitation of others is something that solo programmers must
need when groups like the Linux kernel get larger and large every year.

Understand ? Are we in embarrassing agreement here ?

bill

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Davide Libenzi
On Mon, 16 Apr 2007, Pavel Pisa wrote:

 I cannot help myself to not report results with GAVL
 tree algorithm there as an another race competitor.
 I believe, that it is better solution for large priority
 queues than RB-tree and even heap trees. It could be
 disputable if the scheduler needs such scalability on
 the other hand. The AVL heritage guarantees lower height
 which results in shorter search times which could
 be profitable for other uses in kernel.
 
 GAVL algorithm is AVL tree based, so it does not suffer from
 infinite priorities granularity there as TR does. It allows
 use for generalized case where tree is not fully balanced.
 This allows to cut the first item withour rebalancing.
 This leads to the degradation of the tree by one more level
 (than non degraded AVL gives) in maximum, which is still
 considerably better than RB-trees maximum.
 
 http://cmp.felk.cvut.cz/~pisa/linux/smart-queue-v-gavl.c

Here are the results on my Opteron 252:

Testing N=1
gavl_cfs = 187.20 cycles/loop
CFS = 194.16 cycles/loop
TR  = 314.87 cycles/loop
CFS = 194.15 cycles/loop
gavl_cfs = 187.15 cycles/loop

Testing N=2
gavl_cfs = 268.94 cycles/loop
CFS = 305.53 cycles/loop
TR  = 313.78 cycles/loop
CFS = 289.58 cycles/loop
gavl_cfs = 266.02 cycles/loop

Testing N=4
gavl_cfs = 452.13 cycles/loop
CFS = 518.81 cycles/loop
TR  = 311.54 cycles/loop
CFS = 516.23 cycles/loop
gavl_cfs = 450.73 cycles/loop

Testing N=8
gavl_cfs = 609.29 cycles/loop
CFS = 644.65 cycles/loop
TR  = 308.11 cycles/loop
CFS = 667.01 cycles/loop
gavl_cfs = 592.89 cycles/loop

Testing N=16
gavl_cfs = 686.30 cycles/loop
CFS = 807.41 cycles/loop
TR  = 317.20 cycles/loop
CFS = 810.24 cycles/loop
gavl_cfs = 688.42 cycles/loop

Testing N=32
gavl_cfs = 756.57 cycles/loop
CFS = 852.14 cycles/loop
TR  = 301.22 cycles/loop
CFS = 876.12 cycles/loop
gavl_cfs = 758.46 cycles/loop

Testing N=64
gavl_cfs = 831.97 cycles/loop
CFS = 997.16 cycles/loop
TR  = 304.74 cycles/loop
CFS = 1003.26 cycles/loop
gavl_cfs = 832.83 cycles/loop

Testing N=128
gavl_cfs = 897.33 cycles/loop
CFS = 1030.36 cycles/loop
TR  = 295.65 cycles/loop
CFS = 1035.29 cycles/loop
gavl_cfs = 892.51 cycles/loop

Testing N=256
gavl_cfs = 963.17 cycles/loop
CFS = 1146.04 cycles/loop
TR  = 295.35 cycles/loop
CFS = 1162.04 cycles/loop
gavl_cfs = 966.31 cycles/loop

Testing N=512
gavl_cfs = 1029.82 cycles/loop
CFS = 1218.34 cycles/loop
TR  = 288.78 cycles/loop
CFS = 1257.97 cycles/loop
gavl_cfs = 1029.83 cycles/loop

Testing N=1024
gavl_cfs = 1091.76 cycles/loop
CFS = 1318.47 cycles/loop
TR  = 287.74 cycles/loop
CFS = 1311.72 cycles/loop
gavl_cfs = 1093.29 cycles/loop

Testing N=2048
gavl_cfs = 1153.03 cycles/loop
CFS = 1398.84 cycles/loop
TR  = 286.75 cycles/loop
CFS = 1438.68 cycles/loop
gavl_cfs = 1149.97 cycles/loop


There seem to be some difference from your numbers. This is with:

gcc version 4.1.2

and -O2. But then and Opteron can behave quite differentyl than a Duron on 
a bench like this ;)



- Davide


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Gene Heskett
On Monday 16 April 2007, Con Kolivas wrote:

And I snipped, Sorry fellas.

Con's original submission was to me, quite an improvement.  But I have to say 
it, and no denegration of your efforts is intended Con, but you did 'pull the 
trigger' and get this thing rolling by scratching the itch  drawing 
attention to an ugly lack of user interactivity that had crept into the 2.6 
family.  So from me to Con, a tip of the hat, and a deep bow in your 
direction, thank you.  Now, you have done what you aimed to do, so please get 
well.

I've now been through most of an amanda session using Ingo's CFS and I have 
to say that it is another improvement over your 0.40 that's is just as 
obvious as your first patch was against the stock scheduler.  No other 
scheduler yet has allowed the full utilization of the cpu, and maintained 
user interactivity as well as this one has,  my cpu is running about 5 
degrees F hotter just from this effect alone.  gzip, if the rest of the 
system is in between tasks, is consistently showing around 95%, but let 
anything else stick up its hand, like procmail etc, and gzip now dutifully 
steps aside, dropping into the 40% range until procmail and spamd are done, 
at which point there is no rest for the wicked and the cpu never gets a 
chance to cool.

There was, just now, a pause of about 2 seconds, while amanda moved a tarball 
from the holding disk area on /dev/hda to the vtapes disk on /dev/hdd, so 
that would have been an I/O bound situation.

This one Ingo, even without any other patches and I think I did see one go by 
in this thread which I didn't apply, is a definite keeper.  Sweet even.

-- 
Cheers, Gene
There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order.
-Ed Howdershelt (Author)
A word to the wise is enough.
-- Miguel de Cervantes
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sun, 2007-04-15 at 13:27 +1000, Con Kolivas wrote:
 On Saturday 14 April 2007 06:21, Ingo Molnar wrote:
  [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler
  [CFS]
 
  i'm pleased to announce the first release of the Modular Scheduler Core
  and Completely Fair Scheduler [CFS] patchset:
 
 http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch
 
  This project is a complete rewrite of the Linux task scheduler. My goal
  is to address various feature requests and to fix deficiencies in the
  vanilla scheduler that were suggested/found in the past few years, both
  for desktop scheduling and for server scheduling workloads.
 
 The casual observer will be completely confused by what on earth has happened 
 here so let me try to demystify things for them.

[...]

Demystify what?   The casual observer need only read either your attempt
at writing a scheduler, or my attempts at fixing the one we have, to see
that it was high time for someone with the necessary skills to step in.
Now progress can happen, which was _not_ happening before.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sat, 2007-04-14 at 15:01 +0200, Willy Tarreau wrote:

 Well, I'll stop heating the room for now as I get out of ideas about how
 to defeat it. I'm convinced. I'm impatient to read about Mike's feedback
 with his workload which behaves strangely on RSDL. If it works OK here,
 it will be the proof that heuristics should not be needed.

You mean the X + mp3 player + audio visualization test?  X+Gforce
visualization have problems getting half of my box in the presence of
two other heavy cpu using tasks.  Behavior is _much_ better than
RSDL/SD, but the synchronous nature of X/client seems to be a problem.  

With this scheduler, renicing X/client does cure it, whereas with SD it
did not help one bit.  (I know a trivial way to cure that, and this
framework makes that possible without dorking up fairness as a general
policy.)

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread hui
On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote:
 [...]
 
 Demystify what?   The casual observer need only read either your attempt

Here's the problem. You're a casual observer and obviously not paying
attention.

 at writing a scheduler, or my attempts at fixing the one we have, to see
 that it was high time for someone with the necessary skills to step in.
 Now progress can happen, which was _not_ happening before.

I think that's inaccurate and there are plenty of folks that have that
technical skill and background. The scheduler code isn't a deep mystery
and there are plenty of good kernel hackers out here across many
communities.  Ingo isn't the only person on this planet to have deep
scheduler knowledge. Priority heaps are not new and Solaris has had a
pluggable scheduler framework for years.

Con's characterization is something that I'm more prone to believe about
how Linux kernel development works versus your view. I think it's a great
shame to have folks like Bill Irwin and Con to have waste time trying to
do something right only to have their ideas attack, then copied and held
as the solution for this kind of technical problem as complete reversal
of technical opinion as it suits a moment. This is just wrong in so many
ways.

It outlines the problems with Linux kernel development and questionable
elistism regarding ownership of certain sections of the kernel code.

I call it churn squat and instances like this only support that view
which I would rather it be completely wrong and inaccurate instead.

bill

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sun, 2007-04-15 at 01:36 -0700, Bill Huey wrote:
 On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote:
  [...]
  
  Demystify what?   The casual observer need only read either your attempt
 
 Here's the problem. You're a casual observer and obviously not paying
 attention.
 
  at writing a scheduler, or my attempts at fixing the one we have, to see
  that it was high time for someone with the necessary skills to step in.
  Now progress can happen, which was _not_ happening before.
 
 I think that's inaccurate and there are plenty of folks that have that
 technical skill and background. The scheduler code isn't a deep mystery
 and there are plenty of good kernel hackers out here across many
 communities.  Ingo isn't the only person on this planet to have deep
 scheduler knowledge.

Ok shrug, I'm not paying attention, and you can't read.  We're even.
Have a nice life.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Bill Huey [EMAIL PROTECTED] wrote:

 Hello folks,
 
 I think the main failure I see here is that Con wasn't included in 
 this design or privately in review process. There could have been 
 better co-ownership of the code. This could also have been done openly 
 on lkml [...]

Bill, you come from a BSD background and you are still relatively new to 
Linux development, so i dont at all fault you for misunderstanding this 
situation, and fortunately i have a really easy resolution for your 
worries: i did exactly that! :)

i wrote the first line of code of the CFS patch this week, 8am Wednesday 
morning, and released it to lkml 62 hours later, 22pm on Friday. (I've 
listed the file timestamps of my backup patches further below, for all 
the fine details.)

I prefer such early releases to lkml _alot_ more than any private review 
process. I released the CFS code about 6 hours after i thought okay, 
this looks pretty good and i spent those final 6 hours on testing it 
(making sure it doesnt blow up on your box, etc.), in the final 2 hours 
i showed it to two folks i could reach on IRC (Arjan and Thomas) and on 
various finishing touches. It doesnt get much faster than that and i 
definitely didnt want to sit on it even one day longer because i very 
much thought that Con and others should definitely see this work!

And i very much credited (and still credit) Con for the whole fairness 
angle:

||  i'd like to give credit to Con Kolivas for the general approach here:
||  he has proven via RSDL/SD that 'fair scheduling' is possible and that
||  it results in better desktop scheduling. Kudos Con!

the 'design consultation' phase you are talking about is _NOW_! :)

I got the v1 code out to Con, to Mike and to many others ASAP. That's 
how you are able to comment on this thread and be part of the 
development process to begin with, in a 'private consultation' setup 
you'd not have had any opportunity to see _any_ of this.

In the BSD space there seem to be more 'political' mechanisms for 
development, but Linux is truly about doing things out in the open, and 
doing it immediately.

Okay? ;-)

Here's the timestamps of all my backups of the patch, from its humble 4K 
beginnings to the 100K first-cut v1 result:

-rw-rw-r-- 1 mingo mingo 4230 Apr 11 08:47 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 7653 Apr 11 09:12 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 7728 Apr 11 09:26 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 14416 Apr 11 10:08 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 24211 Apr 11 10:41 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 27878 Apr 11 10:45 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 33807 Apr 11 11:05 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 34524 Apr 11 11:09 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 39650 Apr 11 11:19 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 40231 Apr 11 11:34 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 40627 Apr 11 11:48 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 40638 Apr 11 11:54 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 42733 Apr 11 12:19 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 42817 Apr 11 12:31 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 43270 Apr 11 12:41 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 43531 Apr 11 12:48 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 44331 Apr 11 12:51 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45173 Apr 11 12:56 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45288 Apr 11 12:59 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45368 Apr 11 13:06 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45370 Apr 11 13:06 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45815 Apr 11 13:14 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45887 Apr 11 13:19 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45914 Apr 11 13:25 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 45850 Apr 11 13:29 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 49196 Apr 11 13:39 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 64317 Apr 11 13:45 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 64403 Apr 11 13:52 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 65199 Apr 11 14:03 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 65199 Apr 11 14:07 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 68995 Apr 11 14:50 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 69919 Apr 11 15:23 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 71065 Apr 11 16:26 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 70642 Apr 11 16:28 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 72334 Apr 11 16:49 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 71624 Apr 11 17:01 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 71854 Apr 11 17:20 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 73571 Apr 11 17:42 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 74708 Apr 11 17:49 patches/sched-fair.patch
-rw-rw-r-- 1 mingo mingo 74708 Apr 11 17:51 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Mike Galbraith [EMAIL PROTECTED] wrote:

 On Sat, 2007-04-14 at 15:01 +0200, Willy Tarreau wrote:
 
  Well, I'll stop heating the room for now as I get out of ideas about 
  how to defeat it. I'm convinced. I'm impatient to read about Mike's 
  feedback with his workload which behaves strangely on RSDL. If it 
  works OK here, it will be the proof that heuristics should not be 
  needed.
 
 You mean the X + mp3 player + audio visualization test?  X+Gforce 
 visualization have problems getting half of my box in the presence of 
 two other heavy cpu using tasks.  Behavior is _much_ better than 
 RSDL/SD, but the synchronous nature of X/client seems to be a problem.
 
 With this scheduler, renicing X/client does cure it, whereas with SD 
 it did not help one bit. [...]

thanks for testing it! I was quite worried about your setup - two tasks 
using up 50%/50% of CPU time, pitted against a kernel rebuild workload 
seems to be a hard workload to get right.

 [...] (I know a trivial way to cure that, and this framework makes 
 that possible without dorking up fairness as a general policy.)

great! Please send patches so i can add them (once you are happy with 
the solution) - i think your workload isnt special in any way and could 
hit other people too.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Bill Huey [EMAIL PROTECTED] wrote:

 On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote:
  [...]
  
  Demystify what?  The casual observer need only read either your 
  attempt
 
 Here's the problem. You're a casual observer and obviously not paying 
 attention.

guys, please calm down. Judging by the number of contributions to 
sched.c the main folks who are not 'observers' here and who thus have an 
unalienable right to be involved in a nasty flamewar about scheduler 
interactivity are Con, Mike, Nick and me ;-) Everyone else is just a 
happy bystander, ok? ;-)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sun, 2007-04-15 at 10:58 +0200, Ingo Molnar wrote:
 * Mike Galbraith [EMAIL PROTECTED] wrote:

  [...] (I know a trivial way to cure that, and this framework makes 
  that possible without dorking up fairness as a general policy.)
 
 great! Please send patches so i can add them (once you are happy with 
 the solution) - i think your workload isnt special in any way and could 
 hit other people too.

I'll give it a shot.  (have to read and actually understand your new
code first though, then see if it's really viable)

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread hui
On Sun, Apr 15, 2007 at 10:44:47AM +0200, Ingo Molnar wrote:
 I prefer such early releases to lkml _alot_ more than any private review 
 process. I released the CFS code about 6 hours after i thought okay, 
 this looks pretty good and i spent those final 6 hours on testing it 
 (making sure it doesnt blow up on your box, etc.), in the final 2 hours 
 i showed it to two folks i could reach on IRC (Arjan and Thomas) and on 
 various finishing touches. It doesnt get much faster than that and i 
 definitely didnt want to sit on it even one day longer because i very 
 much thought that Con and others should definitely see this work!
 
 And i very much credited (and still credit) Con for the whole fairness 
 angle:
 
 ||  i'd like to give credit to Con Kolivas for the general approach here:
 ||  he has proven via RSDL/SD that 'fair scheduling' is possible and that
 ||  it results in better desktop scheduling. Kudos Con!
 
 the 'design consultation' phase you are talking about is _NOW_! :)
 
 I got the v1 code out to Con, to Mike and to many others ASAP. That's 
 how you are able to comment on this thread and be part of the 
 development process to begin with, in a 'private consultation' setup 
 you'd not have had any opportunity to see _any_ of this.
 
 In the BSD space there seem to be more 'political' mechanisms for 
 development, but Linux is truly about doing things out in the open, and 
 doing it immediately.

I can't even begin to talk about how screwed up BSD development is. Maybe
another time privately.

Ok, Linux development and inclusiveness can be improved. I'm not trying
to call you out (slang for accusing you with the sole intention to call
you crazy in a highly confrontative manner). This is discussed publically
here to bring this issue to light, open a communication channel as a means
to resolve it.

 Okay? ;-)

It's cool. We're still getting to know each other professionally and it's
okay to a certain degree to have a communication disconnect but only as
long as it clears. Your productivity is amazing BTW. But here's the
problem, there's this perception that NIH is the default mentality here
in Linux.

Con feels that this kind of action is intentional and has a malicious
quality to it as means of churn squating sections of the kernel tree.
The perception here is that there is that there is this expectation that
sections of the Linux kernel are intentionally churn squated to prevent
any other ideas from creeping in other than of the owner of that subsytem
(VM, scheduling, etc...) because of lack of modularity in the kernel.
This isn't an API question but a question possibly general code quality
and how maintenance () of it can .

This was predicted by folks and then this perception was *realized* when
you wrote the equivalent kind of code that has technical overlap with SDL
(this is just one dry example). To a person that is writing new code for
Linux, having one of the old guards write equivalent code to that of a
newcomer has the effect of displacing that person both with regards to
code and responsibility with that. When this happens over and over again
and folks get annoyed by it, it starts seeming that Linux development
seems elitist.

I know this because I heard (read) Con's IRC chats all the time about
these matters all of the time. This is not just his view but a view of
other kernel folks that differing views as to. The closing talk at OLS
2006 was highly disturbing in many ways. It went Christoph is right
everybody else is wrong which sends a highly negative message to new
kernel developers that, say, don't work for RH directly or any of the
other mainstream Linux companies. After a while, it starts seeming like
this kind of behavior is completely intentional and that Linux is
full of arrogant bastards.

What I would have done here was to contact Peter Williams, Bill Irwin
and Con about what your doing and reach a common concensus about how
to create something that would be inclusive of all of their ideas.
Discussions can technically heated but that's ok, the discussion is
happening and it brings down the wall of this perception. Bill and
Con are on oftc.net/#offtopic2. Riel is there as well as Peter Zijlstra.
It might be very useful, it might not be. Folks are all stubborn
about there ideas and hold on to them for dear life. Effective
leaders can deconstruct this hostility and animosity. I don't claim
to be one.

Because of past hostility to something like schedplugin, the hostility
and terseness of responses can be percieved simply as I'm right,
you're wrong which is condescending. This effects discussion and
outright destroys a constructive process if this happens continually
since it reenforces that view of You're an outsider, we don't care
about you. Nobody is listening to each other at that point, folks get
pissed. Then they think about I'm going to NIH this person with patc
X because he/she did the same here which is dysfunctional.

Oddly enough, sometimes you're the best person 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Pekka Enberg

On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote:

The perception here is that there is that there is this expectation that
sections of the Linux kernel are intentionally churn squated to prevent
any other ideas from creeping in other than of the owner of that subsytem


Strangely enough, my perception is that Ingo is simply trying to
address the issues Mike's testing discovered in RDSL and SD. It's not
surprising Ingo made it a separate patch set as Con has repeatedly
stated that the problems are in fact by design and won't be fixed.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Esben Nielsen

On Fri, 13 Apr 2007, Ingo Molnar wrote:


[announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

i'm pleased to announce the first release of the Modular Scheduler Core
and Completely Fair Scheduler [CFS] patchset:

  http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch

This project is a complete rewrite of the Linux task scheduler. My goal
is to address various feature requests and to fix deficiencies in the
vanilla scheduler that were suggested/found in the past few years, both
for desktop scheduling and for server scheduling workloads.

[...]


I took a brief look at it. Have you tested priority inheritance?
As far as  I can see rt_mutex_setprio doesn't have much effect on 
SCHED_FAIR/SCHED_BATCH. I am looking for a place where such a task change 
scheduler class when boosted in rt_mutex_setprio().


Esben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Willy Tarreau
On Sun, Apr 15, 2007 at 01:39:27PM +0300, Pekka Enberg wrote:
 On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote:
 The perception here is that there is that there is this expectation that
 sections of the Linux kernel are intentionally churn squated to prevent
 any other ideas from creeping in other than of the owner of that subsytem
 
 Strangely enough, my perception is that Ingo is simply trying to
 address the issues Mike's testing discovered in RDSL and SD. It's not
 surprising Ingo made it a separate patch set as Con has repeatedly
 stated that the problems are in fact by design and won't be fixed.

That's not exactly the problem. There are people who work very hard to
try to improve some areas of the kernel. They progress slowly, and
acquire more and more skills. Sometimes they feel like they need to
change some concepts and propose those changes which are required for
them to go further, or to develop faster. Those are rejected. So they
are constrained to work in a delimited perimeter from which it is
difficult for them to escape.

Then, the same person who rejected their changes comes with something
shiny new, better and which took him far less time. But he sort of
broke the rules because what was forbidden to the first persons is
suddenly permitted. Maybe for very good reasons, I'm not discussing
that. The good reason should have been valid the first time too.

The fact is that when changes are rejected, we should not simply say
no, but explain why and define what would be acceptable. Some people
here have excellent teaching skills for this, but most others do not.
Anyway, the rules should be the same for everybody.

Also, there is what can be perceived as marketting here. Con worked
on his idea with convictions, he took time to write some generous
documentation, but he hit a wall where his concept was suboptimal on
a given workload. But at least, all the work was oriented on a technical
basis : design + code + doc.

Then, Ingo comes in with something looking amazingly better, with
virtually no documentation, an appealing announcement, and a shiny
advertising at boot. All this implemented without the constraints
other people had to respect. It already looks like definitive work
which will be merge as-is without many changes except a few bugfixes.

If those were two companies, the first one would simply have accused
the second one of not having respected contracts and having employed
heaving marketting to take the first place.

People here do not code for a living, they do it at least because they
believe in what they are doing, and some of them want a bit of gratitude
for their work. I've met people who were proud to say they implement
this or that feature in the kernel, so it is something important for
them. And being cited in an email is nothing compared to advertising
at boot time.

When the discussion was blocked between Con and Mike concerning the
design problems, it is where a new discussion should have taken place.
Ingo could have publicly spoken with them about his ideas of killing
the O(1) scheduler and replacing it with an rbtree-based one, and using
part of Bill's work to speed up development.

It is far easier to resign when people explain what concepts are wrong
and how they think they will do than when they suddenly present something
out of nowhere which is already better.

And it's not specific to Ingo (though I think his ability to work that
fast alone makes him tend to practise this more often than others).

Imagine if Con had worked another full week on his scheduler with better
results on Mike's workload, but still not as good as Ingo's, and they both
published at the same time. You certainly can imagine he would have preferred
to be informed first that it was pointless to continue in that direction.

Now I hope he and Bill will get over this and accept to work on improving
this scheduler, because I really find it smarter than a dumb O(1). I even
agree with Mike that we now have a solid basis for future work. But for
this, maybe a good starting point would be to remove the selfish printk
at boot, revert useless changes (SCHED_NORMAL-SCHED_FAIR come to mind)
and improve the documentation a bit so that people can work together on
the new design, without feeling like their work will only server to
promote X or Y.

Regards,
Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Esben Nielsen [EMAIL PROTECTED] wrote:

 I took a brief look at it. Have you tested priority inheritance?

yeah, you are right, it's broken at the moment, i'll fix it. But the 
good news is that i think PI could become cleaner via scheduling 
classes.

 As far as I can see rt_mutex_setprio doesn't have much effect on 
 SCHED_FAIR/SCHED_BATCH. I am looking for a place where such a task 
 change scheduler class when boosted in rt_mutex_setprio().

i think via scheduling classes we dont have to do the p-policy and 
p-prio based gymnastics anymore, we can just have a clean look at 
p-sched_class and stack the original scheduling class into 
p-real_sched_class. It would probably also make sense to 'privatize' 
p-prio into the scheduling class. That way PI would be a pure property 
of sched_rt, and the PI scheduler would be driven purely by 
p-rt_priority, not by p-prio. That way all the normal_prio() kind of 
complications and interactions with SCHED_OTHER/SCHED_FAIR would be 
eliminated as well. What do you think?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Pekka J Enberg
On Sun, 15 Apr 2007, Willy Tarreau wrote:
 Ingo could have publicly spoken with them about his ideas of killing
 the O(1) scheduler and replacing it with an rbtree-based one, and using
 part of Bill's work to speed up development.

He did exactly that and he did it with a patch. Nothing new here. This is 
how development on LKML proceeds when you have two or more competing 
designs. There's absolutely no need to get upset or hurt your feelings 
over it. It's not malicious, it's how we do Linux development.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Con Kolivas [EMAIL PROTECTED] wrote:

[ i'm quoting this bit out of order: ]

 2. Since then I've been thinking/working on a cpu scheduler design 
 that takes away all the guesswork out of scheduling and gives very 
 predictable, as fair as possible, cpu distribution and latency while 
 preserving as solid interactivity as possible within those confines.

yeah. I think you were right on target with this call. I've applied the 
sched.c change attached at the bottom of this mail to the CFS patch, if 
you dont mind. (or feel free to suggest some other text instead.)

 1. I tried in vain some time ago to push a working extensable 
 pluggable cpu scheduler framework (based on wli's work) for the linux 
 kernel. It was perma-vetoed by Linus and Ingo (and Nick also said he 
 didn't like it) as being absolutely the wrong approach and that we 
 should never do that. [...]

i partially replied to that point to Will already, and i'd like to make 
it clear again: yes, i rejected plugsched 2-3 years ago (which already 
drifted away from wli's original codebase) and i would still reject it 
today.

First and foremost, please dont take such rejections too personally - i 
had my own share of rejections (and in fact, as i mentioned it in a 
previous mail, i had a fair number of complete project throwaways: 
4g:4g, in-kernel Tux, irqrate and many others). I know that they can 
hurt and can demoralize, but if i dont like something it's my job to 
tell that.

Can i sum up your argument as: you rejected plugsched, but then why on 
earth did you modularize portions of the scheduler in CFS? Isnt your 
position thus woefully inconsistent? (i'm sure you would never put it 
this impolitely though, but i guess i can flame myself with impunity ;)

While having an inconsistent position isnt a terminal sin in itself, 
please realize that the scheduler classes code in CFS is quite different 
from plugsched: it was a result of what i saw to be technological 
pressure for _internal modularization_. (This internal/policy 
modularization aspect is something that Will said was present in his 
original plugsched code, but which aspect i didnt see in the plugsched 
patches that i reviewed.)

That possibility never even occured to me to until 3 days ago. You never 
raised it either AFAIK. No patches to simplify the scheduler that way 
were ever sent. Plugsched doesnt even touch the core load-balancer for 
example, and most of the time i spent with the modularization was to get 
the load-balancing details right. So it's really apples to oranges.

My view about plugsched: first please take a look at the latest 
plugsched code:

  http://downloads.sourceforge.net/cpuse/plugsched-6.5-for-2.6.20.patch

  26 files changed, 8951 insertions(+), 1495 deletions(-)

As an experiment i've removed all the add-on schedulers (both the core 
and the include files, only kept the vanilla one) from the plugsched 
patch (and the makefile and kconfig complications, etc), to see the 
'infrastructure cost', and it still gave:

  12 files changed, 1933 insertions(+), 1479 deletions(-)

that's the extra complication i didnt like 3 years ago and which i still 
dont like today. What the current plugsched code does is that it 
simplifies the adding of new experimental schedulers, but it doesnt 
really do what i wanted: to simplify the _scheduler itself_. Personally 
i'm still not primarily interested in having a large selection of 
schedulers, i'm mainly interested in a good and maintainable scheduler 
that works for people.

so the rejection was on these grounds, and i still very much stand by 
that position here and today: i didnt want to see the Linux scheduler 
landscape balkanized and i saw no technological reasons for the 
complication that external modularization brings.

the new scheding classes code in the CFS patch was not a result of oh, 
i want to write a new scheduler, lets make schedulers pluggable kind of 
thinking. That result was just a side-effect of it. (and as you 
correctly noted it, the CFS related modularization is incomplete).

Btw., the thing that triggered the scheduling classes code wasnt even 
plugsched or RSDL/SD, it was Mike's patches. Mike had an itch and he 
fixed it within the framework of the existing scheduler, and the end 
result behaved quite well when i threw various testloads on it.

But i felt a bit uncomfortable that it added another few hundred lines 
of code to an already complex sched.c. This felt unnatural so i mailed 
Mike that i'd attempt to clean these infrastructure aspects of sched.c 
up a bit so that it becomes more hackable to him. Thus 3 days ago, 
without having made up my mind about anything, i started this experiment 
(which ended up in the modularization and in the CFS scheduler) to 
simplify the code and to enable Mike to fix such itches in an easier 
way. By your logic Mike should in fact be quite upset about this: if the 
new code works out and proves to be useful then it obsoletes a whole lot 
of code of 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Gene Heskett
On Sunday 15 April 2007, Pekka Enberg wrote:
On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote:
 The perception here is that there is that there is this expectation that
 sections of the Linux kernel are intentionally churn squated to prevent
 any other ideas from creeping in other than of the owner of that subsytem

Strangely enough, my perception is that Ingo is simply trying to
address the issues Mike's testing discovered in RDSL and SD. It's not
surprising Ingo made it a separate patch set as Con has repeatedly
stated that the problems are in fact by design and won't be fixed.

I won't get into the middle of this just yet, not having decided which dog I 
should bet on yet.  I've been running 2.6.21-rc6 + Con's 0.40 patch for about 
24 hours, its been generally usable, but gzip still causes lots of 5 to 10+ 
second lags when its running.  I'm coming to the conclusion that gzip simply 
doesn't play well with others...  

Amazing to me, the cpu its using stays generally below 80%, and often below 
60%, even while the kmail composer has a full sentence in its buffer that it 
still hasn't shown me when I switch to the htop screen to check, and back to 
the kmail screen to see if its updated yet.  The screen switch doesn't seem 
to lag so I don't think renicing x would be helpfull.  Those are the obvious 
lags, and I'll build  reboot to the CFS patch at some point this morning 
(whats left of it that is :).  And report in due time of course

-- 
Cheers, Gene
There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order.
-Ed Howdershelt (Author)
knot in cables caused data stream to become twisted and kinked
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
On Sun, Apr 15, 2007 at 02:45:27PM +0200, Willy Tarreau wrote:
 Now I hope he and Bill will get over this and accept to work on improving
 this scheduler, because I really find it smarter than a dumb O(1). I even
 agree with Mike that we now have a solid basis for future work. But for
 this, maybe a good starting point would be to remove the selfish printk
 at boot, revert useless changes (SCHED_NORMAL-SCHED_FAIR come to mind)
 and improve the documentation a bit so that people can work together on
 the new design, without feeling like their work will only server to
 promote X or Y.

While I appreciate people coming to my defense, or at least the good
intentions behind such, my only actual interest in pointing out
4-year-old work is getting some acknowledgment of having done something
relevant at all. Sometimes it has I told you so value. At other times
it's merely clarifying what went on when people refer to it since in a
number of cases the patches are no longer extant, so they can't
actually look at it to get an idea of what was or wasn't done. At other
times I'm miffed about not being credited, whether I should've been or
whether dead and buried code has an implementation of the same idea
resurfacing without the author(s) having any knowledge of my prior work.

One should note that in this case, the first work of mine this trips
over (scheduling classes) was never publicly posted as it was only a
part of the original plugsched (an alternate scheduler implementation
devised to demonstrate plugsched's flexibility with respect to
scheduling policies), and a part that was dropped by subsequent
maintainers. The second work of mine this trips over, a virtual deadline
scheduler named vdls, was also never publicly posted. Both are from
around the same time period, which makes them approximately 4 years dead.
Neither of the codebases are extant, having been lost in a transition
between employers, though various people recall having been sent them
privately, and plugsched survives in a mutated form as maintained by
Peter Williams, who's been very good about acknowledging my original
contribution.

If I care to become a direct participant in scheduler work, I can do so
easily enough.

I'm not entirely sure what this is about a basis for future work. By
and large one should alter the API's and data structures to fit the
policy being implemented. While the array swapping was nice for
algorithmically improving 2.4.x -style epoch expiry, most algorithms
not based on the 2.4.x scheduler (in however mutated a form) should use
a different queue structure, in fact, one designed around their
policy's specific algorithmic needs. IOW, when one alters the scheduler,
one should also alter the queue data structure appropriately. I'd not
expect the priority queue implementation in cfs to continue to be used
unaltered as it matures, nor would I expect any significant modification
of the scheduler to necessarily use a similar one.

By and large I've been mystified as to why there is such a penchant for
preserving the existing queue structures in the various scheduler
patches floating around. I am now every bit as mystified at the point
of view that seems to be emerging that a change of queue structure is
particularly significant. These are all largely internal changes to
sched.c, and as such, rather small changes in and of themselves. While
they do tend to have user-visible effects, from this point of view
even changing out every line of sched.c is effectively a micropatch.

Something more significant might be altering the schedule() API to
take a mandatory description of the intention of the call to it, or
breaking up schedule() into several different functions to distinguish
between different sorts of uses of it to which one would then respond
differently. Also more significant would be adding a new state beyond
TASK_INTERRUPTIBLE, TASK_UNINTERRUPTIBLE, and TASK_RUNNING for some
tasks to respond only to fatal signals, then sweeping TASK_UNINTERRUPTIBLE
users to use the new state and handle those fatal signals. While not
quite as ostentatious in their user-visible effects as SCHED_OTHER
policy affairs, they are tremendously more work than switching out the
implementation of a single C file, and so somewhat more respectable.

Even as scheduling semantics go, these are micropatches. So SCHED_OTHER
changes a little. Where are the gang schedulers? Where are the batch
schedulers (SCHED_BATCH is not truly such)? Where are the isochronous
(frame) schedulers? I suppose there is some CKRM work that actually has
a semantic impact despite being largely devoted to SCHED_OTHER, and
there's some spufs gang scheduling going on, though not all that much.
And to reiterate a point from other threads, even as SCHED_OTHER
patches go, I see precious little verification that things like the
semantics of nice numbers or other sorts of CPU bandwidth allocation
between competing tasks of various natures are staying the same while
other things are 

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Willy Tarreau [EMAIL PROTECTED] wrote:

 Ingo could have publicly spoken with them about his ideas of killing 
 the O(1) scheduler and replacing it with an rbtree-based one, [...]

yes, that's precisely what i did, via a patchset :)

[ I can even tell you when it all started: i was thinking about Mike's
  throttling patches while watching Manchester United beat the crap out
  of AS Roma (7 to 1 end result), Thuesday evening. I started coding it
  Wednesday morning and sent the patch Friday evening. I very much
  believe in low-latency when it comes to development too ;) ]

(if this had been done via a comittee then today we'd probably still be 
trying to find a suitable timeslot for the initial conference call where 
we'd discuss the election of a chair who would be tasked with writing up 
an initial document of feature requests, on which we'd take a vote, 
possibly this year already, because the matter is really urgent you know 
;-)

 [...] and using part of Bill's work to speed up development.

ok, let me make this absolutely clear: i didnt use any bit of plugsched 
- in fact the most difficult bits of the modularization was for areas of 
sched.c that plugsched never even touched AFAIK. (the load-balancer for 
example.)

Plugsched simply does something else: i modularized scheduling policies 
in essence that have to cooperate with each other, while plugsched 
modularized complete schedulers which are compile-time or boot-time 
selected, with no runtime cooperation between them. (one has to be 
selected at a time)

(and i have no trouble at all with crediting Will's work either: a few 
years ago i used Will's PID rework concepts for an NPTL related speedup 
and Will is very much credited for it in today's kernel/pid.c and he 
continued to contribute to it later on.)

(the tree walking bits of sched_fair.c were in fact derived from 
kernel/hrtimer.c, the rbtree code written by Thomas and me :-)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread William Lee Irwin III
* Willy Tarreau [EMAIL PROTECTED] wrote:
 [...] and using part of Bill's work to speed up development.

On Sun, Apr 15, 2007 at 05:39:33PM +0200, Ingo Molnar wrote:
 ok, let me make this absolutely clear: i didnt use any bit of plugsched 
 - in fact the most difficult bits of the modularization was for areas of 
 sched.c that plugsched never even touched AFAIK. (the load-balancer for 
 example.)
 Plugsched simply does something else: i modularized scheduling policies 
 in essence that have to cooperate with each other, while plugsched 
 modularized complete schedulers which are compile-time or boot-time 
 selected, with no runtime cooperation between them. (one has to be 
 selected at a time)
 (and i have no trouble at all with crediting Will's work either: a few 
 years ago i used Will's PID rework concepts for an NPTL related speedup 
 and Will is very much credited for it in today's kernel/pid.c and he 
 continued to contribute to it later on.)
 (the tree walking bits of sched_fair.c were in fact derived from 
 kernel/hrtimer.c, the rbtree code written by Thomas and me :-)

The extant plugsched patches have nothing to do with cfs; I suspect
what everyone else is going on about is terminological confusion. The
4-year-old sample policy with scheduling classes for the original
plugsched is something you had no way of knowing about, as it was never
publicly posted. There isn't really anything all that interesting going
on here, apart from pointing out that it's been done before.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Bernd Eckenfels
In article [EMAIL PROTECTED] you wrote:
 A development process like this is likely to exclude smart people from wanting
 to contribute to Linux and folks should be conscious about this issues.

Nobody is excluded, you can always have a next iteration.

Gruss
Bernd
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Arjan van de Ven

 It outlines the problems with Linux kernel development and questionable
 elistism regarding ownership of certain sections of the kernel code.

I have to step in and disagree here

Linux is not about who writes the code.

Linux is about getting the best solution for a problem. Who wrote which
line of the code is irrelevant in the big picture.

that often means that multiple implementations happen, and that the a
darwinistic process decides that the best solution wins.

This darwinistic process often happens in the form of discussion, and
that discussion can happen with words or with code. In this case it
happened with a code proposal.

To make this specific: it has happened many times to me that when I
solved an issue with code, someone else stepped in and wrote a different
solution (although that was usually for smaller pieces). Was I upset
about that? No! I was happy because my *problem got solved* in the best
possible way.

Now this doesn't mean that people shouldn't be nice to each other, not
cooperate or steal credits, but I don't get the impression that that is
happening here. Ingo is taking part in the discussion with a counter
proposal for discussion *on the mailing list*. What more do you want??
If you or anyone else can improve it or do better, take part of this
discussion and show what you mean either in words or in code.

Your qualification of the discussion as a elitist takeover... I disagree
with that. It's a *discussion*. Now if you agree that Ingo's patch is
better technically, you and others should be happy about that because
your problem is getting solved better. If you don't agree that his patch
is better technically, take part in the technical discussion.

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Con Kolivas
On Monday 16 April 2007 01:16, Gene Heskett wrote:
 On Sunday 15 April 2007, Pekka Enberg wrote:
 On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote:
  The perception here is that there is that there is this expectation that
  sections of the Linux kernel are intentionally churn squated to
  prevent any other ideas from creeping in other than of the owner of that
  subsytem
 
 Strangely enough, my perception is that Ingo is simply trying to
 address the issues Mike's testing discovered in RDSL and SD. It's not
 surprising Ingo made it a separate patch set as Con has repeatedly
 stated that the problems are in fact by design and won't be fixed.

 I won't get into the middle of this just yet, not having decided which dog
 I should bet on yet.  I've been running 2.6.21-rc6 + Con's 0.40 patch for
 about 24 hours, its been generally usable, but gzip still causes lots of 5
 to 10+ second lags when its running.  I'm coming to the conclusion that
 gzip simply doesn't play well with others...

Actually Gene I think you're being bitten here by something I/O bound since 
the cpu usage never tops out. If that's the case and gzip is dumping 
truckloads of writes then you're suffering something that irks me even more 
than the scheduler in linux, and that's how much writes hurt just about 
everything else. Try your testcase with bzip2 instead (since that won't be 
i/o bound), or drop your dirty ratio to as low as possible which helps a 
little bit (5% is the minimum)

echo 5  /proc/sys/vm/dirty_ratio

and finally try the braindead noop i/o scheduler as well.

echo noop  /sys/block/sda/queue/scheduler

(replace sda with your drive obviously).

I'd wager a big one that's what causes your gzip pain. If it wasn't for the 
fact that I've decided to all but give up ever trying to provide code for 
mainline again, trying my best to make writes hurt less on linux would be my 
next big thing [tm]. 

Oh and for the others watching, (points to vm hackers) I found a bug when 
playing with the dirty ratio code. If you modify it to allow it drop below 5% 
but still above the minimum in the vm code, stalls happen somewhere in the vm 
where nothing much happens for sometimes 20 or 30 seconds worst case 
scenario. I had to drop a patch in 2.6.19 that allowed the dirty ratio to be 
set ultra low because these stalls were gross.

 Amazing to me, the cpu its using stays generally below 80%, and often below
 60%, even while the kmail composer has a full sentence in its buffer that
 it still hasn't shown me when I switch to the htop screen to check, and
 back to the kmail screen to see if its updated yet.  The screen switch
 doesn't seem to lag so I don't think renicing x would be helpfull.  Those
 are the obvious lags, and I'll build  reboot to the CFS patch at some
 point this morning (whats left of it that is :).  And report in due time of
 course

-- 
-ck
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Gene Heskett
On Sunday 15 April 2007, Con Kolivas wrote:
On Monday 16 April 2007 01:16, Gene Heskett wrote:
 On Sunday 15 April 2007, Pekka Enberg wrote:
 On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote:
  The perception here is that there is that there is this expectation
  that sections of the Linux kernel are intentionally churn squated to
  prevent any other ideas from creeping in other than of the owner of
  that subsytem
 
 Strangely enough, my perception is that Ingo is simply trying to
 address the issues Mike's testing discovered in RDSL and SD. It's not
 surprising Ingo made it a separate patch set as Con has repeatedly
 stated that the problems are in fact by design and won't be fixed.

 I won't get into the middle of this just yet, not having decided which dog
 I should bet on yet.  I've been running 2.6.21-rc6 + Con's 0.40 patch for
 about 24 hours, its been generally usable, but gzip still causes lots of 5
 to 10+ second lags when its running.  I'm coming to the conclusion that
 gzip simply doesn't play well with others...

Actually Gene I think you're being bitten here by something I/O bound since
the cpu usage never tops out. If that's the case and gzip is dumping
truckloads of writes then you're suffering something that irks me even more
than the scheduler in linux, and that's how much writes hurt just about
everything else. Try your testcase with bzip2 instead (since that won't be
i/o bound), or drop your dirty ratio to as low as possible which helps a
little bit (5% is the minimum)

echo 5  /proc/sys/vm/dirty_ratio

and finally try the braindead noop i/o scheduler as well.

echo noop  /sys/block/sda/queue/scheduler

(replace sda with your drive obviously).

I'd wager a big one that's what causes your gzip pain. If it wasn't for the
fact that I've decided to all but give up ever trying to provide code for
mainline again, trying my best to make writes hurt less on linux would be my
next big thing [tm].

Chuckle, possibly but then I'm not anything even remotely close to an expert 
here Con, just reporting what I get.  And I just rebooted to 2.6.21-rc6 + 
sched-mike-5.patch for grins and giggles, or frowns and profanity as the case 
may call for.

Oh and for the others watching, (points to vm hackers) I found a bug when
playing with the dirty ratio code. If you modify it to allow it drop below
 5% but still above the minimum in the vm code, stalls happen somewhere in
 the vm where nothing much happens for sometimes 20 or 30 seconds worst case
 scenario. I had to drop a patch in 2.6.19 that allowed the dirty ratio to
 be set ultra low because these stalls were gross.

I think I'd need a bit of tutoring on how to do that.  I recall that one other 
time, several weeks back, I thought I would try one of those famous echo this 
/proc/that ideas that went by on this list, but even though I was root, 
apparently /proc was read-only AFAIWC.

 Amazing to me, the cpu its using stays generally below 80%, and often
 below 60%, even while the kmail composer has a full sentence in its buffer
 that it still hasn't shown me when I switch to the htop screen to check,
 and back to the kmail screen to see if its updated yet.  The screen switch
 doesn't seem to lag so I don't think renicing x would be helpfull.  Those
 are the obvious lags, and I'll build  reboot to the CFS patch at some
 point this morning (whats left of it that is :).  And report in due time
 of course

And now I wonder if I applied the right patch.  This one feels good ATM, but I 
don't think its the CFS thingy.  No, I'm sure of it now, none of the patches 
I've saved say a thing about CFS.  Backtrack up the list time I guess, ignore 
me for the nonce.


-- 
Cheers, Gene
There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order.
-Ed Howdershelt (Author)
Microsoft: Re-inventing square wheels

   -- From a Slashdot.org post
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Mike Galbraith
On Sun, 2007-04-15 at 16:08 +0300, Pekka J Enberg wrote:
 On Sun, 15 Apr 2007, Willy Tarreau wrote:
  Ingo could have publicly spoken with them about his ideas of killing
  the O(1) scheduler and replacing it with an rbtree-based one, and using
  part of Bill's work to speed up development.
 
 He did exactly that and he did it with a patch. Nothing new here. This is 
 how development on LKML proceeds when you have two or more competing 
 designs. There's absolutely no need to get upset or hurt your feelings 
 over it. It's not malicious, it's how we do Linux development.

Yes.  Exactly.  This is what it's all about, this is what makes it work.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Ingo Molnar

* Willy Tarreau [EMAIL PROTECTED] wrote:

 Well, since I merged the fair-fork patch, I cannot reproduce (in fact, 
 bash forks 1000 processes, then progressively execs scheddos, but it 
 takes some time). So I'm rebuilding right now. But I think that Linus 
 has an interesting clue about GPM and notification before switching 
 the terminal. I think it was enabled in console mode. I don't know how 
 that translates to frozen xterms, but let's attack the problems one at 
 a time.

to debug this, could you try to apply this add-on as well:

  http://redhat.com/~mingo/cfs-scheduler/sched-fair-print.patch

with this patch applied you should have a /proc/sched_debug file that 
prints all runnable tasks and other interesting info from the runqueue. 

[ i've refreshed all the patches on the CFS webpage, so if this doesnt 
  apply cleanly to your current tree then you'll probably have to 
  refresh one of the patches.]

The output should look like this:

 Sched Debug Version: v0.01
 now at 226761724575 nsecs

 cpu: 0
   .nr_running: 3
   .raw_weighted_load : 384
   .nr_switches   : 13666
   .nr_uninterruptible: 0
   .next_balance  : 4294947416
   .curr-pid : 2179
   .rq_clock  : 241337421233
   .fair_clock: 7503791206
   .wait_runtime  : 2269918379

 runnable tasks:
task | PID | tree-key |   -delta |  waiting | switches
 -
 +cat  2179 7501930066   -18611401861140 2
  loop_silent  2149 7503010354-780852  0   911
  loop_silent  2148 7503510048-281158 280753   918

now for your workload the list should be considerably larger. If there's 
starvation going on then the 'switches' field (number of context 
switches) of one of the tasks would never increase while you have this 
'cannot switch consoles' problem.

maybe you'll have to unapply the fair-fork patch to make it trigger 
again. (fair-fork does not fix anything, so it probably just hides a 
real bug.)

(i'm meanwhile busy running your scheddos utilities to reproduce it 
locally as well :)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Linus Torvalds


On Sun, 15 Apr 2007, Mike Galbraith wrote:

 On Sun, 2007-04-15 at 16:08 +0300, Pekka J Enberg wrote:
  
  He did exactly that and he did it with a patch. Nothing new here. This is 
  how development on LKML proceeds when you have two or more competing 
  designs. There's absolutely no need to get upset or hurt your feelings 
  over it. It's not malicious, it's how we do Linux development.
 
 Yes.  Exactly.  This is what it's all about, this is what makes it work.

I obviously agree, but I will also add that one of the most motivating 
things there *is* in open source is personal pride.

It's a really good thing, and it means that if somebody shows that your 
code is flawed in some way (by, for example, making a patch that people 
claim gets better behaviour or numbers), any *good* programmer that 
actually cares about his code will obviously suddenly be very motivated to 
out-do the out-doer!

Does this mean that there will be tension and rivalry? Hell yes. But 
that's kind of the point. Life is a game, and if you aren't in it to win, 
what the heck are you still doing here?

As long as it's reasonably civil (I'm not personally a huge believer in 
being too polite or politically correct, so I think the reasonably is 
more important than the civil part!), and as long as the end result is 
judged on TECHNICAL MERIT, it's all good.

We don't want to play politics. But encouraging peoples competitive 
feelings? Oh, yes. 

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    1   2   3   4   5   6   7   >