Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Hi Nick, On Tue, Apr 17, 2007 at 06:29:54AM +0200, Nick Piggin wrote: (...) And my scheduler for example cuts down the amount of policy code and code size significantly. I haven't looked at Con's ones for a while, but I believe they are also much more straightforward than mainline... For example, let's say all else is equal between them, then why would we go with the O(logN) implementation rather than the O(1)? Of course, if this is the case, the question will be raised. But as a general rule, I don't see much potential in O(1) to finely tune scheduling according to several criteria. In O(logN), you can adjust scheduling in realtime at a very low cost. Better processing of varying priorities or fork() comes to mind. Regards, Willy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007, Con Kolivas wrote: And I snipped, Sorry fellas. Con's original submission was to me, quite an improvement. But I have to say it, and no denegration of your efforts is intended Con, but you did 'pull the trigger' and get this thing rolling by scratching the itch & drawing attention to an ugly lack of user interactivity that had crept into the 2.6 family. So from me to Con, a tip of the hat, and a deep bow in your direction, thank you. Now, you have done what you aimed to do, so please get well. I've now been through most of an amanda session using Ingo's "CFS" and I have to say that it is another improvement over your 0.40 that's is just as obvious as your first patch was against the stock scheduler. No other scheduler yet has allowed the full utilization of the cpu, and maintained user interactivity as well as this one has, my cpu is running about 5 degrees F hotter just from this effect alone. gzip, if the rest of the system is in between tasks, is consistently showing around 95%, but let anything else stick up its hand, like procmail etc, and gzip now dutifully steps aside, dropping into the 40% range until procmail and spamd are done, at which point there is no rest for the wicked and the cpu never gets a chance to cool. There was, just now, a pause of about 2 seconds, while amanda moved a tarball from the holding disk area on /dev/hda to the vtapes disk on /dev/hdd, so that would have been an I/O bound situation. This one Ingo, even without any other patches and I think I did see one go by in this thread which I didn't apply, is a definite keeper. Sweet even. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) A word to the wise is enough. -- Miguel de Cervantes - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Mon, 16 Apr 2007, Pavel Pisa wrote: > I cannot help myself to not report results with GAVL > tree algorithm there as an another race competitor. > I believe, that it is better solution for large priority > queues than RB-tree and even heap trees. It could be > disputable if the scheduler needs such scalability on > the other hand. The AVL heritage guarantees lower height > which results in shorter search times which could > be profitable for other uses in kernel. > > GAVL algorithm is AVL tree based, so it does not suffer from > "infinite" priorities granularity there as TR does. It allows > use for generalized case where tree is not fully balanced. > This allows to cut the first item withour rebalancing. > This leads to the degradation of the tree by one more level > (than non degraded AVL gives) in maximum, which is still > considerably better than RB-trees maximum. > > http://cmp.felk.cvut.cz/~pisa/linux/smart-queue-v-gavl.c Here are the results on my Opteron 252: Testing N=1 gavl_cfs = 187.20 cycles/loop CFS = 194.16 cycles/loop TR = 314.87 cycles/loop CFS = 194.15 cycles/loop gavl_cfs = 187.15 cycles/loop Testing N=2 gavl_cfs = 268.94 cycles/loop CFS = 305.53 cycles/loop TR = 313.78 cycles/loop CFS = 289.58 cycles/loop gavl_cfs = 266.02 cycles/loop Testing N=4 gavl_cfs = 452.13 cycles/loop CFS = 518.81 cycles/loop TR = 311.54 cycles/loop CFS = 516.23 cycles/loop gavl_cfs = 450.73 cycles/loop Testing N=8 gavl_cfs = 609.29 cycles/loop CFS = 644.65 cycles/loop TR = 308.11 cycles/loop CFS = 667.01 cycles/loop gavl_cfs = 592.89 cycles/loop Testing N=16 gavl_cfs = 686.30 cycles/loop CFS = 807.41 cycles/loop TR = 317.20 cycles/loop CFS = 810.24 cycles/loop gavl_cfs = 688.42 cycles/loop Testing N=32 gavl_cfs = 756.57 cycles/loop CFS = 852.14 cycles/loop TR = 301.22 cycles/loop CFS = 876.12 cycles/loop gavl_cfs = 758.46 cycles/loop Testing N=64 gavl_cfs = 831.97 cycles/loop CFS = 997.16 cycles/loop TR = 304.74 cycles/loop CFS = 1003.26 cycles/loop gavl_cfs = 832.83 cycles/loop Testing N=128 gavl_cfs = 897.33 cycles/loop CFS = 1030.36 cycles/loop TR = 295.65 cycles/loop CFS = 1035.29 cycles/loop gavl_cfs = 892.51 cycles/loop Testing N=256 gavl_cfs = 963.17 cycles/loop CFS = 1146.04 cycles/loop TR = 295.35 cycles/loop CFS = 1162.04 cycles/loop gavl_cfs = 966.31 cycles/loop Testing N=512 gavl_cfs = 1029.82 cycles/loop CFS = 1218.34 cycles/loop TR = 288.78 cycles/loop CFS = 1257.97 cycles/loop gavl_cfs = 1029.83 cycles/loop Testing N=1024 gavl_cfs = 1091.76 cycles/loop CFS = 1318.47 cycles/loop TR = 287.74 cycles/loop CFS = 1311.72 cycles/loop gavl_cfs = 1093.29 cycles/loop Testing N=2048 gavl_cfs = 1153.03 cycles/loop CFS = 1398.84 cycles/loop TR = 286.75 cycles/loop CFS = 1438.68 cycles/loop gavl_cfs = 1149.97 cycles/loop There seem to be some difference from your numbers. This is with: gcc version 4.1.2 and -O2. But then and Opteron can behave quite differentyl than a Duron on a bench like this ;) - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 09:25:07AM -0700, Arjan van de Ven wrote: > Now this doesn't mean that people shouldn't be nice to each other, not > cooperate or steal credits, but I don't get the impression that that is > happening here. Ingo is taking part in the discussion with a counter > proposal for discussion *on the mailing list*. What more do you want?? Con should have been CCed from the first moment this was put into motion to limit the perception of exclusion. That was mistake number one and big time failures to understand this dynamic. After it was Con's idea. Why the hell he was excluded from Ingo's development process is baffling to me and him (most likely). He put int a lot of effort into SDL and his experiences with scheduling should still be seriously considered in this development process even if he doesn't write a single line of code from this moment on. What should have happened is that our very busy associate at RH by the name of Ingo Molnar should have leverage more of Con's and Bill's work and use them as a proxy for his own ideas. They would have loved to have contributed more and our very busy Ingo Molnar would have gotten a lot of his work and ideas implemented without him even opening a single source file for editting. They would have happily done this work for Ingo. Ingo could have been used for something else more important like making KVM less of a freaking ugly hack and we all would have benefitted from this. He could have been working on SystemTap so that you stop losing accounts to Sun and Solaris 10's Dtrace. He could have been working with Riel to fix your butt ugly page scanning problem causing horrible contention via the Clock/Pro algorithm, etc... He could have been fixing the ugly futex rwsem mapping problem that's killing -rt and anything that uses Posix threads. He could have created a userspace thread control block (TCB) with Mr. Drepper so that we can turn off preemption in userspace (userspace per CPU local storage) and implement a very quick non-kernel crossing implementation of priority ceilings (userspace check for priority and flags at preempt_schedule() in the TCB) so that our -rt Posix API doesn't suck donkey shit... Need I say more ? As programmers like Ingo get spread more thinly, he needs super smart folks like Bill Irwin and Con to help him out and learn to resist NIH folk's stuff out of some weird fear. When this happens, folks like Ingo must learn to "facilitate" development in addition to implementing it with those kind of folks. This takes time and practice to entrust folks to do things for him. Ingo is the best method of getting new Linux kernel ideas and communicate them to Linus. His value goes beyond just just code and is often the biggest hammer we have in the Linux community to get stuff into the kernel. "Facilitation" of others is something that solo programmers must need when groups like the Linux kernel get larger and large every year. Understand ? Are we in embarrassing agreement here ? bill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 01:05, Ingo Molnar wrote: > * Con Kolivas <[EMAIL PROTECTED]> wrote: > > 2. Since then I've been thinking/working on a cpu scheduler design > > that takes away all the guesswork out of scheduling and gives very > > predictable, as fair as possible, cpu distribution and latency while > > preserving as solid interactivity as possible within those confines. > > yeah. I think you were right on target with this call. Yay thank goodness :) It's time to fix the damn cpu scheduler once and for all. Everyone uses this; it's no minor driver or $bigsmp or $bigram or $small_embedded_RT_hardware feature. > I've applied the > sched.c change attached at the bottom of this mail to the CFS patch, if > you dont mind. (or feel free to suggest some other text instead.) > * 2003-09-03 Interactivity tuning by Con Kolivas. > * 2004-04-02 Scheduler domains code by Nick Piggin > + * 2007-04-15 Con Kolivas was dead right: fairness matters! :) LOL that's awful. I'd prefer something meaningful like "Work begun on replacing all interactivity tuning with a fair virtual-deadline design by Con Kolivas". While you're at it, it's worth getting rid of a few slightly pointless name changes too. Don't rename SCHED_NORMAL yet again, and don't call all your things sched_fair blah_fair __blah_fair and so on. It means that anything else is by proxy going to be considered unfair. Leave SCHED_NORMAL as is, replace the use of the word _fair with _cfs. I don't really care how many copyright notices you put into our already noisy bootup but it's redundant since there is no choice; we all get the same cpu scheduler. > > 1. I tried in vain some time ago to push a working extensable > > pluggable cpu scheduler framework (based on wli's work) for the linux > > kernel. It was perma-vetoed by Linus and Ingo (and Nick also said he > > didn't like it) as being absolutely the wrong approach and that we > > should never do that. [...] > > i partially replied to that point to Will already, and i'd like to make > it clear again: yes, i rejected plugsched 2-3 years ago (which already > drifted away from wli's original codebase) and i would still reject it > today. No that was just me being flabbergasted by what appeared to be you posting your own plugsched. Note nowhere in the 40 iterations of rsdl->sd did I ask/suggest for plugsched. I said in my first announcement my aim was to create a scheduling policy robust enough for all situations rather than fantastic a lot of the time and awful sometimes. There are plenty of people ready to throw out arguments for plugsched now and I don't have the energy to continue that fight (I never did really). But my question still stands about this comment: > case, all of SD's logic could be added via a kernel/sched_sd.c module > as well, if Con is interested in such an approach. ] What exactly would be the purpose of such a module that governs nothing in particular? Since there'll be no pluggable scheduler by your admission it has no control over SCHED_NORMAL, and would require another scheduling policy for it to govern which there is no express way to use at the moment and people tend to just use the default without great effort. > First and foremost, please dont take such rejections too personally - i > had my own share of rejections (and in fact, as i mentioned it in a > previous mail, i had a fair number of complete project throwaways: > 4g:4g, in-kernel Tux, irqrate and many others). I know that they can > hurt and can demoralize, but if i dont like something it's my job to > tell that. Hmm? No that's not what this is about. Remember dynticks which was not originally my code but I tried to bring it up to mainline standard which I fought with for months? You came along with yet another rewrite from scratch and the flaws in the design I was working with were obvious so I instantly bowed down to that and never touched my code again. I didn't ask for credit back then, but obviously brought the requirement for a no idle tick implementation to the table. > My view about plugsched: first please take a look at the latest > plugsched code: > > http://downloads.sourceforge.net/cpuse/plugsched-6.5-for-2.6.20.patch > > 26 files changed, 8951 insertions(+), 1495 deletions(-) > > As an experiment i've removed all the add-on schedulers (both the core > and the include files, only kept the vanilla one) from the plugsched > patch (and the makefile and kconfig complications, etc), to see the > 'infrastructure cost', and it still gave: > > 12 files changed, 1933 insertions(+), 1479 deletions(-) I do not see extra code per-se as being a bad thing. I've heard said a few times before "ever notice how when the correct solution is done it is a lot more code than the quick hack that ultimately fails?". Insert long winded discussion of perfect is the enemy of good here, _but_ I'm not arguing perfect versus good, I'm talking about solid code
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Mon, Apr 16, 2007 at 01:15:27PM +1000, Con Kolivas wrote: > On Monday 16 April 2007 12:28, Nick Piggin wrote: > > So, on to something productive, we have 3 candidates for a new scheduler so > > far. How do we decide which way to go? (and yes, I still think switchable > > schedulers is wrong and a copout) This is one area where it is virtually > > impossible to discount any decent design on correctness/performance/etc. > > and even testing in -mm isn't really enough. > > We're in agreement! YAY! > > Actually this is simpler than that. I'm taking SD out of the picture. It has > served it's purpose of proving that we need to seriously address all the > scheduling issues and did more than a half decent job at it. Unfortunately I > also cannot sit around supporting it forever by myself. My own life is more > important, so consider SD not even running the race any more. > > I'm off to continue maintaining permanent-out-of-tree leisurely code at my > own > pace. What's more is, I think I'll just stick to staircase Gen I version blah > and shelve SD and try to have fond memories of SD as an intellectual > prompting exercise only. Well I would hope that _if_ we decide to switch schedulers, then you get a chance to field something (and I hope you will decide to and have time to), and I hope we don't rush into the decision. We've had the current scheduler for so many years now that it is much more important to make sure we take the time to do the right thing rather than absolutely have to merge a new scheduler right now ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 12:28, Nick Piggin wrote: > So, on to something productive, we have 3 candidates for a new scheduler so > far. How do we decide which way to go? (and yes, I still think switchable > schedulers is wrong and a copout) This is one area where it is virtually > impossible to discount any decent design on correctness/performance/etc. > and even testing in -mm isn't really enough. We're in agreement! YAY! Actually this is simpler than that. I'm taking SD out of the picture. It has served it's purpose of proving that we need to seriously address all the scheduling issues and did more than a half decent job at it. Unfortunately I also cannot sit around supporting it forever by myself. My own life is more important, so consider SD not even running the race any more. I'm off to continue maintaining permanent-out-of-tree leisurely code at my own pace. What's more is, I think I'll just stick to staircase Gen I version blah and shelve SD and try to have fond memories of SD as an intellectual prompting exercise only. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
William Lee Irwin III wrote: >> One of the reasons I never posted my own code is that it never met its >> own design goals, which absolutely included switching on the fly. I >> think Peter Williams may have done something about that. >> It was my hope >> to be able to do insmod sched_foo.ko until it became clear that the >> effort it was intended to assist wasn't going to get even the limited >> hardware access required, at which point I largely stopped working on >> it. On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote: > I didn't but some students did. > In a previous life, I did implement a runtime configurable CPU > scheduling mechanism (implemented on True64, Solaris and Linux) that > allowed schedulers to be loaded as modules at run time. This was > released commercially on True64 and Solaris. So I know that it can be done. > I have thought about doing something similar for the SPA schedulers > which differ in only small ways from each other but lack motivation. Driver models for scheduling are not so far out. AFAICS it's largely a tug-of-war over design goals, e.g. maintaining per-cpu runqueues and switching out intra-queue policies vs. switching out whole-system policies, SMP handling and all. Whether this involves load balancing depends strongly on e.g. whether you have per-cpu runqueues. A 2.4.x scheduler module, for instance, would not have a load balancer at all, as it has only one global runqueue. There are other sorts of policies wanting significant changes to SMP handling vs. the stock load balancing. William Lee Irwin III wrote: >> I'm not sure what happened there. It wasn't a big enough patch to take >> hits in this area due to getting overwhelmed by the programming burden >> like some other efforts of mine. Maybe things started getting ugly once >> on-the-fly switching entered the picture. My guess is that Peter Williams >> will have to chime in here, since things have diverged enough from my >> one-time contribution 4 years ago. On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote: > From my POV, the current version of plugsched is considerably simpler > than it was when I took the code over from Con as I put considerable > effort into minimizing code overlap in the various schedulers. > I also put considerable effort into minimizing any changes to the load > balancing code (something Ingo seems to think is a deficiency) and the > result is that plugsched allows "intra run queue" scheduling to be > easily modified WITHOUT effecting load balancing. To my mind scheduling > and load balancing are orthogonal and keeping them that way simplifies > things. ISTR rearranging things for con in such a fashion that it no longer worked out of the box (though that wasn't the intention; restructuring it to be more suited to his purposes was) and that's what he worked off of afterward. I don't remember very well what changed there as I clearly invested less effort there than the prior versions. Now that I think of it, that may have been where the sample policy demonstrating scheduling classes was lost. On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote: > As Ingo correctly points out, plugsched does not allow different > schedulers to be used per CPU but it would not be difficult to modify it > so that they could. Although I've considered doing this over the years > I decided not to as it would just increase the complexity and the amount > of work required to keep the patch set going. About six months ago I > decided to reduce the amount of work I was doing on plugsched (as it was > obviously never going to be accepted) and now only publish patches > against the vanilla kernel's major releases (and the only reason that I > kept doing that is that the download figures indicated that about 80 > users were interested in the experiment). That's a rather different goal from what I was going on about with it, so it's all diverged quite a bit. Where I had a significant need for mucking with the entire concept of how SMP was handled, this is rather different. At this point I'm questioning the relevance of my own work, though it was already relatively marginal as it started life as an attempt at a sort of debug patch to help gang scheduling (which is in itself a rather marginally relevant feature to most users) code along. On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote: > PS I no longer read LKML (due to time constraints) and would appreciate > it if I could be CC'd on any e-mails suggesting scheduler changes. > PPS I'm just happy to see that Ingo has finally accepted that the > vanilla scheduler was badly in need of fixing and don't really care who > fixes it. > PPS Different schedulers for different aims (i.e. server or work > station) do make a difference. E.g. the spa_svr scheduler in plugsched > does about 1% better on kernbench than the next best scheduler in the bunch. > PPPS Con, fairness isn't always best as
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 04:31:54PM -0500, Matt Mackall wrote: > On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: > > > 4) the good thing that happened to I/O, after years of stagnation isnt > >I/O schedulers. The good thing that happened to I/O is called Jens > >Axboe. If you care about the I/O subystem then print that name out > >and hang it on the wall. That and only that is what mattered. > > Disagree. Things didn't actually get interesting until Nick showed up > with AS and got it in-tree to demonstrate the huge amount of room we > had for improvement. It took several iterations of AS and CFQ (with a > couple complete rewrites) before CFQ began to look like the winner. > The resulting time-sliced CFQ was fairly heavily influenced by the > ideas in AS. Well to be fair, Jens had just implemented deadline, which got me interested ;) Actually, I would still like to be able to deprecate deadline for AS, because AS has a tunable that you can switch to turn off read anticipation and revert to deadline behaviour (or very close to). It would have been nice if CFQ were then a layer on top of AS that implemented priorities (or vice versa). And then AS could be deprecated and we'd be back to 1 primary scheduler. Well CFQ seems to be going in the right direction with that, however some large users still find AS faster for some reason... Anyway, moral of the story is that I think it would have been nice if we hadn't proliferated IO schedulers, however in practice it isn't easy to just layer features on top of each other, and also keeping deadline helped a lot to be able to debug and examine performance regressions and actually get code upstream. And this was true even when it was globally boottine switchable only. I'd prefer if we kept a single CPU scheduler in mainline, because I think that simplifies analysis and focuses testing. I think we can have one that is good enough for everyone. But if the only other option for progress is that Linus or Andrew just pull one out of a hat, then I would rather merge all of them. Yes I think Con's scheduler should get a fair go, ditto for Ingo's, mine, and anyone else's. > > nor was the non-modularity of some piece of code ever an impediment to > > competition. May i remind you of the pretty competitive SLAB allocator > > landscape, resulting in things like the SLOB allocator, written by > > yourself? ;-) > > Thankfully no one came out and said "we don't want to balkanize the > allocator landscape" when I submitted it or I probably would have just > dropped it, rather than painfully dragging it along out of tree for > years. I'm not nearly the glutton for punishment that Con is. :-P I don't think this is a fault of the people or the code involved. We just didn't have much collective drive to replace the scheduler, and even less an idea of how to decide between any two of them. I've kept nicksched around since 2003 or so and no hard feelings ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Mon, Apr 16, 2007 at 08:52:33AM +1000, Con Kolivas wrote: > On Monday 16 April 2007 05:00, Jonathan Lundell wrote: > > On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote: > > > It's a really good thing, and it means that if somebody shows that > > > your > > > code is flawed in some way (by, for example, making a patch that > > > people > > > claim gets better behaviour or numbers), any *good* programmer that > > > actually cares about his code will obviously suddenly be very > > > motivated to > > > out-do the out-doer! > > > > "No one who cannot rejoice in the discovery of his own mistakes > > deserves to be called a scholar." > > Lovely comment. I realise this is not truly directed at me but clearly in the > context it has been said people will assume it is directed my way, so while > we're all spinning lkml quality rhetoric, let me have a right of reply. > > One thing I have never tried to do was to ignore bug reports. I'm forever > joking that I keep pulling code out of my arse to improve what I've done. > RSDL/SD was no exception; heck it had 40 iterations. The reason I could not > reply to bug report A with "Oh that is problem B so I'll fix it with code C" > was, as I've said many many times over, health related. I did indeed try to > fix many of them without spending hours replying to sometimes unpleasant > emails. If health wasn't an issue there might have been 1000 iterations of > SD. Well what matters is the code and development. I don't think Ingo's scheduler is the final word, although I worry that Linus might jump the gun and merge something "just to give it a test", which we then get stuck with :P I don't know how anybody can think Ingo's new scheduler is anything but a good thing (so long as it has to compete before being merged). And that's coming from someone who wants *their* scheduler to get merged... I think mine can compete ;) and if it can't, then I'd rather be using the scheduler that beats it. > There was only ever _one_ thing that I was absolutely steadfast on as a > concept that I refused to fix that people might claim was "a mistake I did > not rejoice in to be a scholar". That was that the _correct_ behaviour for a > scheduler is to be fair such that proportional slowdown with load is (using > that awful pun) a feature, not a bug. If something is using more than a fair share of CPU time, over some macro period, in order to be interactive, then definitely it should get throttled. I've always maintained (since starting scheduler work) that the 2.6 scheduler is horrible because it allows these cases where some things can get more CPU time just by how they behave. Glad people are starting to come around on that point. So, on to something productive, we have 3 candidates for a new scheduler so far. How do we decide which way to go? (and yes, I still think switchable schedulers is wrong and a copout) This is one area where it is virtually impossible to discount any decent design on correctness/performance/etc. and even testing in -mm isn't really enough. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
William Lee Irwin III wrote: On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: 2) plugsched did not allow on the fly selection of schedulers, nor did it allow a per CPU selection of schedulers. IO schedulers you can change per disk, on the fly, making them much more useful in practice. Also, IO schedulers (while definitely not being slow!) are alot less performance sensitive than CPU schedulers. One of the reasons I never posted my own code is that it never met its own design goals, which absolutely included switching on the fly. I think Peter Williams may have done something about that. I didn't but some students did. In a previous life, I did implement a runtime configurable CPU scheduling mechanism (implemented on True64, Solaris and Linux) that allowed schedulers to be loaded as modules at run time. This was released commercially on True64 and Solaris. So I know that it can be done. I have thought about doing something similar for the SPA schedulers which differ in only small ways from each other but lack motivation. It was my hope to be able to do insmod sched_foo.ko until it became clear that the effort it was intended to assist wasn't going to get even the limited hardware access required, at which point I largely stopped working on it. On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: 3) I/O schedulers are pretty damn clean code, and plugsched, at least the last version i saw of it, didnt come even close. I'm not sure what happened there. It wasn't a big enough patch to take hits in this area due to getting overwhelmed by the programming burden like some other efforts of mine. Maybe things started getting ugly once on-the-fly switching entered the picture. My guess is that Peter Williams will have to chime in here, since things have diverged enough from my one-time contribution 4 years ago. From my POV, the current version of plugsched is considerably simpler than it was when I took the code over from Con as I put considerable effort into minimizing code overlap in the various schedulers. I also put considerable effort into minimizing any changes to the load balancing code (something Ingo seems to think is a deficiency) and the result is that plugsched allows "intra run queue" scheduling to be easily modified WITHOUT effecting load balancing. To my mind scheduling and load balancing are orthogonal and keeping them that way simplifies things. As Ingo correctly points out, plugsched does not allow different schedulers to be used per CPU but it would not be difficult to modify it so that they could. Although I've considered doing this over the years I decided not to as it would just increase the complexity and the amount of work required to keep the patch set going. About six months ago I decided to reduce the amount of work I was doing on plugsched (as it was obviously never going to be accepted) and now only publish patches against the vanilla kernel's major releases (and the only reason that I kept doing that is that the download figures indicated that about 80 users were interested in the experiment). Peter PS I no longer read LKML (due to time constraints) and would appreciate it if I could be CC'd on any e-mails suggesting scheduler changes. PPS I'm just happy to see that Ingo has finally accepted that the vanilla scheduler was badly in need of fixing and don't really care who fixes it. PPS Different schedulers for different aims (i.e. server or work station) do make a difference. E.g. the spa_svr scheduler in plugsched does about 1% better on kernbench than the next best scheduler in the bunch. PPPS Con, fairness isn't always best as humans aren't very altruistic and we need to give unfair preference to interactive tasks in order to stop the users flinging their PCs out the window. But the current scheduler doesn't do this very well and is also not very good at fairness so needs to change. But the changes need to address interactive response and fairness not just fairness. -- Peter Williams [EMAIL PROTECTED] "Learning, n. The kind of ignorance distinguishing the studious." -- Ambrose Bierce - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sunday 15 April 2007, Mike Galbraith wrote: >On Sun, 2007-04-15 at 12:58 -0400, Gene Heskett wrote: >> Chuckle, possibly but then I'm not anything even remotely close to an >> expert here Con, just reporting what I get. And I just rebooted to >> 2.6.21-rc6 + sched-mike-5.patch for grins and giggles, or frowns and >> profanity as the case may call for. > >Erm, that patch is embarrassingly buggy, so profanity should dominate. > > -Mike Chuckle, ROTFLMAO even. I didn't run it that long as I immediately rebuilt and rebooted when I found I'd used the wrong patch, and in fact had tested that one and found it sub-optimal before I'd built and ran Con's -0.40 version. As for bugs of the type that make it to the screen or logs, I didn't see any. OTOH, my eyesight is slowly going downhill, now 20/25. It was 20/10 30 years ago. Now thats reason for profanity... -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Unix weanies are as bad at this as anyone. -- Larry Wall in <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* William Lee Irwin III <[EMAIL PROTECTED]> wrote: >> I've been suggesting testing CPU bandwidth allocation as influenced by >> nice numbers for a while now for a reason. On Sun, Apr 15, 2007 at 09:57:48PM +0200, Ingo Molnar wrote: > Oh I was very much testing "CPU bandwidth allocation as influenced by > nice numbers" - it's one of the basic things i do when modifying the > scheduler. An automated tool, while nice (all automation is nice) > wouldnt necessarily show such bugs though, because here too it needed > thousands of running tasks to trigger in practice. Any volunteers? ;) Worse comes to worse I might actually get around to doing it myself. Any more detailed descriptions of the test for a rainy day? -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sunday 15 April 2007 00:38, Davide Libenzi wrote: > Haven't looked at the scheduler code yet, but for a similar problem I use > a time ring. The ring has Ns (2 power is better) slots (where tasks are > queued - in my case they were som sort of timers), and it has a current > base index (Ib), a current base time (Tb) and a time granularity (Tg). It > also has a bitmap with bits telling you which slots contains queued tasks. > An item (task) that has to be scheduled at time T, will be queued in the > slot: > > S = Ib + min((T - Tb) / Tg, Ns - 1); > > Items with T longer than Ns*Tg will be scheduled in the relative last slot > (chosing a proper Ns and Tg can minimize this). > Queueing is O(1) and de-queueing is O(Ns). You can play with Ns and Tg to > suite to your needs. > This is a simple bench between time-ring (TR) and CFS queueing: > > http://www.xmailserver.org/smart-queue.c > > In my box (Dual Opteron 252): > > [EMAIL PROTECTED]:~$ ./smart-queue -n 8 > CFS = 142.21 cycles/loop > TR = 72.33 cycles/loop > [EMAIL PROTECTED]:~$ ./smart-queue -n 16 > CFS = 188.74 cycles/loop > TR = 83.79 cycles/loop > [EMAIL PROTECTED]:~$ ./smart-queue -n 32 > CFS = 221.36 cycles/loop > TR = 75.93 cycles/loop > [EMAIL PROTECTED]:~$ ./smart-queue -n 64 > CFS = 242.89 cycles/loop > TR = 81.29 cycles/loop Hello all, I cannot help myself to not report results with GAVL tree algorithm there as an another race competitor. I believe, that it is better solution for large priority queues than RB-tree and even heap trees. It could be disputable if the scheduler needs such scalability on the other hand. The AVL heritage guarantees lower height which results in shorter search times which could be profitable for other uses in kernel. GAVL algorithm is AVL tree based, so it does not suffer from "infinite" priorities granularity there as TR does. It allows use for generalized case where tree is not fully balanced. This allows to cut the first item withour rebalancing. This leads to the degradation of the tree by one more level (than non degraded AVL gives) in maximum, which is still considerably better than RB-trees maximum. http://cmp.felk.cvut.cz/~pisa/linux/smart-queue-v-gavl.c The description behind the code is there http://cmp.felk.cvut.cz/~pisa/ulan/gavl.pdf The code is part of much more covering uLUt library http://cmp.felk.cvut.cz/~pisa/ulan/ulut.pdf http://sourceforge.net/project/showfiles.php?group_id=118937_id=130840 I have included all required GAVL code directly into smart-queue-v-gavl.c to provide it for easy testing. There are tests run on my little dated computer - Duron 600 MHz. Test are run twice to suppress run order influence. ./smart-queue-v-gavl -n 1 -l 200 gavl_cfs = 55.66 cycles/loop CFS = 88.33 cycles/loop TR = 141.78 cycles/loop CFS = 90.45 cycles/loop gavl_cfs = 55.38 cycles/loop ./smart-queue-v-gavl -n 2 -l 200 gavl_cfs = 82.85 cycles/loop CFS = 104.18 cycles/loop TR = 145.21 cycles/loop CFS = 102.74 cycles/loop gavl_cfs = 82.05 cycles/loop ./smart-queue-v-gavl -n 4 -l 200 gavl_cfs = 137.45 cycles/loop CFS = 156.47 cycles/loop TR = 142.00 cycles/loop CFS = 152.65 cycles/loop gavl_cfs = 139.38 cycles/loop ./smart-queue-v-gavl -n 10 -l 200 gavl_cfs = 229.22 cycles/loop (WORSE) CFS = 206.26 cycles/loop TR = 140.81 cycles/loop CFS = 208.29 cycles/loop gavl_cfs = 223.62 cycles/loop (WORSE) ./smart-queue-v-gavl -n 100 -l 200 gavl_cfs = 257.66 cycles/loop CFS = 329.68 cycles/loop TR = 142.20 cycles/loop CFS = 319.34 cycles/loop gavl_cfs = 260.02 cycles/loop ./smart-queue-v-gavl -n 1000 -l 200 gavl_cfs = 258.41 cycles/loop CFS = 393.04 cycles/loop TR = 134.76 cycles/loop CFS = 392.20 cycles/loop gavl_cfs = 260.93 cycles/loop ./smart-queue-v-gavl -n 1 -l 200 gavl_cfs = 259.45 cycles/loop CFS = 605.89 cycles/loop TR = 196.69 cycles/loop CFS = 622.60 cycles/loop gavl_cfs = 262.72 cycles/loop ./smart-queue-v-gavl -n 10 -l 200 gavl_cfs = 258.21 cycles/loop CFS = 845.62 cycles/loop TR = 315.37 cycles/loop CFS = 860.21 cycles/loop gavl_cfs = 258.94 cycles/loop The GAVL code has not been tuned by any "likely"/"unlikely" constructs. It brings even some other overhead from it generic design which is not necessary for this use - it keeps permanently even pointer to the last element, ensures, that the insertion order is preserved for same key values etc. But it still proves much better scalability then kernel used RB-tree code. On the other hand, it does not encode color/height in one of the pointers and requires additional field for height. May it be, that difference is due some bug in my testing, then I would be interrested in correction. The test case is oversimplified probably. I have already run more different tests against GAVL code in the past to compare it with different tree and queues implementations and I have not found case with real performance degradation. On the other hand, there are cases for small items counts where GAVL
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: > 2) plugsched did not allow on the fly selection of schedulers, nor did >it allow a per CPU selection of schedulers. IO schedulers you can >change per disk, on the fly, making them much more useful in >practice. Also, IO schedulers (while definitely not being slow!) are >alot less performance sensitive than CPU schedulers. One of the reasons I never posted my own code is that it never met its own design goals, which absolutely included switching on the fly. I think Peter Williams may have done something about that. It was my hope to be able to do insmod sched_foo.ko until it became clear that the effort it was intended to assist wasn't going to get even the limited hardware access required, at which point I largely stopped working on it. On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: > 3) I/O schedulers are pretty damn clean code, and plugsched, at least >the last version i saw of it, didnt come even close. I'm not sure what happened there. It wasn't a big enough patch to take hits in this area due to getting overwhelmed by the programming burden like some other efforts of mine. Maybe things started getting ugly once on-the-fly switching entered the picture. My guess is that Peter Williams will have to chime in here, since things have diverged enough from my one-time contribution 4 years ago. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 02:23:08 Arjan van de Ven wrote: > On Mon, 2007-04-16 at 01:49 +0300, Ismail Dönmez wrote: > > Hi, > > > > On Friday 13 April 2007 23:21:00 Ingo Molnar wrote: > > > [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler > > > [CFS] > > > > > > i'm pleased to announce the first release of the "Modular Scheduler > > > Core and Completely Fair Scheduler [CFS]" patchset: > > > > > >http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch > > > > Tested this on top of Linus' GIT tree but the system gets very > > unresponsive during high disk i/o using ext3 as filesystem but even > > writing a 300mb file to a usb disk (iPod actually) has the same affect. > > just to make sure; this exact same workload but with the stock scheduler > does not have this effect? > > if so, then it could well be that the scheduler is too fair for it's own > good (being really fair inevitably ends up not batching as much as one > should, and batching is needed to get any kind of decent performance out > of disks nowadays) Tried with make install in kdepim (which made system sluggish with CFS) and the system is just fine (using CFQ). Regards, ismail - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Mon, 2007-04-16 at 01:49 +0300, Ismail Dönmez wrote: > Hi, > On Friday 13 April 2007 23:21:00 Ingo Molnar wrote: > > [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler > > [CFS] > > > > i'm pleased to announce the first release of the "Modular Scheduler Core > > and Completely Fair Scheduler [CFS]" patchset: > > > >http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch > > Tested this on top of Linus' GIT tree but the system gets very unresponsive > during high disk i/o using ext3 as filesystem but even writing a 300mb file > to a usb disk (iPod actually) has the same affect. just to make sure; this exact same workload but with the stock scheduler does not have this effect? if so, then it could well be that the scheduler is too fair for it's own good (being really fair inevitably ends up not batching as much as one should, and batching is needed to get any kind of decent performance out of disks nowadays) -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 05:00, Jonathan Lundell wrote: > On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote: > > It's a really good thing, and it means that if somebody shows that > > your > > code is flawed in some way (by, for example, making a patch that > > people > > claim gets better behaviour or numbers), any *good* programmer that > > actually cares about his code will obviously suddenly be very > > motivated to > > out-do the out-doer! > > "No one who cannot rejoice in the discovery of his own mistakes > deserves to be called a scholar." Lovely comment. I realise this is not truly directed at me but clearly in the context it has been said people will assume it is directed my way, so while we're all spinning lkml quality rhetoric, let me have a right of reply. One thing I have never tried to do was to ignore bug reports. I'm forever joking that I keep pulling code out of my arse to improve what I've done. RSDL/SD was no exception; heck it had 40 iterations. The reason I could not reply to bug report A with "Oh that is problem B so I'll fix it with code C" was, as I've said many many times over, health related. I did indeed try to fix many of them without spending hours replying to sometimes unpleasant emails. If health wasn't an issue there might have been 1000 iterations of SD. There was only ever _one_ thing that I was absolutely steadfast on as a concept that I refused to fix that people might claim was "a mistake I did not rejoice in to be a scholar". That was that the _correct_ behaviour for a scheduler is to be fair such that proportional slowdown with load is (using that awful pun) a feature, not a bug. Now there are people who will still disagree violently with me on that. SD attempted to be a fairness first virtual-deadline design. If I failed on that front, then so be it (and at least one person certainly has said in lovely warm fuzzy friendly communication that I'm a global failure on all fronts with SD). But let me point out now that Ingo's shiny new scheduler is a fairness-first virtual-deadline design which will have proportional slowdown with load. So it will have a very similar feature. I dare anyone to claim that proportional slowdown with load is a bug, because I will no longer feel like I'm standing alone with a BFG9000 trying to defend my standpoint. Others can take up the post at last. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Hi, On Friday 13 April 2007 23:21:00 Ingo Molnar wrote: > [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler > [CFS] > > i'm pleased to announce the first release of the "Modular Scheduler Core > and Completely Fair Scheduler [CFS]" patchset: > >http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch Tested this on top of Linus' GIT tree but the system gets very unresponsive during high disk i/o using ext3 as filesystem but even writing a 300mb file to a usb disk (iPod actually) has the same affect. Regards, ismail signature.asc Description: This is a digitally signed message part.
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: > > * Matt Mackall <[EMAIL PROTECTED]> wrote: > > > Look at what happened with I/O scheduling. Opening things up to some > > new ideas by making it possible to select your I/O scheduler took us > > from 10 years of stagnation to healthy, competitive development, which > > gave us a substantially better I/O scheduler. > > actually, 2-3 years ago we already had IO schedulers, and my opinion > against plugsched back then (also shared by Nick and Linus) was very > much considering them. There are at least 4 reasons why I/O schedulers > are different from CPU schedulers: ... > 3) I/O schedulers are pretty damn clean code, and plugsched, at least >the last version i saw of it, didnt come even close. That's irrelevant. Plugsched was an attempt to get alternative schedulers exposure in mainline. I know, because I remember encouraging Bill to pursue it. Not only did you veto plugsched (which may have been a perfectly reasonable thing to do), but you also vetoed the whole concept of multiple schedulers in the tree too. "We don't want to balkanize the scheduling landscape". And that latter part is what I'm claiming has set us back for years. It's not a technical argument but a strategic one. And it's just not a good strategy. > 4) the good thing that happened to I/O, after years of stagnation isnt >I/O schedulers. The good thing that happened to I/O is called Jens >Axboe. If you care about the I/O subystem then print that name out >and hang it on the wall. That and only that is what mattered. Disagree. Things didn't actually get interesting until Nick showed up with AS and got it in-tree to demonstrate the huge amount of room we had for improvement. It took several iterations of AS and CFQ (with a couple complete rewrites) before CFQ began to look like the winner. The resulting time-sliced CFQ was fairly heavily influenced by the ideas in AS. Similarly, things in scheduler land had been pretty damn boring until Con finally got Andrew to take one of his schedulers for a spin. > nor was the non-modularity of some piece of code ever an impediment to > competition. May i remind you of the pretty competitive SLAB allocator > landscape, resulting in things like the SLOB allocator, written by > yourself? ;-) Thankfully no one came out and said "we don't want to balkanize the allocator landscape" when I submitted it or I probably would have just dropped it, rather than painfully dragging it along out of tree for years. I'm not nearly the glutton for punishment that Con is. :-P -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Matt Mackall <[EMAIL PROTECTED]> wrote: > Look at what happened with I/O scheduling. Opening things up to some > new ideas by making it possible to select your I/O scheduler took us > from 10 years of stagnation to healthy, competitive development, which > gave us a substantially better I/O scheduler. actually, 2-3 years ago we already had IO schedulers, and my opinion against plugsched back then (also shared by Nick and Linus) was very much considering them. There are at least 4 reasons why I/O schedulers are different from CPU schedulers: 1) CPUs are a non-persistent resource shared by _all_ tasks and workloads in the system. Disks are _persistent_ resources very much attached to specific workloads. (If tasks had to be 'persistent' to the CPU they were started on we'd have much different scheduling technology, and there would be much less complexity.) More analogous to CPU schedulers would perhaps be VM/MM schedulers, and those tend to be hard to modularize in a technologically sane way too. (and unlike disks there's no good generic way to attach VM/MM schedulers to particular workloads.) So it's apples to oranges. in practice it comes down to having one good scheduler that runs all workloads on a system reasonably well. And given that a very large portion of system runs mixed workloads, the demand for one good scheduler is pretty high. While i can run with mixed IO schedulers just fine. 2) plugsched did not allow on the fly selection of schedulers, nor did it allow a per CPU selection of schedulers. IO schedulers you can change per disk, on the fly, making them much more useful in practice. Also, IO schedulers (while definitely not being slow!) are alot less performance sensitive than CPU schedulers. 3) I/O schedulers are pretty damn clean code, and plugsched, at least the last version i saw of it, didnt come even close. 4) the good thing that happened to I/O, after years of stagnation isnt I/O schedulers. The good thing that happened to I/O is called Jens Axboe. If you care about the I/O subystem then print that name out and hang it on the wall. That and only that is what mattered. all in one, while there are definitely uses (embedded would like to have a smaller/different scheduler, etc.), the technical case for modularization for the sake of selectability is alot lower for CPU schedulers than it is for I/O schedulers. nor was the non-modularity of some piece of code ever an impediment to competition. May i remind you of the pretty competitive SLAB allocator landscape, resulting in things like the SLOB allocator, written by yourself? ;-) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 05:05:36PM +0200, Ingo Molnar wrote: > so the rejection was on these grounds, and i still very much stand by > that position here and today: i didnt want to see the Linux scheduler > landscape balkanized and i saw no technological reasons for the > complication that external modularization brings. But "balkanization" is a good thing. "Monoculture" is a bad thing. Look at what happened with I/O scheduling. Opening things up to some new ideas by making it possible to select your I/O scheduler took us from 10 years of stagnation to healthy, competitive development, which gave us a substantially better I/O scheduler. Look at what's happening right now with TCP congestion algorithms. We've had decades of tweaking Reno slightly now turned into a vibrant research area with lots of radical alternatives. A winner will eventually emerge and it will probably look quite a bit different than Reno. Similar things have gone on since the beginning with filesystems on Linux. Being able to easily compare filesystems head to head has been immensely valuable in improving our 'core' Linux filesystems. And what we've had up to now is a scheduler monoculture. Until Andrew put RSDL in -mm, if people wanted to experiment with other schedulers, they had to go well off the beaten path to do it. So all the people who've been hopelessy frustrated with the mainline scheduler go off to the -ck ghetto, or worse, stick with 2.4. Whether your motivations have been protectionist or merely shortsighted, you've stomped pretty heavily on alternative scheduler development by completely rejecting the whole plugsched concept. If we'd opened up mainline to a variety of schedulers _3 years ago_, we'd probably have gotten to where we are today much sooner. Hopefully, the next time Rik suggests pluggable page replacement algorithms, folks will actually seriously consider it. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* William Lee Irwin III <[EMAIL PROTECTED]> wrote: > On Sun, Apr 15, 2007 at 09:20:46PM +0200, Ingo Molnar wrote: > > so Linus was right: this was caused by scheduler starvation. I can > > see one immediate problem already: the 'nice offset' is not divided > > by nr_running as it should. The patch below should fix this but i > > have yet to test it accurately, this change might as well render > > nice levels unacceptably ineffective under high loads. > > I've been suggesting testing CPU bandwidth allocation as influenced by > nice numbers for a while now for a reason. Oh I was very much testing "CPU bandwidth allocation as influenced by nice numbers" - it's one of the basic things i do when modifying the scheduler. An automated tool, while nice (all automation is nice) wouldnt necessarily show such bugs though, because here too it needed thousands of running tasks to trigger in practice. Any volunteers? ;) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > so Linus was right: this was caused by scheduler starvation. I can see > one immediate problem already: the 'nice offset' is not divided by > nr_running as it should. The patch below should fix this but i have > yet to test it accurately, this change might as well render nice > levels unacceptably ineffective under high loads. erm, rather the updated patch below if you want to use this on a 32-bit system. But ... i think you should wait until i have all this re-tested. Ingo --- include/linux/sched.h |2 +- kernel/sched_fair.c |4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) Index: linux/include/linux/sched.h === --- linux.orig/include/linux/sched.h +++ linux/include/linux/sched.h @@ -839,7 +839,7 @@ struct task_struct { s64 wait_runtime; u64 exec_runtime, fair_key; - s64 nice_offset, hog_limit; + s32 nice_offset, hog_limit; unsigned long policy; cpumask_t cpus_allowed; Index: linux/kernel/sched_fair.c === --- linux.orig/kernel/sched_fair.c +++ linux/kernel/sched_fair.c @@ -31,7 +31,9 @@ static void __enqueue_task_fair(struct r int leftmost = 1; long long key; - key = rq->fair_clock - p->wait_runtime + p->nice_offset; + key = rq->fair_clock - p->wait_runtime; + if (unlikely(p->nice_offset)) + key += p->nice_offset / (rq->nr_running + 1); p->fair_key = key; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 09:20:46PM +0200, Ingo Molnar wrote: > so Linus was right: this was caused by scheduler starvation. I can see > one immediate problem already: the 'nice offset' is not divided by > nr_running as it should. The patch below should fix this but i have yet > to test it accurately, this change might as well render nice levels > unacceptably ineffective under high loads. I've been suggesting testing CPU bandwidth allocation as influenced by nice numbers for a while now for a reason. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > > to debug this, could you try to apply this add-on as well: > > > > http://redhat.com/~mingo/cfs-scheduler/sched-fair-print.patch > > > > with this patch applied you should have a /proc/sched_debug file > > that prints all runnable tasks and other interesting info from the > > runqueue. > > I don't know if you have seen my mail from yesterday evening (here). I > found that changing keventd prio fixed the problem. You may be > interested in the description. I sent it at 21:01 (+200). ah, indeed i missed that mail - the response to the patches was quite overwhelming (and i naively thought people dont do Linux hacking over the weekends anymore ;). so Linus was right: this was caused by scheduler starvation. I can see one immediate problem already: the 'nice offset' is not divided by nr_running as it should. The patch below should fix this but i have yet to test it accurately, this change might as well render nice levels unacceptably ineffective under high loads. Ingo -> --- kernel/sched_fair.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: linux/kernel/sched_fair.c === --- linux.orig/kernel/sched_fair.c +++ linux/kernel/sched_fair.c @@ -31,7 +31,9 @@ static void __enqueue_task_fair(struct r int leftmost = 1; long long key; - key = rq->fair_clock - p->wait_runtime + p->nice_offset; + key = rq->fair_clock - p->wait_runtime; + if (unlikely(p->nice_offset)) + key += p->nice_offset / rq->nr_running; p->fair_key = key; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote: It's a really good thing, and it means that if somebody shows that your code is flawed in some way (by, for example, making a patch that people claim gets better behaviour or numbers), any *good* programmer that actually cares about his code will obviously suddenly be very motivated to out-do the out-doer! "No one who cannot rejoice in the discovery of his own mistakes deserves to be called a scholar." --Don Foster, "literary sleuth", on retracting his attribution of "A Funerall Elegye" to Shakespeare (it's more likely John Ford's work). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
+ printk("Fair Scheduler: Copyright (c) 2007 Red Hat, Inc., Ingo Molnar\n"); So that's what all the fuss about the staircase scheduler is all about then! At last, I see your point. i'd like to give credit to Con Kolivas for the general approach here: he has proven via RSDL/SD that 'fair scheduling' is possible and that it results in better desktop scheduling. Kudos Con! How pathetic can you get? Tim, really looking forward to the CL final where Liverpool will beat the shit out of Scum (and there's a lot to be beaten out). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Hi Ingo, On Sun, Apr 15, 2007 at 07:55:55PM +0200, Ingo Molnar wrote: > > * Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > Well, since I merged the fair-fork patch, I cannot reproduce (in fact, > > bash forks 1000 processes, then progressively execs scheddos, but it > > takes some time). So I'm rebuilding right now. But I think that Linus > > has an interesting clue about GPM and notification before switching > > the terminal. I think it was enabled in console mode. I don't know how > > that translates to frozen xterms, but let's attack the problems one at > > a time. > > to debug this, could you try to apply this add-on as well: > > http://redhat.com/~mingo/cfs-scheduler/sched-fair-print.patch > > with this patch applied you should have a /proc/sched_debug file that > prints all runnable tasks and other interesting info from the runqueue. I don't know if you have seen my mail from yesterday evening (here). I found that changing keventd prio fixed the problem. You may be interested in the description. I sent it at 21:01 (+200). > [ i've refreshed all the patches on the CFS webpage, so if this doesnt > apply cleanly to your current tree then you'll probably have to > refresh one of the patches.] Fine, I'll have a look. I already had to rediff the sched-fair-fork patch last time. > The output should look like this: > > Sched Debug Version: v0.01 > now at 226761724575 nsecs > > cpu: 0 >.nr_running: 3 >.raw_weighted_load : 384 >.nr_switches : 13666 >.nr_uninterruptible: 0 >.next_balance : 4294947416 >.curr->pid : 2179 >.rq_clock : 241337421233 >.fair_clock: 7503791206 >.wait_runtime : 2269918379 > > runnable tasks: > task | PID | tree-key | -delta | waiting | switches > - > +cat 2179 7501930066 -18611401861140 2 > loop_silent 2149 7503010354-780852 0 911 > loop_silent 2148 7503510048-281158 280753 918 Nice. > now for your workload the list should be considerably larger. If there's > starvation going on then the 'switches' field (number of context > switches) of one of the tasks would never increase while you have this > 'cannot switch consoles' problem. > > maybe you'll have to unapply the fair-fork patch to make it trigger > again. (fair-fork does not fix anything, so it probably just hides a > real bug.) > > (i'm meanwhile busy running your scheddos utilities to reproduce it > locally as well :) I discovered I had the frame-buffer enabled (I did not notice it first because I do not have the logo and the resolution is the same as text). It's matroxfb with a G400, if that can help. It may be possible that it needs some CPU that it cannot get to clear the display before switching, I don't know. However I won't try this right now, I'm deep in userland at the moment. Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 15 Apr 2007, Mike Galbraith wrote: > On Sun, 2007-04-15 at 16:08 +0300, Pekka J Enberg wrote: > > > > He did exactly that and he did it with a patch. Nothing new here. This is > > how development on LKML proceeds when you have two or more competing > > designs. There's absolutely no need to get upset or hurt your feelings > > over it. It's not malicious, it's how we do Linux development. > > Yes. Exactly. This is what it's all about, this is what makes it work. I obviously agree, but I will also add that one of the most motivating things there *is* in open source is "personal pride". It's a really good thing, and it means that if somebody shows that your code is flawed in some way (by, for example, making a patch that people claim gets better behaviour or numbers), any *good* programmer that actually cares about his code will obviously suddenly be very motivated to out-do the out-doer! Does this mean that there will be tension and rivalry? Hell yes. But that's kind of the point. Life is a game, and if you aren't in it to win, what the heck are you still doing here? As long as it's reasonably civil (I'm not personally a huge believer in being too polite or "politically correct", so I think the "reasonably" is more important than the "civil" part!), and as long as the end result is judged on TECHNICAL MERIT, it's all good. We don't want to play politics. But encouraging peoples competitive feelings? Oh, yes. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 2007-04-15 at 12:58 -0400, Gene Heskett wrote: > Chuckle, possibly but then I'm not anything even remotely close to an expert > here Con, just reporting what I get. And I just rebooted to 2.6.21-rc6 + > sched-mike-5.patch for grins and giggles, or frowns and profanity as the case > may call for. Erm, that patch is embarrassingly buggy, so profanity should dominate. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > Well, since I merged the fair-fork patch, I cannot reproduce (in fact, > bash forks 1000 processes, then progressively execs scheddos, but it > takes some time). So I'm rebuilding right now. But I think that Linus > has an interesting clue about GPM and notification before switching > the terminal. I think it was enabled in console mode. I don't know how > that translates to frozen xterms, but let's attack the problems one at > a time. to debug this, could you try to apply this add-on as well: http://redhat.com/~mingo/cfs-scheduler/sched-fair-print.patch with this patch applied you should have a /proc/sched_debug file that prints all runnable tasks and other interesting info from the runqueue. [ i've refreshed all the patches on the CFS webpage, so if this doesnt apply cleanly to your current tree then you'll probably have to refresh one of the patches.] The output should look like this: Sched Debug Version: v0.01 now at 226761724575 nsecs cpu: 0 .nr_running: 3 .raw_weighted_load : 384 .nr_switches : 13666 .nr_uninterruptible: 0 .next_balance : 4294947416 .curr->pid : 2179 .rq_clock : 241337421233 .fair_clock: 7503791206 .wait_runtime : 2269918379 runnable tasks: task | PID | tree-key | -delta | waiting | switches - +cat 2179 7501930066 -18611401861140 2 loop_silent 2149 7503010354-780852 0 911 loop_silent 2148 7503510048-281158 280753 918 now for your workload the list should be considerably larger. If there's starvation going on then the 'switches' field (number of context switches) of one of the tasks would never increase while you have this 'cannot switch consoles' problem. maybe you'll have to unapply the fair-fork patch to make it trigger again. (fair-fork does not fix anything, so it probably just hides a real bug.) (i'm meanwhile busy running your scheddos utilities to reproduce it locally as well :) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 2007-04-15 at 16:08 +0300, Pekka J Enberg wrote: > On Sun, 15 Apr 2007, Willy Tarreau wrote: > > Ingo could have publicly spoken with them about his ideas of killing > > the O(1) scheduler and replacing it with an rbtree-based one, and using > > part of Bill's work to speed up development. > > He did exactly that and he did it with a patch. Nothing new here. This is > how development on LKML proceeds when you have two or more competing > designs. There's absolutely no need to get upset or hurt your feelings > over it. It's not malicious, it's how we do Linux development. Yes. Exactly. This is what it's all about, this is what makes it work. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sunday 15 April 2007, Con Kolivas wrote: >On Monday 16 April 2007 01:16, Gene Heskett wrote: >> On Sunday 15 April 2007, Pekka Enberg wrote: >> >On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote: >> >> The perception here is that there is that there is this expectation >> >> that sections of the Linux kernel are intentionally "churn squated" to >> >> prevent any other ideas from creeping in other than of the owner of >> >> that subsytem >> > >> >Strangely enough, my perception is that Ingo is simply trying to >> >address the issues Mike's testing discovered in RDSL and SD. It's not >> >surprising Ingo made it a separate patch set as Con has repeatedly >> >stated that the "problems" are in fact by design and won't be fixed. >> >> I won't get into the middle of this just yet, not having decided which dog >> I should bet on yet. I've been running 2.6.21-rc6 + Con's 0.40 patch for >> about 24 hours, its been generally usable, but gzip still causes lots of 5 >> to 10+ second lags when its running. I'm coming to the conclusion that >> gzip simply doesn't play well with others... > >Actually Gene I think you're being bitten here by something I/O bound since >the cpu usage never tops out. If that's the case and gzip is dumping >truckloads of writes then you're suffering something that irks me even more >than the scheduler in linux, and that's how much writes hurt just about >everything else. Try your testcase with bzip2 instead (since that won't be >i/o bound), or drop your dirty ratio to as low as possible which helps a >little bit (5% is the minimum) > >echo 5 > /proc/sys/vm/dirty_ratio > >and finally try the braindead noop i/o scheduler as well. > >echo noop > /sys/block/sda/queue/scheduler > >(replace sda with your drive obviously). > >I'd wager a big one that's what causes your gzip pain. If it wasn't for the >fact that I've decided to all but give up ever trying to provide code for >mainline again, trying my best to make writes hurt less on linux would be my >next big thing [tm]. Chuckle, possibly but then I'm not anything even remotely close to an expert here Con, just reporting what I get. And I just rebooted to 2.6.21-rc6 + sched-mike-5.patch for grins and giggles, or frowns and profanity as the case may call for. >Oh and for the others watching, (points to vm hackers) I found a bug when >playing with the dirty ratio code. If you modify it to allow it drop below > 5% but still above the minimum in the vm code, stalls happen somewhere in > the vm where nothing much happens for sometimes 20 or 30 seconds worst case > scenario. I had to drop a patch in 2.6.19 that allowed the dirty ratio to > be set ultra low because these stalls were gross. I think I'd need a bit of tutoring on how to do that. I recall that one other time, several weeks back, I thought I would try one of those famous echo this >/proc/that ideas that went by on this list, but even though I was root, apparently /proc was read-only AFAIWC. >> Amazing to me, the cpu its using stays generally below 80%, and often >> below 60%, even while the kmail composer has a full sentence in its buffer >> that it still hasn't shown me when I switch to the htop screen to check, >> and back to the kmail screen to see if its updated yet. The screen switch >> doesn't seem to lag so I don't think renicing x would be helpfull. Those >> are the obvious lags, and I'll build & reboot to the CFS patch at some >> point this morning (whats left of it that is :). And report in due time >> of course And now I wonder if I applied the right patch. This one feels good ATM, but I don't think its the CFS thingy. No, I'm sure of it now, none of the patches I've saved say a thing about CFS. Backtrack up the list time I guess, ignore me for the nonce. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Microsoft: Re-inventing square wheels -- From a Slashdot.org post - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 01:16, Gene Heskett wrote: > On Sunday 15 April 2007, Pekka Enberg wrote: > >On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote: > >> The perception here is that there is that there is this expectation that > >> sections of the Linux kernel are intentionally "churn squated" to > >> prevent any other ideas from creeping in other than of the owner of that > >> subsytem > > > >Strangely enough, my perception is that Ingo is simply trying to > >address the issues Mike's testing discovered in RDSL and SD. It's not > >surprising Ingo made it a separate patch set as Con has repeatedly > >stated that the "problems" are in fact by design and won't be fixed. > > I won't get into the middle of this just yet, not having decided which dog > I should bet on yet. I've been running 2.6.21-rc6 + Con's 0.40 patch for > about 24 hours, its been generally usable, but gzip still causes lots of 5 > to 10+ second lags when its running. I'm coming to the conclusion that > gzip simply doesn't play well with others... Actually Gene I think you're being bitten here by something I/O bound since the cpu usage never tops out. If that's the case and gzip is dumping truckloads of writes then you're suffering something that irks me even more than the scheduler in linux, and that's how much writes hurt just about everything else. Try your testcase with bzip2 instead (since that won't be i/o bound), or drop your dirty ratio to as low as possible which helps a little bit (5% is the minimum) echo 5 > /proc/sys/vm/dirty_ratio and finally try the braindead noop i/o scheduler as well. echo noop > /sys/block/sda/queue/scheduler (replace sda with your drive obviously). I'd wager a big one that's what causes your gzip pain. If it wasn't for the fact that I've decided to all but give up ever trying to provide code for mainline again, trying my best to make writes hurt less on linux would be my next big thing [tm]. Oh and for the others watching, (points to vm hackers) I found a bug when playing with the dirty ratio code. If you modify it to allow it drop below 5% but still above the minimum in the vm code, stalls happen somewhere in the vm where nothing much happens for sometimes 20 or 30 seconds worst case scenario. I had to drop a patch in 2.6.19 that allowed the dirty ratio to be set ultra low because these stalls were gross. > Amazing to me, the cpu its using stays generally below 80%, and often below > 60%, even while the kmail composer has a full sentence in its buffer that > it still hasn't shown me when I switch to the htop screen to check, and > back to the kmail screen to see if its updated yet. The screen switch > doesn't seem to lag so I don't think renicing x would be helpfull. Those > are the obvious lags, and I'll build & reboot to the CFS patch at some > point this morning (whats left of it that is :). And report in due time of > course -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
> It outlines the problems with Linux kernel development and questionable > elistism regarding ownership of certain sections of the kernel code. I have to step in and disagree here Linux is not about who writes the code. Linux is about getting the best solution for a problem. Who wrote which line of the code is irrelevant in the big picture. that often means that multiple implementations happen, and that the a darwinistic process decides that the best solution wins. This darwinistic process often happens in the form of discussion, and that discussion can happen with words or with code. In this case it happened with a code proposal. To make this specific: it has happened many times to me that when I solved an issue with code, someone else stepped in and wrote a different solution (although that was usually for smaller pieces). Was I upset about that? No! I was happy because my *problem got solved* in the best possible way. Now this doesn't mean that people shouldn't be nice to each other, not cooperate or steal credits, but I don't get the impression that that is happening here. Ingo is taking part in the discussion with a counter proposal for discussion *on the mailing list*. What more do you want?? If you or anyone else can improve it or do better, take part of this discussion and show what you mean either in words or in code. Your qualification of the discussion as a elitist takeover... I disagree with that. It's a *discussion*. Now if you agree that Ingo's patch is better technically, you and others should be happy about that because your problem is getting solved better. If you don't agree that his patch is better technically, take part in the technical discussion. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
In article <[EMAIL PROTECTED]> you wrote: > A development process like this is likely to exclude smart people from wanting > to contribute to Linux and folks should be conscious about this issues. Nobody is excluded, you can always have a next iteration. Gruss Bernd - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Willy Tarreau <[EMAIL PROTECTED]> wrote: >> [...] and using part of Bill's work to speed up development. On Sun, Apr 15, 2007 at 05:39:33PM +0200, Ingo Molnar wrote: > ok, let me make this absolutely clear: i didnt use any bit of plugsched > - in fact the most difficult bits of the modularization was for areas of > sched.c that plugsched never even touched AFAIK. (the load-balancer for > example.) > Plugsched simply does something else: i modularized scheduling policies > in essence that have to cooperate with each other, while plugsched > modularized complete schedulers which are compile-time or boot-time > selected, with no runtime cooperation between them. (one has to be > selected at a time) > (and i have no trouble at all with crediting Will's work either: a few > years ago i used Will's PID rework concepts for an NPTL related speedup > and Will is very much credited for it in today's kernel/pid.c and he > continued to contribute to it later on.) > (the tree walking bits of sched_fair.c were in fact derived from > kernel/hrtimer.c, the rbtree code written by Thomas and me :-) The extant plugsched patches have nothing to do with cfs; I suspect what everyone else is going on about is terminological confusion. The 4-year-old sample policy with scheduling classes for the original plugsched is something you had no way of knowing about, as it was never publicly posted. There isn't really anything all that interesting going on here, apart from pointing out that it's been done before. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > Ingo could have publicly spoken with them about his ideas of killing > the O(1) scheduler and replacing it with an rbtree-based one, [...] yes, that's precisely what i did, via a patchset :) [ I can even tell you when it all started: i was thinking about Mike's throttling patches while watching Manchester United beat the crap out of AS Roma (7 to 1 end result), Thuesday evening. I started coding it Wednesday morning and sent the patch Friday evening. I very much believe in low-latency when it comes to development too ;) ] (if this had been done via a comittee then today we'd probably still be trying to find a suitable timeslot for the initial conference call where we'd discuss the election of a chair who would be tasked with writing up an initial document of feature requests, on which we'd take a vote, possibly this year already, because the matter is really urgent you know ;-) > [...] and using part of Bill's work to speed up development. ok, let me make this absolutely clear: i didnt use any bit of plugsched - in fact the most difficult bits of the modularization was for areas of sched.c that plugsched never even touched AFAIK. (the load-balancer for example.) Plugsched simply does something else: i modularized scheduling policies in essence that have to cooperate with each other, while plugsched modularized complete schedulers which are compile-time or boot-time selected, with no runtime cooperation between them. (one has to be selected at a time) (and i have no trouble at all with crediting Will's work either: a few years ago i used Will's PID rework concepts for an NPTL related speedup and Will is very much credited for it in today's kernel/pid.c and he continued to contribute to it later on.) (the tree walking bits of sched_fair.c were in fact derived from kernel/hrtimer.c, the rbtree code written by Thomas and me :-) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 02:45:27PM +0200, Willy Tarreau wrote: > Now I hope he and Bill will get over this and accept to work on improving > this scheduler, because I really find it smarter than a dumb O(1). I even > agree with Mike that we now have a solid basis for future work. But for > this, maybe a good starting point would be to remove the selfish printk > at boot, revert useless changes (SCHED_NORMAL->SCHED_FAIR come to mind) > and improve the documentation a bit so that people can work together on > the new design, without feeling like their work will only server to > promote X or Y. While I appreciate people coming to my defense, or at least the good intentions behind such, my only actual interest in pointing out 4-year-old work is getting some acknowledgment of having done something relevant at all. Sometimes it has "I told you so" value. At other times it's merely clarifying what went on when people refer to it since in a number of cases the patches are no longer extant, so they can't actually look at it to get an idea of what was or wasn't done. At other times I'm miffed about not being credited, whether I should've been or whether dead and buried code has an implementation of the same idea resurfacing without the author(s) having any knowledge of my prior work. One should note that in this case, the first work of mine this trips over (scheduling classes) was never publicly posted as it was only a part of the original plugsched (an alternate scheduler implementation devised to demonstrate plugsched's flexibility with respect to scheduling policies), and a part that was dropped by subsequent maintainers. The second work of mine this trips over, a virtual deadline scheduler named "vdls," was also never publicly posted. Both are from around the same time period, which makes them approximately 4 years dead. Neither of the codebases are extant, having been lost in a transition between employers, though various people recall having been sent them privately, and plugsched survives in a mutated form as maintained by Peter Williams, who's been very good about acknowledging my original contribution. If I care to become a direct participant in scheduler work, I can do so easily enough. I'm not entirely sure what this is about a basis for future work. By and large one should alter the API's and data structures to fit the policy being implemented. While the array swapping was nice for algorithmically improving 2.4.x -style epoch expiry, most algorithms not based on the 2.4.x scheduler (in however mutated a form) should use a different queue structure, in fact, one designed around their policy's specific algorithmic needs. IOW, when one alters the scheduler, one should also alter the queue data structure appropriately. I'd not expect the priority queue implementation in cfs to continue to be used unaltered as it matures, nor would I expect any significant modification of the scheduler to necessarily use a similar one. By and large I've been mystified as to why there is such a penchant for preserving the existing queue structures in the various scheduler patches floating around. I am now every bit as mystified at the point of view that seems to be emerging that a change of queue structure is particularly significant. These are all largely internal changes to sched.c, and as such, rather small changes in and of themselves. While they do tend to have user-visible effects, from this point of view even changing out every line of sched.c is effectively a micropatch. Something more significant might be altering the schedule() API to take a mandatory description of the intention of the call to it, or breaking up schedule() into several different functions to distinguish between different sorts of uses of it to which one would then respond differently. Also more significant would be adding a new state beyond TASK_INTERRUPTIBLE, TASK_UNINTERRUPTIBLE, and TASK_RUNNING for some tasks to respond only to fatal signals, then sweeping TASK_UNINTERRUPTIBLE users to use the new state and handle those fatal signals. While not quite as ostentatious in their user-visible effects as SCHED_OTHER policy affairs, they are tremendously more work than switching out the implementation of a single C file, and so somewhat more respectable. Even as scheduling semantics go, these are micropatches. So SCHED_OTHER changes a little. Where are the gang schedulers? Where are the batch schedulers (SCHED_BATCH is not truly such)? Where are the isochronous (frame) schedulers? I suppose there is some CKRM work that actually has a semantic impact despite being largely devoted to SCHED_OTHER, and there's some spufs gang scheduling going on, though not all that much. And to reiterate a point from other threads, even as SCHED_OTHER patches go, I see precious little verification that things like the semantics of nice numbers or other sorts of CPU bandwidth allocation between competing tasks of various natures are staying the same while other
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sunday 15 April 2007, Pekka Enberg wrote: >On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote: >> The perception here is that there is that there is this expectation that >> sections of the Linux kernel are intentionally "churn squated" to prevent >> any other ideas from creeping in other than of the owner of that subsytem > >Strangely enough, my perception is that Ingo is simply trying to >address the issues Mike's testing discovered in RDSL and SD. It's not >surprising Ingo made it a separate patch set as Con has repeatedly >stated that the "problems" are in fact by design and won't be fixed. I won't get into the middle of this just yet, not having decided which dog I should bet on yet. I've been running 2.6.21-rc6 + Con's 0.40 patch for about 24 hours, its been generally usable, but gzip still causes lots of 5 to 10+ second lags when its running. I'm coming to the conclusion that gzip simply doesn't play well with others... Amazing to me, the cpu its using stays generally below 80%, and often below 60%, even while the kmail composer has a full sentence in its buffer that it still hasn't shown me when I switch to the htop screen to check, and back to the kmail screen to see if its updated yet. The screen switch doesn't seem to lag so I don't think renicing x would be helpfull. Those are the obvious lags, and I'll build & reboot to the CFS patch at some point this morning (whats left of it that is :). And report in due time of course -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) knot in cables caused data stream to become twisted and kinked - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Con Kolivas <[EMAIL PROTECTED]> wrote: [ i'm quoting this bit out of order: ] > 2. Since then I've been thinking/working on a cpu scheduler design > that takes away all the guesswork out of scheduling and gives very > predictable, as fair as possible, cpu distribution and latency while > preserving as solid interactivity as possible within those confines. yeah. I think you were right on target with this call. I've applied the sched.c change attached at the bottom of this mail to the CFS patch, if you dont mind. (or feel free to suggest some other text instead.) > 1. I tried in vain some time ago to push a working extensable > pluggable cpu scheduler framework (based on wli's work) for the linux > kernel. It was perma-vetoed by Linus and Ingo (and Nick also said he > didn't like it) as being absolutely the wrong approach and that we > should never do that. [...] i partially replied to that point to Will already, and i'd like to make it clear again: yes, i rejected plugsched 2-3 years ago (which already drifted away from wli's original codebase) and i would still reject it today. First and foremost, please dont take such rejections too personally - i had my own share of rejections (and in fact, as i mentioned it in a previous mail, i had a fair number of complete project throwaways: 4g:4g, in-kernel Tux, irqrate and many others). I know that they can hurt and can demoralize, but if i dont like something it's my job to tell that. Can i sum up your argument as: "you rejected plugsched, but then why on earth did you modularize portions of the scheduler in CFS? Isnt your position thus woefully inconsistent?" (i'm sure you would never put it this impolitely though, but i guess i can flame myself with impunity ;) While having an inconsistent position isnt a terminal sin in itself, please realize that the scheduler classes code in CFS is quite different from plugsched: it was a result of what i saw to be technological pressure for _internal modularization_. (This internal/policy modularization aspect is something that Will said was present in his original plugsched code, but which aspect i didnt see in the plugsched patches that i reviewed.) That possibility never even occured to me to until 3 days ago. You never raised it either AFAIK. No patches to simplify the scheduler that way were ever sent. Plugsched doesnt even touch the core load-balancer for example, and most of the time i spent with the modularization was to get the load-balancing details right. So it's really apples to oranges. My view about plugsched: first please take a look at the latest plugsched code: http://downloads.sourceforge.net/cpuse/plugsched-6.5-for-2.6.20.patch 26 files changed, 8951 insertions(+), 1495 deletions(-) As an experiment i've removed all the add-on schedulers (both the core and the include files, only kept the vanilla one) from the plugsched patch (and the makefile and kconfig complications, etc), to see the 'infrastructure cost', and it still gave: 12 files changed, 1933 insertions(+), 1479 deletions(-) that's the extra complication i didnt like 3 years ago and which i still dont like today. What the current plugsched code does is that it simplifies the adding of new experimental schedulers, but it doesnt really do what i wanted: to simplify the _scheduler itself_. Personally i'm still not primarily interested in having a large selection of schedulers, i'm mainly interested in a good and maintainable scheduler that works for people. so the rejection was on these grounds, and i still very much stand by that position here and today: i didnt want to see the Linux scheduler landscape balkanized and i saw no technological reasons for the complication that external modularization brings. the new scheding classes code in the CFS patch was not a result of "oh, i want to write a new scheduler, lets make schedulers pluggable" kind of thinking. That result was just a side-effect of it. (and as you correctly noted it, the CFS related modularization is incomplete). Btw., the thing that triggered the scheduling classes code wasnt even plugsched or RSDL/SD, it was Mike's patches. Mike had an itch and he fixed it within the framework of the existing scheduler, and the end result behaved quite well when i threw various testloads on it. But i felt a bit uncomfortable that it added another few hundred lines of code to an already complex sched.c. This felt unnatural so i mailed Mike that i'd attempt to clean these infrastructure aspects of sched.c up a bit so that it becomes more hackable to him. Thus 3 days ago, without having made up my mind about anything, i started this experiment (which ended up in the modularization and in the CFS scheduler) to simplify the code and to enable Mike to fix such itches in an easier way. By your logic Mike should in fact be quite upset about this: if the new code works out and proves to be useful then it obsoletes a whole lot
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 15 Apr 2007, Willy Tarreau wrote: > Ingo could have publicly spoken with them about his ideas of killing > the O(1) scheduler and replacing it with an rbtree-based one, and using > part of Bill's work to speed up development. He did exactly that and he did it with a patch. Nothing new here. This is how development on LKML proceeds when you have two or more competing designs. There's absolutely no need to get upset or hurt your feelings over it. It's not malicious, it's how we do Linux development. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Esben Nielsen <[EMAIL PROTECTED]> wrote: > I took a brief look at it. Have you tested priority inheritance? yeah, you are right, it's broken at the moment, i'll fix it. But the good news is that i think PI could become cleaner via scheduling classes. > As far as I can see rt_mutex_setprio doesn't have much effect on > SCHED_FAIR/SCHED_BATCH. I am looking for a place where such a task > change scheduler class when boosted in rt_mutex_setprio(). i think via scheduling classes we dont have to do the p->policy and p->prio based gymnastics anymore, we can just have a clean look at p->sched_class and stack the original scheduling class into p->real_sched_class. It would probably also make sense to 'privatize' p->prio into the scheduling class. That way PI would be a pure property of sched_rt, and the PI scheduler would be driven purely by p->rt_priority, not by p->prio. That way all the normal_prio() kind of complications and interactions with SCHED_OTHER/SCHED_FAIR would be eliminated as well. What do you think? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 01:39:27PM +0300, Pekka Enberg wrote: > On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote: > >The perception here is that there is that there is this expectation that > >sections of the Linux kernel are intentionally "churn squated" to prevent > >any other ideas from creeping in other than of the owner of that subsytem > > Strangely enough, my perception is that Ingo is simply trying to > address the issues Mike's testing discovered in RDSL and SD. It's not > surprising Ingo made it a separate patch set as Con has repeatedly > stated that the "problems" are in fact by design and won't be fixed. That's not exactly the problem. There are people who work very hard to try to improve some areas of the kernel. They progress slowly, and acquire more and more skills. Sometimes they feel like they need to change some concepts and propose those changes which are required for them to go further, or to develop faster. Those are rejected. So they are constrained to work in a delimited perimeter from which it is difficult for them to escape. Then, the same person who rejected their changes comes with something shiny new, better and which took him far less time. But he sort of broke the rules because what was forbidden to the first persons is suddenly permitted. Maybe for very good reasons, I'm not discussing that. The good reason should have been valid the first time too. The fact is that when changes are rejected, we should not simply say "no", but explain why and define what would be acceptable. Some people here have excellent teaching skills for this, but most others do not. Anyway, the rules should be the same for everybody. Also, there is what can be perceived as marketting here. Con worked on his idea with convictions, he took time to write some generous documentation, but he hit a wall where his concept was suboptimal on a given workload. But at least, all the work was oriented on a technical basis : design + code + doc. Then, Ingo comes in with something looking amazingly better, with virtually no documentation, an appealing announcement, and a shiny advertising at boot. All this implemented without the constraints other people had to respect. It already looks like definitive work which will be merge as-is without many changes except a few bugfixes. If those were two companies, the first one would simply have accused the second one of not having respected contracts and having employed heaving marketting to take the first place. People here do not code for a living, they do it at least because they believe in what they are doing, and some of them want a bit of gratitude for their work. I've met people who were proud to say they implement this or that feature in the kernel, so it is something important for them. And being cited in an email is nothing compared to advertising at boot time. When the discussion was blocked between Con and Mike concerning the design problems, it is where a new discussion should have taken place. Ingo could have publicly spoken with them about his ideas of killing the O(1) scheduler and replacing it with an rbtree-based one, and using part of Bill's work to speed up development. It is far easier to resign when people explain what concepts are wrong and how they think they will do than when they suddenly present something out of nowhere which is already better. And it's not specific to Ingo (though I think his ability to work that fast alone makes him tend to practise this more often than others). Imagine if Con had worked another full week on his scheduler with better results on Mike's workload, but still not as good as Ingo's, and they both published at the same time. You certainly can imagine he would have preferred to be informed first that it was pointless to continue in that direction. Now I hope he and Bill will get over this and accept to work on improving this scheduler, because I really find it smarter than a dumb O(1). I even agree with Mike that we now have a solid basis for future work. But for this, maybe a good starting point would be to remove the selfish printk at boot, revert useless changes (SCHED_NORMAL->SCHED_FAIR come to mind) and improve the documentation a bit so that people can work together on the new design, without feeling like their work will only server to promote X or Y. Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Fri, 13 Apr 2007, Ingo Molnar wrote: [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS] i'm pleased to announce the first release of the "Modular Scheduler Core and Completely Fair Scheduler [CFS]" patchset: http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch This project is a complete rewrite of the Linux task scheduler. My goal is to address various feature requests and to fix deficiencies in the vanilla scheduler that were suggested/found in the past few years, both for desktop scheduling and for server scheduling workloads. [...] I took a brief look at it. Have you tested priority inheritance? As far as I can see rt_mutex_setprio doesn't have much effect on SCHED_FAIR/SCHED_BATCH. I am looking for a place where such a task change scheduler class when boosted in rt_mutex_setprio(). Esben - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On 4/15/07, hui Bill Huey <[EMAIL PROTECTED]> wrote: The perception here is that there is that there is this expectation that sections of the Linux kernel are intentionally "churn squated" to prevent any other ideas from creeping in other than of the owner of that subsytem Strangely enough, my perception is that Ingo is simply trying to address the issues Mike's testing discovered in RDSL and SD. It's not surprising Ingo made it a separate patch set as Con has repeatedly stated that the "problems" are in fact by design and won't be fixed. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 10:44:47AM +0200, Ingo Molnar wrote: > I prefer such early releases to lkml _alot_ more than any private review > process. I released the CFS code about 6 hours after i thought "okay, > this looks pretty good" and i spent those final 6 hours on testing it > (making sure it doesnt blow up on your box, etc.), in the final 2 hours > i showed it to two folks i could reach on IRC (Arjan and Thomas) and on > various finishing touches. It doesnt get much faster than that and i > definitely didnt want to sit on it even one day longer because i very > much thought that Con and others should definitely see this work! > > And i very much credited (and still credit) Con for the whole fairness > angle: > > || i'd like to give credit to Con Kolivas for the general approach here: > || he has proven via RSDL/SD that 'fair scheduling' is possible and that > || it results in better desktop scheduling. Kudos Con! > > the 'design consultation' phase you are talking about is _NOW_! :) > > I got the v1 code out to Con, to Mike and to many others ASAP. That's > how you are able to comment on this thread and be part of the > development process to begin with, in a 'private consultation' setup > you'd not have had any opportunity to see _any_ of this. > > In the BSD space there seem to be more 'political' mechanisms for > development, but Linux is truly about doing things out in the open, and > doing it immediately. I can't even begin to talk about how screwed up BSD development is. Maybe another time privately. Ok, Linux development and inclusiveness can be improved. I'm not trying to "call you out" (slang for accusing you with the sole intention to call you crazy in a highly confrontative manner). This is discussed publically here to bring this issue to light, open a communication channel as a means to resolve it. > Okay? ;-) It's cool. We're still getting to know each other professionally and it's okay to a certain degree to have a communication disconnect but only as long as it clears. Your productivity is amazing BTW. But here's the problem, there's this perception that NIH is the default mentality here in Linux. Con feels that this kind of action is intentional and has a malicious quality to it as means of "churn squating" sections of the kernel tree. The perception here is that there is that there is this expectation that sections of the Linux kernel are intentionally "churn squated" to prevent any other ideas from creeping in other than of the owner of that subsytem (VM, scheduling, etc...) because of lack of modularity in the kernel. This isn't an API question but a question possibly general code quality and how maintenance () of it can . This was predicted by folks and then this perception was *realized* when you wrote the equivalent kind of code that has technical overlap with SDL (this is just one dry example). To a person that is writing new code for Linux, having one of the old guards write equivalent code to that of a newcomer has the effect of displacing that person both with regards to code and responsibility with that. When this happens over and over again and folks get annoyed by it, it starts seeming that Linux development seems elitist. I know this because I heard (read) Con's IRC chats all the time about these matters all of the time. This is not just his view but a view of other kernel folks that differing views as to. The closing talk at OLS 2006 was highly disturbing in many ways. It went "Christoph" is right everybody else is wrong which sends a highly negative message to new kernel developers that, say, don't work for RH directly or any of the other mainstream Linux companies. After a while, it starts seeming like this kind of behavior is completely intentional and that Linux is full of arrogant bastards. What I would have done here was to contact Peter Williams, Bill Irwin and Con about what your doing and reach a common concensus about how to create something that would be inclusive of all of their ideas. Discussions can technically heated but that's ok, the discussion is happening and it brings down the wall of this perception. Bill and Con are on oftc.net/#offtopic2. Riel is there as well as Peter Zijlstra. It might be very useful, it might not be. Folks are all stubborn about there ideas and hold on to them for dear life. Effective leaders can deconstruct this hostility and animosity. I don't claim to be one. Because of past hostility to something like schedplugin, the hostility and terseness of responses can be percieved simply as "I'm right, you're wrong" which is condescending. This effects discussion and outright destroys a constructive process if this happens continually since it reenforces that view of "You're an outsider, we don't care about you". Nobody is listening to each other at that point, folks get pissed. Then they think about "I'm going to NIH this person with patc X because he/she did the same here" which is dysfunctional.
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 2007-04-15 at 10:58 +0200, Ingo Molnar wrote: > * Mike Galbraith <[EMAIL PROTECTED]> wrote: > > [...] (I know a trivial way to cure that, and this framework makes > > that possible without dorking up fairness as a general policy.) > > great! Please send patches so i can add them (once you are happy with > the solution) - i think your workload isnt special in any way and could > hit other people too. I'll give it a shot. (have to read and actually understand your new code first though, then see if it's really viable) -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Bill Huey <[EMAIL PROTECTED]> wrote: > On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote: > > [...] > > > > Demystify what? The casual observer need only read either your > > attempt > > Here's the problem. You're a casual observer and obviously not paying > attention. guys, please calm down. Judging by the number of contributions to sched.c the main folks who are not 'observers' here and who thus have an unalienable right to be involved in a nasty flamewar about scheduler interactivity are Con, Mike, Nick and me ;-) Everyone else is just a happy bystander, ok? ;-) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Mike Galbraith <[EMAIL PROTECTED]> wrote: > On Sat, 2007-04-14 at 15:01 +0200, Willy Tarreau wrote: > > > Well, I'll stop heating the room for now as I get out of ideas about > > how to defeat it. I'm convinced. I'm impatient to read about Mike's > > feedback with his workload which behaves strangely on RSDL. If it > > works OK here, it will be the proof that heuristics should not be > > needed. > > You mean the X + mp3 player + audio visualization test? X+Gforce > visualization have problems getting half of my box in the presence of > two other heavy cpu using tasks. Behavior is _much_ better than > RSDL/SD, but the synchronous nature of X/client seems to be a problem. > > With this scheduler, renicing X/client does cure it, whereas with SD > it did not help one bit. [...] thanks for testing it! I was quite worried about your setup - two tasks using up 50%/50% of CPU time, pitted against a kernel rebuild workload seems to be a hard workload to get right. > [...] (I know a trivial way to cure that, and this framework makes > that possible without dorking up fairness as a general policy.) great! Please send patches so i can add them (once you are happy with the solution) - i think your workload isnt special in any way and could hit other people too. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 2007-04-15 at 01:36 -0700, Bill Huey wrote: > On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote: > > [...] > > > > Demystify what? The casual observer need only read either your attempt > > Here's the problem. You're a casual observer and obviously not paying > attention. > > > at writing a scheduler, or my attempts at fixing the one we have, to see > > that it was high time for someone with the necessary skills to step in. > > Now progress can happen, which was _not_ happening before. > > I think that's inaccurate and there are plenty of folks that have that > technical skill and background. The scheduler code isn't a deep mystery > and there are plenty of good kernel hackers out here across many > communities. Ingo isn't the only person on this planet to have deep > scheduler knowledge. Ok , I'm not paying attention, and you can't read. We're even. Have a nice life. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Bill Huey <[EMAIL PROTECTED]> wrote: > Hello folks, > > I think the main failure I see here is that Con wasn't included in > this design or privately in review process. There could have been > better co-ownership of the code. This could also have been done openly > on lkml [...] Bill, you come from a BSD background and you are still relatively new to Linux development, so i dont at all fault you for misunderstanding this situation, and fortunately i have a really easy resolution for your worries: i did exactly that! :) i wrote the first line of code of the CFS patch this week, 8am Wednesday morning, and released it to lkml 62 hours later, 22pm on Friday. (I've listed the file timestamps of my backup patches further below, for all the fine details.) I prefer such early releases to lkml _alot_ more than any private review process. I released the CFS code about 6 hours after i thought "okay, this looks pretty good" and i spent those final 6 hours on testing it (making sure it doesnt blow up on your box, etc.), in the final 2 hours i showed it to two folks i could reach on IRC (Arjan and Thomas) and on various finishing touches. It doesnt get much faster than that and i definitely didnt want to sit on it even one day longer because i very much thought that Con and others should definitely see this work! And i very much credited (and still credit) Con for the whole fairness angle: || i'd like to give credit to Con Kolivas for the general approach here: || he has proven via RSDL/SD that 'fair scheduling' is possible and that || it results in better desktop scheduling. Kudos Con! the 'design consultation' phase you are talking about is _NOW_! :) I got the v1 code out to Con, to Mike and to many others ASAP. That's how you are able to comment on this thread and be part of the development process to begin with, in a 'private consultation' setup you'd not have had any opportunity to see _any_ of this. In the BSD space there seem to be more 'political' mechanisms for development, but Linux is truly about doing things out in the open, and doing it immediately. Okay? ;-) Here's the timestamps of all my backups of the patch, from its humble 4K beginnings to the 100K first-cut v1 result: -rw-rw-r-- 1 mingo mingo 4230 Apr 11 08:47 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 7653 Apr 11 09:12 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 7728 Apr 11 09:26 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 14416 Apr 11 10:08 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 24211 Apr 11 10:41 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 27878 Apr 11 10:45 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 33807 Apr 11 11:05 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 34524 Apr 11 11:09 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 39650 Apr 11 11:19 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 40231 Apr 11 11:34 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 40627 Apr 11 11:48 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 40638 Apr 11 11:54 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 42733 Apr 11 12:19 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 42817 Apr 11 12:31 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 43270 Apr 11 12:41 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 43531 Apr 11 12:48 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 44331 Apr 11 12:51 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45173 Apr 11 12:56 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45288 Apr 11 12:59 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45368 Apr 11 13:06 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45370 Apr 11 13:06 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45815 Apr 11 13:14 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45887 Apr 11 13:19 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45914 Apr 11 13:25 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45850 Apr 11 13:29 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 49196 Apr 11 13:39 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 64317 Apr 11 13:45 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 64403 Apr 11 13:52 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 65199 Apr 11 14:03 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 65199 Apr 11 14:07 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 68995 Apr 11 14:50 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 69919 Apr 11 15:23 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 71065 Apr 11 16:26 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 70642 Apr 11 16:28 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 72334 Apr 11 16:49 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 71624 Apr 11 17:01 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 71854 Apr 11 17:20 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 73571 Apr 11 17:42 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 74708 Apr 11 17:49 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 74708 Apr 11 17:51
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote: > [...] > > Demystify what? The casual observer need only read either your attempt Here's the problem. You're a casual observer and obviously not paying attention. > at writing a scheduler, or my attempts at fixing the one we have, to see > that it was high time for someone with the necessary skills to step in. > Now progress can happen, which was _not_ happening before. I think that's inaccurate and there are plenty of folks that have that technical skill and background. The scheduler code isn't a deep mystery and there are plenty of good kernel hackers out here across many communities. Ingo isn't the only person on this planet to have deep scheduler knowledge. Priority heaps are not new and Solaris has had a pluggable scheduler framework for years. Con's characterization is something that I'm more prone to believe about how Linux kernel development works versus your view. I think it's a great shame to have folks like Bill Irwin and Con to have waste time trying to do something right only to have their ideas attack, then copied and held as the solution for this kind of technical problem as complete reversal of technical opinion as it suits a moment. This is just wrong in so many ways. It outlines the problems with Linux kernel development and questionable elistism regarding ownership of certain sections of the kernel code. I call it "churn squat" and instances like this only support that view which I would rather it be completely wrong and inaccurate instead. bill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sat, 2007-04-14 at 15:01 +0200, Willy Tarreau wrote: > Well, I'll stop heating the room for now as I get out of ideas about how > to defeat it. I'm convinced. I'm impatient to read about Mike's feedback > with his workload which behaves strangely on RSDL. If it works OK here, > it will be the proof that heuristics should not be needed. You mean the X + mp3 player + audio visualization test? X+Gforce visualization have problems getting half of my box in the presence of two other heavy cpu using tasks. Behavior is _much_ better than RSDL/SD, but the synchronous nature of X/client seems to be a problem. With this scheduler, renicing X/client does cure it, whereas with SD it did not help one bit. (I know a trivial way to cure that, and this framework makes that possible without dorking up fairness as a general policy.) -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 2007-04-15 at 13:27 +1000, Con Kolivas wrote: > On Saturday 14 April 2007 06:21, Ingo Molnar wrote: > > [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler > > [CFS] > > > > i'm pleased to announce the first release of the "Modular Scheduler Core > > and Completely Fair Scheduler [CFS]" patchset: > > > >http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch > > > > This project is a complete rewrite of the Linux task scheduler. My goal > > is to address various feature requests and to fix deficiencies in the > > vanilla scheduler that were suggested/found in the past few years, both > > for desktop scheduling and for server scheduling workloads. > > The casual observer will be completely confused by what on earth has happened > here so let me try to demystify things for them. [...] Demystify what? The casual observer need only read either your attempt at writing a scheduler, or my attempts at fixing the one we have, to see that it was high time for someone with the necessary skills to step in. Now progress can happen, which was _not_ happening before. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Hi, On Friday 13 April 2007 23:21:00 Ingo Molnar wrote: [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS] i'm pleased to announce the first release of the Modular Scheduler Core and Completely Fair Scheduler [CFS] patchset: http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch Tested this on top of Linus' GIT tree but the system gets very unresponsive during high disk i/o using ext3 as filesystem but even writing a 300mb file to a usb disk (iPod actually) has the same affect. Regards, ismail signature.asc Description: This is a digitally signed message part.
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 05:00, Jonathan Lundell wrote: On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote: It's a really good thing, and it means that if somebody shows that your code is flawed in some way (by, for example, making a patch that people claim gets better behaviour or numbers), any *good* programmer that actually cares about his code will obviously suddenly be very motivated to out-do the out-doer! No one who cannot rejoice in the discovery of his own mistakes deserves to be called a scholar. Lovely comment. I realise this is not truly directed at me but clearly in the context it has been said people will assume it is directed my way, so while we're all spinning lkml quality rhetoric, let me have a right of reply. One thing I have never tried to do was to ignore bug reports. I'm forever joking that I keep pulling code out of my arse to improve what I've done. RSDL/SD was no exception; heck it had 40 iterations. The reason I could not reply to bug report A with Oh that is problem B so I'll fix it with code C was, as I've said many many times over, health related. I did indeed try to fix many of them without spending hours replying to sometimes unpleasant emails. If health wasn't an issue there might have been 1000 iterations of SD. There was only ever _one_ thing that I was absolutely steadfast on as a concept that I refused to fix that people might claim was a mistake I did not rejoice in to be a scholar. That was that the _correct_ behaviour for a scheduler is to be fair such that proportional slowdown with load is (using that awful pun) a feature, not a bug. Now there are people who will still disagree violently with me on that. SD attempted to be a fairness first virtual-deadline design. If I failed on that front, then so be it (and at least one person certainly has said in lovely warm fuzzy friendly communication that I'm a global failure on all fronts with SD). But let me point out now that Ingo's shiny new scheduler is a fairness-first virtual-deadline design which will have proportional slowdown with load. So it will have a very similar feature. I dare anyone to claim that proportional slowdown with load is a bug, because I will no longer feel like I'm standing alone with a BFG9000 trying to defend my standpoint. Others can take up the post at last. -- -ck - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Mon, 2007-04-16 at 01:49 +0300, Ismail Dönmez wrote: Hi, On Friday 13 April 2007 23:21:00 Ingo Molnar wrote: [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS] i'm pleased to announce the first release of the Modular Scheduler Core and Completely Fair Scheduler [CFS] patchset: http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch Tested this on top of Linus' GIT tree but the system gets very unresponsive during high disk i/o using ext3 as filesystem but even writing a 300mb file to a usb disk (iPod actually) has the same affect. just to make sure; this exact same workload but with the stock scheduler does not have this effect? if so, then it could well be that the scheduler is too fair for it's own good (being really fair inevitably ends up not batching as much as one should, and batching is needed to get any kind of decent performance out of disks nowadays) -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 02:23:08 Arjan van de Ven wrote: On Mon, 2007-04-16 at 01:49 +0300, Ismail Dönmez wrote: Hi, On Friday 13 April 2007 23:21:00 Ingo Molnar wrote: [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS] i'm pleased to announce the first release of the Modular Scheduler Core and Completely Fair Scheduler [CFS] patchset: http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch Tested this on top of Linus' GIT tree but the system gets very unresponsive during high disk i/o using ext3 as filesystem but even writing a 300mb file to a usb disk (iPod actually) has the same affect. just to make sure; this exact same workload but with the stock scheduler does not have this effect? if so, then it could well be that the scheduler is too fair for it's own good (being really fair inevitably ends up not batching as much as one should, and batching is needed to get any kind of decent performance out of disks nowadays) Tried with make install in kdepim (which made system sluggish with CFS) and the system is just fine (using CFQ). Regards, ismail - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: 2) plugsched did not allow on the fly selection of schedulers, nor did it allow a per CPU selection of schedulers. IO schedulers you can change per disk, on the fly, making them much more useful in practice. Also, IO schedulers (while definitely not being slow!) are alot less performance sensitive than CPU schedulers. One of the reasons I never posted my own code is that it never met its own design goals, which absolutely included switching on the fly. I think Peter Williams may have done something about that. It was my hope to be able to do insmod sched_foo.ko until it became clear that the effort it was intended to assist wasn't going to get even the limited hardware access required, at which point I largely stopped working on it. On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: 3) I/O schedulers are pretty damn clean code, and plugsched, at least the last version i saw of it, didnt come even close. I'm not sure what happened there. It wasn't a big enough patch to take hits in this area due to getting overwhelmed by the programming burden like some other efforts of mine. Maybe things started getting ugly once on-the-fly switching entered the picture. My guess is that Peter Williams will have to chime in here, since things have diverged enough from my one-time contribution 4 years ago. -- wli - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sunday 15 April 2007 00:38, Davide Libenzi wrote: Haven't looked at the scheduler code yet, but for a similar problem I use a time ring. The ring has Ns (2 power is better) slots (where tasks are queued - in my case they were som sort of timers), and it has a current base index (Ib), a current base time (Tb) and a time granularity (Tg). It also has a bitmap with bits telling you which slots contains queued tasks. An item (task) that has to be scheduled at time T, will be queued in the slot: S = Ib + min((T - Tb) / Tg, Ns - 1); Items with T longer than Ns*Tg will be scheduled in the relative last slot (chosing a proper Ns and Tg can minimize this). Queueing is O(1) and de-queueing is O(Ns). You can play with Ns and Tg to suite to your needs. This is a simple bench between time-ring (TR) and CFS queueing: http://www.xmailserver.org/smart-queue.c In my box (Dual Opteron 252): [EMAIL PROTECTED]:~$ ./smart-queue -n 8 CFS = 142.21 cycles/loop TR = 72.33 cycles/loop [EMAIL PROTECTED]:~$ ./smart-queue -n 16 CFS = 188.74 cycles/loop TR = 83.79 cycles/loop [EMAIL PROTECTED]:~$ ./smart-queue -n 32 CFS = 221.36 cycles/loop TR = 75.93 cycles/loop [EMAIL PROTECTED]:~$ ./smart-queue -n 64 CFS = 242.89 cycles/loop TR = 81.29 cycles/loop Hello all, I cannot help myself to not report results with GAVL tree algorithm there as an another race competitor. I believe, that it is better solution for large priority queues than RB-tree and even heap trees. It could be disputable if the scheduler needs such scalability on the other hand. The AVL heritage guarantees lower height which results in shorter search times which could be profitable for other uses in kernel. GAVL algorithm is AVL tree based, so it does not suffer from infinite priorities granularity there as TR does. It allows use for generalized case where tree is not fully balanced. This allows to cut the first item withour rebalancing. This leads to the degradation of the tree by one more level (than non degraded AVL gives) in maximum, which is still considerably better than RB-trees maximum. http://cmp.felk.cvut.cz/~pisa/linux/smart-queue-v-gavl.c The description behind the code is there http://cmp.felk.cvut.cz/~pisa/ulan/gavl.pdf The code is part of much more covering uLUt library http://cmp.felk.cvut.cz/~pisa/ulan/ulut.pdf http://sourceforge.net/project/showfiles.php?group_id=118937package_id=130840 I have included all required GAVL code directly into smart-queue-v-gavl.c to provide it for easy testing. There are tests run on my little dated computer - Duron 600 MHz. Test are run twice to suppress run order influence. ./smart-queue-v-gavl -n 1 -l 200 gavl_cfs = 55.66 cycles/loop CFS = 88.33 cycles/loop TR = 141.78 cycles/loop CFS = 90.45 cycles/loop gavl_cfs = 55.38 cycles/loop ./smart-queue-v-gavl -n 2 -l 200 gavl_cfs = 82.85 cycles/loop CFS = 104.18 cycles/loop TR = 145.21 cycles/loop CFS = 102.74 cycles/loop gavl_cfs = 82.05 cycles/loop ./smart-queue-v-gavl -n 4 -l 200 gavl_cfs = 137.45 cycles/loop CFS = 156.47 cycles/loop TR = 142.00 cycles/loop CFS = 152.65 cycles/loop gavl_cfs = 139.38 cycles/loop ./smart-queue-v-gavl -n 10 -l 200 gavl_cfs = 229.22 cycles/loop (WORSE) CFS = 206.26 cycles/loop TR = 140.81 cycles/loop CFS = 208.29 cycles/loop gavl_cfs = 223.62 cycles/loop (WORSE) ./smart-queue-v-gavl -n 100 -l 200 gavl_cfs = 257.66 cycles/loop CFS = 329.68 cycles/loop TR = 142.20 cycles/loop CFS = 319.34 cycles/loop gavl_cfs = 260.02 cycles/loop ./smart-queue-v-gavl -n 1000 -l 200 gavl_cfs = 258.41 cycles/loop CFS = 393.04 cycles/loop TR = 134.76 cycles/loop CFS = 392.20 cycles/loop gavl_cfs = 260.93 cycles/loop ./smart-queue-v-gavl -n 1 -l 200 gavl_cfs = 259.45 cycles/loop CFS = 605.89 cycles/loop TR = 196.69 cycles/loop CFS = 622.60 cycles/loop gavl_cfs = 262.72 cycles/loop ./smart-queue-v-gavl -n 10 -l 200 gavl_cfs = 258.21 cycles/loop CFS = 845.62 cycles/loop TR = 315.37 cycles/loop CFS = 860.21 cycles/loop gavl_cfs = 258.94 cycles/loop The GAVL code has not been tuned by any likely/unlikely constructs. It brings even some other overhead from it generic design which is not necessary for this use - it keeps permanently even pointer to the last element, ensures, that the insertion order is preserved for same key values etc. But it still proves much better scalability then kernel used RB-tree code. On the other hand, it does not encode color/height in one of the pointers and requires additional field for height. May it be, that difference is due some bug in my testing, then I would be interrested in correction. The test case is oversimplified probably. I have already run more different tests against GAVL code in the past to compare it with different tree and queues implementations and I have not found case with real performance degradation. On the other hand, there are cases for small items counts where GAVL is sometimes a little worse
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* William Lee Irwin III [EMAIL PROTECTED] wrote: I've been suggesting testing CPU bandwidth allocation as influenced by nice numbers for a while now for a reason. On Sun, Apr 15, 2007 at 09:57:48PM +0200, Ingo Molnar wrote: Oh I was very much testing CPU bandwidth allocation as influenced by nice numbers - it's one of the basic things i do when modifying the scheduler. An automated tool, while nice (all automation is nice) wouldnt necessarily show such bugs though, because here too it needed thousands of running tasks to trigger in practice. Any volunteers? ;) Worse comes to worse I might actually get around to doing it myself. Any more detailed descriptions of the test for a rainy day? -- wli - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sunday 15 April 2007, Mike Galbraith wrote: On Sun, 2007-04-15 at 12:58 -0400, Gene Heskett wrote: Chuckle, possibly but then I'm not anything even remotely close to an expert here Con, just reporting what I get. And I just rebooted to 2.6.21-rc6 + sched-mike-5.patch for grins and giggles, or frowns and profanity as the case may call for. Erm, that patch is embarrassingly buggy, so profanity should dominate. -Mike Chuckle, ROTFLMAO even. I didn't run it that long as I immediately rebuilt and rebooted when I found I'd used the wrong patch, and in fact had tested that one and found it sub-optimal before I'd built and ran Con's -0.40 version. As for bugs of the type that make it to the screen or logs, I didn't see any. OTOH, my eyesight is slowly going downhill, now 20/25. It was 20/10 30 years ago. Now thats reason for profanity... -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Unix weanies are as bad at this as anyone. -- Larry Wall in [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
William Lee Irwin III wrote: On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: 2) plugsched did not allow on the fly selection of schedulers, nor did it allow a per CPU selection of schedulers. IO schedulers you can change per disk, on the fly, making them much more useful in practice. Also, IO schedulers (while definitely not being slow!) are alot less performance sensitive than CPU schedulers. One of the reasons I never posted my own code is that it never met its own design goals, which absolutely included switching on the fly. I think Peter Williams may have done something about that. I didn't but some students did. In a previous life, I did implement a runtime configurable CPU scheduling mechanism (implemented on True64, Solaris and Linux) that allowed schedulers to be loaded as modules at run time. This was released commercially on True64 and Solaris. So I know that it can be done. I have thought about doing something similar for the SPA schedulers which differ in only small ways from each other but lack motivation. It was my hope to be able to do insmod sched_foo.ko until it became clear that the effort it was intended to assist wasn't going to get even the limited hardware access required, at which point I largely stopped working on it. On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: 3) I/O schedulers are pretty damn clean code, and plugsched, at least the last version i saw of it, didnt come even close. I'm not sure what happened there. It wasn't a big enough patch to take hits in this area due to getting overwhelmed by the programming burden like some other efforts of mine. Maybe things started getting ugly once on-the-fly switching entered the picture. My guess is that Peter Williams will have to chime in here, since things have diverged enough from my one-time contribution 4 years ago. From my POV, the current version of plugsched is considerably simpler than it was when I took the code over from Con as I put considerable effort into minimizing code overlap in the various schedulers. I also put considerable effort into minimizing any changes to the load balancing code (something Ingo seems to think is a deficiency) and the result is that plugsched allows intra run queue scheduling to be easily modified WITHOUT effecting load balancing. To my mind scheduling and load balancing are orthogonal and keeping them that way simplifies things. As Ingo correctly points out, plugsched does not allow different schedulers to be used per CPU but it would not be difficult to modify it so that they could. Although I've considered doing this over the years I decided not to as it would just increase the complexity and the amount of work required to keep the patch set going. About six months ago I decided to reduce the amount of work I was doing on plugsched (as it was obviously never going to be accepted) and now only publish patches against the vanilla kernel's major releases (and the only reason that I kept doing that is that the download figures indicated that about 80 users were interested in the experiment). Peter PS I no longer read LKML (due to time constraints) and would appreciate it if I could be CC'd on any e-mails suggesting scheduler changes. PPS I'm just happy to see that Ingo has finally accepted that the vanilla scheduler was badly in need of fixing and don't really care who fixes it. PPS Different schedulers for different aims (i.e. server or work station) do make a difference. E.g. the spa_svr scheduler in plugsched does about 1% better on kernbench than the next best scheduler in the bunch. PPPS Con, fairness isn't always best as humans aren't very altruistic and we need to give unfair preference to interactive tasks in order to stop the users flinging their PCs out the window. But the current scheduler doesn't do this very well and is also not very good at fairness so needs to change. But the changes need to address interactive response and fairness not just fairness. -- Peter Williams [EMAIL PROTECTED] Learning, n. The kind of ignorance distinguishing the studious. -- Ambrose Bierce - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Mon, Apr 16, 2007 at 08:52:33AM +1000, Con Kolivas wrote: On Monday 16 April 2007 05:00, Jonathan Lundell wrote: On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote: It's a really good thing, and it means that if somebody shows that your code is flawed in some way (by, for example, making a patch that people claim gets better behaviour or numbers), any *good* programmer that actually cares about his code will obviously suddenly be very motivated to out-do the out-doer! No one who cannot rejoice in the discovery of his own mistakes deserves to be called a scholar. Lovely comment. I realise this is not truly directed at me but clearly in the context it has been said people will assume it is directed my way, so while we're all spinning lkml quality rhetoric, let me have a right of reply. One thing I have never tried to do was to ignore bug reports. I'm forever joking that I keep pulling code out of my arse to improve what I've done. RSDL/SD was no exception; heck it had 40 iterations. The reason I could not reply to bug report A with Oh that is problem B so I'll fix it with code C was, as I've said many many times over, health related. I did indeed try to fix many of them without spending hours replying to sometimes unpleasant emails. If health wasn't an issue there might have been 1000 iterations of SD. Well what matters is the code and development. I don't think Ingo's scheduler is the final word, although I worry that Linus might jump the gun and merge something just to give it a test, which we then get stuck with :P I don't know how anybody can think Ingo's new scheduler is anything but a good thing (so long as it has to compete before being merged). And that's coming from someone who wants *their* scheduler to get merged... I think mine can compete ;) and if it can't, then I'd rather be using the scheduler that beats it. There was only ever _one_ thing that I was absolutely steadfast on as a concept that I refused to fix that people might claim was a mistake I did not rejoice in to be a scholar. That was that the _correct_ behaviour for a scheduler is to be fair such that proportional slowdown with load is (using that awful pun) a feature, not a bug. If something is using more than a fair share of CPU time, over some macro period, in order to be interactive, then definitely it should get throttled. I've always maintained (since starting scheduler work) that the 2.6 scheduler is horrible because it allows these cases where some things can get more CPU time just by how they behave. Glad people are starting to come around on that point. So, on to something productive, we have 3 candidates for a new scheduler so far. How do we decide which way to go? (and yes, I still think switchable schedulers is wrong and a copout) This is one area where it is virtually impossible to discount any decent design on correctness/performance/etc. and even testing in -mm isn't really enough. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 04:31:54PM -0500, Matt Mackall wrote: On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: 4) the good thing that happened to I/O, after years of stagnation isnt I/O schedulers. The good thing that happened to I/O is called Jens Axboe. If you care about the I/O subystem then print that name out and hang it on the wall. That and only that is what mattered. Disagree. Things didn't actually get interesting until Nick showed up with AS and got it in-tree to demonstrate the huge amount of room we had for improvement. It took several iterations of AS and CFQ (with a couple complete rewrites) before CFQ began to look like the winner. The resulting time-sliced CFQ was fairly heavily influenced by the ideas in AS. Well to be fair, Jens had just implemented deadline, which got me interested ;) Actually, I would still like to be able to deprecate deadline for AS, because AS has a tunable that you can switch to turn off read anticipation and revert to deadline behaviour (or very close to). It would have been nice if CFQ were then a layer on top of AS that implemented priorities (or vice versa). And then AS could be deprecated and we'd be back to 1 primary scheduler. Well CFQ seems to be going in the right direction with that, however some large users still find AS faster for some reason... Anyway, moral of the story is that I think it would have been nice if we hadn't proliferated IO schedulers, however in practice it isn't easy to just layer features on top of each other, and also keeping deadline helped a lot to be able to debug and examine performance regressions and actually get code upstream. And this was true even when it was globally boottine switchable only. I'd prefer if we kept a single CPU scheduler in mainline, because I think that simplifies analysis and focuses testing. I think we can have one that is good enough for everyone. But if the only other option for progress is that Linus or Andrew just pull one out of a hat, then I would rather merge all of them. Yes I think Con's scheduler should get a fair go, ditto for Ingo's, mine, and anyone else's. nor was the non-modularity of some piece of code ever an impediment to competition. May i remind you of the pretty competitive SLAB allocator landscape, resulting in things like the SLOB allocator, written by yourself? ;-) Thankfully no one came out and said we don't want to balkanize the allocator landscape when I submitted it or I probably would have just dropped it, rather than painfully dragging it along out of tree for years. I'm not nearly the glutton for punishment that Con is. :-P I don't think this is a fault of the people or the code involved. We just didn't have much collective drive to replace the scheduler, and even less an idea of how to decide between any two of them. I've kept nicksched around since 2003 or so and no hard feelings ;) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
William Lee Irwin III wrote: One of the reasons I never posted my own code is that it never met its own design goals, which absolutely included switching on the fly. I think Peter Williams may have done something about that. It was my hope to be able to do insmod sched_foo.ko until it became clear that the effort it was intended to assist wasn't going to get even the limited hardware access required, at which point I largely stopped working on it. On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote: I didn't but some students did. In a previous life, I did implement a runtime configurable CPU scheduling mechanism (implemented on True64, Solaris and Linux) that allowed schedulers to be loaded as modules at run time. This was released commercially on True64 and Solaris. So I know that it can be done. I have thought about doing something similar for the SPA schedulers which differ in only small ways from each other but lack motivation. Driver models for scheduling are not so far out. AFAICS it's largely a tug-of-war over design goals, e.g. maintaining per-cpu runqueues and switching out intra-queue policies vs. switching out whole-system policies, SMP handling and all. Whether this involves load balancing depends strongly on e.g. whether you have per-cpu runqueues. A 2.4.x scheduler module, for instance, would not have a load balancer at all, as it has only one global runqueue. There are other sorts of policies wanting significant changes to SMP handling vs. the stock load balancing. William Lee Irwin III wrote: I'm not sure what happened there. It wasn't a big enough patch to take hits in this area due to getting overwhelmed by the programming burden like some other efforts of mine. Maybe things started getting ugly once on-the-fly switching entered the picture. My guess is that Peter Williams will have to chime in here, since things have diverged enough from my one-time contribution 4 years ago. On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote: From my POV, the current version of plugsched is considerably simpler than it was when I took the code over from Con as I put considerable effort into minimizing code overlap in the various schedulers. I also put considerable effort into minimizing any changes to the load balancing code (something Ingo seems to think is a deficiency) and the result is that plugsched allows intra run queue scheduling to be easily modified WITHOUT effecting load balancing. To my mind scheduling and load balancing are orthogonal and keeping them that way simplifies things. ISTR rearranging things for con in such a fashion that it no longer worked out of the box (though that wasn't the intention; restructuring it to be more suited to his purposes was) and that's what he worked off of afterward. I don't remember very well what changed there as I clearly invested less effort there than the prior versions. Now that I think of it, that may have been where the sample policy demonstrating scheduling classes was lost. On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote: As Ingo correctly points out, plugsched does not allow different schedulers to be used per CPU but it would not be difficult to modify it so that they could. Although I've considered doing this over the years I decided not to as it would just increase the complexity and the amount of work required to keep the patch set going. About six months ago I decided to reduce the amount of work I was doing on plugsched (as it was obviously never going to be accepted) and now only publish patches against the vanilla kernel's major releases (and the only reason that I kept doing that is that the download figures indicated that about 80 users were interested in the experiment). That's a rather different goal from what I was going on about with it, so it's all diverged quite a bit. Where I had a significant need for mucking with the entire concept of how SMP was handled, this is rather different. At this point I'm questioning the relevance of my own work, though it was already relatively marginal as it started life as an attempt at a sort of debug patch to help gang scheduling (which is in itself a rather marginally relevant feature to most users) code along. On Mon, Apr 16, 2007 at 11:06:56AM +1000, Peter Williams wrote: PS I no longer read LKML (due to time constraints) and would appreciate it if I could be CC'd on any e-mails suggesting scheduler changes. PPS I'm just happy to see that Ingo has finally accepted that the vanilla scheduler was badly in need of fixing and don't really care who fixes it. PPS Different schedulers for different aims (i.e. server or work station) do make a difference. E.g. the spa_svr scheduler in plugsched does about 1% better on kernbench than the next best scheduler in the bunch. PPPS Con, fairness isn't always best as humans aren't very altruistic and we need to give unfair
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 12:28, Nick Piggin wrote: So, on to something productive, we have 3 candidates for a new scheduler so far. How do we decide which way to go? (and yes, I still think switchable schedulers is wrong and a copout) This is one area where it is virtually impossible to discount any decent design on correctness/performance/etc. and even testing in -mm isn't really enough. We're in agreement! YAY! Actually this is simpler than that. I'm taking SD out of the picture. It has served it's purpose of proving that we need to seriously address all the scheduling issues and did more than a half decent job at it. Unfortunately I also cannot sit around supporting it forever by myself. My own life is more important, so consider SD not even running the race any more. I'm off to continue maintaining permanent-out-of-tree leisurely code at my own pace. What's more is, I think I'll just stick to staircase Gen I version blah and shelve SD and try to have fond memories of SD as an intellectual prompting exercise only. -- -ck - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Mon, Apr 16, 2007 at 01:15:27PM +1000, Con Kolivas wrote: On Monday 16 April 2007 12:28, Nick Piggin wrote: So, on to something productive, we have 3 candidates for a new scheduler so far. How do we decide which way to go? (and yes, I still think switchable schedulers is wrong and a copout) This is one area where it is virtually impossible to discount any decent design on correctness/performance/etc. and even testing in -mm isn't really enough. We're in agreement! YAY! Actually this is simpler than that. I'm taking SD out of the picture. It has served it's purpose of proving that we need to seriously address all the scheduling issues and did more than a half decent job at it. Unfortunately I also cannot sit around supporting it forever by myself. My own life is more important, so consider SD not even running the race any more. I'm off to continue maintaining permanent-out-of-tree leisurely code at my own pace. What's more is, I think I'll just stick to staircase Gen I version blah and shelve SD and try to have fond memories of SD as an intellectual prompting exercise only. Well I would hope that _if_ we decide to switch schedulers, then you get a chance to field something (and I hope you will decide to and have time to), and I hope we don't rush into the decision. We've had the current scheduler for so many years now that it is much more important to make sure we take the time to do the right thing rather than absolutely have to merge a new scheduler right now ;) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 01:05, Ingo Molnar wrote: * Con Kolivas [EMAIL PROTECTED] wrote: 2. Since then I've been thinking/working on a cpu scheduler design that takes away all the guesswork out of scheduling and gives very predictable, as fair as possible, cpu distribution and latency while preserving as solid interactivity as possible within those confines. yeah. I think you were right on target with this call. Yay thank goodness :) It's time to fix the damn cpu scheduler once and for all. Everyone uses this; it's no minor driver or $bigsmp or $bigram or $small_embedded_RT_hardware feature. I've applied the sched.c change attached at the bottom of this mail to the CFS patch, if you dont mind. (or feel free to suggest some other text instead.) * 2003-09-03 Interactivity tuning by Con Kolivas. * 2004-04-02 Scheduler domains code by Nick Piggin + * 2007-04-15 Con Kolivas was dead right: fairness matters! :) LOL that's awful. I'd prefer something meaningful like Work begun on replacing all interactivity tuning with a fair virtual-deadline design by Con Kolivas. While you're at it, it's worth getting rid of a few slightly pointless name changes too. Don't rename SCHED_NORMAL yet again, and don't call all your things sched_fair blah_fair __blah_fair and so on. It means that anything else is by proxy going to be considered unfair. Leave SCHED_NORMAL as is, replace the use of the word _fair with _cfs. I don't really care how many copyright notices you put into our already noisy bootup but it's redundant since there is no choice; we all get the same cpu scheduler. 1. I tried in vain some time ago to push a working extensable pluggable cpu scheduler framework (based on wli's work) for the linux kernel. It was perma-vetoed by Linus and Ingo (and Nick also said he didn't like it) as being absolutely the wrong approach and that we should never do that. [...] i partially replied to that point to Will already, and i'd like to make it clear again: yes, i rejected plugsched 2-3 years ago (which already drifted away from wli's original codebase) and i would still reject it today. No that was just me being flabbergasted by what appeared to be you posting your own plugsched. Note nowhere in the 40 iterations of rsdl-sd did I ask/suggest for plugsched. I said in my first announcement my aim was to create a scheduling policy robust enough for all situations rather than fantastic a lot of the time and awful sometimes. There are plenty of people ready to throw out arguments for plugsched now and I don't have the energy to continue that fight (I never did really). But my question still stands about this comment: case, all of SD's logic could be added via a kernel/sched_sd.c module as well, if Con is interested in such an approach. ] What exactly would be the purpose of such a module that governs nothing in particular? Since there'll be no pluggable scheduler by your admission it has no control over SCHED_NORMAL, and would require another scheduling policy for it to govern which there is no express way to use at the moment and people tend to just use the default without great effort. First and foremost, please dont take such rejections too personally - i had my own share of rejections (and in fact, as i mentioned it in a previous mail, i had a fair number of complete project throwaways: 4g:4g, in-kernel Tux, irqrate and many others). I know that they can hurt and can demoralize, but if i dont like something it's my job to tell that. Hmm? No that's not what this is about. Remember dynticks which was not originally my code but I tried to bring it up to mainline standard which I fought with for months? You came along with yet another rewrite from scratch and the flaws in the design I was working with were obvious so I instantly bowed down to that and never touched my code again. I didn't ask for credit back then, but obviously brought the requirement for a no idle tick implementation to the table. My view about plugsched: first please take a look at the latest plugsched code: http://downloads.sourceforge.net/cpuse/plugsched-6.5-for-2.6.20.patch 26 files changed, 8951 insertions(+), 1495 deletions(-) As an experiment i've removed all the add-on schedulers (both the core and the include files, only kept the vanilla one) from the plugsched patch (and the makefile and kconfig complications, etc), to see the 'infrastructure cost', and it still gave: 12 files changed, 1933 insertions(+), 1479 deletions(-) I do not see extra code per-se as being a bad thing. I've heard said a few times before ever notice how when the correct solution is done it is a lot more code than the quick hack that ultimately fails?. Insert long winded discussion of perfect is the enemy of good here, _but_ I'm not arguing perfect versus good, I'm talking about solid code versus quick fix. Again, none of this comment is directed
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 09:25:07AM -0700, Arjan van de Ven wrote: Now this doesn't mean that people shouldn't be nice to each other, not cooperate or steal credits, but I don't get the impression that that is happening here. Ingo is taking part in the discussion with a counter proposal for discussion *on the mailing list*. What more do you want?? Con should have been CCed from the first moment this was put into motion to limit the perception of exclusion. That was mistake number one and big time failures to understand this dynamic. After it was Con's idea. Why the hell he was excluded from Ingo's development process is baffling to me and him (most likely). He put int a lot of effort into SDL and his experiences with scheduling should still be seriously considered in this development process even if he doesn't write a single line of code from this moment on. What should have happened is that our very busy associate at RH by the name of Ingo Molnar should have leverage more of Con's and Bill's work and use them as a proxy for his own ideas. They would have loved to have contributed more and our very busy Ingo Molnar would have gotten a lot of his work and ideas implemented without him even opening a single source file for editting. They would have happily done this work for Ingo. Ingo could have been used for something else more important like making KVM less of a freaking ugly hack and we all would have benefitted from this. He could have been working on SystemTap so that you stop losing accounts to Sun and Solaris 10's Dtrace. He could have been working with Riel to fix your butt ugly page scanning problem causing horrible contention via the Clock/Pro algorithm, etc... He could have been fixing the ugly futex rwsem mapping problem that's killing -rt and anything that uses Posix threads. He could have created a userspace thread control block (TCB) with Mr. Drepper so that we can turn off preemption in userspace (userspace per CPU local storage) and implement a very quick non-kernel crossing implementation of priority ceilings (userspace check for priority and flags at preempt_schedule() in the TCB) so that our -rt Posix API doesn't suck donkey shit... Need I say more ? As programmers like Ingo get spread more thinly, he needs super smart folks like Bill Irwin and Con to help him out and learn to resist NIH folk's stuff out of some weird fear. When this happens, folks like Ingo must learn to facilitate development in addition to implementing it with those kind of folks. This takes time and practice to entrust folks to do things for him. Ingo is the best method of getting new Linux kernel ideas and communicate them to Linus. His value goes beyond just just code and is often the biggest hammer we have in the Linux community to get stuff into the kernel. Facilitation of others is something that solo programmers must need when groups like the Linux kernel get larger and large every year. Understand ? Are we in embarrassing agreement here ? bill - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Mon, 16 Apr 2007, Pavel Pisa wrote: I cannot help myself to not report results with GAVL tree algorithm there as an another race competitor. I believe, that it is better solution for large priority queues than RB-tree and even heap trees. It could be disputable if the scheduler needs such scalability on the other hand. The AVL heritage guarantees lower height which results in shorter search times which could be profitable for other uses in kernel. GAVL algorithm is AVL tree based, so it does not suffer from infinite priorities granularity there as TR does. It allows use for generalized case where tree is not fully balanced. This allows to cut the first item withour rebalancing. This leads to the degradation of the tree by one more level (than non degraded AVL gives) in maximum, which is still considerably better than RB-trees maximum. http://cmp.felk.cvut.cz/~pisa/linux/smart-queue-v-gavl.c Here are the results on my Opteron 252: Testing N=1 gavl_cfs = 187.20 cycles/loop CFS = 194.16 cycles/loop TR = 314.87 cycles/loop CFS = 194.15 cycles/loop gavl_cfs = 187.15 cycles/loop Testing N=2 gavl_cfs = 268.94 cycles/loop CFS = 305.53 cycles/loop TR = 313.78 cycles/loop CFS = 289.58 cycles/loop gavl_cfs = 266.02 cycles/loop Testing N=4 gavl_cfs = 452.13 cycles/loop CFS = 518.81 cycles/loop TR = 311.54 cycles/loop CFS = 516.23 cycles/loop gavl_cfs = 450.73 cycles/loop Testing N=8 gavl_cfs = 609.29 cycles/loop CFS = 644.65 cycles/loop TR = 308.11 cycles/loop CFS = 667.01 cycles/loop gavl_cfs = 592.89 cycles/loop Testing N=16 gavl_cfs = 686.30 cycles/loop CFS = 807.41 cycles/loop TR = 317.20 cycles/loop CFS = 810.24 cycles/loop gavl_cfs = 688.42 cycles/loop Testing N=32 gavl_cfs = 756.57 cycles/loop CFS = 852.14 cycles/loop TR = 301.22 cycles/loop CFS = 876.12 cycles/loop gavl_cfs = 758.46 cycles/loop Testing N=64 gavl_cfs = 831.97 cycles/loop CFS = 997.16 cycles/loop TR = 304.74 cycles/loop CFS = 1003.26 cycles/loop gavl_cfs = 832.83 cycles/loop Testing N=128 gavl_cfs = 897.33 cycles/loop CFS = 1030.36 cycles/loop TR = 295.65 cycles/loop CFS = 1035.29 cycles/loop gavl_cfs = 892.51 cycles/loop Testing N=256 gavl_cfs = 963.17 cycles/loop CFS = 1146.04 cycles/loop TR = 295.35 cycles/loop CFS = 1162.04 cycles/loop gavl_cfs = 966.31 cycles/loop Testing N=512 gavl_cfs = 1029.82 cycles/loop CFS = 1218.34 cycles/loop TR = 288.78 cycles/loop CFS = 1257.97 cycles/loop gavl_cfs = 1029.83 cycles/loop Testing N=1024 gavl_cfs = 1091.76 cycles/loop CFS = 1318.47 cycles/loop TR = 287.74 cycles/loop CFS = 1311.72 cycles/loop gavl_cfs = 1093.29 cycles/loop Testing N=2048 gavl_cfs = 1153.03 cycles/loop CFS = 1398.84 cycles/loop TR = 286.75 cycles/loop CFS = 1438.68 cycles/loop gavl_cfs = 1149.97 cycles/loop There seem to be some difference from your numbers. This is with: gcc version 4.1.2 and -O2. But then and Opteron can behave quite differentyl than a Duron on a bench like this ;) - Davide - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007, Con Kolivas wrote: And I snipped, Sorry fellas. Con's original submission was to me, quite an improvement. But I have to say it, and no denegration of your efforts is intended Con, but you did 'pull the trigger' and get this thing rolling by scratching the itch drawing attention to an ugly lack of user interactivity that had crept into the 2.6 family. So from me to Con, a tip of the hat, and a deep bow in your direction, thank you. Now, you have done what you aimed to do, so please get well. I've now been through most of an amanda session using Ingo's CFS and I have to say that it is another improvement over your 0.40 that's is just as obvious as your first patch was against the stock scheduler. No other scheduler yet has allowed the full utilization of the cpu, and maintained user interactivity as well as this one has, my cpu is running about 5 degrees F hotter just from this effect alone. gzip, if the rest of the system is in between tasks, is consistently showing around 95%, but let anything else stick up its hand, like procmail etc, and gzip now dutifully steps aside, dropping into the 40% range until procmail and spamd are done, at which point there is no rest for the wicked and the cpu never gets a chance to cool. There was, just now, a pause of about 2 seconds, while amanda moved a tarball from the holding disk area on /dev/hda to the vtapes disk on /dev/hdd, so that would have been an I/O bound situation. This one Ingo, even without any other patches and I think I did see one go by in this thread which I didn't apply, is a definite keeper. Sweet even. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) A word to the wise is enough. -- Miguel de Cervantes - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 2007-04-15 at 13:27 +1000, Con Kolivas wrote: On Saturday 14 April 2007 06:21, Ingo Molnar wrote: [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS] i'm pleased to announce the first release of the Modular Scheduler Core and Completely Fair Scheduler [CFS] patchset: http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch This project is a complete rewrite of the Linux task scheduler. My goal is to address various feature requests and to fix deficiencies in the vanilla scheduler that were suggested/found in the past few years, both for desktop scheduling and for server scheduling workloads. The casual observer will be completely confused by what on earth has happened here so let me try to demystify things for them. [...] Demystify what? The casual observer need only read either your attempt at writing a scheduler, or my attempts at fixing the one we have, to see that it was high time for someone with the necessary skills to step in. Now progress can happen, which was _not_ happening before. -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sat, 2007-04-14 at 15:01 +0200, Willy Tarreau wrote: Well, I'll stop heating the room for now as I get out of ideas about how to defeat it. I'm convinced. I'm impatient to read about Mike's feedback with his workload which behaves strangely on RSDL. If it works OK here, it will be the proof that heuristics should not be needed. You mean the X + mp3 player + audio visualization test? X+Gforce visualization have problems getting half of my box in the presence of two other heavy cpu using tasks. Behavior is _much_ better than RSDL/SD, but the synchronous nature of X/client seems to be a problem. With this scheduler, renicing X/client does cure it, whereas with SD it did not help one bit. (I know a trivial way to cure that, and this framework makes that possible without dorking up fairness as a general policy.) -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote: [...] Demystify what? The casual observer need only read either your attempt Here's the problem. You're a casual observer and obviously not paying attention. at writing a scheduler, or my attempts at fixing the one we have, to see that it was high time for someone with the necessary skills to step in. Now progress can happen, which was _not_ happening before. I think that's inaccurate and there are plenty of folks that have that technical skill and background. The scheduler code isn't a deep mystery and there are plenty of good kernel hackers out here across many communities. Ingo isn't the only person on this planet to have deep scheduler knowledge. Priority heaps are not new and Solaris has had a pluggable scheduler framework for years. Con's characterization is something that I'm more prone to believe about how Linux kernel development works versus your view. I think it's a great shame to have folks like Bill Irwin and Con to have waste time trying to do something right only to have their ideas attack, then copied and held as the solution for this kind of technical problem as complete reversal of technical opinion as it suits a moment. This is just wrong in so many ways. It outlines the problems with Linux kernel development and questionable elistism regarding ownership of certain sections of the kernel code. I call it churn squat and instances like this only support that view which I would rather it be completely wrong and inaccurate instead. bill - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 2007-04-15 at 01:36 -0700, Bill Huey wrote: On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote: [...] Demystify what? The casual observer need only read either your attempt Here's the problem. You're a casual observer and obviously not paying attention. at writing a scheduler, or my attempts at fixing the one we have, to see that it was high time for someone with the necessary skills to step in. Now progress can happen, which was _not_ happening before. I think that's inaccurate and there are plenty of folks that have that technical skill and background. The scheduler code isn't a deep mystery and there are plenty of good kernel hackers out here across many communities. Ingo isn't the only person on this planet to have deep scheduler knowledge. Ok shrug, I'm not paying attention, and you can't read. We're even. Have a nice life. -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Bill Huey [EMAIL PROTECTED] wrote: Hello folks, I think the main failure I see here is that Con wasn't included in this design or privately in review process. There could have been better co-ownership of the code. This could also have been done openly on lkml [...] Bill, you come from a BSD background and you are still relatively new to Linux development, so i dont at all fault you for misunderstanding this situation, and fortunately i have a really easy resolution for your worries: i did exactly that! :) i wrote the first line of code of the CFS patch this week, 8am Wednesday morning, and released it to lkml 62 hours later, 22pm on Friday. (I've listed the file timestamps of my backup patches further below, for all the fine details.) I prefer such early releases to lkml _alot_ more than any private review process. I released the CFS code about 6 hours after i thought okay, this looks pretty good and i spent those final 6 hours on testing it (making sure it doesnt blow up on your box, etc.), in the final 2 hours i showed it to two folks i could reach on IRC (Arjan and Thomas) and on various finishing touches. It doesnt get much faster than that and i definitely didnt want to sit on it even one day longer because i very much thought that Con and others should definitely see this work! And i very much credited (and still credit) Con for the whole fairness angle: || i'd like to give credit to Con Kolivas for the general approach here: || he has proven via RSDL/SD that 'fair scheduling' is possible and that || it results in better desktop scheduling. Kudos Con! the 'design consultation' phase you are talking about is _NOW_! :) I got the v1 code out to Con, to Mike and to many others ASAP. That's how you are able to comment on this thread and be part of the development process to begin with, in a 'private consultation' setup you'd not have had any opportunity to see _any_ of this. In the BSD space there seem to be more 'political' mechanisms for development, but Linux is truly about doing things out in the open, and doing it immediately. Okay? ;-) Here's the timestamps of all my backups of the patch, from its humble 4K beginnings to the 100K first-cut v1 result: -rw-rw-r-- 1 mingo mingo 4230 Apr 11 08:47 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 7653 Apr 11 09:12 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 7728 Apr 11 09:26 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 14416 Apr 11 10:08 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 24211 Apr 11 10:41 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 27878 Apr 11 10:45 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 33807 Apr 11 11:05 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 34524 Apr 11 11:09 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 39650 Apr 11 11:19 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 40231 Apr 11 11:34 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 40627 Apr 11 11:48 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 40638 Apr 11 11:54 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 42733 Apr 11 12:19 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 42817 Apr 11 12:31 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 43270 Apr 11 12:41 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 43531 Apr 11 12:48 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 44331 Apr 11 12:51 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45173 Apr 11 12:56 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45288 Apr 11 12:59 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45368 Apr 11 13:06 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45370 Apr 11 13:06 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45815 Apr 11 13:14 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45887 Apr 11 13:19 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45914 Apr 11 13:25 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 45850 Apr 11 13:29 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 49196 Apr 11 13:39 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 64317 Apr 11 13:45 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 64403 Apr 11 13:52 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 65199 Apr 11 14:03 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 65199 Apr 11 14:07 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 68995 Apr 11 14:50 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 69919 Apr 11 15:23 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 71065 Apr 11 16:26 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 70642 Apr 11 16:28 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 72334 Apr 11 16:49 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 71624 Apr 11 17:01 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 71854 Apr 11 17:20 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 73571 Apr 11 17:42 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 74708 Apr 11 17:49 patches/sched-fair.patch -rw-rw-r-- 1 mingo mingo 74708 Apr 11 17:51
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Mike Galbraith [EMAIL PROTECTED] wrote: On Sat, 2007-04-14 at 15:01 +0200, Willy Tarreau wrote: Well, I'll stop heating the room for now as I get out of ideas about how to defeat it. I'm convinced. I'm impatient to read about Mike's feedback with his workload which behaves strangely on RSDL. If it works OK here, it will be the proof that heuristics should not be needed. You mean the X + mp3 player + audio visualization test? X+Gforce visualization have problems getting half of my box in the presence of two other heavy cpu using tasks. Behavior is _much_ better than RSDL/SD, but the synchronous nature of X/client seems to be a problem. With this scheduler, renicing X/client does cure it, whereas with SD it did not help one bit. [...] thanks for testing it! I was quite worried about your setup - two tasks using up 50%/50% of CPU time, pitted against a kernel rebuild workload seems to be a hard workload to get right. [...] (I know a trivial way to cure that, and this framework makes that possible without dorking up fairness as a general policy.) great! Please send patches so i can add them (once you are happy with the solution) - i think your workload isnt special in any way and could hit other people too. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Bill Huey [EMAIL PROTECTED] wrote: On Sun, Apr 15, 2007 at 08:43:04AM +0200, Mike Galbraith wrote: [...] Demystify what? The casual observer need only read either your attempt Here's the problem. You're a casual observer and obviously not paying attention. guys, please calm down. Judging by the number of contributions to sched.c the main folks who are not 'observers' here and who thus have an unalienable right to be involved in a nasty flamewar about scheduler interactivity are Con, Mike, Nick and me ;-) Everyone else is just a happy bystander, ok? ;-) Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 2007-04-15 at 10:58 +0200, Ingo Molnar wrote: * Mike Galbraith [EMAIL PROTECTED] wrote: [...] (I know a trivial way to cure that, and this framework makes that possible without dorking up fairness as a general policy.) great! Please send patches so i can add them (once you are happy with the solution) - i think your workload isnt special in any way and could hit other people too. I'll give it a shot. (have to read and actually understand your new code first though, then see if it's really viable) -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 10:44:47AM +0200, Ingo Molnar wrote: I prefer such early releases to lkml _alot_ more than any private review process. I released the CFS code about 6 hours after i thought okay, this looks pretty good and i spent those final 6 hours on testing it (making sure it doesnt blow up on your box, etc.), in the final 2 hours i showed it to two folks i could reach on IRC (Arjan and Thomas) and on various finishing touches. It doesnt get much faster than that and i definitely didnt want to sit on it even one day longer because i very much thought that Con and others should definitely see this work! And i very much credited (and still credit) Con for the whole fairness angle: || i'd like to give credit to Con Kolivas for the general approach here: || he has proven via RSDL/SD that 'fair scheduling' is possible and that || it results in better desktop scheduling. Kudos Con! the 'design consultation' phase you are talking about is _NOW_! :) I got the v1 code out to Con, to Mike and to many others ASAP. That's how you are able to comment on this thread and be part of the development process to begin with, in a 'private consultation' setup you'd not have had any opportunity to see _any_ of this. In the BSD space there seem to be more 'political' mechanisms for development, but Linux is truly about doing things out in the open, and doing it immediately. I can't even begin to talk about how screwed up BSD development is. Maybe another time privately. Ok, Linux development and inclusiveness can be improved. I'm not trying to call you out (slang for accusing you with the sole intention to call you crazy in a highly confrontative manner). This is discussed publically here to bring this issue to light, open a communication channel as a means to resolve it. Okay? ;-) It's cool. We're still getting to know each other professionally and it's okay to a certain degree to have a communication disconnect but only as long as it clears. Your productivity is amazing BTW. But here's the problem, there's this perception that NIH is the default mentality here in Linux. Con feels that this kind of action is intentional and has a malicious quality to it as means of churn squating sections of the kernel tree. The perception here is that there is that there is this expectation that sections of the Linux kernel are intentionally churn squated to prevent any other ideas from creeping in other than of the owner of that subsytem (VM, scheduling, etc...) because of lack of modularity in the kernel. This isn't an API question but a question possibly general code quality and how maintenance () of it can . This was predicted by folks and then this perception was *realized* when you wrote the equivalent kind of code that has technical overlap with SDL (this is just one dry example). To a person that is writing new code for Linux, having one of the old guards write equivalent code to that of a newcomer has the effect of displacing that person both with regards to code and responsibility with that. When this happens over and over again and folks get annoyed by it, it starts seeming that Linux development seems elitist. I know this because I heard (read) Con's IRC chats all the time about these matters all of the time. This is not just his view but a view of other kernel folks that differing views as to. The closing talk at OLS 2006 was highly disturbing in many ways. It went Christoph is right everybody else is wrong which sends a highly negative message to new kernel developers that, say, don't work for RH directly or any of the other mainstream Linux companies. After a while, it starts seeming like this kind of behavior is completely intentional and that Linux is full of arrogant bastards. What I would have done here was to contact Peter Williams, Bill Irwin and Con about what your doing and reach a common concensus about how to create something that would be inclusive of all of their ideas. Discussions can technically heated but that's ok, the discussion is happening and it brings down the wall of this perception. Bill and Con are on oftc.net/#offtopic2. Riel is there as well as Peter Zijlstra. It might be very useful, it might not be. Folks are all stubborn about there ideas and hold on to them for dear life. Effective leaders can deconstruct this hostility and animosity. I don't claim to be one. Because of past hostility to something like schedplugin, the hostility and terseness of responses can be percieved simply as I'm right, you're wrong which is condescending. This effects discussion and outright destroys a constructive process if this happens continually since it reenforces that view of You're an outsider, we don't care about you. Nobody is listening to each other at that point, folks get pissed. Then they think about I'm going to NIH this person with patc X because he/she did the same here which is dysfunctional. Oddly enough, sometimes you're the best person
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote: The perception here is that there is that there is this expectation that sections of the Linux kernel are intentionally churn squated to prevent any other ideas from creeping in other than of the owner of that subsytem Strangely enough, my perception is that Ingo is simply trying to address the issues Mike's testing discovered in RDSL and SD. It's not surprising Ingo made it a separate patch set as Con has repeatedly stated that the problems are in fact by design and won't be fixed. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Fri, 13 Apr 2007, Ingo Molnar wrote: [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS] i'm pleased to announce the first release of the Modular Scheduler Core and Completely Fair Scheduler [CFS] patchset: http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch This project is a complete rewrite of the Linux task scheduler. My goal is to address various feature requests and to fix deficiencies in the vanilla scheduler that were suggested/found in the past few years, both for desktop scheduling and for server scheduling workloads. [...] I took a brief look at it. Have you tested priority inheritance? As far as I can see rt_mutex_setprio doesn't have much effect on SCHED_FAIR/SCHED_BATCH. I am looking for a place where such a task change scheduler class when boosted in rt_mutex_setprio(). Esben - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 01:39:27PM +0300, Pekka Enberg wrote: On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote: The perception here is that there is that there is this expectation that sections of the Linux kernel are intentionally churn squated to prevent any other ideas from creeping in other than of the owner of that subsytem Strangely enough, my perception is that Ingo is simply trying to address the issues Mike's testing discovered in RDSL and SD. It's not surprising Ingo made it a separate patch set as Con has repeatedly stated that the problems are in fact by design and won't be fixed. That's not exactly the problem. There are people who work very hard to try to improve some areas of the kernel. They progress slowly, and acquire more and more skills. Sometimes they feel like they need to change some concepts and propose those changes which are required for them to go further, or to develop faster. Those are rejected. So they are constrained to work in a delimited perimeter from which it is difficult for them to escape. Then, the same person who rejected their changes comes with something shiny new, better and which took him far less time. But he sort of broke the rules because what was forbidden to the first persons is suddenly permitted. Maybe for very good reasons, I'm not discussing that. The good reason should have been valid the first time too. The fact is that when changes are rejected, we should not simply say no, but explain why and define what would be acceptable. Some people here have excellent teaching skills for this, but most others do not. Anyway, the rules should be the same for everybody. Also, there is what can be perceived as marketting here. Con worked on his idea with convictions, he took time to write some generous documentation, but he hit a wall where his concept was suboptimal on a given workload. But at least, all the work was oriented on a technical basis : design + code + doc. Then, Ingo comes in with something looking amazingly better, with virtually no documentation, an appealing announcement, and a shiny advertising at boot. All this implemented without the constraints other people had to respect. It already looks like definitive work which will be merge as-is without many changes except a few bugfixes. If those were two companies, the first one would simply have accused the second one of not having respected contracts and having employed heaving marketting to take the first place. People here do not code for a living, they do it at least because they believe in what they are doing, and some of them want a bit of gratitude for their work. I've met people who were proud to say they implement this or that feature in the kernel, so it is something important for them. And being cited in an email is nothing compared to advertising at boot time. When the discussion was blocked between Con and Mike concerning the design problems, it is where a new discussion should have taken place. Ingo could have publicly spoken with them about his ideas of killing the O(1) scheduler and replacing it with an rbtree-based one, and using part of Bill's work to speed up development. It is far easier to resign when people explain what concepts are wrong and how they think they will do than when they suddenly present something out of nowhere which is already better. And it's not specific to Ingo (though I think his ability to work that fast alone makes him tend to practise this more often than others). Imagine if Con had worked another full week on his scheduler with better results on Mike's workload, but still not as good as Ingo's, and they both published at the same time. You certainly can imagine he would have preferred to be informed first that it was pointless to continue in that direction. Now I hope he and Bill will get over this and accept to work on improving this scheduler, because I really find it smarter than a dumb O(1). I even agree with Mike that we now have a solid basis for future work. But for this, maybe a good starting point would be to remove the selfish printk at boot, revert useless changes (SCHED_NORMAL-SCHED_FAIR come to mind) and improve the documentation a bit so that people can work together on the new design, without feeling like their work will only server to promote X or Y. Regards, Willy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Esben Nielsen [EMAIL PROTECTED] wrote: I took a brief look at it. Have you tested priority inheritance? yeah, you are right, it's broken at the moment, i'll fix it. But the good news is that i think PI could become cleaner via scheduling classes. As far as I can see rt_mutex_setprio doesn't have much effect on SCHED_FAIR/SCHED_BATCH. I am looking for a place where such a task change scheduler class when boosted in rt_mutex_setprio(). i think via scheduling classes we dont have to do the p-policy and p-prio based gymnastics anymore, we can just have a clean look at p-sched_class and stack the original scheduling class into p-real_sched_class. It would probably also make sense to 'privatize' p-prio into the scheduling class. That way PI would be a pure property of sched_rt, and the PI scheduler would be driven purely by p-rt_priority, not by p-prio. That way all the normal_prio() kind of complications and interactions with SCHED_OTHER/SCHED_FAIR would be eliminated as well. What do you think? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 15 Apr 2007, Willy Tarreau wrote: Ingo could have publicly spoken with them about his ideas of killing the O(1) scheduler and replacing it with an rbtree-based one, and using part of Bill's work to speed up development. He did exactly that and he did it with a patch. Nothing new here. This is how development on LKML proceeds when you have two or more competing designs. There's absolutely no need to get upset or hurt your feelings over it. It's not malicious, it's how we do Linux development. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Con Kolivas [EMAIL PROTECTED] wrote: [ i'm quoting this bit out of order: ] 2. Since then I've been thinking/working on a cpu scheduler design that takes away all the guesswork out of scheduling and gives very predictable, as fair as possible, cpu distribution and latency while preserving as solid interactivity as possible within those confines. yeah. I think you were right on target with this call. I've applied the sched.c change attached at the bottom of this mail to the CFS patch, if you dont mind. (or feel free to suggest some other text instead.) 1. I tried in vain some time ago to push a working extensable pluggable cpu scheduler framework (based on wli's work) for the linux kernel. It was perma-vetoed by Linus and Ingo (and Nick also said he didn't like it) as being absolutely the wrong approach and that we should never do that. [...] i partially replied to that point to Will already, and i'd like to make it clear again: yes, i rejected plugsched 2-3 years ago (which already drifted away from wli's original codebase) and i would still reject it today. First and foremost, please dont take such rejections too personally - i had my own share of rejections (and in fact, as i mentioned it in a previous mail, i had a fair number of complete project throwaways: 4g:4g, in-kernel Tux, irqrate and many others). I know that they can hurt and can demoralize, but if i dont like something it's my job to tell that. Can i sum up your argument as: you rejected plugsched, but then why on earth did you modularize portions of the scheduler in CFS? Isnt your position thus woefully inconsistent? (i'm sure you would never put it this impolitely though, but i guess i can flame myself with impunity ;) While having an inconsistent position isnt a terminal sin in itself, please realize that the scheduler classes code in CFS is quite different from plugsched: it was a result of what i saw to be technological pressure for _internal modularization_. (This internal/policy modularization aspect is something that Will said was present in his original plugsched code, but which aspect i didnt see in the plugsched patches that i reviewed.) That possibility never even occured to me to until 3 days ago. You never raised it either AFAIK. No patches to simplify the scheduler that way were ever sent. Plugsched doesnt even touch the core load-balancer for example, and most of the time i spent with the modularization was to get the load-balancing details right. So it's really apples to oranges. My view about plugsched: first please take a look at the latest plugsched code: http://downloads.sourceforge.net/cpuse/plugsched-6.5-for-2.6.20.patch 26 files changed, 8951 insertions(+), 1495 deletions(-) As an experiment i've removed all the add-on schedulers (both the core and the include files, only kept the vanilla one) from the plugsched patch (and the makefile and kconfig complications, etc), to see the 'infrastructure cost', and it still gave: 12 files changed, 1933 insertions(+), 1479 deletions(-) that's the extra complication i didnt like 3 years ago and which i still dont like today. What the current plugsched code does is that it simplifies the adding of new experimental schedulers, but it doesnt really do what i wanted: to simplify the _scheduler itself_. Personally i'm still not primarily interested in having a large selection of schedulers, i'm mainly interested in a good and maintainable scheduler that works for people. so the rejection was on these grounds, and i still very much stand by that position here and today: i didnt want to see the Linux scheduler landscape balkanized and i saw no technological reasons for the complication that external modularization brings. the new scheding classes code in the CFS patch was not a result of oh, i want to write a new scheduler, lets make schedulers pluggable kind of thinking. That result was just a side-effect of it. (and as you correctly noted it, the CFS related modularization is incomplete). Btw., the thing that triggered the scheduling classes code wasnt even plugsched or RSDL/SD, it was Mike's patches. Mike had an itch and he fixed it within the framework of the existing scheduler, and the end result behaved quite well when i threw various testloads on it. But i felt a bit uncomfortable that it added another few hundred lines of code to an already complex sched.c. This felt unnatural so i mailed Mike that i'd attempt to clean these infrastructure aspects of sched.c up a bit so that it becomes more hackable to him. Thus 3 days ago, without having made up my mind about anything, i started this experiment (which ended up in the modularization and in the CFS scheduler) to simplify the code and to enable Mike to fix such itches in an easier way. By your logic Mike should in fact be quite upset about this: if the new code works out and proves to be useful then it obsoletes a whole lot of code of
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sunday 15 April 2007, Pekka Enberg wrote: On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote: The perception here is that there is that there is this expectation that sections of the Linux kernel are intentionally churn squated to prevent any other ideas from creeping in other than of the owner of that subsytem Strangely enough, my perception is that Ingo is simply trying to address the issues Mike's testing discovered in RDSL and SD. It's not surprising Ingo made it a separate patch set as Con has repeatedly stated that the problems are in fact by design and won't be fixed. I won't get into the middle of this just yet, not having decided which dog I should bet on yet. I've been running 2.6.21-rc6 + Con's 0.40 patch for about 24 hours, its been generally usable, but gzip still causes lots of 5 to 10+ second lags when its running. I'm coming to the conclusion that gzip simply doesn't play well with others... Amazing to me, the cpu its using stays generally below 80%, and often below 60%, even while the kmail composer has a full sentence in its buffer that it still hasn't shown me when I switch to the htop screen to check, and back to the kmail screen to see if its updated yet. The screen switch doesn't seem to lag so I don't think renicing x would be helpfull. Those are the obvious lags, and I'll build reboot to the CFS patch at some point this morning (whats left of it that is :). And report in due time of course -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) knot in cables caused data stream to become twisted and kinked - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, Apr 15, 2007 at 02:45:27PM +0200, Willy Tarreau wrote: Now I hope he and Bill will get over this and accept to work on improving this scheduler, because I really find it smarter than a dumb O(1). I even agree with Mike that we now have a solid basis for future work. But for this, maybe a good starting point would be to remove the selfish printk at boot, revert useless changes (SCHED_NORMAL-SCHED_FAIR come to mind) and improve the documentation a bit so that people can work together on the new design, without feeling like their work will only server to promote X or Y. While I appreciate people coming to my defense, or at least the good intentions behind such, my only actual interest in pointing out 4-year-old work is getting some acknowledgment of having done something relevant at all. Sometimes it has I told you so value. At other times it's merely clarifying what went on when people refer to it since in a number of cases the patches are no longer extant, so they can't actually look at it to get an idea of what was or wasn't done. At other times I'm miffed about not being credited, whether I should've been or whether dead and buried code has an implementation of the same idea resurfacing without the author(s) having any knowledge of my prior work. One should note that in this case, the first work of mine this trips over (scheduling classes) was never publicly posted as it was only a part of the original plugsched (an alternate scheduler implementation devised to demonstrate plugsched's flexibility with respect to scheduling policies), and a part that was dropped by subsequent maintainers. The second work of mine this trips over, a virtual deadline scheduler named vdls, was also never publicly posted. Both are from around the same time period, which makes them approximately 4 years dead. Neither of the codebases are extant, having been lost in a transition between employers, though various people recall having been sent them privately, and plugsched survives in a mutated form as maintained by Peter Williams, who's been very good about acknowledging my original contribution. If I care to become a direct participant in scheduler work, I can do so easily enough. I'm not entirely sure what this is about a basis for future work. By and large one should alter the API's and data structures to fit the policy being implemented. While the array swapping was nice for algorithmically improving 2.4.x -style epoch expiry, most algorithms not based on the 2.4.x scheduler (in however mutated a form) should use a different queue structure, in fact, one designed around their policy's specific algorithmic needs. IOW, when one alters the scheduler, one should also alter the queue data structure appropriately. I'd not expect the priority queue implementation in cfs to continue to be used unaltered as it matures, nor would I expect any significant modification of the scheduler to necessarily use a similar one. By and large I've been mystified as to why there is such a penchant for preserving the existing queue structures in the various scheduler patches floating around. I am now every bit as mystified at the point of view that seems to be emerging that a change of queue structure is particularly significant. These are all largely internal changes to sched.c, and as such, rather small changes in and of themselves. While they do tend to have user-visible effects, from this point of view even changing out every line of sched.c is effectively a micropatch. Something more significant might be altering the schedule() API to take a mandatory description of the intention of the call to it, or breaking up schedule() into several different functions to distinguish between different sorts of uses of it to which one would then respond differently. Also more significant would be adding a new state beyond TASK_INTERRUPTIBLE, TASK_UNINTERRUPTIBLE, and TASK_RUNNING for some tasks to respond only to fatal signals, then sweeping TASK_UNINTERRUPTIBLE users to use the new state and handle those fatal signals. While not quite as ostentatious in their user-visible effects as SCHED_OTHER policy affairs, they are tremendously more work than switching out the implementation of a single C file, and so somewhat more respectable. Even as scheduling semantics go, these are micropatches. So SCHED_OTHER changes a little. Where are the gang schedulers? Where are the batch schedulers (SCHED_BATCH is not truly such)? Where are the isochronous (frame) schedulers? I suppose there is some CKRM work that actually has a semantic impact despite being largely devoted to SCHED_OTHER, and there's some spufs gang scheduling going on, though not all that much. And to reiterate a point from other threads, even as SCHED_OTHER patches go, I see precious little verification that things like the semantics of nice numbers or other sorts of CPU bandwidth allocation between competing tasks of various natures are staying the same while other things are
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Willy Tarreau [EMAIL PROTECTED] wrote: Ingo could have publicly spoken with them about his ideas of killing the O(1) scheduler and replacing it with an rbtree-based one, [...] yes, that's precisely what i did, via a patchset :) [ I can even tell you when it all started: i was thinking about Mike's throttling patches while watching Manchester United beat the crap out of AS Roma (7 to 1 end result), Thuesday evening. I started coding it Wednesday morning and sent the patch Friday evening. I very much believe in low-latency when it comes to development too ;) ] (if this had been done via a comittee then today we'd probably still be trying to find a suitable timeslot for the initial conference call where we'd discuss the election of a chair who would be tasked with writing up an initial document of feature requests, on which we'd take a vote, possibly this year already, because the matter is really urgent you know ;-) [...] and using part of Bill's work to speed up development. ok, let me make this absolutely clear: i didnt use any bit of plugsched - in fact the most difficult bits of the modularization was for areas of sched.c that plugsched never even touched AFAIK. (the load-balancer for example.) Plugsched simply does something else: i modularized scheduling policies in essence that have to cooperate with each other, while plugsched modularized complete schedulers which are compile-time or boot-time selected, with no runtime cooperation between them. (one has to be selected at a time) (and i have no trouble at all with crediting Will's work either: a few years ago i used Will's PID rework concepts for an NPTL related speedup and Will is very much credited for it in today's kernel/pid.c and he continued to contribute to it later on.) (the tree walking bits of sched_fair.c were in fact derived from kernel/hrtimer.c, the rbtree code written by Thomas and me :-) Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Willy Tarreau [EMAIL PROTECTED] wrote: [...] and using part of Bill's work to speed up development. On Sun, Apr 15, 2007 at 05:39:33PM +0200, Ingo Molnar wrote: ok, let me make this absolutely clear: i didnt use any bit of plugsched - in fact the most difficult bits of the modularization was for areas of sched.c that plugsched never even touched AFAIK. (the load-balancer for example.) Plugsched simply does something else: i modularized scheduling policies in essence that have to cooperate with each other, while plugsched modularized complete schedulers which are compile-time or boot-time selected, with no runtime cooperation between them. (one has to be selected at a time) (and i have no trouble at all with crediting Will's work either: a few years ago i used Will's PID rework concepts for an NPTL related speedup and Will is very much credited for it in today's kernel/pid.c and he continued to contribute to it later on.) (the tree walking bits of sched_fair.c were in fact derived from kernel/hrtimer.c, the rbtree code written by Thomas and me :-) The extant plugsched patches have nothing to do with cfs; I suspect what everyone else is going on about is terminological confusion. The 4-year-old sample policy with scheduling classes for the original plugsched is something you had no way of knowing about, as it was never publicly posted. There isn't really anything all that interesting going on here, apart from pointing out that it's been done before. -- wli - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
In article [EMAIL PROTECTED] you wrote: A development process like this is likely to exclude smart people from wanting to contribute to Linux and folks should be conscious about this issues. Nobody is excluded, you can always have a next iteration. Gruss Bernd - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
It outlines the problems with Linux kernel development and questionable elistism regarding ownership of certain sections of the kernel code. I have to step in and disagree here Linux is not about who writes the code. Linux is about getting the best solution for a problem. Who wrote which line of the code is irrelevant in the big picture. that often means that multiple implementations happen, and that the a darwinistic process decides that the best solution wins. This darwinistic process often happens in the form of discussion, and that discussion can happen with words or with code. In this case it happened with a code proposal. To make this specific: it has happened many times to me that when I solved an issue with code, someone else stepped in and wrote a different solution (although that was usually for smaller pieces). Was I upset about that? No! I was happy because my *problem got solved* in the best possible way. Now this doesn't mean that people shouldn't be nice to each other, not cooperate or steal credits, but I don't get the impression that that is happening here. Ingo is taking part in the discussion with a counter proposal for discussion *on the mailing list*. What more do you want?? If you or anyone else can improve it or do better, take part of this discussion and show what you mean either in words or in code. Your qualification of the discussion as a elitist takeover... I disagree with that. It's a *discussion*. Now if you agree that Ingo's patch is better technically, you and others should be happy about that because your problem is getting solved better. If you don't agree that his patch is better technically, take part in the technical discussion. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Monday 16 April 2007 01:16, Gene Heskett wrote: On Sunday 15 April 2007, Pekka Enberg wrote: On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote: The perception here is that there is that there is this expectation that sections of the Linux kernel are intentionally churn squated to prevent any other ideas from creeping in other than of the owner of that subsytem Strangely enough, my perception is that Ingo is simply trying to address the issues Mike's testing discovered in RDSL and SD. It's not surprising Ingo made it a separate patch set as Con has repeatedly stated that the problems are in fact by design and won't be fixed. I won't get into the middle of this just yet, not having decided which dog I should bet on yet. I've been running 2.6.21-rc6 + Con's 0.40 patch for about 24 hours, its been generally usable, but gzip still causes lots of 5 to 10+ second lags when its running. I'm coming to the conclusion that gzip simply doesn't play well with others... Actually Gene I think you're being bitten here by something I/O bound since the cpu usage never tops out. If that's the case and gzip is dumping truckloads of writes then you're suffering something that irks me even more than the scheduler in linux, and that's how much writes hurt just about everything else. Try your testcase with bzip2 instead (since that won't be i/o bound), or drop your dirty ratio to as low as possible which helps a little bit (5% is the minimum) echo 5 /proc/sys/vm/dirty_ratio and finally try the braindead noop i/o scheduler as well. echo noop /sys/block/sda/queue/scheduler (replace sda with your drive obviously). I'd wager a big one that's what causes your gzip pain. If it wasn't for the fact that I've decided to all but give up ever trying to provide code for mainline again, trying my best to make writes hurt less on linux would be my next big thing [tm]. Oh and for the others watching, (points to vm hackers) I found a bug when playing with the dirty ratio code. If you modify it to allow it drop below 5% but still above the minimum in the vm code, stalls happen somewhere in the vm where nothing much happens for sometimes 20 or 30 seconds worst case scenario. I had to drop a patch in 2.6.19 that allowed the dirty ratio to be set ultra low because these stalls were gross. Amazing to me, the cpu its using stays generally below 80%, and often below 60%, even while the kmail composer has a full sentence in its buffer that it still hasn't shown me when I switch to the htop screen to check, and back to the kmail screen to see if its updated yet. The screen switch doesn't seem to lag so I don't think renicing x would be helpfull. Those are the obvious lags, and I'll build reboot to the CFS patch at some point this morning (whats left of it that is :). And report in due time of course -- -ck - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sunday 15 April 2007, Con Kolivas wrote: On Monday 16 April 2007 01:16, Gene Heskett wrote: On Sunday 15 April 2007, Pekka Enberg wrote: On 4/15/07, hui Bill Huey [EMAIL PROTECTED] wrote: The perception here is that there is that there is this expectation that sections of the Linux kernel are intentionally churn squated to prevent any other ideas from creeping in other than of the owner of that subsytem Strangely enough, my perception is that Ingo is simply trying to address the issues Mike's testing discovered in RDSL and SD. It's not surprising Ingo made it a separate patch set as Con has repeatedly stated that the problems are in fact by design and won't be fixed. I won't get into the middle of this just yet, not having decided which dog I should bet on yet. I've been running 2.6.21-rc6 + Con's 0.40 patch for about 24 hours, its been generally usable, but gzip still causes lots of 5 to 10+ second lags when its running. I'm coming to the conclusion that gzip simply doesn't play well with others... Actually Gene I think you're being bitten here by something I/O bound since the cpu usage never tops out. If that's the case and gzip is dumping truckloads of writes then you're suffering something that irks me even more than the scheduler in linux, and that's how much writes hurt just about everything else. Try your testcase with bzip2 instead (since that won't be i/o bound), or drop your dirty ratio to as low as possible which helps a little bit (5% is the minimum) echo 5 /proc/sys/vm/dirty_ratio and finally try the braindead noop i/o scheduler as well. echo noop /sys/block/sda/queue/scheduler (replace sda with your drive obviously). I'd wager a big one that's what causes your gzip pain. If it wasn't for the fact that I've decided to all but give up ever trying to provide code for mainline again, trying my best to make writes hurt less on linux would be my next big thing [tm]. Chuckle, possibly but then I'm not anything even remotely close to an expert here Con, just reporting what I get. And I just rebooted to 2.6.21-rc6 + sched-mike-5.patch for grins and giggles, or frowns and profanity as the case may call for. Oh and for the others watching, (points to vm hackers) I found a bug when playing with the dirty ratio code. If you modify it to allow it drop below 5% but still above the minimum in the vm code, stalls happen somewhere in the vm where nothing much happens for sometimes 20 or 30 seconds worst case scenario. I had to drop a patch in 2.6.19 that allowed the dirty ratio to be set ultra low because these stalls were gross. I think I'd need a bit of tutoring on how to do that. I recall that one other time, several weeks back, I thought I would try one of those famous echo this /proc/that ideas that went by on this list, but even though I was root, apparently /proc was read-only AFAIWC. Amazing to me, the cpu its using stays generally below 80%, and often below 60%, even while the kmail composer has a full sentence in its buffer that it still hasn't shown me when I switch to the htop screen to check, and back to the kmail screen to see if its updated yet. The screen switch doesn't seem to lag so I don't think renicing x would be helpfull. Those are the obvious lags, and I'll build reboot to the CFS patch at some point this morning (whats left of it that is :). And report in due time of course And now I wonder if I applied the right patch. This one feels good ATM, but I don't think its the CFS thingy. No, I'm sure of it now, none of the patches I've saved say a thing about CFS. Backtrack up the list time I guess, ignore me for the nonce. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Microsoft: Re-inventing square wheels -- From a Slashdot.org post - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 2007-04-15 at 16:08 +0300, Pekka J Enberg wrote: On Sun, 15 Apr 2007, Willy Tarreau wrote: Ingo could have publicly spoken with them about his ideas of killing the O(1) scheduler and replacing it with an rbtree-based one, and using part of Bill's work to speed up development. He did exactly that and he did it with a patch. Nothing new here. This is how development on LKML proceeds when you have two or more competing designs. There's absolutely no need to get upset or hurt your feelings over it. It's not malicious, it's how we do Linux development. Yes. Exactly. This is what it's all about, this is what makes it work. -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* Willy Tarreau [EMAIL PROTECTED] wrote: Well, since I merged the fair-fork patch, I cannot reproduce (in fact, bash forks 1000 processes, then progressively execs scheddos, but it takes some time). So I'm rebuilding right now. But I think that Linus has an interesting clue about GPM and notification before switching the terminal. I think it was enabled in console mode. I don't know how that translates to frozen xterms, but let's attack the problems one at a time. to debug this, could you try to apply this add-on as well: http://redhat.com/~mingo/cfs-scheduler/sched-fair-print.patch with this patch applied you should have a /proc/sched_debug file that prints all runnable tasks and other interesting info from the runqueue. [ i've refreshed all the patches on the CFS webpage, so if this doesnt apply cleanly to your current tree then you'll probably have to refresh one of the patches.] The output should look like this: Sched Debug Version: v0.01 now at 226761724575 nsecs cpu: 0 .nr_running: 3 .raw_weighted_load : 384 .nr_switches : 13666 .nr_uninterruptible: 0 .next_balance : 4294947416 .curr-pid : 2179 .rq_clock : 241337421233 .fair_clock: 7503791206 .wait_runtime : 2269918379 runnable tasks: task | PID | tree-key | -delta | waiting | switches - +cat 2179 7501930066 -18611401861140 2 loop_silent 2149 7503010354-780852 0 911 loop_silent 2148 7503510048-281158 280753 918 now for your workload the list should be considerably larger. If there's starvation going on then the 'switches' field (number of context switches) of one of the tasks would never increase while you have this 'cannot switch consoles' problem. maybe you'll have to unapply the fair-fork patch to make it trigger again. (fair-fork does not fix anything, so it probably just hides a real bug.) (i'm meanwhile busy running your scheddos utilities to reproduce it locally as well :) Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
On Sun, 15 Apr 2007, Mike Galbraith wrote: On Sun, 2007-04-15 at 16:08 +0300, Pekka J Enberg wrote: He did exactly that and he did it with a patch. Nothing new here. This is how development on LKML proceeds when you have two or more competing designs. There's absolutely no need to get upset or hurt your feelings over it. It's not malicious, it's how we do Linux development. Yes. Exactly. This is what it's all about, this is what makes it work. I obviously agree, but I will also add that one of the most motivating things there *is* in open source is personal pride. It's a really good thing, and it means that if somebody shows that your code is flawed in some way (by, for example, making a patch that people claim gets better behaviour or numbers), any *good* programmer that actually cares about his code will obviously suddenly be very motivated to out-do the out-doer! Does this mean that there will be tension and rivalry? Hell yes. But that's kind of the point. Life is a game, and if you aren't in it to win, what the heck are you still doing here? As long as it's reasonably civil (I'm not personally a huge believer in being too polite or politically correct, so I think the reasonably is more important than the civil part!), and as long as the end result is judged on TECHNICAL MERIT, it's all good. We don't want to play politics. But encouraging peoples competitive feelings? Oh, yes. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/