[RFC] Extend Linux to support proportional-share scheduling

2007-04-20 Thread Tong Li
This patch extends the existing Linux scheduler with support for proportional-share scheduling (as a new KConfig option). http://www.cs.duke.edu/~tongli/linux/linux-2.6.19.2-trio.patch It uses a scheduling algorithm, called Distributed Weighted Round-Robin (DWRR), which retains the existing

Re: [git] CFS-devel, group scheduler, fixes

2007-09-19 Thread Tong Li
Signed-off-by: Tong Li [EMAIL PROTECTED] --- --- linux-2.6-sched-devel-orig/kernel/sched.c 2007-09-15 22:00:48.0 -0700 +++ linux-2.6-sched-devel/kernel/sched.c2007-09-18 22:10:52.0 -0700 @@ -1033,9 +1033,6 @@ void set_task_cpu(struct task_struct *p, if (p

Re: [git] CFS-devel, group scheduler, fixes

2007-09-19 Thread Tong Li
On Wed, 19 Sep 2007, Mike Galbraith wrote: On Wed, 2007-09-19 at 09:51 +0200, Mike Galbraith wrote: The scenario which was previously cured was this: taskset -c 1 nice -n 0 ./massive_intr 2 taskset -c 1 nice -n 5 ./massive_intr 2 click link

Re: [git] CFS-devel, group scheduler, fixes

2007-09-19 Thread Tong Li
On Wed, 19 Sep 2007, Siddha, Suresh B wrote: On Tue, Sep 18, 2007 at 11:03:59PM -0700, Tong Li wrote: This patch attempts to improve CFS's SMP global fairness based on the new virtual time design. Removed vruntime adjustment in set_task_cpu() as it skews global fairness. Modified

Re: [git] CFS-devel, group scheduler, fixes

2007-09-21 Thread Tong Li
)) return sync_vruntime(cfs_rq); cfs_rq-curr can be NULL even if cfs_rq-nr_running is non-zero (e.g., when an RT task is running). We only want to call sync_vruntime when cfs_rq-nr_running is 0. This fixed the large latency problem (at least in my tests). Signed-off-by: Tong Li [EMAIL PROTECTED

Re: [git] CFS-devel, group scheduler, fixes

2007-09-24 Thread Tong Li
On Sun, 23 Sep 2007, Mike Galbraith wrote: On Sat, 2007-09-22 at 12:01 +0200, Mike Galbraith wrote: On Fri, 2007-09-21 at 20:27 -0700, Tong Li wrote: Mike, Could you try this patch to see if it solves the latency problem? No, but it helps some when running two un-pinned busy loops, one

Re: [git] CFS-devel, group scheduler, fixes

2007-09-24 Thread Tong Li
On Mon, 24 Sep 2007, Peter Zijlstra wrote: On Mon, 24 Sep 2007 13:22:14 +0200 Mike Galbraith [EMAIL PROTECTED] wrote: On Mon, 2007-09-24 at 12:42 +0200, Mike Galbraith wrote: On Mon, 2007-09-24 at 12:24 +0200, Peter Zijlstra wrote: how about something like: s64 delta = (s64)(vruntime -

Re: [ANNOUNCE/RFC] Really Simple Really Fair Scheduler

2007-09-02 Thread Tong Li
I like this patch since it's really simple. CFS does provide a nice infrastructure to enable new algorithmic changes/extensions. My only concern was the O(log N) complexity under heavy load, but I'm willing to agree that it's OK in the common case. Some comments on the code: * Ingo Molnar

[RFC] scheduler: improve SMP fairness in CFS

2007-07-23 Thread Tong Li
is, the closer the system behavior is to the default CFS without the patch. Any comments and suggestions would be highly appreciated. Thanks, tong Signed-off-by: Tong Li [EMAIL PROTECTED] --- include/linux/sched.h |5 kernel/sched.c | 577

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-24 Thread Tong Li
On Mon, 23 Jul 2007, Chris Snook wrote: This patch is massive overkill. Maybe you're not seeing the overhead on your 8-way box, but I bet we'd see it on a 4096-way NUMA box with a partially-RT workload. Do you have any data justifying the need for this patch? Doing anything globally is

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-25 Thread Tong Li
On Wed, 25 Jul 2007, Ingo Molnar wrote: they now nicely converte to the expected 80% long-term CPU usage. so, could you please try the patch below, does it work for you too? Thanks for the patch. It doesn't work well on my 8-way box. Here's the output at two different times. It's also

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-26 Thread Tong Li
Jul 2007, Li, Tong N wrote: On Thu, 2007-07-26 at 23:31 +0200, Ingo Molnar wrote: * Tong Li [EMAIL PROTECTED] wrote: you need to measure it over longer periods of time. Its not worth balancing for such a thing in any high-frequency manner. (we'd trash the cache constantly migrating tasks back

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-26 Thread Tong Li
On Wed, 25 Jul 2007, Ingo Molnar wrote: * Tong Li [EMAIL PROTECTED] wrote: Thanks for the patch. It doesn't work well on my 8-way box. Here's the output at two different times. It's also changing all the time. you need to measure it over longer periods of time. Its not worth balancing

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-27 Thread Tong Li
On Fri, 27 Jul 2007, Chris Snook wrote: Tong Li wrote: I'd like to clarify that I'm not trying to push this particular code to the kernel. I'm a researcher. My intent was to point out that we have a problem in the scheduler and my dwrr algorithm can potentially help fix it. The patch itself

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-28 Thread Tong Li
On Fri, 27 Jul 2007, Chris Snook wrote: I don't think that achieving a constant error bound is always a good thing. We all know that fairness has overhead. If I have 3 threads and 2 processors, and I have a choice between fairly giving each thread 1.0 billion cycles during the next second,

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-28 Thread Tong Li
On Fri, 27 Jul 2007, Chris Snook wrote: Bill Huey (hui) wrote: You have to consider the target for this kind of code. There are applications where you need something that falls within a constant error bound. According to the numbers, the current CFS rebalancing logic doesn't achieve that to

[RFC] Extend Linux to support proportional-share scheduling

2007-04-20 Thread Tong Li
This patch extends the existing Linux scheduler with support for proportional-share scheduling (as a new KConfig option). http://www.cs.duke.edu/~tongli/linux/linux-2.6.19.2-trio.patch It uses a scheduling algorithm, called Distributed Weighted Round-Robin (DWRR), which retains the existing

[RFC] scheduler: improve SMP fairness in CFS

2007-07-23 Thread Tong Li
is, the closer the system behavior is to the default CFS without the patch. Any comments and suggestions would be highly appreciated. Thanks, tong Signed-off-by: Tong Li <[EMAIL PROTECTED]> --- include/linux/sched.h |5 kernel/sched.c

Re: [ANNOUNCE/RFC] Really Simple Really Fair Scheduler

2007-09-02 Thread Tong Li
I like this patch since it's really simple. CFS does provide a nice infrastructure to enable new algorithmic changes/extensions. My only concern was the O(log N) complexity under heavy load, but I'm willing to agree that it's OK in the common case. Some comments on the code: * Ingo Molnar

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-24 Thread Tong Li
On Mon, 23 Jul 2007, Chris Snook wrote: This patch is massive overkill. Maybe you're not seeing the overhead on your 8-way box, but I bet we'd see it on a 4096-way NUMA box with a partially-RT workload. Do you have any data justifying the need for this patch? Doing anything globally is

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-25 Thread Tong Li
On Wed, 25 Jul 2007, Ingo Molnar wrote: they now nicely converte to the expected 80% long-term CPU usage. so, could you please try the patch below, does it work for you too? Thanks for the patch. It doesn't work well on my 8-way box. Here's the output at two different times. It's also

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-26 Thread Tong Li
Jul 2007, Li, Tong N wrote: On Thu, 2007-07-26 at 23:31 +0200, Ingo Molnar wrote: * Tong Li <[EMAIL PROTECTED]> wrote: you need to measure it over longer periods of time. Its not worth balancing for such a thing in any high-frequency manner. (we'd trash the cache constantly migrating task

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-26 Thread Tong Li
On Wed, 25 Jul 2007, Ingo Molnar wrote: * Tong Li <[EMAIL PROTECTED]> wrote: > Thanks for the patch. It doesn't work well on my 8-way box. Here's the > output at two different times. It's also changing all the time. you need to measure it over longer periods of time. Its not wor

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-27 Thread Tong Li
On Fri, 27 Jul 2007, Chris Snook wrote: Tong Li wrote: I'd like to clarify that I'm not trying to push this particular code to the kernel. I'm a researcher. My intent was to point out that we have a problem in the scheduler and my dwrr algorithm can potentially help fix it. The patch itself

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-28 Thread Tong Li
On Fri, 27 Jul 2007, Chris Snook wrote: I don't think that achieving a constant error bound is always a good thing. We all know that fairness has overhead. If I have 3 threads and 2 processors, and I have a choice between fairly giving each thread 1.0 billion cycles during the next second,

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-28 Thread Tong Li
On Fri, 27 Jul 2007, Chris Snook wrote: Bill Huey (hui) wrote: You have to consider the target for this kind of code. There are applications where you need something that falls within a constant error bound. According to the numbers, the current CFS rebalancing logic doesn't achieve that to

Re: [git] CFS-devel, group scheduler, fixes

2007-09-19 Thread Tong Li
Signed-off-by: Tong Li <[EMAIL PROTECTED]> --- --- linux-2.6-sched-devel-orig/kernel/sched.c 2007-09-15 22:00:48.0 -0700 +++ linux-2.6-sched-devel/kernel/sched.c2007-09-18 22:10:52.0 -0700 @@ -1033,9 +1033,6 @@ void set_task_cpu(struct task_struct *p,

Re: [git] CFS-devel, group scheduler, fixes

2007-09-19 Thread Tong Li
On Wed, 19 Sep 2007, Mike Galbraith wrote: On Wed, 2007-09-19 at 09:51 +0200, Mike Galbraith wrote: The scenario which was previously cured was this: taskset -c 1 nice -n 0 ./massive_intr 2 taskset -c 1 nice -n 5 ./massive_intr 2 click link

Re: [git] CFS-devel, group scheduler, fixes

2007-09-19 Thread Tong Li
On Wed, 19 Sep 2007, Siddha, Suresh B wrote: On Tue, Sep 18, 2007 at 11:03:59PM -0700, Tong Li wrote: This patch attempts to improve CFS's SMP global fairness based on the new virtual time design. Removed vruntime adjustment in set_task_cpu() as it skews global fairness. Modified

Re: [git] CFS-devel, group scheduler, fixes

2007-09-21 Thread Tong Li
)) return sync_vruntime(cfs_rq); cfs_rq->curr can be NULL even if cfs_rq->nr_running is non-zero (e.g., when an RT task is running). We only want to call sync_vruntime when cfs_rq->nr_running is 0. This fixed the large latency problem (at least in my tests). Signed-off-by: Tong Li <[EMA

Re: [git] CFS-devel, group scheduler, fixes

2007-09-24 Thread Tong Li
On Sun, 23 Sep 2007, Mike Galbraith wrote: On Sat, 2007-09-22 at 12:01 +0200, Mike Galbraith wrote: On Fri, 2007-09-21 at 20:27 -0700, Tong Li wrote: Mike, Could you try this patch to see if it solves the latency problem? No, but it helps some when running two un-pinned busy loops, one

Re: [git] CFS-devel, group scheduler, fixes

2007-09-24 Thread Tong Li
On Mon, 24 Sep 2007, Peter Zijlstra wrote: On Mon, 24 Sep 2007 13:22:14 +0200 Mike Galbraith <[EMAIL PROTECTED]> wrote: On Mon, 2007-09-24 at 12:42 +0200, Mike Galbraith wrote: On Mon, 2007-09-24 at 12:24 +0200, Peter Zijlstra wrote: how about something like: s64 delta = (s64)(vruntime -