Re: [patch] sched: fix broken smt/mc optimizations with CFS
* Siddha, Suresh B <[EMAIL PROTECTED]> wrote: > > Seems this didn't get merged? Latest git as of today still has the code > > as it was before this patch. > > This is must fix for .23 and Ingo previously mentioned that he will push it > for .23 yep, it's queued up and it will send it with the next batch. (this is the most important scheduler fix we have at the moment - there are 3 other, smaller items queued up as well) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
On Tue, Sep 04, 2007 at 07:35:21PM -0400, Chuck Ebbert wrote: > On 08/28/2007 06:27 PM, Siddha, Suresh B wrote: > > Try to fix MC/HT scheduler optimization breakage again, with out breaking > > the FUZZ logic. > > > > First fix the check > > if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) > > with this > > if (*imbalance < busiest_load_per_task) > > > > As the current check is always false for nice 0 tasks (as > > SCHED_LOAD_SCALE_FUZZ > > is same as busiest_load_per_task for nice 0 tasks). > > > > With the above change, imbalance was getting reset to 0 in the corner case > > condition, making the FUZZ logic fail. Fix it by not corrupting the > > imbalance and change the imbalance, only when it finds that the > > HT/MC optimization is needed. > > > > Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> > > --- > > > > diff --git a/kernel/sched.c b/kernel/sched.c > > index 9fe473a..03e5e8d 100644 > > --- a/kernel/sched.c > > +++ b/kernel/sched.c > > @@ -2511,7 +2511,7 @@ group_next: > > * a think about bumping its value to force at least one task to be > > * moved > > */ > > - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) { > > + if (*imbalance < busiest_load_per_task) { > > unsigned long tmp, pwr_now, pwr_move; > > unsigned int imbn; > > > > @@ -2563,10 +2563,8 @@ small_imbalance: > > pwr_move /= SCHED_LOAD_SCALE; > > > > /* Move if we gain throughput */ > > - if (pwr_move <= pwr_now) > > - goto out_balanced; > > - > > - *imbalance = busiest_load_per_task; > > + if (pwr_move > pwr_now) > > + *imbalance = busiest_load_per_task; > > } > > > > return busiest; > > Seems this didn't get merged? Latest git as of today still has the code > as it was before this patch. This is must fix for .23 and Ingo previously mentioned that he will push it for .23 Ingo? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
On 08/28/2007 06:27 PM, Siddha, Suresh B wrote: > On Mon, Aug 27, 2007 at 12:31:03PM -0700, Siddha, Suresh B wrote: >> Essentially I observed that nice 0 tasks still endup on two cores of same >> package, with out getting spread out to two different packages. This behavior >> is same with out this fix and this fix doesn't help in any way. > > Ingo, Appended patch seems to fix the issue and as far as I can test, seems ok > to me. > > This is a quick fix for .23. Peter Williams and myself plan to look at > code cleanups in this area (HT/MC optimizations) post .23 > > BTW, with this fix, do you want to retain the current FUZZ value? > > thanks, > suresh > -- > > Try to fix MC/HT scheduler optimization breakage again, with out breaking > the FUZZ logic. > > First fix the check > if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) > with this > if (*imbalance < busiest_load_per_task) > > As the current check is always false for nice 0 tasks (as > SCHED_LOAD_SCALE_FUZZ > is same as busiest_load_per_task for nice 0 tasks). > > With the above change, imbalance was getting reset to 0 in the corner case > condition, making the FUZZ logic fail. Fix it by not corrupting the > imbalance and change the imbalance, only when it finds that the > HT/MC optimization is needed. > > Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> > --- > > diff --git a/kernel/sched.c b/kernel/sched.c > index 9fe473a..03e5e8d 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -2511,7 +2511,7 @@ group_next: >* a think about bumping its value to force at least one task to be >* moved >*/ > - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) { > + if (*imbalance < busiest_load_per_task) { > unsigned long tmp, pwr_now, pwr_move; > unsigned int imbn; > > @@ -2563,10 +2563,8 @@ small_imbalance: > pwr_move /= SCHED_LOAD_SCALE; > > /* Move if we gain throughput */ > - if (pwr_move <= pwr_now) > - goto out_balanced; > - > - *imbalance = busiest_load_per_task; > + if (pwr_move > pwr_now) > + *imbalance = busiest_load_per_task; > } > > return busiest; Seems this didn't get merged? Latest git as of today still has the code as it was before this patch. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
* Siddha, Suresh B <[EMAIL PROTECTED]> wrote: > On Mon, Aug 27, 2007 at 12:31:03PM -0700, Siddha, Suresh B wrote: > > Essentially I observed that nice 0 tasks still endup on two cores of same > > package, with out getting spread out to two different packages. This > > behavior > > is same with out this fix and this fix doesn't help in any way. > > Ingo, Appended patch seems to fix the issue and as far as I can test, > seems ok to me. thanks! I've queued your fix up for .23 merge. I've done a quick test and it indeed seems to work well. > This is a quick fix for .23. Peter Williams and myself plan to look at > code cleanups in this area (HT/MC optimizations) post .23 > > BTW, with this fix, do you want to retain the current FUZZ value? what value would you suggest? I was thinking about using busiest_rq->curr->load.weight instead, to always keep rotating tasks. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
On Mon, Aug 27, 2007 at 12:31:03PM -0700, Siddha, Suresh B wrote: > Essentially I observed that nice 0 tasks still endup on two cores of same > package, with out getting spread out to two different packages. This behavior > is same with out this fix and this fix doesn't help in any way. Ingo, Appended patch seems to fix the issue and as far as I can test, seems ok to me. This is a quick fix for .23. Peter Williams and myself plan to look at code cleanups in this area (HT/MC optimizations) post .23 BTW, with this fix, do you want to retain the current FUZZ value? thanks, suresh -- Try to fix MC/HT scheduler optimization breakage again, with out breaking the FUZZ logic. First fix the check if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) with this if (*imbalance < busiest_load_per_task) As the current check is always false for nice 0 tasks (as SCHED_LOAD_SCALE_FUZZ is same as busiest_load_per_task for nice 0 tasks). With the above change, imbalance was getting reset to 0 in the corner case condition, making the FUZZ logic fail. Fix it by not corrupting the imbalance and change the imbalance, only when it finds that the HT/MC optimization is needed. Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> --- diff --git a/kernel/sched.c b/kernel/sched.c index 9fe473a..03e5e8d 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -2511,7 +2511,7 @@ group_next: * a think about bumping its value to force at least one task to be * moved */ - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) { + if (*imbalance < busiest_load_per_task) { unsigned long tmp, pwr_now, pwr_move; unsigned int imbn; @@ -2563,10 +2563,8 @@ small_imbalance: pwr_move /= SCHED_LOAD_SCALE; /* Move if we gain throughput */ - if (pwr_move <= pwr_now) - goto out_balanced; - - *imbalance = busiest_load_per_task; + if (pwr_move > pwr_now) + *imbalance = busiest_load_per_task; } return busiest; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
On Mon, Aug 27, 2007 at 09:23:24PM +0200, Ingo Molnar wrote: > > * Siddha, Suresh B <[EMAIL PROTECTED]> wrote: > > > > - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task/2) { > > > + if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) { > > > > Ingo, this is still broken. This condition is always false for nice-0 > > tasks.. > > yes - negative reniced tasks are not spread out via this - and positive > reniced tasks are spread out more easily. Or the opposite? Essentially I observed that nice 0 tasks still endup on two cores of same package, with out getting spread out to two different packages. This behavior is same with out this fix and this fix doesn't help in any way. thanks, suresh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
* Siddha, Suresh B <[EMAIL PROTECTED]> wrote: > > - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task/2) { > > + if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) { > > Ingo, this is still broken. This condition is always false for nice-0 > tasks.. yes - negative reniced tasks are not spread out via this - and positive reniced tasks are spread out more easily. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
On Thu, Aug 23, 2007 at 02:13:41PM +0200, Ingo Molnar wrote: > > * Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > [...] So how about the patch below instead? > > the right patch attached. > > > > Subject: sched: fix broken SMT/MC optimizations > From: "Siddha, Suresh B" <[EMAIL PROTECTED]> > > On a four package system with HT - HT load balancing optimizations were > broken. For example, if two tasks end up running on two logical threads > of one of the packages, scheduler is not able to pull one of the tasks > to a completely idle package. > > In this scenario, for nice-0 tasks, imbalance calculated by scheduler > will be 512 and find_busiest_queue() will return 0 (as each cpu's load > is 1024 > imbalance and has only one task running). > > Similarly MC scheduler optimizations also get fixed with this patch. > > [ [EMAIL PROTECTED]: restored fair balancing by increasing the fuzz and > adding it back to the power decision, without the /2 > factor. ] > > Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> > Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> > --- > > include/linux/sched.h |2 +- > kernel/sched.c|2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > Index: linux/include/linux/sched.h > === > --- linux.orig/include/linux/sched.h > +++ linux/include/linux/sched.h > @@ -681,7 +681,7 @@ enum cpu_idle_type { > #define SCHED_LOAD_SHIFT 10 > #define SCHED_LOAD_SCALE (1L << SCHED_LOAD_SHIFT) > > -#define SCHED_LOAD_SCALE_FUZZ(SCHED_LOAD_SCALE >> 1) > +#define SCHED_LOAD_SCALE_FUZZSCHED_LOAD_SCALE > > #ifdef CONFIG_SMP > #define SD_LOAD_BALANCE 1 /* Do load balancing on this > domain. */ > Index: linux/kernel/sched.c > === > --- linux.orig/kernel/sched.c > +++ linux/kernel/sched.c > @@ -2517,7 +2517,7 @@ group_next: >* a think about bumping its value to force at least one task to be >* moved >*/ > - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task/2) { > + if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) { Ingo, this is still broken. This condition is always false for nice-0 tasks.. thanks, suresh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
On 8/23/07, Ingo Molnar <[EMAIL PROTECTED]> wrote: > with no patch, or with my patch below each gets ~66% of CPU time, > long-term: > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 2290 mingo 20 0 2736 528 252 R 67 0.0 3:22.95 bash > 2291 mingo 20 0 2736 532 256 R 67 0.0 3:18.94 bash > 2292 mingo 20 0 2736 532 256 R 66 0.0 3:19.83 bash > I just witnessed another scheduling "bug" that might have been a feature. I use the current 2.6.23-rc3-mm1 kernel without any additional patches. I have a 2x2218 Opteron system using the ondemand cpufreq governor, one CPU was a max 2600 MHz, the other was at 1000 MHz. On this system there were three processes (all niced) running, but they all ended up at one CPU package, so that the distribution was 100-50-50 and the other CPU still idle. So while the 100-50-50 distribution on one CPU might be fixed by your patch, I am interested if the behavior that the second CPU remained idle was intended. On one hand it made perfectly sense: Even if one 50% task would be migrated it would one get 1000MHz of CPU before the ondemand governor kicked in, instead of 50% of 2600MHz == 1300MHZ. A quick grep did not show me any references to cpufreq or governors in kernel/sched* so I would expect that the scheduler can not predict that the CPU will power up, if a task will be migrated there. Part of my config: CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y ... # CONFIG_SCHED_SMT is not set CONFIG_SCHED_MC=y CONFIG_PREEMPT_NONE=y # CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set CONFIG_PREEMPT_BKL=y CONFIG_PREEMPT_NOTIFIERS=y ... CONFIG_HZ_100=y # CONFIG_HZ_250 is not set # CONFIG_HZ_300 is not set # CONFIG_HZ_1000 is not set CONFIG_HZ=100 ... # CONFIG_SCHED_DEBUG is not set # CONFIG_SCHEDSTATS is not set My testcase is not reproducibly as it happened, but I could try to recreate this, if it is necessary. (I was running the screen saver from electricsheep.org and the three niced tasks were three of its render threads) Torsten - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
On Thu, 2007-08-23 at 14:13 +0200, Ingo Molnar wrote: > * Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > [...] So how about the patch below instead? > > the right patch attached. > > > > Subject: sched: fix broken SMT/MC optimizations > From: "Siddha, Suresh B" <[EMAIL PROTECTED]> > > On a four package system with HT - HT load balancing optimizations were > broken. For example, if two tasks end up running on two logical threads > of one of the packages, scheduler is not able to pull one of the tasks > to a completely idle package. > > In this scenario, for nice-0 tasks, imbalance calculated by scheduler > will be 512 and find_busiest_queue() will return 0 (as each cpu's load > is 1024 > imbalance and has only one task running). Is there an upper bound on the number of tasks that can migrate during a new idle balance? The reason that I'm asking is because I've been seeing some large latency in the new idle path into load_balance().. I'm not sure if lots of tasks are getting migrated or if it's just the iteration over tasks in the rb-tree .. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > [...] So how about the patch below instead? the right patch attached. > Subject: sched: fix broken SMT/MC optimizations From: "Siddha, Suresh B" <[EMAIL PROTECTED]> On a four package system with HT - HT load balancing optimizations were broken. For example, if two tasks end up running on two logical threads of one of the packages, scheduler is not able to pull one of the tasks to a completely idle package. In this scenario, for nice-0 tasks, imbalance calculated by scheduler will be 512 and find_busiest_queue() will return 0 (as each cpu's load is 1024 > imbalance and has only one task running). Similarly MC scheduler optimizations also get fixed with this patch. [ [EMAIL PROTECTED]: restored fair balancing by increasing the fuzz and adding it back to the power decision, without the /2 factor. ] Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- include/linux/sched.h |2 +- kernel/sched.c|2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Index: linux/include/linux/sched.h === --- linux.orig/include/linux/sched.h +++ linux/include/linux/sched.h @@ -681,7 +681,7 @@ enum cpu_idle_type { #define SCHED_LOAD_SHIFT 10 #define SCHED_LOAD_SCALE (1L << SCHED_LOAD_SHIFT) -#define SCHED_LOAD_SCALE_FUZZ (SCHED_LOAD_SCALE >> 1) +#define SCHED_LOAD_SCALE_FUZZ SCHED_LOAD_SCALE #ifdef CONFIG_SMP #define SD_LOAD_BALANCE1 /* Do load balancing on this domain. */ Index: linux/kernel/sched.c === --- linux.orig/kernel/sched.c +++ linux/kernel/sched.c @@ -2517,7 +2517,7 @@ group_next: * a think about bumping its value to force at least one task to be * moved */ - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task/2) { + if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task) { unsigned long tmp, pwr_now, pwr_move; unsigned int imbn; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
* Siddha, Suresh B <[EMAIL PROTECTED]> wrote: >* a think about bumping its value to force at least one task to be >* moved >*/ > - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task/2) { > + if (*imbalance < busiest_load_per_task) { > unsigned long tmp, pwr_now, pwr_move; hm, found a problem: this removes the 'fuzz' from balancing, which is a slight over-balancing to perturb CPU-bound tasks to be distributed in a fairer manner between CPUs. So how about the patch below instead? a good testcase for this is to start 3 CPU-bound tasks on a 2-core box: for ((i=0; i<3; i++)); do while :; do :; done & done with your patch applied two of the loops stick to one core, getting 50% each - the third loop sticks to the other core, getting 100% CPU time: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3093 root 20 0 2736 528 252 R 100 0.0 0:23.81 bash 3094 root 20 0 2736 532 256 R 50 0.0 0:11.95 bash 3095 root 20 0 2736 532 256 R 50 0.0 0:11.95 bash with no patch, or with my patch below each gets ~66% of CPU time, long-term: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 2290 mingo 20 0 2736 528 252 R 67 0.0 3:22.95 bash 2291 mingo 20 0 2736 532 256 R 67 0.0 3:18.94 bash 2292 mingo 20 0 2736 532 256 R 66 0.0 3:19.83 bash the breakage wasnt caused by the fuzz, it was caused by the /2 - the patch below should fix this for real. Ingo --> Subject: sched: fix broken SMT/MC optimizations From: "Siddha, Suresh B" <[EMAIL PROTECTED]> On a four package system with HT - HT load balancing optimizations were broken. For example, if two tasks end up running on two logical threads of one of the packages, scheduler is not able to pull one of the tasks to a completely idle package. In this scenario, for nice-0 tasks, imbalance calculated by scheduler will be 512 and find_busiest_queue() will return 0 (as each cpu's load is 1024 > imbalance and has only one task running). Similarly MC scheduler optimizations also get fixed with this patch. [ [EMAIL PROTECTED]: restored fair balancing by increasing the fuzz and adding it back to the power decision, without the /2 factor. ] Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- include/linux/sched.h |2 +- kernel/sched.c|2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Index: linux/include/linux/sched.h === --- linux.orig/include/linux/sched.h +++ linux/include/linux/sched.h @@ -681,7 +681,7 @@ enum cpu_idle_type { #define SCHED_LOAD_SHIFT 10 #define SCHED_LOAD_SCALE (1L << SCHED_LOAD_SHIFT) -#define SCHED_LOAD_SCALE_FUZZ (SCHED_LOAD_SCALE >> 1) +#define SCHED_LOAD_SCALE_FUZZ SCHED_LOAD_SCALE #ifdef CONFIG_SMP #define SD_LOAD_BALANCE1 /* Do load balancing on this domain. */ Index: linux/kernel/sched.c === --- linux.orig/kernel/sched.c +++ linux/kernel/sched.c @@ -2517,7 +2517,7 @@ group_next: * a think about bumping its value to force at least one task to be * moved */ - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task/2) { + if (*imbalance < busiest_load_per_task) { unsigned long tmp, pwr_now, pwr_move; unsigned int imbn; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] sched: fix broken smt/mc optimizations with CFS
* Siddha, Suresh B <[EMAIL PROTECTED]> wrote: > Ingo, let me know if there any side effects of this change. Thanks. > --- > > On a four package system with HT - HT load balancing optimizations > were broken. For example, if two tasks end up running on two logical > threads of one of the packages, scheduler is not able to pull one of > the tasks to a completely idle package. thanks, i've applied your fix to my queue. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] sched: fix broken smt/mc optimizations with CFS
Ingo, let me know if there any side effects of this change. Thanks. --- On a four package system with HT - HT load balancing optimizations were broken. For example, if two tasks end up running on two logical threads of one of the packages, scheduler is not able to pull one of the tasks to a completely idle package. In this scenario, for nice-0 tasks, imbalance calculated by scheduler will be 512 and find_busiest_queue() will return 0 (as each cpu's load is 1024 > imbalance and has only one task running). Similarly MC scheduler optimizations also get fixed with this patch. Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> --- diff --git a/kernel/sched.c b/kernel/sched.c index 45e17b8..c5ac710 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -2494,7 +2494,7 @@ group_next: * a think about bumping its value to force at least one task to be * moved */ - if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task/2) { + if (*imbalance < busiest_load_per_task) { unsigned long tmp, pwr_now, pwr_move; unsigned int imbn; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/