Re: [LKP] [sched/fair] 6c8116c914: stress-ng.mmapfork.ops_per_sec -38.0% regression

2020-07-10 Thread Vincent Guittot
On Fri, 10 Jul 2020 at 14:08, Tao Zhou  wrote:
>
> Hi Vincent,
>
> Sorry for this so late reply.
>
> On Tue, Jun 30, 2020 at 04:22:10PM +0200, Vincent Guittot wrote:
> > Hi Tao,
> >
> > On Tue, 30 Jun 2020 at 11:41, Tao Zhou  wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Jun 30, 2020 at 09:43:11AM +0200, Vincent Guittot wrote:
> > > > Hi Tao,
> > > >
> > > > Le lundi 15 juin 2020 à 16:14:01 (+0800), Xing Zhengjun a écrit :
> > > > >
> > > > >
> > > > > On 6/15/2020 1:18 PM, Tao Zhou wrote:
> > > >
> > > > ...
> > > >
> > > > > I apply the patch based on v5.7, the regression still existed.
> > > >
> > > >
> > > > Could you try the patch below  ? This patch is not a real fix because 
> > > > it impacts performance of others benchmarks but it will at least narrow 
> > > > your problem.
> > > >
> > > >
> > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > > index 9f78eb76f6fb..a4d8614b1854 100644
> > > > --- a/kernel/sched/fair.c
> > > > +++ b/kernel/sched/fair.c
> > > > @@ -8915,9 +8915,9 @@ find_idlest_group(struct sched_domain *sd, struct 
> > > > task_struct *p, int this_cpu)
> > > >  * and consider staying local.
> > > >  */
> > > >
> > > > -   if ((sd->flags & SD_NUMA) &&
> > > > -   ((idlest_sgs.avg_load + imbalance) >= 
> > > > local_sgs.avg_load))
> > > > -   return NULL;
> > > > +// if ((sd->flags & SD_NUMA) &&
> > > > +// ((idlest_sgs.avg_load + imbalance) >= 
> > > > local_sgs.avg_load))
> > > > +// return NULL;
> > >
> > > Just narrow to the fork (wakeup) path that maybe lead the problem, /me 
> > > think.
> >
> > The perf regression seems to be fixed with this patch on my setup.
> > According to the statistics that I have on the use case, groups are
> > overloaded but load is quite low and this low level hits this NUMA
> > specific condition
>
> My box has 1 Socket, 4 Core, 2 Threads per core and 2x4 CPUS.
> (x86_64 Intel(R) Core(TM) i7-6700HQ)

The change above only applies to NUMA system which doesn't seems to be
the case for your setup

>
> stress-ng.mmapfork
>
> v5.8-rc4:
>
> stress-ng: info:  [7158] dispatching hogs: 8 mmapfork
> stress-ng: info:  [7158] successful run completed in 1.09s
> stress-ng: info:  [7158] stressor   bogo ops real time  usr time  sys 
> time   bogo ops/s   bogo ops/s
> stress-ng: info:  [7158]   (secs)(secs)(secs) 
>   (real time) (usr+sys time)
> stress-ng: info:  [7158] mmapfork 32  1.09  2.48  
> 6.0129.36 3.77
> stress-ng: info:  [7158] for a 1.09s run time:
> stress-ng: info:  [7158]   8.73s available CPU time
> stress-ng: info:  [7158]   2.52s user time   ( 28.86%)
> stress-ng: info:  [7158]   6.07s system time ( 69.52%)
> stress-ng: info:  [7158]   8.59s total time  ( 98.38%)
> stress-ng: info:  [7158] load average: 0.52 0.26 0.10
>
> v5.8-rc4 w/ above patch:
>
> stress-ng: info:  [5126] dispatching hogs: 8 mmapfork
> stress-ng: info:  [5126] successful run completed in 1.07s
> stress-ng: info:  [5126] stressor   bogo ops real time  usr time  sys 
> time   bogo ops/s   bogo ops/s
> stress-ng: info:  [5126]   (secs)(secs)(secs) 
>   (real time) (usr+sys time)
> stress-ng: info:  [5126] mmapfork 32  1.07  2.45  
> 5.9629.88 3.80
> stress-ng: info:  [5126] for a 1.07s run time:
> stress-ng: info:  [5126]   8.58s available CPU time
> stress-ng: info:  [5126]   2.49s user time   ( 29.02%)
> stress-ng: info:  [5126]   6.02s system time ( 70.17%)
> stress-ng: info:  [5126]   8.51s total time  ( 99.19%)
> stress-ng: info:  [5126] load average: 0.31 0.22 0.09
>
> No obvious changes.

Yeah, the problem happens for system with several numa nodes

>
> And I traced and also tried to find the task_h_load = 0 after the patch you 
> sent.
>
> I used the command:
>
> trace-cmd record -e sched -e irq -e cpu_idle -e cpu_frequency -e timer cgexec 
> -g cpu:A\
> stress-ng --timeout 1 --times --verify--metrics-brief --sequential 8 --class 
> scheduler\
> --exclude (all exclude but mmapfork)
>
><...>-26132 [000]  6571.361156: bprint:   task_h_load: 
> cfs_rq->h_load:119, p->load_avg:26, cfs_rq->load_avg:14487
><...>-26132 [000]  6571.361156: bprint:   
> load_balance: detach_task migrate_load: task_h_load orginal: 0
>
> If cgroup has three levels(first tried), I can not find the task_h_load = 0 
> case.

Your system is is not large enough to face the problem

> group se's weight is relate to the task_group's share.
> task's weight is its weight.
>
> if (!tg->parent) {
> load = cpu_rq(cpu)->load.weight;
> } else {
> load = tg->parent->cfs_rq[cpu]->h_load;
> load *= tg->cfs_rq[cpu]->shares;
> load /= tg->parent->cfs_rq[cpu]->load.weight + 

Re: [LKP] [sched/fair] 6c8116c914: stress-ng.mmapfork.ops_per_sec -38.0% regression

2020-06-30 Thread Vincent Guittot
Hi Tao,

On Tue, 30 Jun 2020 at 11:41, Tao Zhou  wrote:
>
> Hi,
>
> On Tue, Jun 30, 2020 at 09:43:11AM +0200, Vincent Guittot wrote:
> > Hi Tao,
> >
> > Le lundi 15 juin 2020 à 16:14:01 (+0800), Xing Zhengjun a écrit :
> > >
> > >
> > > On 6/15/2020 1:18 PM, Tao Zhou wrote:
> >
> > ...
> >
> > > I apply the patch based on v5.7, the regression still existed.
> >
> >
> > Could you try the patch below  ? This patch is not a real fix because it 
> > impacts performance of others benchmarks but it will at least narrow your 
> > problem.
> >
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 9f78eb76f6fb..a4d8614b1854 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8915,9 +8915,9 @@ find_idlest_group(struct sched_domain *sd, struct 
> > task_struct *p, int this_cpu)
> >  * and consider staying local.
> >  */
> >
> > -   if ((sd->flags & SD_NUMA) &&
> > -   ((idlest_sgs.avg_load + imbalance) >= 
> > local_sgs.avg_load))
> > -   return NULL;
> > +// if ((sd->flags & SD_NUMA) &&
> > +// ((idlest_sgs.avg_load + imbalance) >= 
> > local_sgs.avg_load))
> > +// return NULL;
>
> Just narrow to the fork (wakeup) path that maybe lead the problem, /me think.

The perf regression seems to be fixed with this patch on my setup.
According to the statistics that I have on the use case, groups are
overloaded but load is quite low and this low level hits this NUMA
specific condition

> Some days ago, I tried this patch:
>
>   https://lore.kernel.org/lkml/20200616164801.18644-1-peter.pu...@linaro.org/
>
> ---
>  kernel/sched/fair.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 02f323b85b6d..abcbdf80ee75 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8662,8 +8662,14 @@ static bool update_pick_idlest(struct sched_group 
> *idlest,
>
> case group_has_spare:
> /* Select group with most idle CPUs */
> -   if (idlest_sgs->idle_cpus >= sgs->idle_cpus)
> +   if (idlest_sgs->idle_cpus > sgs->idle_cpus)
> return false;
> +
> +   /* Select group with lowest group_util */
> +   if (idlest_sgs->idle_cpus == sgs->idle_cpus &&
> +   idlest_sgs->group_util <= sgs->group_util)
> +   return false;
> +
> break;
> }
>
> --
>
> This patch is related to wake up slow path and group type is full_busy.

I tried it but haven't seen impacts on mmapfork test results

> What I tried that got improved:
>
> $> sysbench threads --threads=16 run
>
> The total num of event(high is better):
>
> v5.8-rc1  : 3402034494 33561
> v5.8-rc1+patch: 3546636184 36260
>
> $> perf bench -f simple sched pipe -l 400
>
> v5.8-rc1  : 16.203   16.238   16.150
> v5.8-rc1+patch: 15.757   15.930   15.819
>
> I also saw some regressions about other workloads(dont know much).
> So, suggest to test this patch about this stress-ng.mmapfork. I didn't do
> this yet.
>
> Another patch i want to mention here is this(merged to V5.7 now):
>
>   commit 68f7b5cc83 ("sched/cfs: change initial value of runnable_avg")
>
> And this regression happened based on V5.7. This patch is related to fork
> wake up path of overloaded type. Absolutely need to try then.
>
> Finally, I have given a patch that seems not related to fork wake up path,
> but I also tried it on some benchmark. But, did not saw improvement there.
> I also give this changed patch here(I realized that full_busy type idle cpu
> first but not sure). Maybe not need to try.
>
> Index: core.bak/kernel/sched/fair.c
> ===
> --- core.bak.orig/kernel/sched/fair.c
> +++ core.bak/kernel/sched/fair.c
> @@ -9226,17 +9226,20 @@ static struct sched_group *find_busiest_
> goto out_balanced;
>
> if (busiest->group_weight > 1 &&
> -   local->idle_cpus <= (busiest->idle_cpus + 1))
> -   /*
> -* If the busiest group is not overloaded
> -* and there is no imbalance between this and busiest
> -* group wrt idle CPUs, it is balanced. The imbalance
> -* becomes significant if the diff is greater than 1
> -* otherwise we might end up to just move the 
> imbalance
> -* on another group. Of course this applies only if
> -* there is more than 1 CPU per group.
> -*/
> -   goto out_balanced;
> +   local->idle_cpus <= (busiest->idle_cpus + 1)) {
> +   if (local->group_type == group_has_spare) {
> + 

Re: [LKP] [sched/fair] 6c8116c914: stress-ng.mmapfork.ops_per_sec -38.0% regression

2020-06-30 Thread Vincent Guittot
Hi Tao,

Le lundi 15 juin 2020 à 16:14:01 (+0800), Xing Zhengjun a écrit :
> 
> 
> On 6/15/2020 1:18 PM, Tao Zhou wrote:

...

> I apply the patch based on v5.7, the regression still existed.


Could you try the patch below  ? This patch is not a real fix because it 
impacts performance of others benchmarks but it will at least narrow your 
problem.


diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9f78eb76f6fb..a4d8614b1854 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8915,9 +8915,9 @@ find_idlest_group(struct sched_domain *sd, struct 
task_struct *p, int this_cpu)
 * and consider staying local.
 */

-   if ((sd->flags & SD_NUMA) &&
-   ((idlest_sgs.avg_load + imbalance) >= local_sgs.avg_load))
-   return NULL;
+// if ((sd->flags & SD_NUMA) &&
+// ((idlest_sgs.avg_load + imbalance) >= local_sgs.avg_load))
+// return NULL;

/*
 * If the local group is less loaded than the selected

--


> =
> tbox_group/testcase/rootfs/kconfig/compiler/nr_threads/disk/sc_pid_max/testtime/class/cpufreq_governor/ucode:
> 
> lkp-bdw-ep6/stress-ng/debian-x86_64-20191114.cgz/x86_64-rhel-7.6/gcc-7/100%/1HDD/4194304/1s/scheduler/performance/0xb38
> 
> commit:
>   e94f80f6c49020008e6fa0f3d4b806b8595d17d8
>   6c8116c914b65be5e4d6f66d69c8142eb0648c22
>   v5.7
>   c7e6d37f60da32f808140b1b7dabcc3cde73c4cc  (Tao's patch)
> 
> e94f80f6c4902000 6c8116c914b65be5e4d6f66d69cv5.7
> c7e6d37f60da32f808140b1b7da
>  --- ---
> ---
>  %stddev %change %stddev %change %stddev %change
> %stddev
>  \  |\  |\
> |\
> 819250 ±  5% -10.1% 736616 ±  8% +41.2%1156877 ± 3%
> +43.6%1176246 ±  5%  stress-ng.futex.ops
> 818985 ±  5% -10.1% 736460 ±  8% +41.2%1156215 ± 3%
> +43.6%1176055 ±  5%  stress-ng.futex.ops_per_sec
>   1551 ±  3%  -3.4%   1498 ±  5%  -4.6%   1480 ± 5%
> -14.3%   1329 ± 11%  stress-ng.inotify.ops
>   1547 ±  3%  -3.5%   1492 ±  5%  -4.8%   1472 ± 5%
> -14.3%   1326 ± 11%  stress-ng.inotify.ops_per_sec
>  11292 ±  8%  -2.8%  10974 ±  8%  -9.4%  10225 ± 6%
> -10.1%  10146 ±  6%  stress-ng.kill.ops
>  11317 ±  8%  -2.6%  11023 ±  8%  -9.1%  10285 ± 5%
> -10.3%  10154 ±  6%  stress-ng.kill.ops_per_sec
>  28.20 ±  4% -35.4%  18.22   -33.4%  18.77
> -27.7%  20.40 ±  9%  stress-ng.mmapfork.ops_per_sec
>2999012 ± 21% -10.1%2696954 ± 22% -88.5% 37 ± 11%
> -87.8% 364932stress-ng.tee.ops_per_sec
>   7882 ±  3%  -5.4%   7458 ±  4%  -2.0%   7724 ± 3%
> -2.2%   7709 ±  4%  stress-ng.vforkmany.ops
>   7804 ±  3%  -5.2%   7400 ±  4%  -2.0%   7647 ± 3%
> -2.1%   7636 ±  4%  stress-ng.vforkmany.ops_per_sec
>   46745421 ±  3%  -8.1%   42938569 ±  3%  -5.2%   44312072 ± 4%
> -2.3%   45648193stress-ng.yield.ops
>   46734472 ±  3%  -8.1%   42926316 ±  3%  -5.2%   44290338 ± 4%
> -2.4%   45627571stress-ng.yield.ops_per_sec
> 
> 
> 

...

> -- 
> Zhengjun Xing


Re: [LKP] [sched/fair] 6c8116c914: stress-ng.mmapfork.ops_per_sec -38.0% regression

2020-06-15 Thread Xing Zhengjun




On 6/15/2020 1:18 PM, Tao Zhou wrote:

Hi,

On Fri, Jun 12, 2020 at 03:59:31PM +0800, Xing Zhengjun wrote:

Hi,

I test the regression, it still existed in v5.7.  If you have any fix
for it, please send it to me, I can verify it. Thanks.


When busiest group is group_fully_busy and local group <= group_fully_busy
the metric used:

   local group  busiest group  use metric
group_fully_busy   group_fully_busy avg load
group_has_sparegroup_fully_busy idle cpu/task num

In find_busiest_group() about this condition:

 'if (busiest->group_type != group_overloaded) {'

in this case, busiest type is group_fully_busy, local type <= group_fully_busy.
in this branch, it check idle cpu and task num and can go to out_balance. That
is to say ignore group_fully_busy other than group_has_spare(this case is done
in calculate_imbalance()).

When local group and busiest group are all group_fully_busy, need to use avg
load to metric(in calculate_imbalance()). So give the below change:


diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cbcb2f71599b..0afbea39dd5a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9219,24 +9219,26 @@ static struct sched_group *find_busiest_group(struct 
lb_env *env)
  */
 goto out_balanced;
  
-   if (busiest->group_weight > 1 &&

-   local->idle_cpus <= (busiest->idle_cpus + 1))
-   /*
-* If the busiest group is not overloaded
-* and there is no imbalance between this and busiest
-* group wrt idle CPUs, it is balanced. The imbalance
-* becomes significant if the diff is greater than 1
-* otherwise we might end up to just move the imbalance
-* on another group. Of course this applies only if
-* on another group. Of course this applies only if
-* there is more than 1 CPU per group.
-*/
-   goto out_balanced;
+   if (local->group_type == group_has_spare) {
+   if (busiest->group_weight > 1 &&
+   local->idle_cpus <= (busiest->idle_cpus + 1))
+   /*
+* If the busiest group is not overloaded
+* and there is no imbalance between this and 
busiest
+* group wrt idle CPUs, it is balanced. The 
imbalance
+* becomes significant if the diff is greater 
than 1
+* otherwise we might end up to just move the 
imbalance
+* on another group. Of course this applies 
only if
+* there is more than 1 CPU per group.
+*/
+   goto out_balanced;
  
-   if (busiest->sum_h_nr_running == 1)

-   /*
-* busiest doesn't have any tasks waiting to run
-*/
-   goto out_balanced;
+   if (busiest->sum_h_nr_running == 1)
+   /*
+* busiest doesn't have any tasks waiting to run
+*/
+   goto out_balanced;
+   }
 }
  
  force_balance:


In fact, I don't know this change can help or not, can be right or not.
No test, no compile. If it is wrong, just ignore.

Thanks


I apply the patch based on v5.7, the regression still existed.
=
tbox_group/testcase/rootfs/kconfig/compiler/nr_threads/disk/sc_pid_max/testtime/class/cpufreq_governor/ucode:

lkp-bdw-ep6/stress-ng/debian-x86_64-20191114.cgz/x86_64-rhel-7.6/gcc-7/100%/1HDD/4194304/1s/scheduler/performance/0xb38

commit:
  e94f80f6c49020008e6fa0f3d4b806b8595d17d8
  6c8116c914b65be5e4d6f66d69c8142eb0648c22
  v5.7
  c7e6d37f60da32f808140b1b7dabcc3cde73c4cc  (Tao's patch)

e94f80f6c4902000 6c8116c914b65be5e4d6f66d69cv5.7 
c7e6d37f60da32f808140b1b7da
 --- --- 
---
 %stddev %change %stddev %change 
%stddev %change %stddev
 \  |\  |\ 
|\
819250 ±  5% -10.1% 736616 ±  8% +41.2%1156877 ± 
3% +43.6%1176246 ±  5%  stress-ng.futex.ops
818985 ±  5% -10.1% 736460 ±  8% +41.2%1156215 ± 
3% +43.6%1176055 ±  5%  stress-ng.futex.ops_per_sec
  1551 ±  3%  -3.4%   1498 

Re: [LKP] [sched/fair] 6c8116c914: stress-ng.mmapfork.ops_per_sec -38.0% regression

2020-06-12 Thread Xing Zhengjun

Hi,

  I test the regression, it still existed in v5.7.  If you have any fix 
for it, please send it to me, I can verify it. Thanks.


=
tbox_group/testcase/rootfs/kconfig/compiler/nr_threads/disk/sc_pid_max/testtime/class/cpufreq_governor/ucode:

lkp-bdw-ep6/stress-ng/debian-x86_64-20191114.cgz/x86_64-rhel-7.6/gcc-7/100%/1HDD/4194304/1s/scheduler/performance/0xb38

commit:
  e94f80f6c49020008e6fa0f3d4b806b8595d17d8
  6c8116c914b65be5e4d6f66d69c8142eb0648c22
  v5.7-rc3
  v5.7

e94f80f6c4902000 6c8116c914b65be5e4d6f66d69cv5.7-rc3 
   v5.7
 --- --- 
---
 %stddev %change %stddev %change 
%stddev %change %stddev
 \  |\  |\ 
|\
 21398 ±  7%  +6.5%  22781 ±  2% -14.5%  18287 ± 
4%  -5.5%  20231 ± 14%  stress-ng.clone.ops
819250 ±  5% -10.1% 736616 ±  8% +34.2%1099410 ± 
5% +41.2%1156877 ±  3%  stress-ng.futex.ops
818985 ±  5% -10.1% 736460 ±  8% +34.2%1099487 ± 
5% +41.2%1156215 ±  3%  stress-ng.futex.ops_per_sec
  1551 ±  3%  -3.4%   1498 ±  5%  -9.5%   1404 ± 
2%  -4.6%   1480 ±  5%  stress-ng.inotify.ops
  1547 ±  3%  -3.5%   1492 ±  5%  -9.5%   1400 ± 
2%  -4.8%   1472 ±  5%  stress-ng.inotify.ops_per_sec
 11292 ±  8%  -2.8%  10974 ±  8%  +1.9%  11505 ± 
13%  -9.4%  10225 ±  6%  stress-ng.kill.ops
 28.20 ±  4% -35.4%  18.22   -33.5%  18.75 
 -33.4%  18.77stress-ng.mmapfork.ops_per_sec
   1932318+1.5%1961688 ±  2% -22.8%1492231 ± 
2%  +4.0%2010509 ±  3%  stress-ng.softlockup.ops
   1931679 ±  2%  +1.5%1961143 ±  2% -22.8%1491939 ± 
2%  +4.0%2009585 ±  3%  stress-ng.softlockup.ops_per_sec
  18607406 ±  6% -12.9%   16210450 ± 21% -12.7%   16238693 ± 
14%  -8.0%   17120880 ± 13%  stress-ng.switch.ops
  18604406 ±  6% -12.9%   16208270 ± 21% -12.7%   16237956 ± 
14%  -8.0%   17115273 ± 13%  stress-ng.switch.ops_per_sec
   2999012 ± 21% -10.1%2696954 ± 22%  -9.1%2725653 ± 
21% -88.5% 37 ± 11%  stress-ng.tee.ops_per_sec
  7882 ±  3%  -5.4%   7458 ±  4%  -4.0%   7566 ± 
4%  -2.0%   7724 ±  3%  stress-ng.vforkmany.ops
  7804 ±  3%  -5.2%   7400 ±  4%  -3.8%   7504 ± 
4%  -2.0%   7647 ±  3%  stress-ng.vforkmany.ops_per_sec
  46745421 ±  3%  -8.1%   42938569 ±  3%  -7.8%   43078233 ± 
3%  -5.2%   44312072 ±  4%  stress-ng.yield.ops
  46734472 ±  3%  -8.1%   42926316 ±  3%  -7.8%   43067447 ± 
3%  -5.2%   44290338 ±  4%  stress-ng.yield.ops_per_sec



On 4/27/2020 8:46 PM, Vincent Guittot wrote:

On Mon, 27 Apr 2020 at 13:35, Hillf Danton  wrote:



On Mon, 27 Apr 2020 11:03:58 +0200 Vincent Guittot wrote: 

On Sun, 26 Apr 2020 at 14:42, Hillf Danton wrote:


On 4/21/2020 8:47 AM, kernel test robot wrote:


Greeting,

FYI, we noticed a 56.4% improvement of stress-ng.fifo.ops_per_sec due to commit:


commit: 6c8116c914b65be5e4d6f66d69c8142eb0648c22 ("sched/fair: Fix condition of 
avg_load calculation")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: stress-ng
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G 
memory
with following parameters:

 nr_threads: 100%
 disk: 1HDD
 testtime: 1s
 class: scheduler
 cpufreq_governor: performance
 ucode: 0xb38
 sc_pid_max: 4194304



We need to handle group_fully_busy in a different way from
group_overloaded as task push does not help grow load balance
in the former case.


Have you tested this patch for the UC above ? Do you have figures ?


No I am looking for a box of 88 threads. Likely to get access to it in
as early as three weeks.


--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8744,30 +8744,20 @@ find_idlest_group(struct sched_domain *s

 switch (local_sgs.group_type) {
 case group_overloaded:
-   case group_fully_busy:
-   /*
-* When comparing groups across NUMA domains, it's possible for
-* the local domain to be very lightly loaded relative to the
-* remote domains but "imbalance" skews the comparison making
-* remote CPUs look much more favourable. When considering
-* cross-domain, add imbalance to the load on the remote node
-* and consider staying local.
-*/
-
-   if ((sd->flags & SD_NUMA) &&
-   ((idlest_sgs.avg_load + imbalance) >= local_sgs.avg_load))
+