Re: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer
Resending - previously, I forgot to send the message in the plain text mode. I'm sorry. Hi Mel, Thanks a lot for looking into this! Our results are mostly in line with what you see. We observe big gains (20-50%) when the system is loaded to 1/3 of the maximum capacity and mixed results at the full load - some workloads benefit from the patch at the full load, others not, but performance changes at the full load are mostly within the noise of results (+/-5%). Overall, we think this patch is helpful. Jirka On Mon, Sep 21, 2020 at 1:02 PM Mel Gorman wrote: > > On Thu, Sep 10, 2020 at 11:50:04PM +0200, Jirka Hladky wrote: > > 1) Threads stability has improved a lot. We see much fewer threads > > migrations. Check for example this heatmap based on the mpstat results > > collected while running sp.C test from the NAS benchmark. sp.C was run > > with 16 threads on a dual-socket AMD 7351 server with 8 NUMA nodes: > > 5.9 vanilla > > https://drive.google.com/file/d/10rojhTSQUSu-1aGQi-srr99SmVQ3Llgo/view?usp=sharing > > 5.9 with the patch > > https://drive.google.com/file/d/1eZdTBWvWMNvdvXqitwAlcKZ7gb-ySQUl/view?usp=sharing > > > > The heatmaps are generated from the mpstat output (there are 5 > > different runs at one picture). We collect CPU utilization every 5 > > seconds. Lighter color corresponds to lower CPU utilization. Light > > color means that the thread may have run on different CPUs during that > > 5 seconds. Solid straight lines, on the other hand, mean that thread > > was running on the same CPU all the time. The difference is striking. > > > > We see significantly fewer threads migrations across many different > > tests - NAS, SPECjbb2005, SPECjvm2008 > > > > For all three, I see the point where it's better or worse depends on > overall activity. I looked at heatmaps for a variety of workloads and > visually the bigger differences tend to be with utilisation is relatively > low (e.g. one third of CPUs active). > > > 2) We see also performance improvement in terms of runtime, especially > > at low load scenarios (number of threads being roughly equal to the 2* > > number of NUMA nodes). It fixes the performance drop which we see > > since 5.7 kernel, discussed for example here: > > https://lore.kernel.org/lkml/CAE4VaGB7+sR1nf3Ux8W=hgn46gnxryr0ubwju0oynk7h00y...@mail.gmail.com/ > > > > This would make some sense given the original intent about allowing > imbalances that was later revised significantly. It depends on when > memory throughput is more important so the impact varies with the level > of untilisation. For example, on a 2-socket machine running specjvm 2005 > I see > > specjbb >5.9.0-rc4 5.9.0-rc4 > vanilladstbalance-v1r1 > Hmean tput-1 46425.00 ( 0.00%)43394.00 * -6.53%* > Hmean tput-2 98416.00 ( 0.00%)96031.00 * -2.42%* > Hmean tput-3150184.00 ( 0.00%) 148783.00 * -0.93%* > Hmean tput-4200683.00 ( 0.00%) 197906.00 * -1.38%* > Hmean tput-5236305.00 ( 0.00%) 245549.00 * 3.91%* > Hmean tput-6281559.00 ( 0.00%) 285692.00 * 1.47%* > Hmean tput-7338558.00 ( 0.00%) 334467.00 * -1.21%* > Hmean tput-8340745.00 ( 0.00%) 372501.00 * 9.32%* > Hmean tput-9424343.00 ( 0.00%) 413006.00 * -2.67%* > Hmean tput-10 421854.00 ( 0.00%) 434261.00 * 2.94%* > Hmean tput-11 493256.00 ( 0.00%) 485330.00 * -1.61%* > Hmean tput-12 549573.00 ( 0.00%) 529959.00 * -3.57%* > Hmean tput-13 593183.00 ( 0.00%) 555010.00 * -6.44%* > Hmean tput-14 588252.00 ( 0.00%) 599166.00 * 1.86%* > Hmean tput-15 623065.00 ( 0.00%) 642713.00 * 3.15%* > Hmean tput-16 703924.00 ( 0.00%) 660758.00 * -6.13%* > Hmean tput-17 666023.00 ( 0.00%) 697675.00 * 4.75%* > Hmean tput-18 761502.00 ( 0.00%) 758360.00 * -0.41%* > Hmean tput-19 796088.00 ( 0.00%) 798368.00 * 0.29%* > Hmean tput-20 733564.00 ( 0.00%) 823086.00 * 12.20%* > Hmean tput-21 840980.00 ( 0.00%) 856711.00 * 1.87%* > Hmean tput-22 804285.00 ( 0.00%) 872238.00 * 8.45%* > Hmean tput-23 795208.00 ( 0.00%) 889374.00 * 11.84%* > Hmean tput-24 848619.00 ( 0.00%) 966783.00 * 13.92%* > Hmean tput-25 750848.00 ( 0.00%) 903790.00 * 20.37%* > Hmean tput-26 780523.00 ( 0.00%) 962254.00 * 23.28%* > Hmean tput-27 1042245.00 ( 0.00%) 991544.00 * -4.86%* > Hmean tput-28 1090580.00 ( 0.00%) 1035926.00 * -5.01%* > Hmean tput-29 999483.00 ( 0.00%) 1082948.00 * 8.35%* > Hmean tput-30 1098663.00 ( 0.00%) 1113427.00 * 1.34%* > Hmean tput-31 1125671.00 ( 0.00%) 1134175.00 * 0.76%* > Hmean tput-32 968167.00 ( 0.00%) 1250286.00 * 29.14%* > Hmean tput-33 1077676.00 ( 0.00%) 1060893.00 * -1.56%* > Hmean tput-34 1090538.00 ( 0.00%)
Re: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer
On Thu, Sep 10, 2020 at 11:50:04PM +0200, Jirka Hladky wrote: > 1) Threads stability has improved a lot. We see much fewer threads > migrations. Check for example this heatmap based on the mpstat results > collected while running sp.C test from the NAS benchmark. sp.C was run > with 16 threads on a dual-socket AMD 7351 server with 8 NUMA nodes: > 5.9 vanilla > https://drive.google.com/file/d/10rojhTSQUSu-1aGQi-srr99SmVQ3Llgo/view?usp=sharing > 5.9 with the patch > https://drive.google.com/file/d/1eZdTBWvWMNvdvXqitwAlcKZ7gb-ySQUl/view?usp=sharing > > The heatmaps are generated from the mpstat output (there are 5 > different runs at one picture). We collect CPU utilization every 5 > seconds. Lighter color corresponds to lower CPU utilization. Light > color means that the thread may have run on different CPUs during that > 5 seconds. Solid straight lines, on the other hand, mean that thread > was running on the same CPU all the time. The difference is striking. > > We see significantly fewer threads migrations across many different > tests - NAS, SPECjbb2005, SPECjvm2008 > For all three, I see the point where it's better or worse depends on overall activity. I looked at heatmaps for a variety of workloads and visually the bigger differences tend to be with utilisation is relatively low (e.g. one third of CPUs active). > 2) We see also performance improvement in terms of runtime, especially > at low load scenarios (number of threads being roughly equal to the 2* > number of NUMA nodes). It fixes the performance drop which we see > since 5.7 kernel, discussed for example here: > https://lore.kernel.org/lkml/CAE4VaGB7+sR1nf3Ux8W=hgn46gnxryr0ubwju0oynk7h00y...@mail.gmail.com/ > This would make some sense given the original intent about allowing imbalances that was later revised significantly. It depends on when memory throughput is more important so the impact varies with the level of untilisation. For example, on a 2-socket machine running specjvm 2005 I see specjbb 5.9.0-rc4 5.9.0-rc4 vanilladstbalance-v1r1 Hmean tput-1 46425.00 ( 0.00%)43394.00 * -6.53%* Hmean tput-2 98416.00 ( 0.00%)96031.00 * -2.42%* Hmean tput-3150184.00 ( 0.00%) 148783.00 * -0.93%* Hmean tput-4200683.00 ( 0.00%) 197906.00 * -1.38%* Hmean tput-5236305.00 ( 0.00%) 245549.00 * 3.91%* Hmean tput-6281559.00 ( 0.00%) 285692.00 * 1.47%* Hmean tput-7338558.00 ( 0.00%) 334467.00 * -1.21%* Hmean tput-8340745.00 ( 0.00%) 372501.00 * 9.32%* Hmean tput-9424343.00 ( 0.00%) 413006.00 * -2.67%* Hmean tput-10 421854.00 ( 0.00%) 434261.00 * 2.94%* Hmean tput-11 493256.00 ( 0.00%) 485330.00 * -1.61%* Hmean tput-12 549573.00 ( 0.00%) 529959.00 * -3.57%* Hmean tput-13 593183.00 ( 0.00%) 555010.00 * -6.44%* Hmean tput-14 588252.00 ( 0.00%) 599166.00 * 1.86%* Hmean tput-15 623065.00 ( 0.00%) 642713.00 * 3.15%* Hmean tput-16 703924.00 ( 0.00%) 660758.00 * -6.13%* Hmean tput-17 666023.00 ( 0.00%) 697675.00 * 4.75%* Hmean tput-18 761502.00 ( 0.00%) 758360.00 * -0.41%* Hmean tput-19 796088.00 ( 0.00%) 798368.00 * 0.29%* Hmean tput-20 733564.00 ( 0.00%) 823086.00 * 12.20%* Hmean tput-21 840980.00 ( 0.00%) 856711.00 * 1.87%* Hmean tput-22 804285.00 ( 0.00%) 872238.00 * 8.45%* Hmean tput-23 795208.00 ( 0.00%) 889374.00 * 11.84%* Hmean tput-24 848619.00 ( 0.00%) 966783.00 * 13.92%* Hmean tput-25 750848.00 ( 0.00%) 903790.00 * 20.37%* Hmean tput-26 780523.00 ( 0.00%) 962254.00 * 23.28%* Hmean tput-27 1042245.00 ( 0.00%) 991544.00 * -4.86%* Hmean tput-28 1090580.00 ( 0.00%) 1035926.00 * -5.01%* Hmean tput-29 999483.00 ( 0.00%) 1082948.00 * 8.35%* Hmean tput-30 1098663.00 ( 0.00%) 1113427.00 * 1.34%* Hmean tput-31 1125671.00 ( 0.00%) 1134175.00 * 0.76%* Hmean tput-32 968167.00 ( 0.00%) 1250286.00 * 29.14%* Hmean tput-33 1077676.00 ( 0.00%) 1060893.00 * -1.56%* Hmean tput-34 1090538.00 ( 0.00%) 1090933.00 * 0.04%* Hmean tput-35 967058.00 ( 0.00%) 1107421.00 * 14.51%* Hmean tput-36 1051745.00 ( 0.00%) 1210663.00 * 15.11%* Hmean tput-37 1019465.00 ( 0.00%) 1351446.00 * 32.56%* Hmean tput-38 1083102.00 ( 0.00%) 1064541.00 * -1.71%* Hmean tput-39 1232990.00 ( 0.00%) 1303623.00 * 5.73%* Hmean tput-40 1175542.00 ( 0.00%) 1340943.00 * 14.07%* Hmean tput-41 1127826.00 ( 0.00%) 1339492.00 * 18.77%* Hmean tput-42 1198313.00 ( 0.00%) 1411023.00 * 17.75%* Hmean tput-43 1163733.00 ( 0.00%) 1228253.00 * 5.54%* Hmean tput-44 1305562.00 ( 0.00%) 1357886.00 * 4.01%* Hmean tput-45 1326752.00 ( 0.00%)
Re: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer
Hi Hilf and Mel, thanks a lot for bringing this to my attention. We have tested the proposed patch and we are getting excellent results so far! 1) Threads stability has improved a lot. We see much fewer threads migrations. Check for example this heatmap based on the mpstat results collected while running sp.C test from the NAS benchmark. sp.C was run with 16 threads on a dual-socket AMD 7351 server with 8 NUMA nodes: 5.9 vanilla https://drive.google.com/file/d/10rojhTSQUSu-1aGQi-srr99SmVQ3Llgo/view?usp=sharing 5.9 with the patch https://drive.google.com/file/d/1eZdTBWvWMNvdvXqitwAlcKZ7gb-ySQUl/view?usp=sharing The heatmaps are generated from the mpstat output (there are 5 different runs at one picture). We collect CPU utilization every 5 seconds. Lighter color corresponds to lower CPU utilization. Light color means that the thread may have run on different CPUs during that 5 seconds. Solid straight lines, on the other hand, mean that thread was running on the same CPU all the time. The difference is striking. We see significantly fewer threads migrations across many different tests - NAS, SPECjbb2005, SPECjvm2008 2) We see also performance improvement in terms of runtime, especially at low load scenarios (number of threads being roughly equal to the 2* number of NUMA nodes). It fixes the performance drop which we see since 5.7 kernel, discussed for example here: https://lore.kernel.org/lkml/CAE4VaGB7+sR1nf3Ux8W=hgn46gnxryr0ubwju0oynk7h00y...@mail.gmail.com/ The biggest improvements are for the NAS and the SPECjvm2008 benchmarks (typically between 20-50%). SPECjbb2005 shows also improvements, around 10%. This is again on NUMA servers at the low utilization. You can find snapshots of some results here: https://drive.google.com/drive/folders/1k3Gb-vlI7UjPZZcLkoL2W2VB_zqxIJ3_?usp=sharing @Mel - could you please test the proposed patch? I know you have good coverage for the inter-process communication benchmarks which may show different behavior than benchmarks which we use. I hope that fewer threads migrations might show positive effects also for these tests. Please give it a try. Thanks a lot! Jirka On Tue, Sep 8, 2020 at 3:07 AM Hillf Danton wrote: > > > On Mon, 7 Sep 2020 18:01:06 +0530 Srikar Dronamraju wrote: > > > > On Mon, Sep 07, 2020 at 07:27:08PM +1200, Barry Song wrote: > > > > > Something is wrong. In find_busiest_group(), we are checking if src > > > > > has > > > > > higher load, however, in task_numa_find_cpu(), we are checking if dst > > > > > will have higher load after balancing. It seems it is not sensible to > > > > > check src. > > > > > It maybe cause wrong imbalance value, for example, if > > > > > dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and > > > > > src_running = env->src_stats.nr_running - 1 results in 1; > > > > > The current code is thinking imbalance as 0 since src_running is > > > > > smaller > > > > > than 2. > > > > > This is inconsistent with load balancer. > > > > > > > Hi Srikar and Barry > > > > I have observed the similar behaviour what Barry Song has documented with a > > simple ebizzy with less threads on a 2 node system > > Thanks for your testing. > > > > ebizzy -t 6 -S 100 > > > > We see couple of ebizzy threads moving back and forth between the 2 nodes > > because of numa balancer and load balancer trying to do the exact opposite. > > > > However with Barry's patch, couple of tests regress heavily. (Any numa > > workload that has shared numa faults). > > For example: > > perf bench numa mem --no-data_rand_walk -p 1 -t 6 -G 0 -P 3072 -T 0 -l 50 -c > > > > I also don't understand the rational behind checking for dst_running in numa > > balancer path. This almost means no numa balancing in lightly loaded > > scenario. > > > > So agree with Mel that we should probably test more scenarios before > > we accept this patch. > > Take a look at Jirka's work [1] please if you have any plan to do some > more tests. > > [1] > https://lore.kernel.org/lkml/CAE4VaGB7+sR1nf3Ux8W=hgn46gnxryr0ubwju0oynk7h00y...@mail.gmail.com/ > > > > > > > > > > It checks the conditions if the move was to happen. Have you evaluated > > > > this for a NUMA balancing load and confirmed it a) balances properly and > > > > b) does not increase the scan rate trying to "fix" the problem? > > > > > > I think the original code was trying to check if the numa migration > > > would lead to new imbalance in load balancer. In case src is A, dst is B, > > > and > > > both of them have nr_running as 2. A moves one task to B, then A > > > will have 1, B will have 3. In load balancer, A will try to pull task > > > from B since B's nr_running is larger than min_imbalance. But the code > > > is saying imbalance=0 by finding A's nr_running is smaller than > > > min_imbalance. > > > > > > Will share more test data if you need. > > > > > > > > > > > -- > > > > Mel Gorman > > > > SUSE Labs > > > > > > Thanks > > > Barry > > > > -- > > Thanks and Regards > > Srikar Dronamraju > >
Re: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer
> > > > On Mon, Sep 07, 2020 at 07:27:08PM +1200, Barry Song wrote: > > > Something is wrong. In find_busiest_group(), we are checking if src has > > > higher load, however, in task_numa_find_cpu(), we are checking if dst > > > will have higher load after balancing. It seems it is not sensible to > > > check src. > > > It maybe cause wrong imbalance value, for example, if > > > dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and > > > src_running = env->src_stats.nr_running - 1 results in 1; > > > The current code is thinking imbalance as 0 since src_running is smaller > > > than 2. > > > This is inconsistent with load balancer. > > > I have observed the similar behaviour what Barry Song has documented with a simple ebizzy with less threads on a 2 node system ebizzy -t 6 -S 100 We see couple of ebizzy threads moving back and forth between the 2 nodes because of numa balancer and load balancer trying to do the exact opposite. However with Barry's patch, couple of tests regress heavily. (Any numa workload that has shared numa faults). For example: perf bench numa mem --no-data_rand_walk -p 1 -t 6 -G 0 -P 3072 -T 0 -l 50 -c I also don't understand the rational behind checking for dst_running in numa balancer path. This almost means no numa balancing in lightly loaded scenario. So agree with Mel that we should probably test more scenarios before we accept this patch. > > > > It checks the conditions if the move was to happen. Have you evaluated > > this for a NUMA balancing load and confirmed it a) balances properly and > > b) does not increase the scan rate trying to "fix" the problem? > > I think the original code was trying to check if the numa migration > would lead to new imbalance in load balancer. In case src is A, dst is B, and > both of them have nr_running as 2. A moves one task to B, then A > will have 1, B will have 3. In load balancer, A will try to pull task > from B since B's nr_running is larger than min_imbalance. But the code > is saying imbalance=0 by finding A's nr_running is smaller than > min_imbalance. > > Will share more test data if you need. > > > > > -- > > Mel Gorman > > SUSE Labs > > Thanks > Barry -- Thanks and Regards Srikar Dronamraju
Re: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer
On Mon, Sep 07, 2020 at 09:44:03AM +, Song Bao Hua (Barry Song) wrote: > > > > -Original Message- > > From: Mel Gorman [mailto:mgor...@suse.de] > > Sent: Monday, September 7, 2020 9:27 PM > > To: Song Bao Hua (Barry Song) > > Cc: mi...@redhat.com; pet...@infradead.org; juri.le...@redhat.com; > > vincent.guit...@linaro.org; dietmar.eggem...@arm.com; > > bseg...@google.com; linux-kernel@vger.kernel.org; Linuxarm > > ; Mel Gorman ; > > Peter Zijlstra ; Valentin Schneider > > ; Phil Auld ; Hillf Danton > > ; Ingo Molnar > > Subject: Re: [PATCH] sched/fair: use dst group while checking imbalance for > > NUMA balancer > > > > On Mon, Sep 07, 2020 at 07:27:08PM +1200, Barry Song wrote: > > > Something is wrong. In find_busiest_group(), we are checking if src has > > > higher load, however, in task_numa_find_cpu(), we are checking if dst > > > will have higher load after balancing. It seems it is not sensible to > > > check src. > > > It maybe cause wrong imbalance value, for example, if > > > dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and > > > src_running = env->src_stats.nr_running - 1 results in 1; > > > The current code is thinking imbalance as 0 since src_running is smaller > > > than 2. > > > This is inconsistent with load balancer. > > > > > > > It checks the conditions if the move was to happen. Have you evaluated > > this for a NUMA balancing load and confirmed it a) balances properly and > > b) does not increase the scan rate trying to "fix" the problem? > > I think the original code was trying to check if the numa migration > would lead to new imbalance in load balancer. In case src is A, dst is B, and > both of them have nr_running as 2. A moves one task to B, then A > will have 1, B will have 3. In load balancer, A will try to pull task > from B since B's nr_running is larger than min_imbalance. But the code > is saying imbalance=0 by finding A's nr_running is smaller than > min_imbalance. > > Will share more test data if you need. > Include the test description, data and details of the system you used to evaluate the patch. I ask because the load/numa reconcilation took a long time to cover all the corner cases and it's very easy to reintroduce major regressions. At least one of those corner cases was trying to balance in the wrong direction because in some cases NUMA balancing will try to allow a small imbalance if it makes sense from a locality point of view. Another corner case was if that small imbalance is too large or done at the wrong time, it regresses overall even though the locality is good because of memory bandwidth limitations. This is obviously far from ideal but it does mean that it's an area that needs data backing up the changes. -- Mel Gorman SUSE Labs
RE: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer
> -Original Message- > From: Mel Gorman [mailto:mgor...@suse.de] > Sent: Monday, September 7, 2020 9:27 PM > To: Song Bao Hua (Barry Song) > Cc: mi...@redhat.com; pet...@infradead.org; juri.le...@redhat.com; > vincent.guit...@linaro.org; dietmar.eggem...@arm.com; > bseg...@google.com; linux-kernel@vger.kernel.org; Linuxarm > ; Mel Gorman ; > Peter Zijlstra ; Valentin Schneider > ; Phil Auld ; Hillf Danton > ; Ingo Molnar > Subject: Re: [PATCH] sched/fair: use dst group while checking imbalance for > NUMA balancer > > On Mon, Sep 07, 2020 at 07:27:08PM +1200, Barry Song wrote: > > Something is wrong. In find_busiest_group(), we are checking if src has > > higher load, however, in task_numa_find_cpu(), we are checking if dst > > will have higher load after balancing. It seems it is not sensible to > > check src. > > It maybe cause wrong imbalance value, for example, if > > dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and > > src_running = env->src_stats.nr_running - 1 results in 1; > > The current code is thinking imbalance as 0 since src_running is smaller > > than 2. > > This is inconsistent with load balancer. > > > > It checks the conditions if the move was to happen. Have you evaluated > this for a NUMA balancing load and confirmed it a) balances properly and > b) does not increase the scan rate trying to "fix" the problem? I think the original code was trying to check if the numa migration would lead to new imbalance in load balancer. In case src is A, dst is B, and both of them have nr_running as 2. A moves one task to B, then A will have 1, B will have 3. In load balancer, A will try to pull task from B since B's nr_running is larger than min_imbalance. But the code is saying imbalance=0 by finding A's nr_running is smaller than min_imbalance. Will share more test data if you need. > > -- > Mel Gorman > SUSE Labs Thanks Barry
Re: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer
On Mon, Sep 07, 2020 at 07:27:08PM +1200, Barry Song wrote: > Something is wrong. In find_busiest_group(), we are checking if src has > higher load, however, in task_numa_find_cpu(), we are checking if dst > will have higher load after balancing. It seems it is not sensible to > check src. > It maybe cause wrong imbalance value, for example, if > dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and > src_running = env->src_stats.nr_running - 1 results in 1; > The current code is thinking imbalance as 0 since src_running is smaller > than 2. > This is inconsistent with load balancer. > It checks the conditions if the move was to happen. Have you evaluated this for a NUMA balancing load and confirmed it a) balances properly and b) does not increase the scan rate trying to "fix" the problem? -- Mel Gorman SUSE Labs
[PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer
Something is wrong. In find_busiest_group(), we are checking if src has higher load, however, in task_numa_find_cpu(), we are checking if dst will have higher load after balancing. It seems it is not sensible to check src. It maybe cause wrong imbalance value, for example, if dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and src_running = env->src_stats.nr_running - 1 results in 1; The current code is thinking imbalance as 0 since src_running is smaller than 2. This is inconsistent with load balancer. Fixes: fb86f5b211 ("sched/numa: Use similar logic to the load balancer for moving between domains with spare capacity") Cc: Mel Gorman Cc: Peter Zijlstra Cc: Vincent Guittot Cc: Juri Lelli Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Phil Auld Cc: Hillf Danton Cc: Ingo Molnar Signed-off-by: Barry Song --- kernel/sched/fair.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1a68a05..90cfee7 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1548,7 +1548,7 @@ struct task_numa_env { static unsigned long cpu_load(struct rq *rq); static unsigned long cpu_util(int cpu); -static inline long adjust_numa_imbalance(int imbalance, int src_nr_running); +static inline long adjust_numa_imbalance(int imbalance, int nr_running); static inline enum numa_type numa_classify(unsigned int imbalance_pct, @@ -1925,7 +1925,7 @@ static void task_numa_find_cpu(struct task_numa_env *env, src_running = env->src_stats.nr_running - 1; dst_running = env->dst_stats.nr_running + 1; imbalance = max(0, dst_running - src_running); - imbalance = adjust_numa_imbalance(imbalance, src_running); + imbalance = adjust_numa_imbalance(imbalance, dst_running); /* Use idle CPU if there is no imbalance */ if (!imbalance) { @@ -8957,7 +8957,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd } } -static inline long adjust_numa_imbalance(int imbalance, int src_nr_running) +static inline long adjust_numa_imbalance(int imbalance, int nr_running) { unsigned int imbalance_min; @@ -8966,7 +8966,7 @@ static inline long adjust_numa_imbalance(int imbalance, int src_nr_running) * tasks that remain local when the source domain is almost idle. */ imbalance_min = 2; - if (src_nr_running <= imbalance_min) + if (nr_running <= imbalance_min) return 0; return imbalance; -- 2.7.4