On 8 December 2016 at 15:09, Matt Fleming <m...@codeblueprint.co.uk> wrote:
> On Mon, 05 Dec, at 01:35:46PM, Matt Fleming wrote:
>> On Mon, 05 Dec, at 10:27:36AM, Vincent Guittot wrote:
>> >
>> > Hi Matt,
>> >
>> > Thanks for the results.
>> >
>> > During the review, it has been pointed out by Morten that the test 
>> > condition
>> > (100*this_avg_load < imbalance_scale*min_avg_load) makes more sense than
>> > (100*min_avg_load > imbalance_scale*this_avg_load). But i see lower
>> > performances with this change. Coud you run tests with the change below on
>> > top of the patchset ?
>> >
>> > ---
>> >  kernel/sched/fair.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> > index e8d1ae7..0129fbb 100644
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@ -5514,7 +5514,7 @@ find_idlest_group(struct sched_domain *sd, struct 
>> > task_struct *p,
>> >     if (!idlest ||
>> >         (min_runnable_load > (this_runnable_load + imbalance)) ||
>> >         ((this_runnable_load < (min_runnable_load + imbalance)) &&
>> > -                   (100*min_avg_load > imbalance_scale*this_avg_load)))
>> > +                   (100*this_avg_load < imbalance_scale*min_avg_load)))
>> >             return NULL;
>> >     return idlest;
>> >  }
>>
>> Queued for testing.
>
> Most of the machines didn't notice the difference with this new patch.
> However, I did come across one test that showed a negative change,

OK so you don't see perf impact between the 2 test condition except
for the machine below
So IMHO, we should use the new test condition which tries to keep task
local if there is no other CPU that is obviously less loaded.

I'm going to prepare a new version of the patchset and apply all comments

Thanks
Vincent

>
>
> hackbench-thread-pipes
>                         4.9.0-rc6             4.9.0-rc6             4.9.0-rc6 
>             4.9.0-rc6
>                         tip-sched      fix-fig-for-fork               fix-sig 
>   fix-fig-for-fork-v2
> Amean    1       0.1266 (  0.00%)      0.1269 ( -0.23%)      0.1287 ( -1.69%) 
>      0.1357 ( -7.22%)
> Amean    4       0.4989 (  0.00%)      0.5174 ( -3.72%)      0.5251 ( -5.27%) 
>      0.5117 ( -2.58%)
> Amean    7       0.8510 (  0.00%)      0.8517 ( -0.08%)      0.8964 ( -5.34%) 
>      0.8801 ( -3.42%)
> Amean    12      1.0699 (  0.00%)      1.0484 (  2.00%)      1.0147 (  5.15%) 
>      1.0759 ( -0.56%)
> Amean    21      1.2816 (  0.00%)      1.2140 (  5.27%)      1.1879 (  7.31%) 
>      1.2414 (  3.13%)
> Amean    30      1.4440 (  0.00%)      1.4003 (  3.03%)      1.3969 (  3.26%) 
>      1.4057 (  2.65%)
> Amean    48      1.5773 (  0.00%)      1.5983 ( -1.33%)      1.3984 ( 11.34%) 
>      1.5624 (  0.94%)
> Amean    79      2.2343 (  0.00%)      2.3066 ( -3.24%)      2.0053 ( 10.25%) 
>      2.2630 ( -1.29%)
> Amean    96      2.6736 (  0.00%)      2.4313 (  9.06%)      2.4181 (  9.55%) 
>      2.4717 (  7.55%)
>
>            4.9.0-rc6   4.9.0-rc6   4.9.0-rc6   4.9.0-rc6
>            tip-schedfix-fig-for-fork     fix-sigfix-fig-for-fork-v2
> User          129.53      128.64      127.70      131.00
> System       1784.54     1744.21     1654.08     1744.00
> Elapsed        92.07       90.44       86.95       91.00
>
> Looking at the 48 and 79 groups rows for mean there's a noticeable
> drop off in performance of ~10%, which should be outside of the noise
> for this test. This is a 2 socket, 4 NUMA node (yes, really), 24 cpus
> AMD opteron circa 2010.
>
> Given the age of this machine, I don't think it's worth holding up the
> patch.

Reply via email to