Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-02-04 Thread Xunlei Pang
Hi Peter, Steve,

Thanks for all your valuable sharing.
I'll keep them in mind.

Regards,
Xunlei

On 30 January 2015 at 03:23, Peter Zijlstra  wrote:
> On Fri, Jan 30, 2015 at 12:42:47AM +0800, Xunlei Pang wrote:
>> On 27 January 2015 at 22:56, Steven Rostedt  wrote:
>> > On Tue, 27 Jan 2015 15:21:36 +0100
>> > Peter Zijlstra  wrote:
>> >
>> >> On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
>> >> > In find_lowest_rq(), if we can't find a wake_affine cpu from
>> >> > sched_domain, then we can actually determine a cache hot cpu
>> >> > instead of simply calling "cpumask_any(lowest_mask)" which
>> >> > always returns the first cpu in the mask.
>> >> >
>> >> > So, we can determine the cache hot cpu during the interation of
>> >> > sched_domain() in passing.
>> >>
>> >> Steve, I'm not getting this. Why are we using WAKE_AFFINE here?
>> >>
>> >
>> > It originated from Gregory Haskins topology patches. See
>> >  6e1254d2c41215da27025add8900ed187bca121d
>>
>> Hi Peter, Steve,
>>
>> I think the responsiveness is the most important feature for RT tasks,
>> so I think:
>> response latency > cache > SMT in significance.
>
> No, deterministic execution time is the utmost important feature. And
> for that SMT utterly blows. So much so in fact that rule #1 for -rt work
> is to disable SMT on your hardware.
>
> The same argument can be made for shared caches. If your !rt workload
> blows away the cache of the rt workload, you loose.
>
>> I was wondering if we can take the cpuidle state into account like
>> current find_idlest_cpu() for CFS?
>> cpupri_find() can be easily modified to indicate the CPUPRI_IDLE case,
>> then we can select
>> an optimal idle cpu to improve RT tasks' responsiveness. For other
>> cases(mostly non-idle cpu),
>> I think we can rely on the existent sched_domain iteraction to select
>> a cache-hot cpu without
>> caring too much about SMT.
>
> your patch calls something 'cache-hot' when crossing large numa domains,
> don't you think that's somewhat stretching the definition of hot?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-02-04 Thread Xunlei Pang
Hi Peter, Steve,

Thanks for all your valuable sharing.
I'll keep them in mind.

Regards,
Xunlei

On 30 January 2015 at 03:23, Peter Zijlstra pet...@infradead.org wrote:
 On Fri, Jan 30, 2015 at 12:42:47AM +0800, Xunlei Pang wrote:
 On 27 January 2015 at 22:56, Steven Rostedt rost...@goodmis.org wrote:
  On Tue, 27 Jan 2015 15:21:36 +0100
  Peter Zijlstra pet...@infradead.org wrote:
 
  On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
   In find_lowest_rq(), if we can't find a wake_affine cpu from
   sched_domain, then we can actually determine a cache hot cpu
   instead of simply calling cpumask_any(lowest_mask) which
   always returns the first cpu in the mask.
  
   So, we can determine the cache hot cpu during the interation of
   sched_domain() in passing.
 
  Steve, I'm not getting this. Why are we using WAKE_AFFINE here?
 
 
  It originated from Gregory Haskins topology patches. See
   6e1254d2c41215da27025add8900ed187bca121d

 Hi Peter, Steve,

 I think the responsiveness is the most important feature for RT tasks,
 so I think:
 response latency  cache  SMT in significance.

 No, deterministic execution time is the utmost important feature. And
 for that SMT utterly blows. So much so in fact that rule #1 for -rt work
 is to disable SMT on your hardware.

 The same argument can be made for shared caches. If your !rt workload
 blows away the cache of the rt workload, you loose.

 I was wondering if we can take the cpuidle state into account like
 current find_idlest_cpu() for CFS?
 cpupri_find() can be easily modified to indicate the CPUPRI_IDLE case,
 then we can select
 an optimal idle cpu to improve RT tasks' responsiveness. For other
 cases(mostly non-idle cpu),
 I think we can rely on the existent sched_domain iteraction to select
 a cache-hot cpu without
 caring too much about SMT.

 your patch calls something 'cache-hot' when crossing large numa domains,
 don't you think that's somewhat stretching the definition of hot?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-29 Thread Peter Zijlstra
On Fri, Jan 30, 2015 at 12:42:47AM +0800, Xunlei Pang wrote:
> On 27 January 2015 at 22:56, Steven Rostedt  wrote:
> > On Tue, 27 Jan 2015 15:21:36 +0100
> > Peter Zijlstra  wrote:
> >
> >> On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
> >> > In find_lowest_rq(), if we can't find a wake_affine cpu from
> >> > sched_domain, then we can actually determine a cache hot cpu
> >> > instead of simply calling "cpumask_any(lowest_mask)" which
> >> > always returns the first cpu in the mask.
> >> >
> >> > So, we can determine the cache hot cpu during the interation of
> >> > sched_domain() in passing.
> >>
> >> Steve, I'm not getting this. Why are we using WAKE_AFFINE here?
> >>
> >
> > It originated from Gregory Haskins topology patches. See
> >  6e1254d2c41215da27025add8900ed187bca121d
> 
> Hi Peter, Steve,
> 
> I think the responsiveness is the most important feature for RT tasks,
> so I think:
> response latency > cache > SMT in significance.

No, deterministic execution time is the utmost important feature. And
for that SMT utterly blows. So much so in fact that rule #1 for -rt work
is to disable SMT on your hardware.

The same argument can be made for shared caches. If your !rt workload
blows away the cache of the rt workload, you loose.

> I was wondering if we can take the cpuidle state into account like
> current find_idlest_cpu() for CFS?
> cpupri_find() can be easily modified to indicate the CPUPRI_IDLE case,
> then we can select
> an optimal idle cpu to improve RT tasks' responsiveness. For other
> cases(mostly non-idle cpu),
> I think we can rely on the existent sched_domain iteraction to select
> a cache-hot cpu without
> caring too much about SMT.

your patch calls something 'cache-hot' when crossing large numa domains,
don't you think that's somewhat stretching the definition of hot?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-29 Thread Steven Rostedt
On Fri, 30 Jan 2015 00:42:47 +0800
Xunlei Pang  wrote:

 
> I think the responsiveness is the most important feature for RT tasks,
> so I think:
> response latency > cache > SMT in significance.

Unfortunately, sometimes cache affects response latency.

> 
> I was wondering if we can take the cpuidle state into account like
> current find_idlest_cpu() for CFS?
> cpupri_find() can be easily modified to indicate the CPUPRI_IDLE case,
> then we can select
> an optimal idle cpu to improve RT tasks' responsiveness. For other
> cases(mostly non-idle cpu),

Even if that idle cpu happens to be on another NUMA node?

-- Steve

> I think we can rely on the existent sched_domain iteraction to select
> a cache-hot cpu without
> caring too much about SMT.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-29 Thread Xunlei Pang
On 27 January 2015 at 22:56, Steven Rostedt  wrote:
> On Tue, 27 Jan 2015 15:21:36 +0100
> Peter Zijlstra  wrote:
>
>> On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
>> > In find_lowest_rq(), if we can't find a wake_affine cpu from
>> > sched_domain, then we can actually determine a cache hot cpu
>> > instead of simply calling "cpumask_any(lowest_mask)" which
>> > always returns the first cpu in the mask.
>> >
>> > So, we can determine the cache hot cpu during the interation of
>> > sched_domain() in passing.
>>
>> Steve, I'm not getting this. Why are we using WAKE_AFFINE here?
>>
>
> It originated from Gregory Haskins topology patches. See
>  6e1254d2c41215da27025add8900ed187bca121d

Hi Peter, Steve,

I think the responsiveness is the most important feature for RT tasks,
so I think:
response latency > cache > SMT in significance.

I was wondering if we can take the cpuidle state into account like
current find_idlest_cpu() for CFS?
cpupri_find() can be easily modified to indicate the CPUPRI_IDLE case,
then we can select
an optimal idle cpu to improve RT tasks' responsiveness. For other
cases(mostly non-idle cpu),
I think we can rely on the existent sched_domain iteraction to select
a cache-hot cpu without
caring too much about SMT.

Any comments on this?

Thanks,
Xunlei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-29 Thread Xunlei Pang
On 27 January 2015 at 22:56, Steven Rostedt rost...@goodmis.org wrote:
 On Tue, 27 Jan 2015 15:21:36 +0100
 Peter Zijlstra pet...@infradead.org wrote:

 On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
  In find_lowest_rq(), if we can't find a wake_affine cpu from
  sched_domain, then we can actually determine a cache hot cpu
  instead of simply calling cpumask_any(lowest_mask) which
  always returns the first cpu in the mask.
 
  So, we can determine the cache hot cpu during the interation of
  sched_domain() in passing.

 Steve, I'm not getting this. Why are we using WAKE_AFFINE here?


 It originated from Gregory Haskins topology patches. See
  6e1254d2c41215da27025add8900ed187bca121d

Hi Peter, Steve,

I think the responsiveness is the most important feature for RT tasks,
so I think:
response latency  cache  SMT in significance.

I was wondering if we can take the cpuidle state into account like
current find_idlest_cpu() for CFS?
cpupri_find() can be easily modified to indicate the CPUPRI_IDLE case,
then we can select
an optimal idle cpu to improve RT tasks' responsiveness. For other
cases(mostly non-idle cpu),
I think we can rely on the existent sched_domain iteraction to select
a cache-hot cpu without
caring too much about SMT.

Any comments on this?

Thanks,
Xunlei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-29 Thread Steven Rostedt
On Fri, 30 Jan 2015 00:42:47 +0800
Xunlei Pang pang.xun...@linaro.org wrote:

 
 I think the responsiveness is the most important feature for RT tasks,
 so I think:
 response latency  cache  SMT in significance.

Unfortunately, sometimes cache affects response latency.

 
 I was wondering if we can take the cpuidle state into account like
 current find_idlest_cpu() for CFS?
 cpupri_find() can be easily modified to indicate the CPUPRI_IDLE case,
 then we can select
 an optimal idle cpu to improve RT tasks' responsiveness. For other
 cases(mostly non-idle cpu),

Even if that idle cpu happens to be on another NUMA node?

-- Steve

 I think we can rely on the existent sched_domain iteraction to select
 a cache-hot cpu without
 caring too much about SMT.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-29 Thread Peter Zijlstra
On Fri, Jan 30, 2015 at 12:42:47AM +0800, Xunlei Pang wrote:
 On 27 January 2015 at 22:56, Steven Rostedt rost...@goodmis.org wrote:
  On Tue, 27 Jan 2015 15:21:36 +0100
  Peter Zijlstra pet...@infradead.org wrote:
 
  On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
   In find_lowest_rq(), if we can't find a wake_affine cpu from
   sched_domain, then we can actually determine a cache hot cpu
   instead of simply calling cpumask_any(lowest_mask) which
   always returns the first cpu in the mask.
  
   So, we can determine the cache hot cpu during the interation of
   sched_domain() in passing.
 
  Steve, I'm not getting this. Why are we using WAKE_AFFINE here?
 
 
  It originated from Gregory Haskins topology patches. See
   6e1254d2c41215da27025add8900ed187bca121d
 
 Hi Peter, Steve,
 
 I think the responsiveness is the most important feature for RT tasks,
 so I think:
 response latency  cache  SMT in significance.

No, deterministic execution time is the utmost important feature. And
for that SMT utterly blows. So much so in fact that rule #1 for -rt work
is to disable SMT on your hardware.

The same argument can be made for shared caches. If your !rt workload
blows away the cache of the rt workload, you loose.

 I was wondering if we can take the cpuidle state into account like
 current find_idlest_cpu() for CFS?
 cpupri_find() can be easily modified to indicate the CPUPRI_IDLE case,
 then we can select
 an optimal idle cpu to improve RT tasks' responsiveness. For other
 cases(mostly non-idle cpu),
 I think we can rely on the existent sched_domain iteraction to select
 a cache-hot cpu without
 caring too much about SMT.

your patch calls something 'cache-hot' when crossing large numa domains,
don't you think that's somewhat stretching the definition of hot?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-27 Thread Peter Zijlstra
On Tue, Jan 27, 2015 at 09:56:26AM -0500, Steven Rostedt wrote:
> On Tue, 27 Jan 2015 15:21:36 +0100
> Peter Zijlstra  wrote:
> 
> > On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
> > > In find_lowest_rq(), if we can't find a wake_affine cpu from
> > > sched_domain, then we can actually determine a cache hot cpu
> > > instead of simply calling "cpumask_any(lowest_mask)" which
> > > always returns the first cpu in the mask.
> > > 
> > > So, we can determine the cache hot cpu during the interation of
> > > sched_domain() in passing.
> > 
> > Steve, I'm not getting this. Why are we using WAKE_AFFINE here?
> > 
> 
> It originated from Gregory Haskins topology patches. See 
>  6e1254d2c41215da27025add8900ed187bca121d

Indeed so; it seems an arbitrary choice.

And the proposed patch seems like a convoluted way to simply remove the
->flags & SD_WAKE_AFFINE test.

Of course, the entire domain loop there assumes a lower domain is
better; yay for SMT being such a good counter example ;-)

Of course, if we remove it here; we should do too for deadline.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-27 Thread Steven Rostedt
On Tue, 27 Jan 2015 15:21:36 +0100
Peter Zijlstra  wrote:

> On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
> > In find_lowest_rq(), if we can't find a wake_affine cpu from
> > sched_domain, then we can actually determine a cache hot cpu
> > instead of simply calling "cpumask_any(lowest_mask)" which
> > always returns the first cpu in the mask.
> > 
> > So, we can determine the cache hot cpu during the interation of
> > sched_domain() in passing.
> 
> Steve, I'm not getting this. Why are we using WAKE_AFFINE here?
> 

It originated from Gregory Haskins topology patches. See 
 6e1254d2c41215da27025add8900ed187bca121d

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-27 Thread Peter Zijlstra
On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
> In find_lowest_rq(), if we can't find a wake_affine cpu from
> sched_domain, then we can actually determine a cache hot cpu
> instead of simply calling "cpumask_any(lowest_mask)" which
> always returns the first cpu in the mask.
> 
> So, we can determine the cache hot cpu during the interation of
> sched_domain() in passing.

Steve, I'm not getting this. Why are we using WAKE_AFFINE here?



> Signed-off-by: Xunlei Pang 
> ---
>  kernel/sched/rt.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index d28cfa4..e6a42e6 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1535,6 +1535,7 @@ static int find_lowest_rq(struct task_struct *task)
>   struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask);
>   int this_cpu = smp_processor_id();
>   int cpu  = task_cpu(task);
> + int cachehot_cpu = nr_cpu_ids;
>  
>   /* Make sure the mask is initialized first */
>   if (unlikely(!lowest_mask))
> @@ -1566,8 +1567,12 @@ static int find_lowest_rq(struct task_struct *task)
>  
>   rcu_read_lock();
>   for_each_domain(cpu, sd) {
> + if (cachehot_cpu >= nr_cpu_ids)
> + cachehot_cpu = cpumask_first_and(lowest_mask,
> +sched_domain_span(sd));
> +
>   if (sd->flags & SD_WAKE_AFFINE) {
> - int best_cpu;
> + int wakeaffine_cpu;
>  
>   /*
>* "this_cpu" is cheaper to preempt than a
> @@ -1579,16 +1584,20 @@ static int find_lowest_rq(struct task_struct *task)
>   return this_cpu;
>   }
>  
> - best_cpu = cpumask_first_and(lowest_mask,
> + wakeaffine_cpu = cpumask_first_and(lowest_mask,
>sched_domain_span(sd));
> - if (best_cpu < nr_cpu_ids) {
> + if (wakeaffine_cpu < nr_cpu_ids) {
>   rcu_read_unlock();
> - return best_cpu;
> + return wakeaffine_cpu;
>   }
>   }
>   }
>   rcu_read_unlock();
>  
> + /* most likely cache-hot */
> + if (cachehot_cpu < nr_cpu_ids)
> + return cachehot_cpu;
> +
>   /*
>* And finally, if there were no matches within the domains
>* just give the caller *something* to work with from the compatible
> -- 
> 1.9.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-27 Thread Peter Zijlstra
On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
 In find_lowest_rq(), if we can't find a wake_affine cpu from
 sched_domain, then we can actually determine a cache hot cpu
 instead of simply calling cpumask_any(lowest_mask) which
 always returns the first cpu in the mask.
 
 So, we can determine the cache hot cpu during the interation of
 sched_domain() in passing.

Steve, I'm not getting this. Why are we using WAKE_AFFINE here?



 Signed-off-by: Xunlei Pang pang.xun...@linaro.org
 ---
  kernel/sched/rt.c | 17 +
  1 file changed, 13 insertions(+), 4 deletions(-)
 
 diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
 index d28cfa4..e6a42e6 100644
 --- a/kernel/sched/rt.c
 +++ b/kernel/sched/rt.c
 @@ -1535,6 +1535,7 @@ static int find_lowest_rq(struct task_struct *task)
   struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask);
   int this_cpu = smp_processor_id();
   int cpu  = task_cpu(task);
 + int cachehot_cpu = nr_cpu_ids;
  
   /* Make sure the mask is initialized first */
   if (unlikely(!lowest_mask))
 @@ -1566,8 +1567,12 @@ static int find_lowest_rq(struct task_struct *task)
  
   rcu_read_lock();
   for_each_domain(cpu, sd) {
 + if (cachehot_cpu = nr_cpu_ids)
 + cachehot_cpu = cpumask_first_and(lowest_mask,
 +sched_domain_span(sd));
 +
   if (sd-flags  SD_WAKE_AFFINE) {
 - int best_cpu;
 + int wakeaffine_cpu;
  
   /*
* this_cpu is cheaper to preempt than a
 @@ -1579,16 +1584,20 @@ static int find_lowest_rq(struct task_struct *task)
   return this_cpu;
   }
  
 - best_cpu = cpumask_first_and(lowest_mask,
 + wakeaffine_cpu = cpumask_first_and(lowest_mask,
sched_domain_span(sd));
 - if (best_cpu  nr_cpu_ids) {
 + if (wakeaffine_cpu  nr_cpu_ids) {
   rcu_read_unlock();
 - return best_cpu;
 + return wakeaffine_cpu;
   }
   }
   }
   rcu_read_unlock();
  
 + /* most likely cache-hot */
 + if (cachehot_cpu  nr_cpu_ids)
 + return cachehot_cpu;
 +
   /*
* And finally, if there were no matches within the domains
* just give the caller *something* to work with from the compatible
 -- 
 1.9.1
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-27 Thread Steven Rostedt
On Tue, 27 Jan 2015 15:21:36 +0100
Peter Zijlstra pet...@infradead.org wrote:

 On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
  In find_lowest_rq(), if we can't find a wake_affine cpu from
  sched_domain, then we can actually determine a cache hot cpu
  instead of simply calling cpumask_any(lowest_mask) which
  always returns the first cpu in the mask.
  
  So, we can determine the cache hot cpu during the interation of
  sched_domain() in passing.
 
 Steve, I'm not getting this. Why are we using WAKE_AFFINE here?
 

It originated from Gregory Haskins topology patches. See 
 6e1254d2c41215da27025add8900ed187bca121d

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched/rt: Optimize find_lowest_rq() to select a cache hot cpu

2015-01-27 Thread Peter Zijlstra
On Tue, Jan 27, 2015 at 09:56:26AM -0500, Steven Rostedt wrote:
 On Tue, 27 Jan 2015 15:21:36 +0100
 Peter Zijlstra pet...@infradead.org wrote:
 
  On Mon, Jan 19, 2015 at 04:49:40AM +, Xunlei Pang wrote:
   In find_lowest_rq(), if we can't find a wake_affine cpu from
   sched_domain, then we can actually determine a cache hot cpu
   instead of simply calling cpumask_any(lowest_mask) which
   always returns the first cpu in the mask.
   
   So, we can determine the cache hot cpu during the interation of
   sched_domain() in passing.
  
  Steve, I'm not getting this. Why are we using WAKE_AFFINE here?
  
 
 It originated from Gregory Haskins topology patches. See 
  6e1254d2c41215da27025add8900ed187bca121d

Indeed so; it seems an arbitrary choice.

And the proposed patch seems like a convoluted way to simply remove the
-flags  SD_WAKE_AFFINE test.

Of course, the entire domain loop there assumes a lower domain is
better; yay for SMT being such a good counter example ;-)

Of course, if we remove it here; we should do too for deadline.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/