Re: [PATCHv3 2/2] powerpc: implement arch_scale_smt_power for Power7

2010-01-29 Thread Peter Zijlstra
On Fri, 2010-01-29 at 12:23 +1100, Benjamin Herrenschmidt wrote:
 On machine that don't have SMT, I would like to avoid calling
 arch_scale_smt_power() at all if possible (in addition to not compiling
 it in if SMT is not enabled in .config).
 
 Now, I must say I'm utterly confused by how the domains are setup and I
 haven't quite managed to sort it out... it looks to me that
 SD_SHARE_CPUPOWER is always going to be set on all CPUs when the config
 option is set (though each CPU will have its own domain) or am I
 misguided ? IE. Is there any sense in having at least a fast exit path
 out of arch_scale_smt_power() for non-SMT CPUs ?

The sched_domain creation code is a f'ing stink pile that hurts
everybody's brain.

The AMD magny-cours people sort of cleaned it up a bit but didn't go
nearly far enough. Doing so is somewhere on my todo list, but sadly that
thing is way larger than my spare time.

Now SD_SHARE_CPUPOWER _should_ only be set for SMT domains, because only
SMT siblings share cpupower.

SD_SHARE_PKG_RESOURCES _should_ be set for both SMT and MC, because they
all share the same cache domain.

If it all works out that way in practice on powerpc is another question
entirely ;-)

That said, I'm still not entirely convinced I like this usage of
cpupower, its supposed to be a normalization scale for load-balancing,
not a placement hook.

I'd be much happier with a SD_GROUP_ORDER or something like that, that
works together with SD_PREFER_SIBLING to pack active tasks to cpus in
ascending group order.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCHv3 2/2] powerpc: implement arch_scale_smt_power for Power7

2010-01-29 Thread Gabriel Paubert
On Thu, Jan 28, 2010 at 05:20:55PM -0600, Joel Schopp wrote:
 On Power7 processors running in SMT4 mode with 2, 3, or 4 idle threads 
 there is performance benefit to idling the higher numbered threads in
 the core.  
 

Really 2, 3, or 4? When you have 4 idle threads out of 4, performance
becomes a minor concern, no? ;-)

Gabriel
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCHv3 2/2] powerpc: implement arch_scale_smt_power for Power7

2010-01-29 Thread Joel Schopp

Gabriel Paubert wrote:

On Thu, Jan 28, 2010 at 05:20:55PM -0600, Joel Schopp wrote:
  
On Power7 processors running in SMT4 mode with 2, 3, or 4 idle threads 
there is performance benefit to idling the higher numbered threads in
the core.  




Really 2, 3, or 4? When you have 4 idle threads out of 4, performance
becomes a minor concern, no? ;-)

Gabriel
  
Yes, but going from 4 idle to 3 idle you want to keep the slanted 
weights.  If you ignored 4 you'd place wrong and then correct it after 
the fact.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCHv3 2/2] powerpc: implement arch_scale_smt_power for Power7

2010-01-29 Thread Joel Schopp



That said, I'm still not entirely convinced I like this usage of
cpupower, its supposed to be a normalization scale for load-balancing,
not a placement hook.
  
Even if you do a placement hook you'll need to address it in the load 
balancing as well.  Consider a single 4 thread SMT core with 4 running 
tasks.  If 2 of them exit the remaining 2 will need to be load balanced 
within the core in a way that takes into account the dynamic nature of 
the thread power.  This patch does that.

I'd be much happier with a SD_GROUP_ORDER or something like that, that
works together with SD_PREFER_SIBLING to pack active tasks to cpus in
ascending group order.

  
I don't see this load-balancing patch as mutually exclusive with a patch 
to fix placement.  But even if it is a mutually exclusive solution there 
is no reason we can't fix things now with this patch and then later take 
it out when it's fixed another way.  This patch series is 
straightforward, non-intrusive, and without it the scheduler is broken 
on this processor. 
___

Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCHv3 2/2] powerpc: implement arch_scale_smt_power for Power7

2010-01-29 Thread Joel Schopp

Benjamin Herrenschmidt wrote:

On Thu, 2010-01-28 at 17:24 -0600, Joel Schopp wrote:
  
On Power7 processors running in SMT4 mode with 2, 3, or 4 idle threads 
there is performance benefit to idling the higher numbered threads in
the core.  


This patch implements arch_scale_smt_power to dynamically update smt
thread power in these idle cases in order to prefer threads 0,1 over
threads 2,3 within a core.



Almost there :-) Joel, Peter, can you help me figure something out tho ?

On machine that don't have SMT, I would like to avoid calling
arch_scale_smt_power() at all if possible (in addition to not compiling
it in if SMT is not enabled in .config).

Now, I must say I'm utterly confused by how the domains are setup and I
haven't quite managed to sort it out... it looks to me that
SD_SHARE_CPUPOWER is always going to be set on all CPUs when the config
option is set (though each CPU will have its own domain) or am I
misguided ? IE. Is there any sense in having at least a fast exit path
out of arch_scale_smt_power() for non-SMT CPUs ?

Joel, can you look at compiling it out when SMT is not set ? We don't
want to bloat SMP kernels for 32-bit non-SMT embedded platforms.
  
I can wrap the powerpc definition of arch_scale_smt in an #ifdef, if 
it's not there the scheduler uses the default, which is the same as it 
uses if SMT isn't compiled. 



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCHv3 2/2] powerpc: implement arch_scale_smt_power for Power7

2010-01-28 Thread Joel Schopp
On Power7 processors running in SMT4 mode with 2, 3, or 4 idle threads 
there is performance benefit to idling the higher numbered threads in
the core.  

This patch implements arch_scale_smt_power to dynamically update smt
thread power in these idle cases in order to prefer threads 0,1 over
threads 2,3 within a core.


Signed-off-by: Joel Schopp jsch...@austin.ibm.com
---
Version 2 addresses style and optimization, same basic functionality
Version 3 adds a comment
On Power7 processors running in SMT4 mode with 2, 3, or 4 idle threads 
there is performance benefit to idling the higher numbered threads in
the core.  

This patch implements arch_scale_smt_power to dynamically update smt
thread power in these idle cases in order to prefer threads 0,1 over
threads 2,3 within a core.

Signed-off-by: Joel Schopp jsch...@austin.ibm.com
---
Version 2 addresses style and optimization, same basic functionality
Index: linux-2.6.git/arch/powerpc/kernel/smp.c
===
--- linux-2.6.git.orig/arch/powerpc/kernel/smp.c
+++ linux-2.6.git/arch/powerpc/kernel/smp.c
@@ -620,3 +620,59 @@ void __cpu_die(unsigned int cpu)
smp_ops-cpu_die(cpu);
 }
 #endif
+
+unsigned long arch_scale_smt_power(struct sched_domain *sd, int cpu)
+{
+   int sibling;
+   int idle_count = 0;
+   int thread;
+
+   /* Setup the default weight and smt_gain used by most cpus for SMT
+* Power.  Doing this right away covers the default case and can be
+* used by cpus that modify it dynamically.
+*/
+   struct cpumask *sibling_map = sched_domain_span(sd);
+   unsigned long weight = cpumask_weight(sibling_map);
+   unsigned long smt_gain = sd-smt_gain;
+
+
+   if (cpu_has_feature(CPU_FTR_ASYNC_SMT4)  weight == 4) {
+   for_each_cpu(sibling, sibling_map) {
+   if (idle_cpu(sibling))
+   idle_count++;
+   }
+
+   /* the following section attempts to tweak cpu power based
+* on current idleness of the threads dynamically at runtime
+*/
+   if (idle_count  1) {
+   thread = cpu_thread_in_core(cpu);
+   if (thread  2) {
+   /* add 75 % to thread power */
+   smt_gain += (smt_gain  1) + (smt_gain  2);
+   } else {
+/* subtract 75 % to thread power */
+   smt_gain = smt_gain  2;
+   }
+   }
+   }
+
+   /* default smt gain is 1178, weight is # of SMT threads */
+   switch (weight) {
+   case 1:
+   /*divide by 1, do nothing*/
+   break;
+   case 2:
+   smt_gain = smt_gain  1;
+   break;
+   case 4:
+   smt_gain = smt_gain  2;
+   break;
+   default:
+   smt_gain /= weight;
+   break;
+   }
+
+   return smt_gain;
+
+}
Index: linux-2.6.git/arch/powerpc/include/asm/cputable.h
===
--- linux-2.6.git.orig/arch/powerpc/include/asm/cputable.h
+++ linux-2.6.git/arch/powerpc/include/asm/cputable.h
@@ -195,6 +195,7 @@ extern const char *powerpc_base_platform
 #define CPU_FTR_SAOLONG_ASM_CONST(0x0020)
 #define CPU_FTR_CP_USE_DCBTZ   LONG_ASM_CONST(0x0040)
 #define CPU_FTR_UNALIGNED_LD_STD   LONG_ASM_CONST(0x0080)
+#define CPU_FTR_ASYNC_SMT4 LONG_ASM_CONST(0x0100)
 
 #ifndef __ASSEMBLY__
 
@@ -409,7 +410,7 @@ extern const char *powerpc_base_platform
CPU_FTR_MMCRA | CPU_FTR_SMT | \
CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
-   CPU_FTR_DSCR | CPU_FTR_SAO)
+   CPU_FTR_DSCR | CPU_FTR_SAO | CPU_FTR_ASYNC_SMT4)
 #define CPU_FTRS_CELL  (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCHv3 2/2] powerpc: implement arch_scale_smt_power for Power7

2010-01-28 Thread Joel Schopp
On Power7 processors running in SMT4 mode with 2, 3, or 4 idle threads 
there is performance benefit to idling the higher numbered threads in
the core.  

This patch implements arch_scale_smt_power to dynamically update smt
thread power in these idle cases in order to prefer threads 0,1 over
threads 2,3 within a core.


Signed-off-by: Joel Schopp jsch...@austin.ibm.com
---
Version 2 addresses style and optimization, same basic functionality
Version 3 adds a comment, resent due to mailing format error
Index: linux-2.6.git/arch/powerpc/kernel/smp.c
===
--- linux-2.6.git.orig/arch/powerpc/kernel/smp.c
+++ linux-2.6.git/arch/powerpc/kernel/smp.c
@@ -620,3 +620,59 @@ void __cpu_die(unsigned int cpu)
smp_ops-cpu_die(cpu);
 }
 #endif
+
+unsigned long arch_scale_smt_power(struct sched_domain *sd, int cpu)
+{
+   int sibling;
+   int idle_count = 0;
+   int thread;
+
+   /* Setup the default weight and smt_gain used by most cpus for SMT
+* Power.  Doing this right away covers the default case and can be
+* used by cpus that modify it dynamically.
+*/
+   struct cpumask *sibling_map = sched_domain_span(sd);
+   unsigned long weight = cpumask_weight(sibling_map);
+   unsigned long smt_gain = sd-smt_gain;
+
+
+   if (cpu_has_feature(CPU_FTR_ASYNC_SMT4)  weight == 4) {
+   for_each_cpu(sibling, sibling_map) {
+   if (idle_cpu(sibling))
+   idle_count++;
+   }
+
+   /* the following section attempts to tweak cpu power based
+* on current idleness of the threads dynamically at runtime
+*/
+   if (idle_count  1) {
+   thread = cpu_thread_in_core(cpu);
+   if (thread  2) {
+   /* add 75 % to thread power */
+   smt_gain += (smt_gain  1) + (smt_gain  2);
+   } else {
+/* subtract 75 % to thread power */
+   smt_gain = smt_gain  2;
+   }
+   }
+   }
+
+   /* default smt gain is 1178, weight is # of SMT threads */
+   switch (weight) {
+   case 1:
+   /*divide by 1, do nothing*/
+   break;
+   case 2:
+   smt_gain = smt_gain  1;
+   break;
+   case 4:
+   smt_gain = smt_gain  2;
+   break;
+   default:
+   smt_gain /= weight;
+   break;
+   }
+
+   return smt_gain;
+
+}
Index: linux-2.6.git/arch/powerpc/include/asm/cputable.h
===
--- linux-2.6.git.orig/arch/powerpc/include/asm/cputable.h
+++ linux-2.6.git/arch/powerpc/include/asm/cputable.h
@@ -195,6 +195,7 @@ extern const char *powerpc_base_platform
 #define CPU_FTR_SAOLONG_ASM_CONST(0x0020)
 #define CPU_FTR_CP_USE_DCBTZ   LONG_ASM_CONST(0x0040)
 #define CPU_FTR_UNALIGNED_LD_STD   LONG_ASM_CONST(0x0080)
+#define CPU_FTR_ASYNC_SMT4 LONG_ASM_CONST(0x0100)
 
 #ifndef __ASSEMBLY__
 
@@ -409,7 +410,7 @@ extern const char *powerpc_base_platform
CPU_FTR_MMCRA | CPU_FTR_SMT | \
CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
-   CPU_FTR_DSCR | CPU_FTR_SAO)
+   CPU_FTR_DSCR | CPU_FTR_SAO | CPU_FTR_ASYNC_SMT4)
 #define CPU_FTRS_CELL  (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCHv3 2/2] powerpc: implement arch_scale_smt_power for Power7

2010-01-28 Thread Benjamin Herrenschmidt
On Thu, 2010-01-28 at 17:24 -0600, Joel Schopp wrote:
 On Power7 processors running in SMT4 mode with 2, 3, or 4 idle threads 
 there is performance benefit to idling the higher numbered threads in
 the core.  
 
 This patch implements arch_scale_smt_power to dynamically update smt
 thread power in these idle cases in order to prefer threads 0,1 over
 threads 2,3 within a core.

Almost there :-) Joel, Peter, can you help me figure something out tho ?

On machine that don't have SMT, I would like to avoid calling
arch_scale_smt_power() at all if possible (in addition to not compiling
it in if SMT is not enabled in .config).

Now, I must say I'm utterly confused by how the domains are setup and I
haven't quite managed to sort it out... it looks to me that
SD_SHARE_CPUPOWER is always going to be set on all CPUs when the config
option is set (though each CPU will have its own domain) or am I
misguided ? IE. Is there any sense in having at least a fast exit path
out of arch_scale_smt_power() for non-SMT CPUs ?

Joel, can you look at compiling it out when SMT is not set ? We don't
want to bloat SMP kernels for 32-bit non-SMT embedded platforms.

Oh, and one minor nit:

 Signed-off-by: Joel Schopp jsch...@austin.ibm.com
 ---
 Version 2 addresses style and optimization, same basic functionality
 Version 3 adds a comment, resent due to mailing format error
 Index: linux-2.6.git/arch/powerpc/kernel/smp.c
 ===
 --- linux-2.6.git.orig/arch/powerpc/kernel/smp.c
 +++ linux-2.6.git/arch/powerpc/kernel/smp.c
 @@ -620,3 +620,59 @@ void __cpu_die(unsigned int cpu)
   smp_ops-cpu_die(cpu);
  }
  #endif

 ^^^ Please add the /* CONFIG_CPU_HOTPLUG */ (or whatever it is) that's
missing after that #endif :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev