Hi Srikar, On Tue, Jul 14, 2020 at 10:06:15AM +0530, Srikar Dronamraju wrote: > A new sched_domain_topology_level was added just for Power9. However the > same can be achieved by merging powerpc_topology with power9_topology > and makes the code more simpler especially when adding a new sched > domain. > > Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org> > Cc: Michael Ellerman <micha...@au1.ibm.com> > Cc: Nick Piggin <npig...@au1.ibm.com> > Cc: Oliver OHalloran <olive...@au1.ibm.com> > Cc: Nathan Lynch <nath...@linux.ibm.com> > Cc: Michael Neuling <mi...@linux.ibm.com> > Cc: Anton Blanchard <an...@au1.ibm.com> > Cc: Gautham R Shenoy <e...@linux.vnet.ibm.com> > Cc: Vaidyanathan Srinivasan <sva...@linux.ibm.com> > Signed-off-by: Srikar Dronamraju <sri...@linux.vnet.ibm.com> > --- > arch/powerpc/kernel/smp.c | 33 ++++++++++----------------------- > 1 file changed, 10 insertions(+), 23 deletions(-) > > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c > index 680c0edcc59d..069ea4b21c6d 100644 > --- a/arch/powerpc/kernel/smp.c > +++ b/arch/powerpc/kernel/smp.c > @@ -1315,7 +1315,7 @@ int setup_profiling_timer(unsigned int multiplier) > } > > #ifdef CONFIG_SCHED_SMT > -/* cpumask of CPUs with asymetric SMT dependancy */ > +/* cpumask of CPUs with asymmetric SMT dependency */ > static int powerpc_smt_flags(void) > { > int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES; > @@ -1328,14 +1328,6 @@ static int powerpc_smt_flags(void) > } > #endif > > -static struct sched_domain_topology_level powerpc_topology[] = { > -#ifdef CONFIG_SCHED_SMT > - { cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) }, > -#endif > - { cpu_cpu_mask, SD_INIT_NAME(DIE) }, > - { NULL, }, > -}; > - > /* > * P9 has a slightly odd architecture where pairs of cores share an L2 cache. > * This topology makes it *much* cheaper to migrate tasks between adjacent > cores > @@ -1353,7 +1345,13 @@ static int powerpc_shared_cache_flags(void) > */ > static const struct cpumask *shared_cache_mask(int cpu) > { > - return cpu_l2_cache_mask(cpu); > + if (shared_caches) > + return cpu_l2_cache_mask(cpu); > + > + if (has_big_cores) > + return cpu_smallcore_mask(cpu); > + > + return cpu_smt_mask(cpu); > } > > #ifdef CONFIG_SCHED_SMT > @@ -1363,7 +1361,7 @@ static const struct cpumask *smallcore_smt_mask(int cpu) > } > #endif > > -static struct sched_domain_topology_level power9_topology[] = { > +static struct sched_domain_topology_level powerpc_topology[] = {
> #ifdef CONFIG_SCHED_SMT > { cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) }, > #endif > @@ -1388,21 +1386,10 @@ void __init smp_cpus_done(unsigned int max_cpus) > #ifdef CONFIG_SCHED_SMT > if (has_big_cores) { > pr_info("Big cores detected but using small core scheduling\n"); I> - power9_topology[0].mask = smallcore_smt_mask; > powerpc_topology[0].mask = smallcore_smt_mask; > } > #endif > - /* > - * If any CPU detects that it's sharing a cache with another CPU then > - * use the deeper topology that is aware of this sharing. > - */ > - if (shared_caches) { > - pr_info("Using shared cache scheduler topology\n"); > - set_sched_topology(power9_topology); > - } else { > - pr_info("Using standard scheduler topology\n"); > - set_sched_topology(powerpc_topology); Ok, so we will go with the three level topology by default (SMT, CACHE, DIE) and will rely on the sched-domain creation code to degenerate CACHE domain in case SMT and CACHE have the same set of CPUs (POWER8 for eg). >From a cleanup perspective this is better, since we won't have to worry about defining multiple topology structures, but from a performance point of view, wouldn't we now pay an extra penalty of degenerating the CACHE domains on POWER8 kind of systems, each time when a CPU comes online ? Do we know how bad it is ? If the degeneration takes a few extra microseconds, that should be ok I suppose. -- Thanks and Regards gautham.