Re: [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling

2020-07-24 Thread Gautham R Shenoy
On Wed, Jul 22, 2020 at 12:27:47PM +0530, Srikar Dronamraju wrote:
> * Gautham R Shenoy  [2020-07-22 11:51:14]:
> 
> > Hi Srikar,
> > 
> > > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > > index 72f16dc0cb26..57468877499a 100644
> > > --- a/arch/powerpc/kernel/smp.c
> > > +++ b/arch/powerpc/kernel/smp.c
> > > @@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct 
> > > cpumask *(*mask_fn)(int))
> > >   if (!l2_cache)
> > >   return false;
> > > 
> > > + cpumask_set_cpu(cpu, mask_fn(cpu));
> > 
> > 
> > Ok, we need to do this because "cpu" is not yet set in the
> > cpu_online_mask. Prior to your patch the "cpu" was getting set in
> > cpu_l2_cache_map(cpu) as a side-effect of the code that is removed in
> > the patch.
> > 
> 
> Right.
> 
> > 
> > >   for_each_cpu(i, cpu_online_mask) {
> > >   /*
> > >* when updating the marks the current CPU has not been marked
> > > @@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
> > >* add it to it's own thread sibling mask.
> > >*/
> > >   cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
> > > + cpumask_set_cpu(cpu, cpu_core_mask(cpu));
> 
> Note: Above, we are explicitly setting the cpu_core_mask.

You are right. I missed this.

> 
> > > 
> > >   for (i = first_thread; i < first_thread + threads_per_core; i++)
> > >   if (cpu_online(i))
> > >   set_cpus_related(i, cpu, cpu_sibling_mask);
> > > 
> > >   add_cpu_to_smallcore_masks(cpu);
> > > - /*
> > > -  * Copy the thread sibling mask into the cache sibling mask
> > > -  * and mark any CPUs that share an L2 with this CPU.
> > > -  */
> > > - for_each_cpu(i, cpu_sibling_mask(cpu))
> > > - set_cpus_related(cpu, i, cpu_l2_cache_mask);
> > >   update_mask_by_l2(cpu, cpu_l2_cache_mask);
> > > 
> > > - /*
> > > -  * Copy the cache sibling mask into core sibling mask and mark
> > > -  * any CPUs on the same chip as this CPU.
> > > -  */
> > > - for_each_cpu(i, cpu_l2_cache_mask(cpu))
> > > - set_cpus_related(cpu, i, cpu_core_mask);
> > > + if (pkg_id == -1) {
> > 
> > I suppose this "if" condition is an optimization, since if pkg_id != -1,
> > we anyway set these CPUs in the cpu_core_mask below.
> > 
> > However...
> 
> This is not just an optimization.
> The hunk removed would only work if cpu_l2_cache_mask is bigger than
> cpu_sibling_mask. (this was the previous assumption that we want to break)
> If the cpu_sibling_mask is bigger than cpu_l2_cache_mask and pkg_id is -1,
> then setting only cpu_l2_cache_mask in cpu_core_mask will result in a broken 
> topology.
> 
> > 
> > > + struct cpumask *(*mask)(int) = cpu_sibling_mask;
> > > +
> > > + /*
> > > +  * Copy the sibling mask into core sibling mask and
> > > +  * mark any CPUs on the same chip as this CPU.
> > > +  */
> > > + if (shared_caches)
> > > + mask = cpu_l2_cache_mask;
> > > +
> > > + for_each_cpu(i, mask(cpu))
> > > + set_cpus_related(cpu, i, cpu_core_mask);
> > > 
> > > - if (pkg_id == -1)
> > >   return;
> > > + }
> > 
> > 
> > ... since "cpu" is not yet set in the cpu_online_mask, do we not miss 
> > setting
> > "cpu" in the cpu_core_mask(cpu) in the for-loop below ?
> > 
> > 
> 
> As noted above, we are setting before. So we don't missing the cpu and hence
> have not different from before.


Fair enough.

> 
> > --
> > Thanks and Regards
> > gautham.
> 
> -- 
> Thanks and Regards
> Srikar Dronamraju


Re: [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling

2020-07-22 Thread Srikar Dronamraju
* Gautham R Shenoy  [2020-07-22 11:51:14]:

> Hi Srikar,
> 
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index 72f16dc0cb26..57468877499a 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask 
> > *(*mask_fn)(int))
> > if (!l2_cache)
> > return false;
> > 
> > +   cpumask_set_cpu(cpu, mask_fn(cpu));
> 
> 
> Ok, we need to do this because "cpu" is not yet set in the
> cpu_online_mask. Prior to your patch the "cpu" was getting set in
> cpu_l2_cache_map(cpu) as a side-effect of the code that is removed in
> the patch.
> 

Right.

> 
> > for_each_cpu(i, cpu_online_mask) {
> > /*
> >  * when updating the marks the current CPU has not been marked
> > @@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
> >  * add it to it's own thread sibling mask.
> >  */
> > cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
> > +   cpumask_set_cpu(cpu, cpu_core_mask(cpu));

Note: Above, we are explicitly setting the cpu_core_mask.

> > 
> > for (i = first_thread; i < first_thread + threads_per_core; i++)
> > if (cpu_online(i))
> > set_cpus_related(i, cpu, cpu_sibling_mask);
> > 
> > add_cpu_to_smallcore_masks(cpu);
> > -   /*
> > -* Copy the thread sibling mask into the cache sibling mask
> > -* and mark any CPUs that share an L2 with this CPU.
> > -*/
> > -   for_each_cpu(i, cpu_sibling_mask(cpu))
> > -   set_cpus_related(cpu, i, cpu_l2_cache_mask);
> > update_mask_by_l2(cpu, cpu_l2_cache_mask);
> > 
> > -   /*
> > -* Copy the cache sibling mask into core sibling mask and mark
> > -* any CPUs on the same chip as this CPU.
> > -*/
> > -   for_each_cpu(i, cpu_l2_cache_mask(cpu))
> > -   set_cpus_related(cpu, i, cpu_core_mask);
> > +   if (pkg_id == -1) {
> 
> I suppose this "if" condition is an optimization, since if pkg_id != -1,
> we anyway set these CPUs in the cpu_core_mask below.
> 
> However...

This is not just an optimization.
The hunk removed would only work if cpu_l2_cache_mask is bigger than
cpu_sibling_mask. (this was the previous assumption that we want to break)
If the cpu_sibling_mask is bigger than cpu_l2_cache_mask and pkg_id is -1,
then setting only cpu_l2_cache_mask in cpu_core_mask will result in a broken 
topology.

> 
> > +   struct cpumask *(*mask)(int) = cpu_sibling_mask;
> > +
> > +   /*
> > +* Copy the sibling mask into core sibling mask and
> > +* mark any CPUs on the same chip as this CPU.
> > +*/
> > +   if (shared_caches)
> > +   mask = cpu_l2_cache_mask;
> > +
> > +   for_each_cpu(i, mask(cpu))
> > +   set_cpus_related(cpu, i, cpu_core_mask);
> > 
> > -   if (pkg_id == -1)
> > return;
> > +   }
> 
> 
> ... since "cpu" is not yet set in the cpu_online_mask, do we not miss setting
> "cpu" in the cpu_core_mask(cpu) in the for-loop below ?
> 
> 

As noted above, we are setting before. So we don't missing the cpu and hence
have not different from before.

> --
> Thanks and Regards
> gautham.

-- 
Thanks and Regards
Srikar Dronamraju


Re: [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling

2020-07-22 Thread Gautham R Shenoy
Hi Srikar,

On Tue, Jul 21, 2020 at 05:08:09PM +0530, Srikar Dronamraju wrote:
> Current code assumes that cpumask of cpus sharing a l2-cache mask will
> always be a superset of cpu_sibling_mask.
> 
> Lets stop that assumption. cpu_l2_cache_mask is a superset of
> cpu_sibling_mask if and only if shared_caches is set.
> 
> Cc: linuxppc-dev 
> Cc: LKML 
> Cc: Michael Ellerman 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Valentin Schneider 
> Cc: Nick Piggin 
> Cc: Oliver OHalloran 
> Cc: Nathan Lynch 
> Cc: Michael Neuling 
> Cc: Anton Blanchard 
> Cc: Gautham R Shenoy 
> Cc: Vaidyanathan Srinivasan 
> Cc: Jordan Niethe 
> Signed-off-by: Srikar Dronamraju 
> ---
> Changelog v1 -> v2:
> powerpc/smp: Dont assume l2-cache to be superset of sibling
>   Set cpumask after verifying l2-cache. (Gautham)
> 
>  arch/powerpc/kernel/smp.c | 28 +++-
>  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 72f16dc0cb26..57468877499a 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask 
> *(*mask_fn)(int))
>   if (!l2_cache)
>   return false;
> 
> + cpumask_set_cpu(cpu, mask_fn(cpu));


Ok, we need to do this because "cpu" is not yet set in the
cpu_online_mask. Prior to your patch the "cpu" was getting set in
cpu_l2_cache_map(cpu) as a side-effect of the code that is removed in
the patch.


>   for_each_cpu(i, cpu_online_mask) {
>   /*
>* when updating the marks the current CPU has not been marked
> @@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
>* add it to it's own thread sibling mask.
>*/
>   cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
> + cpumask_set_cpu(cpu, cpu_core_mask(cpu));
> 
>   for (i = first_thread; i < first_thread + threads_per_core; i++)
>   if (cpu_online(i))
>   set_cpus_related(i, cpu, cpu_sibling_mask);
> 
>   add_cpu_to_smallcore_masks(cpu);
> - /*
> -  * Copy the thread sibling mask into the cache sibling mask
> -  * and mark any CPUs that share an L2 with this CPU.
> -  */
> - for_each_cpu(i, cpu_sibling_mask(cpu))
> - set_cpus_related(cpu, i, cpu_l2_cache_mask);
>   update_mask_by_l2(cpu, cpu_l2_cache_mask);
> 
> - /*
> -  * Copy the cache sibling mask into core sibling mask and mark
> -  * any CPUs on the same chip as this CPU.
> -  */
> - for_each_cpu(i, cpu_l2_cache_mask(cpu))
> - set_cpus_related(cpu, i, cpu_core_mask);
> + if (pkg_id == -1) {

I suppose this "if" condition is an optimization, since if pkg_id != -1,
we anyway set these CPUs in the cpu_core_mask below.

However...

> + struct cpumask *(*mask)(int) = cpu_sibling_mask;
> +
> + /*
> +  * Copy the sibling mask into core sibling mask and
> +  * mark any CPUs on the same chip as this CPU.
> +  */
> + if (shared_caches)
> + mask = cpu_l2_cache_mask;
> +
> + for_each_cpu(i, mask(cpu))
> + set_cpus_related(cpu, i, cpu_core_mask);
> 
> - if (pkg_id == -1)
>   return;
> + }


... since "cpu" is not yet set in the cpu_online_mask, do we not miss setting
"cpu" in the cpu_core_mask(cpu) in the for-loop below ?


> 
>   for_each_cpu(i, cpu_online_mask)
>   if (get_physical_package_id(i) == pkg_id)


Before this patch it was unconditionally getting set in
cpu_core_mask(cpu) because of the fact that it was set in
cpu_l2_cache_mask(cpu) and we were unconditionally setting all the
CPUs in cpu_l2_cache_mask(cpu) in cpu_core_mask(cpu).

What am I missing ?

> -- 
> 2.17.1
>

--
Thanks and Regards
gautham.