Re: [PATCH v3 0/5] Extend Parsing "ibm, thread-groups" for Shared-L2 information

2020-12-15 Thread Michael Ellerman
On Thu, 10 Dec 2020 16:08:54 +0530, Gautham R. Shenoy wrote:
> This is the v2 of the patchset to extend parsing of "ibm,thread-groups" 
> property
> to discover the Shared-L2 cache information.
> 
> The previous versions can be found here :
> 
> v2 : 
> https://lore.kernel.org/linuxppc-dev/1607533700-5546-1-git-send-email-...@linux.vnet.ibm.com/T/#m043ea15d3832658527fca94765202b9cbefd330d
> 
> [...]

Applied to powerpc/next.

[1/5] powerpc/smp: Parse ibm,thread-groups with multiple properties
  https://git.kernel.org/powerpc/c/790a1662d3a26fe9fa5f691386d8fde6bb8b0dc2
[2/5] powerpc/smp: Rename cpu_l1_cache_map as thread_group_l1_cache_map
  https://git.kernel.org/powerpc/c/1fdc1d6632ff3f6813a2f15b65586bde8fe0f0ba
[3/5] powerpc/smp: Rename init_thread_group_l1_cache_map() to make it generic
  https://git.kernel.org/powerpc/c/fbd2b672e91d276b9fa5a729e4a823ba29fa2692
[4/5] powerpc/smp: Add support detecting thread-groups sharing L2 cache
  https://git.kernel.org/powerpc/c/9538abee18cca70ffd03cef56027388b0c5084cc
[5/5] powerpc/cacheinfo: Print correct cache-sibling map/list for L2 cache
  https://git.kernel.org/powerpc/c/0be47634db0baa9e91c7e635e7e73355d6a5cf43

cheers


[PATCH v3 0/5] Extend Parsing "ibm, thread-groups" for Shared-L2 information

2020-12-10 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Hi,

This is the v2 of the patchset to extend parsing of "ibm,thread-groups" property
to discover the Shared-L2 cache information.

The previous versions can be found here :

v2 : 
https://lore.kernel.org/linuxppc-dev/1607533700-5546-1-git-send-email-...@linux.vnet.ibm.com/T/#m043ea15d3832658527fca94765202b9cbefd330d

v1 : 
https://lore.kernel.org/linuxppc-dev/1607057327-29822-1-git-send-email-...@linux.vnet.ibm.com/T/#m0fabffa1ea1a2807b362f25c849bb19415216520


Changes form v2-->v3:
 * Fixed the build errors reported by the Kernel Test Robot for Patches 4 and 5.

Changes from v1-->v2:
Incorporate the review comments from Srikar and
fix a build error on !PPC64 configs reported by the kernel bot.

 * Split Patch 1 into three patches
   * First patch ensure that parse_thread_groups() is made generic to
 support more than one property.
   * Second patch renames cpu_l1_cache_map as
 thread_group_l1_cache_map for consistency. No functional impact.
   * The third patch makes init_thread_group_l1_cache_map()
 generic. No functional impact.

* Patch 2 (Now patch 4): Incorporates the review comments from Srikar 
simplifying
   the changes to update_mask_by_l2()

* Patch 3 (Now patch 5): Fix a build errors for 32-bit configs
   reported by the kernel build bot.

Description of the Patchset
===
The "ibm,thread-groups" device-tree property is an array that is used
to indicate if groups of threads within a core share certain
properties. It provides details of which property is being shared by
which groups of threads. This array can encode information about
multiple properties being shared by different thread-groups within the
core.

Example: Suppose,
"ibm,thread-groups" = [1,2,4,8,10,12,14,9,11,13,15,2,2,4,8,10,12,14,9,11,13,15]

This can be decomposed up into two consecutive arrays:

a) [1,2,4,8,10,12,14,9,11,13,15]
b) [2,2,4,8,10,12,14,9,11,13,15]

where in,

a) provides information of Property "1" being shared by "2" groups,
   each with "4" threads each. The "ibm,ppc-interrupt-server#s" of the
   first group is {8,10,12,14} and the "ibm,ppc-interrupt-server#s" of
   the second group is {9,11,13,15}. Property "1" is indicative of
   the thread in the group sharing L1 cache, translation cache and
   Instruction Data flow.

b) provides information of Property "2" being shared by "2" groups,
   each group with "4" threads. The "ibm,ppc-interrupt-server#s" of
   the first group is {8,10,12,14} and the
   "ibm,ppc-interrupt-server#s" of the second group is
   {9,11,13,15}. Property "2" indicates that the threads in each group
   share the L2-cache.
   
The existing code assumes that the "ibm,thread-groups" encodes
information about only one property. Hence even on platforms which
encode information about multiple properties being shared by the
corresponding groups of threads, the current code will only pick the
first one. (In the above example, it will only consider
[1,2,4,8,10,12,14,9,11,13,15] but not [2,2,4,8,10,12,14,9,11,13,15]).

Furthermore, currently on platforms where groups of threads share L2
cache, we incorrectly create an extra CACHE level sched-domain that
maps to all the threads of the core.

For example, if "ibm,thread-groups" is 
 0001 0002 0004 
 0002 0004 0006 0001
 0003 0005 0007 0002
 0002 0004  0002
 0004 0006 0001 0003
 0005 0007

then, the sub-array
[0002 0002 0004
  0002 0004 0006
 0001 0003 0005 0007]
indicates that L2 (Property "2") is shared only between the threads of a single
group. There are "2" groups of threads where each group contains "4"
threads each. The groups being {0,2,4,6} and {1,3,5,7}.

However, the sched-domain hierarchy for CPUs 0,1 is
CPU0 attaching sched-domain(s):
domain-0: span=0,2,4,6 level=SMT
domain-1: span=0-7 level=CACHE
domain-2: span=0-15,24-39,48-55 level=MC
domain-3: span=0-55 level=DIE

CPU1 attaching sched-domain(s):
domain-0: span=1,3,5,7 level=SMT
domain-1: span=0-7 level=CACHE
domain-2: span=0-15,24-39,48-55 level=MC
domain-3: span=0-55 level=DIE

where the CACHE domain reports that L2 is shared across the entire
core which is incorrect on such platforms.

This patchset remedies these issues by extending the parsing support
for "ibm,thread-groups" to discover information about multiple
properties being shared by the corresponding groups of threads. In
particular we cano now detect if the groups of threads within a core
share the L2-cache. On such platforms, we populate the populating the
cpu_l2_cache_mask of every CPU to the core-siblings which share L2
with the CPU as specified in the by the "ibm,thread-groups" property
array.

With the patchset, the sched-domain hierarchy is corre