Le 04/04/2020 à 20:28, Baptiste Jonglez a écrit : > Hi, > > Following up on a thread from 2018 [1], we still have issue with counting > the number of CPU, cores and threads on "exotic" machines. > > I gave a try at hwloc on all farm machines, below is the result where: > > M = machine > N = node (NUMA node) > P = package > C = core > T = pu (thread) > > > gcc10 M1 N4 P2 C24 T24 > gcc13 M1 N2 P2 C4 T4 > gcc14 M1 N P2 C8 T8 > gcc22 M1 N P C2 T2 > gcc23 M1 N P C2 T2 > gcc45 M1 N P C4 T4 > gcc67 M1 N P1 C4 T8 > gcc70 M1 N P1 C1 T2 > gcc110 M1 N2 P16 C16 T64 > gcc112 M1 N4 P4 C20 T160 > gcc113 M1 N P C8 T8 > gcc114 M1 N P C8 T8 > gcc115 M1 N P C8 T8 > gcc116 M1 N P C8 T8 > gcc117 M1 N P4 C8 T8 > gcc120 M1 N2 P2 C16 T32 > gcc121 M1 N2 P2 C16 T32 > gcc122 M1 N2 P2 C16 T32 > gcc123 M1 N2 P2 C16 T32 > gcc135 M1 N2 P2 C32 T128 > gcc202 M1 N1 P1 C8 T64 > gcc203 M1 N2 P4 C4 T32 > > Can someone with experience with each kind of machine make sense of this > data, and determine which field we should use for "CPU" (sockets), "cores" > and "thread" in https://cfarm.tetaneutral.net/machines/list/ ? > > From a first look, cores and threads seem to be correctly detected. > To get the number of CPU sockets, "NUMA node" seems rather unreliable > compared to "package", but both sometimes give strange results > (e.g. gcc110).
Hello Old POWER machines (e.g. gcc110 POWER7) are known to report strange or invalid topology information, for instance by reporting one CPU package per core. I don't remember the reason but IBM developers didn't want to fix their firmware because it would break something else. Things are reported correctly on modern POWER8/9 afaik. Some ARM platforms (gcc117,118) have a similar issue but things are improving now that vendors are implementing the PPTT ACPI table (once you use a recent Linux kernel). Using NUMA nodes for CPU sockets is indeed unreliable these days. Most vendors can expose multiple NUMA nodes per CPU package. Brice (the main hwloc developer) > > Maybe we should just stop trying to determine the number of CPU sockets > except on x86 systems? Does anybody need this kind of data? > > > The data above was obtained with "hwloc-calc -N $type all", the full command > is: > > # echo M$(hwloc-calc -N machine all) N$(hwloc-calc -N numanode all > 2>/dev/null) P$(hwloc-calc -N package all 2>/dev/null) C$(hwloc-calc -N core > all) T$(hwloc-calc -N pu all) > > Thanks, > Baptiste > > [1] > https://lists.tetaneutral.net/pipermail/cfarm-users/2018-November/000424.html
signature.asc
Description: OpenPGP digital signature
_______________________________________________ cfarm-users mailing list [email protected] https://lists.tetaneutral.net/listinfo/cfarm-users
