Hello
This is a kernel bug for 12-core AMD Bulldozer/Piledriver (62xx/63xx)
processors. hwloc is just complaining about buggy L3 information. lstopo
should report one L3 above each set of 6 cores below each NUMA node.
Instead you get strange L3s with 2, 4 or 6 cores.
If you're not binding tasks b
Hello,
To check whether kstat is able to report the psrset definitions, I
defined a set consisting of 2 CPUs (psrset -c 1-2) CPU1 and CPU2. The
remaining CPUs (CPU0, CPU2..CPU23) were left undefined.
On the machine, we can execute the "kstat" command and receive (among
1000s of lines) the follow
Thanks, I copied useful information from this thread and some links to
https://github.com/open-mpi/hwloc/issues/143
However, not sure I'll have time to look at this in the near future :/
Brice
Le 07/01/2016 09:03, Matthias Reich a écrit :
> Hello,
>
> To check whether kstat is able to rep
Brice,
Thanks for the information! It’s good to know it wasn’t a flaw in the upgrade.
This bug must have been introduced in kernel 3.x. I ran lstopo on on of our
servers that still have Centos 6.5 and it correctly reports L3 cache for every
6 cores as shown below.
We have 75 servers with the e
Hello
Good to know, thanks.
There are two ways to workaround the issue:
* run "lstopo foo.xml" on a node that doesn't have the bug and do export
HWLOC_XMLFILE=foo.xml and HWLOC_THISSYSTEM=1 on buggy nodes. (that's
what you call a "map" below). Works with very old hwloc releases.
* export HWLOC_CO
Thanks for the help. I successfully created the XML from a good machine and
used it on the buggy machine. Both lstopo and hwloc-info report correctly and I
no longer get the error when running MPI.
David
> On Jan 7, 2016, at 10:29 AM, Brice Goglin wrote:
>
> Hello
>
> Good to know, thanks.