Thanks for the help. I successfully created the XML from a good machine and
used it on the buggy machine. Both lstopo and hwloc-info report correctly and I
no longer get the error when running MPI.
David
> On Jan 7, 2016, at 10:29 AM, Brice Goglin wrote:
>
> Hello
>
> Good to know, thanks.
Hello
Good to know, thanks.
There are two ways to workaround the issue:
* run "lstopo foo.xml" on a node that doesn't have the bug and do export
HWLOC_XMLFILE=foo.xml and HWLOC_THISSYSTEM=1 on buggy nodes. (that's
what you call a "map" below). Works with very old hwloc releases.
* export HWLOC_CO
Brice,
Thanks for the information! It’s good to know it wasn’t a flaw in the upgrade.
This bug must have been introduced in kernel 3.x. I ran lstopo on on of our
servers that still have Centos 6.5 and it correctly reports L3 cache for every
6 cores as shown below.
We have 75 servers with the e
Thanks, I copied useful information from this thread and some links to
https://github.com/open-mpi/hwloc/issues/143
However, not sure I'll have time to look at this in the near future :/
Brice
Le 07/01/2016 09:03, Matthias Reich a écrit :
> Hello,
>
> To check whether kstat is able to rep
Hello,
To check whether kstat is able to report the psrset definitions, I
defined a set consisting of 2 CPUs (psrset -c 1-2) CPU1 and CPU2. The
remaining CPUs (CPU0, CPU2..CPU23) were left undefined.
On the machine, we can execute the "kstat" command and receive (among
1000s of lines) the follow
Hello
This is a kernel bug for 12-core AMD Bulldozer/Piledriver (62xx/63xx)
processors. hwloc is just complaining about buggy L3 information. lstopo
should report one L3 above each set of 6 cores below each NUMA node.
Instead you get strange L3s with 2, 4 or 6 cores.
If you're not binding tasks b