This node is an IBM "Power 750 Express server", described in detail at
http://www.redbooks.ibm.com/redpapers/pdfs/redp4638.pdf
Notably it is a quad-socket chassis which can take 6-core or 8-core
processors.
However, lstopo is reporting 8 sockets of 4-cores each.
This discrepancy lead me to recall the following from an email sent to
me by a colleague:
A surprise
to me is that the login nodes provide the appearance of having 32
cpu's, but those are in fact only 8 cores with 4 hyper-threads,
and they are in fact VM's on top of one socket of a compute node.
So, I am not really certain what I should expect lstopo to report.
I suppose it is accurately reporting to me the virtual node's configuration.
I bring this up because it may very well be related to the assertion
failures.
My guess here being that some part of hwloc has seen past the "virtual"
to see the "physical" and the assertion failure reflects the resulting
inconsistency. But that is just a guess. Let me know how I might help
debug this failure.
-Paul
On 1/31/2012 7:12 PM, Paul H. Hargrove wrote:
The problem I reported below also exists in hwloc-1.4.1.
Additionally, I can reproduce the SEGVs with xlc which Chris Samuel
reported in
http://www.open-mpi.org/community/lists/hwloc-devel/2012/01/2738.php
-Paul
On 1/31/2012 5:56 PM, Paul H. Hargrove wrote:
When running "make check" in hwloc-1.3.1 on a Linux/POWER7 system I see:
lt-linux-libnuma:
/users/phh1/OMPI/hwloc-1.3.1-linux-ppc64-gcc//hwloc-1.3.1/tests/linux-libnuma.c:53:
main: Assertion `hwloc_bitmap_isequal(set, set2)' failed.
/bin/sh: line 5: 21415 Aborted ${dir}$tst
FAIL: linux-libnuma
I've reproduced that failure with 4 different compilers (3 gcc's and
an xlc).
The xlc-built hwloc-1.3.1 also fails an additional test:
lt-glibc-sched:
/users/phh1/OMPI/hwloc-1.3.1-linux-ppc64-xlc-11.1//hwloc-1.3.1/tests/glibc-sched.c:43:
main: Assertion `!err' failed.
/bin/sh: line 5: 7077 Aborted ${dir}$tst
FAIL: glibc-sched
The contents of /proc/cpuinfo are:
processor : 0
cpu : POWER7 (architected), altivec supported
clock : 3550.000000MHz
revision : 2.0 (pvr 003f 0200)
[30 more of the same]
processor : 31
cpu : POWER7 (architected), altivec supported
clock : 3550.000000MHz
revision : 2.0 (pvr 003f 0200)
timebase : 512000000
platform : pSeries
model : IBM,8233-E8B
machine : CHRP IBM,8233-E8B
Let me know of any other h/w or s/w info I can report.
-Paul
--
Paul H. Hargrove phhargr...@lbl.gov
Future Technologies Group
HPC Research Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900