On 2/1/2012 5:20 AM, Brice Goglin wrote:
Le 01/02/2012 03:49, Christopher Samuel a écrit :
With XLC and 1.3.1 and 1.4 I get plenty of warnings (compile logs for
both attached) whilst compiling and then 4 failures in make check
(accompanied with segmentation faults):

samuel@tambo:~/HWLOC/hwloc-1.3.1>  grep -B1 FAIL: log
/bin/sh: line 1:  5267 Segmentation fault      ${dir}$tst
FAIL: hwloc_bind
/bin/sh: line 1:  5285 Segmentation fault      ${dir}$tst
FAIL: hwloc_get_last_cpu_location
/bin/sh: line 1:  5335 Segmentation fault      ${dir}$tst
FAIL: hwloc_is_thissystem
/bin/sh: line 1:  5481 Segmentation fault      ${dir}$tst
FAIL: glibc-sched
All these tests involved binding, which is likely broken (see below).


"/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28:
1506-1385 (W) The attribute "pure" is not a valid type attribute.
   CC     traversal.lo

Attribute pure is before the function name, I'll move it after, XLC
doesn't seems to warn in this case.


"distances.c", line 62.42: 1506-404 (W) restrict can only qualify a
pointer type.
"distances.c", line 84.50: 1506-404 (W) restrict can only qualify a
pointer type.
"distances.c", line 226.40: 1506-404 (W) restrict can only qualify a
pointer type.

XLC may be wrong here, topology_t is typedef'ed to a pointer...


I've seen this sort of thing before where "pointerness" was ignored when "inside" the typedef. Since this is only a warning, and a missing "restrict" should not impact correctness, I vote to ignore this.




"topology-linux.c", line 303.33: 1506-280 (W) Function argument
assignment between types "unsigned int" and "struct {...}*" is not allowed.
"topology-linux.c", line 303.27: 1506-098 (E) Missing argument(s).
"topology-linux.c", line 391.32: 1506-280 (W) Function argument
assignment between types "unsigned int" and "struct {...}*" is not allowed.
"topology-linux.c", line 391.26: 1506-098 (E) Missing argument(s).
"topology-linux.c", line 715.40: 1506-280 (W) Function argument
assignment between types "unsigned int" and "struct {...}*" is not allowed.
"topology-linux.c", line 715.34: 1506-098 (E) Missing argument(s).
"topology-linux.c", line 807.40: 1506-280 (W) Function argument
assignment between types "unsigned int" and "struct {...}*" is not allowed.
"topology-linux.c", line 807.34: 1506-098 (E) Missing argument(s).

This looks very bad. It means something screwed the already very complex
sched_setaffinity detection code.
Does XLC redefine its own sched_setaffinity functions? Can you find the
relevant header file and send it?
PGI had similar problems at some point. That's very annoying.
This explains why binding tests broke.

I cannot find any instances within the /opt/apps/ibm tree on this machine:
$ find /opt/apps/ibm -name \*.h|xargs grep affi
find: `/opt/apps/ibm/vac/11.1/lap/license': Permission denied
find: `/opt/apps/ibm/essl/5.1/lap/license': Permission denied
find: `/opt/apps/ibm/xlf/13.1/lap/license': Permission denied
/opt/apps/ibm/xlsmp/2.1/include/omp.h: ibm_sched_affinity= 1000/* AFFINITY scheduling type. This is an IBM extension. */
$ find /opt/apps/ibm -name \*.h|xargs grep cpu_set_t
find: `/opt/apps/ibm/vac/11.1/lap/license': Permission denied
find: `/opt/apps/ibm/essl/5.1/lap/license': Permission denied
find: `/opt/apps/ibm/xlf/13.1/lap/license': Permission denied


The generated config.h contains:
#define HWLOC_HAVE_OLD_SCHED_SETAFFINITY 1
#define HWLOC_HAVE_SCHED_SETAFFINITY 1

The "OLD" sched_setaffinity is the 2-argument version, but /usr/include/sched.h contains the 3-argument version:
extern int sched_setaffinity (__pid_t __pid, size_t __cpusetsize,
                              __const cpu_set_t *__cpuset) __THROW;

So, it would appear that configure has wrongly set "HWLOC_HAVE_OLD_SCHED_SETAFFINITY".

Examining config.log I find
configure:9046: checking for old prototype of sched_setaffinity
configure:9064: xlc -c   conftest.c >&5
"conftest.c", line 82.19: 1506-236 (W) Macro name _GNU_SOURCE has been redefined. "conftest.c", line 82.19: 1506-358 (I) "_GNU_SOURCE" is defined on line 25 of conftest.c. "conftest.c", line 89.23: 1506-280 (W) Function argument assignment between types "unsigned long" and "void*" is not allowed.
"conftest.c", line 89.19: 1506-098 (E) Missing argument(s).
configure:9064: $? = 0
configure:9068: result: yes

This is WRONG.
The compiler has reported an error: "(E) Missing argument(s)" and yet exited with $? = 0

I am looking at xlc docs to see if there is some compiler flag to be set.

-Paul

--
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to