Le 02/02/2012 00:12, Paul H. Hargrove a écrit : > > > On 2/1/2012 5:20 AM, Brice Goglin wrote: >> Le 01/02/2012 03:49, Christopher Samuel a écrit : >>> With XLC and 1.3.1 and 1.4 I get plenty of warnings (compile logs for >>> both attached) whilst compiling and then 4 failures in make check >>> (accompanied with segmentation faults): >>> >>> samuel@tambo:~/HWLOC/hwloc-1.3.1> grep -B1 FAIL: log >>> /bin/sh: line 1: 5267 Segmentation fault ${dir}$tst >>> FAIL: hwloc_bind >>> /bin/sh: line 1: 5285 Segmentation fault ${dir}$tst >>> FAIL: hwloc_get_last_cpu_location >>> /bin/sh: line 1: 5335 Segmentation fault ${dir}$tst >>> FAIL: hwloc_is_thissystem >>> /bin/sh: line 1: 5481 Segmentation fault ${dir}$tst >>> FAIL: glibc-sched >> All these tests involved binding, which is likely broken (see below). >> >> >> "/vlsci/VLSCI/samuel/HWLOC/hwloc-1.3.1/include/hwloc.h", line 1203.28: >> 1506-1385 (W) The attribute "pure" is not a valid type attribute. >> CC traversal.lo >> >> Attribute pure is before the function name, I'll move it after, XLC >> doesn't seems to warn in this case. >> >> >> "distances.c", line 62.42: 1506-404 (W) restrict can only qualify a >> pointer type. >> "distances.c", line 84.50: 1506-404 (W) restrict can only qualify a >> pointer type. >> "distances.c", line 226.40: 1506-404 (W) restrict can only qualify a >> pointer type. >> >> XLC may be wrong here, topology_t is typedef'ed to a pointer... > > > I've seen this sort of thing before where "pointerness" was ignored > when "inside" the typedef. > Since this is only a warning, and a missing "restrict" should not > impact correctness, I vote to ignore this. > > >> >> >> "topology-linux.c", line 303.33: 1506-280 (W) Function argument >> assignment between types "unsigned int" and "struct {...}*" is not >> allowed. >> "topology-linux.c", line 303.27: 1506-098 (E) Missing argument(s). >> "topology-linux.c", line 391.32: 1506-280 (W) Function argument >> assignment between types "unsigned int" and "struct {...}*" is not >> allowed. >> "topology-linux.c", line 391.26: 1506-098 (E) Missing argument(s). >> "topology-linux.c", line 715.40: 1506-280 (W) Function argument >> assignment between types "unsigned int" and "struct {...}*" is not >> allowed. >> "topology-linux.c", line 715.34: 1506-098 (E) Missing argument(s). >> "topology-linux.c", line 807.40: 1506-280 (W) Function argument >> assignment between types "unsigned int" and "struct {...}*" is not >> allowed. >> "topology-linux.c", line 807.34: 1506-098 (E) Missing argument(s). >> >> This looks very bad. It means something screwed the already very complex >> sched_setaffinity detection code. >> Does XLC redefine its own sched_setaffinity functions? Can you find the >> relevant header file and send it? >> PGI had similar problems at some point. That's very annoying. >> This explains why binding tests broke. > > I cannot find any instances within the /opt/apps/ibm tree on this > machine: >> $ find /opt/apps/ibm -name \*.h|xargs grep affi >> find: `/opt/apps/ibm/vac/11.1/lap/license': Permission denied >> find: `/opt/apps/ibm/essl/5.1/lap/license': Permission denied >> find: `/opt/apps/ibm/xlf/13.1/lap/license': Permission denied >> /opt/apps/ibm/xlsmp/2.1/include/omp.h: ibm_sched_affinity= 1000/* >> AFFINITY scheduling type. This is an IBM extension. */ >> $ find /opt/apps/ibm -name \*.h|xargs grep cpu_set_t >> find: `/opt/apps/ibm/vac/11.1/lap/license': Permission denied >> find: `/opt/apps/ibm/essl/5.1/lap/license': Permission denied >> find: `/opt/apps/ibm/xlf/13.1/lap/license': Permission denied > > > The generated config.h contains: >> #define HWLOC_HAVE_OLD_SCHED_SETAFFINITY 1 >> #define HWLOC_HAVE_SCHED_SETAFFINITY 1 > > The "OLD" sched_setaffinity is the 2-argument version, but > /usr/include/sched.h contains the 3-argument version: >> extern int sched_setaffinity (__pid_t __pid, size_t __cpusetsize, >> __const cpu_set_t *__cpuset) __THROW; > > So, it would appear that configure has wrongly set > "HWLOC_HAVE_OLD_SCHED_SETAFFINITY". > > Examining config.log I find >> configure:9046: checking for old prototype of sched_setaffinity >> configure:9064: xlc -c conftest.c >&5 >> "conftest.c", line 82.19: 1506-236 (W) Macro name _GNU_SOURCE has >> been redefined. >> "conftest.c", line 82.19: 1506-358 (I) "_GNU_SOURCE" is defined on >> line 25 of conftest.c. >> "conftest.c", line 89.23: 1506-280 (W) Function argument assignment >> between types "unsigned long" and "void*" is not allowed. >> "conftest.c", line 89.19: 1506-098 (E) Missing argument(s). >> configure:9064: $? = 0 >> configure:9068: result: yes > > This is WRONG. > The compiler has reported an error: "(E) Missing argument(s)" and yet > exited with $? = 0 > > I am looking at xlc docs to see if there is some compiler flag to be set.
Thanks for the debugging, this makes my last mail to Christopher useless then :) Brice