[hwloc-users] glibc struggling with get_nprocs and get_nprocs_conf
Hello, For information, glibc is struggling with the problematic of the precise meaning of get_nprocs, get_nprocs_conf, _SC_NPROCESSORS_CONF, _SC_NPROCESSORS_ONLN https://sourceware.org/pipermail/libc-alpha/2022-February/136177.html Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] hwloc cant detect hardware topology error.
Yogesh Sharma, le sam. 18 juil. 2020 15:59:57 +0530, a ecrit: > i am new to ubuntu. can you give me a moment and help me get through command > lines here It is really just the same as you tried, but with the libhwloc-dev package instead of hwloc. Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] hwloc cant detect hardware topology error.
Hello, Yogesh Sharma, le sam. 18 juil. 2020 15:02:30 +0530, a ecrit: > i tried sudo apt -get hwloc=1.11.3 It is the libhwloc-dev package that you need to downgrade. But 1.11 is old, and the software really needs to be ported to the hwloc 2 API. Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] One more silly warning squash
Balaji, Pavan, le mar. 02 juin 2020 09:31:29 +, a ecrit: > > On Jun 1, 2020, at 4:11 AM, Balaji, Pavan via hwloc-users > > wrote: > >> On Jun 1, 2020, at 4:10 AM, Balaji, Pavan wrote: > >>> On Jun 1, 2020, at 4:06 AM, Samuel Thibault > >>> wrote: > >>> could you check whether the attached patch avoids the warning? > >>> (we should really not need a cast to const char*) > >> > >> The attached patch is basically the same as what we are using, isn't it? > >> It does avoid the warning. > > > > Oh, sorry, I see now that you skipped the extra cast in that case. Let me > > try it out and get back to you. > > I've verified that the patch works. Ok, I pushed the fix to master, thanks! Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] One more silly warning squash
Balaji, Pavan, le lun. 01 juin 2020 09:10:21 +, a ecrit: > > On Jun 1, 2020, at 4:06 AM, Samuel Thibault > > wrote: > > could you check whether the attached patch avoids the warning? > > (we should really not need a cast to const char*) > > The attached patch is basically the same as what we are using, isn't it? Yes, but without the cast, which a compiler should really not require :) > It does avoid the warning. Ok, thanks. Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] One more silly warning squash
Hello, Balaji, Pavan via hwloc-users, le lun. 01 juin 2020 03:39:02 +, a ecrit: > We are seeing some warnings with the Intel compiler with hwloc (listed > below). The warnings seem to be somewhat silly because there already is a > cast to "char *" from the string literal, Well, I'd agree with icc that casting a string literal to (char*) is in general a bad idea :) > but it seems to expect a cast to "const char *" before casting to "char *". > We are maintaining the below patch to workaround it. Can you either > integrate this or a better fix for the warning? > > https://github.com/pmodels/hwloc/commit/fb27dc6e21bac14754d1b50b57f752e37d475704 I fixed them except: > CC topology-hardwired.lo > traversal.c(598): warning #3179: deprecated conversion of string literal to > char* (should be const char*) > const char *quote = strchr(info->value, ' ') ? "\"" : ""; > ^ Which is more silly (these are already const char* in essence) and just seems to me like icc's limited analysis. Our version of icc doesn't get these, could you check whether the attached patch avoids the warning? (we should really not need a cast to const char*) Samuel diff --git a/hwloc/traversal.c b/hwloc/traversal.c index 4062a19d..14549422 100644 --- a/hwloc/traversal.c +++ b/hwloc/traversal.c @@ -654,7 +654,11 @@ hwloc_obj_attr_snprintf(char * __hwloc_restrict string, size_t size, hwloc_obj_t unsigned i; for(i=0; iinfos_count; i++) { struct hwloc_info_s *info = >infos[i]; - const char *quote = strchr(info->value, ' ') ? "\"" : ""; + const char *quote; + if (strchr(info->value, ' ')) +quote = "\""; + else +quote = ""; res = hwloc_snprintf(tmp, tmplen, "%s%s=%s%s%s", prefix, info->name, ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] question about hwloc_set_area_membind_nodeset
Brice Goglin, on dim. 12 nov. 2017 05:19:37 +0100, wrote: > That's likely what's happening. Each set_area() may be creating a new "virtual > memory area". The kernel tries to merge them with neighbors if they go to the > same NUMA node. Otherwise it creates a new VMA. Mmmm, that sucks. Ideally we'd have a way to ask the kernel not to strictly bind the memory, but just to allocate on a given memory node, and just hope that the allocation will not go away (e.g. due to swapping), which thus doesn't need a VMA to record the information. As you describe below, first-touch achieves that but it's not necessarily so convenient. > I can't find the exact limit but it's something like 64k so I guess > you're exhausting that. It's sysctl vm.max_map_count > Question 2 : Is there a better way of achieving the result I'm looking for > (such as a call to membind with a stride of some kind to say put N pages > in > a row on each domain in alternation). > > > Unfortunately, the interleave policy doesn't have a stride argument. It's one > page on node 0, one page on node 1, etc. > > The only idea I have is to use the first-touch policy: Make sure your buffer > isn't is physical memory yet, and have a thread on node 0 read the "0" pages, > and another thread on node 1 read the "1" page. Or "next-touch" if that was to ever get merged into mainline Linux :) Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] linkspeed in hwloc_obj_attr_u::hwloc_pcidev_attr_s struct while traversing topology
Hello, TEJASWI k, on ven. 13 oct. 2017 14:44:53 +0530, wrote: > Thanks I could get the linkspeed when i tried with root user. > But is there no other way? See Brice's answer :) > And what is the reason behind this limitation? Ask Linux people, not us :) I can only guess that they are afraid of exposing too much config information, and thus only whitelist the first part. Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] linkspeed in hwloc_obj_attr_u::hwloc_pcidev_attr_s struct while traversing topology
Hello, TEJASWI k, on ven. 13 oct. 2017 14:23:00 +0530, wrote: > All the other details I am able to query but linkspeed (pciObj->attr-> > bridge.upstream.pci.linkspeed) is always 0. > Do I need to enable any other flag to get linkspeed or am I going wrong > somewhere? You need to run as root for hwloc to be able to read the linkspeed from Linux. Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
[hwloc-users] process names in lstopo --ps
Hello, The other day I modified the output of lstopo --ps to contain the end of the cmdline instead of the beginning, because with module systems, spack, etc. the path to application binaries get longer and longer, and eventually the actual name of the binary goes away on the right. But conversely, now it's the end of the options passed to the application which show up, and when there are a lot, the application name goes away again, on the left. Would it be fine to people to just get rid of the path leading to the application binary, and only show the file name of the binary in the output of lstopo --ps? (Basically, taking /proc/$pid/comm instead of trying to find the right part of /proc/$pid/cmdline to be displayed.) Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users
Re: [OMPI users] Message reception not getting pipelined with TCP
Gilles Gouaillardet, on ven. 21 juil. 2017 10:57:36 +0900, wrote: > if you are fine with using more memory, and your application should not > generate too much unexpected messages, then you can bump the eager_limit > for example > > mpirun --mca btl_tcp_eager_limit $((8*1024*1024+128)) ... Thanks for the workaround! Normally we shouldn't have many unexpected messages, the memory consumption would be concerning, though. Samuel ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] Message reception not getting pipelined with TCP
Hello, George Bosilca, on jeu. 20 juil. 2017 19:05:34 -0500, wrote: > Can you reproduce the same behavior after the first batch of messages ? Yes, putting a loop around the whole series of communications, event with a 1-second pause in between, gets the same behavior repeated. > Assuming the times showed on the left of your messages are correct, the first > MPI seems to deliver the entire set of messages significantly faster than the > second. The second log was with mpich2. Samuel ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
[OMPI users] Message reception not getting pipelined with TCP
Hello, We are getting a strong performance issue, which is due to a missing pipelining behavior from OpenMPI when running over TCP. I have attached a test case. Basically what it does is if (myrank == 0) { for (i = 0; i < N; i++) MPI_Isend(...); } else { for (i = 0; i < N; i++) MPI_Irecv(...); } for (i = 0; i < N; i++) MPI_Wait(...); with corresponding printfs. And the result is: 0.182620: Isend 0 begin 0.182761: Isend 0 end 0.182766: Isend 1 begin 0.182782: Isend 1 end ... 0.183911: Isend 49 begin 0.183915: Isend 49 end 0.199028: Irecv 0 begin 0.199068: Irecv 0 end 0.199070: Irecv 1 begin 0.199072: Irecv 1 end ... 0.199187: Irecv 49 begin 0.199188: Irecv 49 end 0.233948: Isend 0 done! 0.269895: Isend 1 done! ... 1.982475: Isend 49 done! 1.984065: Irecv 0 done! 1.984078: Irecv 1 done! ... 1.984131: Irecv 49 done! i.e. almost two seconds happen between the start of the application and the first Irecv completes, and then all Irecv complete immediately too, i.e. it seems the communications were grouped altogether. This is really bad, because in our real use case, we trigger computations after each MPI_Wait calls, and we use several messages so as to pipeline things: the first computation can start as soon as one message gets received, thus overlapped with further receptions. This problem is only with openmpi on TCP, I'm not getting this behavior with openmpi on IB, and I'm not getting it either with mpich or madmpi: 0.182168: Isend 0 begin 0.182235: Isend 0 end 0.182237: Isend 1 begin 0.182242: Isend 1 end ... 0.182842: Isend 49 begin 0.182844: Isend 49 end 0.200505: Irecv 0 begin 0.200564: Irecv 0 end 0.200567: Irecv 1 begin 0.200569: Irecv 1 end ... 0.201233: Irecv 49 begin 0.201234: Irecv 49 end 0.269511: Isend 0 done! 0.273154: Irecv 0 done! 0.341054: Isend 1 done! 0.344507: Irecv 1 done! ... 3.767726: Isend 49 done! 3.770637: Irecv 49 done! There we do have pipelined reception. Is there a way to get the second, pipelined behavior with openmpi on TCP? Samuel #include #include #include #include #include /* run with mpirun --map-by node */ #define SIZE (8*1024*1024) #define N 50 //#define DEBUG int main(int argc, char *argv[]) { char *c[N]; int rank; int i, repeat, flag; MPI_Request request[N]; MPI_Status status; int done[N] = { 0 }; char *actions[2] = { "Isend", "Irecv" }; int ret; double start; struct utsname name; uname(); MPI_Init(, ); MPI_Comm_rank(MPI_COMM_WORLD,); fprintf(stderr,"I'm %d on %s\n", rank, name.nodename); MPI_Barrier(MPI_COMM_WORLD); start = MPI_Wtime(); for (i = 0; i < N; i++) { c[i] = calloc(1,SIZE); c[i][0] = i; c[i][SIZE-1] = i; } if (rank == 0) { for (i = 0; i < N; i++) { fprintf(stderr,"%f: Isend %d begin\n", MPI_Wtime() - start, i); ret = MPI_Isend(c[i], SIZE, MPI_CHAR, 1, 0, MPI_COMM_WORLD, [i]); //ret = MPI_Issend(c[i], SIZE, MPI_CHAR, 1, 0, MPI_COMM_WORLD, [i]); //ret = MPI_Send(c[i], SIZE, MPI_CHAR, 1, 0, MPI_COMM_WORLD); assert(ret == MPI_SUCCESS); fprintf(stderr,"%f: Isend %d end\n", MPI_Wtime() - start, i); } } else { for (i = 0; i < N; i++) { fprintf(stderr,"%f: Irecv %d begin\n", MPI_Wtime() - start, i); ret = MPI_Irecv(c[i], SIZE, MPI_CHAR, 0, 0, MPI_COMM_WORLD, [i]); assert(ret == MPI_SUCCESS); fprintf(stderr,"%f: Irecv %d end\n", MPI_Wtime() - start, i); } } //if (rank) { #if 0 do { repeat = 0; for (i = 0; i < N; i++) { if (!done[i]) { repeat = 1; #ifdef DEBUG fprintf(stderr,"%f: %s Test %d begin\n", MPI_Wtime() - start, actions[rank], i); #endif ret = MPI_Test([i], [i], ); assert(ret == MPI_SUCCESS); #ifdef DEBUG fprintf(stderr,"%f: %s Test %d end\n", MPI_Wtime() - start, actions[rank], i); #endif if (done[i]) { fprintf(stderr,"%f: %s %d done!\n", MPI_Wtime() - start, actions[rank], i); if (rank) { assert(c[i][0] == i); assert(c[i][SIZE-1] == i); } } } } } while(repeat); #elif 0 repeat = N; do { ret = MPI_Testany(N, request, , , ); assert(ret == MPI_SUCCESS); if (flag) { fprintf(stderr,"%f: %s %d done!\n", MPI_Wtime() - start, actions[rank], i); if (rank) { assert(c[i][0] == i); assert(c[i][SIZE-1] == i); } repeat--; } } while (repeat); #elif 0 for (i = 0; i < N; i++) { do { #ifdef DEBUG fprintf(stderr,"%f: %s Test %d begin\n", MPI_Wtime() - start, actions[rank], i); #endif ret = MPI_Test([i], , ); assert(ret == MPI_SUCCESS); #ifdef DEBUG fprintf(stderr,"%f: %s Test %d end\n", MPI_Wtime() - start, actions[rank], i); #endif } while(!flag); fprintf(stderr,"%f: %s %d done!\n", MPI_Wtime() - start, actions[rank], i); if (rank) { assert(c[i][0] == i); assert(c[i][SIZE-1] == i); } } #else for (i = 0; i < N; i++) { ret = MPI_Wait([i], ); assert(ret == MPI_SUCCESS); fprintf(stderr,"%f: %s %d done!\n", MPI_Wtime() - start, actions[rank], i); if (rank) {
Re: [hwloc-users] ? Finding cache & pci info on SPARC/Solaris 11.3
Hello, Maureen Chew, on jeu. 08 juin 2017 10:51:56 -0400, wrote: > Should finding cache & pci info work? AFAWK, there is no user-available way to get cache information on Solaris, so it's not implemented in hwloc. Concerning pci, you need libpciaccess to get PCI information. Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] Building hwloc for X11 on Mac OS X
Hello, Gunter, David O, on jeu. 04 mai 2017 20:44:16 +, wrote: > launching lstopo always produces the text-based output. I cannot seem > to get the X-display features to work. And yes, I am able to launch > xterms and other X11-based apps correctly. Do you have the DISPLAY environment variable set? lstopo uses it to determine whether it should run the X11 output or not. Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] Building hwloc for a Cray/KNL system
Hello, Gunter, David O, on Fri 27 Jan 2017 18:05:44 +, wrote: > $ aprun -n 1 -L 193 ~hwloc-tt/bin/lstopo-no-graphics Does aprun give you allocation of all cores? By default lstopo only shows the allocated cores. To see all of them, use the --whole-system option. Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] Issue running hwloc on Xeon-Phi Coprocessor uOS
Hello, Jacob Peter Caswell, on Mon 16 Jan 2017 11:53:56 -0600, wrote: > x86_64-k1om-linux-ld: i386:x86-64 architecture of input file `.libs/support.o' > is incompatible with k1om output Did you make clean before reconfiguring+making? Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] hwloc on Zynq
Hello, Alberto Ortiz, on Mon 12 Dec 2016 18:03:23 +0100, wrote: > These gpios are included to the PS by looking into the device tree, and > located > in /sys/class. > I know hwloc is able to find PCI devices, but i would like to know if hwloc is > able to detect other type of I/O like the ones i've just mentioned hwloc currently doesn't have support for gpios, but we could add it if there is enough information about it in /sys/class. What does it look like? On my LIME2 box, I only have /sys/class/gpio/gpiochip0 without much information since it's an integrated device. Could you send us a tarball of your /sys/class/gpio? Samuel ___ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
Re: [hwloc-users] [hwloc-announce] Hardware Locality (hwloc) v1.11.3 released
Brice Goglin, on Tue 26 Apr 2016 15:45:49 +0200, wrote: > The Hardware Locality (hwloc) team is pleased to announce the release > of v1.11.3: I'm getting one testsuite issue: FAIL: 16-2gr2gr2n2c+misc.xml (gdb) bt #0 strlen () at ../sysdeps/x86_64/strlen.S:106 #1 0x77346d8e in __GI___strdup (s=0x0) at strdup.c:41 #2 0x004032ee in hwloc_utils_userdata_import_cb (topology=0x62a520, obj=0x639c00, name=0x639330 "normal:MyName0", buffer=0x0, length=0) at ../../utils/hwloc/misc.h:312 #3 0x77bb48e1 in hwloc__xml_import_userdata (topology=0x62a520, obj=0x639c00, state=0x7fffd2f0) at topology-xml.c:624 #4 0x77bb519e in hwloc__xml_import_object (topology=0x62a520, data=0x6399d0, obj=0x639c00, state=0x7fffd3e0) at topology-xml.c:766 #5 0x77bb5b27 in hwloc_look_xml (backend=0x6398e0) at topology-xml.c:1021 #6 0x77b9d962 in hwloc_discover (topology=0x62a520) at topology.c:2499 #7 0x77b9e974 in hwloc_topology_load (topology=0x62a520) at topology.c:2994 #8 0x004054e7 in main (argc=0, argv=0x7fffd728) at lstopo.c:734 312 u->buffer = strdup(buffer); (gdb) p buffer $1 = (const void *) 0x0 624 topology->userdata_import_cb(topology, obj, fakename, buffer, length); (gdb) p buffer $2 = 0x0 so it looks like 617 ret = state->global->get_content(state, , reallength); didn't actually fill buffer, but (gdb) p name $13 = 0x64ff4c "MyName0" (gdb) p encoded $10 = 0 (gdb) p length $11 = 0 (gdb) p reallength $12 = 0 so maybe that's "expected" :) I'll be using the attached patch in Debian. Samuel diff --git a/src/topology-xml.c b/src/topology-xml.c index 220afd1..35fb19e 100644 --- a/src/topology-xml.c +++ b/src/topology-xml.c @@ -612,7 +612,7 @@ hwloc__xml_import_userdata(hwloc_topology_t topology __hwloc_attribute_unused, h return -1; } else if (topology->userdata_not_decoded) { - char *buffer, *fakename; + char *buffer = "", *fakename; size_t reallength = encoded ? BASE64_ENCODED_LENGTH(length) : length; ret = state->global->get_content(state, , reallength); if (ret < 0)
Re: [hwloc-users] Selecting real cores vs HT cores
Jeff Squyres (jsquyres), le Thu 11 Dec 2014 21:12:27 +, a écrit : > When the BIOS is set to enable hyper threading, then several resources on the > core are split when the machine is booted up (e.g., some of the queue depths > for various processing units in the core are half the length that they are > when hyperthreading is disabled in the BIOS). Perhaps some queues get divided, but most of the resources (such as cache, TLB, etc.) are completely available when using only one hyperthread, like they would be with HT disabled. Samuel
Re: [hwloc-users] Processor numbering in Ivy-bridge
Vishwanath Venkatesan, le Mon 29 Sep 2014 13:38:35 -0700, a écrit : > I was trying to use HWLOC on Ivybridge. I found that there is some > inconsistency in the core numbering. > > In the attached image (generated from running lstopo (hwloc - 1.9.1), we can > see that cores 6,7 do not exist although, PU#6 and PU#7 does exist. I am not very surprised. Those are physical numbers, which BIOS & such determine in various ways, which may not be contiguous. If you are looking for a contiguous numbering, you need to have a look at the logical numbers, obtained from lstopo -l. Samuel
Re: [hwloc-users] hwloc-ls graphical output
Dennis Jacobfeuerborn, le Thu 25 Sep 2014 02:01:48 +0200, a écrit : > The question I guess is how does the command determine the availability > of png as an output? Both cairo and libpng are installed. It depends on the backends which were built into cairo. Samuel
Re: [hwloc-users] BGQ question.
Biddiscombe, John A., le Tue 25 Mar 2014 08:56:02 +, a écrit : > Looking at /proc/cpuinfo on the io node itself, I see only 60 cores listed. I > wonder if they’ve reserved one socket of 4 cores for IO purposes That's possible, yes. > and in fact hwloc is seeing the correct information. At least it provides the correct information according to the content of /proc and /sys. Samuel
Re: [hwloc-users] [hwloc-announce] Hardware Locality (hwloc) v1.8.1 released
Brice Goglin, le Thu 13 Feb 2014 23:18:04 +0100, a écrit : > IIRC, Windows warnings are function pointer casts that should be OK. IIRC too. Samuel
Re: [hwloc-users] Using hwloc to map GPU layout on system
Brock Palen, le Thu 06 Feb 2014 21:31:42 +0100, a écrit : > GPU L#3 "nvml2" > GPU L#5 "nvml3" > GPU L#7 "nvml0" > GPU L#9 "nvml1" > > Is the L# always going to be in the oder I would expect? Because then I > already have my map then. No, L# is just following the machine topology. CUDA numbering does not necessarily follows that (e.g. if a slow GPU is somewhere in the middle). Samuel
Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64
Brice Goglin, le Tue 28 Jan 2014 12:46:24 +0100, a écrit : > 42: xchg %ebx,%rbx > > I guess having both ebx and rbx on these lines isn't OK. On Linux, I get > rsi instead of ebx, no problem. > > Samuel, any idea? Mmm, IIRC, "unsigned long" on windows may not be 64bit but 32bit? Perhaps we could rather include stdint.h and use uintptr_t or uint64_t there (so any other unix with 32bit unsigned long is fixed), and in the case of windows, include windows.h and use DWORDLONG. Samuel
Re: [hwloc-users] How to build hwloc static to link into a shared lib on Linux
Erik Schnetter, le Sat 18 Jan 2014 07:29:37 +0100, a écrit : > You probably need to set CFLAGS in addition to CXXFLAGS. Yes, CXXFLAGS is for C++ files. hwloc doesn't have any :) It's CFLAGS which is for C. That being said, I wonder the gain you will have: all the probing functions will still get pulled in, and for Linux that'll be the most part of hwloc. Be sure to explicitly disable PCI and such at configure time to at least avoid including these probing functions. Samuel
Re: [hwloc-users] [windows] hwloc_get_proc_cpubind issue, even with current process handle as 2nd parameter
Samuel Thibault, le Mon 06 Jan 2014 18:07:59 +0100, a écrit : > Eloi Gaudry, le Mon 06 Jan 2014 17:16:53 +0100, a écrit : > > the PID of the process. I was assuming that casting this member to a HANDLE > > object would allow me to use hwloc_get_proc_cpubind, Let me fix my typos: No, PIDs are mere numbers, they have nothing to do with HANDLEs. More interestingly, PID values are valid along the whole system, while HANDLE values are only valid within a given process. You have to use OpenProcess(), to create a HANDLE from a PID value. Samuel
Re: [hwloc-users] [windows] hwloc_get_proc_cpubind issue, even with current process handle as 2nd parameter
Eloi Gaudry, le Mon 06 Jan 2014 17:16:53 +0100, a écrit : > the PID of the process. I was assuming that casting this member to a HANDLE > object would allow me to use hwloc_get_proc_cpubind, No, PIDs are mere numbers, they have nothing to do with HANDLES. More interestingly, PID values are valid along the whole systems, while HANDLE values are only valid with a given process. You have to use OpenProcess, to create a HANDLE from a PID. Samuel
Re: [hwloc-users] [windows] hwloc_get_proc_cpubind issue, even with current process handle as 2nd parameter
Eloi Gaudry, le Mon 06 Jan 2014 16:37:55 +0100, a écrit : > AFAIK, the issue seems related to the GetAffinityMask call inside > hwloc_win_get_proc_cpubind : it always returns 0. So it's really the win32 layer which does not like seeing GetAffinityMask called. Just to make sure: you are using at least Windows XP, right? Samuel
Re: [hwloc-users] [windows] hwloc_get_proc_cpubind issue, even with current process handle as 2nd parameter
Eloi Gaudry, le Mon 06 Jan 2014 16:04:27 +0100, a écrit : > On Windows, hwloc_get_cpubind and hwloc_set_cpubind works correctly but I > cannot use hwloc_get_proc_cpubind or hwloc_set_proc_cpubind using the current > process handle as 2^nd parameter (no matter what the last one is). > > Any clue on this ? Not really, it should just work. Do GetProcessAffinityMask() or SetProcessAffinityMask() work if you call them the same way? Do you perhaps have more than 64 processors ? We still haven't found access to such system in order to implement the use of Get/SetProcessGroupAffinity. Samuel
Re: [hwloc-users] Hwloc and Electric Fence (libefence).
cesse...@free.fr, le Tue 29 Jan 2013 19:12:32 +0100, a écrit : > It was a very stupid question indeed ! Well, no it's not stupid :) Zero-allocs can indeed be frowned upon. Some algorithms like doing it, but some others to actually bug out at the same time allocating 0 bytes. Samuel
Re: [hwloc-users] hwloc tutorial material
Kenneth A. Lloyd, le Mon 21 Jan 2013 22:46:37 +0100, a écrit : > Thanks for making this tutorial available. Using hwloc 1.7, how far down > into, say, NVIDIA cards can the architecture be reflected? Global memory > size? SMX cores? None of the above? None of the above for now. Both are available in the cuda svn branch, however. Samuel
[hwloc-users] AIX test? Re: Hardware locality (hwloc) v1.6rc2 released
Hello, Brice Goglin, le Tue 20 Nov 2012 15:26:37 +0100, a écrit : > I just released 1.6rc2 (mirrors will update soon). It seems fine in my tests, can somebody test on AIX? Samuel
Re: [hwloc-users] Windows api threading functions equivalent to hwloc?
Andrew Somorjai, le Tue 20 Nov 2012 01:39:47 +0100, a écrit : > "CreateThread() and WaitForMultipleObjects() are not in hwloc since they have > nothing to do with topologies." > > I thought hwloc was also for threading? It can bind your threads, yes, but the way to create the thread is yours, it can be CreateThread, or OpenMP, etc... > "DWORD_PTR m_id = 0; > DWORD_PTR m_mask = 1 << i; > > m_threads[i] = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)threadMain, > (LPVOID)i, NULL, _id); > SetThreadAffinityMask(m_threads[i], m_mask); > > This will likely be something such as: > > hwloc_bitmap_t bitmap = hwloc_bitmap_alloc(); > hwloc_bitmap_set_only(bitmap, i); > hwloc_set_thread_cpubind(topology, m_threads[i], bitmap, 0); > hwloc_bitmap_free(bitmap);" > > How would I pass a function like threadMain in the above CreateThread > function into the thread itself. Someone told me to use this library for this > purpose so I wasn't sure what it was made for. You should indeed use hwloc to replace the SetThreadAffinityMask, but keep your CreateThread. > How would I create an array m_threads and pass it > into hwloc_set_thread_cpubind. I would still need this part then correct? > > m_threads[i] = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)threadMain, > (LPVOID)i, NULL, _id); Yes, something like: m_threads[i] = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)threadMain, (LPVOID)i, NULL, _id); hwloc_bitmap_t bitmap = hwloc_bitmap_alloc(); hwloc_bitmap_set_only(bitmap, i); hwloc_set_thread_cpubind(topology, m_threads[i], bitmap, 0); hwloc_bitmap_free(bitmap);" > I would like to be independent of windows.h by the way, not using windows > api calls is the motivation for all of this. Ah, then you may want to also use the pthread-win32 package, which is meant to replace CreateThread, and use pthread_getw32threadhandle_np in the windows case to convert from pthread-win32's pthread_t into a HANDLE for hwloc. Samuel
Re: [hwloc-users] Windows api threading functions equivalent to hwloc?
Brice Goglin, le Mon 19 Nov 2012 21:09:33 +0100, a écrit : > hwloc_bitmap_t bitmap = hwloc_bitmap_alloc(); > hwloc_bitmap_set_only(bitmap, i); > hwloc_set_thread_cpubind(topology, m_threads[i], bitmap, 0); > hwloc_bitmap_free(bitmap); Or perhaps hwloc_set_thread_cpubind(topology, m_threads[i], hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, i), 0); if you want to get core number in logical order rather than physical order (or use HWLOC_OBJ_PU if that's the hardware threads you want to get). > To get the number of processors with hwloc, use something like: > hwloc_get_nbobjs_by_type(topology, HWLOC_OBJ_CORE); > or > hwloc_get_nbobjs_by_type(topology, HWLOC_OBJ_PU); > Then it depends if you want real cores (the former or hardware threads (the > latter). Samuel
Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.6rc1 released
Hello, Brice Goglin, le Tue 13 Nov 2012 13:45:28 +0100, a écrit : > The Hardware Locality (hwloc) team is pleased to announce the first > release candidate for v1.6: I'm getting an odd failure in hwloc_pci_backend: lt-hwloc_pci_backend: hwloc-1.6rc1/tests/hwloc_pci_backend.c:68: main: Assertion `!nb' failed. It seems that even with flags == 0, pci stuff gets loaded from the xml output. It happens on only one of our machines, hannibal. I wonder what is special there. Samuel
Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups
Brice Goglin, le Mon 05 Nov 2012 23:23:42 +0100, a écrit : > top can also sort by the last used CPU. Type f to enter the config menu, > hilight the "last cpu" line, and hit 's' to make it the sort column. With older versions of top, type F, then j, then space. Samuel
Re: [hwloc-users] How do I access CPUModel info string
Olivier Cessenat, le Sat 27 Oct 2012 19:10:55 +0200, a écrit : > Just in case, I also provide the output of sysctl hw: Thanks. There is indeed no package information (hw.packages), that's why hwloc does not include any socket object. Brice wrote: > One way to solve this problem (which may also occur on old Linux > distribs) would be to store the CPU model in the machine object. But > we'll have to make sure all processors in the machine are indeed of the > same model. On MacOSX, it looks like sysctl reports a single socket > description anyway, so no problem. So we have to resort to that, now commited. Samuel
Re: [hwloc-users] How do I access CPUModel info string
Robin Scher, le Thu 25 Oct 2012 23:57:38 +0200, a écrit : > ; eax = 0x8002 --> eax, ebx, ecx, edx: get processor name string > (part 1) > mov eax,0x8002 > cpuid Oh, this is indeed *exactly* the model name string. I only knew about the vendor_id string. > I don't know if that would work on Win64, though. It should: cpuid is not a privileged instruction. > Do you think those could be added to hwloc? Yes: we already use cpuid for the x86 backend. That will only work on x86 hosts of course. Brice, that actually brings another piece to the plugin engine: on Windows ideally we should still get the topology from the OS, but take the cpu string from the x86 backend... Samuel
Re: [hwloc-users] How do I access CPUModel info string
Robin Scher, le Thu 25 Oct 2012 23:39:46 +0200, a écrit : > Is there a way to get this string (e.g. "Intel(R) Core(TM) i7 CPU M 620 @ > 2.67GHz") consistently on Windows, Linux, OS-X and Solaris? Currently, no. hwloc itself does not have a table of such strings, and each OS has its own table. Samuel
Re: [hwloc-users] hwloc 1.5, freebsd and linux output on the same hardware
Sebastian Kuzminsky, le Sat 06 Oct 2012 00:55:57 +0200, a écrit : > binding to CPU0 > could not bind to CPU0: Resource deadlock avoided Mmm, from what I read in the freebsd kernel: /* * Create a set in the space provided in 'set' with the provided parameters. * The set is returned with a single ref. May return EDEADLK if the set * will have no valid cpu based on restrictions from the parent. */ _cpuset_create(struct cpuset *set, struct cpuset *parent, const cpuset_t *mask, cpusetid_t id) { if (!CPU_OVERLAP(>cs_mask, mask)) return (EDEADLK); Could it be that due to administration rules lstopo is not allowed to bind on cpu 0-9 ? In that case the x86 backend can not detect anything there. Samuel
Re: [hwloc-users] hwloc 1.5, freebsd and linux output on the same hardware
Sebastian Kuzminsky, le Wed 03 Oct 2012 17:24:55 +0200, a écrit : > So that's an improvement over the svn trunk > yesterday, but it's not all the way fixed yet! Ok. Apparemently hwloc can't bind itself to procs 0-9 for some reason. I have added debug to the trunk, could you try it again (no need for the config.log any more, but I still need --enable-debug). Samuel
Re: [hwloc-users] hwloc 1.5, freebsd and linux output on the same hardware
Hello, Sebastian Kuzminsky, le Wed 03 Oct 2012 01:08:46 +0200, a écrit : > Here you go (the list server rejected it because it was too big, but this > compressed version should make it through). Thanks! There were two bugs which resulted into cpuid not being properly compiled. I have fixed them in the trunk, could you try again? Samuel
Re: [hwloc-users] hwloc 1.5, freebsd and linux output on the same hardware
Hello, Sebastian Kuzminsky, le Tue 02 Oct 2012 23:47:05 +0200, a écrit : > I've attached the output from both platforms. On freebsd, could you pass --enable-debug to ./configure and rerun lstopo, to get more debugging information? Samuel
Re: [hwloc-users] Solaris and hwloc
Jeff Squyres, le Thu 13 Sep 2012 17:10:00 +0200, a écrit : > After a little more thought, I'm also thinking that having a "it's ok if > binding fails" CLI flag is a bad idea. If the user really wants something to > run without binding, then you can just do that in the shell: > > - > hwloc-bind ...whatever... my_executable > if test "$?" != "0"; then > # run without binding > my_executable > fi Well, I find this a bit tedious. Other than that, I agree. Samuel
Re: [hwloc-users] Solaris and hwloc
Jeff Squyres, le Thu 13 Sep 2012 00:46:33 +0200, a écrit : > On Sep 12, 2012, at 6:44 PM, Samuel Thibault wrote: > > >> Anyone have an opinion? I'm 60/40 in favor of not letting it run, under > >> the rationale that the user asked for something that we can't deliver, so > >> we shouldn't continue. > > > > Well, it depends on the situation. The binding might only be an > > optimization, and failing just because of that is not nice. When it's an > > administration decision, it's different, but then one would use cgroups > > & such instead. > > > How about adding a flag to make it fail if it doesn't bind? Now I understand Brice's --strict flag mentioning :) Samuel
Re: [hwloc-users] Solaris and hwloc
Jeff Squyres, le Thu 13 Sep 2012 00:45:56 +0200, a écrit : > On Sep 12, 2012, at 6:42 PM, Samuel Thibault wrote: > > No, we have it, but not all solaris systems have it. > > > Ah, I see. So if Siegmar had done "hwloc-bind socket:0 ..." -- assuming his > system has lgrp support -- that should work. right? Rather node:0, but yes. Samuel
Re: [hwloc-users] Solaris and hwloc
I forgot to answer this: Jeff Squyres, le Wed 12 Sep 2012 16:16:57 +0200, a écrit : > Sidenote: if hwloc-bind fails to bind, should we still launch the child > process? Well, it's up to you to decide :) Samuel
Re: [hwloc-users] Solaris and hwloc
Jeff Squyres, le Wed 12 Sep 2012 16:16:57 +0200, a écrit : > He seems to get an hwloc error any time he tries to bind to more than 1 PU. > Is that expected on Solaris? Without lgrp support, unfortunately yes: the processor_bind solaris interface only permits to bind to one processor. With lgrp support, on should be able to bind oneself to sets of whole NUMA nodes. I don't know any interface which would provide a granularity between one processor and one NUMA node. Samuel
Re: [hwloc-users] lstopo and GPus
Brice Goglin, le Tue 28 Aug 2012 14:43:53 +0200, a écrit : > > $ lstopo > > Socket #0 > > Socket #1 > > PCI... > > (connected to socket #1) > > > > vs > > > > $ lstopo > > Socket #0 > > Socket #1 > > PCI... > > (connected to both sockets) > > Fortunately, this won't occur in most cases (including Gabriele's > machines) because there's a NUMAnode object above each socket. Oops, I actually meant NUMAnode above > Both the socket and the PCI bus are drawn inside the NUMA box, so > things appear OK in graphics to. Indeed, if the PCI bus was connected to one NUMAnode/socket only, it would be drawn inside, which is not the case. > Gabriele, assuming you have a dual Xeon X56xx Westmere machine, there > are plenty of such platforms where the GPU is indeed connected to both > sockets. Or it could be a buggy BIOS. Agreed. Samuel
Re: [hwloc-users] lstopo and GPus
Gabriele Fatigati, le Tue 28 Aug 2012 14:19:44 +0200, a écrit : > I'm using hwloc 1.5. I would to see how GPUs are connected with the processor > socket using lstopo command. About connexion with the socket, there is indeed no real graphical difference between "connected to socket #1" and "connected to all sockets". You can use the text output for that: $ lstopo Socket #0 Socket #1 PCI... (connected to socket #1) vs $ lstopo Socket #0 Socket #1 PCI... (connected to both sockets) Samuel
Re: [hwloc-users] possible concurrency issue with reading /proc data on Linux
Vlad, le Sat 21 Apr 2012 23:37:11 +0200, a écrit : > 433 /* take the number of links as a good estimate for the number of tids */ > 434 if (fstat(dirfd(taskdir), ) == 0) > 435max_tids = sb.st_nlink; > > "taskdir" here is /proc//task, correct? In which case the threads will be > doing readdir() on the same DIR stream... No, each thread opens its own DIR in hwloc_linux_foreach_proc_tid. Samuel
Re: [hwloc-users] Problems on SMP with 48 cores
Samuel Thibault, le Thu 15 Mar 2012 07:42:40 +0100, a écrit : > Brice Goglin, le Wed 14 Mar 2012 22:32:07 +0100, a écrit : > > We debugged this in private emails with Hartmut. His 48-core platform is > > now detected properly. Everything got fixed with a patch > > functionnally-identical to what Samuel sent earlier. > > Is the 32bit-on-64bit build fixed too? It'd also be good to test 32-on-32, where there would be two groups, because binding on groups has not been implemented at all due to lacking access to a machine with several groups. Samuel
Re: [hwloc-users] Problems on SMP with 48 cores
Hartmut Kaiser, le Wed 14 Mar 2012 08:52:59 -0500, a écrit : > > > Le 14/03/2012 09:39, Brice Goglin a écrit : > > > Le 13/03/2012 19:08, Hartmut Kaiser a écrit : > > >>> - hwloc_bitmap_from_ith_ulong(obj->cpuset, > > GroupMask[i].Group, > > >>> GroupMask[i].Mask); > > >>> + hwloc_bitmap_from_ith_ulong(obj->cpuset, > > 2*GroupMask[i].Group, > > >>> GroupMask[i].Mask & 0xfff); > > > There's a missing 'f' above. > > > Here's another almost untested patch, with additional debug printf. > > > Please remove the previous one and apply this one instead. > > > > Grrr, I failed to fix the missing f. New patch attached. > > Your patch relies on two symbols which I'm not able to resolve: > hwloc_debug_bitmap_2args and hwloc_debug_2args. If I comment those the > picture has changed (see attached), but still no overall luck Here is a fixed patch concerning the debugging statements. Samuel Index: src/topology-windows.c === --- src/topology-windows.c (révision 4385) +++ src/topology-windows.c (copie de travail) @@ -532,7 +532,9 @@ obj = hwloc_alloc_setup_object(type, id); obj->cpuset = hwloc_bitmap_alloc(); hwloc_debug("%s#%u mask %lx\n", hwloc_obj_type_string(type), id, procInfo[i].ProcessorMask); - hwloc_bitmap_from_ulong(obj->cpuset, procInfo[i].ProcessorMask); + hwloc_bitmap_from_ulong(obj->cpuset, procInfo[i].ProcessorMask & 0x); + hwloc_bitmap_from_ith_ulong(obj->cpuset, i, procInfo[i].ProcessorMask >> 32); + hwloc_debug_2args_bitmap("%s#%u bitmap %s\n", hwloc_obj_type_string(type), id, obj->cpuset); switch (type) { case HWLOC_OBJ_NODE: @@ -634,7 +636,9 @@ mask = procInfo->Group.GroupInfo[id].ActiveProcessorMask; hwloc_debug("group %u %d cpus mask %lx\n", id, procInfo->Group.GroupInfo[id].ActiveProcessorCount, mask); - hwloc_bitmap_from_ith_ulong(obj->cpuset, id, mask); + hwloc_bitmap_from_ith_ulong(obj->cpuset, 2*id, mask & 0x); + hwloc_bitmap_from_ith_ulong(obj->cpuset, 2*id+1, mask >> 32); + hwloc_debug_2args_bitmap("group %u %d bitmap %s\n", id, procInfo->Group.GroupInfo[id].ActiveProcessorCount, obj->cpuset); hwloc_insert_object_by_cpuset(topology, obj); } continue; @@ -648,8 +652,10 @@ obj->cpuset = hwloc_bitmap_alloc(); for (i = 0; i < num; i++) { hwloc_debug("%s#%u %d: mask %d:%lx\n", hwloc_obj_type_string(type), id, i, GroupMask[i].Group, GroupMask[i].Mask); - hwloc_bitmap_from_ith_ulong(obj->cpuset, GroupMask[i].Group, GroupMask[i].Mask); + hwloc_bitmap_from_ith_ulong(obj->cpuset, 2*GroupMask[i].Group, GroupMask[i].Mask & 0xfff); + hwloc_bitmap_from_ith_ulong(obj->cpuset, 2*GroupMask[i].Group+1, GroupMask[i].Mask >> 32); } + hwloc_debug("%s#%u bitmap %lx\n", hwloc_obj_type_string(type), id, obj->cpuset); switch (type) { case HWLOC_OBJ_NODE:
Re: [hwloc-users] V1.4.1: Windows x64 import library broken
Hartmut Kaiser, le Mon 12 Mar 2012 23:05:44 +0100, a écrit : > The import library libhwloc.lib distributed with the Windows x64 binaries is > broken in V1.4.1 (even if it was ok in V1.4). The library internally refers > to libhwloc-4.dll (instead of libhwloc-5.dll). While it is not a problem to > generate a correct import library from the supplied definition file, it > would be good to be able to use the supplied binaries as is. I've uploaded 1.4.1.1 windows builds, whose only change is that. Thanks for the report, Samuel
Re: [hwloc-users] Problems on SMP with 48 cores
Brice Goglin, le Tue 13 Mar 2012 18:55:29 +0100, a écrit : > Le 13/03/2012 17:04, Hartmut Kaiser a écrit : > >>> But the problems I was seeing were not MSVC specific. It's a > >>> proliferation of arcane (non-POSIX) function use (like strcasecmp, > >>> etc.) missing use of HAVE_UNISTD_H, HAVE_STRINGS_H to wrap > >>> non-standard headers, unsafe mixing of > >>> int32<->int64 data types, reliance on int (and other types) having a > >>> certain bit-size, totally unsafe shift operations, wide use of > >>> (non-C-standard) gcc extensions, etc. Should I go on? > > More investigation shows that the code currently assumes group (and > > processor) masks to be 32 bit, which is not true on 64 bit systems. For > > instance this (topology-windows.c: line 643): > > > > hwloc_bitmap_from_ith_ulong(obj->cpuset, GroupMask[i].Group, > > GroupMask[i].Mask); > > Try applying something like the patch below. Totally untested obviously, > but we'll see if that starts improving lstopo. That won't work on 32bit systems, where the mask is 32bit only and thus >> 32 is undefined. He will probably be able to provide me with an account on such windows system, let's just wait for that. Samuel
Re: [hwloc-users] Problems on SMP with 48 cores
Samuel Thibault, le Tue 13 Mar 2012 13:33:05 +0100, a écrit : > > I tried to recompile the library using MSVC which would allow me to debug > > the issue, but after several hours of tweaking I gave up. As it turns out > > the code base is everything but portable, which is really unfortunate for a > > library which is supposed to be cross platform. > > I'm afraid to have to answer that MSVC does everything but respecting > standards, even when they are more that 10 years old. The hwloc code > compiles as such on a variety of unix compilers, and we didn't need many > tweaks for that. The mingw toolchain saves a lot of such concerns, so I > can only advise to use it. Just to make it clear: patches for making hwloc compile with MSVC are welcome and will be happily applied, I'm just very reluctant to spend time on writing them while the mingw build just works. Samuel
Re: [hwloc-users] creation and destruction of bound threads
Albert Solernou, le Mon 30 Jan 2012 12:37:31 +0100, a écrit : > I am working on a threaded code, and want to bind threads to cores. However, > the process creates and destroys the threads, so here is the question: > What happens if I enter on a threaded part of the code, bind "thread X" to > a core, return to a serial part and then thread again? Can I expect to find > thread X bound to the core I bound it previously? It depends on what actually creates the threads. For instance, most implementations of OpenMP reuse the same kernel threads, without actually destroying them. But nothing in the standard asserts that, so you'd probably prefer to re-bind just to be sure. Samuel
Re: [hwloc-users] Bogus files in 64bit Windows binary distribution (1.4rc1)
Hartmut Kaiser, le Fri 20 Jan 2012 00:43:32 +0100, a écrit : > > Hartmut Kaiser, le Thu 19 Jan 2012 22:48:50 +0100, a écrit : > > > We are using hwloc with VS2010 and were happy to realize that after > > > the (for > > > us) totally broken Windows binary distribution in V1.3 > > > > Broken? How so? It worked for me. > > Try it, the autoconf/config.h has settings not compatible with VC++, for > instance: > > /* Maybe before gcc 2.95 too */ > #if !defined(HWLOC_HAVE_ATTRIBUTE_UNUSED) && defined(__GNUC__) > # define HWLOC_HAVE_ATTRIBUTE_UNUSED 1 > #else > # define HWLOC_HAVE_ATTRIBUTE_UNUSED 1 > #endif > #if HWLOC_HAVE_ATTRIBUTE_UNUSED > # define __hwloc_attribute_unused __attribute__((__unused__)) > #else > # define __hwloc_attribute_unused > #endif > > etc. This essentially always defines __hwloc_attribute_unused to expand to > the __attribute__() (from hwloc-win64-build-1.3.1.zip). Ok, so the problem is not actually in the binaries, but the headers :) This was also reported in another case and already fixed for the next 1.3 release. Samuel
Re: [hwloc-users] hwloc_get_last_cpu_location and hwloc_get_cpubind
Marc-André Hermanns, le Tue 17 Jan 2012 11:47:43 +0100, a écrit : > It seems now that it has the whole system in the cpuset. How can I > really infer the PU this process was run on? I would have expected the > cpuset to have only 1 element per level to indicate the path from > machine to PU. That is what is expected, yes (though only at the PU level, since only that one is completely included in the cpuset, you would need "intersects" to get the path). and that's what I get on my machine: € ./test This system has 7 levels Cpuset: 0x0040 Number of objects at depth 0: 0 Number of objects at depth 1: 0 Number of objects at depth 2: 0 Number of objects at depth 3: 0 Number of objects at depth 4: 0 Number of objects at depth 5: 0 Number of objects at depth 6: 1 > Evidently my understanding of this functionality is still > not correct. No, it's completely correct, it just seems there's an odd thing somewhere. Could you run through strace so we can check what the kernel returns? Samuel
Re: [hwloc-users] Compiling hwloc into a static library on Windows and Linux
Andrew Helwer, le Fri 13 Jan 2012 18:16:16 +0100, a écrit : > libhwloc.lib(traversal.o) : error LNK2019: unresolved external symbol > __ms_vsnpr > intf referenced in function snprintf Do you also link msvcrt in? mingw needs it for almost everything. Samuel
Re: [hwloc-users] Compiling hwloc into a static library on Windows and Linux
Andrew Helwer, le Fri 13 Jan 2012 01:35:27 +0100, a écrit : > It fails with the following: > > *** Warning: linker path does not have real file for library -lgdi32. Ah, that's a dark bug in libtool. > gcc -I/cygdrive/c/hwloc-asdf/include -I/cygdrive/c/hwloc-asdf/include > -I/cygdriv > e/c/hwloc-asdf/includedolib.c -o dolib > ./dolib "/cygdrive/c/Program Files (x86)/Microsoft Visual Studio > 10.0/VC/bin/lib > " X86 .libs/libhwloc.def libhwloc- .libs/libhwloc.lib > The system cannot find the path specified. > "/cygdrive/c/Program Files (x86)/Microsoft Visual Studio 10.0/VC/bin/lib" > /machi > ne:X86 /def:.libs/libhwloc.def /name:libhwloc- /out:.libs/libhwloc.lib failed > Makefile:758: recipe for target `.libs/libhwloc.lib' failed Well, AIUI, you don't actually need the shared version, so you can as well pass --disable-shared to ./configure to just get rid of this bug. That said, isn't the just-uploaded-to-hwloc-website win64 build enough for you? It contains the libhwloc.a static build in lib/ Samuel
Re: [hwloc-users] Compiling hwloc into a static library on Windows and Linux
Andrew Helwer, le Tue 10 Jan 2012 02:08:46 +0100, a écrit : > the Visual Studio compiler runs into a lot of issues. What kind of issues for instance? Samuel
Re: [hwloc-users] Compiling hwloc into a static library on Windows and Linux
Hello, Andrew Helwer, le Thu 12 Jan 2012 02:11:58 +0100, a écrit : > If I run the command manually, it can't find the libhwloc.def file. Which is > reasonable, as it does not appear to exist in the .lib directory. Am I > missing something? In principle the .def file is generated by the linker. Could you run make V=1 to get the command lines, and check that HWLOC_HAVE_WINDOWS is 1 in ./include/hwloc/autogen/config.h ? At worse, I believe you can just copy the libhwloc.def contained in the 32bit build of the exact same version of hwloc, it should be compatible. Thanks, Samuel
Re: [hwloc-users] Compiling hwloc into a static library on Windows and Linux
Andrew Helwer, le Tue 10 Jan 2012 02:08:46 +0100, a écrit : > First of all, is Windows 64-bit supported? There is only a 32-bit release on > the downloads page. I have never tried to build a 64bit binary, but there is little reason it should fail. > However, when I specify the --enable-embedded-mode flag in configure in Linux, > no libraries are built at all - the specified prefix directory contains only > empty directories. But the library is built, it's just not installed because projects often prefer to link the library in, or something similar. If you want to install libhwloc.a, simply fetch it from src/.libs/ > I've managed to compile a working static library on Linux using the headers > generated by configure, I'm not sure to understand. Doesn't passing --enable-static to ./configure already generate a static library? > but am having a lot of difficulty doing the same on Windows - the > Visual Studio compiler runs into a lot of issues. Is there a simple > way to do this? I have to say I know basically nothing about what Visual Studio expects from a static library. Samuel
Re: [hwloc-users] GPU/NIC/CPU locality
Stefan Eilemann, le Tue 29 Nov 2011 11:40:18 +0100, a écrit : > Maybe I'm missing something, but I don't see any PCI-related output with > lstopo. You are probably missing the libpci-devel package. Samuel
Re: [hwloc-users] Process and thread binding
Gabriele Fatigati, le Mon 12 Sep 2011 15:50:45 +0200, a écrit : > thanks very much for your explanations. But I don't understand why a process > inherits core bound of his threads On Linux, there is no such thing as "process binding", only "thread binding". hwloc emulates the former by using the latter. Samuel
Re: [hwloc-users] Re : Re : hwloc topology check initializing
Gabriele Fatigati, le Sat 03 Sep 2011 16:09:11 +0200, a écrit : > What about hwloc_topology check()? > > What types of check does? Mostly that the hwloc library itself didn't do anything wrong. Samuel
Re: [hwloc-users] Numa availability
Brice Goglin, le Sun 28 Aug 2011 12:36:31 +0200, a écrit : > > Is there a hwloc routine to check this? > > get_nbobjs_by_type(topology, HWLOC_OBJ_NODE) tells how many NUMA node > objects exist. > If you get >1, the machine is NUMA. > If the non-NUMA case, I think you can get 0 or 1 depending on whether > the OS is NUMA-aware or not (not sure we should remove this possible > difference). The useful difference is that 0 means we don't know, while 1 means we do know there is only one node. Samuel
Re: [hwloc-users] [hwloc-announce] Hardware Locality (hwloc) v1.2.1rc3 released
Brice Goglin, le Tue 16 Aug 2011 19:49:10 +0200, a écrit : > hwloc 1.2.1 *rc3* is out (web mirrors will update shortly). It fixes > hwloc_get_last_cpu_location() for Linux threads. Apart from that, > nothing important. Let's hope this one will become the final 1.2.1 > within a couple days. Since the Bordeaux university machines are mostly down, I won't be able to perform all the usual tests before thursday afternoon. Samuel
Re: [hwloc-users] Magny Cours L3 cache issue
Wheeler, Kyle Bruce, le Tue 16 Aug 2011 16:52:54 +0200, a écrit : > hwloc-gather-topology doesn't seem to work on my compute nodes... not sure > why. It doesn't report any failures, but it doesn't create the tarball either > (just spits out more lstopo output). Maybe try to replace /bin/sh with /bin/bash in the script? Samuel
Re: [hwloc-users] Get CPU associated to a thread
Hello, PULVERAIL Sébastien, le Fri 12 Aug 2011 13:59:46 +0200, a écrit : > Does a such function exist ? See hwloc_get_last_cpu_location() Samuel
Re: [hwloc-users] hwloc get cpubind function
Gabriele Fatigati, le Thu 11 Aug 2011 18:26:28 +0200, a écrit : > Gabriele Fatigati, le Thu 11 Aug 2011 18:05:25 +0200, a écrit : > > char* bitmap_string=(char*)malloc(256); > > > > hwloc_bitmap_t set = hwloc_bitmap_alloc(); > > > > hwloc_linux_get_tid_cpubind(, tid, set); > > with gettid() works well. Well in that case you can use the more portable hwloc_get_cpubind(topology, set, HWLOC_CPUBIND_THREAD); which will also work on non-Linux. Samuel
Re: [hwloc-users] hwloc get cpubind function
Gabriele Fatigati, le Thu 11 Aug 2011 18:05:25 +0200, a écrit : > char* bitmap_string=(char*)malloc(256); > > hwloc_bitmap_t set = hwloc_bitmap_alloc(); > > hwloc_linux_get_tid_cpubind(, tid, set); Where does "tid" come from? hwloc_linux_get_tid_cpubind() only takes Linux tids (as in gettid()), not OpenMP thread IDs. Samuel
Re: [hwloc-users] hwloc get cpubind function
Gabriele Fatigati, le Thu 11 Aug 2011 10:32:23 +0200, a écrit : > I'm using hwloc-1.3a1r3606. Now hwloc_get_last_cpu_location() works well: > > thread 0 bind: 0x0008 as core number 3 > thread 1 bind: 0x0800 as core number 11 Good. > but hwloc_linux_get_tid_cpubind() has still some problems because after > binding > one thread on just one core it give me: > > thread 0 bind: 0x0008 as core number 3 > thread 1 bind: "0x00ff" as all available cores!! How do you use it exactly? Samuel
Re: [hwloc-users] hwloc get cpubind function
Samuel Thibault, le Wed 10 Aug 2011 16:24:39 +0200, a écrit : > Gabriele Fatigati, le Wed 10 Aug 2011 16:13:27 +0200, a écrit : > > there is something wrong. I'm using two thread, the first one is bound on > > HWLOC_OBJ_PU number 2, the second one on HWLOC_OBJ_PU number 10, > > It seems that hwloc_linux_get_tid_last_cpu_location erroneously assume > that /proc/self/stat points to its own thread state indeed, we need to > fix that. This should now be fixed in the trunk and the v1.2 branch. You can either upgrade from svn, or wait for this night's snapshot. Samuel
Re: [hwloc-users] hwloc get cpubind function
Gabriele Fatigati, le Wed 10 Aug 2011 16:13:27 +0200, a écrit : > there is something wrong. I'm using two thread, the first one is bound on > HWLOC_OBJ_PU number 2, the second one on HWLOC_OBJ_PU number 10, It seems that hwloc_linux_get_tid_last_cpu_location erroneously assume that /proc/self/stat points to its own thread state indeed, we need to fix that. Samuel
Re: [hwloc-users] hwloc get cpubind function
Gabriele Fatigati, le Wed 10 Aug 2011 15:41:19 +0200, a écrit : > hwloc_cpuset_t set = hwloc_bitmap_alloc(); > > int return_value = hwloc_get_last_cpu_location(topology, set, > HWLOC_CPUBIND_THREAD); > > printf( " bitmap_string: %s \n", bitmap_string[0]); > > give me: > > 0x0800 > > converted in binary: > > 1000 > > So, CPU 0 I suppose, Do you mean linear 0 or physical 0? cpusets are always physical, 0x800 means CPU with physical number 11. Samuel
Re: [hwloc-users] hwloc get cpubind function
Gabriele Fatigati, le Wed 10 Aug 2011 15:29:43 +0200, a écrit : > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_MACHINE, 0); > > int return_value = hwloc_get_last_cpu_location(topology, core->cpuset, > HWLOC_CPUBIND_THREAD); > > and now in "core->cpuset" I get the new cpuset bitmap, where process/threads > runs. Is it right? Err, yes, but why using core->cpuset?? Giving it as parameter to hwloc_get_last_cpu_location will only overwrite its content with the content returned by hwloc_get_last_cpu_location (which is forbidden, see the documentation of the cpuset field). Samuel
Re: [hwloc-users] hwloc get cpubind function
Gabriele Fatigati, le Wed 10 Aug 2011 09:35:19 +0200, a écrit : > these lines, doesn't works: > > set = hwloc_bitmap_alloc(); > hwloc_get_cpubind(topology, , 0); > > hwloc_get_cpubind() crash, because I have to pass set, not i suppose. Right, of course. > I think hwloc_get_last_cpu_location() is used coupled with > hwloc_get_cpubind()? Well, they don't _have_ to. They provide a different information. It just happens that get_last_cpu_location usually returns an index withing what get_cpubind returns ("always", if the binding is strict). Samuel
Re: [hwloc-users] hwloc get cpubind function
Gabriele Fatigati, le Tue 09 Aug 2011 18:14:55 +0200, a écrit : > hwloc_get_cpubind() function, return, according to the manual, "current > process > or thread binding". What does it means? The cpuset to which the current process or thread (according to flags) was last bound to. That is, the converse of set_cpubind(). > It return cpu index where process/ thread runs? No, hwloc_get_last_cpu_location() does that. > If yes, which cpuset I have to use in function arguments? get_cpubind returns a cpuset, you just provide one you have allocated the way you prefer. > Could you give me a little example to use it? It is really just the converse of hwloc_set_cpubind(), so for instance: set = hwloc_bitmap_alloc(); hwloc_get_cpubind(topology, , 0) Samuel
Re: [hwloc-users] Difference between HWLOC_OBJ_CORE and HWLOC_OBJ_PU
Gabriele Fatigati, le Tue 09 Aug 2011 17:04:04 +0200, a écrit : > >There is no difference concerning the cpuset. > > It means they have the same logical index? Since there is exactly one pu per core and they'll be sorted the same, yes, by construction they will have the same logical index. Samuel
Re: [hwloc-users] Thread core affinity
Gabriele Fatigati, le Thu 04 Aug 2011 16:56:22 +0200, a écrit : > L#0 and L#1 are physically near because hwloc consider shared caches map when > build topology? Yes. That's the whole point of sorting objects topologically first, and numbering them afterwards. See the glossary entry for "logical index": “The ordering is based on topology first, and then on OS CPU numbers” I.e. OS CPU numbers are only used when no topology information (shared cache etc.) provides any better sorting. > Because if not, i don't know how hwloc understand the physical > proximity of cores :( Physical proximity of cores does not mean logical proximity. cores can be next one to the other, and still share no cache at all. Forget the expression "physical proximity", it does not provide any interesting information. What matters is logical proximity. And that's *precisely* what logical indexes express. Samuel
Re: [hwloc-users] Thread core affinity
Gabriele Fatigati, le Thu 04 Aug 2011 16:35:36 +0200, a écrit : > so physical OS index 0 and 1 are not true are physically near on the die. They quite often aren't. See the updated glossary of the documentation: "The index that the operating system (OS) uses to identify the object. This may be completely arbitrary, non-unique, non-contiguous, not representative of proximity, and may depend on the BIOS configuration." > Considering that, how I can use cache locality and cache sharing by cores if I > don't know where my threads will physically bound? By using logical indexes, not physical indexes. And almost all hwloc functions use logical indexes, not physical indexes. > If L#0 and L#1 where I bind my threads are physically far, may give me bad > performance. L#0 and L#1 are physically near, that's precisely the whole point of hwloc: it provides you with *logical* indexes which express proximity, instead of the P#0 and P#1 physical/OS indexes, which are quite often simply arbitrary. Samuel
Re: [hwloc-users] Thread core affinity
Gabriele Fatigati, le Thu 04 Aug 2011 15:52:09 +0200, a écrit : > how the topology gave by lstopo is built? In particolar, how the logical index > P# are initialized? P# are not logical indexes, they are physical indexes, as displayed in /proc/cpuinfo & such. The logical indexes, L#, displayed when passing the -l option to lstopo, are numbered simply linearly, after having sorted the PUs according to topology. Samuel
Re: [hwloc-users] Thread core affinity
Hello, Gabriele Fatigati, le Mon 01 Aug 2011 12:32:44 +0200, a écrit : > So, are not physically near. I aspect that with Hyperthreading, and 2 hardware > threads each core, PU P#0 and PU P#1 are on the same core. Since these are P#0 and 1, they may not be indeed (physical indexes). That's the whole problem of the indexes provided by operating systems. Fortunately, > If is it not true, > using in a OMP PARALLEL region with 2 software threads: > > $ pragma omp paralle num_threads(2) > > tid= omp_get_thread_num(); > > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid); > hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset); > hwloc_bitmap_singlify(set); > > hwloc_set_cpubind(topology, set, HWLOC_CPUBIND_THREAD); > > > > i would bind thread 0 on PU P#0 and thread 1 on PU P#1, supposing are > physically near. No, because hwloc functions do not use physical, but logical indexes, which it computes according to the topology. Use lstopo --top to check the actual binding being used. Samuel
Re: [hwloc-users] Multiple thread binding
Gabriele Fatigati, le Tue 02 Aug 2011 17:22:31 +0200, a écrit : > and in this way are equivalent? > > #pragma omp parallel num_threads(1) > { > hwloc_obj_t core = hwloc_get_obj_by_type(*topology, HWLOC_OBJ_PU, 0); > hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset); > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_THREAD | > HWLOC_CPUBIND_STRICT); > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_THREAD | > HWLOC_CPUBIND_NOMEMBIND); > } Since the first call does not have NOMEMBIND, it might bind the memory on some OSes, and since the second call does not have the strict flag, the thread will in the end not be strictly bound. Samuel
Re: [hwloc-users] Multiple thread binding
Gabriele Fatigati, le Tue 02 Aug 2011 17:13:15 +0200, a écrit : > $pragma omp parallel num_thread(1) > { > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_THREAD | > HWLOC_CPUBIND_STRICT > | HWLOC_CPUBIND_NOMEMBIND); > } > > is equivalent to? > > $pragma omp parallel num_thread(1) > { > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_THREAD); > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_STRICT); > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_NOMEMBIND); > > } As I said, no. The latter will perform the three operations one after the other, piling the effect of each of them, which is different from specifying all the flags at the same time. For instance, in the first case, only the current thread will be bound, while in the second case, the second and third calls will bind the whole process! (since there is no THREAD flag). > You said HWLOC_CPUBIND_STRICT bind process and memory. I should have said "potentially memory too". And it's not the STRICT flag which does this, it's the absence of NOMEMBIND which does this. > Why also the memory? Because some OS do this too. Samuel
Re: [hwloc-users] Multiple thread binding
Gabriele Fatigati, le Tue 02 Aug 2011 16:23:12 +0200, a écrit : > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_THREAD | HWLOC_CPUBIND_STRICT > | HWLOC_CPUBIND_NOMEMBIND); > > is it possible do multiple call to hwloc_set_cpubind passing each flag per > time? > > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_THREAD); > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_STRICT); > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_NOMEMBIND); > > or only the last have effect? Err, it will simply do the three operations, i.e. first bind the current thread and memory, then strictly bind the whole process and memory, and eventually bind the process but not memory (but it will still bound since it was by the second call). Samuel
Re: [hwloc-users] [hwloc-announce] Hardware Locality (hwloc) v1.2.1rc1 released
Hello, Hendryk Bockelmann, le Tue 02 Aug 2011 10:54:54 +0200, a écrit : > I will test hwloc-1.2.1rc1r3567.tar.gz in the next days on our POWER6 > cluster running AIX6.1 and report the results to you resp. to the list Maybe rather wait for next nightly snapshot, as I've just fixed a bug with xml test which will probably hit you. Samuel
Re: [hwloc-users] Thread core affinity
Gabriele Fatigati, le Mon 01 Aug 2011 14:48:11 +0200, a écrit : > so, if I inderstand well, PU P# numbers are not the same specified as > HWLOC_OBJ_PU flag? They are, in the os_index (aka physical index) field. Samuel
Re: [hwloc-users] Thread core affinity
Gabriele Fatigati, le Fri 29 Jul 2011 13:34:29 +0200, a écrit : > I forgot to tell you these code block is inside a parallel OpenMP region. This > is the complete code: > > #pragma omp parallel num_threads(6) > { > int tid = omp_get_thread_num(); > > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, tid); > > and other code block is: > > #pragma omp parallel num_threads(6) > { > int tid = omp_get_thread_num(); > > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid); Ok, so it depends whether you want to put your OpenMP threads on separate cores (then the first code which distributes among cores), or if you're ok with letting them share a core (then the first code which distributes among threads). Maybe try and run lstopo --top to see the result. Samuel
Re: [hwloc-users] Thread core affinity
Gabriele Fatigati, le Fri 29 Jul 2011 13:24:17 +0200, a écrit : > yhanks for yout quick reply! > > But i have a litte doubt. in a non SMT machine, Is it better use this: > > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, tid); > hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset); > hwloc_bitmap_singlify(set); > hwloc_set_cpubind(topology, set, HWLOC_CPUBIND_THREAD); > > or: > > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid); > hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset); > hwloc_bitmap_singlify(set); > hwloc_set_cpubind(topology, set, HWLOC_CPUBIND_THREAD); > > because work in the same way( i suppose). They'll both work about the same way on SMT too, since in the end it'll pick up only one thread. Whether you want to assign threads to cores or threads then depends on your application: do you want to let its threads share a core or not. Samuel
Re: [hwloc-users] Thread core affinity
Hello, Gabriele Fatigati, le Fri 29 Jul 2011 12:43:47 +0200, a écrit : > I'm so confused. I see couples of cores with the same core id! ( Core#8 for > example) How is it possible? That's because they are on different sockets. These are physical IDs (not logical IDs), and are thus not garanteed to be unique. > 2) logical Core id and Physical core id maybe differents. If i want to be sure > that id 0 and id 1 are physically near, i have to use core id or PU id? PU ids > are ever physically near? Using core or thread ID does not matter. What matters is that you take the proper ID. Physical IDs will in general never bring you any proximity indication. What you want is logical IDs, which hwloc takes care of meaning proximity. Using adjacent logical IDs (be it for core or threads) will bring you adjacent cores/threads. > 3) Binding a thread on a core, what's the difference between hwloc_set_cpubind > () and hwloc_set_thread_cpubind()? More in depth, my code example works well > with: > > hwloc_set_cpubind(topology, set, HWLOC_CPUBIND_THREAD); > > and crash with: > > hwloc_set_thread_cpubind(topology, tid, set, HWLOC_CPUBIND_THREAD); Note that tid is hwloc_thread_t, i.e. pthread_t on unixes. It is not a (Linux-specific) tid. If what you have is a (Linux-specific) tid, use the Linux-specific function, hwloc_linux_set_tid_cpubind. Samuel
Re: [hwloc-users] hwloc 1.2 compilation problems
Carl Smith, le Tue 12 Jul 2011 02:46:27 +0200, a écrit : > > is it perhaps the presence of -L/usr/local/lib which makes the linking > > fail? I've commited something that might help. > > Perhaps. Your latest change does work on this AIX system. Thanks > for persisting. Great! I've backported to the 1.2 branch. Samuel
Re: [hwloc-users] hwloc 1.2 compilation problems
Carl Smith, le Fri 08 Jul 2011 03:51:07 +0200, a écrit : > > Alright, I give up trying to use autoconf high-end macros, here is > > another, low-level try. > > Alas, I think this one comes full circle: it's deciding on ncurses, > then failing the link step. Uh. That's not coherent: checking curses support using curses.h and -lncurses... yes means that ./configure was able to compile & link with -lncurses the following: #include #include int main(void) { NULL, 0, 0, 0, 0, 0, 0, 0, 0, 0); } but then it fails at lstopo-text link, which does the same?! is it perhaps the presence of -L/usr/local/lib which makes the linking fail? I've commited something that might help. Samuel
Re: [hwloc-users] hwloc 1.2 compilation problems
Carl Smith, le Fri 08 Jul 2011 01:01:53 +0200, a écrit : > > Oops, I hadn't realized that AC_CHECK_HEADERS checks for all of them. > > I've rewritten it quite a bit, in an actually more straightforward way, > > could you test it? > > Sure - still no joy. It's still selecting ncurses. Ow, AC_SEARCH_LIBS is actually not using ac_includes_default. Alright, I give up trying to use autoconf high-end macros, here is another, low-level try. Samuel
Re: [hwloc-users] hwloc 1.2 compilation problems
Samuel Thibault, le Tue 21 Jun 2011 02:10:22 +0200, a écrit : > Carl Smith, le Tue 21 Jun 2011 02:07:09 +0200, a écrit : > > > Ah, ok. So what fails to link is > > > > > > /* cc test.c -o test -lncurses */ > > > #include > > > #include > > > int main(void) { > > > } > > > > > > is that right? > > > > Yes, and > > > > > /* cc test.c -I/usr/include/ncurses -o test -lncurses */ > > > > does not fail. > > Ok, then good, I'll simply include term.h when checking -lfoocurses, to > make it fail with ncurses on your AIX box (but succeed with curses right > after that) I've done so in svn, could you check? Samuel
Re: [hwloc-users] Patch to disable GCC __builtin_ operations
Josh Hursey, le Thu 09 Jun 2011 14:52:39 +0200, a écrit : > The odd thing about this environment is that the head node seems to > have a slightly different setup than the compute nodes (not sure why > exactly, but that's what it is). So hwloc is configured and runs > correctly on the head node, but when it is asked to run on the compute > nodes it segvs at the call site of the __builtin_ functions. Could you post a disassembly of the site? > I suspect that the ABI compatibility of the libc interface is what is > enabling the remainder of the code to work in both environments, and > that the __builtin_ functions bypass that ABI to put in system > specific code that (for whatever reason) does not match on the compute > nodes. But the odd thing is that there shouldn't be any ABI things here, it's meant to be inlined. Samuel