Re: [hwloc-devel] Negative values for die_id and physical_package_id
Brice Goglin, le mer. 26 mai 2021 14:13:02 +0200, a ecrit: > os_index is already *unsigned* in the API (did you mean signed?). We cannot > change the obj->os_index back to signed now, it would break existing users. Mmm, it wouldn't break the ABI, only printf formats using %u? Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] Strange CPU topology numbering on dual socket ARM server with 2×ThunderX2 CN9975
Jirka Hladky, le ven. 06 sept. 2019 16:52:30 +0200, a ecrit: > The trouble is that other Linux tools (like ps) are using the physical > numbering. Yes, that is why hwloc provides both, and hwloc-calc can be used to convert between them for instance. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] Strange CPU topology numbering on dual socket ARM server with 2×ThunderX2 CN9975
Brice Goglin, le ven. 06 sept. 2019 16:07:13 +0200, a ecrit: > physical_package_id don't have to be between 0 and N-1, Which is the very reason for the logical IDs that hwloc provide :) Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
[hwloc-devel] hwloc2 in Debian?
Hello, The last missing bits for having hwloc2 in Debian are almost there: - librsb has been updated in unstable to the fixed 1.2.0.8 - openmpi 4 has been uploaded to experimental So once openmpi 4 is in unstable and these two are ready to move to testing, AFAICT there nothing that prevents from putting hwloc2 in Debian. Does anybody see any other red flag that we should take care of? Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] hwloc_distances_add conflicting declaration
Marco Atzeri, le dim. 30 sept. 2018 20:02:59 +0200, a ecrit: > also adding a HWLOC_DECLSPEC on the first case distances.c:347 > does not solve the issue as the two declaration are not the same. > > Suggestion ? Perhaps use hwloc_uint64_t instead of uint64_t in hwloc/distances.c? Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
[hwloc-devel] Homebrew bumped to hwloc 2
Hello, For information, Homebrew bumped its hwloc version to 2: https://github.com/Homebrew/homebrew-core/pull/23721 Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [SCM] open-mpi/hwloc branch master updated. 14e727976867931a2eb74f2630b0ce9137182874
Brice Goglin, on lun. 05 févr. 2018 14:25:58 +0100, wrote: > configure only looks for CL/cl_ext.h before enabling the OpenCL backend. > Did it enable OpenCL on your machine? Possibly not, we just happen to have had StarPU build errors when including opencl.h. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0rc2 released
BTW, I have noticed that ia64 eventually tried to build beta1. It failed in the shmem.c test: https://buildd.debian.org/status/fetch.php?pkg=hwloc=ia64=2.0.0~beta1-4=1517101408=0 I don't have access to a porterbox to check what happened more precisely. The rc2 build might be attempted within a few days, depending on how fast the porters manage to unlock more package builds. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] No opencl osdev for NVidia GPU devices
Brice Goglin, on lun. 08 janv. 2018 15:41:02 +0100, wrote: > Do we want to see OpenCL CPU devices too? I believe we don't want them. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] get_last_cpu_location for process (Was: Hardware locality (hwloc) v2.0.0-beta1 released)
Brice Goglin, on ven. 29 déc. 2017 11:15:09 +0100, wrote: > I couldn't test since binding doesn't seem to work in my qemu (always > goes to PU #0), even when using qemu-x86_64 on x86_64. Is this fixed > with your patches sent to qemu-devel yesterday? My get/setaffinity patches shouldn't fix anything in that regard since it's only for big/little endian conversion. > Also sched_getcpu() isn't implemented in my qemu, My getcpu patch fixes that. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] get_last_cpu_location for process (Was: Hardware locality (hwloc) v2.0.0-beta1 released)
Brice Goglin, on jeu. 28 déc. 2017 18:47:29 +0100, wrote: > Le 28/12/2017 à 16:18, Samuel Thibault a écrit : > > Samuel Thibault, on jeu. 28 déc. 2017 15:08:30 +0100, wrote: > >> Samuel Thibault, on mer. 20 déc. 2017 18:32:48 +0100, wrote: > >>> I have uploaded it to debian experimental, so when it passes NEW, > >>> various arch test results will show up on > >>> > >>> https://buildd.debian.org/status/package.php?p=hwloc=experimental > >>> > >>> so you can check the results on odd systems :) > >> FI, the failure on m68k is due to a bug in qemu's linux-user emulation, > >> which I'm currently fixing. > > There is however an issue with the hwloc_get_last_cpu_location test when > > run inside qemu's linux-user emulation, because in that case qemu > > introduces a thread for its own purposes in addition to the normal > > thread, and then the test looks like this: > > Can you clarify what this qemu linux-user emulation does? Is it > emulating each process of a "fake-VM" inside a dedicated process on the > host? Yes. qemu-i386 /crossroot/bin/bash will run an i386-based bash as a normal process, only emulating the CPU part, all system calls are made normally (with a bit of data translation). > Any idea when this could be useful beside (I guess) cross-building > platforms? Mostly that :) > we can add a hwloc env var to disable process-wide asserts. So I would set it for all Debian builds? (we don't have a fixed set of archs which are built inside qemu-user) Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
[hwloc-devel] get_last_cpu_location for process (Was: Hardware locality (hwloc) v2.0.0-beta1 released)
Samuel Thibault, on jeu. 28 déc. 2017 15:08:30 +0100, wrote: > Samuel Thibault, on mer. 20 déc. 2017 18:32:48 +0100, wrote: > > I have uploaded it to debian experimental, so when it passes NEW, > > various arch test results will show up on > > > > https://buildd.debian.org/status/package.php?p=hwloc=experimental > > > > so you can check the results on odd systems :) > > FI, the failure on m68k is due to a bug in qemu's linux-user emulation, > which I'm currently fixing. There is however an issue with the hwloc_get_last_cpu_location test when run inside qemu's linux-user emulation, because in that case qemu introduces a thread for its own purposes in addition to the normal thread, and then the test looks like this: I'm tid 15573 trying 0x0003 1 setaffinity 15573 3 gave 0 setaffinity 15606 3 gave 0 getting last location for 15573 got 0 getting last location for 15606 got 2 got 0x0005 hwloc_get_last_cpu_location: hwloc_get_last_cpu_location.c:38: check: Assertion `hwloc_bitmap_isincluded(last, set)' failed. I.e. when trying check(set, HWLOC_CPUBIND_PROCESS);, the hwloc_set_cpubind() call does bind the two threads of the process, and then looks for the CPU locations of the two threads, but probably thread 15606 didn't actually run in between, and thus the last CPU location is still with the old binding, and that fails the assertion. Of course, in the Debian package I could patch over this test to ignore the failure, possibly by blacklisting architectures which are known to be built inside qemu, but it could pose problem more generally. Perhaps we should use the attached patch, to try to check inclusion only from the result of the current-thread-only method? Samuel diff --git a/tests/hwloc/hwloc_get_last_cpu_location.c b/tests/hwloc/hwloc_get_last_cpu_location.c index 03ab103b..15f2fb00 100644 --- a/tests/hwloc/hwloc_get_last_cpu_location.c +++ b/tests/hwloc/hwloc_get_last_cpu_location.c @@ -27,6 +27,13 @@ static int check(hwloc_const_cpuset_t set, int flags) ret = hwloc_get_last_cpu_location(topology, last, flags); assert(!ret); assert(!hwloc_bitmap_iszero(last)); + + if (support->cpubind->get_thisthread_last_cpu_location) +ret = hwloc_get_last_cpu_location(topology, last, HWLOC_CPUBIND_THREAD); + else +ret = hwloc_get_last_cpu_location(topology, last, 0); + assert(!ret); + assert(!hwloc_bitmap_iszero(last)); assert(hwloc_bitmap_isincluded(last, set)); hwloc_bitmap_free(last); ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
Samuel Thibault, on mer. 20 déc. 2017 18:32:48 +0100, wrote: > I have uploaded it to debian experimental, so when it passes NEW, > various arch test results will show up on > > https://buildd.debian.org/status/package.php?p=hwloc=experimental > > so you can check the results on odd systems :) FI, the failure on m68k is due to a bug in qemu's linux-user emulation, which I'm currently fixing. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
In the end, I'm wondering what we will do for the Debian packages: a separate libhwloc2-dev package (which is a pain for various reasons) or not. It depends whether we have rdependencies ready when we really want hwloc2 into Debian. FI, here are the rdeps: Package: gridengine Package: htop Package: librsb Package: mpich Package: openmpi Package: pocl Package: slurm-llnl Package: starpu Package: trafficserver For small fixups like field changes etc. maintainers will probably be fine with patches, but for more involved changes such as memory nodes, it'll probably take more time since maintainers may not be happy to backport heavy changes. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
Brice Goglin, on ven. 22 déc. 2017 12:35:35 +0100, wrote: > That won't work. You can have memory attached at different levels of the > hierarchy (things like HBM inside a die, normal memory attached to a > package, and slow memory attached to the memory interconnect). The > notion of NUMA node and proximity domain is changing. It's not a set of > CPU+memory anymore. Things are moving towards the separation of "memory > initiator" (CPUs) and "memory target" (memory banks, possibly behind > memory-side caches). And those targets can be attached to different things. Alright. So hwloc might there be a lever to push people into thinking that way :) Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
BTW, I find differing information on the soname that hwloc2 will have: https://github.com/open-mpi/hwloc/wiki/Upgrading-to-v2.0-API mentions version 6, but VERSION uses 12:0:0 (and thus the soname uses 12). Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
Hello, Brice Goglin, on mar. 19 déc. 2017 11:48:39 +0100, wrote: > + Memory, I/O and Misc objects are now stored in dedicated children lists, > not in the usual children list that is now only used for CPU-side objects. > - hwloc_get_next_child() may still be used to iterate over these 4 lists > of children at once. I hadn't realized this before: so the NUMA-related hierarchy level can not be easily obtained with hwloc_get_type_depth and such, that's really a concern. For instance in slurm-llnl one can find if (hwloc_get_type_depth(topology, HWLOC_OBJ_NODE) > hwloc_get_type_depth(topology, HWLOC_OBJ_SOCKET)) { and probably others are doing this too, e.g. looking up from a CPU to find the NUMA level becomes very different from looking up from a cPU to find the L3 level etc. Instead of moving these objects to another place which is very different to find, can't we rather create another type of object, e.g. HWLOC_OBJ_MEMORY, to represent the different kinds of memories that can be found in a given NUMA level, and keep HWLOC_OBJ_NODE as it is? I'm really afraid that otherwise this change will hurt a lot of people and remain a pain for programming things for a long time. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
Samuel Thibault, on mer. 20 déc. 2017 18:26:37 +0100, wrote: > Brice Goglin, on mer. 20 déc. 2017 18:16:34 +0100, wrote: > > Le 20/12/2017 à 18:06, Samuel Thibault a écrit : > > > It has only one NUMA node, thus triggering the code I patched over. > > > > Well, this has been working fine for a while, since that's my daily > > development machine and all our jenkins slaves. > > > > Can you give the usually requested details about the OS, kernel, > > hwloc-gather-topology? hwloc-gather-cpuid if the x86 backend is involved? > > Your commit 301c0f94e0a54823bfd530c36b5f9c9d9862332b seems to have fixed > it. > > It's Debian Buster, kernel 4.14.0, and attached gathers. Mmm, it seems the x86 backend gets triggered somehow: this is the first hwloc_topology_reconnect call: #0 hwloc_topology_reconnect (topology=topology@entry=0x5577f060, flags=flags@entry=0) at /home/samy/recherche/hwloc/hwloc/hwloc/topology.c:2910 #1 0x77bc93e2 in hwloc_x86_discover (backend=0x5577f890) at /home/samy/recherche/hwloc/hwloc/hwloc/topology-x86.c:1264 #2 0x77ba1595 in hwloc_discover (topology=0x5577f060) at /home/samy/recherche/hwloc/hwloc/hwloc/topology.c:3008 #3 hwloc_topology_load (topology=0x5577f060) at /home/samy/recherche/hwloc/hwloc/hwloc/topology.c:3584 #4 0x84f6 in main (argc=, argv=) at /home/samy/recherche/hwloc/hwloc/utils/lstopo/lstopo.c:995 and then the second: #0 hwloc_topology_reconnect (topology=0x5577f060, flags=0) at /home/samy/recherche/hwloc/hwloc/hwloc/topology.c:2910 #1 0x77ba1707 in hwloc_discover (topology=0x5577f060) at /home/samy/recherche/hwloc/hwloc/hwloc/topology.c:3103 #2 hwloc_topology_load (topology=0x5577f060) at /home/samy/recherche/hwloc/hwloc/hwloc/topology.c:3584 #3 0x84f6 in main (argc=, argv=) at /home/samy/recherche/hwloc/hwloc/utils/lstopo/lstopo.c:995 and a third: #0 hwloc_topology_reconnect (topology=0x5577f060, flags=0) at /home/samy/recherche/hwloc/hwloc/hwloc/topology.c:2910 #1 0x77ba1789 in hwloc_discover (topology=0x5577f060) at /home/samy/recherche/hwloc/hwloc/hwloc/topology.c:3149 #2 hwloc_topology_load (topology=0x5577f060) at /home/samy/recherche/hwloc/hwloc/hwloc/topology.c:3584 #3 0x84f6 in main (argc=, argv=) at /home/samy/recherche/hwloc/hwloc/utils/lstopo/lstopo.c:995 (these are all with git 220ee3eb926ca6bfa175d9700ab56d14a554cea4) I have attached the cpuid/ directory. Samuel cpuid.tgz Description: application/gtar-compressed ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
I have uploaded it to debian experimental, so when it passes NEW, various arch test results will show up on https://buildd.debian.org/status/package.php?p=hwloc=experimental so you can check the results on odd systems :) Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
Brice Goglin, on mer. 20 déc. 2017 18:16:34 +0100, wrote: > Le 20/12/2017 à 18:06, Samuel Thibault a écrit : > > It has only one NUMA node, thus triggering the code I patched over. > > Well, this has been working fine for a while, since that's my daily > development machine and all our jenkins slaves. > > Can you give the usually requested details about the OS, kernel, > hwloc-gather-topology? hwloc-gather-cpuid if the x86 backend is involved? Your commit 301c0f94e0a54823bfd530c36b5f9c9d9862332b seems to have fixed it. It's Debian Buster, kernel 4.14.0, and attached gathers. Samuel Machine (P#0 total=8022768KB DMIProductName="HP EliteBook 820 G2" DMIProductVersion=A3008C410003 DMIBoardVendor=Hewlett-Packard DMIBoardName=225A DMIBoardVersion="KBC Version 96.54" DMIBoardAssetTag= DMIChassisVendor=Hewlett-Packard DMIChassisType=10 DMIChassisVersion= DMIChassisAssetTag=5CG5201YZY DMIBIOSVendor=Hewlett-Packard DMIBIOSVersion="M71 Ver. 01.04" DMIBIOSDate=02/24/2015 DMISysVendor=Hewlett-Packard Backend=Linux LinuxCgroup=/ hwlocVersion=2.0.0a1-git ProcessName=lstopo-no-graphics) Package L#0 (P#0 total=8022768KB CPUModel="Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz" CPUVendor=GenuineIntel CPUFamilyNumber=6 CPUModelNumber=61 CPUStepping=4) NUMANode L#0 (P#0 local=8022768KB total=8022768KB) L3Cache L#0 (size=3072KB linesize=64 ways=12 Inclusive=1) L2Cache L#0 (size=256KB linesize=64 ways=8 Inclusive=0) L1dCache L#0 (size=32KB linesize=64 ways=8 Inclusive=0) L1iCache L#0 (size=32KB linesize=64 ways=8 Inclusive=0) Core L#0 (P#0) PU L#0 (P#0) PU L#1 (P#1) L2Cache L#1 (size=256KB linesize=64 ways=8 Inclusive=0) L1dCache L#1 (size=32KB linesize=64 ways=8 Inclusive=0) L1iCache L#1 (size=32KB linesize=64 ways=8 Inclusive=0) Core L#1 (P#1) PU L#2 (P#2) PU L#3 (P#3) HostBridge L#0 (buses=:[00-03]) PCI L#0 (busid=:00:02.0 id=8086:1616 class=0300(VGA) PCIVendor="Intel Corporation" PCIDevice="HD Graphics 5500") PCI L#1 (busid=:00:19.0 id=8086:15a2 class=0200(Ethernet) PCIVendor="Intel Corporation" PCIDevice="Ethernet Connection (3) I218-LM") Network L#0 (Address=48:0f:cf:28:82:c3) "enp0s25" PCIBridge L#1 (busid=:00:1c.3 id=8086:9c96 class=0604(PCIBridge) buses=:[03-03] PCIVendor="Intel Corporation" PCIDevice="Wildcat Point-LP PCI Express Root Port #4") PCI L#2 (busid=:03:00.0 id=8086:095a class=0280(Network) PCIVendor="Intel Corporation" PCIDevice="Wireless 7265") Network L#1 (Address=34:02:86:2c:6a:19) "wlo1" PCI L#3 (busid=:00:1f.2 id=8086:9c83 class=0106(SATA) PCIVendor="Intel Corporation" PCIDevice="Wildcat Point-LP SATA Controller [AHCI Mode]") Block(Disk) L#2 (Size=250059096 SectorSize=512 LinuxDeviceID=8:0 Model=MTFDDAK256MBF-1AN15ABHA Revision=M6T3 SerialNumber=14380F25F377) "sda" depth 0: 1 Machine (type #0) depth 1: 1 Package (type #1) depth 2: 1 L3Cache (type #6) depth 3:2 L2Cache (type #5) depth 4: 2 L1dCache (type #4) depth 5: 2 L1iCache (type #9) depth 6: 2 Core (type #2) depth 7:4 PU (type #3) Special depth -3: 1 NUMANode (type #13) Special depth -4: 2 Bridge (type #14) Special depth -5: 4 PCIDev (type #15) Special depth -6: 3 OSDev (type #16) Topology not from this system gather.xml Description: XML document gather.tar.bz2 Description: Binary data ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
Brice Goglin, on mer. 20 déc. 2017 17:53:54 +0100, wrote: > Le 20/12/2017 à 17:49, Samuel Thibault a écrit : > > Samuel Thibault, on mer. 20 déc. 2017 13:57:45 +0100, wrote: > >> Brice Goglin, on mar. 19 déc. 2017 11:48:39 +0100, wrote: > >>> The Hardware Locality (hwloc) team is pleased to announce the first > >>> beta release for v2.0.0: > >>> > >>>http://www.open-mpi.org/projects/hwloc/ > >> I tried to build the Debian package, there are a few failures in the > >> testsuite: > >> > >> FAIL: test-lstopo.sh > >> FAIL: hwloc_bind > >> FAIL: hwloc_get_last_cpu_location > >> FAIL: hwloc_get_area_memlocation > >> FAIL: hwloc_object_userdata > >> FAIL: hwloc_backends > >> FAIL: hwloc_pci_backend > >> FAIL: hwloc_is_thissystem > >> FAIL: hwloc_topology_diff > >> FAIL: hwloc_topology_abi > >> FAIL: hwloc_obj_infos > >> FAIL: glibc-sched > >> ../.././config/test-driver: line 107: 27886 Segmentation fault "$@" > > >> $log_file 2>&1 > >> FAIL: hwloc-hello > >> ../.././config/test-driver: line 107: 27905 Segmentation fault "$@" > > >> $log_file 2>&1 > >> FAIL: hwloc-hello-cpp > >> > >> This is running inside a Debian Buster system. > > It seems to be fixed by the attached patch. > > > > I can't reproduce the issue, what's specific about your system? It has only one NUMA node, thus triggering the code I patched over. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
Samuel Thibault, on mer. 20 déc. 2017 13:57:45 +0100, wrote: > Brice Goglin, on mar. 19 déc. 2017 11:48:39 +0100, wrote: > > The Hardware Locality (hwloc) team is pleased to announce the first > > beta release for v2.0.0: > > > >http://www.open-mpi.org/projects/hwloc/ > > I tried to build the Debian package, there are a few failures in the > testsuite: > > FAIL: test-lstopo.sh > FAIL: hwloc_bind > FAIL: hwloc_get_last_cpu_location > FAIL: hwloc_get_area_memlocation > FAIL: hwloc_object_userdata > FAIL: hwloc_backends > FAIL: hwloc_pci_backend > FAIL: hwloc_is_thissystem > FAIL: hwloc_topology_diff > FAIL: hwloc_topology_abi > FAIL: hwloc_obj_infos > FAIL: glibc-sched > ../.././config/test-driver: line 107: 27886 Segmentation fault "$@" > > $log_file 2>&1 > FAIL: hwloc-hello > ../.././config/test-driver: line 107: 27905 Segmentation fault "$@" > > $log_file 2>&1 > FAIL: hwloc-hello-cpp > > This is running inside a Debian Buster system. It seems to be fixed by the attached patch. Samuel diff --git a/hwloc/topology.c b/hwloc/topology.c index d827d5f5..e0bf7beb 100644 --- a/hwloc/topology.c +++ b/hwloc/topology.c @@ -1,7 +1,7 @@ /* * Copyright © 2009 CNRS * Copyright © 2009-2017 Inria. All rights reserved. - * Copyright © 2009-2012 Université Bordeaux + * Copyright © 2009-2012, 2017 Université Bordeaux * Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved. * See COPYING in top-level directory. */ @@ -1596,6 +1596,7 @@ hwloc__insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t root */ #endif + topology->modified = 1; if (hwloc_obj_type_is_memory(obj->type)) { if (!root) { root = hwloc__find_insert_memory_parent(topology, obj, report_error); @@ -3044,6 +3045,7 @@ next_cpubackend: memcpy(>attr->numanode, >machine_memory, sizeof(topology->machine_memory)); memset(>machine_memory, 0, sizeof(topology->machine_memory)); hwloc_insert_object_by_cpuset(topology, node); +hwloc_topology_reconnect(topology, 0); } else { /* if we're sure we found all NUMA nodes without their sizes (x86 backend?), * we could split topology->total_memory in all of them. ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
Brice Goglin, on mar. 19 déc. 2017 11:48:39 +0100, wrote: > The Hardware Locality (hwloc) team is pleased to announce the first > beta release for v2.0.0: > >http://www.open-mpi.org/projects/hwloc/ I tried to build the Debian package, there are a few failures in the testsuite: FAIL: test-lstopo.sh FAIL: hwloc_bind FAIL: hwloc_get_last_cpu_location FAIL: hwloc_get_area_memlocation FAIL: hwloc_object_userdata FAIL: hwloc_backends FAIL: hwloc_pci_backend FAIL: hwloc_is_thissystem FAIL: hwloc_topology_diff FAIL: hwloc_topology_abi FAIL: hwloc_obj_infos FAIL: glibc-sched ../.././config/test-driver: line 107: 27886 Segmentation fault "$@" > $log_file 2>&1 FAIL: hwloc-hello ../.././config/test-driver: line 107: 27905 Segmentation fault "$@" > $log_file 2>&1 FAIL: hwloc-hello-cpp This is running inside a Debian Buster system. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v2.0.0-beta1 released
Hello, Brice Goglin, on mar. 19 déc. 2017 11:48:39 +0100, wrote: > The Hardware Locality (hwloc) team is pleased to announce the first > beta release for v2.0.0: > >http://www.open-mpi.org/projects/hwloc/ The tarball doesn't contain a netloc/ directory. This is not a problem for ./configure && make && make install, but it prevents from being able to run ./autogen.sh Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] No opencl osdev for NVidia GPU devices
Brice Goglin, on mer. 27 sept. 2017 20:39:47 +0200, wrote: > Le 27/09/2017 18:58, Samuel Thibault a écrit : > > Isn't it better to show OpenCL at the root rather then not at all? > > As you want. > If there's a need for these objects without any topology information, > that's fine with me. Well, some people were surprised not to see OpenCL devices while compilation found the support for it etc. People will perhaps be surprised that there is no topology information for them, but I find that less surprising. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] [SCM] open-mpi/hwloc branch master updated. 37eb93c7dfeca1a0ce84474bac9d2f234bcbacd4
Brice Goglin, on mar. 29 août 2017 18:54:03 +0200, wrote: > Contrary to lstopo, hwloc-ps has no problem with long command-lines. Right. > What's the point of shortening to comm here? Well, for coherency only, if you prefer long command-lines there, I'm fine with it. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] hwloc.m4: minor english fixes
Jeff Squyres (jsquyres), on Wed 08 Feb 2017 15:19:58 +, wrote: > 1. You reverted an actual grammar fix: "support" -> "supported". Oops, I missed that part, sorry. > 2. I don't think that "likely" is bad to have. Like I said above, the test > itself is just a switch/case test based on a hard-coded list of OSs. The > test does not *actually* test to see if the system supports binding. No, but as of now there is just no way that binding can be supported without knowing anything about the OS. There is simply no standard way of binding a thread as of now. I don't see a reason why we could let users lose time with trying to determine whether it actually works or not while it will just never work with the current codebase, and I doubt we will ever see a really standard OS way of binding threads. If that ever happens, we can still change the phrasing here, while letting the user be unsure about the current state means making him lose time. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] hwloc.m4: minor english fixes
git...@open-mpi.org, on Tue 07 Feb 2017 09:15:01 -0600, wrote: > commit 96a1a1b4d9f4d34e6b26ed4a665a739fd449131a > Author: Jeff Squyres> Date: Tue Feb 7 10:13:27 2017 -0500 > > hwloc.m4: minor english fixes > > Signed-off-by: Jeff Squyres > > @@ -262,7 +262,7 @@ EOF]) > AC_MSG_WARN([*** hwloc does not support this system.]) > AC_MSG_WARN([*** hwloc will *attempt* to build (but it may not > work).]) > AC_MSG_WARN([*** hwloc run-time results may be reduced to showing > just one processor,]) > -AC_MSG_WARN([*** and binding will not be support.]) > +AC_MSG_WARN([*** and binding will likely not be supported.]) Well, it's not really "likely": unsupported systems really won't have binding support. For getting the number of processors we have a couple of more or less generic ways, but for binding we don't have any. Samuel ___ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-devel
Re: [hwloc-devel] Three patches for MSVC/ICL builds on Windows.
Brice Goglin, on Tue 05 Apr 2016 10:39:29 +0200, wrote: > Le 05/04/2016 10:26, Samuel Thibault a écrit : > > The bug here is that that HWLOC_CHECK_DECL assumed that availability > > of the function was tested before, i.e. > >> conftest.c(96) : fatal error C1083: Cannot open include file: 'sched.h': No > >> such file or directory > > was unexpected. > > Adding a check for sched.h availability before CHECK_DECL() might be > enough for Jonathan's case. I am not sure I want to change this m4 code > in v1.11.3 since it has been working fine for years. Well, we can as well just use AC_CHECK_DECL in v1.11.3, it'll just get the same result as what the code currently expects, and not the bug. Samuel
Re: [hwloc-devel] hwloc-1.11 failure with pgi compiler
Paul Hargrove, le Tue 28 Jul 2015 16:47:37 -0700, a écrit : > : "=a" (*eax), "=r" (*ebx), "=c" (*ecx), "=d" (*edx) > > 5f5c5: 87 d3 xchg %edx,%ebx Ouch. That I call "buggy" indeed :) Thanks for the tests, that's good to know. Samuel
Re: [hwloc-devel] hwloc-1.11 failure with pgi compiler
Paul Hargrove, le Tue 28 Jul 2015 15:00:36 -0700, a écrit : > Well, for the compiler that accepted the "=r" form and then generated code > that > SEGV'd I would say "buggy". I would like to see the generated code before saying anything, since it's so easy to write bogus inline assembly and being completely unable to see the issue before seeing the bogus generated code :) Samuel
Re: [hwloc-devel] hwloc-1.11 failure with pgi compiler
Paul Hargrove, le Tue 21 Jul 2015 16:15:24 -0700, a écrit : > I am glad you asked me to test widely, because I did find 2 compilers that > rejected my version with "=r" and one that generated bad code for that case. What kind of bad code was it generating? Perhaps it was due to not using an early clobber ("=" instead of just "=r")? Samuel
Re: [hwloc-devel] hwloc-1.11 failure with pgi compiler
Brice Goglin, le Tue 28 Jul 2015 16:13:49 +0200, a écrit : > and your commit is slightly different: (s/xchg/mov/ and removed last line). xchg is spurious here, mov is enough. I didn't remove the last line, I just kept the original source, which uses +a instead of =a and a. > FWIW, in master we don't have multiple inlining anymore (there's a > wrapper function calling this inline asm). You mean the cpuid_or_from_dump function? Samuel
Re: [hwloc-devel] hwloc-1.11 failure with pgi compiler
Hello, Paul Hargrove, le Mon 20 Jul 2015 23:12:10 -0700, a écrit : > I believe the following inline x86 asm is correct and more robust than the > existing code that pgi appears to reject: Indeed, in the 32bit case, we don't need to shuffle between 32 and 64bit values, so it's simpler to just use a register. It's surprising that letting the compiler decide the register fails more than just specifying SD, but since wide testing shows that, then let's go with it. I'm however afraid that this code has again posed problem, even if we do test its compilation in configure.ac. I'm wondering: instead of insisting on inlining this function, we should perhaps just put it in a separate .c file, which we try to compile from configure.ac exactly the same way as it will be for libhwloc.so? Samuel
Re: [hwloc-devel] [PATCH] utils/hwloc/Makefile.am: fix install-man race condition
Peter Korsgaard, le Tue 12 May 2015 16:09:55 +0200, a écrit : > Make install contains a race condition in utils/hwloc, as both > install-exec-hook (through intall-exec) and install-data trigger > install-man: I'm surprised: isn't make supposed to handle this kind of dependency concurrency? Samuel
Re: [hwloc-devel] upcoming feature removal
Brice Goglin, le Mon 03 Nov 2014 11:49:02 +0100, a écrit : > * kerrighed support (single-system image): planned for removal since > 2012, see https://github.com/open-mpi/hwloc/issues/73 Right, Kerrighed is mostly discontinued. > I am also considering this change that shouldn't break existing programs: > * always create a NUMA node even if the machine isn't NUMA That makes sense to me, yes. > If we're going to break the ABI anyway (removing custom will break the > ABI), we could break it even more. Yes, sure, a 2.0 is the opportunity to break the ABI/API a bit. > * don't put I/O objects in "normal" children since it confuses programs > consulting the children list. rather place them under a dedicated child > pointer special objects such as Misc may go there as well. Indeed. > * replace hwloc_topology_ignore_type_keep_structure() with a flavor that > does not create asymmetric topologies. only remove entire levels that > don't add any hierarchy. don't remove single objects within levels in > case of asymmetric topologies (restricted by cgroup etc). Agreed. > * remove obj->os_level: pretty much unused and undocumented, can go in a > string attribute if really useful Indeed. This was introduced for AIX, and later re-used by x86, but applications most probably won't really make use of it. I don't think it's even useful to put it in an attribute. > Changes requested by some users but that I am not sure what to do yet: > * stop having 4 cpusets and 3 nodesets per object and just have 1 cpuset > and 1 nodeset depending on topology flags (only allowed, or only online, > etc). possibly with ways to switch between modes at runtime Right, I can understand it scares users to get all this information . Having to choose at runtime however poses the problem of applications which would want the various information, to deal with both online and allowed, lstopo is such an example, because they wouldn't like to have to build the topology several times to get the various information. What I guess would work fine is to have only the cpuset and nodeset fields, have a flag to decide between cpuset/nodeset being the complete sets or just what is covered by PU object (the latter being the default, probably), and provide the allowed and online cpuset/nodesets another way: we don't need it in all objects anyway, since one can always perform an AND of allowed or online with the object cpuset to get what one wants. Considering the netloc support, I guess the information should be contained in the machine object. > * stop having a CACHE type + data/instruction/unified + depth, and just > have one type for each of them, such as HWLOC_OBJ_CACHE_L1d. the > advantage is that you can switch (type) without special-casing the CACHE > subtypes. One drawback is that there are many subtypes in existing > machines (at least L1[id], L2[idu], L3[idu], L4u). Yes, and we don't know when it will end. We had only L1 in the past, then L1-L2, then L1-L3, then L1-L4, ... :) It would be a pity to abandon providing applications with a level-agnostic way to deal with caches. Separating instruction cache from data caches seems reasonable to me, however. > Also the "Group" type still needs special-casing because of multiple > nested groups in large NUMA machines. Which kind of special-casing do we see in the wild? I would usually consider groups as something applications can't really take into account more precisely than the mere topology division. Samuel
Re: [hwloc-devel] Migrate Trac tickets -> Github issues
Jeff Squyres (jsquyres), le Fri 12 Sep 2014 10:44:03 +, a écrit : > I did a test import of hwloc's Trac tickets to githib -- what do you think? > > https://github.com/ompiteam/hwloc-test-ticket-import/issues It looks good to me. I have unwatched the github hwloc project. Samuel
Re: [hwloc-devel] Interesting warning
Hello, Ralph Castain, le Wed 10 Sep 2014 17:41:17 -0700, a écrit : > Just got this from Clang 3.4.2 on Linux x86-64: > > In file included from topology-x86.c:23: > /home/common/openmpi/svn-trunk/opal/mca/hwloc/hwloc191/hwloc/include/private/ > cpuid-x86.h:67:3: warning: extension used [-Wlanguage-extension-token] > asm( > ^ > 1 warning generated. > > > Guess it doesn't like that assembler in there Could you try the attached patch? Samuel diff --git a/include/private/cpuid-x86.h b/include/private/cpuid-x86.h index f00a97f..1abf172 100644 --- a/include/private/cpuid-x86.h +++ b/include/private/cpuid-x86.h @@ -16,7 +16,7 @@ static __hwloc_inline int hwloc_have_x86_cpuid(void) { int ret; unsigned tmp, tmp2; - asm( + __asm__( "mov $0,%0\n\t" /* Not supported a priori */ "pushfl \n\t" /* Save flags */ @@ -64,7 +64,7 @@ static __hwloc_inline void hwloc_x86_cpuid(unsigned *eax, unsigned *ebx, unsigne * use them :/ */ #ifdef HWLOC_X86_64_ARCH hwloc_uint64_t sav_rbx; - asm( + __asm__( "mov %%rbx,%2\n\t" "cpuid\n\t" "xchg %2,%%rbx\n\t" @@ -73,7 +73,7 @@ static __hwloc_inline void hwloc_x86_cpuid(unsigned *eax, unsigned *ebx, unsigne "+c" (*ecx), "=" (*edx)); #elif defined(HWLOC_X86_32_ARCH) unsigned long sav_ebx; - asm( + __asm__( "mov %%ebx,%2\n\t" "cpuid\n\t" "xchg %2,%%ebx\n\t"
Re: [hwloc-devel] hwloc-1.9.1 failure on FreeBSD64
Balaji, Pavan, le Thu 04 Sep 2014 14:39:38 +, a écrit : > /home/autotest/balaji/hwloc/hwloc-1.9.1/include/private/misc.h:360:3: error: > implicit declaration of function 'strncasecmp' > [-Werror=implicit-function-declaration] Uh, that's odd, we explicitly test for HWLOC_HAVE_DECL_STRNCASECMP. Could you post the config.log? Thanks, Samuel
Re: [hwloc-devel] hwloc with Xen system support - v2
Brice Goglin, le Wed 29 Jan 2014 16:04:54 +0100, a écrit : > We may want to make inputbuffer and outputbuffer generic enough (void* + > length) so that the model works for other architectures one day? Probably, yes. > Xen will know that they correspond to inputbuffer=one-register and > outputbuffer=four-registers when running on x86. It's actually already two-registers on x86, see cpuid calls which put cachenum in ecx. Perhaps ebx and edx would be used someday, who knows, so perhaps pass the four registers already. Samuel
Re: [hwloc-devel] hwloc with Xen system support - v2
Hello, Brice Goglin, le Tue 07 Jan 2014 12:54:45 +0100, a écrit : > I currently have a crazy idea for getting at the cache information. > topology-x86.c has a lot of cpuid knowledge, and I have a proposed new > hypercall which executes cpuid on a specific PU. Would it be possible (or > indeed sensible) to parametrise the code in topology-x86.c to take a few > function pointers for get/set binding information, and for the cpuid call > itself? > > > I don't see why we couldn't do that. Yep, it should just work. > Can you post an example of what the Xen cpuid hypercall prototype > would be, so that I see how I need to change the x86 backend? Well, it will probably just take a cpu number and eax,ecx values (or even all register values, in case future cpuid calls use all registers as input) as parameters, and return eax,ebx,ecx,edx. Compared to what we have already in the x86 backend, we are essentially missing passing the cpu number, since in our case we were assuming that the code was already running on the cpu itself. We might however want to have a way to restrict discovery to caches only. Samuel
Re: [hwloc-devel] Creating a topology generation method for Xen
Andrew Cooper, le Thu 26 Dec 2013 23:31:36 +0100, a écrit : > On 26/12/2013 21:43, Samuel Thibault wrote: > > Andrew Cooper, le Thu 26 Dec 2013 22:17:38 +0100, a écrit : > >> I believe can make a topology-xen.c without too much trouble. It likely > >> wants to checked before an os-specific hook (Xen dom0's come in at least > >> Linux, FreeBSD, NetBSD flavours which have mainstream support) > >> Are there any hints/suggestion/information about how to go about > >> integrating this? > > Yes, you can probably play with plugin priorities for that. See for > > instance what happens with the pci plugins. > > Are there any hints on exactly what I have to tweak to get > topology-xen.c picked up properly? Things happen in config/hwloc.m4, where you have to specify in hwloc_components that you want to build a new plugin. You can probably use the xml plugin as an example. Samuel
Re: [hwloc-devel] Creating a topology generation method for Xen
Samuel Thibault, le Thu 26 Dec 2013 22:43:35 +0100, a écrit : > Andrew Cooper, le Thu 26 Dec 2013 22:17:38 +0100, a écrit : > > I believe can make a topology-xen.c without too much trouble. It likely > > wants to checked before an os-specific hook (Xen dom0's come in at least > > Linux, FreeBSD, NetBSD flavours which have mainstream support) > > Are there any hints/suggestion/information about how to go about > > integrating this? > > Yes, you can probably play with plugin priorities for that. See for > instance what happens with the pci plugins. (against the linux plugin. You can also see how the bgq plugin goes before the linux plugin). Samuel
Re: [hwloc-devel] Creating a topology generation method for Xen
Hello, Andrew Cooper, le Thu 26 Dec 2013 22:17:38 +0100, a écrit : > I believe can make a topology-xen.c without too much trouble. It likely > wants to checked before an os-specific hook (Xen dom0's come in at least > Linux, FreeBSD, NetBSD flavours which have mainstream support) > Are there any hints/suggestion/information about how to go about > integrating this? Yes, you can probably play with plugin priorities for that. See for instance what happens with the pci plugins. > What is the policy with regards to linking against > new libraries by default (or perhaps by an --enable-xen configure > option)? By default we usually link against anything which is there, so linking against libxenctrl is fine. IIRC hypercalls through libxenctrl are reserved to root? We'd like to let normal users be able to get the topology... Samuel
Re: [hwloc-devel] hwloc-1.8 patch
Pavan Balaji, le Fri 06 Dec 2013 00:34:30 +0100, a écrit : > Would you consider the following patch for hwloc-1.8 that we embed in the > mpich version of hwloc? The commit log has the description. Please ignore > the extra white-space piece of the commit. This is now commited in master and 1.8, thanks! Samuel
Re: [hwloc-devel] Relationship between Cario and X11
Jeff Squyres (jsquyres), le Fri 01 Nov 2013 18:03:55 +0100, a écrit : > On Nov 1, 2013, at 11:54 AM, Samuel Thibault <samuel.thiba...@inria.fr> wrote: > > > We could avoid Xutil.h and keysym.h by disabling the case KeyPress part, > > but I'd rather not: people will wonder why they don't have keyboard > > shortcut, and finding out from ./configure output will not be easy. > > When one has Xlib.h, having Xutil.h and keysym.h is not really far > > anyway. > > > Ok -- so you're saying we *require* all 3, right? With the current source code, yes. Samuel
Re: [hwloc-devel] Relationship between Cario and X11
Jeff Squyres (jsquyres), le Fri 01 Nov 2013 16:33:41 +0100, a écrit : > There's some funny m4 logic in the CHECK_HEADERS for X11. Let me make sure I > understand the intent: > > - X11/Xlib.h: this file is required for X11 support > - X11/Xutil.h X11/keysym.h: these files are optional for X11 support (i.e., > we can still build X11 support without them, but if we have them, there's > extra X11 goodies that can be used) > > Is that correct? Or do we *require* all 3 header files for X11 support? We could avoid Xutil.h and keysym.h by disabling the case KeyPress part, but I'd rather not: people will wonder why they don't have keyboard shortcut, and finding out from ./configure output will not be easy. When one has Xlib.h, having Xutil.h and keysym.h is not really far anyway. Samuel
Re: [hwloc-devel] Relationship between Cario and X11
Jeff Squyres (jsquyres), le Fri 01 Nov 2013 16:01:41 +0100, a écrit : > It looks like this logic isn't quite correct, anyway -- the X11 checks are > embedded in the Cairo and GL sections. Should they moved out to be > independent of Cairo and GL (and therefore only once, and include the > AC_DEFINE for HWLOC_HAVE_X11)? Probably, yes. Samuel
Re: [hwloc-devel] Relationship between Cario and X11
Jeff Squyres (jsquyres), le Fri 01 Nov 2013 15:12:31 +0100, a écrit : > Cool. Does the following patch look ok? If so, I'll commit to master and > v1.7: Err, no, we really need to have HWLOC_HAVE_X11 defined when X11 is available, otherwise we won't get the graphical lstopo. Samuel
Re: [hwloc-devel] Relationship between Cario and X11
Hello, Jeff Squyres (jsquyres), le Fri 01 Nov 2013 14:59:03 +0100, a écrit : > I notice that we have an explicit dependency between Cairo and X11 in > configure: > > Is there any reason for this? I think if there was any it's now gone. > Indeed, I manually disabled this extra check in configure, and I can still > seem to use Cairo in lstopo (e.g., generate PDFs and PNGs). So the source code is already fine with it, good! > Are there some platforms where linking Cairo depends on X11? Possibly, but I believe it is hidden, or at least all handled by pkg-config, and so we don't care. Of course we need X11 for our x11 backend (which also happens to be using cairo). Samuel
Re: [hwloc-devel] nightly snapshot tarballs now available
Samuel Thibault, le Thu 03 Oct 2013 18:01:25 +0200, a écrit : > I have automatically imported the previous debian uploads. I don't > think we really need to keep the detailed commit, the releases have > brought small diffs enough. And that permits to switch to another layout which is nicer to handle with git-buildpackage. Samuel
Re: [hwloc-devel] nightly snapshot tarballs now available
Jeff Squyres (jsquyres), le Thu 03 Oct 2013 12:38:05 +0200, a écrit : > On Oct 2, 2013, at 12:47 PM, Brice Goglinwrote: > > I can do it (assuming the "hwloc-svn-conversion" git tree it uptodate). > > But I need somebody to create the ompi/hwloc-debian repo on github and > > give us credentials. > > Done. I have automatically imported the previous debian uploads. I don't think we really need to keep the detailed commit, the releases have brought small diffs enough. Samuel
Re: [hwloc-devel] git / nightly builds
Jeff Squyres (jsquyres), le Fri 27 Sep 2013 15:36:33 +0200, a écrit : > a) The last SVN nightly snapshot on the v1.7 branch was named >hwloc-1.7.3rc1r5779.tar.bz2. > b) The first git nightly snapshot on the v1.7 branch will be named >hwloc-1.7.2-4-g3a6f84c.tar.bz2. > > Note "1.7.3rc1..." vs. "1.7.2...". I.e., the git name will say "we're X > commits beyond the 1.7.2 tag", but the old SVN name was "we're at this > *upcoming* version". I'm fine with it. Samuel
Re: [hwloc-devel] Git testing of hwloc
Jeff Squyres (jsquyres), le Sat 07 Sep 2013 00:04:13 +0200, a écrit : > What are your github IDs? sthibaul Samuel
Re: [hwloc-devel] hwloc-distrib - please add the option to distribute the jobs in the reverse direction
Brice Goglin, le Thu 29 Aug 2013 09:58:17 +0200, a écrit : > Anyway, reversing the loop just move the core you don't want to the end of the > list. But if you use the entire list, you end up using the exact same cores. He wants that, yes. Samuel
Re: [hwloc-devel] upcoming cleaning of headers and doc sections
Brice Goglin, le Thu 18 Jul 2013 14:10:28 +0200, a écrit : > * only put the prototypes in hwloc.h and keep the inline code somewhere else > * if some sections are obviously less important, keep these out of > hwloc.h (just like the ones in hwloc/helper.h currently) I'd say these two. Samuel
Re: [hwloc-devel] lstopo --top
Jiri Hladky, le Thu 20 Jun 2013 22:08:03 +0200, a écrit : > lstopo has obviously some logic how to sort the data inserted > by hwloc_topology_insert_misc_object_by_cpuset. Could be data displayed in the > same order as inserted? hwloc_topology_insert_misc_object_by_parent probably does that, you just need to replace the cpuset with an hwloc object. Samuel
Re: [hwloc-devel] lstopo --top
Hello, Jiri Hladky, le Tue 18 Jun 2013 17:18:15 +0200, a écrit : > I would like to check the possibilities to visualize the results to the output > similar to lstopo --top (see the attachment). I would like to write a simple > utility which will > * parse the above file > * foreach timestep create an output similar to lstopo --top output showing, > where each job was running It should be easy to do: create a program which - detects the topology as usual - for each of these lines: PID #CPU #CPU #CPU #CPU PID #CPU #CPU #CPU call hwloc_topology_insert_misc_object_by_cpuset(topology, cpuset, PID) - export the topology as xml file. and then for each job output, run it and run lstopo on the generated xml file. Samuel
Re: [hwloc-devel] plugins inside plugin broken, as expected
Brice Goglin, le Mon 03 Jun 2013 19:50:26 +0200, a écrit : > Le 03/06/2013 10:52, Samuel Thibault a écrit : > > Brice Goglin, le Mon 03 Jun 2013 10:46:49 +0200, a écrit : > >> hwloc/bitmap.h is the biggest problem, plugins should be allowed to use > >> all of them but there are many of them. Splitting hwloc-bitmap.so > >> out of hwloc.so would be an easy way to solve this. The bitmap API is > >> totally independent from the hwloc core anyway. > > Having a libhwloc-plugin-helper.so for most functions is probably the > > sanest way indeed. > > If both plugins and the core libhwloc use these functions, is there a > way to avoid having to pass both -lhwloc and -lhwloc-helper when linking > normal hwloc applications? If only plugins and the core use these functions, the application does not have to use -lhwloc-helper at all. If the application uses them (e.g. bitmap functions), then it would have to use -lhwloc-helper, but we can probably as well simply provide the symbols in both libhwloc-helper and libhwloc, so the application only needs -lhwloc. We can probably do that for the helpers we know for sure they have no state, such as bitmap functions. Samuel
Re: [hwloc-devel] plugins inside plugin broken, as expected
Brice Goglin, le Mon 03 Jun 2013 10:46:49 +0200, a écrit : > hwloc/bitmap.h is the biggest problem, plugins should be allowed to use > all of them but there are many of them. Splitting hwloc-bitmap.so > out of hwloc.so would be an easy way to solve this. The bitmap API is > totally independent from the hwloc core anyway. Having a libhwloc-plugin-helper.so for most functions is probably the sanest way indeed. Samuel
Re: [hwloc-devel] hwloc embedding vs libltdl
Jeff Squyres (jsquyres), le Wed 08 May 2013 02:21:02 +0200, a écrit : > On May 7, 2013, at 6:25 PM, Brice Goglinwrote: > > > I don't have anything against this. What was the reason for not using > > the default/system libltdl in OMPI? libtool was buggy when you started > > using it? > > > I neglected to answer this. > > Yes, plus libltdl grew new functionality that we needed (global/local symbol > visibility). > > We might be getting to the point soon where we can rely on the installed > libltdl to be new enough everywhere, but we haven't had that conversation. We could already check that the installed version is new enough for our needs. Samuel
Re: [hwloc-devel] hwloc-1.7 Warnings on FreeBSD
Pavan Balaji, le Fri 03 May 2013 06:45:10 +0200, a écrit : > -Wbad-function-cast'. > > lstopo-draw.c:437: warning: cast from function call of type 'double' to > non-matching type 'unsigned int' I'm not sure to understand what one is supposed to do here. double->float->unsigned is less precise than double->unsigned. I don't know any standard function that would do the double->unsigned conversion, thus simply casting, which should already be doing the job exactly how we want, anyway... Samuel
Re: [hwloc-devel] RPATH issues when building in Fedora 18
Paul Hargrove, le Wed 24 Apr 2013 08:06:03 +0200, a écrit : > In my testing on Fedora 17, the patch below applied to hwloc-1.7 produces an > accurate sys_lib_dlsearch_path_spec > > --- config/libtool.m4~ 2013-04-07 16:29:21.0 -0700 > +++ config/libtool.m4 2013-04-23 22:43:52.88200 -0700 > @@ -2669,10 +2669,10 @@ > # before this can be enabled. > hardcode_into_libs=yes > > - # Append ld.so.conf contents to the search path > - if test -f /etc/ld.so.conf; then > - lt_ld_extra=`awk '/^include / { system(sprintf("cd /etc; cat %s 2>/dev/ > null", \[$]2)); skip = 1; } { if (!skip) print \[$]0; skip = 0; }' < /etc/ > ld.so.conf | $SED -e 's/#.*//;/^[ ]*hwcap[ ]/d;s/[:, ]/ /g;s/= > [^=]*$//;s/=[^= ]* / /g;s/"//g;/^$/d' | tr '\n' ' '` > - sys_lib_dlsearch_path_spec="/lib /usr/lib $lt_ld_extra" > + # Extract search path from ldconfig > + ldconfig_search_path=`/sbin/ldconfig -N -X -v 2>/dev/null|$SED > 's,^\(/.*\):\ > ( (.*)\)\?$,\1,p;d'|tr '\012' ' '` It looks better to use ldconfig's output than parsing its configuration files indeed (notably at least since configuration files now have include statements...) Samuel
Re: [hwloc-devel] Compiling hwloc 1.7 with NV support
Hello, Jiri Hladky, le Sat 20 Apr 2013 00:57:18 +0200, a écrit : > topology-gl.c: In function 'hwloc_gl_query_devices': > topology-gl.c:91:41: error: 'NV_CTRL_PCI_DOMAIN' undeclared (first use in this > > Indeed, there is no NV_CTRL_PCI_DOMAIN MACRO defined in NVCtrl header files: > > grep NV_CTRL_PCI_DOMAIN /usr/include/NVCtrl/NVCtrl* Which version of nvctrl do you have? I have it in Debian in version 304.88-1, at least. I guess older versions don't have it and we should check against it. > yum whatprovides "*/cuda_runtime_api.h" > > but without any luck. So it seems I can get rpm only with PCI and GL support. > > What's your opinion on it? Do you know what other Linux distros are doing? In debian cuda_runtime_api.h is provided by the non-free nvidia-cuda-dev package. Is cuda really packaged in RedHat? Samuel
Re: [hwloc-devel] Hardware locality (hwloc) v1.7rc1 released
Samuel Thibault, le Fri 05 Apr 2013 14:08:16 +0200, a écrit : > Samuel Thibault, le Fri 05 Apr 2013 09:11:31 +0200, a écrit : > > Brice Goglin, le Thu 04 Apr 2013 18:02:33 +0200, a écrit : > > > I haven't seen any problem on various Linux distribs, several BSDs, some > > > Solaris, and AIX 6.1. > > > > Same for me (including hp-ux). > > Ah, mic devices don't seem to get detected on a MIC cluster in Japan, > I'll dig a bit more. Ok, it was simply missing pci devel headers on that machine, installing them by hand and reconfiguring made the MIC device show up. Samuel
Re: [hwloc-devel] Hardware locality (hwloc) v1.7rc1 released
Samuel Thibault, le Fri 05 Apr 2013 09:11:31 +0200, a écrit : > Brice Goglin, le Thu 04 Apr 2013 18:02:33 +0200, a écrit : > > I haven't seen any problem on various Linux distribs, several BSDs, some > > Solaris, and AIX 6.1. > > Same for me (including hp-ux). Ah, mic devices don't seem to get detected on a MIC cluster in Japan, I'll dig a bit more. Samuel
Re: [hwloc-devel] Hardware locality (hwloc) v1.7rc1 released
Brice Goglin, le Thu 04 Apr 2013 18:02:33 +0200, a écrit : > I haven't seen any problem on various Linux distribs, several BSDs, some > Solaris, and AIX 6.1. Same for me (including hp-ux). Samuel
Re: [hwloc-devel] v1.7
Hello, I'm realizing that this was actually not settled on. I've just fixed my previous text with the current syntax Samuel Thibault, le Mon 07 Jan 2013 15:05:55 +0100, a écrit : > Brice Goglin, le Mon 31 Dec 2012 10:05:41 +0100, a écrit : > > + The HWLOC_COMPONENTS may now start with '-' to only define a list of > > components to exclude. > > I'm finding it not intuitive and not generic enough [...] > > It means that > > HWLOC_COMPONENTS=-cuda,opencl > > disables cuda *and* opencl, while intuition would have told me that it > disables cuda but enables opencl. > > Also, one would for instance want to be able to do this: > > HWLOC_COMPONENTS=x86,-cuda,-opencl,nvml > > To be able to enable x86 before the default linux, but disable cuda and > opencl, but enable nvml, as well as all the other usual plugins (without > having to know the list, which is important for future extensions). I thought we agreed that it would be useful to be able to do it, and using '-' instead of '^' was meant to avoid confusion with Open-MPI which has the previous behavior. Samuel
Re: [hwloc-devel] libpci: GPL
Christopher Samuel, le Tue 19 Feb 2013 05:30:40 +0100, a écrit : > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 06/02/13 10:29, Samuel Thibault wrote: > > > Right. If hwloc was strictly requiring a GPL library to be able > > to run, providing it under BSD would be questionable: > > I don't think that's true, the BSD license is compatible with the GPL > so it's not an issue. It's not an issue, but it's still questionable: if it needs some GPL code to be able to run, what is the point of providing it under BSD, since it'll always have to be GPL-ed on link? Samuel
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r5299 - in branches/bgq: config include/private src
Brice Goglin, le Fri 08 Feb 2013 13:34:01 +0100, a écrit : > Le 08/02/2013 13:26, Samuel Thibault a écrit : > > Brice Goglin, le Fri 08 Feb 2013 13:23:33 +0100, a écrit : > >> Le 08/02/2013 12:52, Samuel Thibault a écrit : > >>> svn-commit-mai...@open-mpi.org, le Fri 08 Feb 2013 12:02:18 +0100, a > >>> écrit : > >>>> Everything is hardwired in the backend, all nodes are the same. > >>> Would it be possible to check some file in /proc or /sys to identify the > >>> machine, to make sure we are not lying? > >> There's no such filesystems on BlueGene compute nodes. The CNK kernel > >> redirects I/O call to the I/O node (which runs Linux). > >> > >> I can check that uname reports "CNK" and "BGQ" if you're afraid of > >> people running hwloc/bgq on non-bgq machines. > > My concern is rather than within a few years, new BGQ machines get built > > with a differing architecture. Ideally we would find some > > identifier or version which will for sure change when IBM changes the > > architecture. > > If a new BlueGene is built, it won't be called Q again, they still have > 23 letters to use :) Ok, so we can use that as identifier :) Samuel
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r5299 - in branches/bgq: config include/private src
Brice Goglin, le Fri 08 Feb 2013 13:23:33 +0100, a écrit : > Le 08/02/2013 12:52, Samuel Thibault a écrit : > > svn-commit-mai...@open-mpi.org, le Fri 08 Feb 2013 12:02:18 +0100, a écrit : > >> Everything is hardwired in the backend, all nodes are the same. > > Would it be possible to check some file in /proc or /sys to identify the > > machine, to make sure we are not lying? > > There's no such filesystems on BlueGene compute nodes. The CNK kernel > redirects I/O call to the I/O node (which runs Linux). > > I can check that uname reports "CNK" and "BGQ" if you're afraid of > people running hwloc/bgq on non-bgq machines. My concern is rather than within a few years, new BGQ machines get built with a differing architecture. Ideally we would find some identifier or version which will for sure change when IBM changes the architecture. Samuel
Re: [hwloc-devel] libpci: GPL
Brice Goglin, le Wed 06 Feb 2013 07:11:03 +0100, a écrit : > Any idea why it doesn't find your nvidia card? Actually it is due to my use of bumblebee, which puts the graphic card into sleep when not running CUDA. Running lstopo through optirun brings the card back. It actually makes sense, I have to do the same with all applications using it. Samuel
Re: [hwloc-devel] libpci: GPL
Jeff Squyres (jsquyres), le Wed 06 Feb 2013 17:33:03 +0100, a écrit : > On Feb 6, 2013, at 8:21 AM, Samuel Thibault <samuel.thiba...@inria.fr> wrote: > > >> - if found, and if --enable-gpl-taint was specified, use it. STOP. > > > > Such kind of options are questionable: nothing says that libpci is for sure > > GPL. The system might have a BSD replacement for libpci with the exact same > > API... > > Do we know if this has happened? I don't think it has. On another level, as previously mentioned, such kind of option would mean that we track GPL-ness, while we can't really promise that we can check the licence of the libraries we are linking with. Samuel
Re: [hwloc-devel] libpci: GPL
Brice Goglin, le Wed 06 Feb 2013 16:03:21 +0100, a écrit : > I am not sure yet if we should add a > --disable-gpl or --enable-gpl, Jeff Squyres (jsquyres), le Wed 06 Feb 2013 16:11:55 +0100, a écrit : > - if found, and if --enable-gpl-taint was specified, use it. STOP. Such kind of options are questionable: nothing says that libpci is for sure GPL. The system might have a BSD replacement for libpci with the exact same API... Samuel
Re: [hwloc-devel] libpci: GPL
Brice Goglin, le Wed 06 Feb 2013 07:11:03 +0100, a écrit : > Any idea why it doesn't find your nvidia card? Well, actually it does, but vendor id & co are all 0x > By the way, the contamination should be limited to the libpci plugin > when plugins are enabled. Right. Samuel
Re: [hwloc-devel] libpci: GPL
Samuel Thibault, le Wed 06 Feb 2013 01:55:18 +0100, a écrit : > Jeff Squyres (jsquyres), le Wed 06 Feb 2013 01:41:21 +0100, a écrit : > > On Feb 5, 2013, at 3:50 PM, Samuel Thibault <samuel.thiba...@inria.fr> > > wrote: > > > > > Jeff Squyres (jsquyres), le Tue 05 Feb 2013 22:52:01 +0100, a écrit : > > >> It was just pointed out to me that libpci is licensed under the GPL (not > > >> the LGPL). > > > > > > I'm told that we could use libpciaccess instead, which is BSD. > > > > That would be great -- is it easily available? > > Yes. I've made a quick port Here is a quick version (I haven't taken the time to handle the hwloc.m4 stuff). Samuel Index: src/topology-libpci.c === --- src/topology-libpci.c (révision 5235) +++ src/topology-libpci.c (copie de travail) @@ -1,7 +1,7 @@ /* * Copyright © 2009 CNRS * Copyright © 2009-2012 Inria. All rights reserved. - * Copyright © 2009-2011 Université Bordeaux 1 + * Copyright © 2009-2011, 2013 Université Bordeaux 1 * See COPYING in top-level directory. */ @@ -14,7 +14,6 @@ #include #include -#include #include #include #include @@ -22,8 +21,92 @@ #include #include -#define CONFIG_SPACE_CACHESIZE 256 +#ifdef HWLOC_HAVE_LIBPCIACCESS +#include +#undef HWLOC_HAVE_LIBPCI +#endif +#ifdef HWLOC_HAVE_LIBPCI +#include +#endif + +#ifndef PCI_HEADER_TYPE +#define PCI_HEADER_TYPE 0x0e +#endif +#ifndef PCI_HEADER_TYPE_BRIDGE +#define PCI_HEADER_TYPE_BRIDGE 1 +#endif + +#ifndef PCI_CLASS_DEVICE +#define PCI_CLASS_DEVICE 0x0a +#endif +#ifndef PCI_CLASS_BRIDGE_PCI +#define PCI_CLASS_BRIDGE_PCI 0x0604 +#endif + +#ifndef PCI_REVISION_ID +#define PCI_REVISION_ID 0x08 +#endif + +#ifndef PCI_SUBSYSTEM_VENDOR_ID +#define PCI_SUBSYSTEM_VENDOR_ID 0x2c +#endif +#ifndef PCI_SUBSYSTEM_ID +#define PCI_SUBSYSTEM_ID 0x2e +#endif + +#ifndef PCI_PRIMARY_BUS +#define PCI_PRIMARY_BUS 0x18 +#endif +#ifndef PCI_SECONDARY_BUS +#define PCI_SECONDARY_BUS 0x19 +#endif +#ifndef PCI_SUBORDINATE_BUS +#define PCI_SUBORDINATE_BUS 0x1a +#endif + +#ifndef PCI_EXP_LNKSTA +#define PCI_EXP_LNKSTA 18 +#endif + +#ifndef PCI_EXP_LNKSTA_SPEED +#define PCI_EXP_LNKSTA_SPEED 0x000f +#endif +#ifndef PCI_EXP_LNKSTA_WIDTH +#define PCI_EXP_LNKSTA_WIDTH 0x03f0 +#endif + +#ifndef PCI_CAP_ID_EXP +#define PCI_CAP_ID_EXP 0x10 +#endif + +#ifndef PCI_CAP_NORMAL +#define PCI_CAP_NORMAL 1 +#endif + +#ifndef PCI_STATUS +#define PCI_STATUS 0x06 +#endif + +#ifndef PCI_CAPABILITY_LIST +#define PCI_CAPABILITY_LIST 0x34 +#endif + +#ifndef PCI_STATUS_CAP_LIST +#define PCI_STATUS_CAP_LIST 0x10 +#endif + +#ifndef PCI_CAP_LIST_ID +#define PCI_CAP_LIST_ID 0 +#endif + +#ifndef PCI_CAP_LIST_NEXT +#define PCI_CAP_LIST_NEXT 1 +#endif + +#define CONFIG_SPACE_CACHESIZE_TRY 256 +#define CONFIG_SPACE_CACHESIZE 64 + static void hwloc_pci_traverse_print_cb(void * cbdata __hwloc_attribute_unused, struct hwloc_obj *pcidev, int depth __hwloc_attribute_unused) @@ -290,6 +373,7 @@ return parent; } +#ifdef HWLOC_HAVE_LIBPCI /* Avoid letting libpci call exit(1) when no PCI bus is available. */ static jmp_buf err_buf; static void @@ -308,15 +392,59 @@ hwloc_pci_warning(char *msg __hwloc_attribute_unused, ...) { } +#endif +#ifndef HWLOC_HAVE_PCI_FIND_CAP +static unsigned +hwloc_pci_find_cap(const unsigned char *config, size_t config_size, unsigned cap) +{ + unsigned char seen[256] = { 0 }; + unsigned char ptr; + + if (!(config[PCI_STATUS] & PCI_STATUS_CAP_LIST)) +return 0; + + for (ptr = config[PCI_CAPABILITY_LIST] & ~3; + ptr; + ptr = config[ptr + PCI_CAP_LIST_NEXT] & ~3) { +unsigned char id; + +if (ptr >= config_size) + return 0; + +/* Looped around! */ +if (seen[ptr]) + return 0; +seen[ptr] = 1; + +id = config[ptr + PCI_CAP_LIST_ID]; +if (id == cap) + return ptr; +if (id == 0xff) + break; + +if (ptr + PCI_CAP_LIST_NEXT >= config_size) + return 0; + } + return 0; +} +#endif + static int hwloc_look_libpci(struct hwloc_backend *backend) { struct hwloc_topology *topology = backend->topology; + struct hwloc_obj fakehostbridge; /* temporary object covering the whole PCI hierarchy until its complete */ + unsigned current_hostbridge; +#ifdef HWLOC_HAVE_LIBPCIACCESS + int ret; + struct pci_device_iterator *iter; + struct pci_device *pcidev; +#endif +#ifdef HWLOC_HAVE_LIBPCI struct pci_access *pciaccess; struct pci_dev *pcidev; - struct hwloc_obj fakehostbridge; /* temporary object covering the whole PCI hierarchy until its complete */ - unsigned current_hostbridge; +#endif if (!(hwloc_topology_get_flags(topology) & (HWLOC_TOPOLOGY_FLAG_IO_DEVICES|HWLOC_TOPOLOGY_FLAG_WHOLE_IO))) return 0; @@ -331,6 +459,16 @@ hwloc_debug("%s", "\nScanning PCI buses...\n"); +#ifdef
Re: [hwloc-devel] libpci: GPL
Jeff Squyres (jsquyres), le Wed 06 Feb 2013 01:41:21 +0100, a écrit : > On Feb 5, 2013, at 3:50 PM, Samuel Thibault <samuel.thiba...@inria.fr> wrote: > > > Jeff Squyres (jsquyres), le Tue 05 Feb 2013 22:52:01 +0100, a écrit : > >> It was just pointed out to me that libpci is licensed under the GPL (not > >> the LGPL). > > > > I'm told that we could use libpciaccess instead, which is BSD. > > That would be great -- is it easily available? Yes. I've made a quick port, it does work. libpciaccess is however a bit buggy (it doesn't find my nvidia card for instance), and does not support finding capabilities (but we can do this by hand). > Do we still want to offer a non-default configure switch (with > appropriate big, flashing banner in configure that says "YOU WILL BE > GPL!")? Samuel
Re: [hwloc-devel] libpci: GPL
Jeff Squyres (jsquyres), le Tue 05 Feb 2013 22:52:01 +0100, a écrit : > It was just pointed out to me that libpci is licensed under the GPL (not the > LGPL). I'm told that we could use libpciaccess instead, which is BSD. Samuel
Re: [hwloc-devel] libpci: GPL
Pavan Balaji, le Tue 05 Feb 2013 23:53:54 +0100, a écrit : > I checked libnuma, which seems to be LGPL (phew!), but didn't look at > the remaining libraries hwloc uses. The base of hwloc needs libm/libc (LGPL), plugin support needs libltdl (LGPL) and libdl (LGPL). Samuel
Re: [hwloc-devel] [mpich-core] libpci: GPL
Pavan Balaji, le Wed 06 Feb 2013 00:07:10 +0100, a écrit : > > On 02/05/2013 04:52 PM US Central Time, Pavan Balaji wrote: > > If libpci was disabled by default, would hwloc still come under the same > > GPL issue? > > I realized that wasn't very clear. Let me rephrase -- if libpci was > disabled (either by default or a configure argument), would hwloc still > be considered "tainted". I'd think not. What matters is the resulting binary. If libpci is linked in, the result is GPL. If it isn't, the result remains BSD. The former however does not change the fact that the source code is provided under BSD. It's the person who does the compilation which needs to be aware of the licence of the result. We don't provide binaries linked against libpci, so we don't strictly have to care. Samuel
Re: [hwloc-devel] libpci: GPL
Jeff Squyres (jsquyres), le Tue 05 Feb 2013 22:52:01 +0100, a écrit : > The complaint to me was that hwloc needs to be clearer about this in its > documentation. > > Does this sound right? It makes sense that we warn about this, yes, so people know they might want to pass --disable-pci. Samuel
Re: [hwloc-devel] v1.7
Jeff Squyres (jsquyres), le Mon 07 Jan 2013 20:01:44 +0100, a écrit : > So if you don't know the list of available components, is it not possible to > specify *only* foo and bar should be used? foo,bar,stop will do it. Samuel
Re: [hwloc-devel] v1.7
Brice Goglin, le Mon 07 Jan 2013 19:11:02 +0100, a écrit : > BTW, if we change the hwloc syntax, we may want to not use ^ to avoid > confusion with OMPI. ~ and ! could work but some shells may not like them? How about '-'? I doubt anybody would want a plugin name starting with it. Samuel
Re: [hwloc-devel] v1.7
Jeff Squyres (jsquyres), le Mon 07 Jan 2013 19:19:15 +0100, a écrit : > On Jan 7, 2013, at 12:59 PM, Samuel Thibault <samuel.thiba...@inria.fr> > > Because I may not know *everything* that I want. Who knows what > > proprietary plugin I need to use to discover CPUs, while I know that for > > GPUs I can use cuda, but I don't want to use nvml. > > > >> Taking your example: HWLOC_COMPONENTS=foo,^bar,^baz,yow > >> Is the same as: HWLOC_COMPONENTS=foo,yow > > > > No, because there is also the implicit "and the default plugins" after > > that. > > So you're really saying "not bar and baz, but I do want everything else." I'm also saying "foo and yow before everything else", which as Brice mentioned, does matter. > - if foo doesn't load / isn't used, it's an error > - don't load bar > - don't load baz > - if yow doesn't load / isn't used, it's an error > - try to load all other components, but don't warn/error if they don't load / > aren't used We don't imply erroring out. Components never error out, they just don't discover anything :) What we however have is the ordering. Samuel
Re: [hwloc-devel] v1.7
Brice Goglin, le Mon 07 Jan 2013 17:33:47 +0100, a écrit : > Ideally, we could even have a OS device for each OpenCL platform, each > containing OS devices for devices of the platform. But I'd rather keep a > single level to match other OS devices. In most cases the platform object wouldn't bring much information to the topology anyway. Or worse, make it impossible, e.g. when the NVIDIA OpenCL platform drives several GPUs on different PCI cards or even NUMA nodes. Samuel
Re: [hwloc-devel] v1.7
Hello, Brice Goglin, le Mon 31 Dec 2012 10:05:41 +0100, a écrit : > + The HWLOC_COMPONENTS may now start with '^' to only define a list of > components to exclude. I'm finding it not intuitive and not generic enough, I'm wondering how that didn't affect Open-MPI, which as IUI uses this convention. It means that HWLOC_COMPONENTS=^cuda,opencl disables cuda *and* opencl, while intuition would have told me that it disables cuda but enables opencl. Also, one would for instance want to be able to do this: HWLOC_COMPONENTS=x86,^cuda,^opencl,nvml To be able to enable x86 before the default linux, but disable cuda and opencl, but enable nvml, as well as all the other usual plugins (without having to know the list, which is important for future extensions). Samuel
Re: [hwloc-devel] v1.7
Brice Goglin, le Mon 31 Dec 2012 10:05:41 +0100, a écrit : > - They add OS devices such as opencl0p0, I see that platform 0 device 3 would be called opencl3p0. I find it counterintuitive, and would have rather called it opencl0d3, along the line of sda3, eth0:3, socket:2.core:0, etc. What do people think? Samuel
Re: [hwloc-devel] plugins update
Brice Goglin, le Thu 08 Nov 2012 13:48:39 +0100, a écrit : > Did anybody look at leastc at the documentation links below? Yep, it looks nice! Samuel
Re: [hwloc-devel] [hwloc-users] hwloc 1.5, freebsd and linux output on the same hardware
Sebastian Kuzminsky wrote: > Maybe lstopo should expand its cpuset to be fully inclusive at startup? I'll > be happy to test patches if you want. Brice Goglin, le Thu 11 Oct 2012 18:13:53 +0200, a écrit : > Is the cpuset-modification a root-only operation on FreeBSD? If so lstopo > wouldn't be able to expand the cpuset at startup. > > lstopo has a --whole-system option to ignore such limitations. Unfortunately > the x86 backend that hwloc uses on FreeBSD requires that we bind to each > individual core to get its locality information, so that won't help unless > lstopo can indeed remove the cpuset first. Indeed. Also, we probably want to save the current cpuset before modifying it, in order to be able to restore it. I don't think we want to see libhwloc drop the current cpuset, even if only under whole-system flag condition. Samuel
Re: [hwloc-devel] merging plugins?
Brice Goglin, le Tue 25 Sep 2012 11:08:04 +0200, a écrit : > >> We have the "core_xml" component (generic xml support) and "xml_libxml" > >> + "xml_nolibxml" backends behind that. I am fine with removing the > >> "core_" prefix, but I wonder if we should keep the "xml_" prefix for the > >> latter. > > I'd say we should keep it. Just like I wanted to use core_linux_x86 (as > > opposed to core_linux) > > Keep which one? "core" or "core" and "xml" ? xml. > > Well, that still looks hardcoded to me. Actually, a simple way would > > be to order all plugins in just one list by priorities. When loading a > > plugin, one checks whether the exclusion point of the plugin was > > already filled or not, and load the plugin accordingly > > The good thing about this is that XML and synthetic can set exclusion > flags on OS+PCI+ADDITIONAL. > But we obviously don't want cuda, ... to set the ADDITIONAL exclusion > flag. So setting a exclusion flag would mean "I don't want any plugin of > this type to be enabled" (different from "I don't want any plugin with > this exclusion flag to be set). Right, like in Debian packages, there are separate "provides" and "conflicts". Usually one both provide and conflict, but one can do either separately. Samuel
Re: [hwloc-devel] merging plugins?
Brice Goglin, le Tue 25 Sep 2012 10:34:29 +0200, a écrit : > I am also going to add a hwloc_ prefix to plugin filenames because we > obviously can't create a libpci.so (libtool even warns about this). And it makes things clearer, I believe. > XML backends could be hwlocxml_ (not hwloc_xml_) to make it clear that > they are not normal hwloc_ plugins. Well, in the detection point of view they are: they simply replace all other detection plugins (i.e. provide "cpu" and "pci" exclusion points, but not "bind"), and third-party could want to do the same with their own plugin. Samuel
Re: [hwloc-devel] merging plugins?
Brice Goglin, le Tue 25 Sep 2012 07:41:48 +0200, a écrit : > * Your HWLOC_PLUGINS variable is not about loading plugins, it's about > enabling core components. It could also be to use another PCI detection plugin that libpci. Samuel
Re: [hwloc-devel] merging plugins?
Hello, Brice Goglin, le Mon 24 Sep 2012 22:04:14 +0200, a écrit : > 1) A rework of the backend infrastructure to make the core much more > readable (basically all changes in *.[ch] files). That looks nicer indeed. > 2) Plugin support One thing that doesn't seem implemented yet is to choose another OS core plugin, e.g. to use x86 detection on Linux instead of /proc or /sys detection. This will be the same kind of thing with likwid / servet -based OS core plugins. I have gotten the x86 detection code to get enabled with the attached code, which should be reproducable with other OSes which support CPU binding. How does it look like? Samuel Index: src/Makefile.am === --- src/Makefile.am (révision 4846) +++ src/Makefile.am (copie de travail) @@ -75,6 +75,11 @@ if HWLOC_HAVE_LINUX sources += topology-linux.c +plugins_LTLIBRARIES += core_linuxx86.la +core_linuxx86_la_SOURCES = topology-linux-x86.c +core_linuxx86_la_CPPFLAGS = $(AM_CPPFLAGS) -DHWLOC_BUILD_PLUGIN +core_linuxx86_la_CFLAGS = $(AM_CFLAGS) +core_linuxx86_la_LDFLAGS = $(plugins_ldflags) endif HWLOC_HAVE_LINUX if HWLOC_HAVE_AIX Index: src/topology-linux-x86.c === --- src/topology-linux-x86.c(révision 0) +++ src/topology-linux-x86.c(révision 0) @@ -0,0 +1,62 @@ +/* + * Copyright © 2012 Université Bordeaux 1 + * See COPYING in top-level directory. + */ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#if defined(HWLOC_HAVE_CPUID) +static int +hwloc_look_linux_x86(struct hwloc_topology *topology) +{ + unsigned nbprocs = hwloc_fallback_nbprocessors(topology); + + hwloc_alloc_obj_cpusets(topology->levels[0][0]); + hwloc_setup_pu_level(topology, nbprocs); + hwloc_set_linux_hooks(topology); + hwloc_look_x86(topology, nbprocs); + + return 1; +} + +static int +hwloc_linux_x86_component_instantiate(struct hwloc_topology *topology, + struct hwloc_core_component *component, + const void *_data1, + const void *_data2 __hwloc_attribute_unused, + const void *_data3 __hwloc_attribute_unused) +{ + struct hwloc_backend *backend = _hwloc_linux_component_instantiate(topology, component, _data1, _data2, _data3); + if (backend) { +backend->discover = hwloc_look_linux_x86; +hwloc_backend_enable(topology, backend); +return 0; + } + return -1; +} + +static struct hwloc_core_component hwloc_linuxx86_core_component = { + HWLOC_CORE_COMPONENT_TYPE_OS, + "linuxx86", + hwloc_linux_x86_component_instantiate, + hwloc_set_linux_hooks, + 10, + NULL +}; + +const struct hwloc_component hwloc_core_linuxx86_component = { + HWLOC_COMPONENT_ABI, + HWLOC_COMPONENT_TYPE_CORE, + 0, + _linuxx86_core_component +}; +#endif Index: src/topology-linux.c === --- src/topology-linux.c(révision 4846) +++ src/topology-linux.c(copie de travail) @@ -3341,8 +3341,8 @@ return 1; } -static void -hwloc_set_linuxfs_hooks(struct hwloc_topology *topology) +void +hwloc_set_linux_hooks(struct hwloc_topology *topology) { topology->set_thisthread_cpubind = hwloc_linux_set_thisthread_cpubind; topology->get_thisthread_cpubind = hwloc_linux_get_thisthread_cpubind; @@ -3647,7 +3647,7 @@ return res; } -static int +int hwloc_linux_backend_notify_new_object(struct hwloc_topology *topology, struct hwloc_obj *obj) { struct hwloc_linux_backend_data_s *data = topology->backend->private_data; @@ -3678,7 +3678,7 @@ return res; } -static int +int hwloc_linux_backend_get_obj_cpuset(struct hwloc_topology *topology, struct hwloc_obj *obj, hwloc_bitmap_t cpuset) { @@ -3713,7 +3713,7 @@ return -1; } -static void +void hwloc_linux_backend_disable(struct hwloc_topology *topology __hwloc_attribute_unused, struct hwloc_backend *backend) { @@ -3726,8 +3726,8 @@ free(data); } -static int -hwloc_linux_component_instantiate(struct hwloc_topology *topology, +struct hwloc_backend * +_hwloc_linux_component_instantiate(struct hwloc_topology *topology, struct hwloc_core_component *component, const void *_data1, const void *_data2 __hwloc_attribute_unused, @@ -3774,14 +3774,29 @@ #endif data->root_fd = root; - hwloc_backend_enable(topology, backend); - return 0; + return backend; out_with_data: free(data); out_with_backend: free(backend); out: + return NULL; +} + +static int +hwloc_linux_component_instantiate(struct hwloc_topology *topology, + struct hwloc_core_component *component, + const void
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r4815 - branches/components/src
Jeff Squyres, le Thu 06 Sep 2012 15:46:29 +0200, a écrit : > On Sep 6, 2012, at 7:46 AM, Jeff Squyres wrote: > > (sorry; I forgot to ping Shiqing yesterday -- I just did so now to get a > > confirmation of what you found) > > > Shiqing confirms that DSOs are disabled by default on Windows. However, he > says that they do actually work if you enable them. Mmm, I wonder how. Actually now I remember, it wasn't darwin which was posing problem with no -no-undefined, but windows, which requires it for shared libraries, and thus requiring -lhwloc. Samuel
Re: [hwloc-devel] -lhwloc in components.
Jeff Squyres, le Wed 05 Sep 2012 17:06:00 +0200, a écrit : > On Sep 5, 2012, at 10:21 AM, Samuel Thibault wrote: > > > So ltdl does not help for that matter? > > No. It's not really an ltdl issue. ltdl is just a portable wrapper around > OS-specific dlopen-like mechanisms. I understand that, but dlopen is usually used for plugins, and plugins usually need such kind of calling back into what loaded the plugin. > > One way would be to pass to the component a structure with all the > > useful function pointers (using #define to keep the same source code). > > We thought about this in OMPI and decided it would be a nightmare in the > source code. The source code shouldn't need to be modified: #define hwloc_foo_bar(arg1, arg2) hwloc_funcs->foo_bar(arg1, arg2) Samuel
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r4815 - branches/components/src
Brice Goglin, le Wed 05 Sep 2012 16:13:31 +0200, a écrit : > The problem I was trying to fix below is that linking hwloc plugins on > Darwin failed because plugins referred to hwloc-core symbols. Nothing on > the libtool command-line said where to find those symbols (I don't > understand why it worked on other platforms). Because on other platforms, undefined symbols are allowed. > I added -lhwloc as a way to tell the linker "those symbols are there". > I didn't think it would statically link libhwloc inside the plugins, > and it doesn't seem to do so (from what I see in objdump). Is this > what you mean? No, he means that it'll also make the loader load libhwloc.so. Even if the application linked libhwloc.a statically. > It's really a problem when linking, not about loading But it has effects on loading. Samuel
Re: [hwloc-devel] backends and plugins
Brice Goglin, le Wed 22 Aug 2012 07:52:07 +0200, a écrit : > Le 21/08/2012 21:18, Samuel Thibault a écrit : > > Brice Goglin, le Tue 21 Aug 2012 18:49:48 +0200, a écrit : > >> 1) We load plugins and list existing components once per topology. We > >> should do it only once per process. But that requires some locking in > >> case multiple topologies are loaded simultaneously, which means we need > >> thread-safety. Do we want pthread_mutex() for this? > > I'd say so. We can test in configure.ac whether -lpthread is really > > needed for that (it is not on most systems, which optimizes things away > > in non-libpthread cases). > > So pthread_mutex() is always available, at least with -lpthread, on all > platforms we support? IIRC yes. For windows that needs a recent enough version of mingw, but that should be fine. Samuel
Re: [hwloc-devel] hwloc_bitmap_or
Pavan Balaji, le Sat 21 Jul 2012 04:40:22 +0200, a écrit : > In hwloc_bitmap_or(), is the resultant bitmap allowed to be the same as one > of the input bitmaps? It seems to work correctly in practice, but the API > doesn't seem to explicitly guarantee it. We actually use it a lot in the core. I'll add comments in the doc. Samuel
Re: [hwloc-devel] XML string filtering
Brice Goglin, le Fri 06 Jul 2012 14:50:46 +0200, a écrit : > I could just use isprint() to check every character > before export and only keep those between 32 and 127. Why not just taking 32-126? I don't see what isprint will bring. > But what about \n, > \t, \r, \f which are before ? Do we want to allow them? I'd say so if they are allowed in the xml standard (I don't know whether they are). Samuel
Re: [hwloc-devel] [hwloc-svn] svn:hwloc r4554 - trunk/utils
Brice Goglin, le Fri 29 Jun 2012 22:29:20 +0200, a écrit : > Arg, I was using the console output (hwloc-nox). Ah, you should be able to use -.txt then, it colorizes there too. Samuel