Re: [hwloc-users] [EXTERNAL] Re: How to show all cores in lstopo output?

2024-06-25 Thread Brice Goglin
You may also hit the 'f' key from the graphical X11 output to toggle factorizing of cores and collapsing of PCI devices (those shortcuts are shown in the terminal text output while the graphical window is running). Brice Le 25/06/2024 à 21:34, Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND

Re: [hwloc-users] MI300A support

2024-02-08 Thread Brice Goglin
Hello I don't have access to a MI300A but I worked with AMD several month ago to solve a very similar issue. It was caused by a buggy APCI HMAT in the BIOS. Try setting HWLOC_USE_NUMA_DISTANCES=0 in the environment to disable the hwloc code that uses this HMAT info. If the warning goes away

Re: [hwloc-users] Support for Intel's hybrid architecture - can I restrict hwloc-distrib to P cores only?

2023-11-24 Thread Brice Goglin
Le 24/11/2023 à 08:51, John Hearns a écrit : Good question.  Maybe not an answer referring to hwloc. When managing a large NUMA machine, SGI UV, I ran the OS processes in a boot cpuset which was restricted to (AFAIR) the first 8 Cpus. On Intel architecures with E and P cores could we think of r

Re: [hwloc-users] Support for Intel's hybrid architecture - can I restrict hwloc-distrib to P cores only?

2023-11-24 Thread Brice Goglin
Le 23/11/2023 à 19:29, Jirka Hladky a écrit : Hi Brice, I have a question about the hwloc's support for Intel's hybrid architectures, like in Alder Lake CPUs: https://en.wikipedia.org/wiki/Alder_Lake There are P (performance) and E (efficiency) cores. Is hwloc able to detect which core is wh

Re: [hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Brice Goglin
ell account for sundry testing, mostly of build procedures. Is there anything I could do to get hwloc to work? Regards, Max --- On Wed, Aug 02, 2023 at 03:12:27PM +0200, Brice Goglin wrote: Hello There's something wrong in this machine. It exposes 4 cores (number 0 to 3) and no NUMA nod

Re: [hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Brice Goglin
PU L#0 (P#0) PU L#1 (P#1) HostBridge PCI 00:03.0 (Other) Block(Disk) "sda" PCI 00:04.0 (Ethernet) Net "ens4" PCI 00:05.0 (Other) (from which I conclude my build procedure is correct). At the suggestion of Brice Goglin (in response t

Re: [hwloc-users] Memory-Binding causes storing to swap before main memory is filled

2022-09-26 Thread Brice Goglin
Hello If I understand correctly, your allocations are going to swap as soon as you use more than half of your available local RAM? I don't see any reason for this unless some additional limit is preventing you from using more. Is there any chance your job was allocated with less memory (for i

Re: [hwloc-users] Problems with binding memory

2022-03-02 Thread Brice Goglin
Le 02/03/2022 à 12:31, Mike a écrit : Hello, Can you display both mask before set_area_membind and after get_area_membind and send the entire output of all processes and threads? If you can prefix the line with the PID, it'd help a lot :) What do you mean with output of all process

Re: [hwloc-users] Problems with binding memory

2022-03-02 Thread Brice Goglin
Le 02/03/2022 à 11:38, Mike a écrit : Hello, If you print the set that is built before calling set_area_membind, you should only see 4 bits in there, right? (since threadcount=4 in your code) I'd say 0xf for rank0, 0xf0 for rank1, etc. set_area_membind() will translate that

Re: [hwloc-users] Problems with binding memory

2022-03-02 Thread Brice Goglin
what threadcount means in your code. Are you calling the allocate function multiple times with many different ranks? (MPI ranks?) Brice Mike Am Mi., 2. März 2022 um 09:53 Uhr schrieb Brice Goglin mailto:brice.gog...@inria.fr>>: Le 02/03/2022 à 09:39, Mike a écrit :

Re: [hwloc-users] Problems with binding memory

2022-03-02 Thread Brice Goglin
Le 02/03/2022 à 09:39, Mike a écrit : Hello, Please run "lstopo -.synthetic" to compress the output a lot. I will be able to reuse it from here and understand your binding mask. Package:2 [NUMANode(memory=270369247232)] L3Cache:8(size=33554432) L2Cache:8(size=524288) L1dCache:1(size=32

Re: [hwloc-users] Problems with binding memory

2022-03-01 Thread Brice Goglin
Le 01/03/2022 à 17:34, Mike a écrit : Hello, Usually you would rather allocate and bind at the same time so that the memory doesn't need to be migrated when bound. However, if you do not touch the memory after allocation, pages are not actually physically allocated, hence there'

Re: [hwloc-users] Problems with binding memory

2022-03-01 Thread Brice Goglin
Le 01/03/2022 à 15:17, Mike a écrit : Dear list, I have a program that utilizes Openmpi + multithreading and I want the freedom to decide on which hardware cores my threads should run. By using hwloc_set_cpubind() that already works, so now I also want to bind memory to the hardware cores.

Re: [hwloc-users] [OMPI users] hwloc error

2021-08-23 Thread Brice Goglin
.open-mpi.org; Brice Goglin *Subject:* RE: [OMPI users] hwloc error Hello Brice Thanks for your reply. I forgot to mention that my machine is a windows one and not Linux. I did download the new version of hwloc. Could you brief me the steps for installing it? Are the steps similar to this?

Re: [hwloc-users] Build an OS-X Universal version

2021-03-23 Thread Brice Goglin
Le 23/03/2021 à 08:08, Brice Goglin a écrit : > Le 23/03/2021 à 02:28, ro...@uberware.net a écrit : >> Hi. I'm trying to build hwloc on OS-X Big Sur on an M1. Ultimate plan is >> to build it as a universal binary. Right now, I cannot even get the git >> master to a

Re: [hwloc-users] Build an OS-X Universal version

2021-03-23 Thread Brice Goglin
Le 23/03/2021 à 02:28, ro...@uberware.net a écrit : > Hi. I'm trying to build hwloc on OS-X Big Sur on an M1. Ultimate plan is > to build it as a universal binary. Right now, I cannot even get the git > master to autogen. This is what I get: > > robin@Robins-Mac-mini hwloc % ./autogen.sh > autoreco

[hwloc-users] getting the latest snapshot version string

2021-03-11 Thread Brice Goglin
Hello The "latest_snapshot.txt" files on the website were broken (for years). Things are now fixed and improved. And they are also explicitly documented on the main web page. If you want the version string of the latest release or release candidate, read https://www.open-mpi.org/software/hwloc/cu

Re: [hwloc-users] Netloc questions

2021-02-16 Thread Brice Goglin
Hello Kevin There is some very experimental support for Cray networks as well as Intel OmniPath. But the entire subproject has been unmaintained for a while and I don't expect anybody to revive it anytime soon unfortunately. Brice Le 16/02/2021 à 17:00, ke...@continuum-dynamics.com a écrit : >

Re: [hwloc-users] [Bug] Topology incorrect when CPU 0 offline

2021-02-05 Thread Brice Goglin
Hello I am not sure we ever tested this because offlining cpu0 was impossible in Linux until recently. I knew things would change because arm kernel devs were modifying Linux to allow it. Looks like it matters to x86 too, now. I'll take a look. Brice Le 5 février 2021 20:43:33 GMT+01:00, "Clay

Re: [hwloc-users] [hwloc-announce] hwloc 2.3.0 released

2020-10-02 Thread Brice Goglin
Le 02/10/2020 à 01:59, Jirka Hladky a écrit : > > I'll see if I can make things case-insensitive in the tools (not > in the C API). > > Yes, it would be a nice improvement.  Currently, there is a mismatch > between different commands.  hwloc-info supports both bandwidth and > Bandwidth, bu

Re: [hwloc-users] [hwloc-announce] hwloc 2.3.0 released

2020-10-01 Thread Brice Goglin
good performance measurement tool isn't easy. I see people sending patches for adding some assembly because this corner case on this processor isn't well optimized by GCC :/ I am not sure we want to put this inside hwloc. Brice > > > On Thu, Oct 1, 2020 at 7:28 PM Brice Goglin

Re: [hwloc-users] [hwloc-announce] hwloc 2.3.0 released

2020-10-01 Thread Brice Goglin
ist to > hwloc-info -h? I could add the default ones, but I'll need to specify that additional user-given attributes may exist. Thanks for the feedback. Brice > > hwloc-info --best-memattr bandwidth > hwloc-info --best-memattr latency > > Thanks a lot! > Jirka >

Re: [hwloc-users] hwloc Python3 Bindings - Correctly Grab number cores available

2020-08-31 Thread Brice Goglin
If you don't care about the overhead, tell python to use the output of shell command "hwloc-calc -N pu all". Brice Le 31/08/2020 à 18:38, Brock Palen a écrit : > Thanks, > > yeah I was looking for an API that would take into consideration most > cases, like I find with hwloc-bind --get   where I

Re: [hwloc-users] hwloc Python3 Bindings - Correctly Grab number cores available

2020-08-31 Thread Brice Goglin
Le 31/08/2020 à 18:19, Guy Streeter a écrit : > As I said, cgroups doesn't limit the group to a number of cores, it > limits processing time, either as an absolute amount or as a share of > what is available. > A docker process can be restricted to a set of cores, but that is done > with cpu affin

Re: [hwloc-users] hwloc 1.11.13 incorrect PCI locality information Xeon Platinum 9242

2020-08-30 Thread Brice Goglin
Hello Do you know which lstopo is correct here? Do you have a way to know if the IB interface is indeed connected to first NUMA node of 2nd package, or to 2nd NUMA node of 1st package? Benchmarking IB bandwidth when memory/cores are in NUMA node #1 vs #2 would be nice. The warning/fixup was added

Re: [hwloc-users] issue with MSVC Community Edition 2019

2020-07-23 Thread Brice Goglin
07/2020 à 19:32, Jon Dart a écrit : > That was it - the older DLL was in the path. Thanks for looking into it. > > --Jon > > On 7/22/2020 6:02 AM, Brice Goglin wrote: >> >> Hello Jon >> >> Sorry the delay. I finally got some time to look at this. I can only &g

Re: [hwloc-users] issue with MSVC Community Edition 2019

2020-07-22 Thread Brice Goglin
Le 01/07/2020 à 15:55, Jon Dart a écrit : > On 6/30/2020 4:00 PM, Brice Goglin wrote: >> >> Hello >> >> We don't have many windows-specific changes in 2.1 except some late >> MSVC-related changes added after rc1. Can you try 2.1.0rc1 instead of >> 2.1.

Re: [hwloc-users] Error occurred in topology.c line 940

2020-07-20 Thread Brice Goglin
Hello It looks your hardware and/or OS is reporting buggy information. We'd need more details to debug this. Can you open an githab issue at https://github.com/open-mpi/hwloc/issues/new ? This page lists what information you need to provide for debugging. It looks like you're using hwloc inside a

Re: [hwloc-users] issue with MSVC Community Edition 2019

2020-06-30 Thread Brice Goglin
Hello We don't have many windows-specific changes in 2.1 except some late MSVC-related changes added after rc1. Can you try 2.1.0rc1 instead of 2.1.0? It's not visible on the download page but it's actually available, for instance at https://download.open-mpi.org/release/hwloc/v2.1/hwloc-win64-bui

Re: [hwloc-users] Unused function

2020-05-29 Thread Brice Goglin
Oh sure, I thought we fixed this a while ago. I pushed it to master. Do you need in 2.2 only or also earlier stable series? Brice Le 29/05/2020 à 05:32, Balaji, Pavan via hwloc-users a écrit : > Hello, > > We are maintaining this patch for hwloc internally in mpich. Can this be > upstreamed? >

Re: [hwloc-users] Multi-Node Topologies in hwloc 2.0+

2020-05-11 Thread Brice Goglin
Hello Stephen There's no equivalent in hwloc 2.x unfortunately, even with netloc. "custom" caused too many issues for core maintenance (mostly because of cpusets being different between machines) while use cases were very rare. Brice Le 12/05/2020 à 08:01, Herbein, Stephen via hwloc-users a é

[hwloc-users] heterogeneous memory in hwloc

2020-03-19 Thread Brice Goglin
Hello Several people asked recently how hwloc exposes heterogeneous memory and how to recognize which NUMA nodes is which kind of memory. Short answer is that it's currently ugly but we're working on it for hwloc 2.3. I put all details in this wiki page : https://github.com/open-mpi/hwloc/wiki/He

Re: [hwloc-users] PCI to NUMA node mapping.

2020-02-03 Thread Brice Goglin
Hello Liam dmidecode is usually reserved to root only because it uses SMBIOS or whatever hardware/ACPI/... tables. Those tables are read by the Linux kernel and exported to non-root users in sysfs: $ cat /sys/bus/pci/devices/:ae:0c.6/numa_node 1 However this file isn't that good because som

Re: [hwloc-users] disabling ucx over omnipath

2019-11-15 Thread Brice Goglin
Oops wrong list, sorry :) Le 15/11/2019 à 10:49, Brice Goglin a écrit : > Hello > > We have a platform with an old MLX4 partition and another OPA partition. > We want a single OMPI installation working for both kinds of nodes. When > we enable UCX in OMPI for MLX4, UCX ends up be

[hwloc-users] disabling ucx over omnipath

2019-11-15 Thread Brice Goglin
Hello We have a platform with an old MLX4 partition and another OPA partition. We want a single OMPI installation working for both kinds of nodes. When we enable UCX in OMPI for MLX4, UCX ends up being used on the OPA partition too, and the performance is poor (3GB/s instead of 10). The problem se

Re: [hwloc-users] Embedded hwloc and Name Mangling Convention

2019-10-10 Thread Brice Goglin
symbols and avoid > altogether the prefixed ones? > > Thank you, > > Sam > >> On Oct 10, 2019, at 10:30 AM, Brice Goglin > <mailto:brice.gog...@inria.fr>> wrote: >> >> Le 10/10/2019 à 17:38, Gutierrez, Samuel K. via hwloc-users a écrit : >>> Good mor

Re: [hwloc-users] Embedded hwloc and Name Mangling Convention

2019-10-10 Thread Brice Goglin
Le 10/10/2019 à 17:38, Gutierrez, Samuel K. via hwloc-users a écrit : > Good morning, > > I have a question about expected name mangling behavior when using > HWLOC_SET_SYMBOL_PREFIX in hwloc v2.1.0 (and perhaps other versions). > > Say, for example, I do the following in a project embedding hwloc

Re: [hwloc-users] Netloc feature suggestion

2019-08-19 Thread Brice Goglin
Hello Indeed we would like to expose this kind of info but Netloc is unfornately undermanpowered these days. The code in git master is outdated. We have a big rework in a branch but it still needs quite a lot of polishing before being merged The API is still mostly-scotch-oriented (i.e. for pro

Re: [hwloc-users] Hang with SunOS

2019-07-08 Thread Brice Goglin
Hello It may be similar to https://github.com/open-mpi/hwloc/issues/290 but we weren't able to find the exact issue unfortunately :/ Setting HWLOC_COMPONENTS=-x86 in the environment would disable that code path, causing the topology to be possibly not as precise. Brice Le 08/07/2019 à 20:43,

Re: [hwloc-users] hwloc LDFLAGS in embedded builds

2019-05-24 Thread Brice Goglin
Thanks Pavan, I am pushing this. Brice Le 25/05/2019 à 08:19, Balaji, Pavan via hwloc-users a écrit : > Folks, > > We ran into an issue with the hwloc integration for MPICH. On Mac OS, hwloc > detects that OpenCL is available, but the corresponding LDFLAGS are not > exported upstream to MPIC

Re: [hwloc-users] Build warnings with hwloc-2.0.3

2019-03-18 Thread Brice Goglin
alaji, Pavan via hwloc-users a écrit : > Brice, all, > > Any update on this? Are you guys planning on fixing these? > > -- Pavan > >> On Feb 25, 2019, at 7:33 AM, Balaji, Pavan via hwloc-users >> wrote: >> >> Hi Brice, >> >>> On Feb 25, 2019

Re: [hwloc-users] Build warnings with hwloc-2.0.3

2019-02-25 Thread Brice Goglin
Hello Pavan, Are you sure you're not passing -Wstack-usage? My Ubuntu 18.04 with latest gcc-7 (7.3.0-27ubuntu1~18.04) doesn't show any of those warnings. It looks like all these warnings are caused by C99 variable-length arrays (except 2 that I don't understand). I know the kernel devs stopped us

Re: [hwloc-users] unusual memory binding results

2019-01-29 Thread Brice Goglin
parent_hugepage/enabled > [always] madvise never > > is set already, so I'm not really sure what should go in there to disable it. > > JB > > -Original Message- > From: Brice Goglin > Sent: 29 January 2019 15:29 > To: Biddiscombe, John A. ; Hardware local

Re: [hwloc-users] unusual memory binding results

2019-01-29 Thread Brice Goglin
it. > > Problem seems to be solved for now. Thank you very much for your insights and > suggestions/help. > > JB > > -Original Message- > From: Brice Goglin > Sent: 29 January 2019 10:35 > To: Biddiscombe, John A. ; Hardware locality user list > > Subj

Re: [hwloc-users] unusual memory binding results

2019-01-29 Thread Brice Goglin
0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 > > On the 8 numa node machine it sometimes gives the right answer even with 512 > pages. > > Still baffled > > JB > > -Original Message- > Fro

Re: [hwloc-users] unusual memory binding results

2019-01-28 Thread Brice Goglin
1-1-1-1-1 > 1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1- > which is correct because the '-' is a negative status. I will run again and > see if it's -14 or -2 > > JB > > > -Original Message- > From: Brice

Re: [hwloc-users] unusual memory binding results

2019-01-28 Thread Brice Goglin
y next since I can > see the memory contents hold the correct CPU ID of the thread that touched > the memory, so either the syscall is wrong, or the kernel is doing something > else. I welcome any suggestions on what might be wrong. > > Thanks for trying to help. > > JB > >

Re: [hwloc-users] unusual memory binding results

2019-01-26 Thread Brice Goglin
Le 25/01/2019 à 23:16, Biddiscombe, John A. a écrit : >> move_pages() returning 0 with -14 in the status array? As opposed to >> move_pages() returning -1 with errno set to 14, which would definitely be a >> bug in hwloc. > I think it was move_pages returning zero with -14 in the status array, an

Re: [hwloc-users] unusual memory binding results

2019-01-25 Thread Brice Goglin
Le 25/01/2019 à 14:17, Biddiscombe, John A. a écrit : > Dear List/Brice > > I experimented with disabling the memory touch on threads except for > N=1,2,3,4 etc and found a problem in hwloc, which is that the function > hwloc_get_area_memlocation was returning '0' when the status of the memory

Re: [hwloc-users] unusual memory binding results

2019-01-21 Thread Brice Goglin
Le 21/01/2019 à 17:08, Biddiscombe, John A. a écrit : > Dear list, > > I'm allocating a matrix of size (say) 2048*2048 on a node with 2 numa domains > and initializing the matrix by using 2 threads, one pinned on each numa > domain - with the idea that I can create tiles of memory bound to each

Re: [hwloc-users] mem bind

2018-12-21 Thread Brice Goglin
Hello That's not how current operating systems work, hence hwloc cannot do it. Usually you can bind a process virtual memory to a specific part of the physical memory (a NUMA node is basically a big static range), but the reverse isn't allowed by any OS I know. If you can tweak the hardware, you

Re: [hwloc-users] Travis CI unit tests failing with HW "operating system" error

2018-09-13 Thread Brice Goglin
> team if you want them to upgrade :-) > > Jeff > > On Thu, Sep 13, 2018 at 8:42 AM, Brice Goglin <mailto:brice.gog...@inria.fr>> wrote: > > This is actually just a warning. Usually it causes the topology to > be wrong (like a missing object), but it shouldn'

Re: [hwloc-users] Travis CI unit tests failing with HW "operating system" error

2018-09-13 Thread Brice Goglin
This is actually just a warning. Usually it causes the topology to be wrong (like a missing object), but it shouldn't prevent the program from working. Are you sure your programs are failing because of hwloc? Do you have a way to run lstopo on that node? By the way, you shouldn't use hwloc 2.0.0rc

Re: [hwloc-users] How to get pid in hwloc?

2018-09-04 Thread Brice Goglin
Hello The only public portability layer we have for PIDs is hwloc_pid_t when passed to things like set_proc_cpubind(). But we don't have a portable getpid() or printf(). You'll have to use getpid() and printf("%ld", (long)pid) on Unix. On Windows, hwloc_pid_t is a HANDLE, you don't want to print

Re: [hwloc-users] conflicts of multiple hwloc libraries

2018-09-01 Thread Brice Goglin
This was also addressed offline while the mailing was (again) broken. Some symbols weren't renamed in old releases. This was fixed a couple months ago. It will be in 2.0.2 and 1.11.11 (to be released on Monday Sept 3rd). Brice Le 30/08/2018 à 06:31, Junchao Zhang a écrit : > Hi, >    My progra

Re: [hwloc-users] Question about hwloc_bitmap_singlify

2018-08-28 Thread Brice Goglin
Hello If you bind a thread to a newset that contains 4 PUs (4 bits), the operating system scheduler is free to run that thread on any of these PUs. It means it may run on it on one PU, then migrate it to the other PU, then migrate it back, etc. If these PUs do not share all caches, you will see a

Re: [hwloc-users] How to combine bitmaps on MPI ranks?

2018-08-28 Thread Brice Goglin
This question was addressed offline while the mailing lists were offline. We had things like hwloc_bitmap_set_ith_ulong() and hwloc_bitmap_from_ith_ulong() for packing/unpacking but they weren't very convenient unless you know multiple ulongs are actually needed to store the bitmap. We added new

Re: [hwloc-users] Please help interpreting reported topology - possible bug?

2018-05-17 Thread Brice Goglin
Hello Hartmut The mailing list address changed a while ago, there's an additional "lists." in the domaine name. Regarding your question, I would assume you are running in a cgroup with the second NUMA node disallowed (while all the corresponding cores are allowed). lstopo with --whole-system woul

Re: [hwloc-users] Netloc integration with hwloc

2018-04-04 Thread Brice Goglin
Le 04/04/2018 à 16:49, Madhu, Kavitha Tiptur a écrit : > > — I tried building older netloc with hwloc 2.0 and it throws compiler errors. > Note that netloc was cloned from it’s git repo. My guess is that the "map" part that joins netloc's info about the fabric with hwloc's info about the nodes do

Re: [hwloc-users] Netloc integration with hwloc

2018-04-03 Thread Brice Goglin
à 01:36, Balaji, Pavan a écrit : > Brice, > > We want to use both hwloc and netloc in mpich. What are our options here? > Move back to hwloc-1.x? That’d be a bummer because we already invested a lot > of effort to migrate to hwloc-2.x. > > — Pavan > > Sent from my iP

Re: [hwloc-users] Netloc integration with hwloc

2018-04-03 Thread Brice Goglin
embedded mode? > > >> On Mar 30, 2018, at 1:34 PM, Brice Goglin wrote: >> >> Hello >> >> In 2.0, netloc is still highly experimental. Hopefully, a large rework >> will be merged in git master next month for being released in hwloc 2.1. >> >>

Re: [hwloc-users] Netloc integration with hwloc

2018-03-30 Thread Brice Goglin
Hello In 2.0, netloc is still highly experimental. Hopefully, a large rework will be merged in git master next month for being released in hwloc 2.1. Most of the API from the old standalone netloc was made private when integrated in hwloc because there wasn't any actual user. The API was quite la

[hwloc-users] libhwloc soname change in 2.0.1rc1

2018-03-21 Thread Brice Goglin
Hello In case you missed the announce yesterday, hwloc 2.0.1rc1 changes the library soname from 12:0:0 to 15:0:0. On Linux, it means that we'll now build libhwloc.so.15 instead of libhwloc.so.12. That means any application built for hwloc 2.0.0 will need to be recompiled against 2.0.1. I should h

Re: [hwloc-users] NUMA, io and miscellaneous object depths

2018-03-14 Thread Brice Goglin
to objects at the depth or above in Hydra previously. As >> you pointed out, the functionality makes no sense with NUMA/IO objects >> possibly being at different depths or for objects. >> >>> On Mar 14, 2018, at 3:00 PM, Brice Goglin wrote: >>> >>>

Re: [hwloc-users] NUMA, io and miscellaneous object depths

2018-03-14 Thread Brice Goglin
Hello I can fix the documentation to say that the function always suceeds and returns the virtual depth for NUMA/IO/Misc. I don't understand your third sentence. If by "actual depth", you mean the depth of a (normal) parent where NUMA are attached (for instance the depth of Package if NUMAs are a

[hwloc-users] call for testing on KNL

2018-02-09 Thread Brice Goglin
Hello As you may know, hwloc only discovers KNL MCDRAM Cache details if hwloc-dump-hwdata ran as root earlier. There's an issue with that tool in 2.0, which was supposed to be a feature: we fixed the matching of SMBIOS strings, and now it appears some vendors don't match anymore because they didn'

Re: [hwloc-users] Machine nodes in hwloc topology

2018-02-05 Thread Brice Goglin
ase let me know. Brice Le 05/02/2018 à 23:19, Madhu, Kavitha Tiptur a écrit : > Hi > > Thanks for the response. Could you also confirm if hwloc topology > object would have only machine node? > > Thanks, > Kavitha > > > >> On Feb 5, 2018, at 4:14 PM, B

Re: [hwloc-users] Machine nodes in hwloc topology

2018-02-05 Thread Brice Goglin
Hello, Oops, sorry, this sentence is obsolete, I am removing it from the doc right now. We don't support the assembly of multiple machines in a single hwloc topology anymore. For the record, this feature was a very small corner case and it had important limitations (you couldn't bind things or us

[hwloc-users] need help for testing new Mac OS support

2018-01-26 Thread Brice Goglin
Hello I need people running Mac OS to test some patches before releasing them in 2.0rc2 (which is likely delayed to Monday). Just build this tarball, run lstopo, and report any difference with older lstopo outputs: https://ci.inria.fr/hwloc/job/zbgoglin-0-tarball/lastSuccessfulBuild/artifact/hwl

Re: [hwloc-users] Puzzled by the number of cores on i5-7500

2018-01-25 Thread Brice Goglin
It looks like our Mac OS X backend doesn't properly handle processors that support hyperthreading without actually having hyperthreads enabled in hardware. Your processor has 4-core without HT but it's based on a processor with up to 8 cores and 16 threads. Our current code uses the latter and ther

Re: [hwloc-users] hwloc-2.0rc1 failure on Solaris

2018-01-25 Thread Brice Goglin
It is actually easy to fix, we just need to move hwloc's #include before what base64.c actually #include's. That'll be fixed in rc2 too. Brice Le 25/01/2018 à 10:56, Brice Goglin a écrit : > Like the error below? > > This code hasn't changed recently. Did yo

Re: [hwloc-users] hwloc-2.0rc1 failure on Solaris

2018-01-25 Thread Brice Goglin
Like the error below? This code hasn't changed recently. Did you ever build with these flags before? I am not sure I'll have time to fix yet another header crazyness before rc2. Brice   CC   base64.lo In file included from /builds/hwloc-master-20180124.2347.gitf53fe3a/include/private/priv

Re: [hwloc-users] Puzzled by the number of cores on i5-7500

2018-01-24 Thread Brice Goglin
The output of sysctl -a would also help. Brice Le 25/01/2018 à 07:34, Brice Goglin a écrit : > Hello > > I don't see anything obvious. Can you rebuild with --enable-debug and > report the full lstopo output? > > Brice > > > > Le 25/01/2018 à 07:14,

Re: [hwloc-users] hwloc-2.0rc1 build warnings

2018-01-24 Thread Brice Goglin
f9 > https://github.com/pmodels/hwloc/commit/9bf3ff256511ea4092928438f5718904875e65e1 > > The first one is definitely not usable as-is, since that breaks standalone > builds. But I'm interested in hearing about any better solution that you > might have. > > Thanks, >

Re: [hwloc-users] hwloc-2.0rc1 build warnings

2018-01-24 Thread Brice Goglin
Thanks, I am fixing this for rc2 tomorrow. Brice Le 24/01/2018 à 22:59, Balaji, Pavan a écrit : > Folks, > > I'm seeing these warnings on the mac os when building hwloc-2.0rc1 with clang: > > 8< > CC lstopo-lstopo.o > lstopo.c: In function 'usage': > lstopo.c:425:7: warning: "CAI

Re: [hwloc-users] OFED requirements for netloc

2018-01-24 Thread Brice Goglin
system that seq faults, and 1.6.6 on the one that succeeds.  > And that the first looks to be the standard OFED release and the 1.6.6 > version a mellanox release of OFED. > > Craig. > > On Tue, 23 Jan 2018 at 17:10 Brice Goglin <mailto:brice.gog...@inria.fr>> wrote: &

Re: [hwloc-users] Tags for pre-releases

2018-01-23 Thread Brice Goglin
Hello I didn't know you use submodule. I just pushed tag "hwloc-2.0.0rc1" and I'll try to remember pushing one for each future rc. If I don't, please remind me. I am not going to push all the previous ones because there are just too many of them. If you need some specific ones, please let me know

Re: [hwloc-users] OFED requirements for netloc

2018-01-22 Thread Brice Goglin
Hello, If the output isn't too big, could you put the files gathered by netloc_ib_gather_raw online so that we look at them and try to reproduce the crash? Thanks Brice Le 23/01/2018 à 03:54, Craig West a écrit : > Hi, > > I can't find the version requirements for netloc. I've tried it on an

Re: [hwloc-users] AMD EPYC topology

2017-12-29 Thread Brice Goglin
Le 29/12/2017 à 23:15, Bill Broadley a écrit : > > > Very interesting, I was running parallel finite element code and was seeing > great performance compared to Intel in most cases, but on larger runs it was > 20x > slower. This would explain it. > > Do you know which commit, or anything else t

Re: [hwloc-users] AMD EPYC topology

2017-12-24 Thread Brice Goglin
Hello Make sure you use a very recent Linux kernel. There was a bug regarding L3 caches on 24-core Epyc processors which has been fixed in 4.14 and backported in 4.13.x (and maybe in distro kernels too). However, that would likely not cause huge performance difference unless your application hea

Re: [hwloc-users] pianofish process management GUI

2017-12-20 Thread Brice Goglin
Hello This sound very interesting, likely what we want for https://github.com/open-mpi/hwloc/issues/54 I couldn't test it yet (dnf on Fedora25 says nothing provides python2-hwloc even that package is installed in version 2.3.1). Do you plan to put python-hwloc (and maybe pianofish too) in pypi? I

Re: [hwloc-users] How are processor groups under Windows reported?

2017-11-29 Thread Brice Goglin
thanks. Brice > > Thank you, > > David > > On 29/11/2017 13:35, Brice Goglin wrote: >> Hello >> >> We only add hwloc Group objects when necessary. On your system, each >> processor group contains a single NUMA node, so these Groups would not >> real

Re: [hwloc-users] How are processor groups under Windows reported?

2017-11-29 Thread Brice Goglin
Hello We only add hwloc Group objects when necessary. On your system, each processor group contains a single NUMA node, so these Groups would not really bring additional information about the hierarchy of resources. If you had a bigger system with, let's say, 4 NUMA nodes, with 2 of them in each p

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] Dual socket AMD Epyc error

2017-11-22 Thread Brice Goglin
Brice Le 28/10/2017 09:31, Brice Goglin a écrit : > Hello, > The Linux kernel reports incorrect L3 information. > Unfortunately, your old kernel seems to already contain patches for > supporting the L3 on this hardware. I found two candidate patches for > further fixing this, one is

[hwloc-users] RFCs about latest API changes

2017-11-19 Thread Brice Goglin
m/open-mpi/hwloc/pull/277 Make all depths *signed* ints https://github.com/open-mpi/hwloc/pull/276 Remove the "System" object type https://github.com/open-mpi/hwloc/pull/275 Move local_memory to NUMA node specific attrs https://github.com/open-mpi/hwloc/pull/274 Brice Le 26/

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-15 Thread Brice Goglin
I hoped for. > > Thanks again > > JB > > > From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of Brice > Goglin [brice.gog...@inria.fr] > Sent: 13 November 2017 15:32 > To: Hardware locality user list > Subject

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-13 Thread Brice Goglin
t; > aha. thanks. I knew I'd seen a function for that, but couldn't remember what > it was. > > Cheers > > JB > ____ > From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of Brice > Goglin [bri

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-13 Thread Brice Goglin
-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of > Samuel Thibault [samuel.thiba...@inria.fr] > Sent: 12 November 2017 10:48 > To: Hardware locality user list > Subject: Re: [hwloc-users] question about hwloc_set_area_membind_nodeset > > Brice Goglin, on dim. 12 nov. 201

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-11 Thread Brice Goglin
Le 12/11/2017 00:14, Biddiscombe, John A. a écrit : > I'm allocating some large matrices, from 10k squared elements up to > 40k squared per node. > I'm also using membind to place pages of the matrix memory across numa > nodes so that the matrix might be bound according to the kind of > pattern a

Re: [hwloc-users] HWLOC_VERSION

2017-10-30 Thread Brice Goglin
Hello It should have been 0x00010b03 but I forgot to increase it unfortunately (and again in 1.11.6). I need to add this to my release-TODO-list. The upcoming 1.11.9 will have the proper HWLOC_API_VERSION (0x00010b06 unless we had something) so that people can at least check for these features in

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] Dual socket AMD Epyc error

2017-10-28 Thread Brice Goglin
Hello, The Linux kernel reports incorrect L3 information. Unfortunately, your old kernel seems to already contain patches for supporting the L3 on this hardware. I found two candidate patches for further fixing this, one is in 4.10 (cleanup of the above patch) and the other will only be in 4.14. I

[hwloc-users] new memory model and API

2017-10-26 Thread Brice Goglin
Hello I finally merged the new memory model in master (mainly for properly supporting KNL-like heterogeneous memory). This was the main and last big change for hwloc 2.0. I still need to fix some caveats (and lstopo needs to better display NUMA nodes) but that part of the API should be ready. Now

Re: [hwloc-users] linkspeed in hwloc_obj_attr_u::hwloc_pcidev_attr_s struct while traversing topology

2017-10-13 Thread Brice Goglin
Hello On Linux, the PCI linkspeed requires root privileges unfortunately (except for the uplink above NVIDIA GPUs where we have another way to find it). The only way to workaround this is to dump the topology as XML as root and then reload it at runtime (e.g. with HWLOC_XMLFILE) :/ Brice Le 13

Re: [hwloc-users] Why do I get such little information back about GPU's on my system

2017-07-07 Thread Brice Goglin
:01:00.0 3D controller: NVIDIA Corporation GP100GL (rev a1) > > But the only devices returned by hwloc are named "cardX" (same as what > lstopo shows) and have osdev.type of HWLOC_OBJ_OSDEV_GPU and we see no > devices of type HWLOC_OBJ_OSDEV_COPROC > > Sorry, I&#x

Re: [hwloc-users] Why do I get such little information back about GPU's on my system

2017-07-07 Thread Brice Goglin
Le 07/07/2017 21:51, David Solt a écrit : > Oh, Geoff Paulsen will be there at Open MPI meeting and he can help > with the discussion. We tried searching for > > // Iterate over each osdevice and identify the GPU's on each socket. > while ((obj = hwloc_get_next_osdev(machine_topology, obj)) !

Re: [hwloc-users] Why do I get such little information back about GPU's on my system

2017-07-07 Thread Brice Goglin
Le 07/07/2017 20:38, David Solt a écrit : > We are using the hwloc api to identify GPUs on our cluster. While we > are able to "discover" the GPUs, other information about them does not > appear to be getting filled in. See below for example:
 > (gdb) p *obj->attr > $20 = { > cache = { > size

Re: [hwloc-users] hwloc error in SuperMicro AMD Opteron 6238

2017-06-30 Thread Brice Goglin
(P#0) + L3 L#0 (6144KB) > L2 L#0 (2048KB) + L1i L#0 (64KB) > ... > > These nodes are the only one in our entire cluster to cause zombie > processes using torque/moab. I have a feeling that they are related. > We use hwloc/1.10.0. > > Not sure if this helps at all, but yo

Re: [hwloc-users] hwloc error in SuperMicro AMD Opteron 6238

2017-06-30 Thread Brice Goglin
Le 30/06/2017 22:08, fabricio a écrit : > Em 30-06-2017 16:21, Brice Goglin escreveu: >> Yes, it's possible but very easy. Before we go that way: >> Can you also pass HWLOC_COMPONENTS_VERBOSE=1 in the environment and send &

Re: [hwloc-users] hwloc error in SuperMicro AMD Opteron 6238

2017-06-28 Thread Brice Goglin
Hello We've seen this issue many times (it's specific to 12-core opterons), but I am surprised it still occurs with such a recent kernel. AMD was supposed to fix the kernel in early 2016 but I forgot checking whether something was actually pushed. Anyway, you can likely ignore the issue as docume

Re: [hwloc-users] ? Finding cache & pci info on SPARC/Solaris 11.3

2017-06-09 Thread Brice Goglin
Thanks a lot for the input. I opened https://github.com/open-mpi/hwloc/issues/243 I have access to a T5 but this will need investigation to actually find where to get the info from. Feel free to comment the issue if you find more. I am going to modify Pg.pm to better understand where Caches come fr

  1   2   3   4   5   >