Re: [hwloc-users] MI300A support

2024-02-08 Thread Brice Goglin
Hello I don't have access to a MI300A but I worked with AMD several month ago to solve a very similar issue. It was caused by a buggy APCI HMAT in the BIOS. Try setting HWLOC_USE_NUMA_DISTANCES=0 in the environment to disable the hwloc code that uses this HMAT info. If the warning goes away

Re: [hwloc-users] Support for Intel's hybrid architecture - can I restrict hwloc-distrib to P cores only?

2023-11-24 Thread Brice Goglin
Le 24/11/2023 à 08:51, John Hearns a écrit : Good question.  Maybe not an answer referring to hwloc. When managing a large NUMA machine, SGI UV, I ran the OS processes in a boot cpuset which was restricted to (AFAIR) the first 8 Cpus. On Intel architecures with E and P cores could we think of

Re: [hwloc-users] Support for Intel's hybrid architecture - can I restrict hwloc-distrib to P cores only?

2023-11-24 Thread Brice Goglin
Le 23/11/2023 à 19:29, Jirka Hladky a écrit : Hi Brice, I have a question about the hwloc's support for Intel's hybrid architectures, like in Alder Lake CPUs: https://en.wikipedia.org/wiki/Alder_Lake There are P (performance) and E (efficiency) cores. Is hwloc able to detect which core is

Re: [hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Brice Goglin
ll account for sundry testing, mostly of build procedures. Is there anything I could do to get hwloc to work? Regards, Max --- On Wed, Aug 02, 2023 at 03:12:27PM +0200, Brice Goglin wrote: Hello There's something wrong in this machine. It exposes 4 cores (number 0 to 3) and no NUMA node, but say

Re: [hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Brice Goglin
) PU L#1 (P#1) HostBridge PCI 00:03.0 (Other) Block(Disk) "sda" PCI 00:04.0 (Ethernet) Net "ens4" PCI 00:05.0 (Other) (from which I conclude my build procedure is correct). At the suggestion of Brice Goglin (in response to my post of th

Re: [hwloc-users] Problems with binding memory

2022-03-02 Thread Brice Goglin
Le 02/03/2022 à 09:39, Mike a écrit : Hello, Please run "lstopo -.synthetic" to compress the output a lot. I will be able to reuse it from here and understand your binding mask. Package:2 [NUMANode(memory=270369247232)] L3Cache:8(size=33554432) L2Cache:8(size=524288)

Re: [hwloc-users] Problems with binding memory

2022-03-01 Thread Brice Goglin
Le 01/03/2022 à 17:34, Mike a écrit : Hello, Usually you would rather allocate and bind at the same time so that the memory doesn't need to be migrated when bound. However, if you do not touch the memory after allocation, pages are not actually physically allocated, hence

Re: [hwloc-users] Problems with binding memory

2022-03-01 Thread Brice Goglin
Le 01/03/2022 à 15:17, Mike a écrit : Dear list, I have a program that utilizes Openmpi + multithreading and I want the freedom to decide on which hardware cores my threads should run. By using hwloc_set_cpubind() that already works, so now I also want to bind memory to the hardware cores.

Re: [hwloc-users] [OMPI users] hwloc error

2021-08-23 Thread Brice Goglin
-mpi.org; Brice Goglin *Subject:* RE: [OMPI users] hwloc error Hello Brice Thanks for your reply. I forgot to mention that my machine is a windows one and not Linux. I did download the new version of hwloc. Could you brief me the steps for installing it? Are the steps similar to this? cd

Re: [hwloc-users] Build an OS-X Universal version

2021-03-23 Thread Brice Goglin
Le 23/03/2021 à 08:08, Brice Goglin a écrit : > Le 23/03/2021 à 02:28, ro...@uberware.net a écrit : >> Hi. I'm trying to build hwloc on OS-X Big Sur on an M1. Ultimate plan is >> to build it as a universal binary. Right now, I cannot even get the git >> master to autoge

Re: [hwloc-users] Build an OS-X Universal version

2021-03-23 Thread Brice Goglin
Le 23/03/2021 à 02:28, ro...@uberware.net a écrit : > Hi. I'm trying to build hwloc on OS-X Big Sur on an M1. Ultimate plan is > to build it as a universal binary. Right now, I cannot even get the git > master to autogen. This is what I get: > > robin@Robins-Mac-mini hwloc % ./autogen.sh >

[hwloc-users] getting the latest snapshot version string

2021-03-11 Thread Brice Goglin
Hello The "latest_snapshot.txt" files on the website were broken (for years). Things are now fixed and improved. And they are also explicitly documented on the main web page. If you want the version string of the latest release or release candidate, read

Re: [hwloc-users] Netloc questions

2021-02-16 Thread Brice Goglin
Hello Kevin There is some very experimental support for Cray networks as well as Intel OmniPath. But the entire subproject has been unmaintained for a while and I don't expect anybody to revive it anytime soon unfortunately. Brice Le 16/02/2021 à 17:00, ke...@continuum-dynamics.com a écrit : >

Re: [hwloc-users] [Bug] Topology incorrect when CPU 0 offline

2021-02-05 Thread Brice Goglin
Hello I am not sure we ever tested this because offlining cpu0 was impossible in Linux until recently. I knew things would change because arm kernel devs were modifying Linux to allow it. Looks like it matters to x86 too, now. I'll take a look. Brice Le 5 février 2021 20:43:33 GMT+01:00,

Re: [hwloc-users] [hwloc-announce] hwloc 2.3.0 released

2020-10-02 Thread Brice Goglin
Le 02/10/2020 à 01:59, Jirka Hladky a écrit : > > I'll see if I can make things case-insensitive in the tools (not > in the C API). > > Yes, it would be a nice improvement.  Currently, there is a mismatch > between different commands.  hwloc-info supports both bandwidth and > Bandwidth,

Re: [hwloc-users] [hwloc-announce] hwloc 2.3.0 released

2020-10-01 Thread Brice Goglin
n't easy. I see people sending patches for adding some assembly because this corner case on this processor isn't well optimized by GCC :/ I am not sure we want to put this inside hwloc. Brice > > > On Thu, Oct 1, 2020 at 7:28 PM Brice Goglin <mailto:brice.gog...@inria.fr>> wrote: >

Re: [hwloc-users] [hwloc-announce] hwloc 2.3.0 released

2020-10-01 Thread Brice Goglin
d add the default ones, but I'll need to specify that additional user-given attributes may exist. Thanks for the feedback. Brice > > hwloc-info --best-memattr bandwidth > hwloc-info --best-memattr latency > > Thanks a lot! > Jirka > > > On Thu, Oct 1, 2020 at 12:45

Re: [hwloc-users] hwloc Python3 Bindings - Correctly Grab number cores available

2020-08-31 Thread Brice Goglin
If you don't care about the overhead, tell python to use the output of shell command "hwloc-calc -N pu all". Brice Le 31/08/2020 à 18:38, Brock Palen a écrit : > Thanks, > > yeah I was looking for an API that would take into consideration most > cases, like I find with hwloc-bind --get   where

Re: [hwloc-users] hwloc Python3 Bindings - Correctly Grab number cores available

2020-08-31 Thread Brice Goglin
Le 31/08/2020 à 18:19, Guy Streeter a écrit : > As I said, cgroups doesn't limit the group to a number of cores, it > limits processing time, either as an absolute amount or as a share of > what is available. > A docker process can be restricted to a set of cores, but that is done > with cpu

Re: [hwloc-users] hwloc 1.11.13 incorrect PCI locality information Xeon Platinum 9242

2020-08-30 Thread Brice Goglin
Hello Do you know which lstopo is correct here? Do you have a way to know if the IB interface is indeed connected to first NUMA node of 2nd package, or to 2nd NUMA node of 1st package? Benchmarking IB bandwidth when memory/cores are in NUMA node #1 vs #2 would be nice. The warning/fixup was

Re: [hwloc-users] issue with MSVC Community Edition 2019

2020-07-23 Thread Brice Goglin
, Jon Dart a écrit : > That was it - the older DLL was in the path. Thanks for looking into it. > > --Jon > > On 7/22/2020 6:02 AM, Brice Goglin wrote: >> >> Hello Jon >> >> Sorry the delay. I finally got some time to look at this. I can only >> reproduc

Re: [hwloc-users] issue with MSVC Community Edition 2019

2020-07-22 Thread Brice Goglin
Le 01/07/2020 à 15:55, Jon Dart a écrit : > On 6/30/2020 4:00 PM, Brice Goglin wrote: >> >> Hello >> >> We don't have many windows-specific changes in 2.1 except some late >> MSVC-related changes added after rc1. Can you try 2.1.0rc1 instead of >> 2.1.0

Re: [hwloc-users] Error occurred in topology.c line 940

2020-07-20 Thread Brice Goglin
Hello It looks your hardware and/or OS is reporting buggy information. We'd need more details to debug this. Can you open an githab issue at https://github.com/open-mpi/hwloc/issues/new ? This page lists what information you need to provide for debugging. It looks like you're using hwloc inside

Re: [hwloc-users] issue with MSVC Community Edition 2019

2020-06-30 Thread Brice Goglin
Hello We don't have many windows-specific changes in 2.1 except some late MSVC-related changes added after rc1. Can you try 2.1.0rc1 instead of 2.1.0? It's not visible on the download page but it's actually available, for instance at

Re: [hwloc-users] Unused function

2020-05-29 Thread Brice Goglin
Oh sure, I thought we fixed this a while ago. I pushed it to master. Do you need in 2.2 only or also earlier stable series? Brice Le 29/05/2020 à 05:32, Balaji, Pavan via hwloc-users a écrit : > Hello, > > We are maintaining this patch for hwloc internally in mpich. Can this be > upstreamed?

Re: [hwloc-users] Multi-Node Topologies in hwloc 2.0+

2020-05-12 Thread Brice Goglin
Hello Stephen There's no equivalent in hwloc 2.x unfortunately, even with netloc. "custom" caused too many issues for core maintenance (mostly because of cpusets being different between machines) while use cases were very rare. Brice Le 12/05/2020 à 08:01, Herbein, Stephen via hwloc-users a

[hwloc-users] heterogeneous memory in hwloc

2020-03-19 Thread Brice Goglin
Hello Several people asked recently how hwloc exposes heterogeneous memory and how to recognize which NUMA nodes is which kind of memory. Short answer is that it's currently ugly but we're working on it for hwloc 2.3. I put all details in this wiki page :

Re: [hwloc-users] PCI to NUMA node mapping.

2020-02-03 Thread Brice Goglin
Hello Liam dmidecode is usually reserved to root only because it uses SMBIOS or whatever hardware/ACPI/... tables. Those tables are read by the Linux kernel and exported to non-root users in sysfs: $ cat /sys/bus/pci/devices/:ae:0c.6/numa_node 1 However this file isn't that good because

Re: [hwloc-users] disabling ucx over omnipath

2019-11-15 Thread Brice Goglin
Oops wrong list, sorry :) Le 15/11/2019 à 10:49, Brice Goglin a écrit : > Hello > > We have a platform with an old MLX4 partition and another OPA partition. > We want a single OMPI installation working for both kinds of nodes. When > we enable UCX in OMPI for MLX4, UCX end

[hwloc-users] disabling ucx over omnipath

2019-11-15 Thread Brice Goglin
Hello We have a platform with an old MLX4 partition and another OPA partition. We want a single OMPI installation working for both kinds of nodes. When we enable UCX in OMPI for MLX4, UCX ends up being used on the OPA partition too, and the performance is poor (3GB/s instead of 10). The problem

Re: [hwloc-users] Embedded hwloc and Name Mangling Convention

2019-10-10 Thread Brice Goglin
Le 10/10/2019 à 17:38, Gutierrez, Samuel K. via hwloc-users a écrit : > Good morning, > > I have a question about expected name mangling behavior when using > HWLOC_SET_SYMBOL_PREFIX in hwloc v2.1.0 (and perhaps other versions). > > Say, for example, I do the following in a project embedding

Re: [hwloc-users] Netloc feature suggestion

2019-08-19 Thread Brice Goglin
Hello Indeed we would like to expose this kind of info but Netloc is unfornately undermanpowered these days. The code in git master is outdated. We have a big rework in a branch but it still needs quite a lot of polishing before being merged The API is still mostly-scotch-oriented (i.e. for

Re: [hwloc-users] Hang with SunOS

2019-07-08 Thread Brice Goglin
Hello It may be similar to https://github.com/open-mpi/hwloc/issues/290 but we weren't able to find the exact issue unfortunately :/ Setting HWLOC_COMPONENTS=-x86 in the environment would disable that code path, causing the topology to be possibly not as precise. Brice Le 08/07/2019 à 20:43,

Re: [hwloc-users] Build warnings with hwloc-2.0.3

2019-03-18 Thread Brice Goglin
, Pavan via hwloc-users a écrit : > Brice, all, > > Any update on this? Are you guys planning on fixing these? > > -- Pavan > >> On Feb 25, 2019, at 7:33 AM, Balaji, Pavan via hwloc-users >> wrote: >> >> Hi Brice, >> >>> On Feb 25, 2019,

Re: [hwloc-users] Build warnings with hwloc-2.0.3

2019-02-25 Thread Brice Goglin
Hello Pavan, Are you sure you're not passing -Wstack-usage? My Ubuntu 18.04 with latest gcc-7 (7.3.0-27ubuntu1~18.04) doesn't show any of those warnings. It looks like all these warnings are caused by C99 variable-length arrays (except 2 that I don't understand). I know the kernel devs stopped

Re: [hwloc-users] unusual memory binding results

2019-01-29 Thread Brice Goglin
t_hugepage/enabled > [always] madvise never > > is set already, so I'm not really sure what should go in there to disable it. > > JB > > -Original Message- > From: Brice Goglin > Sent: 29 January 2019 15:29 > To: Biddiscombe, John A. ; Hardware locality user l

Re: [hwloc-users] unusual memory binding results

2019-01-29 Thread Brice Goglin
> > Problem seems to be solved for now. Thank you very much for your insights and > suggestions/help. > > JB > > -Original Message- > From: Brice Goglin > Sent: 29 January 2019 10:35 > To: Biddiscombe, John A. ; Hardware locality user list > > Subject:

Re: [hwloc-users] unusual memory binding results

2019-01-29 Thread Brice Goglin
0 > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 > > On the 8 numa node machine it sometimes gives the right answer even with 512 > pages. > > Still baffled > > JB > > -Original Message- > From: hwloc-users

Re: [hwloc-users] unusual memory binding results

2019-01-28 Thread Brice Goglin
1-1-1-1-1 > 1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1- > which is correct because the '-' is a negative status. I will run again and > see if it's -14 or -2 > > JB > > > -Original Message- > From: Brice Goglin > S

Re: [hwloc-users] unusual memory binding results

2019-01-28 Thread Brice Goglin
can > see the memory contents hold the correct CPU ID of the thread that touched > the memory, so either the syscall is wrong, or the kernel is doing something > else. I welcome any suggestions on what might be wrong. > > Thanks for trying to help. > > JB > > -Original

Re: [hwloc-users] unusual memory binding results

2019-01-25 Thread Brice Goglin
Le 25/01/2019 à 14:17, Biddiscombe, John A. a écrit : > Dear List/Brice > > I experimented with disabling the memory touch on threads except for > N=1,2,3,4 etc and found a problem in hwloc, which is that the function > hwloc_get_area_memlocation was returning '0' when the status of the memory

Re: [hwloc-users] unusual memory binding results

2019-01-21 Thread Brice Goglin
Le 21/01/2019 à 17:08, Biddiscombe, John A. a écrit : > Dear list, > > I'm allocating a matrix of size (say) 2048*2048 on a node with 2 numa domains > and initializing the matrix by using 2 threads, one pinned on each numa > domain - with the idea that I can create tiles of memory bound to each

Re: [hwloc-users] mem bind

2018-12-21 Thread Brice Goglin
Hello That's not how current operating systems work, hence hwloc cannot do it. Usually you can bind a process virtual memory to a specific part of the physical memory (a NUMA node is basically a big static range), but the reverse isn't allowed by any OS I know. If you can tweak the hardware, you

Re: [hwloc-users] Travis CI unit tests failing with HW "operating system" error

2018-09-13 Thread Brice Goglin
This is actually just a warning. Usually it causes the topology to be wrong (like a missing object), but it shouldn't prevent the program from working. Are you sure your programs are failing because of hwloc? Do you have a way to run lstopo on that node? By the way, you shouldn't use hwloc

Re: [hwloc-users] How to get pid in hwloc?

2018-09-04 Thread Brice Goglin
Hello The only public portability layer we have for PIDs is hwloc_pid_t when passed to things like set_proc_cpubind(). But we don't have a portable getpid() or printf(). You'll have to use getpid() and printf("%ld", (long)pid) on Unix. On Windows, hwloc_pid_t is a HANDLE, you don't want to print

Re: [hwloc-users] conflicts of multiple hwloc libraries

2018-09-01 Thread Brice Goglin
This was also addressed offline while the mailing was (again) broken. Some symbols weren't renamed in old releases. This was fixed a couple months ago. It will be in 2.0.2 and 1.11.11 (to be released on Monday Sept 3rd). Brice Le 30/08/2018 à 06:31, Junchao Zhang a écrit : > Hi, >    My

Re: [hwloc-users] Question about hwloc_bitmap_singlify

2018-08-28 Thread Brice Goglin
Hello If you bind a thread to a newset that contains 4 PUs (4 bits), the operating system scheduler is free to run that thread on any of these PUs. It means it may run on it on one PU, then migrate it to the other PU, then migrate it back, etc. If these PUs do not share all caches, you will see a

Re: [hwloc-users] How to combine bitmaps on MPI ranks?

2018-08-28 Thread Brice Goglin
This question was addressed offline while the mailing lists were offline. We had things like hwloc_bitmap_set_ith_ulong() and hwloc_bitmap_from_ith_ulong() for packing/unpacking but they weren't very convenient unless you know multiple ulongs are actually needed to store the bitmap. We added new

Re: [hwloc-users] Please help interpreting reported topology - possible bug?

2018-05-17 Thread Brice Goglin
Hello Hartmut The mailing list address changed a while ago, there's an additional "lists." in the domaine name. Regarding your question, I would assume you are running in a cgroup with the second NUMA node disallowed (while all the corresponding cores are allowed). lstopo with --whole-system

Re: [hwloc-users] Netloc integration with hwloc

2018-04-04 Thread Brice Goglin
Le 04/04/2018 à 16:49, Madhu, Kavitha Tiptur a écrit : > > — I tried building older netloc with hwloc 2.0 and it throws compiler errors. > Note that netloc was cloned from it’s git repo. My guess is that the "map" part that joins netloc's info about the fabric with hwloc's info about the nodes

Re: [hwloc-users] Netloc integration with hwloc

2018-04-03 Thread Brice Goglin
 : > Brice, > > We want to use both hwloc and netloc in mpich. What are our options here? > Move back to hwloc-1.x? That’d be a bummer because we already invested a lot > of effort to migrate to hwloc-2.x. > > — Pavan > > Sent from my iPhone > >> On A

Re: [hwloc-users] Netloc integration with hwloc

2018-04-03 Thread Brice Goglin
dded mode? > > >> On Mar 30, 2018, at 1:34 PM, Brice Goglin <brice.gog...@inria.fr> wrote: >> >> Hello >> >> In 2.0, netloc is still highly experimental. Hopefully, a large rework >> will be merged in git master next month for being released in hwloc 2.1. >

Re: [hwloc-users] Netloc integration with hwloc

2018-03-30 Thread Brice Goglin
Hello In 2.0, netloc is still highly experimental. Hopefully, a large rework will be merged in git master next month for being released in hwloc 2.1. Most of the API from the old standalone netloc was made private when integrated in hwloc because there wasn't any actual user. The API was quite

[hwloc-users] libhwloc soname change in 2.0.1rc1

2018-03-21 Thread Brice Goglin
Hello In case you missed the announce yesterday, hwloc 2.0.1rc1 changes the library soname from 12:0:0 to 15:0:0. On Linux, it means that we'll now build libhwloc.so.15 instead of libhwloc.so.12. That means any application built for hwloc 2.0.0 will need to be recompiled against 2.0.1. I should

Re: [hwloc-users] NUMA, io and miscellaneous object depths

2018-03-14 Thread Brice Goglin
processes to objects at the depth or above in Hydra previously. As >> you pointed out, the functionality makes no sense with NUMA/IO objects >> possibly being at different depths or for objects. >> >>> On Mar 14, 2018, at 3:00 PM, Brice Goglin <brice.gog...@inria.fr> wrot

Re: [hwloc-users] NUMA, io and miscellaneous object depths

2018-03-14 Thread Brice Goglin
Hello I can fix the documentation to say that the function always suceeds and returns the virtual depth for NUMA/IO/Misc. I don't understand your third sentence. If by "actual depth", you mean the depth of a (normal) parent where NUMA are attached (for instance the depth of Package if NUMAs are

Re: [hwloc-users] Machine nodes in hwloc topology

2018-02-05 Thread Brice Goglin
et me know. Brice Le 05/02/2018 à 23:19, Madhu, Kavitha Tiptur a écrit : > Hi > > Thanks for the response. Could you also confirm if hwloc topology > object would have only machine node? > > Thanks, > Kavitha > > > >> On Feb 5, 2018, at 4:14 PM, Brice Gogl

Re: [hwloc-users] Machine nodes in hwloc topology

2018-02-05 Thread Brice Goglin
Hello, Oops, sorry, this sentence is obsolete, I am removing it from the doc right now. We don't support the assembly of multiple machines in a single hwloc topology anymore. For the record, this feature was a very small corner case and it had important limitations (you couldn't bind things or

[hwloc-users] need help for testing new Mac OS support

2018-01-26 Thread Brice Goglin
Hello I need people running Mac OS to test some patches before releasing them in 2.0rc2 (which is likely delayed to Monday). Just build this tarball, run lstopo, and report any difference with older lstopo outputs:

Re: [hwloc-users] Puzzled by the number of cores on i5-7500

2018-01-25 Thread Brice Goglin
It looks like our Mac OS X backend doesn't properly handle processors that support hyperthreading without actually having hyperthreads enabled in hardware. Your processor has 4-core without HT but it's based on a processor with up to 8 cores and 16 threads. Our current code uses the latter and

Re: [hwloc-users] hwloc-2.0rc1 failure on Solaris

2018-01-25 Thread Brice Goglin
It is actually easy to fix, we just need to move hwloc's #include before what base64.c actually #include's. That'll be fixed in rc2 too. Brice Le 25/01/2018 à 10:56, Brice Goglin a écrit : > Like the error below? > > This code hasn't changed recently. Did you ever build with th

Re: [hwloc-users] hwloc-2.0rc1 failure on Solaris

2018-01-25 Thread Brice Goglin
Like the error below? This code hasn't changed recently. Did you ever build with these flags before? I am not sure I'll have time to fix yet another header crazyness before rc2. Brice   CC   base64.lo In file included from

Re: [hwloc-users] hwloc-2.0rc1 build warnings

2018-01-24 Thread Brice Goglin
tps://github.com/pmodels/hwloc/commit/9bf3ff256511ea4092928438f5718904875e65e1 > > The first one is definitely not usable as-is, since that breaks standalone > builds. But I'm interested in hearing about any better solution that you > might have. > > Thanks, > > -- Pava

Re: [hwloc-users] OFED requirements for netloc

2018-01-24 Thread Brice Goglin
that seq faults, and 1.6.6 on the one that succeeds.  > And that the first looks to be the standard OFED release and the 1.6.6 > version a mellanox release of OFED. > > Craig. > > On Tue, 23 Jan 2018 at 17:10 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.

Re: [hwloc-users] Tags for pre-releases

2018-01-23 Thread Brice Goglin
Hello I didn't know you use submodule. I just pushed tag "hwloc-2.0.0rc1" and I'll try to remember pushing one for each future rc. If I don't, please remind me. I am not going to push all the previous ones because there are just too many of them. If you need some specific ones, please let me

Re: [hwloc-users] OFED requirements for netloc

2018-01-22 Thread Brice Goglin
Hello, If the output isn't too big, could you put the files gathered by netloc_ib_gather_raw online so that we look at them and try to reproduce the crash? Thanks Brice Le 23/01/2018 à 03:54, Craig West a écrit : > Hi, > > I can't find the version requirements for netloc. I've tried it on an

Re: [hwloc-users] AMD EPYC topology

2017-12-29 Thread Brice Goglin
Le 29/12/2017 à 23:15, Bill Broadley a écrit : > > > Very interesting, I was running parallel finite element code and was seeing > great performance compared to Intel in most cases, but on larger runs it was > 20x > slower. This would explain it. > > Do you know which commit, or anything else

Re: [hwloc-users] AMD EPYC topology

2017-12-24 Thread Brice Goglin
Hello Make sure you use a very recent Linux kernel. There was a bug regarding L3 caches on 24-core Epyc processors which has been fixed in 4.14 and backported in 4.13.x (and maybe in distro kernels too). However, that would likely not cause huge performance difference unless your application

Re: [hwloc-users] How are processor groups under Windows reported?

2017-11-29 Thread Brice Goglin
Hello We only add hwloc Group objects when necessary. On your system, each processor group contains a single NUMA node, so these Groups would not really bring additional information about the hierarchy of resources. If you had a bigger system with, let's say, 4 NUMA nodes, with 2 of them in each

[hwloc-users] RFCs about latest API changes

2017-11-19 Thread Brice Goglin
/hwloc/pull/277 Make all depths *signed* ints https://github.com/open-mpi/hwloc/pull/276 Remove the "System" object type https://github.com/open-mpi/hwloc/pull/275 Move local_memory to NUMA node specific attrs https://github.com/open-mpi/hwloc/pull/274 Brice Le 26/10/2017 17

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-13 Thread Brice Goglin
ce > > aha. thanks. I knew I'd seen a function for that, but couldn't remember what > it was. > > Cheers > > JB > ____ > From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of Brice > Goglin [brice.gog...@inria

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-13 Thread Brice Goglin
loc-users-boun...@lists.open-mpi.org] on behalf of > Samuel Thibault [samuel.thiba...@inria.fr] > Sent: 12 November 2017 10:48 > To: Hardware locality user list > Subject: Re: [hwloc-users] question about hwloc_set_area_membind_nodeset > > Brice Goglin, on dim. 12 nov. 2017 05:19:37

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-11 Thread Brice Goglin
Le 12/11/2017 00:14, Biddiscombe, John A. a écrit : > I'm allocating some large matrices, from 10k squared elements up to > 40k squared per node. > I'm also using membind to place pages of the matrix memory across numa > nodes so that the matrix might be bound according to the kind of > pattern

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] Dual socket AMD Epyc error

2017-10-28 Thread Brice Goglin
Hello, The Linux kernel reports incorrect L3 information. Unfortunately, your old kernel seems to already contain patches for supporting the L3 on this hardware. I found two candidate patches for further fixing this, one is in 4.10 (cleanup of the above patch) and the other will only be in 4.14. I

[hwloc-users] new memory model and API

2017-10-26 Thread Brice Goglin
Hello I finally merged the new memory model in master (mainly for properly supporting KNL-like heterogeneous memory). This was the main and last big change for hwloc 2.0. I still need to fix some caveats (and lstopo needs to better display NUMA nodes) but that part of the API should be ready.

Re: [hwloc-users] linkspeed in hwloc_obj_attr_u::hwloc_pcidev_attr_s struct while traversing topology

2017-10-13 Thread Brice Goglin
Hello On Linux, the PCI linkspeed requires root privileges unfortunately (except for the uplink above NVIDIA GPUs where we have another way to find it). The only way to workaround this is to dump the topology as XML as root and then reload it at runtime (e.g. with HWLOC_XMLFILE) :/ Brice Le

Re: [hwloc-users] Why do I get such little information back about GPU's on my system

2017-07-07 Thread Brice Goglin
Le 07/07/2017 20:38, David Solt a écrit : > We are using the hwloc api to identify GPUs on our cluster. While we > are able to "discover" the GPUs, other information about them does not > appear to be getting filled in. See below for example:
 > (gdb) p *obj->attr > $20 = { > cache = { >

Re: [hwloc-users] hwloc error in SuperMicro AMD Opteron 6238

2017-06-30 Thread Brice Goglin
> L2 L#0 (2048KB) + L1i L#0 (64KB) > ... > > These nodes are the only one in our entire cluster to cause zombie > processes using torque/moab. I have a feeling that they are related. > We use hwloc/1.10.0. > > Not sure if this helps at all, but you are definitely not alone :)

Re: [hwloc-users] hwloc error in SuperMicro AMD Opteron 6238

2017-06-30 Thread Brice Goglin
Le 30/06/2017 22:08, fabricio a écrit : > Em 30-06-2017 16:21, Brice Goglin escreveu: >> Yes, it's possible but very easy. Before we go that way: >> Can you also pass HWLOC_COMPONENTS_VERBOSE=1 in the environment and send >&g

Re: [hwloc-users] hwloc error in SuperMicro AMD Opteron 6238

2017-06-28 Thread Brice Goglin
Hello We've seen this issue many times (it's specific to 12-core opterons), but I am surprised it still occurs with such a recent kernel. AMD was supposed to fix the kernel in early 2016 but I forgot checking whether something was actually pushed. Anyway, you can likely ignore the issue as

Re: [hwloc-users] ? Finding cache & pci info on SPARC/Solaris 11.3

2017-06-09 Thread Brice Goglin
Thanks a lot for the input. I opened https://github.com/open-mpi/hwloc/issues/243 I have access to a T5 but this will need investigation to actually find where to get the info from. Feel free to comment the issue if you find more. I am going to modify Pg.pm to better understand where Caches come

Re: [hwloc-users] ? Finding cache & pci info on SPARC/Solaris 11.3

2017-06-08 Thread Brice Goglin
Le 08/06/2017 16:58, Samuel Thibault a écrit : > Hello, > > Maureen Chew, on jeu. 08 juin 2017 10:51:56 -0400, wrote: >> Should finding cache & pci info work? > AFAWK, there is no user-available way to get cache information on > Solaris, so it's not implemented in hwloc. And even if prtpicl

Re: [hwloc-users] NetLoc subnets Problem

2017-02-22 Thread Brice Goglin
options should I use to give > ./configure script information about Scotch? > > Best regards, > Mikhail > > 2017-02-20 11:50 GMT+03:00 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>>: > > Inside the tarball that you downloaded, there'

Re: [hwloc-users] NetLoc subnets Problem

2017-02-20 Thread Brice Goglin
n't > found any information about it in docs and readme > > > > 2017-02-19 20:52 GMT+03:00 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>>: > > The only publicly-installed netloc API is currently specific to > the scotch partitioner

Re: [hwloc-users] NetLoc subnets Problem

2017-02-19 Thread Brice Goglin
The only publicly-installed netloc API is currently specific to the scotch partitioner for process placement. It takes a network topology and a communication pattern between a set of process and it generates a topology-aware placement for these processes. This API only gets installed if you have

Re: [hwloc-users] NetLoc subnets Problem

2017-02-17 Thread Brice Goglin
at we need in the hwloc development snapshot (the I/O discovery changed significantly in hwloc 2.0). Brice Le 17/02/2017 10:26, Михаил Халилов a écrit : > I ran ibstat on head node it gives information in attach. > > 2017-02-17 12:16 GMT+03:00 Brice Goglin <brice.gog...@inria.fr > <

Re: [hwloc-users] NetLoc subnets Problem

2017-02-17 Thread Brice Goglin
files in attach. I run netloc_ib_gather_raw with this parameters > netloc_ib_gather_raw /home/halilov/mycluster-data/ > --hwloc-dir=/home/halilov/mycluster-data/hwloc/ --verbose --sudo > > 2017-02-17 11:55 GMT+03:00 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>

Re: [hwloc-users] NetLoc subnets Problem

2017-02-17 Thread Brice Goglin
ов a écrit : > I installed nightly tarball, but it still isn't working. In attach > info of ibnetdiscover and ibroute. May be it wlii help... > What could be the problem? > > Best regards, > Mikhail Khalilov > > 2017-02-17 9:53 GMT+03:00 Brice Goglin <brice.gog...@inr

Re: [hwloc-users] NetLoc subnets Problem

2017-02-16 Thread Brice Goglin
Hello As identicated on the netloc webpages, the netloc development now occurs inside the hwloc git tree. netloc v0.5 is obsolete even if hwloc 2.0 isn't released yet. If you want to use a development snapshot, take hwloc nightly tarballs from https://ci.inria.fr/hwloc/job/master-0-tarball/ or

Re: [hwloc-users] CPUSET shading using xml output of lstopo

2017-02-03 Thread Brice Goglin
Le 03/02/2017 23:01, James Elliott a écrit : > On 2/3/17, Brice Goglin <brice.gog...@inria.fr> wrote: >> What do you mean with shaded? Red or green? Red means unavailable. >> Requires --whole-system everywhere. Green means that's where the >> process is bound. But XML do

Re: [hwloc-users] CPUSET shading using xml output of lstopo

2017-02-03 Thread Brice Goglin
Le 03/02/2017 21:57, James Elliott a écrit : > Brice, > > Thanks for you comments. I have worked with this some, but this is > not working. > > My goal is to generate images of the cpusets inuse when I run a > parallel code using mpirun, aprun, srun, etc... The compute nodes > lack the mojo

Re: [hwloc-users] CPUSET shading using xml output of lstopo

2017-01-31 Thread Brice Goglin
o not shade/highlight the tasksets. > > I'll drop the args that are redundant and try the exact form you list. > > James > > On 1/31/2017 10:52 PM, Brice Goglin wrote: >> Le 01/02/2017 00:19, James Elliott a écrit : >>> Hi, >>> >>> I seem to be st

Re: [hwloc-users] Building hwloc on Cray with /opt/cray/craype/2.5.4/bin/cc

2017-01-05 Thread Brice Goglin
ll > have the link complaining about recompiling with –fPIE and linking > with –pie, but I should be able to handle that) > > I tried all available cray cc (2.2.1 and 2.5.6) and they behave the same. > > I’ll see how to report bug to Cray and may ask for a new compiler > installation. >

Re: [hwloc-users] Reporting an operating system warning

2017-01-03 Thread Brice Goglin
getting that to > work on the GPU, and that might still work on my current kernel (4.8), > even if I get warnings like the one reported. > > > johannes. > > 2017-01-03 15:15 GMT+09:00 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>>: > >

Re: [hwloc-users] Reporting an operating system warning

2017-01-02 Thread Brice Goglin
Hello Johannes I think there are two bugs here. First one is that each "dual-core compute unit" is reported as a single core with two hardware threads. That's a kernel bug that appeared in 4.6. There's a fix at https://lkml.org/lkml/2016/11/29/852 but I don't think it has been applied yet. The

Re: [hwloc-users] memory binding on Knights Landing

2016-09-08 Thread Brice Goglin
Hello It's not a feature. This should work fine. Random guess: do you have NUMA headers on your build machine ? (package libnuma-dev or numactl-devel) (hwloc-info --support also report whether membinding is supported or not) Brice Le 08/09/2016 16:34, Dave Love a écrit : > I'm somewhat confused

Re: [hwloc-users] Topology Error

2016-05-09 Thread Brice Goglin
Le 09/05/2016 23:58, Mehmet Belgin a écrit : > Greetings! > > We've been receiving this error for a while on our 64-core Interlagos > AMD machines: > > > > * hwloc has encountered what looks like an error from the

Re: [hwloc-users] hwloc_alloc_membind with HWLOC_MEMBIND_BYNODESET

2016-05-09 Thread Brice Goglin
Hello Hugo, Can you send your code and a description of the machine so that I try to reproduce ? By the way, BYNODESET is also available in 1.11.3. Brice Le 09/05/2016 16:18, Hugo Brunie a écrit : > Hello, > > When I try to use hwloc_alloc_membind with HWLOC_MEMBIND_BYNODESET > I obtain NULL

Re: [hwloc-users] HWLOC_get_membind: problem in getting right(specific) NODESET where data is allocated

2016-04-24 Thread Brice Goglin
0x7f352e515000 Variable:= bound to > nodeset 0x0004 with contains: > [2nd Q] node #2 (OS index 2) with 8471621632 bytes of memory > > in case of [3rd Q] > Error Occured, and error no:= -1 and segmentation fault happened. > > Thanks.! > > > On Sun, Apr 24, 2016 at 4:0

Re: [hwloc-users] HWLOC_get_membind: problem in getting right(specific) NODESET where data is allocated

2016-04-24 Thread Brice Goglin
mbind_nodeset(topology, size, cset_available, > HWLOC_MEMBIND_INTERLEAVE, HWLOC_MEMBIND_MIGRATE); > but I did get it working here as well. > > > *Can you please comment on this..? * > > Thank you very much in advance..!! > - Raju > > On Mon, Mar 21, 2

  1   2   3   >