Re: [hwloc-users] hwloc problem on SGI machine

2014-01-10 Thread Brice Goglin
Le 11/01/2014 00:27, Jeff Squyres (jsquyres) a écrit : > Jeff Becker (CC'ed) reported to me a failure with hwloc 1.7.2 (in OMPI > trunk). I had him verify this with a standalone hwloc 1.7.2, and then had > him try standalone hwloc 1.8 as well -- all got the same failure. > > Here's what he's see

Re: [hwloc-users] hwloc problem on SGI machine

2014-01-11 Thread Brice Goglin
Le 11/01/2014 01:58, Chris Samuel a écrit : > On Sat, 11 Jan 2014 11:54:17 AM Chris Samuel wrote: > >> We've got both an older Altix XE cluster and a UV10 (both running RHEL) I >> can test on if it's useful? > Forgot I already had both 1.7.2 and 1.8 built for both - all fine (RHEL6.4). > This was

Re: [hwloc-users] hwloc errors on program startup

2014-01-17 Thread Brice Goglin
Hello, Linux says socket 0 contains processors 0-7 and socket 1 contains 8-15, while NUMA node 0 contains processors 0-3+8-11 and NUMA node 1 contains processors 4-7+12-15. Given why I read about Opteron 6320 online, the problem is that NUMA 0 should be replaced with two NUMA nodes with processors

Re: [hwloc-users] How to build hwloc static to link into a shared lib on Linux

2014-01-18 Thread Brice Goglin
Maybe try to disable some dependencies such as pci in hwloc (--disable-pci), I wouldn't be surprised if there were issues there. If that helps, please let us know what was enabled before (libpciaccess (default), or libpci/pciutils (--enable-libpci)). Brice Le 18/01/2014 07:23, Robin Scher a écr

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-28 Thread Brice Goglin
Hello, The CPUModel attribute should be only in Socket or machine/root objects. At least, that's what I documented and what I seem to see in the code. Did you actually see any other place? So it may just mean that the CPUModel is not available on your machine? Or maybe the code below is buggy som

[hwloc-users] CPU info on ARM

2014-01-28 Thread Brice Goglin
Hello, Is anybody familiar with ARM CPUs? I am adding more CPU information because Intel needs more: CPUVendor=GenuineIntel CPUModel=Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz CPUModelNumber=45 CPUFamilyNumber=6 Would something similar be useful for ARM? What are the fields below from /proc/cpuinf

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-28 Thread Brice Goglin
Le 28/01/2014 09:46, Robin Scher a écrit : > Hi, thanks for responding. > > The CPUModel is definitely available on this machine. A 32 bit process > on the same machine correctly finds the model name using code that > calls the cpuid inline assembly to get it, and the machine itself is a > VM runn

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-28 Thread Brice Goglin
Le 28/01/2014 09:57, Brice Goglin a écrit : > I will debug a bit more to see if it's actually a 64bit cpuid problem > on windows. The x86 backend is entirely disabled in the 64bit windows build because configure fails to compile the cpuid assembly (in my mingw64 with gcc 4.7). It

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-28 Thread Brice Goglin
Le 28/01/2014 13:00, Samuel Thibault a écrit : > Brice Goglin, le Tue 28 Jan 2014 12:46:24 +0100, a écrit : >> 42: xchg %ebx,%rbx >> >> I guess having both ebx and rbx on these lines isn't OK. On Linux, I get >> rsi instead of ebx, no problem. >> >> S

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-28 Thread Brice Goglin
Le 28/01/2014 14:31, Brice Goglin a écrit : > Le 28/01/2014 13:00, Samuel Thibault a écrit : >> Brice Goglin, le Tue 28 Jan 2014 12:46:24 +0100, a écrit : >>> 42: xchg %ebx,%rbx >>> >>> I guess having both ebx and rbx on these lines isn't OK. On Linux,

Re: [hwloc-users] CPU info on ARM

2014-01-28 Thread Brice Goglin
models executing in the same SMP system)." >> >> He passed the question on to another ARM guy, asking for further detail. >> I'll pass on what he says. >> >> >> >> On Jan 28, 2014, at 3:39 AM, Brice Goglin wrote: >> >>> Hello,

Re: [hwloc-users] Finding closest host bridge

2014-01-28 Thread Brice Goglin
The bridge cannot be "not connected to anything". All objects have a parent (and are a child of that parent) except the very-top root object. Theoretically, the bridge could be connected anywhere. In practice it's connected to a NUMA node, a root object, or (rarely) a group of numa nodes. The prob

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-29 Thread Brice Goglin
en-mpi.org/community/lists/hwloc-devel/2014/01/4043.php Le 29/01/2014 06:50, Robin Scher a écrit : > Hi Brice > > This works great now. Thank you for your help! > -robin > > Robin Scher > ro...@uberware.net > +1 (213) 448-0443 > > > > On Jan 28, 2014, at 7:4

Re: [hwloc-users] misleading cache size on AMD Opteron 6348?

2014-01-31 Thread Brice Goglin
Hello, Your BIOS reports invalid L3 cache information. On these processors, the L3 is shared by 6 cores, it covers 6 cores of an entire half-socket NUMA node. But the BIOS says that some L3 are shared between 4 cores, others by 6 cores. And worse it says that some L3 is shared by some cores from a

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-05 Thread Brice Goglin
Hello Brock, Some people reported the same issue in the past and that's why we added the "nvml" objects. CUDA reorders devices by "performance". Batch-schedulers are somehow supposed to use "nvml" for managing GPUs without actually using them with CUDA directly. And the "nvml" order is the "normal

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-06 Thread Brice Goglin
r following the PCI bus order? We may want to talk to NVIDIA to get a clarification about all this. Brice > > Brock Palen > www.umich.edu/~brockp > CAEN Advanced Computing > XSEDE Campus Champion > bro...@umich.edu > (734)936-1985 > > > > On Feb 5, 2014, at 1:19 A

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-07 Thread Brice Goglin
GPU L#3 "nvml2" > GPU L#5 "nvml3" > GPU L#7 "nvml0" > GPU L#9 "nvml1" > > Is the L# always going to be in the oder I would expect? Because then I > already have my map then. Brice > > Brock P

Re: [hwloc-users] [hwloc-announce] Hardware Locality (hwloc) v1.8.1 released

2014-02-13 Thread Brice Goglin
Le 13/02/2014 22:25, Jiri Hladky a écrit : > Hi Brice, > > when compiling hwloc-1.8.1 I have seen these warnings. Could you > please check them? fread() warnings come from fread() on kernel sysfs files, so it's very unlikely that we read totally buggy data from there. One day we'll fix this, maybe

Re: [hwloc-users] BGQ question.

2014-03-25 Thread Brice Goglin
Le 25/03/2014 07:51, Biddiscombe, John A. a écrit : > > I'm compiling hwloc using clang (bgclang++11 from ANL) to run on IO > nodes af a BGQ. It seems to have compiled ok, and when I run lstopo, I > get an output like this (below), which looks reasonable, but there are > 15 sockets instead of 16. I

Re: [hwloc-users] BGQ question.

2014-03-25 Thread Brice Goglin
x=/gpfs/bbp.cscs.ch/home/biddisco/apps/clang/hwloc-1.8.1 > >should I rerun with something set? > >Thanks > >JB > > >From: hwloc-users [mailto:hwloc-users-boun...@open-mpi.org] On Behalf >Of Brice Goglin >Sent: 25 March 2014 08:04 >To: Hardware locality user list &g

Re: [hwloc-users] BGQ question.

2014-03-25 Thread Brice Goglin
ere we > are trying to customise the IO. > > > > JB > > > > *From:*Brice Goglin [mailto:brice.gog...@inria.fr] > *Sent:* 25 March 2014 08:43 > *To:* Hardware locality user list; Biddiscombe, John A. > *Subject:* Re: [hwloc-users] BGQ question. > > &

Re: [hwloc-users] BGQ question.

2014-03-26 Thread Brice Goglin
Le 26/03/2014 01:00, Christopher Samuel a écrit : > On 26/03/14 01:34, Biddiscombe, John A. wrote: > > > If I compile on the login node, but run lstopo on the ION, I get > > this (wrong, below) > > If you build this with GCC (the standard system one, not the > cross-compiler for BGQ) does it still

Re: [hwloc-users] distributing across cores with hwloc-distrib

2014-03-30 Thread Brice Goglin
Hello, This is the main corner case of hwloc-distrib. It can return objects only, not groups of objects. The distrib algorithms is: 1) start at the root, where there are M children, and you have to distribute N processes 2) if there are no children, or if N is 1, return the entire object 3) split

Re: [hwloc-users] distributing across cores with hwloc-distrib

2014-03-30 Thread Brice Goglin
hat this is considered a corner case. Could you > please consider fixing this? > > Thanks, > Tim > > Brice Goglin wrote: >> Hello, >> >> This is the main corner case of hwloc-distrib. It can return objects >> only, not groups of objects. The distrib algorit

Re: [hwloc-users] distributing across cores with hwloc-distrib

2014-03-30 Thread Brice Goglin
gt; > On Sun, Mar 30, 2014 at 05:32:38PM +0200, Brice Goglin wrote: >> Don't worry, binding multithreaded processes is not a corner case. I was >> rather talking about the general "distributing less processes than there >> are object and returning cpusets as large as po

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.9 released

2014-04-01 Thread Brice Goglin
Le 01/04/2014 10:43, Jiri Hladky a écrit : > Hi Brice, > > I see some compiler warnings when building rpm package for Fedora: > > topology-windows.c: In function 'hwloc_win_get_VirtualAllocExNumaProc': > topology-windows.c:338:30: warning: assignment from incompatible > pointer type [enabled by def

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.9 released

2014-04-01 Thread Brice Goglin
oc-gui package) is still much > lower compared to lstopo-no-graphics > B) Compile it without libXNVCtrl but it will reduce the functionality. > > Is there any 3rd option? I guess not. It seems like A) is the best > choice for Fedora. > > Any ideas on that? > > Thanks!

Re: [hwloc-users] misleading cache size on AMD Opteron 6348?

2014-04-01 Thread Brice Goglin
has latest version. If I should check some BIOS information, > I have access to hardware. Tell me what variables from SMBIOS you want > to see? > > > On Fri, Jan 31, 2014 at 1:07 PM, Brice Goglin <mailto:brice.gog...@inria.fr>> wrote: > > Hello, > > Your BI

Re: [hwloc-users] problem with open mpi

2014-04-16 Thread Brice Goglin
Hello, This list is for hwloc users (hwloc is a Open MPI subproject). You likely want Open MPI users instead: us...@open-mpi.org Brice Le 16/04/2014 18:44, flavienne sayou a écrit : > Hello, > I am Flavienne and I am a master student. > I wrote a script which have to backup sequentials applicatio

Re: [hwloc-users] divide by zero error?

2014-04-29 Thread Brice Goglin
Please run "hwloc-gather-topology simics" and send the resulting simics.tar.bz2 that it will create. However, I assume that the simulator returns buggy x86 cpuid information, so we'll see if we want/can easily workaround the bug or just let simics developers fix it. Brice Le 29/04/2014 01:15, Fri

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
Aside of the BIOS config, are you sure that you have the exact same BIOS *version* in each node? (can check in /sys/class/dmi/id/bios_*) Same Linux kernel too? Also, recently we've seen somebody fix such problems by unplugging and replugging some CPUs on the motherboard. Seems crazy but it happene

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
> Thanks much, > > Craig > > > On Wednesday, May 28, 2014 1:39 PM, Brice Goglin > wrote: > > > Aside of the BIOS config, are you sure that you have the exact same > BIOS *version* in each node? (can check in /sys/class/dmi/id/bios_*) > Same Linux kernel too? &g

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
Le 28/05/2014 14:57, Craig Kapfer a écrit : > > > Hmm ... the slurm config defines that all nodes have 4 sockets with 16 > cores per socket (which corresponds to the hardware--all nodes are the > same). Slurm node config is as follows: > > NodeName=n[001-008] RealMemory=258452 Sockets=4 CoresPerS

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
Le 28/05/2014 15:46, Craig Kapfer a écrit : > Wait, I'm sorry, I must be missing something, please bear with me! > > By the way, your discussion of groups 1 and 2 below is wrong. > Group 2 doesn't say that NUMA node == socket, and it doesn't > report 8 sockets of 8 cores each. It report

Re: [hwloc-users] divide by zero error?

2014-06-08 Thread Brice Goglin
gt; > Thanks, > > Andrew > >> -Original Message- >> From: Brice Goglin [mailto:brice.gog...@inria.fr] >> Sent: Monday, May 5, 2014 1:03 PM >> To: Friedley, Andrew >> Subject: Re: [hwloc-users] divide by zero error? >> >> Thanks. >&

Re: [hwloc-users] misleading cache size on AMD Opteron 6348?

2014-06-11 Thread Brice Goglin
amples of 6348 (all characteristics are same). > > > On Tue, Apr 1, 2014 at 6:59 PM, Yury Vorobyov <mailto:teupol...@gmail.com>> wrote: > > The BIOS has latest version. If I should check some BIOS > information, I have access to hardware. Tell me wh

Re: [hwloc-users] hwloc 1.9 and openmpi using intel compiler

2014-07-09 Thread Brice Goglin
Hello, A quick look in Open MPI source code seems to say that it's manipulating XML topologies in these lines. Please go into your hwloc-1.9 build directory, and run "tests/xmlbuffer" (you will may have to build it with run "make xmlbuffer -C tests"). If it works, try running "make check". Also

Re: [hwloc-users] hwloc 1.9 and openmpi using intel compiler

2014-07-09 Thread Brice Goglin
Le 09/07/2014 23:30, Nick Papior Andersen a écrit : > Dear Brice > > Here are my findings (apologies for not doing make check on before-hand!) > > 2014-07-09 20:42 GMT+00:00 Brice Goglin <mailto:brice.gog...@inria.fr>>: > > Hello, > > A quick look in Op

Re: [hwloc-users] hwloc 1.9 and openmpi using intel compiler

2014-07-11 Thread Brice Goglin
4 23:42, Nick Papior Andersen a écrit : > Dear Brice > > > 2014-07-09 21:34 GMT+00:00 Brice Goglin <mailto:brice.gog...@inria.fr>>: > > Le 09/07/2014 23:30, Nick Papior Andersen a écrit : >> Dear Brice >> >> Here are my findings (apologies for not

Re: [hwloc-users] hwloc 1.9 and openmpi using intel compiler

2014-07-12 Thread Brice Goglin
This commit should fix it. https://github.com/open-mpi/hwloc/commit/f46c983df58a41ec8f994f30f57154bd78392de8.patch Brice Le 09/07/2014 23:42, Nick Papior Andersen a écrit : > Dear Brice > > > 2014-07-09 21:34 GMT+00:00 Brice Goglin <mailto:brice.gog...@inria.fr>>: > &

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc error

2014-08-15 Thread Brice Goglin
Hello, Your platform reports buggy L3 cache locality information. This is very common on AMD 62xx and 63xx platforms unfortunately. You have 8 L3 caches (one per 6-core NUMA node, two per socket), but the platform report 11 L3 caches instead: Socket s1, 2 and 4 report one L3 above 2 cores, one L3

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc error

2014-08-15 Thread Brice Goglin
Le 15/08/2014 14:59, Andrej Prsa a écrit : > Hi Brice, > >> Your kernel looks recent enough, can you try upgrading your BIOS ? You >> have version 3.0b and there's a 3.5 version at >> http://www.supermicro.com/aplus/motherboard/opteron6000/sr56x0/h8qg6-f.cfm > Flashing bios is not the easiest optio

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE]Re: hwloc error

2014-08-17 Thread Brice Goglin
Le 16/08/2014 18:37, Andrej Prsa a écrit : > Hi Brice, > >> Your kernel looks recent enough, can you try upgrading your BIOS ? You >> have version 3.0b and there's a 3.5 version at >> http://www.supermicro.com/aplus/motherboard/opteron6000/sr56x0/h8qg6-f.cfm > For completeness, I just tried updatin

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Brice Goglin
Le 19/08/2014 18:38, Aulwes, Rob a écrit : > Hi, > > I'm trying to write a custom C++ allocator that wraps hwloc calls. > I've tried using various hwloc_alloc* functions to set the memory > bindings, but when I call hwloc_get_area_membind_nodeset to verify, I > don't get the same policy I passed t

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Brice Goglin
* sizeof (T)); > hwloc_set_area_membind_nodeset(_topo, p, cnt * sizeof (T), > > mem_nodeset, HWLOC_MEMBIND_NEXTTOUCH, 0); > > where > > mem_nodeset = hwloc_topology_get_complete_nodeset(_topo); > > Thanks,Rob > > From: Brice Goglin

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Brice Goglin
ould like to try 'replicate'. > > From: Brice Goglin mailto:brice.gog...@inria.fr>> > Reply-To: Hardware locality user list <mailto:hwloc-us...@open-mpi.org>> > Date: Tue, 19 Aug 2014 18:55:57 +0200 > To: Hardware locality user list <mailto:hwloc-us...

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Brice Goglin
any doc? > > Thanks for the help! Rob > > From: Brice Goglin mailto:brice.gog...@inria.fr>> > Reply-To: Hardware locality user list <mailto:hwloc-us...@open-mpi.org>> > Date: Tue, 19 Aug 2014 19:03:56 +0200 > To: Hardware locality user list <mailto:hw

Re: [hwloc-users] setting memory bindings

2014-09-02 Thread Brice Goglin
thout the STRICT flag. And I'll see if I add a good example somewhere. Brice Le 19/08/2014 19:00, Aulwes, Rob a écrit : > nope, no error. is there a way to find out what policies are > supported? I would like to try 'replicate'. > > From: Brice Goglin mailto:brice.gog.

Re: [hwloc-users] setting memory bindings

2014-09-04 Thread Brice Goglin
I added a new doc/examples/ repository to better show how to use bitmaps, cpu and memory binding etc. https://github.com/open-mpi/hwloc/tree/master/doc/examples If you see anything missing, don't hesitate to ask. Brice Le 19/08/2014 19:10, Aulwes, Rob a écrit : > ok, in the meantime, is th

Re: [hwloc-users] hwloc error with "node interleaving" disabled

2014-09-05 Thread Brice Goglin
Hello You sent the test.output file instead of test.tar.bz2 so I can't check for sure. Anyway I guess this is yet another buggy AMD platform with magny-cours/interlagos/abu-dahbi Opterons (61xx, 62xx or 63xx). Sometimes upgrading the BIOS/kernel helps. Sometimes not. Some L3 caches will be missi

Re: [hwloc-users] hwloc error with "node interleaving" disabled

2014-09-05 Thread Brice Goglin
Don't be sorry, I used "yet another" to complain about all these buggy AMD platforms, and not to complain about their owners ;) Bug reports are always welcome, that's why the big warning says you should report it. Also these warnings vary a little bit with the platform and processor model so i

Re: [hwloc-users] setting memory bindings

2014-09-15 Thread Brice Goglin
t_numanode_obj_by_os_index? > > Thanks,Rob > > > *From:* hwloc-users [hwloc-users-boun...@open-mpi.org] on behalf of > Brice Goglin [brice.gog...@inria.fr] > *Sent:* Thursday, September 04, 2014 6:25 AM > *To:* hwloc-us...@open-mpi.org > *Subject:* Re: [hwloc-users] setting

Re: [hwloc-users] problem with X11 using Solaris

2014-09-17 Thread Brice Goglin
Can you send the output of configure, the generated config.log and your unmodified Xutil.h? My solaris/openindiana doesn't have that problem. thanks Brice Le 16/09/2014 14:43, Siegmar Gross a écrit : > Hi, > > today I installed hwloc-1.9.1 on my machines (Solaris 10 Sparc (tyr), > Solaris 10 x86

Re: [hwloc-users] more detailed errors

2014-09-17 Thread Brice Goglin
What is errno after load() failing? Brice On 17 septembre 2014 17:43:13 UTC+02:00, "Aulwes, Rob" wrote: >Hi, > >A call to hwloc_topology_load is failing, but all that is returned is >–1. Are there error reporting routines that can be called to get more >details about the error? The doc for hwlo

Re: [hwloc-users] more detailed errors

2014-09-17 Thread Brice Goglin
$ errno 24 EMFILE 24 Too many open files Ohoh that's a new one :) Can you do a strace of the program and send the output? If the file is big, you can send it to me in a private mail. Brice Le 17/09/2014 18:14, Aulwes, Rob a écrit : > ERRNO = 24. > > From: Brice Goglin ma

Re: [hwloc-users] problem with X11 using Solaris

2014-09-18 Thread Brice Goglin
Thanks, I just pushed a fix. Can you verify that this tarball enables X automatically and properly? https://ci.inria.fr/hwloc/job/master-0-tarball/lastSuccessfulBuild/artifact/hwloc-master-20140918.1131.git005a7e8.tar.gz I am looking at the warnings and make check failures you sent. Brice Le 1

Re: [hwloc-users] hwloc-ls graphical output

2014-09-24 Thread Brice Goglin
Hello Are there any graphical formats in lstopo -h ? If so maybe Cairo can export to png etc but it cant draw a x11 window? Check whether X11/Xlib.h and X11/Xutil.h are available. Brice On 24 septembre 2014 18:08:31 UTC+02:00, Dennis Jacobfeuerborn wrote: >Hi, >I just compiled hwloc for Cen

Re: [hwloc-users] hwloc-ls graphical output

2014-09-25 Thread Brice Goglin
Le 25/09/2014 02:22, Dennis Jacobfeuerborn a écrit : > So I just recompiled again but using version 1.4.3 and the graphical > output options reappeared. I also tried version 1.5.2 and this version > will not show the graphical output options anymore so it seems something > has changed between 1.4 a

Re: [hwloc-users] binding to thread

2014-09-29 Thread Brice Goglin
Le 29/09/2014 19:01, Aulwes, Rob a écrit : > Hi, > > I'm trying to allocate and bind memory on the same NUMA domain as the > calling thread. The code I use is as follows. > > /* retrieve the single PU where the current thread actually > runs within this process binding */ > > > i

Re: [hwloc-users] Processor numbering in Ivy-bridge

2014-09-29 Thread Brice Goglin
Yes. Most of locality info comes from /sys/... on Linux. Brice Le 29/09/2014 22:59, Vishwanath Venkatesan a écrit : > Thanks for the quick response, yes lstopo -l does make the numbers > contiguous. > Another question I had was, how does hwloc populate the information > that certain cpus share a p

Re: [hwloc-users] hwloc-ls graphical output

2014-10-01 Thread Brice Goglin
Dennis, Did you have an opinion about this? I am going to release the final hwloc v1.10 soon. So if there's something to fix, I'd rather know it quickly. thanks Brice Le 25/09/2014 07:47, Brice Goglin a écrit : > Le 25/09/2014 02:22, Dennis Jacobfeuerborn a écrit : >> So I ju

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.10.0 release

2014-10-08 Thread Brice Goglin
Le 08/10/2014 01:52, Jiri Hladky a écrit : > 2) I have also some trouble with symlinks. The trouble is this: > > * when installed with ./configure && make && make install > then hwloc-ls is symlink to lstopo-no-graphics and man pages > { lstopo-no-graphics.1, hwloc-ls.1 } are symlinks to

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.10.0 release

2014-10-08 Thread Brice Goglin
Le 08/10/2014 01:52, Jiri Hladky a écrit : > Hi Brice, > > glad to see the new version is out! :-) > > I have bumped into couple of minor problems when building new RPM for > Fedora: > > 1) desktop file > desktop-file-validate hwloc-ls.desktop.back > hwloc-ls.desktop.back: error: file contains key

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.10.0 release

2014-10-09 Thread Brice Goglin
Le 09/10/2014 00:55, Jiri Hladky a écrit : > > * if building without cairo/X11 support, lstopo and lstopo.1 are > symlinks. Packagers can choose to ignore lstopo and lstopo.1. > lstopo.desktop isn't installed. > > > Could you please make (in the next version) > lstopo-no-graphics.1 > a

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.10.0 release

2014-10-09 Thread Brice Goglin
Le 09/10/2014 00:49, Jiri Hladky a écrit : > Hi Brice, > > this sounds perfectly reasonable to me. I will make the arrangements > on packing side. > > Perhaps you could add this in README file? > The README file is autogenerated from the huge doxygen text, which is really for users, not for packa

[hwloc-users] engineer position on hwloc+netloc

2014-10-27 Thread Brice Goglin
Hello, There's an R&D engineer position opening in my research team at Inria Bordeaux (France) for developing hwloc and netloc software. All details available at http://runtime.bordeaux.inria.fr/goglin/201410-Engineer-hwloc+netloc.en.pdf or French version http://runtime.bordeaux.inria.fr/goglin/

[hwloc-users] engineer position on hwloc+netloc

2014-10-30 Thread Brice Goglin
Hello, There's an R&D engineer position opening in my research team at Inria Bordeaux (France) for developing hwloc and netloc software (both Open MPI subprojects). All details available at http://runtime.bordeaux.inria.fr/goglin/201410-Engineer-hwloc+netloc.en.pdf or French version http://runt

Re: [hwloc-users] lstopo error

2014-11-18 Thread Brice Goglin
Le 18/11/2014 14:46, Diego Regueira a écrit : > Hi, I'm getting an error from the lstopo command. > Please, check the attachments. > > Thanks Hello, It's a very common problem on AMD platforms unfortunately. http://www.open-mpi.org/projects/hwloc/doc/v1.10.0/a00028.php#faq_os_error In your case,

Re: [hwloc-users] hwloc - "symbol already defined" error building with optimizations (-O3) on 32bit ubuntu

2014-11-20 Thread Brice Goglin
Hello, Thanks, I can reproduce the problem on Debian with -O3 -m32. The issue is that -O3 makes gcc inline more. We have function A call B multiple times, and B calls C which contains asm with a label. So in the end A contains the asm label from C multiple times. Google says we should use local lab

Re: [hwloc-users] hwloc - "symbol already defined" error building with optimizations (-O3) on 32bit ubuntu

2014-11-21 Thread Brice Goglin
mas.vando...@gmail.com> > > > On Wed, Nov 19, 2014 at 10:42 PM, Brice Goglin <mailto:brice.gog...@inria.fr>> wrote: > > Hello, > Thanks, I can reproduce the problem on Debian with -O3 -m32. > The issue is that -O3 makes gcc inline more. We have functio

Re: [hwloc-users] hwloc - "symbol already defined" error building with optimizations (-O3) on 32bit ubuntu

2014-11-21 Thread Brice Goglin
> Makefile:615: recipe for target 'check-recursive' failed > make: *** [check-recursive] Error 1 > > I attached the output of all of the steps and the logs. Let me know if > you need something else. > > Thanks! > > Thomas Van Doren > thomas.vando...@gmail.com <mailto:thoma

Re: [hwloc-users] hwloc - "symbol already defined" error building with optimizations (-O3) on 32bit ubuntu

2014-11-25 Thread Brice Goglin
Le 21/11/2014 01:57, Thomas Van Doren a écrit : > Hi Brice > > Thank you for the quick response! That patch fixes the build issue and > hwloc works as expected (make check has 1 failure on 32bit, but that > also happens on master so I didn't worry about it). This was an overzealous assertion in th

Re: [hwloc-users] Selecting real cores vs HT cores

2014-12-11 Thread Brice Goglin
Le 11/12/2014 21:51, Brock Palen a écrit : > When a system has HT enabled is one core presented the real one and one the > fake partner? Or is that not the case? > > If wanting to test behavior without messing with the bios how do I select > just the 'real cores' if this is the case? > > I a

[hwloc-users] wrong os_index on AIX -> please test

2014-12-17 Thread Brice Goglin
Hello I am seeing assert failures on AIX 6.1 because our PU os_index is off by one. They go from -1 to 62 instead of 0 to 63. We have a comment saying /* It seems logical processors are numbered from 1 here, while the * bindprocessor functions numbers them from 0... */ This contradicts

Re: [hwloc-users] HWLoc error mesg

2014-12-20 Thread Brice Goglin
Hello, As explained in another mail, this yet another buggy AMD L3 cache information reported by the hardware. The only way to *fix* this is to tell your machine vendor to fix the L3 cache information. The only thing we can do is remove the hwloc warning (if you don't care about cache or NUMA aff

Re: [hwloc-users] Hwloc on windows does not show pci devices

2015-01-06 Thread Brice Goglin
Hello We don't have PCI support on Windows unfortunately. And on non-Linux platforms, you would have PCI devices without their locality, not really useful. The hwloc I/O doc says: "Note that I/O discovery requires significant help from the operating system. The pciaccess library (the development

Re: [hwloc-users] PCI devices topology

2015-01-08 Thread Brice Goglin
Hello, hwloc_topology_init(&topology); hwloc_topology_set_flags(topology, HWLOC_TOPOLOGY_FLAG_IO_DEVICES); hwloc_topology_load(topology); Then you can use hwloc_get_next_pcidev() to iterate over the entire list PCI devices. If you want to know whether it's connected to a specific NUMA node, start

Re: [hwloc-users] PCI devices topology

2015-01-09 Thread Brice Goglin
t's enough for "comparing distances". Brice Le 09/01/2015 10:30, Pradeep Kiruvale a écrit : > Hi Brice, > > Thanks for the reply. Is it possible to get the distance matrix for > each cpu and the pci device from these hwloc apis? > > Regards, > Pradee

Re: [hwloc-users] hwloc error when starting slurmd on 48 core FreeBSD 10.1 system

2015-01-17 Thread Brice Goglin
Hello This is a widespread problem with AMD machines. Buggy platform reporting invalid L3 cache information in this case. Upgrading the BIOS may help. Anyway, I guess Slurm doesn't care much about L3 cache affinity, so you can ignore the error by setting HWLOC_HIDE_ERRORS=1 in the environment. More

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc-gather-topology

2015-02-17 Thread Brice Goglin
Hello This is a widespread problem with AMD machines. Buggy platforms reporting invalid L3 cache information in this case. Upgrading the BIOS may help. If your program doesn't care about cache affinity, you can hide/ignore the message by setting HWLOC_HIDE_ERRORS=1 in the environment. More detail

Re: [hwloc-users] hwloc has encountered what looks like an error from the operating system

2015-02-23 Thread Brice Goglin
Hello, This is yet another example of buggy AMD topology information unfortunately. See http://www.open-mpi.org/projects/hwloc/doc/v1.10.1/a00028.php#faq_os_error In your case, NUMA and processor package/socket information are conflicting because NUMA information is buggy. Upgrading the BIOS may

Re: [hwloc-users] lstopo on Kaveri

2015-03-27 Thread Brice Goglin
Hello, That's an interesting question: Even if the GPU is physically-located inside the die, it is exposed as a "virtual" PCI device (vendor number 1002 and model number 130f), and that's how we detect it, and that's how the driver configures it. Many components of the CPU die are configured throu

Re: [hwloc-users] linking libcudart and libnvml only to the plugins

2015-06-02 Thread Brice Goglin
Le 02/06/2015 23:27, Fabricio Cannini a écrit : > Hello there > > Is there a way to link 'libcudart.so' and 'libnvidia-ml.so' solely to > their respective plugin .so files, not the main libraries/executables? > > This is the './configure' line i'm using: > > ./configure --enable-shared --enable-sta

Re: [hwloc-users] linking libcudart and libnvml only to the plugins

2015-06-03 Thread Brice Goglin
Le 04/06/2015 00:00, Fabricio Cannini a écrit : > Hi Brice, thanks for answering. > > Strangely, xml_libxml and pci works fine as plugins, but nvml and cuda > not. I had no trouble making the 'pci' and 'xml_libxml' plugins link > to their respective libraries, leaving 'libhwloc.so' alone, but no >

Re: [hwloc-users] linking libcudart and libnvml only to the plugins

2015-06-03 Thread Brice Goglin
Le 04/06/2015 00:53, Fabricio Cannini a écrit : > On 03-06-2015 19:45, Brice Goglin wrote: >> Le 04/06/2015 00:00, Fabricio Cannini a écrit : >>> Hi Brice, thanks for answering. >>> >>> Strangely, xml_libxml and pci works fine as plugins, but nvml and cuda >&

Re: [hwloc-users] linking libcudart and libnvml only to the plugins

2015-06-03 Thread Brice Goglin
Le 04/06/2015 01:02, Fabricio Cannini a écrit : > LDFLAGS = -L/usr/local/cuda-6.5/lib64 -lcudart -L/usr/lib64/nvidia -lnvidia-ml Does this line come from your environment? hwloc isn't supposed to set LDFLAGS unless it comes from the environment. I guess that's where your problems comes from. Bric

Re: [hwloc-users] linking libcudart and libnvml only to the plugins

2015-06-03 Thread Brice Goglin
Le 04/06/2015 01:17, Fabricio Cannini a écrit : > On 03-06-2015 20:10, Brice Goglin wrote: >> Le 04/06/2015 01:02, Fabricio Cannini a écrit : >>> LDFLAGS = -L/usr/local/cuda-6.5/lib64 -lcudart -L/usr/lib64/nvidia >>> -lnvidia-ml >> >> Does this line come from

Re: [hwloc-users] Only one CUDA device showing up

2015-06-04 Thread Brice Goglin
CUDA releases before 4.0 didn't support this attribute, the #ifdef cannot work anymore on recent CUDA releases, I'll fix that, thanks. Interesting to know that NUMAScale machines use PCI domains. Brice Le 04/06/2015 14:13, Imre Kerr a écrit : > Hi, > Never mind, I figured it out. hwloc_cudart_ge

Re: [hwloc-users] Difficulty embedding hwloc 1.11.0

2015-07-07 Thread Brice Goglin
Hello I don't see any significant change in v1.11 regarding embedding, especially with respect to CONFIGURE_DEPENDENCIES. Does v1.10 work when running autogen with the same versions of automake/libtool/autoconf? I am using 1.14.1/2.4.2/2.69 here. If you enter hwloc-1.11.0/tests/embedded, does ".

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc 1.11.0 seems to have problem with 3.13 kernel on AMD bulldozer

2015-07-09 Thread Brice Goglin
Hello The 3.13 kernel reports invalid L3 cache information in sysfs. 0x3f0 is not possible on this processor, it should be either 0x3f or 0xfc (there's exactly one L3 per NUMA node, with the same 6 cores in them). Can you check whether the BIOS is also the same on these machines? (see files in /s

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc 1.11.0 seems to have problem with 3.13 kernel on AMD bulldozer

2015-07-09 Thread Brice Goglin
09/07/2015 16:26, Åke Sandgren a écrit : > Yes the BIOS is the same. > > Anything else i should check? > > On 07/09/2015 04:10 PM, Brice Goglin wrote: >> Hello >> >> The 3.13 kernel reports invalid L3 cache information in sysfs. 0x3f0 is >> not possible on this

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc 1.11.0 seems to have problem with 3.13 kernel on AMD bulldozer

2015-07-09 Thread Brice Goglin
, Åke Sandgren a écrit : > Attached tar file with data from both systems. See Readme file for > kernel versions > > On 07/09/2015 07:54 PM, Brice Goglin wrote: >> Can you send the output of this command on both nodes? >> cat /sys/devices/system/cpu/cpu{?,??}/cache/index3/sha

Re: [hwloc-users] Finding hwloc'c HWLOC_OBJ_MISC objects

2015-07-14 Thread Brice Goglin
Hello In 1.11, they are attached to root. In theory they should be attached to Numa nodes, so you iterate under those. However their locality information isn't easy to find/trust (are we sure "DIMM A3" is in first numa node?) so we just attach to root for now. It's not clear we'll fix that anyt

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-08-24 Thread Brice Goglin
Hello, hwloc 1.7 is very old, I am surprised CentOS 7 doesn't have anything more recent, maybe not in "standard" packages? Anyway, this is a very common error on AMD 6200 and 6300 machines. See http://www.open-mpi.org/projects/hwloc/doc/v1.11.0/a00030.php#faq_os_error Assuming you kernel isn't to

[hwloc-users] hwloc tutorial @ EuroMPI - Sept 21st

2015-08-26 Thread Brice Goglin
gether :) Brice Message transféré Sujet : EuroMPI 2015 Call for Participation - Early deadline Sept 1st Date : Wed, 26 Aug 2015 10:41:39 +0200 De : Brice Goglin Pour : Open MPI Users EuroMPI 2015 Call for participation EuroMPI 2015 in-cooperation status with ACM and SIG

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-27 Thread Brice Goglin
ir respective next releases. > > Ondrej > >> On Monday, August 24, 2015 15:32:12 Brice Goglin wrote: >> Hello, >> >> hwloc 1.7 is very old, I am surprised CentOS 7 doesn't have anything >> more recent, maybe not in "standard" packages? >> >&g

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-27 Thread Brice Goglin
Hello This bug is about L3 cache locality only, everything else should be fine, including cache sizes. Few applications use that locality information, so I assume it doesn't matter for PETSc scaling. We can work around the bug by loading a XML topology. There's no easy way to build that correct XM

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-27 Thread Brice Goglin
P#0 cpuset > 0x003f) without inclusion! > * Error occurred in topology.c line 981 > * > .. > > So if you can affort the time, I apprechiate it very much! > > Fabian > > > > On 10/27/2015 09:52 AM, Brice Goglin wrote: >> Hello >> >> This bug is

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-27 Thread Brice Goglin
he same poor and random speedups. > > I tried to check the xml file by myself via > xmllint --valid leo_brice.xml --loaddtd /usr/local/share/hwloc/hwloc.dtd > > However xmllint complains about hwloc.dtd itself > /usr/local/share/hwloc/hwloc.dtd:8: parser error : StartTag: invalid &

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-27 Thread Brice Goglin
écrit : > On 10/27/2015 03:42 PM, Brice Goglin wrote: >> I guess the problem is that your OMPI uses an old hwloc internally. That >> one may be too old to understand recent XML exports. >> Try replacing "Package" with "Socket" everywhere in the XML file. >

  1   2   3   4   5   >