Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] object intersection without inclusion

2016-02-10 Thread Brice Goglin
Hello compute-0-12 reports totally buggy NUMA information: $ cat compute-0-12/sys/devices/system/node/node*/cpumap ,00ff ,ff00ff00 ,00ff , $ cat compute-0-0/sys/devices/system/node/node*/cpumap ,00ff ,ff00 ,00ff

Re: [hwloc-users] lstopo hangs for centos 7

2016-02-03 Thread Brice Goglin
05:45, Jianjun Wen a écrit : > Confirmed! > This patch fixes the problem. > > Thanks a lot! > Jianjun > > On Tue, Feb 2, 2016 at 9:05 AM, Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>> wrote: > > Does this patch help? > >

Re: [hwloc-users] lstopo hangs for centos 7

2016-02-01 Thread Brice Goglin
Thanks for the debugging. I guess VMware doesn't properly emulate the CPUID instruction. Please do: 1) take a tarball from git master at https://ci.inria.fr/hwloc/job/master-0-tarball/ and build it 2) export HWLOC_COMPONENTS=-x86 in your terminal 3) do utils/hwloc/hwloc-gather-cpuid 4) tar cfj

Re: [hwloc-users] lstopo hangs for centos 7

2016-01-31 Thread Brice Goglin
Hello Thanks for the report. I have never seen this issue. I have CentOS 7 VMs (kvm), lstopo works fine. Did you try this in similar VMs in the past? When you say "latest hwloc", do you mean "build latest tarball" (1.11.2) or "installed latest centos package" (1.7)? First thing to check: run

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc error after upgrading from Centos 6.5 to Centos 7 on Supermicro with AMD Opteron 6344

2016-01-07 Thread Brice Goglin
L2Cache L#2 (size=2048KB linesize=64 ways=16) > L1iCache L#2 (size=64KB linesize=64 ways=2) > L1dCache L#4 (size=16KB linesize=64 ways=4) > Core L#4 (P#4) > PU L#4 (P#4) > L1dCache L#5 (size=16KB linesiz

Re: [hwloc-users] error from the operating system - Solaris 11.3 - SOLVED

2016-01-07 Thread Brice Goglin
Thanks, I copied useful information from this thread and some links to https://github.com/open-mpi/hwloc/issues/143 However, not sure I'll have time to look at this in the near future :/ Brice Le 07/01/2016 09:03, Matthias Reich a écrit : > Hello, > > To check whether kstat is able to

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc error after upgrading from Centos 6.5 to Centos 7 on Supermicro with AMD Opteron 6344

2016-01-07 Thread Brice Goglin
Hello This is a kernel bug for 12-core AMD Bulldozer/Piledriver (62xx/63xx) processors. hwloc is just complaining about buggy L3 information. lstopo should report one L3 above each set of 6 cores below each NUMA node. Instead you get strange L3s with 2, 4 or 6 cores. If you're not binding tasks

Re: [hwloc-users] error from the operating system - Solaris 11.3 - SOLVED

2016-01-05 Thread Brice Goglin
Hello So processor sets are not taken into account when Solaris reports topology information in kstat etc. Do you know if hwloc can query processor sets from the C interface? If so, we could apply the processor set mask to hwloc object cpusets during discovery to avoid your error. Brice Le

Re: [hwloc-users] [hwloc-announce] Hardware Locality (hwloc) v1.11.2 released

2015-12-19 Thread Brice Goglin
Applied, thanks ! Le 19/12/2015 06:52, Marco Atzeri a écrit : > On 19/12/2015 00:38, Brice Goglin wrote: >> >> >> Le 18/12/2015 12:14, Marco Atzeri a écrit : >>> attached minor patch to solve a false "make check" failure >>> on platform wh

Re: [hwloc-users] [hwloc-announce] Hardware Locality (hwloc) v1.11.2 released

2015-12-18 Thread Brice Goglin
Le 18/12/2015 12:14, Marco Atzeri a écrit : > attached minor patch to solve a false "make check" failure > on platform where EXEEXT in not empty. > > Tested on CYGWIN platforms. > > Regards > Marco > --- origsrc/hwloc-1.11.2/utils/hwloc/test-hwloc-assembler.sh.in > 2015-06-14

Re: [hwloc-users] [hwloc-announce] Hardware Locality (hwloc) v1.11.2 released

2015-12-18 Thread Brice Goglin
Hello Release announces are sent to the hwloc-annonce mailing list only. Yes your AMD bug is covered. You should pass HWLOC_COMPONENTS=x86 in the environment to work around your Linux kernel bug. Regards Brice Le 18/12/2015 12:26, Fabian Wein a écrit : > Somehow I missed the announcement?! >>>

Re: [hwloc-users] Assembling multiple node XMLs

2015-10-30 Thread Brice Goglin
Hello Can you have a startup script set HWLOC_XMLFILE=/common/path/${hostname}.xml in the system-wide environment? Brice Le 30/10/2015 13:57, Andrej Prsa a écrit : > Hi Brice, > >> When you assemble multiple nodes' topologies into a single one, the >> resulting topology cannot be used for

Re: [hwloc-users] Assembling multiple node XMLs

2015-10-30 Thread Brice Goglin
will be removed in 2.0. Brice Le 30/10/2015 02:13, Andrej Prsa a écrit : > Hi all, > > I have a 6-node cluster with the buggy L3 H8QG6 AMD boards. Brice > Goglin recently provided a fix to Fabian Wein and I applied the same > fix (by diffing Fabian's original and Brice's fixed XML and then

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-29 Thread Brice Goglin
thread and the wrong list? Yeah, OpenMPI specific issues should go to OpenMPI list (hwloc is a subproject of the OpenMPI consortium, but the software projects are pretty much independent). Brice > I have a feeling that I'm quite close but just cannot reach it :( > > Thanks, > > Fa

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-27 Thread Brice Goglin
a écrit : > On 10/27/2015 03:42 PM, Brice Goglin wrote: >> I guess the problem is that your OMPI uses an old hwloc internally. That >> one may be too old to understand recent XML exports. >> Try replacing "Package" with "Socket" everywhere in the XML file. >

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-27 Thread Brice Goglin
or and random speedups. > > I tried to check the xml file by myself via > xmllint --valid leo_brice.xml --loaddtd /usr/local/share/hwloc/hwloc.dtd > > However xmllint complains about hwloc.dtd itself > /usr/local/share/hwloc/hwloc.dtd:8: parser error : StartTag: invalid > elemen

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-27 Thread Brice Goglin
puset > 0x003f) without inclusion! > * Error occurred in topology.c line 981 > * > .. > > So if you can affort the time, I apprechiate it very much! > > Fabian > > > > On 10/27/2015 09:52 AM, Brice Goglin wrote: >> Hello >> >> This bug is about

Re: [hwloc-users] hwloc error for AMD Opteron 6300 processor family

2015-10-27 Thread Brice Goglin
ir respective next releases. > > Ondrej > >> On Monday, August 24, 2015 15:32:12 Brice Goglin wrote: >> Hello, >> >> hwloc 1.7 is very old, I am surprised CentOS 7 doesn't have anything >> more recent, maybe not in "standard" packages? >> >> An

[hwloc-users] hwloc tutorial @ EuroMPI - Sept 21st

2015-08-26 Thread Brice Goglin
Message transféré Sujet : EuroMPI 2015 Call for Participation - Early deadline Sept 1st Date : Wed, 26 Aug 2015 10:41:39 +0200 De :Brice Goglin <brice.gog...@inria.fr> Pour : Open MPI Users <us...@open-mpi.org> EuroMPI 2015 Call for participation

Re: [hwloc-users] Finding hwloc'c HWLOC_OBJ_MISC objects

2015-07-14 Thread Brice Goglin
Hello In 1.11, they are attached to root. In theory they should be attached to Numa nodes, so you iterate under those. However their locality information isn't easy to find/trust (are we sure "DIMM A3" is in first numa node?) so we just attach to root for now. It's not clear we'll fix that

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc 1.11.0 seems to have problem with 3.13 kernel on AMD bulldozer

2015-07-09 Thread Brice Goglin
09/07/2015 16:26, Åke Sandgren a écrit : > Yes the BIOS is the same. > > Anything else i should check? > > On 07/09/2015 04:10 PM, Brice Goglin wrote: >> Hello >> >> The 3.13 kernel reports invalid L3 cache information in sysfs. 0x3f0 is >> not possible on t

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc 1.11.0 seems to have problem with 3.13 kernel on AMD bulldozer

2015-07-09 Thread Brice Goglin
Hello The 3.13 kernel reports invalid L3 cache information in sysfs. 0x3f0 is not possible on this processor, it should be either 0x3f or 0xfc (there's exactly one L3 per NUMA node, with the same 6 cores in them). Can you check whether the BIOS is also the same on these machines? (see files in

Re: [hwloc-users] Difficulty embedding hwloc 1.11.0

2015-07-07 Thread Brice Goglin
Hello I don't see any significant change in v1.11 regarding embedding, especially with respect to CONFIGURE_DEPENDENCIES. Does v1.10 work when running autogen with the same versions of automake/libtool/autoconf? I am using 1.14.1/2.4.2/2.69 here. If you enter hwloc-1.11.0/tests/embedded, does

Re: [hwloc-users] linking libcudart and libnvml only to the plugins

2015-06-03 Thread Brice Goglin
Le 02/06/2015 23:27, Fabricio Cannini a écrit : > Hello there > > Is there a way to link 'libcudart.so' and 'libnvidia-ml.so' solely to > their respective plugin .so files, not the main libraries/executables? > > This is the './configure' line i'm using: > > ./configure --enable-shared

Re: [hwloc-users] lstopo on Kaveri

2015-03-27 Thread Brice Goglin
Hello, That's an interesting question: Even if the GPU is physically-located inside the die, it is exposed as a "virtual" PCI device (vendor number 1002 and model number 130f), and that's how we detect it, and that's how the driver configures it. Many components of the CPU die are configured

Re: [hwloc-users] hwloc has encountered what looks like an error from the operating system

2015-02-23 Thread Brice Goglin
Hello, This is yet another example of buggy AMD topology information unfortunately. See http://www.open-mpi.org/projects/hwloc/doc/v1.10.1/a00028.php#faq_os_error In your case, NUMA and processor package/socket information are conflicting because NUMA information is buggy. Upgrading the BIOS may

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc-gather-topology

2015-02-17 Thread Brice Goglin
Hello This is a widespread problem with AMD machines. Buggy platforms reporting invalid L3 cache information in this case. Upgrading the BIOS may help. If your program doesn't care about cache affinity, you can hide/ignore the message by setting HWLOC_HIDE_ERRORS=1 in the environment. More

Re: [hwloc-users] PCI devices topology

2015-01-09 Thread Brice Goglin
t's enough for "comparing distances". Brice Le 09/01/2015 10:30, Pradeep Kiruvale a écrit : > Hi Brice, > > Thanks for the reply. Is it possible to get the distance matrix for > each cpu and the pci device from these hwloc apis? > > Regards, > Pradeep >

Re: [hwloc-users] PCI devices topology

2015-01-08 Thread Brice Goglin
Hello, hwloc_topology_init(); hwloc_topology_set_flags(topology, HWLOC_TOPOLOGY_FLAG_IO_DEVICES); hwloc_topology_load(topology); Then you can use hwloc_get_next_pcidev() to iterate over the entire list PCI devices. If you want to know whether it's connected to a specific NUMA node, start from

Re: [hwloc-users] Hwloc on windows does not show pci devices

2015-01-06 Thread Brice Goglin
Hello We don't have PCI support on Windows unfortunately. And on non-Linux platforms, you would have PCI devices without their locality, not really useful. The hwloc I/O doc says: "Note that I/O discovery requires significant help from the operating system. The pciaccess library (the development

[hwloc-users] wrong os_index on AIX -> please test

2014-12-17 Thread Brice Goglin
Hello I am seeing assert failures on AIX 6.1 because our PU os_index is off by one. They go from -1 to 62 instead of 0 to 63. We have a comment saying /* It seems logical processors are numbered from 1 here, while the * bindprocessor functions numbers them from 0... */ This

Re: [hwloc-users] Selecting real cores vs HT cores

2014-12-11 Thread Brice Goglin
Le 11/12/2014 21:51, Brock Palen a écrit : > When a system has HT enabled is one core presented the real one and one the > fake partner? Or is that not the case? > > If wanting to test behavior without messing with the bios how do I select > just the 'real cores' if this is the case? > > I

Re: [hwloc-users] hwloc - "symbol already defined" error building with optimizations (-O3) on 32bit ubuntu

2014-11-21 Thread Brice Goglin
of all of the steps and the logs. Let me know if > you need something else. > > Thanks! > > Thomas Van Doren > thomas.vando...@gmail.com <mailto:thomas.vando...@gmail.com> > > > On Thu, Nov 20, 2014 at 10:01 PM, Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...

Re: [hwloc-users] hwloc - "symbol already defined" error building with optimizations (-O3) on 32bit ubuntu

2014-11-20 Thread Brice Goglin
Hello, Thanks, I can reproduce the problem on Debian with -O3 -m32. The issue is that -O3 makes gcc inline more. We have function A call B multiple times, and B calls C which contains asm with a label. So in the end A contains the asm label from C multiple times. Google says we should use local

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.10.0 release

2014-10-09 Thread Brice Goglin
Le 09/10/2014 00:49, Jiri Hladky a écrit : > Hi Brice, > > this sounds perfectly reasonable to me. I will make the arrangements > on packing side. > > Perhaps you could add this in README file? > The README file is autogenerated from the huge doxygen text, which is really for users, not for

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.10.0 release

2014-10-09 Thread Brice Goglin
Le 09/10/2014 00:55, Jiri Hladky a écrit : > > * if building without cairo/X11 support, lstopo and lstopo.1 are > symlinks. Packagers can choose to ignore lstopo and lstopo.1. > lstopo.desktop isn't installed. > > > Could you please make (in the next version) > lstopo-no-graphics.1 >

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.10.0 release

2014-10-08 Thread Brice Goglin
Le 08/10/2014 01:52, Jiri Hladky a écrit : > Hi Brice, > > glad to see the new version is out! :-) > > I have bumped into couple of minor problems when building new RPM for > Fedora: > > 1) desktop file > desktop-file-validate hwloc-ls.desktop.back > hwloc-ls.desktop.back: error: file contains

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.10.0 release

2014-10-08 Thread Brice Goglin
Le 08/10/2014 01:52, Jiri Hladky a écrit : > 2) I have also some trouble with symlinks. The trouble is this: > > * when installed with ./configure && make && make install > then hwloc-ls is symlink to lstopo-no-graphics and man pages > { lstopo-no-graphics.1, hwloc-ls.1 } are symlinks to

Re: [hwloc-users] hwloc-ls graphical output

2014-10-01 Thread Brice Goglin
Dennis, Did you have an opinion about this? I am going to release the final hwloc v1.10 soon. So if there's something to fix, I'd rather know it quickly. thanks Brice Le 25/09/2014 07:47, Brice Goglin a écrit : > Le 25/09/2014 02:22, Dennis Jacobfeuerborn a écrit : >> So I just recompi

Re: [hwloc-users] Processor numbering in Ivy-bridge

2014-09-29 Thread Brice Goglin
Yes. Most of locality info comes from /sys/... on Linux. Brice Le 29/09/2014 22:59, Vishwanath Venkatesan a écrit : > Thanks for the quick response, yes lstopo -l does make the numbers > contiguous. > Another question I had was, how does hwloc populate the information > that certain cpus share a

Re: [hwloc-users] binding to thread

2014-09-29 Thread Brice Goglin
Le 29/09/2014 19:01, Aulwes, Rob a écrit : > Hi, > > I'm trying to allocate and bind memory on the same NUMA domain as the > calling thread. The code I use is as follows. > > /* retrieve the single PU where the current thread actually > runs within this process binding */ > > >

Re: [hwloc-users] hwloc-ls graphical output

2014-09-25 Thread Brice Goglin
Le 25/09/2014 02:22, Dennis Jacobfeuerborn a écrit : > So I just recompiled again but using version 1.4.3 and the graphical > output options reappeared. I also tried version 1.5.2 and this version > will not show the graphical output options anymore so it seems something > has changed between 1.4

Re: [hwloc-users] problem with X11 using Solaris

2014-09-18 Thread Brice Goglin
Thanks, I just pushed a fix. Can you verify that this tarball enables X automatically and properly? https://ci.inria.fr/hwloc/job/master-0-tarball/lastSuccessfulBuild/artifact/hwloc-master-20140918.1131.git005a7e8.tar.gz I am looking at the warnings and make check failures you sent. Brice Le

Re: [hwloc-users] problem with X11 using Solaris

2014-09-17 Thread Brice Goglin
Can you send the output of configure, the generated config.log and your unmodified Xutil.h? My solaris/openindiana doesn't have that problem. thanks Brice Le 16/09/2014 14:43, Siegmar Gross a écrit : > Hi, > > today I installed hwloc-1.9.1 on my machines (Solaris 10 Sparc (tyr), > Solaris 10

Re: [hwloc-users] setting memory bindings

2014-09-15 Thread Brice Goglin
anode_obj_by_os_index? > > Thanks,Rob > > > *From:* hwloc-users [hwloc-users-boun...@open-mpi.org] on behalf of > Brice Goglin [brice.gog...@inria.fr] > *Sent:* Thursday, September 04, 2014 6:25 AM > *To:* hwloc-us...@open-mpi.org > *Subject:* Re: [hwloc-users] setting memo

Re: [hwloc-users] hwloc error with "node interleaving" disabled

2014-09-05 Thread Brice Goglin
Don't be sorry, I used "yet another" to complain about all these buggy AMD platforms, and not to complain about their owners ;) Bug reports are always welcome, that's why the big warning says you should report it. Also these warnings vary a little bit with the platform and processor model so

Re: [hwloc-users] hwloc error with "node interleaving" disabled

2014-09-05 Thread Brice Goglin
Hello You sent the test.output file instead of test.tar.bz2 so I can't check for sure. Anyway I guess this is yet another buggy AMD platform with magny-cours/interlagos/abu-dahbi Opterons (61xx, 62xx or 63xx). Sometimes upgrading the BIOS/kernel helps. Sometimes not. Some L3 caches will be

Re: [hwloc-users] setting memory bindings

2014-09-04 Thread Brice Goglin
I added a new doc/examples/ repository to better show how to use bitmaps, cpu and memory binding etc. https://github.com/open-mpi/hwloc/tree/master/doc/examples If you see anything missing, don't hesitate to ask. Brice Le 19/08/2014 19:10, Aulwes, Rob a écrit : > ok, in the meantime, is

Re: [hwloc-users] setting memory bindings

2014-09-02 Thread Brice Goglin
the STRICT flag. And I'll see if I add a good example somewhere. Brice Le 19/08/2014 19:00, Aulwes, Rob a écrit : > nope, no error. is there a way to find out what policies are > supported? I would like to try 'replicate'. > > From: Brice Goglin <brice.gog...@inria.fr <mail

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Brice Goglin
hanks for the help! Rob > > From: Brice Goglin <brice.gog...@inria.fr <mailto:brice.gog...@inria.fr>> > Reply-To: Hardware locality user list <hwloc-us...@open-mpi.org > <mailto:hwloc-us...@open-mpi.org>> > Date: Tue, 19 Aug 2014 19:03:56 +0200 > To: Hardwa

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Brice Goglin
Le 19/08/2014 18:38, Aulwes, Rob a écrit : > Hi, > > I'm trying to write a custom C++ allocator that wraps hwloc calls. > I've tried using various hwloc_alloc* functions to set the memory > bindings, but when I call hwloc_get_area_membind_nodeset to verify, I > don't get the same policy I passed

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE]Re: hwloc error

2014-08-17 Thread Brice Goglin
Le 16/08/2014 18:37, Andrej Prsa a écrit : > Hi Brice, > >> Your kernel looks recent enough, can you try upgrading your BIOS ? You >> have version 3.0b and there's a 3.5 version at >> http://www.supermicro.com/aplus/motherboard/opteron6000/sr56x0/h8qg6-f.cfm > For completeness, I just tried

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc error

2014-08-15 Thread Brice Goglin
Hello, Your platform reports buggy L3 cache locality information. This is very common on AMD 62xx and 63xx platforms unfortunately. You have 8 L3 caches (one per 6-core NUMA node, two per socket), but the platform report 11 L3 caches instead: Socket s1, 2 and 4 report one L3 above 2 cores, one

Re: [hwloc-users] hwloc 1.9 and openmpi using intel compiler

2014-07-11 Thread Brice Goglin
Andersen a écrit : > Dear Brice > > > 2014-07-09 21:34 GMT+00:00 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>>: > > Le 09/07/2014 23:30, Nick Papior Andersen a écrit : >> Dear Brice >> >> Here are my findings (apolo

Re: [hwloc-users] hwloc 1.9 and openmpi using intel compiler

2014-07-09 Thread Brice Goglin
Le 09/07/2014 23:30, Nick Papior Andersen a écrit : > Dear Brice > > Here are my findings (apologies for not doing make check on before-hand!) > > 2014-07-09 20:42 GMT+00:00 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>>: > > Hel

Re: [hwloc-users] hwloc 1.9 and openmpi using intel compiler

2014-07-09 Thread Brice Goglin
Hello, A quick look in Open MPI source code seems to say that it's manipulating XML topologies in these lines. Please go into your hwloc-1.9 build directory, and run "tests/xmlbuffer" (you will may have to build it with run "make xmlbuffer -C tests"). If it works, try running "make check".

Re: [hwloc-users] misleading cache size on AMD Opteron 6348?

2014-06-11 Thread Brice Goglin
engineering samples of 6348 (all characteristics are same). > > > On Tue, Apr 1, 2014 at 6:59 PM, Yury Vorobyov <teupol...@gmail.com > <mailto:teupol...@gmail.com>> wrote: > > The BIOS has latest version. If I should check some BIOS >

Re: [hwloc-users] divide by zero error?

2014-06-08 Thread Brice Goglin
gt; > Thanks, > > Andrew > >> -Original Message- >> From: Brice Goglin [mailto:brice.gog...@inria.fr] >> Sent: Monday, May 5, 2014 1:03 PM >> To: Friedley, Andrew >> Subject: Re: [hwloc-users] divide by zero error? >> >> Thanks. &g

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
Le 28/05/2014 14:57, Craig Kapfer a écrit : > > > Hmm ... the slurm config defines that all nodes have 4 sockets with 16 > cores per socket (which corresponds to the hardware--all nodes are the > same). Slurm node config is as follows: > > NodeName=n[001-008] RealMemory=258452 Sockets=4

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
Thanks much, > > Craig > > > On Wednesday, May 28, 2014 1:39 PM, Brice Goglin > <brice.gog...@inria.fr> wrote: > > > Aside of the BIOS config, are you sure that you have the exact same > BIOS *version* in each node? (can check in /sys/class/dmi/id/bio

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
Aside of the BIOS config, are you sure that you have the exact same BIOS *version* in each node? (can check in /sys/class/dmi/id/bios_*) Same Linux kernel too? Also, recently we've seen somebody fix such problems by unplugging and replugging some CPUs on the motherboard. Seems crazy but it

Re: [hwloc-users] divide by zero error?

2014-04-29 Thread Brice Goglin
Please run "hwloc-gather-topology simics" and send the resulting simics.tar.bz2 that it will create. However, I assume that the simulator returns buggy x86 cpuid information, so we'll see if we want/can easily workaround the bug or just let simics developers fix it. Brice Le 29/04/2014 01:15,

Re: [hwloc-users] problem with open mpi

2014-04-16 Thread Brice Goglin
Hello, This list is for hwloc users (hwloc is a Open MPI subproject). You likely want Open MPI users instead: us...@open-mpi.org Brice Le 16/04/2014 18:44, flavienne sayou a écrit : > Hello, > I am Flavienne and I am a master student. > I wrote a script which have to backup sequentials

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.9 released

2014-04-01 Thread Brice Goglin
lower compared to lstopo-no-graphics > B) Compile it without libXNVCtrl but it will reduce the functionality. > > Is there any 3rd option? I guess not. It seems like A) is the best > choice for Fedora. > > Any ideas on that? > > Thanks! > Jirka > > > > > On Tue

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.9 released

2014-04-01 Thread Brice Goglin
Le 01/04/2014 10:43, Jiri Hladky a écrit : > Hi Brice, > > I see some compiler warnings when building rpm package for Fedora: > > topology-windows.c: In function 'hwloc_win_get_VirtualAllocExNumaProc': > topology-windows.c:338:30: warning: assignment from incompatible > pointer type [enabled by

Re: [hwloc-users] distributing across cores with hwloc-distrib

2014-03-30 Thread Brice Goglin
gt; On Sun, Mar 30, 2014 at 05:32:38PM +0200, Brice Goglin wrote: >> Don't worry, binding multithreaded processes is not a corner case. I was >> rather talking about the general "distributing less processes than there >> are object and returning cpusets as large as possible&quo

Re: [hwloc-users] distributing across cores with hwloc-distrib

2014-03-30 Thread Brice Goglin
sidered a corner case. Could you > please consider fixing this? > > Thanks, > Tim > > Brice Goglin wrote: >> Hello, >> >> This is the main corner case of hwloc-distrib. It can return objects >> only, not groups of objects. The distrib algorithms is: >>

Re: [hwloc-users] distributing across cores with hwloc-distrib

2014-03-30 Thread Brice Goglin
Hello, This is the main corner case of hwloc-distrib. It can return objects only, not groups of objects. The distrib algorithms is: 1) start at the root, where there are M children, and you have to distribute N processes 2) if there are no children, or if N is 1, return the entire object 3) split

Re: [hwloc-users] BGQ question.

2014-03-26 Thread Brice Goglin
Le 26/03/2014 01:00, Christopher Samuel a écrit : > On 26/03/14 01:34, Biddiscombe, John A. wrote: > > > If I compile on the login node, but run lstopo on the ION, I get > > this (wrong, below) > > If you build this with GCC (the standard system one, not the > cross-compiler for BGQ) does it still

Re: [hwloc-users] BGQ question.

2014-03-25 Thread Brice Goglin
ere we > are trying to customise the IO. > > > > JB > > > > *From:*Brice Goglin [mailto:brice.gog...@inria.fr] > *Sent:* 25 March 2014 08:43 > *To:* Hardware locality user list; Biddiscombe, John A. > *Subject:* Re: [hwloc-users] BGQ question. >

Re: [hwloc-users] BGQ question.

2014-03-25 Thread Brice Goglin
./configure >--prefix=/gpfs/bbp.cscs.ch/home/biddisco/apps/clang/hwloc-1.8.1 > >should I rerun with something set? > >Thanks > >JB > > >From: hwloc-users [mailto:hwloc-users-boun...@open-mpi.org] On Behalf >Of Brice Goglin >Sent: 25 March 2014 08:04 >To:

Re: [hwloc-users] BGQ question.

2014-03-25 Thread Brice Goglin
Le 25/03/2014 07:51, Biddiscombe, John A. a écrit : > > I'm compiling hwloc using clang (bgclang++11 from ANL) to run on IO > nodes af a BGQ. It seems to have compiled ok, and when I run lstopo, I > get an output like this (below), which looks reasonable, but there are > 15 sockets instead of 16.

Re: [hwloc-users] [hwloc-announce] Hardware Locality (hwloc) v1.8.1 released

2014-02-13 Thread Brice Goglin
Le 13/02/2014 22:25, Jiri Hladky a écrit : > Hi Brice, > > when compiling hwloc-1.8.1 I have seen these warnings. Could you > please check them? fread() warnings come from fread() on kernel sysfs files, so it's very unlikely that we read totally buggy data from there. One day we'll fix this,

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-07 Thread Brice Goglin
quot;nvml2" > GPU L#5 "nvml3" > GPU L#7 "nvml0" > GPU L#9 "nvml1" > > Is the L# always going to be in the oder I would expect? Because then I > already have my map then. Brice > > Brock Palen >

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-06 Thread Brice Goglin
lowing the PCI bus order? We may want to talk to NVIDIA to get a clarification about all this. Brice > > Brock Palen > www.umich.edu/~brockp > CAEN Advanced Computing > XSEDE Campus Champion > bro...@umich.edu > (734)936-1985 > > > > On Feb 5, 2014, at 1:19 AM, Brice

Re: [hwloc-users] Using hwloc to map GPU layout on system

2014-02-05 Thread Brice Goglin
Hello Brock, Some people reported the same issue in the past and that's why we added the "nvml" objects. CUDA reorders devices by "performance". Batch-schedulers are somehow supposed to use "nvml" for managing GPUs without actually using them with CUDA directly. And the "nvml" order is the

Re: [hwloc-users] misleading cache size on AMD Opteron 6348?

2014-01-31 Thread Brice Goglin
Hello, Your BIOS reports invalid L3 cache information. On these processors, the L3 is shared by 6 cores, it covers 6 cores of an entire half-socket NUMA node. But the BIOS says that some L3 are shared between 4 cores, others by 6 cores. And worse it says that some L3 is shared by some cores from

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-29 Thread Brice Goglin
en-mpi.org/community/lists/hwloc-devel/2014/01/4043.php Le 29/01/2014 06:50, Robin Scher a écrit : > Hi Brice > > This works great now. Thank you for your help! > -robin > > Robin Scher > ro...@uberware.net > +1 (213) 448-0443 > > > > On Jan 28, 2014, at 7:47

Re: [hwloc-users] Finding closest host bridge

2014-01-28 Thread Brice Goglin
The bridge cannot be "not connected to anything". All objects have a parent (and are a child of that parent) except the very-top root object. Theoretically, the bridge could be connected anywhere. In practice it's connected to a NUMA node, a root object, or (rarely) a group of numa nodes. The

Re: [hwloc-users] CPU info on ARM

2014-01-28 Thread Brice Goglin
; models executing in the same SMP system)." >> >> He passed the question on to another ARM guy, asking for further detail. >> I'll pass on what he says. >> >> >> >> On Jan 28, 2014, at 3:39 AM, Brice Goglin <brice.gog...@inria.fr>

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-28 Thread Brice Goglin
Le 28/01/2014 14:31, Brice Goglin a écrit : > Le 28/01/2014 13:00, Samuel Thibault a écrit : >> Brice Goglin, le Tue 28 Jan 2014 12:46:24 +0100, a écrit : >>> 42: xchg %ebx,%rbx >>> >>> I guess having both ebx and rbx on these lines isn't OK. On Linux, I ge

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-28 Thread Brice Goglin
Le 28/01/2014 13:00, Samuel Thibault a écrit : > Brice Goglin, le Tue 28 Jan 2014 12:46:24 +0100, a écrit : >> 42: xchg %ebx,%rbx >> >> I guess having both ebx and rbx on these lines isn't OK. On Linux, I get >> rsi instead of ebx, no problem. >> >> Samuel

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-28 Thread Brice Goglin
Le 28/01/2014 09:57, Brice Goglin a écrit : > I will debug a bit more to see if it's actually a 64bit cpuid problem > on windows. The x86 backend is entirely disabled in the 64bit windows build because configure fails to compile the cpuid assembly (in my mingw64 with gcc 4.7). It says

Re: [hwloc-users] Having trouble getting CPU Model string on Windows 7 x64

2014-01-28 Thread Brice Goglin
Le 28/01/2014 09:46, Robin Scher a écrit : > Hi, thanks for responding. > > The CPUModel is definitely available on this machine. A 32 bit process > on the same machine correctly finds the model name using code that > calls the cpuid inline assembly to get it, and the machine itself is a > VM

[hwloc-users] CPU info on ARM

2014-01-28 Thread Brice Goglin
Hello, Is anybody familiar with ARM CPUs? I am adding more CPU information because Intel needs more: CPUVendor=GenuineIntel CPUModel=Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz CPUModelNumber=45 CPUFamilyNumber=6 Would something similar be useful for ARM? What are the fields below from

Re: [hwloc-users] How to build hwloc static to link into a shared lib on Linux

2014-01-18 Thread Brice Goglin
Maybe try to disable some dependencies such as pci in hwloc (--disable-pci), I wouldn't be surprised if there were issues there. If that helps, please let us know what was enabled before (libpciaccess (default), or libpci/pciutils (--enable-libpci)). Brice Le 18/01/2014 07:23, Robin Scher a

Re: [hwloc-users] hwloc errors on program startup

2014-01-17 Thread Brice Goglin
Hello, Linux says socket 0 contains processors 0-7 and socket 1 contains 8-15, while NUMA node 0 contains processors 0-3+8-11 and NUMA node 1 contains processors 4-7+12-15. Given why I read about Opteron 6320 online, the problem is that NUMA 0 should be replaced with two NUMA nodes with

Re: [hwloc-users] hwloc problem on SGI machine

2014-01-11 Thread Brice Goglin
Le 11/01/2014 01:58, Chris Samuel a écrit : > On Sat, 11 Jan 2014 11:54:17 AM Chris Samuel wrote: > >> We've got both an older Altix XE cluster and a UV10 (both running RHEL) I >> can test on if it's useful? > Forgot I already had both 1.7.2 and 1.8 built for both - all fine (RHEL6.4). > This

Re: [hwloc-users] hwloc problem on SGI machine

2014-01-10 Thread Brice Goglin
Le 11/01/2014 00:27, Jeff Squyres (jsquyres) a écrit : > Jeff Becker (CC'ed) reported to me a failure with hwloc 1.7.2 (in OMPI > trunk). I had him verify this with a standalone hwloc 1.7.2, and then had > him try standalone hwloc 1.8 as well -- all got the same failure. > > Here's what he's

Re: [hwloc-users] [windows] build from source using visual studio

2014-01-08 Thread Brice Goglin
dress(0x07FF7E1B > [c:\windows\system32\PSAPI.DLL], "QueryWorkingSetEx") called from " > XXX\bin\LIBHWLOC-5.DLL" at address 0x69E9419E and returned > 0x07FF7E1B2E60 by thread 1. > > 00:00:00.625: First chance exception 0xC094 (Integer Div

Re: [hwloc-users] windows PCI locality (was; DELL 8 core machine + Quadro K5000 GPU Card...)

2013-11-19 Thread Brice Goglin
data[0] = 0; > > DEVPROPTYPE type; > > DEVPROPKEY key = DEVPKEY_Numa_Proximity_Domain; > > > > lastError = 0; > > > > ret = SetupDiGetDeviceProperty(hNvDevInfo, > , , , (PBYTE)[0], 20*sizeof(int), NULL

Re: [hwloc-users] Regarding the Dell 8 core machine with GPUs

2013-11-18 Thread Brice Goglin
't contain the kernel version ("uname -a" would be more useful) but I don't need this information anymore anyway. Looks like I am ready to release the final hwloc v1.8 now :) Brice Le 18/11/2013 04:17, Solibakke Per Bjarte a écrit : > Dear Brice Goglin > > Sorry, there mus

Re: [hwloc-users] DELL 8 core machine + Quadro K5000 GPU Card...

2013-11-18 Thread Brice Goglin
Le 18/11/2013 02:14, Solibakke Per Bjarte a écrit : > Hello > > I recently got access to a very interesting and powerful machine: Dell > 8 core + GPU Quadro K5000 (96 cores). > A total of 1536 cores in the original machine configuration. Hello GPU cores are not real cores so I am not sure your

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.8rc1 released

2013-11-09 Thread Brice Goglin
index1 object:index2 is easy to write, I'd vote for not making the code too complex. Brice > > Thanks a lot! > Jirka > > > On Wed, Nov 6, 2013 at 3:06 PM, Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>> wrote: > > The Hardware Locality

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE]Re: [OMPI users] SIGSEGV in opal_hwlock152_hwlock_bitmap_or.A // Bug in 'hwlock" ?

2013-11-04 Thread Brice Goglin
losed". Brice Le 04/11/2013 22:33, Paul Kapinos a écrit : > Hello again, > I'm not allowed to publish to Hardware locality user list so I omit it > now. > > On 11/04/13 14:19, Brice Goglin wrote: >> Le 04/11/2013 11:44, Paul Kapinos a écrit : >>> Hello all, >>&

Re: [hwloc-users] CPU binding

2013-10-03 Thread Brice Goglin
Le 03/10/2013 02:56, Panos Labropoulos a écrit : > Hallo, > > > I initially posted this at us...@open-mpi.org . > > We seem to be unable to to set the cpu binding on a cluster consisting > of Dell M420/M610 systems: > > [jallan@hpc21 ~]$ cat report-bindings.sh #!/bin/sh

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.7.2rc1 released

2013-08-29 Thread Brice Goglin
put in 1.7.2 ? (see also my other email I sent to > you 2 minutes ago). > > Jirka > > > > > On Thu, Aug 29, 2013 at 11:32 AM, Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>> wrote: > > The Hardware Locality (hwloc) te

Re: [hwloc-users] Open-mpi + hwloc ...

2013-06-21 Thread Brice Goglin
Hello, hwloc can only tell where CPU/device are, and place programs on the right CPUs. hwloc isn't going to convert your parallel program into a GPU program. If you want to use NVIDIA GPUs, you have to rewrite your program using CUDA, OpenCL, or a high-level heterogeneous langage. Brice Le

Re: [hwloc-users] hwloc on Xeon Phi

2013-06-18 Thread Brice Goglin
Le 18/06/2013 08:52, pinak panigrahi a écrit : > Hi, how do I use hwloc on Intel Xeon Phi. I have written codes that > use it for Sandybridge. Hello, If you really mean 'inside the Xeon Phi", it should just work and report all available Phi cores. If you mean managing the Phi internal topology

Re: [hwloc-users] Windows binaries miss lib file

2013-05-20 Thread Brice Goglin
now. All earlier releases (except v0.9) were already OK. Final v1.7.1 expected today or wednesday. Brice Le 20/05/2013 18:45, Brice Goglin a écrit : > Thanks, there was indeed an issue on the machine that builds the Windows > zipballs. I am fixing this. Should be fixed in 1.7.1. If a

<    1   2   3   >