Re: [hwloc-users] howloc with scalemp

2010-04-07 Thread Brice Goglin
Brock Palen wrote: > has anyone done work with hwloc on scalemp systems? They provide > their own tool numabind, but we are looking for a more generic > solution to process placement and control that works well inside our > MPI library (openMPI in most cases). > > Any input on this would be

Re: [hwloc-users] howloc with scalemp

2010-04-07 Thread Brice Goglin
Brock Palen wrote: > [brockp@nyx0809 INTEL]$ lstopo - > System(79GB) > Misc0 > Node#0(10GB) + Socket#1 + L3(8192KB) > L2(256KB) + L1(32KB) + Core#0 + P#0 > L2(256KB) + L1(32KB) + Core#1 + P#1 > L2(256KB) + L1(32KB) + Core#2 + P#2 > L2(256KB) + L1(32KB) + Core#3 + P#3

Re: [hwloc-users] Creating a D wrapper around hwloc

2010-04-16 Thread Brice Goglin
Jim Burnes wrote: > I can make these available to D in several different ways, but I need > to know the true intent of marking them as "static __inline". > > 1. Are they marked that way simply to increase performance? > No. > 2. Are they marked that way to avoid some sort of thread safety

Re: [hwloc-users] hwloc RPM spec file

2010-04-26 Thread Brice Goglin
On 23/04/2010 18:09, Jirka Hladky wrote: > Hello, > > I have written hwloc RPM spec file. It's attached. > > Thanks > Jirka > > Thanks Jirka, but don't you need some BuildRequires such as the following? libX11-devel libxml2-devel cairo-devel ncurses-devel Tony (CCed) also worked on RPMs for

Re: [hwloc-users] hwloc on systems with more than 64 cpus?

2010-05-16 Thread Brice Goglin
No, there is no such limit. If you have 128cores, the cpuset string will be 0x,0x,0x,0x As long as you have less than 1024 cores, everything should work fine. For more than 1024, you'll need to rebuild with a manual change in the source code, or wait for hwloc 1.1.

Re: [hwloc-users] Getting a graphics view for anon graphic system...

2010-06-09 Thread Brice Goglin
Le 09/06/2010 21:41, Jeff Squyres a écrit : > On Jun 6, 2010, at 4:03 PM, Olivier Cessenat wrote: > > >> What you write is clear to computer scientists, but I failed to figure >> out what it meant. Sorry, it is clear now ! >> > FWIW, there's a section about "output formats" in the

Re: [hwloc-users] hwloc on cray

2010-06-23 Thread Brice Goglin
Hello Norman, I don't think anybody ever tried. But we have an entry in the TODO list saying "port to cray catamount" :) If anybody wants to port hwloc on cray, we'd be happy to help. Getting us an access on a Cray machine might also help :) Brice Le 23/06/2010 04:05, Norman Lo a écrit : >

Re: [hwloc-users] hwloc sockets support on solaris

2010-06-23 Thread Brice Goglin
I see this in the solaris binding core: if (hwloc_cpuset_weight(hwloc_set) != 1) { errno = EXDEV; return -1; } OMPI doesn't get this error ? Brice Le 23/06/2010 21:56, Terry Dontje a écrit : > Does hwloc think it supports binding processes to sockets or multiple > cpus? I am

Re: [hwloc-users] hwloc sockets support on solaris

2010-06-23 Thread Brice Goglin
Le 23/06/2010 22:27, Jeff Squyres a écrit : > Hm. We should be. Here's the hwloc plugin code for setting CPU affinity > (it's static because it's invoked by function pointer): > > static int module_set(opal_paffinity_base_cpu_set_t mask) > { > int i, ret = OPAL_SUCCESS; > hwloc_cpuset_t

Re: [hwloc-users] Getting a graphics view for anon graphic system...

2010-07-02 Thread Brice Goglin
Le 09/06/2010 21:52, Brice Goglin a écrit : > Le 09/06/2010 21:41, Jeff Squyres a écrit : > >> On Jun 6, 2010, at 4:03 PM, Olivier Cessenat wrote: >> >> >> >>> What you write is clear to computer scientists, but I failed to figure >&g

Re: [hwloc-users] hwloc_set/get_thread_cpubind

2010-07-15 Thread Brice Goglin
Le 14/07/2010 20:28, Αλέξανδρος Παπαδογιαννάκης a écrit : > hwloc_set_thread_cpubind and hwloc_get_thread_cpubind are missing from the > html documentation > http://www.open-mpi.org/projects/hwloc/doc/v1.0.1/group__hwlocality__binding.php > > It may be

Re: [hwloc-users] xmlbuffer test failure

2010-11-05 Thread Brice Goglin
Looks like there's something specific to your machine. Can you send the XML output of lstopo ? thanks Brice Le 05/11/2010 05:41, ryuuta a écrit : > Hi, > > I'd like to report the failure of the one of the tests run by 'make > check': > > exported to buffer 0x8546408 length 3070 > re-exported

Re: [hwloc-users] xmlbuffer test failure

2010-11-05 Thread Brice Goglin
Here's the patch :) Le 05/11/2010 08:10, Brice Goglin a écrit : > Interesting, you don't have any hugepage information, it's probably > disabled in the kernel. Can you apply th attached patch and check that > the XML output only contains a single "page_type" line and th

Re: [hwloc-users] hwloc@SC10

2010-11-12 Thread Brice Goglin
; > Drop by the Cisco booth for the exact schedule; we're right next to the main > SciNet NOC. > > See you there! > > > > On Nov 8, 2010, at 11:22 AM, Brice Goglin wrote: > > >> Hello, >> For those of you going to SC10 @ New Orleans next week, you should kno

Re: [hwloc-users] Identifying NIC in a topology using HWLOC

2010-12-27 Thread Brice Goglin
Hello Saktheesh, NICs do not appear in the topology yet. This is under development in the libpci branch. You can take a look at https://svn.open-mpi.org/svn/hwloc/branches/libpci and tell us what you think of the interface. If you're talking about infiniband NICs, hwloc/openfabrics-verbs.h

Re: [hwloc-users] some questions about hwloc

2011-01-28 Thread Brice Goglin
Le 28/01/2011 15:32, guillaume Arnal a écrit : > Hi everyone, > > I'm beginner in using hwloc and I have some questions. > > First: I'm looking for a way to find which core is using by the > current thread. (maybe with hwloc_get_thread_cpubind ??) > > Second: is there a way to set the number of

Re: [hwloc-users] Problem getting cpuset of MPI task

2011-02-09 Thread Brice Goglin
Le 09/02/2011 16:53, Hendryk Bockelmann a écrit : > Since I am new to hwloc there might be a misunderstanding from my > side, but I have a problem getting the cpuset of MPI tasks. I just > want to run a simple MPI program to see on which cores (or CPUs in > case of hyperthreading or SMT) the tasks

Re: [hwloc-users] hwloc-ps output - how to verify process binding on the core level?

2011-02-14 Thread Brice Goglin
Le 14/02/2011 07:43, Siew Yin Chan a écrit : > >> >> > > No. Each hwloc-bind command in the mpirun above doesn't know that > there are other hwloc-bind instances on the same machine. All of > them bind their process to all cores in the first socket. > > => Agree. For socket:0.core:0-3

Re: [hwloc-users] on using hwloc_get_area_membind_nodeset

2011-07-05 Thread Brice Goglin
Le 05/07/2011 20:13, Alfredo Buttari a écrit : > Hi all, > if I understand correctly this routine can tell on which NUMA node(s) > a specific memory area resides, is this correct? > Will this routine work on any memory area allocated with any > allocation routine other than those provided by

Re: [hwloc-users] on using hwloc_get_area_membind_nodeset

2011-07-06 Thread Brice Goglin
be I can get away with a single call to get_mempolicy (no need to > check for all the pages in the memory area). > Thanks again > > best regards > alfredo > > > On Tue, Jul 5, 2011 at 8:34 PM, Brice Goglin <brice.gog...@inria.fr> wrote: >> Hello, >> >>

Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Brice Goglin
Le 01/08/2011 12:16, Gabriele Fatigati a écrit : > Hi, > > reading a hwloc-v1.2-a4 manual, on page 15, i look an example > with 4-socket 2-core machine with hyperthreading. > > Core id's are not exclusive as said before. PU's id are exclusive but > not physically sequential (I suppose) > > PU P#0

Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Brice Goglin
Le 01/08/2011 14:47, Gabriele Fatigati a écrit : > Hi Brice, > > so, if I inderstand well, PU P# numbers are not the same specified > as HWLOC_OBJ_PU flag? > > 2011/8/1 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>> > > Le 01/08/2011 12:

Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Brice Goglin
gt; "P#" prefix means "physical index". > > But from the hwloc manual, page 58: > > > HWLOC_OBJ_PU: Processing Unit, or (Logical) Processor.. > > > but it is in conflict with what you said :( > > > 2011/8/1 Brice Goglin <brice.gog...@inria.

Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Brice Goglin
ine, > PU# are sequential (page 17), and in a non NUMA machine are not > sequential? ( page 16) > > 2011/8/1 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>> > > You're confusing object types with index types. > > PU is an object

Re: [hwloc-users] hwloc varning flag

2011-08-13 Thread Brice Goglin
I think I am seeing this too on a custom program, so probably not your application's fault. Brice Le 13/08/2011 10:37, Gabriele Fatigati a écrit : > > > Dearhwloc users and developers, > > I'm using hwloc 1.2 stable version Intel 11 compiled and checking my > little application with valgrind

Re: [hwloc-users] hwloc varning flag

2011-08-14 Thread Brice Goglin
FWIW it's worth, it's a "bug" in valgrind. The manpage of mbind does not exactly match the kernel requirements on mbind parameters. And valgrind fails at respecting the manpage anyway. See https://bugs.kde.org/show_bug.cgi?id=280083 for the mess... Brice Le 13/08/2011 15:07, Br

[hwloc-users] Re : hwloc varning flag

2011-08-15 Thread Brice Goglin
No it just means that valgrind could properly check how hwloc uses mbind. But I checked the hwloc code again, things look ok, and the kernel is happy with our mbind parameters. Brice - Reply message - De : "Gabriele Fatigati" <g.fatig...@cineca.it> Pour?: "Bri

[hwloc-users] Re : lstopo on multiple machines

2011-08-16 Thread Brice Goglin
Hello Seb, Hwloc only looks at the local machine, there's no support for multinode topology detection so far. We are considering adding it but we don't know yet what users want to do with it, if it should be in the core or not, automatic or nor. Your feedback is welcome. Brice - Reply

Re: [hwloc-users] Bind current thread to a specific cpu

2011-08-18 Thread Brice Goglin
Are you talking about logical ids (the one given by hwloc) or physical/OS ids (the one given by the OS and possibly in strange order) ? You should avoid using physical ids, but... If logical, you can hwloc_get_obj_by_type() to get the corresponding object, then use its ->cpuset. If physical, you

Re: [hwloc-users] Numa availability

2011-08-28 Thread Brice Goglin
Le 28/08/2011 12:14, Gabriele Fatigati a écrit : > Dear hwloc users, > > what happens if I use hwloc on a non-NUMA machine? I suppose memory > binding has no sense because there is not a memory locality concept. > And regards execution binding? are there some difference on a non-NUMA > machine?

Re: [hwloc-users] hwloc_get_last_cpu_location and PU

2011-08-29 Thread Brice Goglin
Yes Brice Le 29/08/2011 16:15, Gabriele Fatigati a écrit : > Dear hwloc users, > > hwloc_get_last_cpu_location() return last CPU where process/thread > ran.On SMT machine, it return the PU where process/thread ran ? > > Thanks a lot. > > -- > Ing. Gabriele Fatigati > > HPC specialist > >

[hwloc-users] Re : hwloc topology check initializing

2011-09-03 Thread Brice Goglin
Assign NULL to the topology when declaring the variable. It will be changed into something else when init() is called. Brice - Reply message - De : "Gabriele Fatigati" Pour : "Hardware locality user list" Objet : [hwloc-users] hwloc

[hwloc-users] Re : Re : hwloc topology check initializing

2011-09-03 Thread Brice Goglin
nitializing Date : sam., sept. 3, 2011 15:26 Hi Brice, but it works only if the user assing NULL to topology. hwloc_topology_init() does not check the argument passed ? There are no ways to check if topology is initialized or not? Thanks. 2011/9/3 Brice Goglin <brice.gog...@inria.fr>

Re: [hwloc-users] Process and thread binding

2011-09-12 Thread Brice Goglin
Le 12/09/2011 12:52, Gabriele Fatigati a écrit : > Dear hwloc users, > > I'm binding process in a NUMA node and also associated OpenMP threads. > I've noted that, if I bind execution of all on different cores in > the same NUMA node , it works well. > > If I bind process in NUMA node 0 for

Re: [hwloc-users] Process and thread binding

2011-09-12 Thread Brice Goglin
Le 12/09/2011 13:29, Gabriele Fatigati a écrit : > Hi Birce, > > I'm so confused.. > > I'm binding MPI processes with set_cpu_bind and it works well. The > problem is when I try to bind process and threads. > > It seem that thread process influence bind of main thread. > > And from hwloc manual:

Re: [hwloc-users] Process and thread binding

2011-09-12 Thread Brice Goglin
Le 12/09/2011 13:58, Gabriele Fatigati a écrit : > Hi Brice, > > but in the manual is not written that get_cpubind() returns the > logical OR of the binding of all threads... I ever understand that > returns the bind of the calloer, where the caller can be process or > thread.. A process is a

Re: [hwloc-users] Process and thread binding

2011-09-12 Thread Brice Goglin
Le 12/09/2011 14:17, Gabriele Fatigati a écrit : > Mm, and why? In a hybrid code ( MPI + OpenMP), my idea is to bind a > single MPI process in one core, and his threads in other cores. > Otherwise I have all threads that runs on a single core. > The usual way to do that is to first bind the

Re: [hwloc-users] hwloc set membind function

2011-09-22 Thread Brice Goglin
Le 22/09/2011 12:20, Gabriele Fatigati a écrit : > NUMA node(s) near the specified cpuset. > > What does "nodes near the specified cpuset" means? The node wherethe > specified cpuset lives? > Set the default memory binding policy of the current process or thread > to prefer the The node near

Re: [hwloc-users] hwloc set membind function

2011-09-25 Thread Brice Goglin
Le 25/09/2011 11:14, Gabriele Fatigati a écrit : > > I report my questions in a different way (in the first question i did > a mistake): > > > 1) I don't understand the means of set_membind() function. Why I > should to allocate in a node "near" my cpuset and not in my local node > (where thread

Re: [hwloc-users] hwloc set membind function

2011-09-25 Thread Brice Goglin
Le 25/09/2011 12:19, Gabriele Fatigati a écrit : > Hi Brice, > > >The flag says "when the first touch occurs and the physical memory is > allocated for real, don't allocate on the local node (default), but > >rather allocate where specified by set_membind". > > If is it already allocated for real,

Re: [hwloc-users] hwloc set membind function

2011-09-25 Thread Brice Goglin
Le 25/09/2011 20:27, Gabriele Fatigati a écrit : > if(tid==0){ > > set_membind(HWLOCMEMBIND_BIND, node 0) > malloc(array)... > > } > > if (tid==1){ > set_membind(HWLOCMEMBIND_BIND, node 1) > > for(i...) > array(i) > } > > end parallel region > > > array is allocated on node 1, not node 0 as I

Re: [hwloc-users] hwloc set membind function

2011-09-25 Thread Brice Goglin
Le 25/09/2011 20:57, Gabriele Fatigati a écrit : > after done this, memory is allocated not in a local node of thread > that does set_membind and malloc, but in node of thread that touches > it. And I don't understand this behaviour :( Memory is allocated when first-touched. If there's no

Re: [hwloc-users] hwloc set membind function

2011-09-25 Thread Brice Goglin
ble? I just said "you have to touch right after malloc." Brice > > 2011/9/25 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>> > > Le 25/09/2011 20:57, Gabriele Fatigati a écrit : > > after done this, memory is allocated not

Re: [hwloc-users] How to combine hwloc-bind and mpirun

2011-11-10 Thread Brice Goglin
Le 10/11/2011 13:13, Rafael R. Pappalardo a écrit : > I am trying to send a MPI job to selected cores on a 64 cores machine. With > taskset I use: > > mpirun -np 8 taskset -c 1,3,5,7,9,11,13,15 program > > but if I substitute taskset by hwloc-bind doing > > mpirun -np 8 hwloc-bind core:1 core:3

Re: [hwloc-users] GPU/NIC/CPU locality

2011-11-29 Thread Brice Goglin
Hello Stefan, hwloc 1.3 already has support for PCI device detection. These new objects contain a "class" field that can help you know if it's a NIC/GPU/... However it's hard to know which PCI device is eth0 or eth1, so we also try to add some OS device inside PCI device. If you're using Linux,

Re: [hwloc-users] GPU/NIC/CPU locality

2011-11-29 Thread Brice Goglin
> Hwloc optional build support status (more details can be found above): > > Probe / display PCI devices: yes > Graphical output (Cairo):yes > XML output: full "XML output" should be "XML input/output" or "XML support". > Memory support: binding, set policy,

Re: [hwloc-users] GPU/NIC/CPU locality

2011-11-30 Thread Brice Goglin
Le 30/11/2011 08:44, Stefan Eilemann a écrit : > Let me know if I can help. We would be quite interested in this feature. You can help by asking the relevant people for help :) * ask the OpenCL board to add an device query property that tells us the locality of a device. If they return the BusID

Re: [hwloc-users] hwloc download link broken

2012-01-03 Thread Brice Goglin
Le 03/01/2012 05:32, gareth.willi...@csiro.au a écrit : > > On the page: http://www.open-mpi.org/projects/hwloc/ the 'download > page' link: http://www.open-mpi.org/software/hwloc/v1.3.1/ is broken. > > > > But http://www.open-mpi.org/software/hwloc/v1.3/ works so my work is > not stalled J > >

Re: [hwloc-users] Memory replication on a linux NUMA server

2012-01-05 Thread Brice Goglin
Hello François, Replicate is not supported on Linux (and that is not going to change soon unfortunately). For now you should replicate manually. Best wishes to you too! Brice Le 05/01/2012 11:33, François Galea a écrit : > Hello, > > I am working on a Linux amd64 NUMA server running SUSE

[hwloc-users] removing old cpuset API?

2012-01-19 Thread Brice Goglin
Dear hwloc users, The cpuset API (hwloc_cpuset_*) was replaced by the bitmap API (hwloc_bitmap_*) in v1.1.0, back in december 2010. We kept backward compatibility by #defin'ing the old API on top of the new one. So you may stil use the old API in your application (but you would get "deprecated"

[hwloc-users] hwloc and HTX device ?

2012-01-27 Thread Brice Goglin
Hello, I'd like to see what hwloc reports on AMD machines with a HTX card (hypertransport expansion card). The most widely known case would likely be a 3-5-years old AMD cluster with Pathscale Infinipath network cards. But I think there are also some accelerators such as clearspeed, and the

Re: [hwloc-users] PCI devices in the topology

2012-02-10 Thread Brice Goglin
Le 10/02/2012 21:16, Jeff Squyres a écrit : > When PCI devices are put into the tree, do they potentially make other > objects be a different depths? > > For example, http://www.open-mpi.org/projects/hwloc/devel09-pci.png has a PCI > bridge hanging off a socket. Are the cores on sockets P0

Re: [hwloc-users] PCI devices in the topology

2012-02-10 Thread Brice Goglin
Le 10/02/2012 21:46, Jeff Squyres a écrit : > On Feb 10, 2012, at 3:37 PM, Brice Goglin wrote: > >> All objects of the same type are *always* at the same depth (for caches >> and groups, replace "same type" with "same type and same level" so that >> L1

Re: [hwloc-users] receive 0x0 from hwloc_cuda_get_device_cpuset

2012-02-16 Thread Brice Goglin
Le 16/02/2012 15:26, Albert Solernou a écrit : > Is there anything easy that the administrators of the cluster could > do? How could I persuade them that this is an easy task to do? They could upgrade the BIOS. But your machine is old and people didn't care much about I/O affinity in Intel

Re: [hwloc-users] bind process to built cpuset

2012-02-20 Thread Brice Goglin
Le 20/02/2012 17:41, Albert Solernou a écrit : > Hi, > I'd like to bind a process to a cpuset, so that when it spawns on > several threads, those are trapped on that cpuset. > > In order to do so, I want to define my own cpuset. Let's say I want it > to include HWLOC_OBJ_CORE 2 and 5. How can I

Re: [hwloc-users] bind process to built cpuset

2012-02-20 Thread Brice Goglin
Le 20/02/2012 19:06, Brice Goglin a écrit : > Le 20/02/2012 17:41, Albert Solernou a écrit : >> Hi, >> I'd like to bind a process to a cpuset, so that when it spawns on >> several threads, those are trapped on that cpuset. >> >> In order to do so, I want to defin

Re: [hwloc-users] bind process to built cpuset

2012-02-21 Thread Brice Goglin
t > > On Tue 21 Feb 2012 09:46:46 GMT, Albert Solernou wrote: >> Thank you very much, Brice! >> >> Best, >> Albert >> >> On Mon 20 Feb 2012 18:09:55 GMT, Brice Goglin wrote: >>> Le 20/02/2012 19:06, Brice Goglin a écrit : >>>>

Re: [hwloc-users] receive 0x0 from hwloc_cuda_get_device_cpuset

2012-02-21 Thread Brice Goglin
Le 21/02/2012 15:42, Albert Solernou a écrit : > Hi, > I have several questions in order to fix this issue from the machine > side. > > 1) I realised that on this machine neither libcpuset nor cpuset-utils > are installed. Could this be related to the problem? No, Linux "cpuset" are very

Re: [hwloc-users] Problems on SMP with 48 cores

2012-03-13 Thread Brice Goglin
Le 13/03/2012 17:04, Hartmut Kaiser a écrit : >>> But the problems I was seeing were not MSVC specific. It's a >>> proliferation of arcane (non-POSIX) function use (like strcasecmp, >>> etc.) missing use of HAVE_UNISTD_H, HAVE_STRINGS_H to wrap >>> non-standard headers, unsafe mixing of >>>

Re: [hwloc-users] Problems on SMP with 48 cores

2012-03-13 Thread Brice Goglin
Le 13/03/2012 17:04, Hartmut Kaiser a écrit : >>> But the problems I was seeing were not MSVC specific. It's a >>> proliferation of arcane (non-POSIX) function use (like strcasecmp, >>> etc.) missing use of HAVE_UNISTD_H, HAVE_STRINGS_H to wrap >>> non-standard headers, unsafe mixing of >>>

Re: [hwloc-users] Problems on SMP with 48 cores

2012-03-13 Thread Brice Goglin
Le 13/03/2012 18:57, Samuel Thibault a écrit : > Brice Goglin, le Tue 13 Mar 2012 18:55:29 +0100, a écrit : >> Le 13/03/2012 17:04, Hartmut Kaiser a écrit : >>>>> But the problems I was seeing were not MSVC specific. It's a >>>>> proliferation of arcane (n

Re: [hwloc-users] Problems on SMP with 48 cores

2012-03-14 Thread Brice Goglin
Le 13/03/2012 19:08, Hartmut Kaiser a écrit : >> - hwloc_bitmap_from_ith_ulong(obj->cpuset, GroupMask[i].Group, >> GroupMask[i].Mask); >> + hwloc_bitmap_from_ith_ulong(obj->cpuset, 2*GroupMask[i].Group, >> GroupMask[i].Mask & 0xfff); There's a missing 'f' above. Here's

Re: [hwloc-users] Problems on SMP with 48 cores

2012-03-14 Thread Brice Goglin
We debugged this in private emails with Hartmut. His 48-core platform is now detected properly. Everything got fixed with a patch functionnally-identical to what Samuel sent earlier. There's a bit of work before we can commit the fix, but Windows support for more than 32 cores will be officially

Re: [hwloc-users] Using distances

2012-04-21 Thread Brice Goglin
On 21/04/2012 12:23, Jeffrey Squyres wrote: I'm trying to use hwloc distances in Open MPI (e.g., find the distance from each OpenFabrics device to the PU(s) where this process is bound), and I'm a bit confused by the distances documentation. If I have a WHOLE_SYSTEM topology, and I know that

Re: [hwloc-users] Using distances

2012-04-21 Thread Brice Goglin
On 21/04/2012 13:15, Jeffrey Squyres wrote: On Apr 21, 2012, at 7:09 AM, Brice Goglin wrote: I assume you have the entire distance (latency) matrix between all NUMA nodes as usually reported by the BIOS. const struct hwloc_distance_s *distances = hwloc_get_whole_distance_matrix_by_type

Re: [hwloc-users] possible concurrency issue with reading /proc data on Linux

2012-04-21 Thread Brice Goglin
On 21/04/2012 23:08, Vlad wrote: Greetings, I use hwloc-1.4.1 stable on Red Hat 5 and am seeing a possible concurrency issue not covered by the "Thread Safety" guidelines: - I start a small number (4) of threads, each of which does some work and periodically executes

Re: [hwloc-users] possible concurrency issue with reading /proc data on Linux

2012-04-21 Thread Brice Goglin
On 21/04/2012 23:36, Vlad wrote: Will try this within a day or two. At the moment I am simply using a retry loop on ENOSYS and usually no more than one retry is needed. Ok thanks. You are probably correct. I was thinking of this code from

Re: [hwloc-users] possible concurrency issue with reading /proc data on Linux

2012-04-23 Thread Brice Goglin
run out of retries I default to hwloc_get_last_cpu_location(... HWLOC_CPUBIND_THREAD) -- since presumably that can't fail and the result is technically valid given hwloc_get_last_cpu_location() semantics (it reads state that's inherently transient). On Apr 23, 2012, at 7:53 AM, Brice Goglin

Re: [hwloc-users] hwloc_get_last_cpu_location on AIX

2012-05-08 Thread Brice Goglin
Le 08/05/2012 14:33, Hendryk Bockelmann a écrit : > Hello, > > I just ran into trouble using hwloc_get_last_cpu_location on our > POWER6 cluster with AIX6.1 > My plan is to find out if the binding of the job-scheduler was correct > for MPI-tasks and OpenMP-threads. This is what I want to use: > >

Re: [hwloc-users] hwloc - Build problem.

2012-05-20 Thread Brice Goglin
Hello Anatoly, You likely need to add libxml2.a to your link command-line. And some others may be missing later. Instead of linking with src/.libs/libhwloc.a, you should run "make install" and use libhwloc.a from there (use --prefix= to tell configure where to install). Once hwloc is installed,

Re: [hwloc-users] hwloc_get_last_cpu_location on AIX

2012-05-29 Thread Brice Goglin
ing at get_last_cpu_location() for entire processes instead of individual threads. Brice Le 08/05/2012 14:41, Brice Goglin a écrit : > Le 08/05/2012 14:33, Hendryk Bockelmann a écrit : >> Hello, >> >> I just ran into trouble using hwloc_get_last_cpu_location on our >> PO

Re: [hwloc-users] Understanding hwloc-ps output

2012-05-30 Thread Brice Goglin
the OMPI v1.6 SVN branch) > > > On May 30, 2012, at 9:54 AM, Brice Goglin wrote: > >> Hello Youri, >> When using openmpi 1.4.4 with --np 2 --bind-to-core --bycore” it reports the >> following: >>> [hostname:03339] [[17125,0],0] odls:default:fork binding child >&

Re: [hwloc-users] Hwloc error.

2012-05-30 Thread Brice Goglin
Le 30/05/2012 17:22, Samuel Thibault a écrit : > Hello, > > John Hanks, le Wed 30 May 2012 17:03:47 +0200, a écrit : >> * Hwloc has encountered what looks like an error from the operating system. >> * >> * object intersection without inclusion! >> * Error occurred in topology.c line 594 > There is

Re: [hwloc-users] Hwloc error.

2012-05-30 Thread Brice Goglin
We don't need any other info on the hwloc side. And we thank you for testing the big hwloc warning code :) For HP: * If you're lucky, the BIOS may talk about the number of NUMA nodes (either on the usual messages during boot, or in the BIOS configuration menu). See if it says 2 on the broken node

Re: [hwloc-users] anyone seen problems with PCI on RHEL 6?

2012-07-03 Thread Brice Goglin
I think I remember a similar report but I can't find it in the archives. RHEL bugzilla found https://bugzilla.redhat.com/show_bug.cgi?id=740630 which is solved in pciutils >= 3.1.4-11 Which pciutils do you have? Brice Le 03/07/2012 01:48, Carl Smith a écrit : > I happened to run "lstopo

Re: [hwloc-users] hwloc_get_latency() failures and confusion

2012-08-06 Thread Brice Goglin
Le 06/08/2012 23:47, Wheeler, Kyle Bruce a écrit : > Hello, > > I'm failing to understand what hwloc (v1.5) is doing. I'm trying to use > hwloc_get_latency() to determine the distance between two cores. > > The two cores are on different sockets. According to libnuma's numactl, the > latency

Re: [hwloc-users] [EXTERNAL] Re: hwloc_get_latency() failures and confusion

2012-08-06 Thread Brice Goglin
Le 07/08/2012 00:36, Wheeler, Kyle Bruce a écrit : > A, that's key! The documentation currently says "Look at ancestor > objects from the bottom to the top until one of them contains a > distance matrix that matches the objects exactly", which suggests to > me that it will traverse the object

Re: [hwloc-users] lstopo and GPus

2012-08-28 Thread Brice Goglin
Hello, For now, you have to look at PCI ids. NVIDIA GPUs have "10de:" as vendor/device ids, that's what is shown in your boxes on the right. We should have better GPU support in the future. Right now, we only use what Linux knows, and it knows pretty much nothing about NVIDIA GPUs because of

Re: [hwloc-users] lstopo and GPus

2012-08-28 Thread Brice Goglin
Le 28/08/2012 14:23, Samuel Thibault a écrit : > Gabriele Fatigati, le Tue 28 Aug 2012 14:19:44 +0200, a écrit : >> I'm using hwloc 1.5. I would to see how GPUs are connected with the processor >> socket using lstopo command. > About connexion with the socket, there is indeed no real graphical >

Re: [hwloc-users] Thread binding problem

2012-09-05 Thread Brice Goglin
Hello Gabriele, The only limit that I would think of is the available physical memory on each NUMA node (numactl -H will tell you how much of each NUMA node memory is still available). malloc usually only fails (it returns NULL?) when there no *virtual* memory anymore, that's different. If you

Re: [hwloc-users] Thread binding problem

2012-09-05 Thread Brice Goglin
upported > -1 with errno set to EXDEV if the binding cannot be enforced > > > Any other binding failure reason? The memory available is enought. > > 2012/9/5 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>> > > Hello Gabriele, > > The on

Re: [hwloc-users] Thread binding problem

2012-09-05 Thread Brice Goglin
osed that these two case > was the two unique possibly. > > From the hwloc documentation: > > -1 with errno set to ENOSYS if the action is not supported > -1 with errno set to EXDEV if the binding cannot be enforced > > > Any other binding failure reason?

Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Brice Goglin
Le 06/09/2012 09:56, Gabriele Fatigati a écrit : > Hi Brice, hi Jeff, > > >Can you add some printf inside hwloc_linux_set_area_membind() in > src/topology-linux.c to see if ENOMEM comes from the mbind >syscall or > not? > > I added printf inside that function, but ENOMEM does not come from there.

Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Brice Goglin
we're talking about 1,6MB only here. So there's still something else eating all the memory. /proc/meminfo (MemFree) and numactl -H should again help. Brice > > > > 2012/9/6 Brice Goglin <brice.gog...@inria.fr > <mailto:brice.gog...@inria.fr>> > > Le 06/09/2012 1

Re: [hwloc-users] Solaris and hwloc

2012-09-13 Thread Brice Goglin
Le 13/09/2012 00:26, Jeff Squyres a écrit : > On Sep 12, 2012, at 10:30 AM, Samuel Thibault wrote: > >>> Sidenote: if hwloc-bind fails to bind, should we still launch the child >>> process? >> Well, it's up to you to decide :) > > Anyone have an opinion? I'm 60/40 in favor of not letting it run,

Re: [hwloc-users] Solaris and hwloc

2012-09-13 Thread Brice Goglin
(resending because the formatting was bad) Le 13/09/2012 00:26, Jeff Squyres a écrit : > On Sep 12, 2012, at 10:30 AM, Samuel Thibault wrote: > >>> Sidenote: if hwloc-bind fails to bind, should we still launch the child >>> process? >> Well, it's up to you to decide :) > > Anyone have an

Re: [hwloc-users] Solaris and hwloc

2012-09-13 Thread Brice Goglin
If the user really wants something to > run without binding, then you can just do that in the shell: > > - > hwloc-bind ...whatever... my_executable > if test "$?" != "0"; then > # run without binding > my_executable > fi > - > > My

Re: [hwloc-users] Questions to lstopo and hwloc-bind

2012-09-14 Thread Brice Goglin
Le 14/09/2012 07:48, Siegmar Gross a écrit : > I have installed hwloc-1.5 on our systems and get the following output > when I run "lstopo" on a Sun Server M4000 (two quad-core processors with > two hardware-threads each). > > rs0 fd1026 101 lstopo > Machine (32GB) + NUMANode L#0 (P#1 32GB) >

Re: [hwloc-users] hwloc 1.5, freebsd and linux output on the same hardware

2012-10-02 Thread Brice Goglin
Le 02/10/2012 23:45, Sebastian Kuzminsky a écrit : > Hi folks, I just discovered hwloc and it's really cool. Very useful, > so thanks! > > I'm trying to understand the hardware layout of a computer I'm working > with, an HP Proliant DL360p G8 server with two Intel E5-2690 processors. > > I'm

Re: [hwloc-users] hwloc 1.5, freebsd and linux output on the same hardware

2012-10-03 Thread Brice Goglin
Le 03/10/2012 17:23, Sebastian Kuzminsky a écrit : > On Tue, Oct 2, 2012 at 5:14 PM, Samuel Thibault > > wrote: > > There were two bugs which resulted into cpuid not being properly > compiled. I have fixed them in the trunk, could

Re: [hwloc-users] How do I access CPUModel info string

2012-10-25 Thread Brice Goglin
Le 25/10/2012 23:42, Samuel Thibault a écrit : > Robin Scher, le Thu 25 Oct 2012 23:39:46 +0200, a écrit : >> Is there a way to get this string (e.g. "Intel(R) Core(TM) i7 CPU M 620 @ >> 2.67GHz") consistently on Windows, Linux, OS-X and Solaris? > Currently, no. > > hwloc itself does not have a

Re: [hwloc-users] How do I access CPUModel info string

2012-10-25 Thread Brice Goglin
Le 25/10/2012 23:57, Robin Scher a écrit : > On OS-X, you can get this string from the sysctlbyname() call: > > const char *name = "machdep.cpu.brand_string"; > char buffer[ 64 ]; > size_t size = 64; > if( !sysctlbyname( name, buffer, , NULL, 0 ) ) > memcpy( cpu_model,

Re: [hwloc-users] How do I access CPUModel info string

2012-10-26 Thread Brice Goglin
Le 26/10/2012 05:22, Robin Scher a écrit : > I would love to get this by my next release, say in the next 3-6 > months. Is that something that would be possible? Is there anything I > can do to help? We'll have a v1.6 release before the end of the year, and hopefully a first release candidate by

Re: [hwloc-users] How do I access CPUModel info string

2012-10-27 Thread Brice Goglin
int the number of sockets. of > http://www.open-mpi.org/projects/hwloc/doc/v1.5.1/ > ] > I see objects type 1,2,4 and 6 only. > > So, will there be another (non socket hwloc object based) way to get > CPUModel or will it find sockets as on Linux ? > > Thanks. > > Olivier Ce

Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-02 Thread Brice Goglin
Le 02/11/2012 21:03, Brock Palen a écrit : > This isn't a hwloc problem exactly, but maybe you can shed some insight. > > We have some 4 socket 10 core = 40 core nodes, HT off: > > depth 0: 1 Machine (type #1) > depth 1: 4 NUMANodes (type #2) > depth 2:4 Sockets (type #3) >

Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-02 Thread Brice Goglin
Le 02/11/2012 21:22, Brice Goglin a écrit : > hwloc-bind --get-last-cpu-location --pid should give the same > info but it seems broken on my machine right now, going to debug. Actually, that works fine once you try it on a non-multithreaded program that uses all cores :) So you can u

Re: [hwloc-users] Strange binding issue on 40 core nodes and cgroups

2012-11-05 Thread Brice Goglin
Le 05/11/2012 22:57, Brock Palen a écrit : > Ok more information (had to build newer hwloc) My job today only 2 processes > are running at half speed and they indeed are sharing the same core: > > [root@nyx7000 ~]# for x in `cat /tmp/pids `; do echo -n "$x "; hwloc-bind >

[hwloc-users] hwloc@SC12

2012-11-07 Thread Brice Goglin
Hello, If you're attending SC12, feel free to come to the Inria booth (#1209) and say hello. Samuel and I will be there, happy to meet people in real life. Brice

Re: [hwloc-users] [hwloc-announce] Hardware locality (hwloc) v1.6rc1 released

2012-11-15 Thread Brice Goglin
Thanks, that was an old bug on a somehow rare XML case on a NUIOA machine. Looks like adding new test cases is indeed useful :) Brice Le 15/11/2012 13:14, Samuel Thibault a écrit : > Hello, > > Brice Goglin, le Tue 13 Nov 2012 13:45:28 +0100, a écrit : >> The Hardware Locali

Re: [hwloc-users] How do I access CPUModel info string

2012-11-18 Thread Brice Goglin
Le 26/10/2012 09:39, Brice Goglin a écrit : > Le 26/10/2012 05:22, Robin Scher a écrit : >> I would love to get this by my next release, say in the next 3-6 >> months. Is that something that would be possible? Is there anything I >> can do to help? > > We'll have a

  1   2   3   4   >