Re: [hwloc-devel] Stability of /sys/devices/system/cpu/cpu0/cache/index*/ interface

2011-10-12 Thread Jiri Hladky
Thanks Samuel!

I completely agree with you and I have passed this information to haveged
upstream.

Jirka

On Wed, Oct 5, 2011 at 12:01 AM, Samuel Thibault
wrote:

> Jiri Hladky, le Tue 04 Oct 2011 23:53:02 +0200, a écrit :
> > Since hwloc relies also on /sys/devices/system/cpu/cpu0/cache/ I'm
> > wondering if you had some thoughts or issues on that.
>
> On the linux-kernel mailing list it is often said that the content of
> /sys is part of the ABI and thus shall not get broken, so I believe we
> can assume it is stable.
>
> Samuel
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>


[hwloc-devel] beyond v1.3

2011-10-12 Thread Brice Goglin
I just released the final v1.3 (official announce mail coming later).

I will likely merge the custom branch into trunk in the next days so
that the multinode topology support gets wider testing.

The other big thing on my TODO list for v1.4 is throughput distance
matrix. This one is a bit annoying because the existing distance
interface isn't generic enough to make this clean. We'll see.

Brice



[hwloc-devel] hwloc + OMPI issue

2011-10-12 Thread Jeff Squyres
One of the OMPI devs found a problem when I upgraded the OMPI SVN trunk to the 
hwloc 1.2.2ompi version last week that I think I am just now beginning to 
understand.

Brief reminder of our strategy:

- on each compute node, OMPI launches a local "orted" helper daemon
- this orted fork/exec's the local MPI processes

To avoid the penalty of each MPI process invoking hwloc discovery more-or-less 
simultaneously upon startup (which, as we've see on this list before, can be 
painful when core counts are large), we have the orted do the hwloc discovery, 
serialize this into XML, and send it to each of its local processes.  The local 
processes receive this XML and then load it into hwloc and run from there.

However, it looks like the resulting loaded-from-XML topology->is_thissystem is 
set to 0, and therefore functions like hwloc_get_cpubind() actually get wired 
up to dontget_thisproc_cpubind() (instead of the proper Linux backend, for 
example).

How do we avoid this?  We need working hwloc functions after loading up an XML 
topology string.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [hwloc-devel] hwloc + OMPI issue

2011-10-12 Thread Brice Goglin
Le 12/10/2011 22:56, Jeff Squyres a écrit :
> One of the OMPI devs found a problem when I upgraded the OMPI SVN trunk to 
> the hwloc 1.2.2ompi version last week that I think I am just now beginning to 
> understand.
>
> Brief reminder of our strategy:
>
> - on each compute node, OMPI launches a local "orted" helper daemon
> - this orted fork/exec's the local MPI processes
>
> To avoid the penalty of each MPI process invoking hwloc discovery 
> more-or-less simultaneously upon startup (which, as we've see on this list 
> before, can be painful when core counts are large), we have the orted do the 
> hwloc discovery, serialize this into XML, and send it to each of its local 
> processes.  The local processes receive this XML and then load it into hwloc 
> and run from there.
>
> However, it looks like the resulting loaded-from-XML topology->is_thissystem 
> is set to 0, and therefore functions like hwloc_get_cpubind() actually get 
> wired up to dontget_thisproc_cpubind() (instead of the proper Linux backend, 
> for example).
>
> How do we avoid this?  We need working hwloc functions after loading up an 
> XML topology string.

export HWLOC_THISSYSTEM=1
or
hwloc_topology_set_flags(HWLOC_TOPOLOGY_FLAG_IS_THISSYSTEM) between
init() and load()

Brice



Re: [hwloc-devel] hwloc + OMPI issue

2011-10-12 Thread Ralph Castain
Thanks! I'll add the latter to our code.

Ralph

On Oct 12, 2011, at 3:11 PM, Brice Goglin wrote:

> Le 12/10/2011 22:56, Jeff Squyres a écrit :
>> One of the OMPI devs found a problem when I upgraded the OMPI SVN trunk to 
>> the hwloc 1.2.2ompi version last week that I think I am just now beginning 
>> to understand.
>> 
>> Brief reminder of our strategy:
>> 
>> - on each compute node, OMPI launches a local "orted" helper daemon
>> - this orted fork/exec's the local MPI processes
>> 
>> To avoid the penalty of each MPI process invoking hwloc discovery 
>> more-or-less simultaneously upon startup (which, as we've see on this list 
>> before, can be painful when core counts are large), we have the orted do the 
>> hwloc discovery, serialize this into XML, and send it to each of its local 
>> processes.  The local processes receive this XML and then load it into hwloc 
>> and run from there.
>> 
>> However, it looks like the resulting loaded-from-XML topology->is_thissystem 
>> is set to 0, and therefore functions like hwloc_get_cpubind() actually get 
>> wired up to dontget_thisproc_cpubind() (instead of the proper Linux backend, 
>> for example).
>> 
>> How do we avoid this?  We need working hwloc functions after loading up an 
>> XML topology string.
> 
> export HWLOC_THISSYSTEM=1
> or
> hwloc_topology_set_flags(HWLOC_TOPOLOGY_FLAG_IS_THISSYSTEM) between
> init() and load()
> 
> Brice
> 
> ___
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel