That's pretty much what I had in mind too - will have to play with it a bit 
until we find the best solution, but it shouldn't be all that hard.

On Feb 9, 2012, at 2:23 PM, Brice Goglin wrote:

> Here's what I would do:
> During init, walk the list of hwloc PCI devices (hwloc_get_next_pcidev()) and 
> keep an array of pointers to the interesting onces + their locality (the 
> hwloc cpuset of the parent non-IO object).
> When you want the I/O device near a core, walk the array and find one whose 
> locality contains your core hwloc cpuset.
> 
> If you need help, feel free to contact me offline.
> 
> Brice
> 
> 
> 
> Le 09/02/2012 22:14, Ralph Castain a écrit :
>> 
>> Hmmm….guess we'll have to play with it. Our need is to start with a core or 
>> some similar object, and quickly determine the closest IO device of a 
>> certain type. We wound up having to write "summarizer" code to parse the 
>> hwloc tree into a more OMPI-usable form, so we can always do that with the 
>> IO tree as well if necessary.
>> 
>> 
>> On Feb 9, 2012, at 2:09 PM, Brice Goglin wrote:
>> 
>>> That doesn't really work with the hwloc model unfortunately. Also, when you 
>>> get to smaller objects (cores, threads, ...) there are multiple "closest" 
>>> objects at each depth.
>>> 
>>> We have one "closest" object at some depth (usually Machine or NUMA node). 
>>> If you need something higher, you just walk the parent links. If you need 
>>> something smaller, you look at children.
>>> 
>>> Also, each I/O device isn't directly attached to such a closest object. 
>>> It's usually attached under some bridge objects. There's a tree of hwloc 
>>> PCI bus objects exactly like you have a tree of hwloc 
>>> sockets/cores/threads/etc. At the top of the I/O tree, one (bridge) object 
>>> is attached to a regular object as explained earlier. So, when you have a 
>>> random hwloc PCI object, you get its locality by walking up its parent link 
>>> until you find a non-I/O object (one whose cpuset isn't NULL). 
>>> hwloc/helper.h gives you hwloc_get_non_io_ancestor_obj() to do that.
>>> 
>>> Brice
>>> 
>>> 
>>> 
>>> Le 09/02/2012 14:34, Ralph Castain a écrit :
>>>> 
>>>> Ah, okay - in that case, having the I/O device attached to the "closest" 
>>>> object at each depth would be ideal from an OMPI perspective.
>>>> 
>>>> On Feb 9, 2012, at 6:30 AM, Brice Goglin wrote:
>>>> 
>>>>> The bios usually tells you which numa location is close to each 
>>>>> host-to-pci bridge. So the answer is yes.
>>>>> Brice
>>>>> 
>>>>> 
>>>>> Ralph Castain <r...@open-mpi.org> a écrit :
>>>>> I'm not sure I understand this comment. A PCI device is attached to the 
>>>>> node, not to any specific location within the node, isn't it? Can you 
>>>>> really say that a PCI device is "attached" to a specific NUMA location, 
>>>>> for example?
>>>>> 
>>>>> 
>>>>> On Feb 9, 2012, at 6:15 AM, Jeff Squyres wrote:
>>>>> 
>>>>>> That doesn't seem too attractive from an OMPI perspective, though.  We'd 
>>>>>> want to know where the PCI devices are actually rooted.
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to