Hello
For hwloc, we indeed need something to help applications bind tasks
and/or memory buffer near the network device. On Linux, the "name" of
the kernel object ("hfi1_0", "usnic2", etc) is often enough because we
can walk from it to PCI objects under sysfs, and then get locality
information from there. But it's not that easy:
* Some proprietary drivers don't expose anything in sysfs, hence there's
no such name. I assume you'll have at least some sort of "device index".
* It's not clear we'll get such a name on non-Linux systems
Another solution is to give us the PCI bus ID, which should work all
physical hardware, except bonding etc.
Having both the Linux kernel name and the PCI bus ID would be nice.
Brice
Le 04/05/2018 à 14:28, Jeff Squyres (jsquyres) a écrit :
> 2. Another topic that comes up not infrequently is the ability to correlate a
> fabric/domain/endpoint to some other corresponding Linux entity, such as an
> IP interface and/or PCI device (if relevant). This obviously doesn't work
> for fabrics/domains/endpoints that represent emulation devices, may be tricky
> for bonded devices, ...etc. But there are many providers that create
> fabrics/domains/endpoints that directly correlate with a specific Linux
> device. Tools like hwloc (and therefore Open MPI) could definitely use this
> information for determining locality, especially where short message latency
> matters.
>
> Some sort of optional of fabric/domain/endpoint correlation to a Linux device
> would be genuinely useful.
>
> I honestly haven't given a ton of thought to either of these other than "that
> would be useful"; apologies if this is somewhat half-baked.
>
>
>> On May 3, 2018, at 4:45 PM, Hefty, Sean <[email protected]> wrote:
>>
>> There has been a long outstanding set of requests to obtain HW specific data
>> from libfabric. A side discussion brought this topic up again, so I'd like
>> to at least put it on the agenda as a possible feature for 1.7. As a point
>> of reference, Cisco has implemented a set of provider specific ops to
>> retrieve device specific data. It's fairly simple, and details are here:
>>
>> https://github.com/cisco/usnic_tools/blob/master/usnic_devinfo.c
>>
>> This feature would obviously only apply to providers that are directly
>> associated with some sort of HW device.
>>
>> What I would like to start to collect is a list of what sort of attributes
>> would be desirable to report, or what applications or users could make use
>> of.
>>
>> - Sean
>> _______________________________________________
>> ofiwg mailing list
>> [email protected]
>> http://lists.openfabrics.org/mailman/listinfo/ofiwg
>
_______________________________________________
ofiwg mailing list
[email protected]
http://lists.openfabrics.org/mailman/listinfo/ofiwg