Le 02/12/2010 22:25, Bernd Kallies a écrit : > >> Do you have any feel for if there are particular bottlenecks in hwloc / >> lstopo that make it take so long? I wonder if we should just attack those >> (if possible)...? Samuel and Brice have done all the work in the guts of >> the API, so they might know offhand if there are places that can be >> optimized or not... >> > Hmm. I did no profiling. The machines in question have 64 NUMA nodes > with 16 logical CPUs, each. The topology depth is 10. So parsing > of /sys/devices/system/node/* and evaluating the distance matrix to > fiddle out the topology tree should be quite expensive. But I guess this > statement is trivial and does not help very much. >
We should really encourage people to use XML in such cases. Setting HWLOC_XMLFILE=/path/to/exported/file.xml in the environment should just work (as long as you update the XML file major hwloc releases or os). Maybe we should add a dedicated section about this in the documentation? Something like "Speeding up hwloc on large nodes"? And maybe even encourage distro-packager to create a XML export file under /var/lib with an advice to add HWLOC_XMLFILE to /etc/environment if they care about hwloc/HPC? Anyway Bernd, can you export a XML on this nice machine and reload it and see how long it takes? I hope all the bottlenecks are in the Linux backend parsing /sys and /proc, not in the actual hwloc core. By the way, we're not the only project with little scalability problems on very large machines: https://lkml.org/lkml/2010/12/3/19 :) Brice