Le 03/12/2010 21:42, Bernd Kallies a écrit : >> We should really encourage people to use XML in such cases. Setting >> HWLOC_XMLFILE=/path/to/exported/file.xml in the environment should just >> work (as long as you update the XML file major hwloc releases or os). >> Maybe we should add a dedicated section about this in the documentation? >> Something like "Speeding up hwloc on large nodes"? And maybe even >> encourage distro-packager to create a XML export file under /var/lib >> with an advice to add HWLOC_XMLFILE to /etc/environment if they care >> about hwloc/HPC? >> >> Anyway Bernd, can you export a XML on this nice machine and reload it >> and see how long it takes? I hope all the bottlenecks are in the Linux >> backend parsing /sys and /proc, not in the actual hwloc core. >> > I'm not sure if I understood. From my point of view it makes no sense to > create an XML representation of the topology with hwloc, and then read > this XML in again to be able to dive into it to figure out something. > When there is an API that provides direct access to parts of the > topology once it is constructed, then the XML thing is useless > additional work. >
Don't see the XML as a way to represent the topology and traverse it. Just see it as a cache that you can read much faster than /proc and /sys. Once you load the XML, you get the usual hwloc API. > But this would not help us in many > of our use cases. We have to analyze topologies that do not represent a > whole machine. We analyze topologies that are bound to cpusets. We do > this e.g. to construct pinning schemes for MPI applications that run > inside of batch jobs, which get their cpusets created on the fly > depending on their resource requests and current load of the machine. > Right, if the cpuset changes, caching in XML is useless (except if we implement a way to restrict a given topology in the future). > The > problem here is rather, if every task running on a node should read the > topology and figure out on which CPU it should pin itself, or if one > does this by one master task on a node, and communicate the result to > the others. But this is outside of hwloc. > Well, having hundreds of processes read /proc and /sys at the same time is also another reason to use XML. The master can read the topology once and pass it to all other processes through XML-files or XML-buffers-over-socket. I assume that's what Open MPI will do in the near future. Brice