Hello
Indeed we would like to expose this kind of info but Netloc is unfornately undermanpowered these days. The code in git master is outdated. We have a big rework in a branch but it still needs quite a lot of polishing before being merged The API is still mostly-scotch-oriented (i.e. for process placement using communication graphs) because that's pretty-much the only clear user-request we got in the last years (most people said "we want netloc" but never gave any idea of what API they actually needed). Of course, there will be a way to say "I want the entire machine" or "only my allocated nodes". The non-scotch API for exposing topology details has been made private until we understand better what users want. And your request would definitely help there. Brice Le 19/08/2019 à 09:31, Rigel Falcao do Couto Alves a écrit : > > Thanks John and Jeff for the replies. > > > Indeed, we are using Slurm here at our cluster; so, for now, I can > stick with the runtime reading of the network topology's description > file, explained here: > > > https://slurm.schedmd.com/topology.conf.html > > > But given the idea of the project is to produce a library that can be > distributed to anyone in the world, it would still worth it to have a > way to gather such information on-the-go -- as I can already do > with /hwloc/'s topology information. No problem about starting > simple, i.e. only single-path hierarchies supported in the beginning. > > > The additional /switch/ information (coming from /netloc/) would then > be added to the graphical output of our tools, allowing users to > visually analyse how resources placement (both /intra/ and /inter/ > node) affect their applications. > > > > ------------------------------------------------------------------------ > *Von:* hwloc-users <hwloc-users-boun...@lists.open-mpi.org> im Auftrag > von John Hearns via hwloc-users <hwloc-users@lists.open-mpi.org> > *Gesendet:* Freitag, 16. August 2019 07:16 > *An:* Hardware locality user list > *Cc:* John Hearns > *Betreff:* Re: [hwloc-users] Netloc feature suggestion > > Hi Rigel. This is very interesting. > First though I should say - most batch systems have built in node > grouping utilities. > PBSPro has bladesets - I think they are called placement groups now. > I used these when running CFD codes in a Formula 1 team. > The systems administrator has to set these up manually, using > knowledge of the switch topology. > In PBSPro jobs would then 'prefer' to run within the smallest bladeset > which could accomodate them. > So you define bladesets for (say) 8/16/24/48 node jobs. > > https://pbspro.atlassian.net/wiki/spaces/PD/pages/455180289/Finer+grained+node+grouping > > Similarly for Slurm > https://slurm.schedmd.com/topology.html > > > On Wed, 14 Aug 2019 at 18:53, Rigel Falcao do Couto Alves > <rigel.al...@tu-dresden.de <mailto:rigel.al...@tu-dresden.de>> wrote: > > Hi, > > > I am doing a PhD in performance analysis of highly parallel CFD > codes and would like to suggest a feature for Netloc: from > topic /Build Scotch sub-architectures/ > (at https://www.open-mpi.org/projects/hwloc/doc/v2.0.3/a00329.php), > create a function-version of /netloc_get_resources/, which could > retrieve at runtime the network details of the available cluster > resources (i.e. the nodes allocated to the job). I am mostly > interested about how many switches (the gray circles in the figure > below) need to be traversed in order for any pair of > allocated nodes to communicate with each other: > > [removed 200kB image] > > > For example, suppose my job is running within 4 nodes in the > cluster, illustrated by the numbers above. All I would love to get > from Netloc - at runtime - is some sort of classification of the > nodes, like: > > > 1: aa > > 2: ab > > 3: ba > > 4: ca > > > The difference between nodes 1 and 2 is on the last digit, which > means their MPI communications only need to traverse 1 switch; > however, between any of them and nodes 3 or 4, the difference > starts on the second-last digit, which means their communications > need to traverse two switches. More digits may be left-added to > the string, per necessity; i.e. if the central gray circle on the > above figure is connected to another switch, which in turnleads to > another part of the cluster's structure (with its own switches, > nodes etc.). For me, it is at the present moment irrelevant > whether e.g. nodes 1 and 2 are physically - or logically - > consecutive to each other: /a/, /b/, /c/ etc. would be just > arbitrary identifiers. > > > I would then use this data to plot the process placement, using > open-source tools developed here in the University of Dresden > (Germany); i.e. Scotch is not an option for me. The results of my > study will be open-source as well and I can gladly share them with > you once the thesis is finished. > > > I hope I have clearly explained what I have in mind; please let me > know if there are any questions. Finally, it is important that > this feature is part of Netloc's API (as it is supposed to be > integrated with the tools we develop here), works at runtime and > doesn't require root privileges (as those tools are used by our > cluster's costumers on their every-day job submissions). > > > Kind regards, > > > > -- > Dipl.-Ing. Rigel Alves > researcher > > Technische Universität Dresden > Center for Information Services and High Performance Computing (ZIH) > Zellescher Weg 12 A 218, 01069 Dresden | Germany > > 📞 +49 (351) 463.42418 > 🌐 https://tu-dresden.de/zih/die-einrichtung/struktur/rigel-alves > > _______________________________________________ > hwloc-users mailing list > hwloc-users@lists.open-mpi.org <mailto:hwloc-users@lists.open-mpi.org> > https://lists.open-mpi.org/mailman/listinfo/hwloc-users > > > _______________________________________________ > hwloc-users mailing list > hwloc-users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/hwloc-users
_______________________________________________ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users