I ran ibstat on head node it gives information in attach. 2017-02-17 12:16 GMT+03:00 Brice Goglin <brice.gog...@inria.fr>:
> For some reason, lstopo didn't find any InfiniBand information on the head > node. I guess running lstopo won't show any "mlx4_0" or "ib0" object. Is > the InfiniBand service really running on that machine? > > Brice > > > > > > Le 17/02/2017 10:04, Михаил Халилов a écrit : > > All files in attach. I run netloc_ib_gather_raw with this parameters > netloc_ib_gather_raw /home/halilov/mycluster-data/ > --hwloc-dir=/home/halilov/mycluster-data/hwloc/ --verbose --sudo > > 2017-02-17 11:55 GMT+03:00 Brice Goglin <brice.gog...@inria.fr>: > >> Please copy-paste the exact command line of your "netloc_ib_gather_raw" >> and all the messages it printed. And also send the output of the hwloc >> directory it created (it will contain the lstopo XML output of the node >> where you ran the command). >> >> Brice >> >> >> >> Le 17/02/2017 09:51, Михаил Халилов a écrit : >> >> I installed nightly tarball, but it still isn't working. In attach info >> of ibnetdiscover and ibroute. May be it wlii help... >> What could be the problem? >> >> Best regards, >> Mikhail Khalilov >> >> 2017-02-17 9:53 GMT+03:00 Brice Goglin < <brice.gog...@inria.fr> >> brice.gog...@inria.fr>: >> >>> Hello >>> >>> As identicated on the netloc webpages, the netloc development now occurs >>> inside the hwloc git tree. netloc v0.5 is obsolete even if hwloc 2.0 >>> isn't released yet. >>> >>> If you want to use a development snapshot, take hwloc nightly tarballs >>> from https://ci.inria.fr/hwloc/job/master-0-tarball/ or >>> https://www.open-mpi.org/software/hwloc/nightly/master/ >>> >>> Regards >>> Brice >>> >>> >>> >>> >>> >>> Le 16/02/2017 19:15, <miharuli...@gmail.com>miharuli...@gmail.com a >>> écrit : >>> > I downloaded gunzip from openmpi site here: https://www.open-mpi.org/ >>> software/netloc/v0.5/ >>> > >>> > There are three identical machines in my cluster, but now third node >>> is broken, and i tried on two machines. They all connected by InfiniBand >>> switch, and when I try to use ibnetdiscovery or ibroute, it works >>> perfectly... >>> > >>> > >>> > >>> > Отправлено с iPad >>> >> 16 февр. 2017 г., в 18:40, Cyril Bordage <cyril.bord...@inria.fr> >>> написал(а): >>> >> >>> >> Hi, >>> >> >>> >> What version did you use? >>> >> >>> >> I pushed some commits on master on ompi repository. With this version >>> it >>> >> seems to work. >>> >> You have two machines because you tried netloc on these two? >>> >> >>> >> >>> >> Cyril. >>> >> >>> >>> Le 15/02/2017 à 22:44, miharulidze a écrit : >>> >>> Hi! >>> >>> >>> >>> I'm trying to use NetLoc tool for detecting my cluster topology. >>> >>> >>> >>> I have 2 node cluster with AMD Processors, connected by InfiniBand. >>> Also >>> >>> I installed latest versions of hwloc and netloc tools. >>> >>> >>> >>> I followed the instruction of netloc and when I tried to use >>> >>> netloc_ib_gather_raw as root, i recieved this message >>> >>> root:$ netloc_ib_gather_raw >>> >>> --out-dir=/home/halilov/mycluster-data/result/ >>> >>> --hwloc-dir=/home/halilov/mycluster-data/hwloc/ --sudo >>> >>> >>> >>> Found 0 subnets in hwloc directory: >>> >>> >>> >>> >>> >>> There are two files in /home/halilov/mycluster-data/hwloc/ >>> generated by >>> >>> hwloc: head.xml and node01.xml >>> >>> >>> >>> P.S. in attach archieve with .xml files >>> >>> >>> >>> >>> >>> Best regards, >>> >>> Mikhail Khalilov >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> hwloc-users mailing list >>> >>> hwloc-users@lists.open-mpi.org >>> >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users >>> >> _______________________________________________ >>> >> hwloc-users mailing list >>> >> hwloc-users@lists.open-mpi.org >>> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users >>> > _______________________________________________ >>> > hwloc-users mailing list >>> > hwloc-users@lists.open-mpi.org >>> > https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users >>> >>> _______________________________________________ >>> hwloc-users mailing list >>> hwloc-users@lists.open-mpi.org >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users >>> >> >> >> >> _______________________________________________ >> hwloc-users mailing >> listhwloc-us...@lists.open-mpi.orghttps://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users >> >> _______________________________________________ hwloc-users mailing list >> hwloc-users@lists.open-mpi.org https://rfd.newmexicoconsortiu >> m.org/mailman/listinfo/hwloc-users > > _______________________________________________ > hwloc-users mailing > listhwloc-us...@lists.open-mpi.orghttps://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users > > > _______________________________________________ > hwloc-users mailing list > hwloc-users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users >
CA 'mlx4_0' CA type: MT26428 Number of ports: 1 Firmware version: 2.7.626 Hardware version: b0 Node GUID: 0x0002c903004a9972 System image GUID: 0x0002c903004a9975 Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 1 LMC: 0 SM lid: 1 Capability mask: 0x0251086a Port GUID: 0x0002c903004a9973 Link layer: IB
_______________________________________________ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users