Hello Thank for your comment
Only the frontend was updated directly via install.sh fron ofed 2.4.3 to ofed 3.1.1.0 and contains openmpi 1.8.8. Now the compute node have a older version of ofed 2.4 with openmpi 1.6.4 My question; if is possible update ofed directly in the compute node executing install.sh in ofed or is recomended add the rolls and update the nodes. Regards. Sebastian El 20 nov. 2016 03:15, "Gilles Gouaillardet" <gilles.gouaillar...@gmail.com> escribió: > Sebastian, > > The error message is pretty self-explanatory > /usr/mpi/gcc/openmpi-1.8.8/bin/orted is missing on your compute nodes. > > it seems you are using /usr/mpi/gcc/openmpi-1.8.8/bin/mpirun on your > frontend node > (e.g. the node on which mpirun is invoked) > but Open MPI was not updated on some nodes listed in your nodes8 > machinefile > > you likely want to contact your sysadmin and figure this out > > Cheers, > > Gilles > > On Sat, Nov 19, 2016 at 4:22 PM, Sebastian Antunez N. > <antunez.sebast...@gmail.com> wrote: > > Hello Guys > > > > I have a cluster of HPC and I update OFED, Firmware etc. > > > > Post reboot and run mpirun -machinefile nodes8 -n 128 > > /home/HPL/run_hpl/xhpl show the following error > > > > bash: /usr/mpi/gcc/openmpi-1.8.8/bin/orted: No such file or directory > > bash: /usr/mpi/gcc/openmpi-1.8.8/bin/orted: No such file or directory > > bash: /usr/mpi/gcc/openmpi-1.8.8/bin/orted: No such file or directory > > ------------------------------------------------------------ > -------------- > > ORTE was unable to reliably start one or more daemons. > > This usually is caused by: > > > > * not finding the required libraries and/or binaries on > > one or more nodes. Please check your PATH and LD_LIBRARY_PATH > > settings, or configure OMPI with --enable-orterun-prefix-by-default > > > > * lack of authority to execute on one or more specified nodes. > > Please verify your allocation and authorities. > > > > * the inability to write startup files into /tmp > > (--tmpdir/orte_tmpdir_base). > > Please check with your sys admin to determine the correct location to > use. > > > > * compilation of the orted with dynamic libraries when static are > required > > (e.g., on Cray). Please check your configure cmd line and consider > using > > one of the contrib/platform definitions for your system type. > > > > * an inability to create a connection back to mpirun due to a > > lack of common network interfaces and/or no route found between > > them. Please check network connectivity (including firewalls > > and network routing requirements). > > > > > > > > Before update I have version 1.6.4 and the cluster not show errors when I > > run the mpirun > > > > I changed the Enviroment Variables but persist the error. > > > > Is possible ypur comment who resolved the issue. > > > > Regards > > > > Sebastian Antunez > > > > > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users