Hooray! On Dec 30, 2010, at 9:57 AM, Michael Di Domenico wrote:
> I think i take it all back. I just tried it again and it seems to > work now. I'm not sure what I changed (between my first and this > msg), but it does appear to work now. > > On Thu, Dec 30, 2010 at 4:31 PM, Michael Di Domenico > <mdidomeni...@gmail.com> wrote: >> Yes that's true, error messages help. I was hoping there was some >> documentation to see what i've done wrong. I can't easily cut and >> paste errors from my cluster. >> >> Here's a snippet (hand typed) of the error message, but it does look >> like a rank communications error >> >> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose >> contact information is unknown in file rml_oob_send.c at line 145. >> *** MPI_INIT failure message (snipped) *** >> orte_grpcomm_modex failed >> --> Returned "A messages is attempting to be sent to a process whose >> contact information us uknown" (-117) instead of "Success" (0) >> >> This msg repeats for each rank, an ultimately hangs the srun which i >> have to Ctrl-C and terminate >> >> I have mpiports defined in my slurm config and running srun with >> -resv-ports does show the SLURM_RESV_PORTS environment variable >> getting parts to the shell >> >> >> On Thu, Dec 23, 2010 at 8:09 PM, Ralph Castain <r...@open-mpi.org> wrote: >>> I'm not sure there is any documentation yet - not much clamor for it. :-/ >>> >>> It would really help if you included the error message. Otherwise, all I >>> can do is guess, which wastes both of our time :-( >>> >>> My best guess is that the port reservation didn't get passed down to the >>> MPI procs properly - but that's just a guess. >>> >>> >>> On Dec 23, 2010, at 12:46 PM, Michael Di Domenico wrote: >>> >>>> Can anyone point me towards the most recent documentation for using >>>> srun and openmpi? >>>> >>>> I followed what i found on the web with enabling the MpiPorts config >>>> in slurm and using the --resv-ports switch, but I'm getting an error >>>> from openmpi during setup. >>>> >>>> I'm using Slurm 2.1.15 and Openmpi 1.5 w/PSM >>>> >>>> I'm sure I'm missing a step. >>>> >>>> Thanks >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users