Hooray!

On Dec 30, 2010, at 9:57 AM, Michael Di Domenico wrote:

> I think i take it all back.  I just tried it again and it seems to
> work now.  I'm not sure what I changed (between my first and this
> msg), but it does appear to work now.
> 
> On Thu, Dec 30, 2010 at 4:31 PM, Michael Di Domenico
> <mdidomeni...@gmail.com> wrote:
>> Yes that's true, error messages help.  I was hoping there was some
>> documentation to see what i've done wrong.  I can't easily cut and
>> paste errors from my cluster.
>> 
>> Here's a snippet (hand typed) of the error message, but it does look
>> like a rank communications error
>> 
>> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
>> contact information is unknown in file rml_oob_send.c at line 145.
>> *** MPI_INIT failure message (snipped) ***
>> orte_grpcomm_modex failed
>> --> Returned "A messages is attempting to be sent to a process whose
>> contact information us uknown" (-117) instead of "Success" (0)
>> 
>> This msg repeats for each rank, an ultimately hangs the srun which i
>> have to Ctrl-C and terminate
>> 
>> I have mpiports defined in my slurm config and running srun with
>> -resv-ports does show the SLURM_RESV_PORTS environment variable
>> getting parts to the shell
>> 
>> 
>> On Thu, Dec 23, 2010 at 8:09 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>> I'm not sure there is any documentation yet - not much clamor for it. :-/
>>> 
>>> It would really help if you included the error message. Otherwise, all I 
>>> can do is guess, which wastes both of our time :-(
>>> 
>>> My best guess is that the port reservation didn't get passed down to the 
>>> MPI procs properly - but that's just a guess.
>>> 
>>> 
>>> On Dec 23, 2010, at 12:46 PM, Michael Di Domenico wrote:
>>> 
>>>> Can anyone point me towards the most recent documentation for using
>>>> srun and openmpi?
>>>> 
>>>> I followed what i found on the web with enabling the MpiPorts config
>>>> in slurm and using the --resv-ports switch, but I'm getting an error
>>>> from openmpi during setup.
>>>> 
>>>> I'm using Slurm 2.1.15 and Openmpi 1.5 w/PSM
>>>> 
>>>> I'm sure I'm missing a step.
>>>> 
>>>> Thanks
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to