Blah -- this is a segv when trying to print a help message. The help
message you should have gotten was:

-----
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_sds_base_select failed
  --> Returned value ?? instead of ORTE_SUCCESS
-----

I'll look into why this happened (segv instead of printing the message).

However, the real issue is why you got this error in the first place.
What version of OMPI were you running (a nightly tarball, an rc tarball,
etc.)?  What run-time environment were you using -- a batch scheduler or
simple rsh/ssh?  Can you send the information listed in
http://www.open-mpi.org/community/help/ ?



On Thu, 2005-10-20 at 22:43 -0500, Troy Benjegerdes wrote:
> Anyone know what's up here?
> 
> troy@opteron1:~$ mpirun -np 2 hostname
> [opteron1.scl.ameslab.gov:01865] [NO-NAME] ORTE_ERROR_LOG: Not found in
> file ../../../ompi-svn_v1.0/orte/runtime/orte_init_stage1.c at line 212
> Segmentation fault
> troy@opteron1:~$ gdb
> -bash: gdb: command not found
> troy@opteron1:~$ gdb mpirun
> GNU gdb 6.3-debian
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you
> are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for
> details.
> This GDB was configured as "x86_64-linux"...Using host libthread_db
> library "/lib/libthread_db.so.1".
> 
> (gdb) run -np 2 hostname
> Starting program: /usr/local/bin/mpirun -np 2 hostname
> [Thread debugging using libthread_db enabled]
> [New Thread 46912509168352 (LWP 7636)]
> [opteron1.scl.ameslab.gov:07636] [NO-NAME] ORTE_ERROR_LOG: Not found in
> file ../../../ompi-svn_v1.0/orte/runtime/orte_init_stage1.c at line 212
> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 46912509168352 (LWP 7636)]
> 0x00002aaaab3279d0 in strlen () from /lib/libc.so.6
> (gdb) bt
> #0  0x00002aaaab3279d0 in strlen () from /lib/libc.so.6
> #1  0x00002aaaab2fa158 in vfprintf () from /lib/libc.so.6
> #2  0x00002aaaab31931d in vasprintf () from /lib/libc.so.6
> #3  0x00002aaaab50b150 in output () from /usr/local/lib/libopal.so.0
> #4  0x00002aaaab50ae14 in opal_show_help () from
> /usr/local/lib/libopal.so.0
> #5  0x00002aaaaabd2a8d in orte_init_stage1 () from
> /usr/local/lib/liborte.so.0
> #6  0x00002aaaaabd594a in orte_system_init () from
> /usr/local/lib/liborte.so.0
> #7  0x00002aaaaabd2969 in orte_init () from /usr/local/lib/liborte.so.0
> #8  0x00000000004021d3 in orterun (argc=4, argv=0x7fffffd242a8)
>     at ../../../../ompi-svn_v1.0/orte/tools/orterun/orterun.c:294
> #9  0x0000000000401f93 in main (argc=4, argv=0x7fffffd242a8)
>     at ../../../../ompi-svn_v1.0/orte/tools/orterun/main.c:13
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/

Reply via email to