Rongliang Chen <[email protected]> writes:

> Hi Jed,
>
> I  have not find a way to "dump core on selected ranks" yet and I will 
> continue to do that. 

Ask the administrators at your facility.  There are a few common ways,
but I'm not going to play a guessing game on the mailing list.

> I run my code with the option "-on_error_attach_debugger" and got the
> following message:
>
> --------------------------------------------------------------------------
> An MPI process has executed an operation involving a call to the
> "fork()" system call to create a child process.  Open MPI is currently
> operating in a condition that could result in memory corruption or
> other system errors; your MPI job may hang, crash, or produce silent
> data corruption.  The use of fork() (or system() or other calls that
> create child processes) is strongly discouraged.
>
> The process that invoked fork was:
>
>    Local host:          node1529 (PID 3701)
>    MPI_COMM_WORLD rank: 0
>
> If you are *absolutely sure* that your application will successfully
> and correctly survive a call to fork(), you may disable this warning
> by setting the mpi_warn_on_fork MCA parameter to 0.
> --------------------------------------------------------------------------
> [node1529:03700] 13 more processes have sent help message 
> help-mpi-runtime.txt / mpi_init:warn-fork
> [node1529:03700] Set MCA parameter "orte_base_help_aggregate" to 0 to 
> see all help / error messages
> --------------------------------------------------------------------------
>
> Is this message useful for the debugging?

This is just a possibly technical problem attaching a debugger in your
environment, but you have to actually attach the debugger and poke
around (stack trace, etc).

Can you create an interactive session and run your job from there?

Attachment: pgpkZbzZ_Tktg.pgp
Description: PGP signature

Reply via email to