I believe that this issue has been fixed for the upcoming v1.3 series;
it will not be fixed in the v1.2 series (we made extensive overhauls
to the underlying run-time system for v1.3 which would be
extraordinarily difficult to port back to the v1.2 series).
On Sep 30, 2008, at 9:35 AM, André Gaul wrote:
Hey all!
Last week I observed a strange behaviour of Open MPI when using
MPI_Comm_spawn() to create new MPI processes: The child processes are
started but after the childs call to MPI_Init() no output to stdout
gets
redirected to the stdout of the parent/mpirun process. Before the call
to MPI_Init() the childs stdout is redirected correctly.
I tried this with several MPI versions on different architectures
(1.2.7
on Debian i686, 1.2.2 on SuSe 10.3x86_64) and wrote some dummy code to
demonstrate the behaviour:
/* parent.c */
#include <mpi.h>
#include <stdio.h>
int main(int argc, char **argv) {
MPI_Init(&argc, &argv);
printf("[parent] now spawn\n");
MPI_Comm everyone;
MPI_Comm_spawn("./child", MPI_ARGV_NULL, 1, MPI_INFO_NULL, 0,
MPI_COMM_SELF, &everyone, MPI_ERRCODES_IGNORE);
printf("[parent] finished spawning\n");
//see child.c
while (1);
MPI_Finalize();
return 0;
}
/* child.c */
#include <mpi.h>
#include <stdio.h>
int main(int argc, char **argv) {
MPI_Init(&argc, &argv);
/* stdout does not get redirected!
* (even sometimes (!) without the while (1); loop
* in parent.c)
*/
printf("[child] initialized MPI\n");
MPI_Finalize();
return 0;
}
Output is:
% mpicc -o parent parent.c && mpicc -o child child.c && mpirun ./
parent
[parent] now spawn
[parent] finished spawning
Without the while(1); loop in parent.c the output sometimes (!)
remains
the same as above and sometimes is:
% mpicc -o parent parent.c && mpicc -o child child.c && mpirun ./
parent
[parent] now spawn
[parent] finished spawning
[child] initialized MPI
The child process definitely runs past the MPI_Init() call in every
situation described here, so I think the problem has to be the stdout
redirection.
A similar (or the same?) bug is reported here:
https://svn.open-mpi.org/trac/ompi/ticket/1120 . And as rhc states in
the comment it is not working on remote nodes either. I don't know
which
release should have fixed the bug and that's why I can't say if it's a
known or a new problem. Perhaps someone of the developers could take a
look at it.
Thanks!!
bye,
André Gaul
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems