Bill --
Check out http://www.open-mpi.org/faq/?category=openfabrics#ofa-fork.
To my knowledge, RHEL4 has not yet received a hotfix that will allow
fork() with OpenFabrics verbs applications when memory is still
registered in the parent.
On Aug 6, 2007, at 7:53 AM, Bill Wichser wrote:
We have run across an issue, probably more related to openib than
to openmpi but don't know how to resolve.
Linux kernel - 2.6.9-55.0.2.ELsmp x86_64
libibverbs-1.0.4-7
openmpi - it doesn't matter - 1.1.5 and 1.2.3 both fail.
When the sample code is run across IB nodes, using the IB
interface, the receive just hangs whenever a system call is
issued. Removing this system call removes the hang. Running
across the nodes over TCP removes the hang. Running on a single
node removes the hang. Only when using the IB interface do we have
this hang.
So the simple solution is "don't do this" but apparently something
deeper is involved and who knows where it will pop up again.
Thanks,
Bill
ps - sample code compiled using mpicc, built with gcc. You'll need
a test.dat file for the system("cp") command.
#include <stdio.h>
#include <mpi.h>
#include <unistd.h>
char All[4840];
int ThisTask;
int NTask;
int main(int argc, char **argv)
{
int task;
int nothing;
MPI_Status status;
int errorFlag = 0;
int sysstatus;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &ThisTask);
MPI_Comm_size(MPI_COMM_WORLD, &NTask);
#if 1
if(ThisTask == 0) {
printf("Task %d cmd run\n", ThisTask);
sysstatus = system(
"cp test.dat test2.dat");
printf("Task %d cmd status %d\n", ThisTask, sysstatus);
}
#else
if (ThisTask == 0) {
sleep(60);
}
#endif
if (ThisTask == 0) {
printf("Task 0 Wait Loop START\n");
for (task = 1; task < NTask; task++) {
printf("Task %d Recv START\n", task);
MPI_Recv(¬hing, sizeof(nothing), MPI_BYTE, task, 0,
MPI_COMM_WORLD,
&status);
printf("Task %d Recv END\n", task);
}
printf("Task 0 Wait Loop END\n");
}
else {
printf("Task %d Send START\n", ThisTask);
MPI_Send(¬hing, sizeof(nothing), MPI_BYTE, 0, 0,
MPI_COMM_WORLD);
printf("Task %d Send END\n", ThisTask);
}
printf("Task %d Finalize START\n", ThisTask);
MPI_Finalize(); /* clean up & finalize MPI */
printf("Task %d Finalize END\n", ThisTask);
return 0;
}
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Jeff Squyres
Cisco Systems