Further to the posting below, I can report that the test program (attached - this time correctly) is chewing up CPU time on both compute nodes for as long as I care to let it continue. It would appear that MPI_Receive which is the next command after the print statements in the test program.
Has anyone else seen this behavior or can anyone give me a hint on how to troubleshoot. Cheers and thanks Ole Nielsen Output: nielso@alamba:~/sandpit/pypar/source$ mpirun --hostfile /etc/mpihosts --host node17,node18 --npernode 2 a.out Number of processes = 4 Test repeated 3 times for reliability I am process 2 on node node18 P2: Waiting to receive from to P1 I am process 0 on node node17 Run 1 of 3 P0: Sending to P1 I am process 1 on node node17 P1: Waiting to receive from to P0 I am process 3 on node node18 P3: Waiting to receive from to P2 P0: Waiting to receive from P3 P1: Sending to to P2 P1: Waiting to receive from to P0 P2: Sending to to P3 P0: Received from to P3 Run 2 of 3 P0: Sending to P1 P3: Sending to to P0 P3: Waiting to receive from to P2 P2: Waiting to receive from to P1 P1: Sending to to P2 P0: Waiting to receive from P3 On Mon, Sep 19, 2011 at 11:04 AM, Ole Nielsen <ole.moller.niel...@gmail.com>wrote: > > Hi all > > We have been using OpenMPI for many years with Ubuntu on our 20-node > cluster. Each node has 2 quad cores, so we usually run up to 8 processes on > each node up to a maximum of 160 processes. > > However, we just upgraded the cluster to Ubuntu 11.04 with Open MPI 1.4.3 > and and have come across a strange behavior where mpi programs run perfectly > well when confined to one node but hangs during communication across > multiple nodes. We have no idea why and would like some help in debugging > this. A small MPI test program is attached and typical output shown below. > > Hope someone can help us > Cheers and thanks > Ole Nielsen > > -------------------- Test output across two nodes (This one hangs) > -------------------------- > nielso@alamba:~/sandpit/pypar/source$ mpirun --hostfile /etc/mpihosts > --host node17,node18 --npernode 2 a.out > Number of processes = 4 > Test repeated 3 times for reliability > I am process 1 on node node17 > P1: Waiting to receive from to P0 > I am process 0 on node node17 > Run 1 of 3 > P0: Sending to P1 > I am process 2 on node node18 > P2: Waiting to receive from to P1 > I am process 3 on node node18 > P3: Waiting to receive from to P2 > P1: Sending to to P2 > > > -------------------- Test output within one node (This one is OK) > -------------------------- > nielso@alamba:~/sandpit/pypar/source$ mpirun --hostfile /etc/mpihosts > --host node17 --npernode 4 a.out > Number of processes = 4 > Test repeated 3 times for reliability > I am process 2 on node node17 > P2: Waiting to receive from to P1 > I am process 0 on node node17 > Run 1 of 3 > P0: Sending to P1 > I am process 1 on node node17 > P1: Waiting to receive from to P0 > I am process 3 on node node17 > P3: Waiting to receive from to P2 > P1: Sending to to P2 > P2: Sending to to P3 > P1: Waiting to receive from to P0 > P2: Waiting to receive from to P1 > P3: Sending to to P0 > P0: Received from to P3 > Run 2 of 3 > P0: Sending to P1 > P3: Waiting to receive from to P2 > P1: Sending to to P2 > P2: Sending to to P3 > P1: Waiting to receive from to P0 > P3: Sending to to P0 > P2: Waiting to receive from to P1 > P0: Received from to P3 > Run 3 of 3 > P0: Sending to P1 > P3: Waiting to receive from to P2 > P1: Sending to to P2 > P2: Sending to to P3 > P1: Done > P2: Done > P3: Sending to to P0 > P0: Received from to P3 > P0: Done > P3: Done > > > >
/* Simple MPI communication test. Ole Moller Nielsen - 2011 */ #include <stdio.h> #include <stdlib.h> #include <mpi.h> #define M 5000 /* Data size */ int main(int argc, char **argv) { int repeats = 3, msgid = 0; int myid, procs; int j, k; double A[M]; int namelen; char processor_name[MPI_MAX_PROCESSOR_NAME]; MPI_Status stat; /* Initialize */ MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &procs); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Get_processor_name(processor_name, &namelen); if (myid == 0) { printf("Number of processes = %d\n", procs); printf("Test repeated %d times for reliability\n", repeats); } if (procs < 2) { printf("Program needs at least two processors - aborting\n"); MPI_Abort(MPI_COMM_WORLD,999); } /* Create the data */ for (j=0; j<M; j++) { A[j]=rand(); } /* Synchronize */ MPI_Barrier(MPI_COMM_WORLD); printf("I am process %d on node %s\n", myid, processor_name); /* Pass msg circularly a number of times*/ for (k=0; k<repeats; k++) { if (myid == 0) { printf("Run %d of %d\n", k+1, repeats); } /* Communicate*/ if (myid == 0) { printf("P%i: Sending to P%i\n", myid, 1); MPI_Send(&A[0], M, MPI_DOUBLE, 1, msgid, MPI_COMM_WORLD); printf("P%i: Waiting to receive from P%i\n", myid, procs-1); MPI_Recv(&A[0], M, MPI_DOUBLE, procs-1, msgid, MPI_COMM_WORLD, &stat); printf("P%i: Received from to P%i\n", myid, procs-1); } else { printf("P%i: Waiting to receive from to P%i\n", myid, myid-1); MPI_Recv(&A[0], M, MPI_DOUBLE, myid-1, msgid, MPI_COMM_WORLD, &stat); printf("P%i: Sending to to P%i\n", myid, (myid+1)%procs); MPI_Send(&A[0], M, MPI_DOUBLE, (myid+1)%procs, msgid, MPI_COMM_WORLD); } } printf("P%i: Done\n", myid); MPI_Finalize(); return 0; }