Adding to that - can you also post how you compiled the program? Show us the output too.
Thanks, Bernard > -----Original Message----- > From: Michael Edwards [mailto:[EMAIL PROTECTED] > Sent: Tuesday, April 04, 2006 7:47 > To: Michelle Chu > Cc: Bernard Li; [email protected] > Subject: Re: [Oscar-users] MPI ranking and size problem > > Did you run the program from your home directory or the /opt/ > directory where it lives? > For LAM to work properly the program needs to be available on all the > compute nodes. > By default, /home is the only directory exported by nfs. > > I am still confused by the mpi_init error even if that is the problem, > did you compile the program using gcc or mpicc? > > On 4/4/06, Michelle Chu <[EMAIL PROTECTED]> wrote: > > > > Bernard, > > Here are some outputs... > > [EMAIL PROTECTED] ~]$ lamnodes > > n0 athena.cs.xxx.edu:1:origin,this_node > > n1 oscarnode1.cs.xxx.edu:1 : > > n2 oscarnode2.cs.xxx.edu:1: > > n3 oscarnode3.cs.xxx.edu:1: > > n4 oscarnode4.cs.xxx.edu:1 : > > n5 oscarnode5.cs.xxx.edu:1: > > n6 oscarnode6.cs.xxx.edu:1: > > n7 oscarnode7.cs.xxx.edu:1 : > > n8 oscarnode8.cs.xxx.edu:1: > > > ************************************************************** > ********************************* > > [EMAIL PROTECTED] ~]$ vi lamtest. output > > > > wall clock time = 0.000074 > > Process 1 of 8 on oscarnode8.cs.xxx.edu > > > > --> MPI C++ bindings test: > > > > Hello World! I am 1 of 8 > > Hello World! I am 0 of 8 > > Hello World! I am 6 of 8 > > Hello World! I am 4 of 8 > > Hello World! I am 2 of 8 > > Hello World! I am 5 of 8 > > Hello World! I am 7 of 8 > > Hello World! I am 3 of 8 > > > > > > > > --> MPI Fortran bindings test: > > > > Hello World! I am 0 of 8 > > Hello World! I am 1 of 8 > > Hello World! I am 4 of 8 > > Hello World! I am 6 of 8 > > Hello World! I am 2 of 8 > > Hello World! I am 7 of 8 > > Hello World! I am 5 of 8 > > Hello World! I am 3 of 8 > > > > > > LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University > > > > > > LAM/MPI test complete > > Unless there are errors above, test completed successfully. > > > > > ************************************************************** > ****************************************************** > > [EMAIL PROTECTED] examples]$ mpirun N hello++ > > > > Hello World! I am 0 of 1 > > Hello World! I am 0 of 1 > > Hello World! I am 0 of 1 > > Hello World! I am 0 of 1 > > Hello World! I am 0 of 1 > > Hello World! I am 0 of 1 > > Hello World! I am 0 of 1 > > Hello World! I am 0 of 1 > > > -------------------------------------------------------------- > --------------- > > It seems that [at least] one of the processes that was started with > > mpirun did not invoke MPI_INIT before quitting (it is possible that > > more than one process did not invoke MPI_INIT -- mpirun was only > > notified of the first one, which was on node n0). > > > > mpirun can *only* be used with MPI programs (i.e., programs that > > invoke MPI_INIT and MPI_FINALIZE). You can use the > "lamexec" program > > to run non-MPI programs over the lambooted nodes. > > > ************************************************************** > ******************************************************* > > Attached is the output file file: lamboot -d hostfile > > > > What i did was: > > 1). lamboot -d hostfile > > 2). mpirun N hello++ > > hostfile lists host name of headnode and all eight cluster nodes as: > > > > athena.cs.xxx.edu > > oscarnode1.cs.xxx.edu > > > > ..... > > oscarnode8.cs.xxx.edu > > > > Thanks, > > > > > > Michelle > > > > On 4/4/06, Bernard Li <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > Hi Michelle: > > > > > > I just tested your code on a 2 node cluster (including > headnode) and got > > the following result: > > > > > > $ mpirun N a.out > > > Hello World! I am 1 of 2 > > > Hello World! I am 0 of 2 > > > > > > So it seems fine (you had a space between " and 'mpi.h' > but I fixed that) > > > > > > Can you show us the output of "lamnodes" after you have > successfully > > booted the nodes? Also, post the output of your LAM/MPI OSCAR tests > > (/home/oscartst/lam/lamtest.out). > > > > > > Cheers, > > > > > > Bernard > > > > > > ________________________________ > > From: [EMAIL PROTECTED] on behalf of > > Michelle Chu > > > Sent: Mon 03/04/2006 21:02 > > > To: Michael Edwards > > > Cc: [email protected] > > > Subject: Re: [Oscar-users] MPI ranking and size problem > > > > > > > > > > > > > > > Michael, > > > OSCAR version is 4.2. The OS on the client node is Red Hat and is > > installed from the client image file generated during the OSCAR > > installation. The cluster testing step at the end of > installation passed > > except the ganglia part. Thank you very much for your help. Michelle > > > > > > Here is the code for Hello.cc. > > > > > > ************************************************************** > ***************************** > > > #include <iostream.h> > > > // modified to reference the master mpi.h file, to meet > the MPI standard > > spec. > > > #include " mpi.h" > > > int > > > main(int argc, char *argv[]) > > > { > > > MPI::Init(argc, argv); > > > > > > int rank = MPI::COMM_WORLD.Get_rank(); > > > int size = MPI::COMM_WORLD.Get_size(); > > > > > > cout << "Hello World! I am " << rank << " of " << size << endl; > > > > > > MPI::Finalize(); > > > } > > > > > > > > > ************************************************************** > *************************** > > > > > > > > > On 4/3/06, Michael Edwards <[EMAIL PROTECTED] > wrote: > > > > What version of OSCAR are you using, and on what platform? > > > > > > > > Also, could you send us a copy of hello++.cpp, it looks > like there are > > > > some errors there? Also, did all the oscar tests pass? > > > > > > > > LAM appears to be working correctly at surface anyway. > > > > > > > > > > > > > > > > On 4/3/06, Michelle Chu < [EMAIL PROTECTED] > wrote: > > > > > Hello, there, > > > > > > > > > > When I mpirun a simple hello MPI program on all my > eight nodes as the > > > > > following. I get a sequence of hello world! i am 0 of > 1, instead of 1 > > of 8, > > > > > 2 of 8, 3 of 8. Also, problem with MPI_INIT. Thank > you for your help. > > > > > > > > > > which mpicc > > > > > /opt/lam-7.0.6/bin/mpicc > > > > > lamboot -v my_hostfile > > > > > my_hostfile is: > > > > > ************************************** > > > > > athena.cs.xxx.edu > > > > > oscarnode1.cs.xxx.edu > > > > > oscarnode2.cs.xxx.edu > > > > > oscarnode3.cs.xxx.edu > > > > > oscarnode4.cs.xxx.edu > > > > > oscarnode5.cs.xxx.edu > > > > > oscarnode6.cs.xxx.edu > > > > > oscarnode7.cs.xxx.edu > > > > > oscarnode8.cs.xxx.edu > > > > > > > > > > > > > ************************************************************** > *************** > > > > > LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University > > > > > > > > > > n-1<16365> ssi:boot:base:linear: booting n0 ( > athena.cs.xxx.edu) > > > > > n-1<16365> ssi:boot:base:linear: booting n1 ( > oscarnode1.cs.xxx.edu) > > > > > n-1<16365> ssi:boot:base:linear: booting n2 > (oscarnode2.cs.xxx.edu) > > > > > n-1<16365> ssi:boot:base:linear: booting n3 ( > oscarnode3.cs.xxx.edu) > > > > > n-1<16365> ssi:boot:base:linear: booting n4 ( > oscarnode4.cs.xxx.edu) > > > > > n-1<16365> ssi:boot:base:linear: booting n5 ( > oscarnode5.cs.xxx.edu ) > > > > > n-1<16365> ssi:boot:base:linear: booting n6 > (oscarnode6.cs.xxx.edu) > > > > > n-1<16365> ssi:boot:base:linear: booting n7 ( > oscarnode7.cs.xxx.edu) > > > > > n-1<16365> ssi:boot:base:linear: booting n8 ( > oscarnode8.cs.xxx.edu) > > > > > n-1<16365> ssi:boot:base:linear: finished > > > > > > > > > > mpirun N hello++ > > > > > > > ***************************************************************** > > > > > Hello World! I am 0 of 1 > > > > > Hello World! I am 0 of 1 > > > > > Hello World! I am 0 of 1 > > > > > Hello World! I am 0 of 1 > > > > > Hello World! I am 0 of 1 > > > > > Hello World! I am 0 of 1 > > > > > > > > -------------------------------------------------------------- > --------------- > > > > > It seems that [at least] one of the processes that > was started with > > > > > mpirun did not invoke MPI_INIT before quitting (it > is possible that > > > > > more than one process did not invoke MPI_INIT -- > mpirun was only > > > > > notified of the first one, which was on node n0). > > > > > > > > > > mpirun can *only* be used with MPI programs (i.e., > programs that > > > > > invoke MPI_INIT and MPI_FINALIZE). You can use the > "lamexec" program > > > > > to run non-MPI programs over the lambooted nodes. > > > > > > > > -------------------------------------------------------------- > --------------- > > > > > Hello World! I am 0 of 1 > > > > > > > > ************************************************************** > ************************************** > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642 _______________________________________________ Oscar-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-users
