Hello,
After installation of OSCAR 4 on RH-EL-AS-3 cluster, one of my major mpi program is not running right. Here is the detail, thanks in advance for any help:
In short, the program will just sit there, waiting and waiting, but doing nothing, since normally it should gives out a lot of outputs.
In detail, we have a 28 nodes cluster including master node, each have 2 CPUs
Originally, I was running LAM-6.5.9 on Redhat 7.2, using PGI FORTRAN compiler and GNU C compiler. The command used to run is:
"mpirun -O -x CYANALIB c0,1,2,3,4,5,6,7,8,9,10,11,12 My_Program"
It ran fine, when run "gstat -a -1", I would see 6 nodes running at about 100% CPU time, since each had two copies running.
Now, I am using OSCAR 4(LAM-7.0.6) on RH-EL-AS-3 with all GNU compilers(C and FORTRAN), I recompiled my program BTW. Now with the same command, it runs, then just sits there, doing nothing. And from "gstat -a -1", it only shows 6 nodes running at about 50% CPU time, which seems like only one copy running on each node. The "mpitask" shows everything running.
Anyone's got any idea?
Regards Chen
=========================================== Yu Chen Howard Hughes Medical Institute Chemistry Building, Rm 182 University of Maryland at Baltimore County 1000 Hilltop Circle Baltimore, MD 21250
phone: (410)455-6347 (primary)
(410)455-2718 (secondary)
fax: (410)455-1174
email: [EMAIL PROTECTED]
===========================================
------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Oscar-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-users
