Hello,

After installation of OSCAR 4 on RH-EL-AS-3 cluster, one of my major mpi program is not running right. Here is the detail, thanks in advance for any help:

In short, the program will just sit there, waiting and waiting, but doing nothing, since normally it should gives out a lot of outputs.

In detail, we have a 28 nodes cluster including master node, each have 2 CPUs

Originally, I was running LAM-6.5.9 on Redhat 7.2, using PGI FORTRAN compiler and GNU C compiler. The command used to run is:
"mpirun -O -x CYANALIB c0,1,2,3,4,5,6,7,8,9,10,11,12 My_Program"
It ran fine, when run "gstat -a -1", I would see 6 nodes running at about 100% CPU time, since each had two copies running.


Now, I am using OSCAR 4(LAM-7.0.6) on RH-EL-AS-3 with all GNU compilers(C and FORTRAN), I recompiled my program BTW. Now with the same command, it runs, then just sits there, doing nothing. And from "gstat -a -1", it only shows 6 nodes running at about 50% CPU time, which seems like only one copy running on each node. The "mpitask" shows everything running.

Anyone's got any idea?

Regards
Chen

===========================================
Yu Chen
Howard Hughes Medical Institute
Chemistry Building, Rm 182
University of Maryland at Baltimore County
1000 Hilltop Circle
Baltimore, MD 21250

phone:  (410)455-6347 (primary)
        (410)455-2718 (secondary)
fax:    (410)455-1174
email:  [EMAIL PROTECTED]
===========================================


------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Oscar-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to