On Thu, Apr 09, 2009 at 08:15:07PM +0500, amjad ali wrote: > > Hello All, > > On my 4-node Beowulf Cluster, when I run my PDE solver code (compiled > with mpif90 of openmpi-installed-with-gfortran) with -np 4 launched > only on the Head Node (without providing -machinefile), it gives me > correct results. ONLY one problem is there: when I monitor RAM > behavior it gets filling at a constant speed throughout the RUN of > program, till I get final result. So Why the usage of RAM is > constantly increasing (although this is not the case with relevant > serial code of the same problem/algorithm/method). > > Secondly when I launch the same compiled code on 4 nodes (with > -machinefile option). Then I do not get correct result. The > convergence gets very much slow down after few iterations, ultimately > resulting in NaN values of problem variables. > > I would be very grateful for having comments for the remedy of above > two difficulties/confusions.
Without the code there is not too much this group can do. The constant increase in RAM sounds like a memory leak or a natural result of the programs' structure. You may need to run a debugger to see what part of your code is triggering the memory activity. Getting different results locally and distributed sounds like a bug in your code. Look for uninitialized data and code that depends on side effects. Compiler flags can help: start with dialing optimization down -O0 -g look also at -pedantic -Wall -fbounds-check -fno-range-check -Wsurprising (man gfortran) and perhaps first verify that the N hosts have the same runtime libs and hardware that the local host has. Different results when N changes may also be a natural result of your code. While algebra tells us about the commutativity of simple operations, such as multiplication or addition real floating point arithmetic can prove to be unstable. IEEE arithmetic and IEEE exceptions control when and how libs return NaN etc.... Since this is sometimes managed via environment variables look there as well. -- T o m M i t c h e l l Found me a new hat, now what? _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf