Dear All,
The combination of some astute observations
by J. Guyer and D. Wheeler's trilinos sanity test
(posted on this thread) have resolved the problem
I have been reporting (at annoying length) with
running the mesh1D example in parallel.
Sheepishly, I must now reveal the awful truth:
my mesh1D example was running properly in
parallel mode all along. Yes, those long harangues
you endured were for naught. My sincere apologies.
As an aid to the unwary FiPyers of the future,
here are three things to consider if problems arise
in parallel FiPy.
1) The Wheeler Test. Because it probes the
trilinos/MPI infrastructure, this simple bit of
fipy-independent code will let you know if that
side of your software environment is functioning
properly.
Here is a simple script that will help, similar to
the one Dr. Wheeler suggested earlier:
===============================
#!/usr/bin/env python
from PyTrilinos import Epetra
import sys
Nproc = Epetra.PyComm().NumProc()
Pid = Epetra.PyComm().MyPID()
TstMsg = "MyPID = %d; total procs = %d \n"
sys.stdout.write(TstMsg % (Pid, Nproc))
===============================
Launch that test program with mpirun
mpirun -np 2 simpleTestProg.py
and in parallel mode you should see
MyPID = 0; total procs = 2
MyPID = 1; total procs = 2
If there is only PID 0, you are likely in serial mode.
2) A somewhat more devious pitfall you will want
to be aware of has to do with the way the TSV-
Viewer works. Here is Dr. Guyer's explanation
TSVViewer outputs var.getGlobalValue(). If you're
writing to a file, it does this only on process 0, but
for stdout it does it on all processes. It works that
way, if for no other reason, because the doctest
results have to make sense, regardless of the
number of processes.
So, to see one copy of the combined results of a
parallel operation, using TSVViewer, you will need
to write those results to a file. [BTW, simply "tee"ing
the result to a file will still give you the output from
all processes; this is what tripped me up in performing
this test.]
3) Finally, for completeness, you may wish to per-
form a simple sanity check on mpi4py, similar to the
Wheeler Test. Here is a suggestion.
=======================
#!/usr/bin/env python
import sys
from mpi4py import MPI
Nproc = MPI.COMM_WORLD.Get_size()
Pid = MPI.COMM_WORLD.Get_rank()
TstMsg = "MyPID = %d; total procs = %d \n"
sys.stdout.write(TstMsg % (Pid, Nproc))
=======================
Launch this with mpirun, as before, and in parallel
mode you should see results similar to those for
the trilinos test above.
Again, my apologies for occupying so much
bandwidth on the list, for what turned out to
be such a small matter. I appreciate the time
and patience Drs. Wheeler and Guyer ex-
pended untangling this (non-)problem. And
thanks as well to Igor for kicking this thread
off in the first place and for his suggestions
and timing results(!) -- a significant speed
increase seems achingly close: I can see
it now, just over the rainbow...