Thanks for the ideas. I'm sorry that it's been painful for you (it's worse for us). I'll try and incorporate your ideas back into the docs. I believe Jon has already done so to some degree. We are seeing very good speed ups, so keep trying.
On Mon, May 10, 2010 at 5:29 PM, jtg <[email protected]> wrote: > > Dear All, > > The combination of some astute observations > by J. Guyer and D. Wheeler's trilinos sanity test > (posted on this thread) have resolved the problem > I have been reporting (at annoying length) with > running the mesh1D example in parallel. > > Sheepishly, I must now reveal the awful truth: > my mesh1D example was running properly in > parallel mode all along. Yes, those long harangues > you endured were for naught. My sincere apologies. > > As an aid to the unwary FiPyers of the future, > here are three things to consider if problems arise > in parallel FiPy. > > 1) The Wheeler Test. Because it probes the > trilinos/MPI infrastructure, this simple bit of > fipy-independent code will let you know if that > side of your software environment is functioning > properly. > > Here is a simple script that will help, similar to > the one Dr. Wheeler suggested earlier: > > =============================== > #!/usr/bin/env python > > from PyTrilinos import Epetra > import sys > > Nproc = Epetra.PyComm().NumProc() > Pid = Epetra.PyComm().MyPID() > > TstMsg = "MyPID = %d; total procs = %d \n" > > sys.stdout.write(TstMsg % (Pid, Nproc)) > =============================== > > Launch that test program with mpirun > > mpirun -np 2 simpleTestProg.py > > and in parallel mode you should see > > MyPID = 0; total procs = 2 > MyPID = 1; total procs = 2 > > If there is only PID 0, you are likely in serial mode. > > 2) A somewhat more devious pitfall you will want > to be aware of has to do with the way the TSV- > Viewer works. Here is Dr. Guyer's explanation > > TSVViewer outputs var.getGlobalValue(). If you're > writing to a file, it does this only on process 0, but > for stdout it does it on all processes. It works that > way, if for no other reason, because the doctest > results have to make sense, regardless of the > number of processes. > > So, to see one copy of the combined results of a > parallel operation, using TSVViewer, you will need > to write those results to a file. [BTW, simply "tee"ing > the result to a file will still give you the output from > all processes; this is what tripped me up in performing > this test.] > > 3) Finally, for completeness, you may wish to per- > form a simple sanity check on mpi4py, similar to the > Wheeler Test. Here is a suggestion. > > ======================= > #!/usr/bin/env python > > import sys > from mpi4py import MPI > > Nproc = MPI.COMM_WORLD.Get_size() > Pid = MPI.COMM_WORLD.Get_rank() > > TstMsg = "MyPID = %d; total procs = %d \n" > > sys.stdout.write(TstMsg % (Pid, Nproc)) > ======================= > > Launch this with mpirun, as before, and in parallel > mode you should see results similar to those for > the trilinos test above. > > Again, my apologies for occupying so much > bandwidth on the list, for what turned out to > be such a small matter. I appreciate the time > and patience Drs. Wheeler and Guyer ex- > pended untangling this (non-)problem. And > thanks as well to Igor for kicking this thread > off in the first place and for his suggestions > and timing results(!) -- a significant speed > increase seems achingly close: I can see > it now, just over the rainbow... > > -- Daniel Wheeler
