Re: [DOLFIN-dev] Parallelization and PXMLMesh

Anders Logg Sat, 20 Dec 2008 12:24:19 -0800

On Sat, Dec 20, 2008 at 12:58:04PM +0100, Martin Sandve Alnæs wrote:
> On Fri, Dec 19, 2008 at 9:08 PM, Anders Logg <[email protected]> wrote:
> > On Fri, Dec 19, 2008 at 01:01:50PM +0100, Martin Sandve Alnæs wrote:
> >> On Fri, Dec 19, 2008 at 11:54 AM, Anders Logg <[email protected]> wrote:
> >> > On Fri, Dec 19, 2008 at 09:23:43AM +0100, Martin Sandve Alnæs wrote:
> >> >> On Fri, Dec 19, 2008 at 9:12 AM, Anders Logg <[email protected]> wrote:
> >> >> > On Fri, Dec 19, 2008 at 09:06:43AM +0100, Martin Sandve Alnæs wrote:
> >> >> >> On Fri, Dec 19, 2008 at 8:54 AM, Anders Logg <[email protected]> wrote:
> >> >> >> > On Fri, Dec 19, 2008 at 12:09:50PM +0900, Evan Lezar wrote:
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> On Thu, Dec 18, 2008 at 4:25 AM, Johan Hake <[email protected]> 
> >> >> >> >> wrote:
> >> >> >> >>
> >> >> >> >>     On Wednesday 17 December 2008 20:19:52 Anders Logg wrote:
> >> >> >> >>     > On Wed, Dec 17, 2008 at 08:13:03PM +0100, Johan Hake wrote:
> >> >> >> >>     > > On Wednesday 17 December 2008 19:20:11 Anders Logg wrote:
> >> >> >> >>     > > > Ola and I have now finished up the first round of 
> >> >> >> >> getting DOLFIN to
> >> >> >> >>     > > > run in parallel. In short, we can now parse meshes from 
> >> >> >> >> file in
> >> >> >> >>     > > > parallel and partition meshes in parallel (using 
> >> >> >> >> ParMETIS).
> >> >> >> >>     > > >
> >> >> >> >>     > > > We reused some good ideas that Niclas Jansson had 
> >> >> >> >> implemented in
> >> >> >> >>     > > > PXMLMesh before, but have also made some significant 
> >> >> >> >> changes as
> >> >> >> >>     > > > follows:
> >> >> >> >>     > > >
> >> >> >> >>     > > > 1. The XML reader does not handle any partitioning.
> >> >> >> >>     > > >
> >> >> >> >>     > > > 2. The XML reader just reads in a chunk of the mesh 
> >> >> >> >> data on each
> >> >> >> >>     > > > processor (in parallel) and stores that into a 
> >> >> >> >> LocalMeshData object
> >> >> >> >>     > > > (one for each processor). The data is just partitioned 
> >> >> >> >> in blocks so
> >> >> >> >>     > > > the vertices and cells may be completely unrelated.
> >> >> >> >>     > > >
> >> >> >> >>     > > > 3. The partitioning takes place in 
> >> >> >> >> MeshPartitioning::partition,
> >> >> >> >>     > > > which gets a LocalMeshData object on each processor. It 
> >> >> >> >> then calls
> >> >> >> >>     > > > ParMETIS to compute a partition (in parallel) and then 
> >> >> >> >> redistributes
> >> >> >> >>     > > > the data accordingly. Finally, a mesh is built on each 
> >> >> >> >> processor
> >> >> >> >>     using
> >> >> >> >>     > > > the local data.
> >> >> >> >>     > > >
> >> >> >> >>     > > > 4. All direct MPI calls (except one which should be 
> >> >> >> >> removed) have
> >> >> >> >>     been
> >> >> >> >>     > > > removed from the code. Instead, we mostly rely on
> >> >> >> >>     > > > dolfin::MPI::distribute which handles most cases of 
> >> >> >> >> parallel
> >> >> >> >>     > > > communication and works with STL data structures.
> >> >> >> >>     > > >
> >> >> >> >>     > > > 5. There is just one ParMETIS call (no initial geometric
> >> >> >> >>     > > > partitioning). It seemed like an unnecessary step, or 
> >> >> >> >> are there good
> >> >> >> >>     > > > reasons to perform the partitioning in two steps?
> >> >> >> >>     > > >
> >> >> >> >>     > > > For testing, go to sandbox/passembly, build and then run
> >> >> >> >>     > > >
> >> >> >> >>     > > >   mpirun -n 4 ./demo
> >> >> >> >>     > > >   ./plot_partitions 4
> >> >> >> >>     > >
> >> >> >> >>     > > Looks beautiful!
> >> >> >> >>     > >
> >> >> >> >>     > > I threw a 3D mesh of 160K vertices onto it, and it was 
> >> >> >> >> partitioned
> >> >> >> >>     nicely
> >> >> >> >>     > > in some 10 s, on my 2 core laptop.
> >> >> >> >>     > >
> >> >> >> >>     > > Johan
> >> >> >> >>     >
> >> >> >> >>     > Nice, in particular since we haven't run any 3D test cases 
> >> >> >> >> ourselves,
> >> >> >> >>     > just a tiny mesh of the unit square... :-)
> >> >> >> >>
> >> >> >> >>     Yes I thought so too ;)
> >> >> >> >>
> >> >> >> >>     Johan
> >> >> >> >>     _______________________________________________
> >> >> >> >>     DOLFIN-dev mailing list
> >> >> >> >>     [email protected]
> >> >> >> >>     http://www.fenics.org/mailman/listinfo/dolfin-dev
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> While we are on the topic of parallelisation - I have some 
> >> >> >> >> comments.
> >> >> >> >>
> >> >> >> >> A couple of months ago I was trying to parallelise some of my own 
> >> >> >> >> code and ran
> >> >> >> >> into some problems with the communicators used for PETSc and 
> >> >> >> >> SLEPc problems - I
> >> >> >> >> sort of got them to work, but never commited my changes because 
> >> >> >> >> they were a
> >> >> >> >> little ungainly and I felt I needed to spend some more time on 
> >> >> >> >> them.
> >> >> >> >>
> >> >> >> >> One thing that I did notice is that the user does not have much 
> >> >> >> >> control over
> >> >> >> >> which parts of the process run in parallel - with the number of 
> >> >> >> >> MPI processes
> >> >> >> >> deciding whether or not a parallel implementation should be used. 
> >> >> >> >>  In my case
> >> >> >> >> what I wanted to do was perform a frequency sweep and for each 
> >> >> >> >> frequency point
> >> >> >> >> perform the same calculation (an eigenvalue problem) for that 
> >> >> >> >> frequency.  My
> >> >> >> >> intention was to distribute the frequency sweep over a number of 
> >> >> >> >> processors and
> >> >> >> >> then handle the assembly and solution of each system separately.  
> >> >> >> >> This is not
> >> >> >> >> possible as the code is now since the assembly and eigenvalue 
> >> >> >> >> solvers all try
> >> >> >> >> to run in parallel as soon as the applicaion is run as an mpi 
> >> >> >> >> program.
> >> >> >> >>
> >> >> >> >> I know that this is not very descriptive, but does anyone else 
> >> >> >> >> have thoughts on
> >> >> >> >> the matter?  I will put together something a little more concrete 
> >> >> >> >> as soon as I
> >> >> >> >> have a chance (I am travelling around quite a bit at the moment 
> >> >> >> >> so it is
> >> >> >> >> difficult for me to focus).
> >> >> >> >
> >> >> >> > We're just getting started. Everything needs to be configurable. At
> >> >> >> > the moment, we assume that all parallel computation should be 
> >> >> >> > split in
> >> >> >> > MPI::num_processes().
> >> >> >> >
> >> >> >> > We can either add optional parameters to all parallel calls or add
> >> >> >> > global options to control how many processes are used. But I 
> >> >> >> > suggest
> >> >> >>
> >> >> >> Global options wouldn't solve anything (the point here is variable
> >> >> >> number of processes during the application lifetime), and it's 
> >> >> >> probably
> >> >> >> enough with a single parameter, the communicator, perhaps permanently
> >> >> >> attached to each FunctionSpace (dofmap must be distributed over
> >> >> >> the processes in a communicator to be usable). Calls to 
> >> >> >> MPI::num_processes()
> >> >> >> would be replaced by V.communicator().num_processes() or something.
> >> >> >> I get the waiting argument though. When stuff works, a search for 
> >> >> >> MPI::*
> >> >> >> should reveal the places where "V.communicator()." should replace 
> >> >> >> "MPI::",
> >> >> >> likely covering most places that need to be updated.
> >> >> >>
> >> >> >> Martin
> >> >> >
> >> >> > Sounds good.
> >> >> >
> >> >> > Speaking of the communicator, I'd like to add access to a default
> >> >> > communicator in the MPI class which could be accessed by
> >> >> > MPI::communicator() but couldn't figure out how to do this (I suspect
> >> >> > it's trivial).
> >> >> >
> >> >> > Then we could avoid including mpi.h in MeshPartitioning.cpp (look for
> >> >> > the two first FIXMEs in that file). If anyone knows how to do this,
> >> >> > let me know.
> >> >>
> >> >> You mean something like
> >> >>
> >> >> class MPI {
> >> >> public:
> >> >>
> >> >>   static dolfin::Communicator & communicator()
> >> >>   { return *mainCommunicator; }
> >> >>
> >> >> private:
> >> >>   static shared_ptr<Communicator> mainCommunicator;
> >> >> };
> >> >>
> >> >> ?
> >> >>
> >> >> This can then be initialized the first time MPI::communicator()
> >> >> is called, to be a communicator including all MPI processes.
> >> >>
> >> >> Epetra has a "serial communicator", which is probably needed
> >> >> as well when MPI is not in use, instead of adding #ifdefs all
> >> >> places MPI is used.
> >> >>
> >> >> Martin
> >> >
> >> > Yes, something like that. But what would the Communicator class look
> >> > like? If you know how to do this, feel free to just add it.
> >> >
> >>
> >> Basically, it looks like class MPI can be renamed to Communicator.
> >> All its functions must be made non-static, and it should have a member
> >> variable to hold the communicator ID. All occurences of MPI_COMM_WORLD
> >> in MPI.cpp should be replaced with the communicator ID variable.
> >> And all other places in the code where MPI_COMM_WORLD is needed,
> >> a communicator must be used instead (i.e. distribute).
> >>
> >> The communicator design will also affect the linear algebra backend,
> >> depending on how those libraries do this... The persons who design
> >> this should probably take a look at how both PETSc and Epetra
> >> handle this to get ideas.
> >>
> >> But I don't have time to really get involved with this now.
> >> It's not like this is a simple thing to do on the side ;)
> >>
> >> Martin
> >
> > I don't understand the point. The functions in MPI are static by
> > nature, how many processes, current process number etc.
> 
> Those functions depend on MPI_COMM_WORLD, which
> is the group of all processes. If you want to run one part
> of the application on a subset of the processes, you wouldn't
> want to use the current MPI::gather etc, since it works with
> all processes. Instead you can define a communicator for
> that subset of processes, and "how many processes",
> "current process number" etc is defined within that context.
> 
> > But we do need something extra (the communicator) which can be shared
> > between objects that should share communication pattern.
> 
> A "communicator" is a MPI term, and doesn't really involve
> anything I would associate with a "communication pattern",
> it's just a named group of processes.
> 
> > We'll get to this in time, unless someone has time to think about it
> > now, and provide the implementation... ;-)
> 
> Since current MPI ~ future Communicator the change
> should probably be fairly easy to do later on.
> 
> Martin


Yes, and then we can decide if we want the Communicator to be just a a
handle to be passed to functions in dolfin::MPI or if it should also
provide functions. I imagine we might do something like this:

class MPI
{
public:
  void static foo(args, const Communicator& communicator=default_communicator);
};

class Communicator
{
public:
  void foo(args) { MPI::foo(args, *this); }
};

-- 
Anders

signature.asc
Description: Digital signature

_______________________________________________
DOLFIN-dev mailing list
[email protected]
http://www.fenics.org/mailman/listinfo/dolfin-dev

Re: [DOLFIN-dev] Parallelization and PXMLMesh

Reply via email to