On Fri, Dec 19, 2008 at 09:23:43AM +0100, Martin Sandve Alnæs wrote: > On Fri, Dec 19, 2008 at 9:12 AM, Anders Logg <[email protected]> wrote: > > On Fri, Dec 19, 2008 at 09:06:43AM +0100, Martin Sandve Alnæs wrote: > >> On Fri, Dec 19, 2008 at 8:54 AM, Anders Logg <[email protected]> wrote: > >> > On Fri, Dec 19, 2008 at 12:09:50PM +0900, Evan Lezar wrote: > >> >> > >> >> > >> >> On Thu, Dec 18, 2008 at 4:25 AM, Johan Hake <[email protected]> wrote: > >> >> > >> >> On Wednesday 17 December 2008 20:19:52 Anders Logg wrote: > >> >> > On Wed, Dec 17, 2008 at 08:13:03PM +0100, Johan Hake wrote: > >> >> > > On Wednesday 17 December 2008 19:20:11 Anders Logg wrote: > >> >> > > > Ola and I have now finished up the first round of getting > >> >> DOLFIN to > >> >> > > > run in parallel. In short, we can now parse meshes from file > >> >> in > >> >> > > > parallel and partition meshes in parallel (using ParMETIS). > >> >> > > > > >> >> > > > We reused some good ideas that Niclas Jansson had implemented > >> >> in > >> >> > > > PXMLMesh before, but have also made some significant changes > >> >> as > >> >> > > > follows: > >> >> > > > > >> >> > > > 1. The XML reader does not handle any partitioning. > >> >> > > > > >> >> > > > 2. The XML reader just reads in a chunk of the mesh data on > >> >> each > >> >> > > > processor (in parallel) and stores that into a LocalMeshData > >> >> object > >> >> > > > (one for each processor). The data is just partitioned in > >> >> blocks so > >> >> > > > the vertices and cells may be completely unrelated. > >> >> > > > > >> >> > > > 3. The partitioning takes place in > >> >> MeshPartitioning::partition, > >> >> > > > which gets a LocalMeshData object on each processor. It then > >> >> calls > >> >> > > > ParMETIS to compute a partition (in parallel) and then > >> >> redistributes > >> >> > > > the data accordingly. Finally, a mesh is built on each > >> >> processor > >> >> using > >> >> > > > the local data. > >> >> > > > > >> >> > > > 4. All direct MPI calls (except one which should be removed) > >> >> have > >> >> been > >> >> > > > removed from the code. Instead, we mostly rely on > >> >> > > > dolfin::MPI::distribute which handles most cases of parallel > >> >> > > > communication and works with STL data structures. > >> >> > > > > >> >> > > > 5. There is just one ParMETIS call (no initial geometric > >> >> > > > partitioning). It seemed like an unnecessary step, or are > >> >> there good > >> >> > > > reasons to perform the partitioning in two steps? > >> >> > > > > >> >> > > > For testing, go to sandbox/passembly, build and then run > >> >> > > > > >> >> > > > mpirun -n 4 ./demo > >> >> > > > ./plot_partitions 4 > >> >> > > > >> >> > > Looks beautiful! > >> >> > > > >> >> > > I threw a 3D mesh of 160K vertices onto it, and it was > >> >> partitioned > >> >> nicely > >> >> > > in some 10 s, on my 2 core laptop. > >> >> > > > >> >> > > Johan > >> >> > > >> >> > Nice, in particular since we haven't run any 3D test cases > >> >> ourselves, > >> >> > just a tiny mesh of the unit square... :-) > >> >> > >> >> Yes I thought so too ;) > >> >> > >> >> Johan > >> >> _______________________________________________ > >> >> DOLFIN-dev mailing list > >> >> [email protected] > >> >> http://www.fenics.org/mailman/listinfo/dolfin-dev > >> >> > >> >> > >> >> While we are on the topic of parallelisation - I have some comments. > >> >> > >> >> A couple of months ago I was trying to parallelise some of my own code > >> >> and ran > >> >> into some problems with the communicators used for PETSc and SLEPc > >> >> problems - I > >> >> sort of got them to work, but never commited my changes because they > >> >> were a > >> >> little ungainly and I felt I needed to spend some more time on them. > >> >> > >> >> One thing that I did notice is that the user does not have much control > >> >> over > >> >> which parts of the process run in parallel - with the number of MPI > >> >> processes > >> >> deciding whether or not a parallel implementation should be used. In > >> >> my case > >> >> what I wanted to do was perform a frequency sweep and for each > >> >> frequency point > >> >> perform the same calculation (an eigenvalue problem) for that > >> >> frequency. My > >> >> intention was to distribute the frequency sweep over a number of > >> >> processors and > >> >> then handle the assembly and solution of each system separately. This > >> >> is not > >> >> possible as the code is now since the assembly and eigenvalue solvers > >> >> all try > >> >> to run in parallel as soon as the applicaion is run as an mpi program. > >> >> > >> >> I know that this is not very descriptive, but does anyone else have > >> >> thoughts on > >> >> the matter? I will put together something a little more concrete as > >> >> soon as I > >> >> have a chance (I am travelling around quite a bit at the moment so it is > >> >> difficult for me to focus). > >> > > >> > We're just getting started. Everything needs to be configurable. At > >> > the moment, we assume that all parallel computation should be split in > >> > MPI::num_processes(). > >> > > >> > We can either add optional parameters to all parallel calls or add > >> > global options to control how many processes are used. But I suggest > >> > >> Global options wouldn't solve anything (the point here is variable > >> number of processes during the application lifetime), and it's probably > >> enough with a single parameter, the communicator, perhaps permanently > >> attached to each FunctionSpace (dofmap must be distributed over > >> the processes in a communicator to be usable). Calls to > >> MPI::num_processes() > >> would be replaced by V.communicator().num_processes() or something. > >> I get the waiting argument though. When stuff works, a search for MPI::* > >> should reveal the places where "V.communicator()." should replace "MPI::", > >> likely covering most places that need to be updated. > >> > >> Martin > > > > Sounds good. > > > > Speaking of the communicator, I'd like to add access to a default > > communicator in the MPI class which could be accessed by > > MPI::communicator() but couldn't figure out how to do this (I suspect > > it's trivial). > > > > Then we could avoid including mpi.h in MeshPartitioning.cpp (look for > > the two first FIXMEs in that file). If anyone knows how to do this, > > let me know. > > You mean something like > > class MPI { > public: > > static dolfin::Communicator & communicator() > { return *mainCommunicator; } > > private: > static shared_ptr<Communicator> mainCommunicator; > }; > > ? > > This can then be initialized the first time MPI::communicator() > is called, to be a communicator including all MPI processes. > > Epetra has a "serial communicator", which is probably needed > as well when MPI is not in use, instead of adding #ifdefs all > places MPI is used. > > Martin
Yes, something like that. But what would the Communicator class look like? If you know how to do this, feel free to just add it. -- Anders
signature.asc
Description: Digital signature
_______________________________________________ DOLFIN-dev mailing list [email protected] http://www.fenics.org/mailman/listinfo/dolfin-dev
