On Fri, Aug 06, 2010 at 08:13:13PM +0100, Garth N. Wells wrote: > On Fri, 2010-08-06 at 21:06 +0200, Anders Logg wrote: > > On Fri, Aug 06, 2010 at 07:55:54PM +0100, Garth N. Wells wrote: > > > On Fri, 2010-08-06 at 20:53 +0200, Anders Logg wrote: > > > > On Fri, Aug 06, 2010 at 07:51:18PM +0100, Garth N. Wells wrote: > > > > > On Fri, 2010-08-06 at 20:36 +0200, Anders Logg wrote: > > > > > > On Fri, Aug 06, 2010 at 04:55:44PM +0100, Garth N. Wells wrote: > > > > > > > On Fri, 2010-08-06 at 08:42 -0700, Johan Hake wrote: > > > > > > > > On Friday August 6 2010 08:16:26 you wrote: > > > > > > > > > ------------------------------------------------------------ > > > > > > > > > revno: 4896 > > > > > > > > > committer: Garth N. Wells <gn...@cam.ac.uk> > > > > > > > > > branch nick: dolfin-all > > > > > > > > > timestamp: Fri 2010-08-06 16:13:29 +0100 > > > > > > > > > message: > > > > > > > > > Add simple Stokes solver for parallel testing. > > > > > > > > > > > > > > > > > > Other Stokes demos don't run in parallel because > > > > > > > > > MeshFunction io is not > > > > > > > > > supported in parallel. > > > > > > > > > > > > > > > > Does anyone have an overview of what is needed for this to be > > > > > > > > fixed. I > > > > > > > > couldn't find a blueprint on it. > > > > > > > > > > > > > > > > > > > > > > Here it is: > > > > > > > > > > > > > > https://blueprints.launchpad.net/dolfin/+spec/parallel-io > > > > > > > > > > > > > > > I am interested in getting this fixed :) > > > > > > > > > > > > > > > > > > > > > > Me too! We need to look at all the io since much of it is broken > > > > > > > in > > > > > > > parallel. > > > > > > > > > > > > > > We need to settle on how to handle XML data. I favour (and I know > > > > > > > Niclas > > > > > > > Janson does too) the VTK approach in which we have a 'master > > > > > > > file' that > > > > > > > points to other XML files which contain portions of the > > > > > > > vector/mesh, > > > > > > > etc. Process zero can read the 'master file' and then instruct > > > > > > > the other > > > > > > > processes on which file(s) they should read in. > > > > > > > > > > > > This only works if the data is already partitioned. Most of our > > > > > > demos > > > > > > assume that we have the mesh in one single file which is then > > > > > > partitioned on the fly. > > > > > > > > > > > > > > > > The approach does work for data which is not partitioned. Just like > > > > > with > > > > > VTK, one can read the 'master file' or the individual files. > > > > > > > > > > > The initial plan was to support two different ways of reading data > > > > > > in parallel: > > > > > > > > > > > > 1. One file and automatic partitioning > > > > > > > > > > > > DOLFIN gets one file "mesh.xml", each process reads one part of it > > > > > > (just > > > > > > skipping other parts of the file), then the mesh is partitioned and > > > > > > redistributed. > > > > > > > > > > > > 2. Several files and no partitioning > > > > > > > > > > > > DOLFIN get multiple files and each process reads one part. In this > > > > > > case, the mesh and all associated data is already partitioned. This > > > > > > should be very easy to fix since everything that is needed is > > > > > > already > > > > > > in place; we just need to fix the logic. In particular, the data > > > > > > section of each local mesh contains all auxilliary parallel data. > > > > > > > > > > > > This can be handled in two different ways. Either a user specifies > > > > > > the > > > > > > name of the file as "mesh*.xml", in which case DOLFIN appends say > > > > > > > > > > > > "_%d" % MPI::process_number() > > > > > > > > > > > > on each local process. > > > > > > > > > > > > The other way is to have a master file which lists all the other > > > > > > files. In this case, I don't see a need for process 0 to take any > > > > > > kind > > > > > > of responsibility for communicating file names. It would work fine > > > > > > for > > > > > > each process to read the master file and then check which file it > > > > > > should use. Each process could also check that the total number of > > > > > > processes matches the number of partitions in the file. We could let > > > > > > process 0 handle the parsing of the master file and then communicate > > > > > > the file names but maybe that is an extra complication. > > > > > > > > > > > > > > > > This fails when the number of files differs from the number of > > > > > processes. It's very important to support m files on n processes. > > > > > We've > > > > > discussed this at length before. > > > > > > > > I don't remember. Can you remind me of what the reasons are? > > > > > > > > > > I perform a simulation using m processes, and write the result to m > > > files. Later I want to use the result later in another computation using > > > n processors. > > > > I assume you did your first simulation (with m processors) starting > > from one big file? > > > > What do you mean? The first simulation might not read in any file.
I assumed you had an input for your mesh with some nontrivial geometry. The only other possibilities I see are either if you use one of the builtin meshes or if you have some custom program that generates your mesh. Or is the thing that you have done mesh refinement? > > Can't you just restart from that file when you later want to run with > > n processors? It would not be much extra work, and maybe it would even > > be faster considering all the extra communication. > > > > The communication cost is negligible when reading a vector just once and > distributing. > > It will be very easy to read m files on n processes, so I don't get why > would we wish to prevent it? Yes, it would be easy but we would need to redistribute it before the call to ParMETIS. Another "problem" is if m > n. Then we would have to decide which processes read multiple files and how many. > Also, we can't rely in the data that we read in being suitably > partitioned. In my thinking, that is one of the main points of being able to read in multiple files, that you have already done the work of the partitioning and have a ready-made partition and just want to run the simulation again, or perhaps another simulation on the same mesh. In other words to allow computing in parallel without needing to go through the partitioning step. -- Anders
signature.asc
Description: Digital signature
_______________________________________________ Mailing list: https://launchpad.net/~dolfin Post to : dolfin@lists.launchpad.net Unsubscribe : https://launchpad.net/~dolfin More help : https://help.launchpad.net/ListHelp