There are some things to think about here: - Telescope can have some overheads wrt to duplicating operators, which maybe you don't want for GAMG - The current way we use it with MG is to have a separate PCMG for each communicator (e.g. for a 5-level method with one agglomeration step, 3 levels of GMG on 128 ranks, telescope on the coarse grid, 3 levels of GMG on 8 ranks). I'm not sure if this is also what you'd want for GAMG. - Logging is problematic when working with subcommunicators
On a related note, a feature which came up in a session today at PASC17 leads to a proposal which might be helpful is an additional mode for PetscSubcomm (which is used by Telescope). This would accept a size K (usually corresponding to a node or rack or some other topological unit of a cluster) and do "blockwise interlaced" agglomeration, splitting the communicator into chunks of size K and choosing an equally-space subset of these chunks to be in the sub-communicator. On Tue, Jun 27, 2017 at 2:35 PM, Matthew Knepley <[email protected]> wrote: > On Tue, Jun 27, 2017 at 6:36 AM, Mark Adams <[email protected]> wrote: > >> In talking with Garth, this will not work. >> >> I/we am now thinking that we should replace the MG object with Telescope. >> Telescope seems to be designed to be a superset of MG. Telescope does the >> processor reduction, and GAMG does as well, so we would have to reconcile >> this. Does this sound like a good idea? Am I missing anything important? >> > > I don't think "replace" is the right word. Telescope only does process > reduction. It does not do control flow for solvers, > or restriction/prolongation. You can see telescope interacting with MG here > > https://arxiv.org/abs/1604.07163 > > I think more of this should be "default", in that the options are turned > on if you are running GMG on a large number of procs. > > I also think GAMG should reuse the telescope code for doing reduction, but > I am not sure how hard this is. Mark? > > Thanks, > > Matt > > >> Mark >> >> On Tue, Jun 27, 2017 at 4:48 AM, Mark Adams <[email protected]> wrote: >> >>> Parallel coarse grid solvers are a bit broken at large scale where you >>> don't want to use all processors on the coarse grid. The ideal thing might >>> be to create a sub communicator, but it's not clear how to integrate this >>> in (eg, check if the sub communicator exists before calling the coarse grid >>> solver and convert if necessary). A bit messy. It would be nice if a >>> parallel direct solver would not redistribute the matrix, but then it would >>> be asking too much for it to reorder also, so we could have a crappy >>> ordering. So maybe the first option would be best long term. >>> >>> I see we have MUMPS and PaStiX. Do either of these not redistribute if >>> asked? >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > http://www.caam.rice.edu/~mk51/ >
