Mark,

   I think there is a misunderstanding here. With GASM an individual block 
problem is __solved__ (via a parallel KSP) in parallel by several processes, 
with ASM each block is "owned" by and solved on a single process. 

   With both the "block" can come from any unknowns on any processes. You can 
have, for example a block that comes from a region snaking across several 
processes if you like (or it makes sense due to coupling in the matrix). 

   By default if you use ASM it will create one non-overlapping block defined 
by all unknowns owned by a single process and then extend it by "one level" 
(defined by the nonzero structure of the matrix) to get overlapping. If you use 
multiple blocks per process it defines the non-overlapping blocks within a 
single process's unknowns and extends each of them to have overlap (again by 
the non-zero structure of the matrix). The default is simple because the user 
only need indicate the number of blocks per process, the drawback is of course 
that it does depend on the process layout, number of processes etc and does not 
take into account particular "coupling information" that the user may know 
about with their problem.

  If the user wishes to defined the blocks themselves that is also possible 
with PCASMSetSubLocalSubdomains(). Each process provides 1 or more index sets 
for the subdomains it will solve on. Note that the index sets can contain any 
unknowns in the entire problem so the blocks do not have to "line up" with the 
parallel decomposition at all. Of course determining and providing good such 
subdomains may not always be clear.

  I see in GAMG you have PCGAMGSetUseASMAggs  which sadly does not have an 
explanation in the users manual and sadly does not have a matching options data 
base name -pc_gamg_use_agg_gasm  following the rule of drop the word set, all 
lower case, and put _ between words the option should be -pc_gamg_use_asm_aggs. 
 

   In addition to this one you could also have one that uses the aggs but use 
the PCASM to manage the solves instead of GASM, it would likely be less buggy 
and more efficient.

  Please tell me exactly what example you tried to run with what options and I 
will debug it. Note that ALL functionality that is included in PETSc should 
have tests that test that functionality then we will find out immediately when 
it is broken instead of two years later when it is much harder to debug. If 
this -pc_gamg_use_agg_gasm had had a test we won't be in this mess now. (Jed's 
damn code reviews sure don't pick up this stuff).

   Barry






> On Jun 22, 2016, at 5:20 PM, Mark Adams <[email protected]> wrote:
> 
> 
> 
> On Wed, Jun 22, 2016 at 8:06 PM, Barry Smith <[email protected]> wrote:
> 
>    I suggest focusing on asm.
> 
> OK, I will switch gasm to asm, this does not work anyway.
>  
> Having blocks that span multiple processes seems like over kill for a 
> smoother ?
> 
> No, because it is a pain to have the math convolved with the parallel 
> decompositions strategy (ie, I can't tell an application how to partition 
> their problem). If an aggregate spans processor boundaries, which is fine and 
> needed, and let's say we have a pretty uniform problem, then if the block 
> gets split up, H is small in part of the domain and convergence could suffer 
> along processor boundaries.  And having the math change as the parallel 
> decomposition changes is annoying. 
>  
> (Major league overkill) in fact doesn't one want multiple blocks per process, 
> ie. pretty small blocks.
> 
> No, it is just doing what would be done in serial.  If the cost of moving the 
> data across the processor is a problem then that is a tradeoff to consider.
> 
> And I think you are misunderstanding me.  There are lots of blocks per 
> process (the aggregates are say 3^D in size).  And many of the 
> aggregates/blocks along the processor boundary will be split between 
> processors, resulting is mall blocks and weak ASM PC on processor boundaries.
> 
> I can understand ASM not being general and not letting blocks span processor 
> boundaries, but I don't think the extra matrix communication costs are a big 
> deal (done just once) and the vector communication costs are not bad, it 
> probably does not include (too many) new processors to communicate with.
> 
> 
>    Barry
> 
> > On Jun 22, 2016, at 7:51 AM, Mark Adams <[email protected]> wrote:
> >
> > I'm trying to get block smoothers to work for gamg.  We (Garth) tried this 
> > and got this error:
> >
> >
> >  - Another option is use '-pc_gamg_use_agg_gasm true' and use 
> > '-mg_levels_pc_type gasm'.
> >
> >
> > Running in parallel, I get
> >
> >      ** Max-trans not allowed because matrix is distributed
> >  ----
> >
> > First, what is the difference between asm and gasm?
> >
> > Second, I need to fix this to get block smoothers. This used to work.  Did 
> > we lose the capability to have blocks that span processor subdomains?
> >
> > gamg only aggregates across processor subdomains within one layer, so maybe 
> > I could use one layer of overlap in some way?
> >
> > Thanks,
> > Mark
> >
> 
> 

Reply via email to