AMG setup is kind of expensive, say the order of a solve. What you have looks OK here. If you have a super hard problem you will want to coarsen slower (high threshold), which will increase setup costs. The setup costs are symbolic (graph work) and numeric (like a factorization). As you noticed:
MatPtAPNumeric 4 1.0 1.3035e-01 1.0 4.64e+07 1.0 0.0e+00 0.0e+00 0.0e+00 3 5 0 0 0 3 5 0 0 0 356 This is the numeric part. The symbolic part will get amortized if the grid does not change and the numerical part will get amortized if the operator does not change (linear). Mark On Thu, Jun 4, 2015 at 5:47 PM, Justin Chang <[email protected]> wrote: > Thank you Matt and Mark for the clarification. Matt, if you recall our > discussion about calculating the arithmetic intensity from the earlier > threads, it seems GAMG now has a myriad of all these additional vector and > matrix operations that were not present in the CG/Jacobi case. Running with > the command line options you and Mark suggested, I now have these > additional operations to deal with: > > VecMDot > VecAXPBYCZ > VecMAXPY > VecSetRandom > VecNormalize > MatMultAdd > MatMultTranspose > MatSolve > MatConvert > MatScale > MatResidual > MatCoarsen > MatAXPY > MatMatMult > MatMatMultSym > MatMatMultNum > MatPtAP > MatPtAPSymbolic > PatPtAPNumeric > MatTrnMatMult > MatTrnMatMultSym > MatTrnMatMultNum > MatGetSymTrans > KSPGMRESOrthog > PCGAMGGraph_AGG > PCGAMGCoarse_AGG > PCGAMGProl_AGG > PCGAMGPOpt_AGG > GAMG: createProl and all of its associated events. > GAMG: partLevel > PCSetUpOnBlocks > > Attached is the output from -log_summary showing the exact counts for the > case I am running. > > I have the following questions: > > 1) For the Vec operations VecMDot and VecMAXPY, it seems the estimation of > total bytes transferred (TBT) relies on knowing how many vectors there are. > Is there a way to figure this out? Or at least with gamg what would it be, > three vectors? > > 2) It seems there are a lot of matrix manipulations and multiplications. > Is it safe to say that the size and number of non zeroes is the same? Or > will it change? > > 3) If I follow the TBT tabulation as in that paper you pointed me to, > would MatMultTranspose follow the same formula if the Jacobian is symmetric? > > 4) How do I calculate anything that requires the multiplication of at > least two matrices? > > 5) More importantly, are any of the above calculations necessary? Because > log_summary seems to indicate that MatMult() has the greatest amount of > workload and number of calls. My only hesitation is how much traffic > MatMatMults may take (assuming I go off of the same assumptions as in that > paper). > > 6) And/or, are there any other functions that I missed that might be > important to calculate as well? > > Thanks, > Justin > > On Thu, Jun 4, 2015 at 11:33 AM, Mark Adams <[email protected]> wrote: > >> >> >> On Thu, Jun 4, 2015 at 12:29 PM, Matthew Knepley <[email protected]> >> wrote: >> >>> On Thu, Jun 4, 2015 at 10:31 AM, Justin Chang <[email protected]> >>> wrote: >>> >>>> Yeah I saw his recommendation and am trying it out. But I am not sure >>>> what most of those parameters mean. For instance: >>>> >>>> 1) What does -pc_gamg_agg_nsmooths refer to? >>>> >>> >>> This is always 1 (its the definition of smoothed aggregation). Mark >>> allows 0 to support unsmoothed aggregation, which may be >>> better for easy problems on extremely large machines. >>> >>> >>>> 2) Does increase in the threshold of -pc_gamg_threshold translate to >>>> making the coarsening faster? >>>> >>> >>> Yes, I believe so (easy to check). >>> >> >> Other way around. >> > >
