Dear Kev,

To give an additional perspective on this topic: The assembly functions of deal.II (when using FEValues and the approach shown in step-1 to step-30 at least) are not optimized for performance also when compiled in release mode, but rather for readability. On an average case with linear or quadratic polynomials, I would expect a factor of 10 in performance gain by using one of the following approaches:

- You could express the three nested loop of a typical matrix assembly routine as a dense matrix-matrix multiplication. This requires you to rearrange some computations, especially when you work with systems of equations, but the gain comes from the fact that matrix-matrix multiplication is typically mapped to BLAS functions, which are one of the most optimized algorithms in the world (as LINPACK uses it as the dominating algorithm). The step-65 tutorial program explains this for a symmetric problem, but it is not hard to do for non-symmetric problems.

- You could express the code that computes the local matrix such that FEEvaluation (from the matrix-free framework) can compute the system matrix by applying the local operator to all unit vectors. You could also go completely matrix-free in the sense of not computing a matrix at all, but using more optimized strategies to compute the local matrices and residuals could be a middle ground when you don't want to fiddle with iterative solvers beyond PETSc's (great) default choices. The matrix-free framework uses computer hardware slightly less efficiently than BLAS-3, but it can leverage the structure in shape functions and hence lower complexity, often quite significantly.

Obviously, both the above options require some work, and lead to not-so-nice code. It depends on your use case if that is worth it. Of course, all the other suggestions, like not assembling the matrix in every step (possibly via SUNDIALS) or gathering precise timings to take the right action, should also be in your list of considerations.

Best,
Martin


On 09.12.22 22:07, Marc Fehling wrote:
Hello Kev,

just to make sure: did you compile the deal.II library and your code in Optimized mode/Release mode <https://www.dealii.org/current/readme.html#configuration>?

Marc

On Thursday, December 8, 2022 at 2:35:22 PM UTC-7 Wolfgang Bangerth wrote:


    Kev,

    > I've done some layman's benchmarking of the individual "steps"
    (setup,
    > assembly, solve, ...) in my current version of the code. It
    looks as if the
    > assembly takes several orders of magnitude (~100 at least)
    longer than the
    > solving part.
    >
    > My question is now: What is the best strategy to speed up
    assembly, is there
    > any experience with this? I've read different approaches and am
    confused
    > what's promising for small-scale problems. So far I'm considering:
    >
    > 1) Using a matrix-free approach rather than PETSc - this seems
    to be a win in
    > most cases, would however consider  rewriting large parts of the
    code and I am
    > not sure if I will gain a lot given my small system size.
    >
    > 2) Only assemble the Jacobian every few steps, but the residual
    in every step.
    > This is probably easier to implement. I know from experience
    with my problem
    > that I pretty quickly land in a situation where I need only one
    or two Newton
    > steps to find the solution to my nonlinear equation, so there
    saving will be
    > small at best.

    Like Bruno, it seems rather unlikely to me that assembly would
    take 100x times
    longer than other things, and I would suggest you use something like
    TimerOutput to time individual sections to narrow down where the
    issue lies.

    Beyond this, only updating the Jacobian is indeed a fairly usual
    strategy. One
    can of course implement this by hand, but it's quite cumbersome to
    implement
    optimal algorithms to determine when updating is necessary, and I
    would
    encourage you to take a look at step-77 to see how one can solve
    nonlinear
    problems efficiently through interfaces to advanced libraries such
    as SUNDIALS:
    https://dealii.org/developer/doxygen/deal.II/step_77.html

    I don't remember if KINSOL has ways to solve a sequence of
    nonlinear system,
    re-using the Jacobian between systems. But if it does not, you can
    always just
    store a pointer to the last Jacobian matrix and whenever you start
    solving a
    linear system for the first time, copy the stored matrix over.

    In any case, I think the first step should be to look at (i)
    whether assembly
    really takes that long for you, (ii) understand why it takes that
    long. If
    step-77 is any indication, then assembling the Jacobian should
    really not take
    that much longer than actually solving linear systems.

    Best
    W.


-- ------------------------------------------------------------------------

    Wolfgang Bangerth email: [email protected]
    www: http://www.math.colostate.edu/~bangerth/


--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/5bb618cf-52ce-4871-944b-aeb56d1b4c50n%40googlegroups.com <https://groups.google.com/d/msgid/dealii/5bb618cf-52ce-4871-944b-aeb56d1b4c50n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/cd83f257-098a-dfeb-840b-945f2bd7c65e%40gmail.com.

Reply via email to