Re: [deal.II] Problem with scaling with parallel::distributed::Triangulation and direct solver

'Wolfgang Bangerth' via deal.II User Group Tue, 11 Nov 2025 10:58:36 -0800

My idea was to observe timing behaviour similar to step-55, as thepreconditioner template I have followed most closely emulates thebehaviour of the implementation in step-55. The number of degrees offreedom are not significantly different from those observed at cycles 4and 5. The output I observe from cycles 4 and 5 when i run step-55 using2 processes with the same version of deal.II, I get the performance asshown below. This indicated to me that for 2 mpi processes and thenumber of dofs, the solution time would be quite low. Once I hadreasonable solution times, I was planning to scale the code to a largernumber of degrees of freedom.

Ah, I see -- so the question you're really asking is why it takes 493seconds to solve a problem with 215,000 unknowns. That's likely becauseyou do 19 outer iterations, and in each you call a direct solver todecompose the same matrix 19 times.

Prof. Bangerth, would there be a way to do this when usingSparseDirectMUMPS? From reading the documentation, I only see a solve()function. The alternative would be to use sparseILU. Do you recommendusing sparseILU instead?

I don't recall the exact interface of SparseDirectMUMPS from pastreleases. SparseDirectUMFPACK allows you to compute a decomposition onlyonce, and then apply it repeatedly in vmult(). The interface ofSparseDirectMUMPS that's in the current developer version also allowsyou to do that. If you can switch to the current developer version (orcheck whether 9.7 can do that as well), you may want to try that.

SparseILU works sometimes, but it typically does not scale well to largeproblems. (Sparse direct solvers often do not either, but at least theydon't require endless fiddling with settings.)

Since there is clearly an inefficiency when using DirectInverseMatrixobjects as my preconditioner, I switched to using InverseMatrix as myinner solver, wherein I have CG with AMG preconditioner as shown in thecode below:
[...]
I don't seem to observe any form of improvement in performance. From myobservations the second CG solve with the (1,1) block takes around 70iterations to converge, which adds to the bulk of the computation time.I would most likely have to add some performance improvements here forprecAs, which might bring down the iteration counts and speed up things.Do you think this would be the right way to approach this problem?

I can't see where you use the AMG preconditioners, but the same applies:You should only set them up once and then re-use many times. That is,the preconditioners need to live *outside* the place where you solve theinner (1,1) block.

Perhaps as a general rule, people spend whole PhD theses to develop goodparallel solvers and preconditioners. In industry, consultants are paidthousands or tens of thousands of dollars to figure out good solvers andpreconditioners. You should expect that figuring this out is a longlearning process that involves developing the skills to set up blockpreconditioners in the right place, and to find ways to timing the rightplaces. This is not going to be an easy process; there is also not goingto be a magic bullet the good people on this mailing list have that willmagically make it work for you.


Best
 W.

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en

---You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/dealii/a133023a-cb1a-4afb-9e9f-cbd0022b2092%40colostate.edu.

Re: [deal.II] Problem with scaling with parallel::distributed::Triangulation and direct solver

Reply via email to