* Try how the run time of both the direct and iterative solvers
            change as you increase the number of unknowns. (E.g., start with a
            10x10x10 mesh, then try a 20x20x20, ... mesh.)

As suggested, I have added more tests, which are all executed with MPI in parallel:
(*in the following, I only show the timing for solution*)

*Case 1*. 10*10*10 cells:

Case 1.1 mpirun -np 2 ./xxxx
cg: 0.00621s      MUMPS (with symmetric setting): 0.0932s       MUMPS (without symmetric setting): 0.122s

Case 1.2 mpirun -np 6 ./xxxx
cg: 0.00479s      MUMPS (with symmetric setting): 0.0774s       MUMPS (without symmetric setting): 0.169s

*Case 2.* 20*20*20 cells:

Case 2.1 mpirun -np 2 ./xxxx
cg: 0.0884s       MUMPS (with symmetric setting): 3.71s           MUMPS (without symmetric setting): 6.44s

Case 2.2 mpirun -np 6 ./xxxx
cg: 0.087s         MUMPS (with symmetric setting): 2.16s           MUMPS (without symmetric setting): 4.29s

*Case 3*. 30*30*30 cells:

Case 3.1 mpirun -np 2 ./xxxx
cg: 0.39s          MUMPS (with symmetric setting): 26.2s           MUMPS (without symmetric setting): 50.8s

Case 3.2 mpirun -np 6 ./xxxx
cg: 0.372s        MUMPS (with symmetric setting): 23.4s           MUMPS (without symmetric setting): 43.8s

I have to admit that I find the CG times too small to be credible. The last case should have about 200,000 unknowns. It seems implausible to me that you can solve that in 0.4 seconds on 2 processors. What preconditioner do you use, and do you include the time to build the preconditioner in this time?

My rule of thumb has always been that to solve a problem with 100,000 unknowns on one processor, it takes about a minute. If you have a fast processor, then maybe you can get that done in 20 or 30 seconds, so the times you quote for MUMPS seem not out of the ordinary to me.


*I have another question:*
*For problem with millions of unknowns, the same Dirichlet boundary condition and different right hand sides (e.g. rhs1, rhs2, ..., rhs8). How can I speed up the solution process with (maybe) iterative solver?* I think for a small number of unknowns, maybe I can use parallel direct solver, which can reuse the factorization of the system matrix for rhs2-rhs8 after I solve the solution with rhs1. But for a problem with millions of unknowns, maybe I have to use iterative solver for efficiency. So what solver or what technique should I use to speed up the solution of such a multiple load case problem?

Bruno already gave the correct answer: Build an expensive preconditioner because you only need to build it once. Of course, the best preconditioner is an LU decomposition of the matrix, which is what a direct solver computes.

But you will need to expect that fundamentally, solving N problems with an iterative solver requires N times as many operations as solving one (once you have built the preconditioner). There are "block" variants of solvers such as GMRES or CG that can be more efficient because they group these operations in a more efficient way through vectorization or grouping communication, but they fundamentally still have to do N times as many operations.

Best
 W.

--
------------------------------------------------------------------------
Wolfgang Bangerth          email:                 bange...@colostate.edu
                           www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to