Thanks to Jan, Miro, and Garth for this informative discussion.
It seems to me that a part of the philosophy behind FEniCS is to
allow users fine level control of the algorithms but to
have good default values. I suspect that a large portion
of FEniCS programs solve SPD problems with the default LU
solver, which, it seems from this discussion, is not really
a good default. Would it not be better to take another default,
e.g., if linear_solver is not set, default to mumps. This
is not only a lot faster, but doesn't bring in the totally
unexpected behavior that if you inform the solver that the
matrix is symmetric it slows down considerably.
On 03/01/2015 11:50 AM, Jan Blechta wrote:
On Sun, 1 Mar 2015 17:48:42 +0000
Jan Blechta <[email protected]> wrote:
For the sake of objectivity, it should be noted that PETSc LU and
Cholesky perform so bad in DOLFIN because reordering is not employed,
see
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFactorSetMatOrderingType.html
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatOrderingType.html
Also note that in DOLFIN we somehow manipulate MUMPS and SuperLU_dist
ordering, see
https://bitbucket.org/fenics-project/dolfin/src/21ba7f92a2acb2f45351d47eca6658a24e7affc2/dolfin/common/SubSystemsManager.cpp?at=master#cl-191
to workaround some problems with SCOTCH and/or ParMETIS and/or 64-bit
PetscInt. MUMPS automatic choice ICNTL(7) == 7 may perform better.
This hack is little bit unexpectable because the code is only
I mean ^^^^^^^^^^^^
unpredictable
Jan
executed if DOLFIN takes care of PETSc initialization.
Jan
On Sun, 01 Mar 2015 11:15:19 -0600
Douglas N Arnold <[email protected]> wrote:
I did some testing to reproduce Miro's results. Here are times for
a symmetric Poisson-like problem on an N x N x N mesh of the unit
cube with CG1 elements. If I use the MUMPS solver, then setting
solver.parameters['symmetric'] to True, and so using Cholesky,
results in a modest memory saving, and about 25%-30% faster solve.
If I use PETSc instead of MUMPS, then setting symmetric equal to
True results in a modest *increase* in memory and a *huge increase*
in solver time. For example, on a 40 x 40 x 40 mesh the PETSc LU
solver uses 37 seconds while the PETSc Cholesky solver uses 198
seconds. PETSc takes 5 times longer than MUMPS for the LU solve and
a whopping 39 times longer for the Cholesky solve.
As Garth indicated, if you want to solve large systems by direct
solve don't use PETSc, (and, if you must, don't tell PETSc that your
matrix is symmetric).
Here are the actual timings:
Solution of Poisson problem, CG1 on N x N x N mesh of cube
symmetric=False symmetric=True
N time memory time memory
MUMPS
20 mumps 0.3 942976 0.2 937212
22 mumps 0.4 967744 0.4 947736
24 mumps 0.6 992196 0.5 964740
26 mumps 1.0 1032872 0.8 992284
28 mumps 1.3 1078404 1.1 1025716
30 mumps 1.9 1141312 1.4 1065532
32 mumps 2.4 1186708 1.8 1095176
34 mumps 3.1 1262252 2.4 1141552
36 mumps 4.8 1374788 3.4 1221876
38 mumps 5.4 1456412 3.9 1269188
40 mumps 7.3 1582028 5.1 1351760
PETSC
20 petsc 0.6 950972 2.1 950656
22 petsc 1.1 976664 3.8 979100
24 petsc 1.8 1007896 6.4 1015736
26 petsc 2.9 1098964 11.0 1067084
28 petsc 4.5 1149148 17.8 1136260
30 petsc 6.7 1242468 28.0 1222572
32 petsc 9.8 1322416 43.0 1322780
34 petsc 13.9 1333408 64.8 1457924
36 petsc 19.7 1440184 95.9 1621312
38 petsc 27.3 1669508 139.4 1826692
40 petsc 37.3 1723864 198.2 2076156
and here is the program that generated them:
# sample call: python ch.py 50 'mumps' True
from dolfin import *
import sys
N = int(sys.argv[1])
method = sys.argv[2]
symmetric = (sys.argv[3] == 'True')
mesh = UnitCubeMesh(N, N, N)
V = FunctionSpace(mesh, 'CG', 1)
u = TrialFunction(V)
v = TestFunction(V)
a = ( dot(grad(u), grad(v)) + u*v ) *dx
L = v*dx
u = Function(V)
problem = LinearVariationalProblem(a, L, u)
solver = LinearVariationalSolver(problem)
solver.parameters['linear_solver'] = method
solver.parameters['symmetric'] = symmetric
timer = Timer('solver')
timer.start()
solver.solve()
tim = timer.stop()
mem = memory_usage(as_string=False)
print "{:6} {:8} {:1} {:7.1f} {:9}".\
format(N, method, symmetric, tim, mem[1])
On 03/01/2015 08:58 AM, Garth N. Wells wrote:
The PETSc built-in direct solver is slow for large systems. It
really there for cases where LU is needed as part of another
algorithm, e.g. the coarse level is multigrid.
If you want to solve large systems, use one of the specialised
direct solvers.
Garth
On Sunday, 1 March 2015, Miro Kuchta <[email protected]
<mailto:[email protected]>> wrote:
Hi,
please consider the attached script. Following this
<https://bitbucket.org/fenics-project/dolfin/pull-request/2/use-cholesky-rather-than-lu-decomposition/diff#chg-dolfin/fem/LinearVariationalSolver.cpp>
discussion, if method is mumps, petsc or pastix and
we have symmetric=True the linear system is solved with
Cholesky factorization (it this so?). While testing different
method/symmetry combinations I noticed that PETSc's own symmetric
solver is easily 10 times slower then mumps (I don't have pastix
to compare against). Can anyone else reproduce
this? Thanks.
Regards, Miro
--
Garth N. Wells
Department of Engineering, University of Cambridge
http://www.eng.cam.ac.uk/~gnw20
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics