> On Jan 11, 2017, at 9:21 PM, Matthew Knepley <knep...@gmail.com> wrote: > > On Wed, Jan 11, 2017 at 8:31 PM, Barry Smith <bsm...@mcs.anl.gov> wrote: > > Thanks, this is very useful information. It means that > > 1) the approximate Sp is actually a very good approximation to the true Schur > complement S, since using Sp^-1 to precondition S gives iteration counts from > 8 to 13. > > 2) using ilu(0) as a preconditioner for Sp is not good, since replacing > Sp^-1 with ilu(0) of Sp gives absurd iteration counts. This is actually not > super surprising since ilu(0) is generally "not so good" for elasticity. > > So the next step is to try using -fieldsplit_FE_split_ksp_monitor > -fieldsplit_FE_split_pc_type gamg > > the one open question is if any options should be passed to the gamg to tell > it that the underly problem comes from "elasticity"; that is something about > the null space. > > Mark Adams, since the GAMG is coming from inside another preconditioner it > may not be easy for the easy for the user to attach the near null space to > that inner matrix. Would it make sense for there to be a GAMG command line > option to indicate that it is a 3d elasticity problem so GAMG could set up > the near null space for itself? or does that not make sense? > > We could do that if somehow we knew the problem geometry, which is the origin > of Mark's PCSetCoordinates() interface.
Ah, so conveying Mat coordinates down to sub matrices? > > Matt > > Barry > > > > > On Jan 11, 2017, at 7:47 PM, David Knezevic <david.kneze...@akselos.com> > > wrote: > > > > I've attached the two log files. Using cholesky for "FE_split" seems to > > have helped a lot! > > > > David > > > > > > -- > > David J. Knezevic | CTO > > Akselos | 210 Broadway, #201 | Cambridge, MA | 02139 > > > > Phone: +1-617-599-4755 > > > > This e-mail and any attachments may contain confidential material for the > > sole use of the intended recipient(s). Any review or distribution by others > > is strictly prohibited. If you are not the intended recipient, please > > contact the sender and delete all copies. > > > > On Wed, Jan 11, 2017 at 8:32 PM, Barry Smith <bsm...@mcs.anl.gov> wrote: > > > > Can you please run with all the monitoring on? So we can see the > > convergence of all the inner solvers > > -fieldsplit_FE_split_ksp_monitor > > > > Then run again with > > > > -fieldsplit_FE_split_ksp_monitor -fieldsplit_FE_split_pc_type cholesky > > > > > > and send both sets of results > > > > Barry > > > > > > > On Jan 11, 2017, at 6:32 PM, David Knezevic <david.kneze...@akselos.com> > > > wrote: > > > > > > On Wed, Jan 11, 2017 at 5:52 PM, Dave May <dave.mayhe...@gmail.com> wrote: > > > so I gather that I'll have to look into a user-defined approximation to S. > > > > > > Where does the 2x2 block system come from? > > > Maybe someone on the list knows the right approximation to use for S. > > > > > > The model is 3D linear elasticity using a finite element discretization. > > > I applied substructuring to part of the system to "condense" it, and that > > > results in the small A00 block. The A11 block is just standard 3D > > > elasticity; no substructuring was applied there. There are constraints to > > > connect the degrees of freedom on the interface of the substructured and > > > non-substructured regions. > > > > > > If anyone has suggestions for a good way to precondition this type of > > > system, I'd be most appreciative! > > > > > > Thanks, > > > David > > > > > > > > > > > > ----------------------------------------- > > > > > > 0 KSP Residual norm 5.405528187695e+04 > > > 1 KSP Residual norm 2.187814910803e+02 > > > 2 KSP Residual norm 1.019051577515e-01 > > > 3 KSP Residual norm 4.370464012859e-04 > > > KSP Object: 1 MPI processes > > > type: cg > > > maximum iterations=1000 > > > tolerances: relative=1e-06, absolute=1e-50, divergence=10000. > > > left preconditioning > > > using nonzero initial guess > > > using PRECONDITIONED norm type for convergence test > > > PC Object: 1 MPI processes > > > type: fieldsplit > > > FieldSplit with Schur preconditioner, factorization FULL > > > Preconditioner for the Schur complement formed from Sp, an assembled > > > approximation to S, which uses (lumped, if requested) A00's diagonal's > > > inverse > > > Split info: > > > Split number 0 Defined by IS > > > Split number 1 Defined by IS > > > KSP solver for A00 block > > > KSP Object: (fieldsplit_RB_split_) 1 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_RB_split_) 1 MPI processes > > > type: cholesky > > > Cholesky: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0., needed 0. > > > Factored matrix follows: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=324, cols=324 > > > package used to perform factorization: mumps > > > total: nonzeros=3042, allocated nonzeros=3042 > > > total number of mallocs used during MatSetValues calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 2 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): > > > 0 > > > ICNTL(13) (efficiency control): > > > 0 > > > ICNTL(14) (percentage of estimated workspace > > > increase): 20 > > > ICNTL(18) (input mat struct): > > > 0 > > > ICNTL(19) (Shur complement info): > > > 0 > > > ICNTL(20) (rhs sparse pattern): > > > 0 > > > ICNTL(21) (solution struct): > > > 0 > > > ICNTL(22) (in-core/out-of-core facility): > > > 0 > > > ICNTL(23) (max size of memory can be allocated > > > locally):0 > > > ICNTL(24) (detection of null pivot rows): > > > 0 > > > ICNTL(25) (computation of a null space basis): > > > 0 > > > ICNTL(26) (Schur options for rhs or solution): > > > 0 > > > ICNTL(27) (experimental parameter): > > > -24 > > > ICNTL(28) (use parallel or sequential ordering): > > > 1 > > > ICNTL(29) (parallel ordering): > > > 0 > > > ICNTL(30) (user-specified set of entries in inv(A)): > > > 0 > > > ICNTL(31) (factors is discarded in the solve phase): > > > 0 > > > ICNTL(33) (compute determinant): > > > 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): > > > 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0. > > > CNTL(4) (value of static pivoting): -1. > > > CNTL(5) (fixation for null pivots): 0. > > > RINFO(1) (local estimated flops for the elimination > > > after analysis): > > > [0] 29394. > > > RINFO(2) (local estimated flops for the assembly > > > after factorization): > > > [0] 1092. > > > RINFO(3) (local estimated flops for the elimination > > > after factorization): > > > [0] 29394. > > > INFO(15) (estimated size of (in MB) MUMPS internal > > > data for running numerical factorization): > > > [0] 1 > > > INFO(16) (size of (in MB) MUMPS internal data used > > > during numerical factorization): > > > [0] 1 > > > INFO(23) (num of pivots eliminated on this processor > > > after factorization): > > > [0] 324 > > > RINFOG(1) (global estimated flops for the elimination > > > after analysis): 29394. > > > RINFOG(2) (global estimated flops for the assembly > > > after factorization): 1092. > > > RINFOG(3) (global estimated flops for the elimination > > > after factorization): 29394. > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > > > (0.,0.)*(2^0) > > > INFOG(3) (estimated real workspace for factors on all > > > processors after analysis): 3888 > > > INFOG(4) (estimated integer workspace for factors on > > > all processors after analysis): 2067 > > > INFOG(5) (estimated maximum front size in the > > > complete tree): 12 > > > INFOG(6) (number of nodes in the complete tree): 53 > > > INFOG(7) (ordering option effectively use after > > > analysis): 2 > > > INFOG(8) (structural symmetry in percent of the > > > permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store the > > > matrix factors after factorization): 3888 > > > INFOG(10) (total integer space store the matrix > > > factors after factorization): 2067 > > > INFOG(11) (order of largest frontal matrix after > > > factorization): 12 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after > > > factorization): 0 > > > INFOG(14) (number of memory compress after > > > factorization): 0 > > > INFOG(15) (number of steps of iterative refinement > > > after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS > > > internal data for factorization after analysis: value on the most memory > > > consuming processor): 1 > > > INFOG(17) (estimated size of all MUMPS internal data > > > for factorization after analysis: sum over all processors): 1 > > > INFOG(18) (size of all MUMPS internal data allocated > > > during factorization: value on the most memory consuming processor): 1 > > > INFOG(19) (size of all MUMPS internal data allocated > > > during factorization: sum over all processors): 1 > > > INFOG(20) (estimated number of entries in the > > > factors): 3042 > > > INFOG(21) (size in MB of memory effectively used > > > during factorization - value on the most memory consuming processor): 1 > > > INFOG(22) (size in MB of memory effectively used > > > during factorization - sum over all processors): 1 > > > INFOG(23) (after analysis: value of ICNTL(6) > > > effectively used): 5 > > > INFOG(24) (after analysis: value of ICNTL(12) > > > effectively used): 1 > > > INFOG(25) (after factorization: number of pivots > > > modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null pivots > > > encountered): 0 > > > INFOG(29) (after factorization: effective number of > > > entries in the factors (sum over all processors)): 3042 > > > INFOG(30, 31) (after solution: size in Mbytes of > > > memory used during solution phase): 0, 0 > > > INFOG(32) (after analysis: type of analysis done): 1 > > > INFOG(33) (value used for ICNTL(8)): -2 > > > INFOG(34) (exponent of the determinant if determinant > > > is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_RB_split_) 1 MPI processes > > > type: seqaij > > > rows=324, cols=324 > > > total: nonzeros=5760, allocated nonzeros=5760 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 108 nodes, limit used is 5 > > > KSP solver for S = A11 - A10 inv(A00) A01 > > > KSP Object: (fieldsplit_FE_split_) 1 MPI processes > > > type: cg > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > > left preconditioning > > > using PRECONDITIONED norm type for convergence test > > > PC Object: (fieldsplit_FE_split_) 1 MPI processes > > > type: bjacobi > > > block Jacobi: number of blocks = 1 > > > Local solve is same for all blocks, in the following KSP and PC > > > objects: > > > KSP Object: (fieldsplit_FE_split_sub_) 1 MPI > > > processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_FE_split_sub_) 1 MPI > > > processes > > > type: ilu > > > ILU: out-of-place factorization > > > 0 levels of fill > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 1., needed 1. > > > Factored matrix follows: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=28476, cols=28476 > > > package used to perform factorization: petsc > > > total: nonzeros=1037052, allocated nonzeros=1037052 > > > total number of mallocs used during MatSetValues > > > calls =0 > > > using I-node routines: found 9489 nodes, limit used > > > is 5 > > > linear system matrix = precond matrix: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=28476, cols=28476 > > > total: nonzeros=1037052, allocated nonzeros=1037052 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 9489 nodes, limit used is 5 > > > linear system matrix followed by preconditioner matrix: > > > Mat Object: (fieldsplit_FE_split_) 1 MPI processes > > > type: schurcomplement > > > rows=28476, cols=28476 > > > Schur complement A11 - A10 inv(A00) A01 > > > A11 > > > Mat Object: (fieldsplit_FE_split_) > > > 1 MPI processes > > > type: seqaij > > > rows=28476, cols=28476 > > > total: nonzeros=1017054, allocated nonzeros=1017054 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 9492 nodes, limit used is 5 > > > A10 > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=28476, cols=324 > > > total: nonzeros=936, allocated nonzeros=936 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 5717 nodes, limit used is 5 > > > KSP of A00 > > > KSP Object: (fieldsplit_RB_split_) > > > 1 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, > > > divergence=10000. > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_RB_split_) > > > 1 MPI processes > > > type: cholesky > > > Cholesky: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0., needed 0. > > > Factored matrix follows: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=324, cols=324 > > > package used to perform factorization: mumps > > > total: nonzeros=3042, allocated nonzeros=3042 > > > total number of mallocs used during MatSetValues > > > calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 2 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): > > > 0 > > > ICNTL(13) (efficiency control): > > > 0 > > > ICNTL(14) (percentage of estimated workspace > > > increase): 20 > > > ICNTL(18) (input mat struct): > > > 0 > > > ICNTL(19) (Shur complement info): > > > 0 > > > ICNTL(20) (rhs sparse pattern): > > > 0 > > > ICNTL(21) (solution struct): > > > 0 > > > ICNTL(22) (in-core/out-of-core facility): > > > 0 > > > ICNTL(23) (max size of memory can be > > > allocated locally):0 > > > ICNTL(24) (detection of null pivot rows): > > > 0 > > > ICNTL(25) (computation of a null space > > > basis): 0 > > > ICNTL(26) (Schur options for rhs or > > > solution): 0 > > > ICNTL(27) (experimental parameter): > > > -24 > > > ICNTL(28) (use parallel or sequential > > > ordering): 1 > > > ICNTL(29) (parallel ordering): > > > 0 > > > ICNTL(30) (user-specified set of entries in > > > inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve > > > phase): 0 > > > ICNTL(33) (compute determinant): > > > 0 > > > CNTL(1) (relative pivoting threshold): > > > 0.01 > > > CNTL(2) (stopping criterion of refinement): > > > 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0. > > > CNTL(4) (value of static pivoting): > > > -1. > > > CNTL(5) (fixation for null pivots): 0. > > > RINFO(1) (local estimated flops for the > > > elimination after analysis): > > > [0] 29394. > > > RINFO(2) (local estimated flops for the > > > assembly after factorization): > > > [0] 1092. > > > RINFO(3) (local estimated flops for the > > > elimination after factorization): > > > [0] 29394. > > > INFO(15) (estimated size of (in MB) MUMPS > > > internal data for running numerical factorization): > > > [0] 1 > > > INFO(16) (size of (in MB) MUMPS internal data > > > used during numerical factorization): > > > [0] 1 > > > INFO(23) (num of pivots eliminated on this > > > processor after factorization): > > > [0] 324 > > > RINFOG(1) (global estimated flops for the > > > elimination after analysis): 29394. > > > RINFOG(2) (global estimated flops for the > > > assembly after factorization): 1092. > > > RINFOG(3) (global estimated flops for the > > > elimination after factorization): 29394. > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) > > > (determinant): (0.,0.)*(2^0) > > > INFOG(3) (estimated real workspace for > > > factors on all processors after analysis): 3888 > > > INFOG(4) (estimated integer workspace for > > > factors on all processors after analysis): 2067 > > > INFOG(5) (estimated maximum front size in the > > > complete tree): 12 > > > INFOG(6) (number of nodes in the complete > > > tree): 53 > > > INFOG(7) (ordering option effectively use > > > after analysis): 2 > > > INFOG(8) (structural symmetry in percent of > > > the permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to > > > store the matrix factors after factorization): 3888 > > > INFOG(10) (total integer space store the > > > matrix factors after factorization): 2067 > > > INFOG(11) (order of largest frontal matrix > > > after factorization): 12 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after > > > factorization): 0 > > > INFOG(14) (number of memory compress after > > > factorization): 0 > > > INFOG(15) (number of steps of iterative > > > refinement after solution): 0 > > > INFOG(16) (estimated size (in MB) of all > > > MUMPS internal data for factorization after analysis: value on the most > > > memory consuming processor): 1 > > > INFOG(17) (estimated size of all MUMPS > > > internal data for factorization after analysis: sum over all processors): > > > 1 > > > INFOG(18) (size of all MUMPS internal data > > > allocated during factorization: value on the most memory consuming > > > processor): 1 > > > INFOG(19) (size of all MUMPS internal data > > > allocated during factorization: sum over all processors): 1 > > > INFOG(20) (estimated number of entries in the > > > factors): 3042 > > > INFOG(21) (size in MB of memory effectively > > > used during factorization - value on the most memory consuming > > > processor): 1 > > > INFOG(22) (size in MB of memory effectively > > > used during factorization - sum over all processors): 1 > > > INFOG(23) (after analysis: value of ICNTL(6) > > > effectively used): 5 > > > INFOG(24) (after analysis: value of ICNTL(12) > > > effectively used): 1 > > > INFOG(25) (after factorization: number of > > > pivots modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of > > > null pivots encountered): 0 > > > INFOG(29) (after factorization: effective > > > number of entries in the factors (sum over all processors)): 3042 > > > INFOG(30, 31) (after solution: size in Mbytes > > > of memory used during solution phase): 0, 0 > > > INFOG(32) (after analysis: type of analysis > > > done): 1 > > > INFOG(33) (value used for ICNTL(8)): -2 > > > INFOG(34) (exponent of the determinant if > > > determinant is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_RB_split_) > > > 1 MPI processes > > > type: seqaij > > > rows=324, cols=324 > > > total: nonzeros=5760, allocated nonzeros=5760 > > > total number of mallocs used during MatSetValues calls > > > =0 > > > using I-node routines: found 108 nodes, limit used is > > > 5 > > > A01 > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=324, cols=28476 > > > total: nonzeros=936, allocated nonzeros=936 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 67 nodes, limit used is 5 > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=28476, cols=28476 > > > total: nonzeros=1037052, allocated nonzeros=1037052 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 9489 nodes, limit used is 5 > > > linear system matrix = precond matrix: > > > Mat Object: () 1 MPI processes > > > type: seqaij > > > rows=28800, cols=28800 > > > total: nonzeros=1024686, allocated nonzeros=1024794 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 9600 nodes, limit used is 5 > > > > > > ---------------------------------------------- PETSc Performance Summary: > > > ---------------------------------------------- > > > > > > /home/dknez/akselos-dev/scrbe/build/bin/fe_solver-opt_real on a > > > arch-linux2-c-opt named david-Lenovo with 1 processor, by dknez Wed Jan > > > 11 17:22:10 2017 > > > Using Petsc Release Version 3.7.3, unknown > > > > > > Max Max/Min Avg Total > > > Time (sec): 9.638e+01 1.00000 9.638e+01 > > > Objects: 2.030e+02 1.00000 2.030e+02 > > > Flops: 1.732e+11 1.00000 1.732e+11 1.732e+11 > > > Flops/sec: 1.797e+09 1.00000 1.797e+09 1.797e+09 > > > MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 > > > MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 > > > MPI Reductions: 0.000e+00 0.00000 > > > > > > Flop counting convention: 1 flop = 1 real number operation of type > > > (multiply/divide/add/subtract) > > > e.g., VecAXPY() for real vectors of length N > > > --> 2N flops > > > and VecAXPY() for complex vectors of length N > > > --> 8N flops > > > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > > > --- -- Message Lengths -- -- Reductions -- > > > Avg %Total Avg %Total counts > > > %Total Avg %Total counts %Total > > > 0: Main Stage: 9.6379e+01 100.0% 1.7318e+11 100.0% 0.000e+00 > > > 0.0% 0.000e+00 0.0% 0.000e+00 0.0% > > > > > > ------------------------------------------------------------------------------------------------------------------------ > > > See the 'Profiling' chapter of the users' manual for details on > > > interpreting output. > > > Phase summary info: > > > Count: number of times phase was executed > > > Time and Flops: Max - maximum over all processors > > > Ratio - ratio of maximum to minimum over all processors > > > Mess: number of messages sent > > > Avg. len: average message length (bytes) > > > Reduct: number of global reductions > > > Global: entire computation > > > Stage: stages of a computation. Set stages with PetscLogStagePush() > > > and PetscLogStagePop(). > > > %T - percent time in this phase %F - percent flops in this > > > phase > > > %M - percent messages in this phase %L - percent message > > > lengths in this phase > > > %R - percent reductions in this phase > > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > > > over all processors) > > > ------------------------------------------------------------------------------------------------------------------------ > > > Event Count Time (sec) Flops > > > --- Global --- --- Stage --- Total > > > Max Ratio Max Ratio Max Ratio Mess Avg len > > > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------------------------------------------------------------------ > > > > > > --- Event Stage 0: Main Stage > > > > > > VecDot 42 1.0 2.2411e-05 1.0 8.53e+03 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 380 > > > VecTDot 77761 1.0 1.4294e+00 1.0 4.43e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 1 3 0 0 0 1 3 0 0 0 3098 > > > VecNorm 38894 1.0 9.1002e-01 1.0 2.22e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 1 1 0 0 0 1 1 0 0 0 2434 > > > VecScale 38882 1.0 3.7314e-01 1.0 1.11e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 1 0 0 0 0 1 0 0 0 2967 > > > VecCopy 38908 1.0 2.1655e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecSet 77887 1.0 3.2034e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecAXPY 77777 1.0 1.8382e+00 1.0 4.43e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 2 3 0 0 0 2 3 0 0 0 2409 > > > VecAYPX 38875 1.0 1.2884e+00 1.0 2.21e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 1 1 0 0 0 1 1 0 0 0 1718 > > > VecAssemblyBegin 68 1.0 1.9407e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecAssemblyEnd 68 1.0 2.6941e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecScatterBegin 48 1.0 4.6349e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatMult 38891 1.0 4.3045e+01 1.0 8.03e+10 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 45 46 0 0 0 45 46 0 0 0 1866 > > > MatMultAdd 38889 1.0 3.5360e+01 1.0 7.91e+10 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 37 46 0 0 0 37 46 0 0 0 2236 > > > MatSolve 77769 1.0 4.8780e+01 1.0 7.95e+10 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 51 46 0 0 0 51 46 0 0 0 1631 > > > MatLUFactorNum 1 1.0 1.9575e-02 1.0 2.49e+07 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1274 > > > MatCholFctrSym 1 1.0 9.4891e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatCholFctrNum 1 1.0 3.7885e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatILUFactorSym 1 1.0 4.1780e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatConvert 1 1.0 3.0041e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatScale 2 1.0 2.7180e-05 1.0 2.53e+04 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 930 > > > MatAssemblyBegin 32 1.0 4.0531e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatAssemblyEnd 32 1.0 1.2032e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetRow 114978 1.0 5.9254e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetRowIJ 2 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetSubMatrice 6 1.0 1.5707e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetOrdering 2 1.0 3.2425e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatZeroEntries 6 1.0 3.0580e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatView 7 1.0 3.5119e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatAXPY 1 1.0 1.9384e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatMatMult 1 1.0 2.7120e-03 1.0 3.16e+05 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 117 > > > MatMatMultSym 1 1.0 1.8010e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatMatMultNum 1 1.0 6.1703e-04 1.0 3.16e+05 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 513 > > > KSPSetUp 4 1.0 9.8944e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSolve 1 1.0 9.3380e+01 1.0 1.73e+11 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 97100 0 0 0 97100 0 0 0 1855 > > > PCSetUp 4 1.0 6.6326e-02 1.0 2.53e+07 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 381 > > > PCSetUpOnBlocks 5 1.0 2.4082e-02 1.0 2.49e+07 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1036 > > > PCApply 5 1.0 9.3376e+01 1.0 1.73e+11 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 97100 0 0 0 97100 0 0 0 1855 > > > KSPSolve_FS_0 5 1.0 7.0214e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSolve_FS_Schu 5 1.0 9.3372e+01 1.0 1.73e+11 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 97100 0 0 0 97100 0 0 0 1855 > > > KSPSolve_FS_Low 5 1.0 2.1377e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > ------------------------------------------------------------------------------------------------------------------------ > > > > > > Memory usage is given in bytes: > > > > > > Object Type Creations Destructions Memory Descendants' > > > Mem. > > > Reports information only for process 0. > > > > > > --- Event Stage 0: Main Stage > > > > > > Vector 92 92 9698040 0. > > > Vector Scatter 24 24 15936 0. > > > Index Set 51 51 537876 0. > > > IS L to G Mapping 3 3 240408 0. > > > Matrix 16 16 77377776 0. > > > Krylov Solver 6 6 7888 0. > > > Preconditioner 6 6 6288 0. > > > Viewer 1 0 0 0. > > > Distributed Mesh 1 1 4624 0. > > > Star Forest Bipartite Graph 2 2 1616 0. > > > Discrete System 1 1 872 0. > > > ======================================================================================================================== > > > Average time to get PetscTime(): 0. > > > #PETSc Option Table entries: > > > -ksp_monitor > > > -ksp_view > > > -log_view > > > #End of PETSc Option Table entries > > > Compiled without FORTRAN kernels > > > Compiled with full precision matrices (default) > > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > > > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > > Configure options: --with-shared-libraries=1 --with-debugging=0 > > > --download-suitesparse --download-blacs --download-ptscotch=yes > > > --with-blas-lapack-dir=/opt/intel/system_studio_2015.2.050/mkl > > > --CXXFLAGS=-Wl,--no-as-needed --download-scalapack --download-mumps > > > --download-metis > > > --prefix=/home/dknez/software/libmesh_install/opt_real/petsc > > > --download-hypre --download-ml > > > ----------------------------------------- > > > Libraries compiled on Wed Sep 21 17:38:52 2016 on david-Lenovo > > > Machine characteristics: > > > Linux-4.4.0-38-generic-x86_64-with-Ubuntu-16.04-xenial > > > Using PETSc directory: /home/dknez/software/petsc-src > > > Using PETSc arch: arch-linux2-c-opt > > > ----------------------------------------- > > > > > > Using C compiler: mpicc -fPIC -Wall -Wwrite-strings > > > -Wno-strict-aliasing -Wno-unknown-pragmas -fvisibility=hidden -g -O > > > ${COPTFLAGS} ${CFLAGS} > > > Using Fortran compiler: mpif90 -fPIC -Wall -ffree-line-length-0 > > > -Wno-unused-dummy-argument -g -O ${FOPTFLAGS} ${FFLAGS} > > > ----------------------------------------- > > > > > > Using include paths: > > > -I/home/dknez/software/petsc-src/arch-linux2-c-opt/include > > > -I/home/dknez/software/petsc-src/include > > > -I/home/dknez/software/petsc-src/include > > > -I/home/dknez/software/petsc-src/arch-linux2-c-opt/include > > > -I/home/dknez/software/libmesh_install/opt_real/petsc/include > > > -I/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent > > > -I/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include > > > -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi > > > ----------------------------------------- > > > > > > Using C linker: mpicc > > > Using Fortran linker: mpif90 > > > Using libraries: > > > -Wl,-rpath,/home/dknez/software/petsc-src/arch-linux2-c-opt/lib > > > -L/home/dknez/software/petsc-src/arch-linux2-c-opt/lib -lpetsc > > > -Wl,-rpath,/home/dknez/software/libmesh_install/opt_real/petsc/lib > > > -L/home/dknez/software/libmesh_install/opt_real/petsc/lib -lcmumps > > > -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lmetis -lHYPRE > > > -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib > > > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 > > > -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > > > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > > > -L/lib/x86_64-linux-gnu -lmpi_cxx -lstdc++ -lscalapack -lml -lmpi_cxx > > > -lstdc++ -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd > > > -lsuitesparseconfig > > > -Wl,-rpath,/opt/intel/system_studio_2015.2.050/mkl/lib/intel64 > > > -L/opt/intel/system_studio_2015.2.050/mkl/lib/intel64 -lmkl_intel_lp64 > > > -lmkl_sequential -lmkl_core -lpthread -lm -lhwloc -lptesmumps -lptscotch > > > -lptscotcherr -lscotch -lscotcherr -lX11 -lm -lmpi_usempif08 > > > -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -lm -lgfortran -lm > > > -lquadmath -lm -lmpi_cxx -lstdc++ -lrt -lm -lpthread -lz > > > -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib > > > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 > > > -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > > > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > > > -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu > > > -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/usr/lib/openmpi/lib -lmpi > > > -lgcc_s -lpthread -ldl > > > ----------------------------------------- > > > > > > > > > > > > > > > On Wed, Jan 11, 2017 at 4:49 PM, Dave May <dave.mayhe...@gmail.com> wrote: > > > It looks like the Schur solve is requiring a huge number of iterates to > > > converge (based on the instances of MatMult). > > > This is killing the performance. > > > > > > Are you sure that A11 is a good approximation to S? You might consider > > > trying the selfp option > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetSchurPre.html#PCFieldSplitSetSchurPre > > > > > > Note that the best approx to S is likely both problem and discretisation > > > dependent so if selfp is also terrible, you might want to consider coding > > > up your own approx to S for your specific system. > > > > > > > > > Thanks, > > > Dave > > > > > > > > > On Wed, 11 Jan 2017 at 22:34, David Knezevic <david.kneze...@akselos.com> > > > wrote: > > > I have a definite block 2x2 system and I figured it'd be good to apply > > > the PCFIELDSPLIT functionality with Schur complement, as described in > > > Section 4.5 of the manual. > > > > > > The A00 block of my matrix is very small so I figured I'd specify a > > > direct solver (i.e. MUMPS) for that block. > > > > > > So I did the following: > > > - PCFieldSplitSetIS to specify the indices of the two splits > > > - PCFieldSplitGetSubKSP to get the two KSP objects, and to set the solver > > > and PC types for each (MUMPS for A00, ILU+CG for A11) > > > - I set -pc_fieldsplit_schur_fact_type full > > > > > > Below I have pasted the output of "-ksp_view -ksp_monitor -log_view" for > > > a test case. It seems to converge well, but I'm concerned about the speed > > > (about 90 seconds, vs. about 1 second if I use a direct solver for the > > > entire system). I just wanted to check if I'm setting this up in a good > > > way? > > > > > > Many thanks, > > > David > > > > > > ----------------------------------------------------------------------------------- > > > > > > 0 KSP Residual norm 5.405774214400e+04 > > > 1 KSP Residual norm 1.849649014371e+02 > > > 2 KSP Residual norm 7.462775074989e-02 > > > 3 KSP Residual norm 2.680497175260e-04 > > > KSP Object: 1 MPI processes > > > type: cg > > > maximum iterations=1000 > > > tolerances: relative=1e-06, absolute=1e-50, divergence=10000. > > > left preconditioning > > > using nonzero initial guess > > > using PRECONDITIONED norm type for convergence test > > > PC Object: 1 MPI processes > > > type: fieldsplit > > > FieldSplit with Schur preconditioner, factorization FULL > > > Preconditioner for the Schur complement formed from A11 > > > Split info: > > > Split number 0 Defined by IS > > > Split number 1 Defined by IS > > > KSP solver for A00 block > > > KSP Object: (fieldsplit_RB_split_) 1 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_RB_split_) 1 MPI processes > > > type: cholesky > > > Cholesky: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0., needed 0. > > > Factored matrix follows: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=324, cols=324 > > > package used to perform factorization: mumps > > > total: nonzeros=3042, allocated nonzeros=3042 > > > total number of mallocs used during MatSetValues calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 2 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): > > > 0 > > > ICNTL(13) (efficiency control): > > > 0 > > > ICNTL(14) (percentage of estimated workspace > > > increase): 20 > > > ICNTL(18) (input mat struct): > > > 0 > > > ICNTL(19) (Shur complement info): > > > 0 > > > ICNTL(20) (rhs sparse pattern): > > > 0 > > > ICNTL(21) (solution struct): > > > 0 > > > ICNTL(22) (in-core/out-of-core facility): > > > 0 > > > ICNTL(23) (max size of memory can be allocated > > > locally):0 > > > ICNTL(24) (detection of null pivot rows): > > > 0 > > > ICNTL(25) (computation of a null space basis): > > > 0 > > > ICNTL(26) (Schur options for rhs or solution): > > > 0 > > > ICNTL(27) (experimental parameter): > > > -24 > > > ICNTL(28) (use parallel or sequential ordering): > > > 1 > > > ICNTL(29) (parallel ordering): > > > 0 > > > ICNTL(30) (user-specified set of entries in inv(A)): > > > 0 > > > ICNTL(31) (factors is discarded in the solve phase): > > > 0 > > > ICNTL(33) (compute determinant): > > > 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): > > > 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0. > > > CNTL(4) (value of static pivoting): -1. > > > CNTL(5) (fixation for null pivots): 0. > > > RINFO(1) (local estimated flops for the elimination > > > after analysis): > > > [0] 29394. > > > RINFO(2) (local estimated flops for the assembly > > > after factorization): > > > [0] 1092. > > > RINFO(3) (local estimated flops for the elimination > > > after factorization): > > > [0] 29394. > > > INFO(15) (estimated size of (in MB) MUMPS internal > > > data for running numerical factorization): > > > [0] 1 > > > INFO(16) (size of (in MB) MUMPS internal data used > > > during numerical factorization): > > > [0] 1 > > > INFO(23) (num of pivots eliminated on this processor > > > after factorization): > > > [0] 324 > > > RINFOG(1) (global estimated flops for the elimination > > > after analysis): 29394. > > > RINFOG(2) (global estimated flops for the assembly > > > after factorization): 1092. > > > RINFOG(3) (global estimated flops for the elimination > > > after factorization): 29394. > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > > > (0.,0.)*(2^0) > > > INFOG(3) (estimated real workspace for factors on all > > > processors after analysis): 3888 > > > INFOG(4) (estimated integer workspace for factors on > > > all processors after analysis): 2067 > > > INFOG(5) (estimated maximum front size in the > > > complete tree): 12 > > > INFOG(6) (number of nodes in the complete tree): 53 > > > INFOG(7) (ordering option effectively use after > > > analysis): 2 > > > INFOG(8) (structural symmetry in percent of the > > > permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store the > > > matrix factors after factorization): 3888 > > > INFOG(10) (total integer space store the matrix > > > factors after factorization): 2067 > > > INFOG(11) (order of largest frontal matrix after > > > factorization): 12 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after > > > factorization): 0 > > > INFOG(14) (number of memory compress after > > > factorization): 0 > > > INFOG(15) (number of steps of iterative refinement > > > after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS > > > internal data for factorization after analysis: value on the most memory > > > consuming processor): 1 > > > INFOG(17) (estimated size of all MUMPS internal data > > > for factorization after analysis: sum over all processors): 1 > > > INFOG(18) (size of all MUMPS internal data allocated > > > during factorization: value on the most memory consuming processor): 1 > > > INFOG(19) (size of all MUMPS internal data allocated > > > during factorization: sum over all processors): 1 > > > INFOG(20) (estimated number of entries in the > > > factors): 3042 > > > INFOG(21) (size in MB of memory effectively used > > > during factorization - value on the most memory consuming processor): 1 > > > INFOG(22) (size in MB of memory effectively used > > > during factorization - sum over all processors): 1 > > > INFOG(23) (after analysis: value of ICNTL(6) > > > effectively used): 5 > > > INFOG(24) (after analysis: value of ICNTL(12) > > > effectively used): 1 > > > INFOG(25) (after factorization: number of pivots > > > modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null pivots > > > encountered): 0 > > > INFOG(29) (after factorization: effective number of > > > entries in the factors (sum over all processors)): 3042 > > > INFOG(30, 31) (after solution: size in Mbytes of > > > memory used during solution phase): 0, 0 > > > INFOG(32) (after analysis: type of analysis done): 1 > > > INFOG(33) (value used for ICNTL(8)): -2 > > > INFOG(34) (exponent of the determinant if determinant > > > is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_RB_split_) 1 MPI processes > > > type: seqaij > > > rows=324, cols=324 > > > total: nonzeros=5760, allocated nonzeros=5760 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 108 nodes, limit used is 5 > > > KSP solver for S = A11 - A10 inv(A00) A01 > > > KSP Object: (fieldsplit_FE_split_) 1 MPI processes > > > type: cg > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > > left preconditioning > > > using PRECONDITIONED norm type for convergence test > > > PC Object: (fieldsplit_FE_split_) 1 MPI processes > > > type: bjacobi > > > block Jacobi: number of blocks = 1 > > > Local solve is same for all blocks, in the following KSP and PC > > > objects: > > > KSP Object: (fieldsplit_FE_split_sub_) 1 MPI > > > processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_FE_split_sub_) 1 MPI > > > processes > > > type: ilu > > > ILU: out-of-place factorization > > > 0 levels of fill > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 1., needed 1. > > > Factored matrix follows: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=28476, cols=28476 > > > package used to perform factorization: petsc > > > total: nonzeros=1017054, allocated nonzeros=1017054 > > > total number of mallocs used during MatSetValues > > > calls =0 > > > using I-node routines: found 9492 nodes, limit used > > > is 5 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_FE_split_) 1 > > > MPI processes > > > type: seqaij > > > rows=28476, cols=28476 > > > total: nonzeros=1017054, allocated nonzeros=1017054 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 9492 nodes, limit used is 5 > > > linear system matrix followed by preconditioner matrix: > > > Mat Object: (fieldsplit_FE_split_) 1 MPI processes > > > type: schurcomplement > > > rows=28476, cols=28476 > > > Schur complement A11 - A10 inv(A00) A01 > > > A11 > > > Mat Object: (fieldsplit_FE_split_) > > > 1 MPI processes > > > type: seqaij > > > rows=28476, cols=28476 > > > total: nonzeros=1017054, allocated nonzeros=1017054 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 9492 nodes, limit used is 5 > > > A10 > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=28476, cols=324 > > > total: nonzeros=936, allocated nonzeros=936 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 5717 nodes, limit used is 5 > > > KSP of A00 > > > KSP Object: (fieldsplit_RB_split_) > > > 1 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, > > > divergence=10000. > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_RB_split_) > > > 1 MPI processes > > > type: cholesky > > > Cholesky: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0., needed 0. > > > Factored matrix follows: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=324, cols=324 > > > package used to perform factorization: mumps > > > total: nonzeros=3042, allocated nonzeros=3042 > > > total number of mallocs used during MatSetValues > > > calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 2 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): > > > 0 > > > ICNTL(13) (efficiency control): > > > 0 > > > ICNTL(14) (percentage of estimated workspace > > > increase): 20 > > > ICNTL(18) (input mat struct): > > > 0 > > > ICNTL(19) (Shur complement info): > > > 0 > > > ICNTL(20) (rhs sparse pattern): > > > 0 > > > ICNTL(21) (solution struct): > > > 0 > > > ICNTL(22) (in-core/out-of-core facility): > > > 0 > > > ICNTL(23) (max size of memory can be > > > allocated locally):0 > > > ICNTL(24) (detection of null pivot rows): > > > 0 > > > ICNTL(25) (computation of a null space > > > basis): 0 > > > ICNTL(26) (Schur options for rhs or > > > solution): 0 > > > ICNTL(27) (experimental parameter): > > > -24 > > > ICNTL(28) (use parallel or sequential > > > ordering): 1 > > > ICNTL(29) (parallel ordering): > > > 0 > > > ICNTL(30) (user-specified set of entries in > > > inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve > > > phase): 0 > > > ICNTL(33) (compute determinant): > > > 0 > > > CNTL(1) (relative pivoting threshold): > > > 0.01 > > > CNTL(2) (stopping criterion of refinement): > > > 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0. > > > CNTL(4) (value of static pivoting): > > > -1. > > > CNTL(5) (fixation for null pivots): 0. > > > RINFO(1) (local estimated flops for the > > > elimination after analysis): > > > [0] 29394. > > > RINFO(2) (local estimated flops for the > > > assembly after factorization): > > > [0] 1092. > > > RINFO(3) (local estimated flops for the > > > elimination after factorization): > > > [0] 29394. > > > INFO(15) (estimated size of (in MB) MUMPS > > > internal data for running numerical factorization): > > > [0] 1 > > > INFO(16) (size of (in MB) MUMPS internal data > > > used during numerical factorization): > > > [0] 1 > > > INFO(23) (num of pivots eliminated on this > > > processor after factorization): > > > [0] 324 > > > RINFOG(1) (global estimated flops for the > > > elimination after analysis): 29394. > > > RINFOG(2) (global estimated flops for the > > > assembly after factorization): 1092. > > > RINFOG(3) (global estimated flops for the > > > elimination after factorization): 29394. > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) > > > (determinant): (0.,0.)*(2^0) > > > INFOG(3) (estimated real workspace for > > > factors on all processors after analysis): 3888 > > > INFOG(4) (estimated integer workspace for > > > factors on all processors after analysis): 2067 > > > INFOG(5) (estimated maximum front size in the > > > complete tree): 12 > > > INFOG(6) (number of nodes in the complete > > > tree): 53 > > > INFOG(7) (ordering option effectively use > > > after analysis): 2 > > > INFOG(8) (structural symmetry in percent of > > > the permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to > > > store the matrix factors after factorization): 3888 > > > INFOG(10) (total integer space store the > > > matrix factors after factorization): 2067 > > > INFOG(11) (order of largest frontal matrix > > > after factorization): 12 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after > > > factorization): 0 > > > INFOG(14) (number of memory compress after > > > factorization): 0 > > > INFOG(15) (number of steps of iterative > > > refinement after solution): 0 > > > INFOG(16) (estimated size (in MB) of all > > > MUMPS internal data for factorization after analysis: value on the most > > > memory consuming processor): 1 > > > INFOG(17) (estimated size of all MUMPS > > > internal data for factorization after analysis: sum over all processors): > > > 1 > > > INFOG(18) (size of all MUMPS internal data > > > allocated during factorization: value on the most memory consuming > > > processor): 1 > > > INFOG(19) (size of all MUMPS internal data > > > allocated during factorization: sum over all processors): 1 > > > INFOG(20) (estimated number of entries in the > > > factors): 3042 > > > INFOG(21) (size in MB of memory effectively > > > used during factorization - value on the most memory consuming > > > processor): 1 > > > INFOG(22) (size in MB of memory effectively > > > used during factorization - sum over all processors): 1 > > > INFOG(23) (after analysis: value of ICNTL(6) > > > effectively used): 5 > > > INFOG(24) (after analysis: value of ICNTL(12) > > > effectively used): 1 > > > INFOG(25) (after factorization: number of > > > pivots modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of > > > null pivots encountered): 0 > > > INFOG(29) (after factorization: effective > > > number of entries in the factors (sum over all processors)): 3042 > > > INFOG(30, 31) (after solution: size in Mbytes > > > of memory used during solution phase): 0, 0 > > > INFOG(32) (after analysis: type of analysis > > > done): 1 > > > INFOG(33) (value used for ICNTL(8)): -2 > > > INFOG(34) (exponent of the determinant if > > > determinant is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_RB_split_) > > > 1 MPI processes > > > type: seqaij > > > rows=324, cols=324 > > > total: nonzeros=5760, allocated nonzeros=5760 > > > total number of mallocs used during MatSetValues calls > > > =0 > > > using I-node routines: found 108 nodes, limit used is > > > 5 > > > A01 > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=324, cols=28476 > > > total: nonzeros=936, allocated nonzeros=936 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 67 nodes, limit used is 5 > > > Mat Object: (fieldsplit_FE_split_) 1 MPI processes > > > type: seqaij > > > rows=28476, cols=28476 > > > total: nonzeros=1017054, allocated nonzeros=1017054 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 9492 nodes, limit used is 5 > > > linear system matrix = precond matrix: > > > Mat Object: () 1 MPI processes > > > type: seqaij > > > rows=28800, cols=28800 > > > total: nonzeros=1024686, allocated nonzeros=1024794 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 9600 nodes, limit used is 5 > > > > > > > > > ---------------------------------------------- PETSc Performance Summary: > > > ---------------------------------------------- > > > > > > /home/dknez/akselos-dev/scrbe/build/bin/fe_solver-opt_real on a > > > arch-linux2-c-opt named david-Lenovo with 1 processor, by dknez Wed Jan > > > 11 16:16:47 2017 > > > Using Petsc Release Version 3.7.3, unknown > > > > > > Max Max/Min Avg Total > > > Time (sec): 9.179e+01 1.00000 9.179e+01 > > > Objects: 1.990e+02 1.00000 1.990e+02 > > > Flops: 1.634e+11 1.00000 1.634e+11 1.634e+11 > > > Flops/sec: 1.780e+09 1.00000 1.780e+09 1.780e+09 > > > MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 > > > MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 > > > MPI Reductions: 0.000e+00 0.00000 > > > > > > Flop counting convention: 1 flop = 1 real number operation of type > > > (multiply/divide/add/subtract) > > > e.g., VecAXPY() for real vectors of length N > > > --> 2N flops > > > and VecAXPY() for complex vectors of length N > > > --> 8N flops > > > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > > > --- -- Message Lengths -- -- Reductions -- > > > Avg %Total Avg %Total counts > > > %Total Avg %Total counts %Total > > > 0: Main Stage: 9.1787e+01 100.0% 1.6336e+11 100.0% 0.000e+00 > > > 0.0% 0.000e+00 0.0% 0.000e+00 0.0% > > > > > > ------------------------------------------------------------------------------------------------------------------------ > > > See the 'Profiling' chapter of the users' manual for details on > > > interpreting output. > > > Phase summary info: > > > Count: number of times phase was executed > > > Time and Flops: Max - maximum over all processors > > > Ratio - ratio of maximum to minimum over all processors > > > Mess: number of messages sent > > > Avg. len: average message length (bytes) > > > Reduct: number of global reductions > > > Global: entire computation > > > Stage: stages of a computation. Set stages with PetscLogStagePush() > > > and PetscLogStagePop(). > > > %T - percent time in this phase %F - percent flops in this > > > phase > > > %M - percent messages in this phase %L - percent message > > > lengths in this phase > > > %R - percent reductions in this phase > > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > > > over all processors) > > > ------------------------------------------------------------------------------------------------------------------------ > > > Event Count Time (sec) Flops > > > --- Global --- --- Stage --- Total > > > Max Ratio Max Ratio Max Ratio Mess Avg len > > > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------------------------------------------------------------------ > > > > > > --- Event Stage 0: Main Stage > > > > > > VecDot 42 1.0 2.4080e-05 1.0 8.53e+03 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 354 > > > VecTDot 74012 1.0 1.2440e+00 1.0 4.22e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 1 3 0 0 0 1 3 0 0 0 3388 > > > VecNorm 37020 1.0 8.3580e-01 1.0 2.11e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 1 1 0 0 0 1 1 0 0 0 2523 > > > VecScale 37008 1.0 3.5800e-01 1.0 1.05e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 1 0 0 0 0 1 0 0 0 2944 > > > VecCopy 37034 1.0 2.5754e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecSet 74137 1.0 3.0537e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecAXPY 74029 1.0 1.7233e+00 1.0 4.22e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 2 3 0 0 0 2 3 0 0 0 2446 > > > VecAYPX 37001 1.0 1.2214e+00 1.0 2.11e+09 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 1 1 0 0 0 1 1 0 0 0 1725 > > > VecAssemblyBegin 68 1.0 2.0432e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecAssemblyEnd 68 1.0 2.5988e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecScatterBegin 48 1.0 4.6921e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatMult 37017 1.0 4.1269e+01 1.0 7.65e+10 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 45 47 0 0 0 45 47 0 0 0 1853 > > > MatMultAdd 37015 1.0 3.3638e+01 1.0 7.53e+10 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 37 46 0 0 0 37 46 0 0 0 2238 > > > MatSolve 74021 1.0 4.6602e+01 1.0 7.42e+10 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 51 45 0 0 0 51 45 0 0 0 1593 > > > MatLUFactorNum 1 1.0 1.7209e-02 1.0 2.44e+07 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1420 > > > MatCholFctrSym 1 1.0 8.8310e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatCholFctrNum 1 1.0 3.6907e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatILUFactorSym 1 1.0 3.7372e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatAssemblyBegin 29 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatAssemblyEnd 29 1.0 9.9473e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetRow 58026 1.0 2.8155e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetRowIJ 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetSubMatrice 6 1.0 1.5399e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetOrdering 2 1.0 3.0112e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatZeroEntries 6 1.0 2.9490e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatView 7 1.0 3.4356e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSetUp 4 1.0 9.4891e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSolve 1 1.0 8.8793e+01 1.0 1.63e+11 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 97100 0 0 0 97100 0 0 0 1840 > > > PCSetUp 4 1.0 3.8375e-02 1.0 2.44e+07 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 637 > > > PCSetUpOnBlocks 5 1.0 2.1250e-02 1.0 2.44e+07 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1150 > > > PCApply 5 1.0 8.8789e+01 1.0 1.63e+11 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 97100 0 0 0 97100 0 0 0 1840 > > > KSPSolve_FS_0 5 1.0 7.5364e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSolve_FS_Schu 5 1.0 8.8785e+01 1.0 1.63e+11 1.0 0.0e+00 0.0e+00 > > > 0.0e+00 97100 0 0 0 97100 0 0 0 1840 > > > KSPSolve_FS_Low 5 1.0 2.1019e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > ------------------------------------------------------------------------------------------------------------------------ > > > > > > Memory usage is given in bytes: > > > > > > Object Type Creations Destructions Memory Descendants' > > > Mem. > > > Reports information only for process 0. > > > > > > --- Event Stage 0: Main Stage > > > > > > Vector 91 91 9693912 0. > > > Vector Scatter 24 24 15936 0. > > > Index Set 51 51 537888 0. > > > IS L to G Mapping 3 3 240408 0. > > > Matrix 13 13 64097868 0. > > > Krylov Solver 6 6 7888 0. > > > Preconditioner 6 6 6288 0. > > > Viewer 1 0 0 0. > > > Distributed Mesh 1 1 4624 0. > > > Star Forest Bipartite Graph 2 2 1616 0. > > > Discrete System 1 1 872 0. > > > ======================================================================================================================== > > > Average time to get PetscTime(): 0. > > > #PETSc Option Table entries: > > > -ksp_monitor > > > -ksp_view > > > -log_view > > > #End of PETSc Option Table entries > > > Compiled without FORTRAN kernels > > > Compiled with full precision matrices (default) > > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > > > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > > Configure options: --with-shared-libraries=1 --with-debugging=0 > > > --download-suitesparse --download-blacs --download-ptscotch=yes > > > --with-blas-lapack-dir=/opt/intel/system_studio_2015.2.050/mkl > > > --CXXFLAGS=-Wl,--no-as-needed --download-scalapack --download-mumps > > > --download-metis > > > --prefix=/home/dknez/software/libmesh_install/opt_real/petsc > > > --download-hypre --download-ml > > > ----------------------------------------- > > > Libraries compiled on Wed Sep 21 17:38:52 2016 on david-Lenovo > > > Machine characteristics: > > > Linux-4.4.0-38-generic-x86_64-with-Ubuntu-16.04-xenial > > > Using PETSc directory: /home/dknez/software/petsc-src > > > Using PETSc arch: arch-linux2-c-opt > > > ----------------------------------------- > > > > > > Using C compiler: mpicc -fPIC -Wall -Wwrite-strings > > > -Wno-strict-aliasing -Wno-unknown-pragmas -fvisibility=hidden -g -O > > > ${COPTFLAGS} ${CFLAGS} > > > Using Fortran compiler: mpif90 -fPIC -Wall -ffree-line-length-0 > > > -Wno-unused-dummy-argument -g -O ${FOPTFLAGS} ${FFLAGS} > > > ----------------------------------------- > > > > > > Using include paths: > > > -I/home/dknez/software/petsc-src/arch-linux2-c-opt/include > > > -I/home/dknez/software/petsc-src/include > > > -I/home/dknez/software/petsc-src/include > > > -I/home/dknez/software/petsc-src/arch-linux2-c-opt/include > > > -I/home/dknez/software/libmesh_install/opt_real/petsc/include > > > -I/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent > > > -I/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include > > > -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi > > > ----------------------------------------- > > > > > > Using C linker: mpicc > > > Using Fortran linker: mpif90 > > > Using libraries: > > > -Wl,-rpath,/home/dknez/software/petsc-src/arch-linux2-c-opt/lib > > > -L/home/dknez/software/petsc-src/arch-linux2-c-opt/lib -lpetsc > > > -Wl,-rpath,/home/dknez/software/libmesh_install/opt_real/petsc/lib > > > -L/home/dknez/software/libmesh_install/opt_real/petsc/lib -lcmumps > > > -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lmetis -lHYPRE > > > -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib > > > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 > > > -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > > > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > > > -L/lib/x86_64-linux-gnu -lmpi_cxx -lstdc++ -lscalapack -lml -lmpi_cxx > > > -lstdc++ -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd > > > -lsuitesparseconfig > > > -Wl,-rpath,/opt/intel/system_studio_2015.2.050/mkl/lib/intel64 > > > -L/opt/intel/system_studio_2015.2.050/mkl/lib/intel64 -lmkl_intel_lp64 > > > -lmkl_sequential -lmkl_core -lpthread -lm -lhwloc -lptesmumps -lptscotch > > > -lptscotcherr -lscotch -lscotcherr -lX11 -lm -lmpi_usempif08 > > > -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -lm -lgfortran -lm > > > -lquadmath -lm -lmpi_cxx -lstdc++ -lrt -lm -lpthread -lz > > > -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib > > > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 > > > -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > > > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > > > -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu > > > -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/usr/lib/openmpi/lib -lmpi > > > -lgcc_s -lpthread -ldl > > > ----------------------------------------- > > > > > > > > > > > > > > > > > > > > > > > > > > > <logfile_1.txt><logfile_2.txt> > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener