Hi Barry, Thanks for your reply. I was wondering why this happens only when we use superlu_dist. I am trying to understand the algorithm in superlu_dist. If we use ASM or MUMPS, we do not produce these differences.
The differences actually are NOT meaningless. In fact, we have a real transient application that presents this issue. When we run the simulation with superlu_dist in parallel for thousands of time steps, the final physics solution looks totally different from different runs. The differences are not acceptable any more. For a steady problem, the difference may be meaningless. But it is significant for the transient problem. This makes the solution not reproducible, and we can not even set a targeting solution in the test system because the solution is so different from one run to another. I guess there might/may be a tiny bug in superlu_dist or the PETSc interface to superlu_dist. Fande, On Wed, Nov 15, 2017 at 1:59 PM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote: > > Meaningless differences > > > > On Nov 15, 2017, at 2:26 PM, Kong, Fande <fande.k...@inl.gov> wrote: > > > > Hi, > > > > There is a heat conduction problem. When superlu_dist is used as a > preconditioner, we have random results from different runs. Is there a > random algorithm in superlu_dist? If we use ASM or MUMPS as the > preconditioner, we then don't have this issue. > > > > run 1: > > > > 0 Nonlinear |R| = 9.447423e+03 > > 0 Linear |R| = 9.447423e+03 > > 1 Linear |R| = 1.013384e-02 > > 2 Linear |R| = 4.020995e-08 > > 1 Nonlinear |R| = 1.404678e-02 > > 0 Linear |R| = 1.404678e-02 > > 1 Linear |R| = 5.104757e-08 > > 2 Linear |R| = 7.699637e-14 > > 2 Nonlinear |R| = 5.106418e-08 > > > > > > run 2: > > > > 0 Nonlinear |R| = 9.447423e+03 > > 0 Linear |R| = 9.447423e+03 > > 1 Linear |R| = 1.013384e-02 > > 2 Linear |R| = 4.020995e-08 > > 1 Nonlinear |R| = 1.404678e-02 > > 0 Linear |R| = 1.404678e-02 > > 1 Linear |R| = 5.109913e-08 > > 2 Linear |R| = 7.189091e-14 > > 2 Nonlinear |R| = 5.111591e-08 > > > > run 3: > > > > 0 Nonlinear |R| = 9.447423e+03 > > 0 Linear |R| = 9.447423e+03 > > 1 Linear |R| = 1.013384e-02 > > 2 Linear |R| = 4.020995e-08 > > 1 Nonlinear |R| = 1.404678e-02 > > 0 Linear |R| = 1.404678e-02 > > 1 Linear |R| = 5.104942e-08 > > 2 Linear |R| = 7.465572e-14 > > 2 Nonlinear |R| = 5.106642e-08 > > > > run 4: > > > > 0 Nonlinear |R| = 9.447423e+03 > > 0 Linear |R| = 9.447423e+03 > > 1 Linear |R| = 1.013384e-02 > > 2 Linear |R| = 4.020995e-08 > > 1 Nonlinear |R| = 1.404678e-02 > > 0 Linear |R| = 1.404678e-02 > > 1 Linear |R| = 5.102730e-08 > > 2 Linear |R| = 7.132220e-14 > > 2 Nonlinear |R| = 5.104442e-08 > > > > Solver details: > > > > SNES Object: 8 MPI processes > > type: newtonls > > maximum iterations=15, maximum function evaluations=10000 > > tolerances: relative=1e-08, absolute=1e-11, solution=1e-50 > > total number of linear solver iterations=4 > > total number of function evaluations=7 > > norm schedule ALWAYS > > SNESLineSearch Object: 8 MPI processes > > type: basic > > maxstep=1.000000e+08, minlambda=1.000000e-12 > > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > > maximum iterations=40 > > KSP Object: 8 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=100, initial guess is zero > > tolerances: relative=1e-06, absolute=1e-50, divergence=10000. > > right preconditioning > > using UNPRECONDITIONED norm type for convergence test > > PC Object: 8 MPI processes > > type: lu > > out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0., needed 0. > > Factored matrix follows: > > Mat Object: 8 MPI processes > > type: superlu_dist > > rows=7925, cols=7925 > > package used to perform factorization: superlu_dist > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > SuperLU_DIST run parameters: > > Process grid nprow 4 x npcol 2 > > Equilibrate matrix TRUE > > Matrix input mode 1 > > Replace tiny pivots FALSE > > Use iterative refinement TRUE > > Processors in row 4 col partition 2 > > Row permutation LargeDiag > > Column permutation METIS_AT_PLUS_A > > Parallel symbolic factorization FALSE > > Repeated factorization SamePattern > > linear system matrix followed by preconditioner matrix: > > Mat Object: 8 MPI processes > > type: mffd > > rows=7925, cols=7925 > > Matrix-free approximation: > > err=1.49012e-08 (relative error in function evaluation) > > Using wp compute h routine > > Does not compute normU > > Mat Object: () 8 MPI processes > > type: mpiaij > > rows=7925, cols=7925 > > total: nonzeros=63587, allocated nonzeros=63865 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > > > > > Fande, > > > > > >