$ ./ex10 -pc_type lu -ksp_monitor_true_residual -f0 ~/Downloads/mat.dat -rhs ~/Downloads/rhs.dat -ksp_converged_reason Linear solve did not converge due to DIVERGED_NANORINF iterations 0 Number of iterations = 0 Residual norm 0.0220971
The matrix has a zero pivot with the nd ordering $ ./ex10 -pc_type lu -ksp_monitor_true_residual -f0 ~/Downloads/mat.dat -rhs ~/Downloads/rhs.dat -ksp_converged_reason -ksp_error_if_not_converged [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Zero pivot in LU factorization: http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot [0]PETSC ERROR: Zero pivot row 5 value 0. tolerance 2.22045e-14 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.6.2-1539-g5ca2a2b GIT Date: 2015-11-13 02:00:47 -0600 [0]PETSC ERROR: ./ex10 on a arch-mpich-nemesis named Barrys-MacBook-Pro.local by barrysmith Sun Nov 15 21:37:15 2015 [0]PETSC ERROR: Configure options --download-mpich --download-mpich-device=ch3:nemesis [0]PETSC ERROR: #1 MatPivotCheck_none() line 688 in /Users/barrysmith/Src/PETSc/include/petsc/private/matimpl.h [0]PETSC ERROR: #2 MatPivotCheck() line 707 in /Users/barrysmith/Src/PETSc/include/petsc/private/matimpl.h [0]PETSC ERROR: #3 MatLUFactorNumeric_SeqAIJ_Inode() line 1332 in /Users/barrysmith/Src/PETSc/src/mat/impls/aij/seq/inode.c [0]PETSC ERROR: #4 MatLUFactorNumeric() line 2946 in /Users/barrysmith/Src/PETSc/src/mat/interface/matrix.c [0]PETSC ERROR: #5 PCSetUp_LU() line 152 in /Users/barrysmith/Src/PETSc/src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: #6 PCSetUp() line 984 in /Users/barrysmith/Src/PETSc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #7 KSPSetUp() line 332 in /Users/barrysmith/Src/PETSc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #8 main() line 312 in /Users/barrysmith/Src/petsc/src/ksp/ksp/examples/tutorials/ex10.c [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -f0 /Users/barrysmith/Downloads/mat.dat [0]PETSC ERROR: -ksp_converged_reason [0]PETSC ERROR: -ksp_error_if_not_converged [0]PETSC ERROR: -ksp_monitor_true_residual [0]PETSC ERROR: -malloc_test [0]PETSC ERROR: -pc_type lu [0]PETSC ERROR: -rhs /Users/barrysmith/Downloads/rhs.dat [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to [email protected] application called MPI_Abort(MPI_COMM_WORLD, 71) - process 0 ~/Src/petsc/src/ksp/ksp/examples/tutorials (barry/utilize-hwloc *>) arch-mpich-nemesis $ ./ex10 -pc_type lu -ksp_monitor_true_residual -f0 ~/Downloads/mat.dat -rhs ~/Downloads/rhs.dat -ksp_converged_reason -ksp_error_if_not_converged -pc_factor_nonzeros_along_diagonal" > ~/Src/petsc/src/ksp/ksp/examples/tutorials (barry/utilize-hwloc *>) arch-mpich-nemesis $ ./ex10 -pc_type lu -ksp_monitor_true_residual -f0 ~/Downloads/mat.dat -rhs ~/Downloads/rhs.dat -ksp_converged_reason -ksp_error_if_not_converged -pc_factor_nonzeros_along_diagonal 0 KSP preconditioned resid norm 1.905901677970e+00 true resid norm 2.209708691208e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.703926496877e-14 true resid norm 5.880234823611e-15 ||r(i)||/||b|| 2.661090507997e-13 Linear solve converged due to CONVERGED_RTOL iterations 1 Number of iterations = 1 Residual norm < 1.e-12 ~/Src/petsc/src/ksp/ksp/examples/tutorials (barry/utilize-hwloc *>) arch-mpich-nemesis Jan, Remember you ALWAYS have to call KSPConvergedReason() after a KSPSolve to see what happened in the solve. > On Nov 15, 2015, at 8:53 PM, Hong <[email protected]> wrote: > > Jan: > I can reproduce reported behavior using > petsc/src/ksp/ksp/examples/tutorials/ex10.c on your mat.dat and rhs.dat. > > Using petsc sequential lu with default ordering 'nd', I get > ./ex10 -f0 mat.dat -rhs rhs.dat -pc_type lu > Number of iterations = 0 > Residual norm 0.0220971 > > Changing to > ./ex10 -f0 mat.dat -rhs rhs.dat -pc_type lu -pc_factor_mat_ordering_type > natural > Number of iterations = 1 > Residual norm < 1.e-12 > > Back to superlu_dist, I get > mpiexec -n 3 ./ex10 -f0 /homes/hzhang/tmp/mat.dat -rhs > /homes/hzhang/tmp/rhs.dat -pc_type lu -pc_factor_mat_solver_package > superlu_dist > Number of iterations = 4 > Residual norm 25650.8 > > which uses default ordering (-ksp_view) > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > > Run it with > mpiexec -n 3 ./ex10 -f0 mat.dat -rhs rhs.dat -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_rowperm NATURAL > -mat_superlu_dist_colperm NATURAL > Number of iterations = 1 > Residual norm < 1.e-12 > > i.e., your problem is sensitive to matrix ordering, which I do not know why. > > I checked condition number of your mat.dat using superlu: > ./ex10 -f0 mat.dat -rhs rhs.dat -pc_type lu -pc_factor_mat_solver_package > superlu -mat_superlu_conditionnumber > Recip. condition number = 1.137938e-03 > Number of iterations = 1 > Residual norm < 1.e-12 > > As you see, matrix is well-conditioned. Why is it so sensitive to matrix > ordering? > > Hong > > Using attached petsc4py code, matrix and right-hand side, SuperLU_dist > returns totally wrong solution for mixed Laplacian: > > $ tar -xzf report.tar.gz > $ python test-solve.py -pc_factor_mat_solver_package mumps -ksp_final_residual > KSP final norm of residual 3.81865e-15 > $ python test-solve.py -pc_factor_mat_solver_package umfpack > -ksp_final_residual > KSP final norm of residual 3.68546e-14 > $ python test-solve.py -pc_factor_mat_solver_package superlu_dist > -ksp_final_residual > KSP final norm of residual 1827.72 > > Moreover final residual is random when run using mpirun -np 3. Maybe > a memory corruption issue? This is reproducible using PETSc 3.6.2 (and > SuperLU_dist configured by PETSc) and much older, see > http://fenicsproject.org/pipermail/fenics-support/2014-March/000439.html > but has never been reported upstream. > > The code for assembling the matrix and rhs using FEniCS is also > included for the sake of completeness. > > Jan >
