On 30/6/2014 1:53 PM, Barry Smith wrote:
On Jun 30, 2014, at 12:00 AM, TAY wee-beng <[email protected]> wrote:
Hi,
I have a CFD code which gives an error when solving the momentum eqn at time step
= 1109. Using KSPGetConvergedReason give < 0 using optimized build.
What value < 0? It is possible there is no bug. Bi-CG-stab (though it is
stabilized) is not always stable and it can grief even if the matrix and right
hand side are “reasonable”. Or the preconditioner may be generating
inappropriately huge values (for example if ILU is being used inside it).
Yes, don’t try to print the matrix or anything like that.
I would start by trying with KSPBCGSL (manual page below). It is designed
to be more stable than Bi-CG-stab. Try it with the default options; you can
also increase the ell if it fails.
GMRES is always a good bet but I am thinking you are not using it because
it requires too much memory due to restart length.
Barry
KSPBCGSL - Implements a slight variant of the Enhanced
BiCGStab(L) algorithm in (3) and (2). The variation
concerns cases when either kappa0**2 or kappa1**2 is
negative due to round-off. Kappa0 has also been pulled
out of the denominator in the formula for ghat.
References:
1. G.L.G. Sleijpen, H.A. van der Vorst, "An overview of
approaches for the stable computation of hybrid BiCG
methods", Applied Numerical Mathematics: Transactions
f IMACS, 19(3), pp 235-54, 1996.
2. G.L.G. Sleijpen, H.A. van der Vorst, D.R. Fokkema,
"BiCGStab(L) and other hybrid Bi-CG methods",
Numerical Algorithms, 7, pp 75-109, 1994.
3. D.R. Fokkema, "Enhanced implementation of BiCGStab(L)
for solving linear systems of equations", preprint
from www.citeseer.com.
Contributed by: Joel M. Malard, email [email protected]
Options Database Keys:
+ -ksp_bcgsl_ell <ell> Number of Krylov search directions, defaults to 2 --
KSPBCGSLSetEll()
. -ksp_bcgsl_cxpol - Use a convex function of the MinRes and OR polynomials
after the BiCG step instead of default MinRes -- KSPBCGSLSetPol()
. -ksp_bcgsl_mrpoly - Use the default MinRes polynomial after the BiCG step
-- KSPBCGSLSetPol()
. -ksp_bcgsl_xres <res> Threshold used to decide when to refresh computed
residuals -- KSPBCGSLSetXRes()
- -ksp_bcgsl_pinv <true/false> - (de)activate use of pseudoinverse --
KSPBCGSLSetUsePseudoinverse()
Level: beginner
.seealso: KSPCreate(), KSPSetType(), KSPType (for list of available types),
KSP, KSPFGMRES, KSPBCGS, KSPSetPCSide(), KSPBCGSLSetEll(), KSPBCGSLSetXRes()
Hi Barry,
I mean why I run :
KSPGetConvergedReason(ksp_semi_xyz,reason,ierr)
reason < 0.
I forgot to add that the problem happens with my newly modified code. In
my old code, it works perfectly. So during my modification, the matrix
or vector may have been changed unintentionally. By right, the new and
old code should give the same matrix, except for small differences due
to truncation error. Based on these info, is there a better way to
debug? I will also changed to KSPBCGSL as suggested.
Thanks
Regards.
I retry using debug build and it gives the error below. I sent the job to a job
scheduler on 32 procs. So what is best way to debug? Should I print out the
matrix but it is very big since grid size is 13 million.
Thanks. Regards.
n12-10:13681] 31 more processes have sent help message help-mpi-btl-base.txt /
btl:no-nics
[n12-10:13681] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
help / error messages
[17]PETSC ERROR:
------------------------------------------------------------------------
[17]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably
divide by zero
[17]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[17]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[17]PETSC ERROR: or
try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
corruption errors
[17]PETSC ERROR: likely location of problem given in stack below
[17]PETSC ERROR: --------------------- Stack Frames
------------------------------------
[17]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[17]PETSC ERROR: INSTEAD the line number of the start of the function
[17]PETSC ERROR: is given.
[17]PETSC ERROR: [17] VecNorm_MPI line 57
/home/wtay/Codes/petsc-3.4.4/src/vec/vec/impls/mpi/pvec2.c
[17]PETSC ERROR: [17] VecNorm line 224
/home/wtay/Codes/petsc-3.4.4/src/vec/vec/interface/rvector.c
[17]PETSC ERROR: [17] KSPSolve_BCGS line 39
/home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/impls/bcgs/bcgs.c
[17]PETSC ERROR: [17] KSPSolve line 356
/home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
--
Thank you
Yours sincerely,
TAY wee-beng