Dominik Szczerba wrote: > After several months of code stability I experience now the famous > infamous dependence of my results on the # of CPUs (and more precisely, > divergence with -np > 4). I upgraded Petsc version meanwhile, but I want > to start by assuming I have a problem. Any pointers to how to set up a > trap for this bug are highly appreciated.
It's hopeless to get bit-for-bit identical answers from any parallel program, but the situation is much trickier with iterative solvers because almost every preconditioner is a different algorithm when you change the distribution. That said, you should be able to get the differences to converge by tightening tolerances (until you run out of floating point precision with very ill-conditioned systems). When you have a convergence problem in the linear solver, try a direct solver, -pc_type lu -pc_factor_mat_solver_package mumps (or even -pc_type redundant if you don't have a parallel direct solver). You can also write out a matrix in parallel (see MatView()) and compare it to the serial matrix. If you use SNES, I recommend running a small problem size in parallel and using -snes_type test which compares your Jacobian to one computed by finite differences. If you are still having problems after confirming that your function evaluation and matrix assembly is correct in parallel, the preconditioner is almost certainly the culprit. You have 3 options: 1. Use a parallel direct solver like MUMPS 2. Try to brute-force a standard preconditioner, e.g. -pc_type asm -pc_asm_overlap 3 -sub_pc_type lu 3. Read up on/design a problem-specific preconditioner that actually works and implement it. As Barry said here the other day, (3) is probably only a good choice if you want to do research in this rather than get some problems solved. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 261 bytes Desc: OpenPGP digital signature URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091023/48e84a72/attachment.pgp>
