If each block is sequential, try replace SuperLU_DIST with SuperLU, which would 
be more robust. You may also try MUMPS LU.
Hong
________________________________
From: petsc-users <[email protected]> on behalf of Barry Smith 
<[email protected]>
Sent: Saturday, June 11, 2022 9:45 AM
To: Jorti, Zakariae <[email protected]>
Cc: [email protected] <[email protected]>; Tang, Xianzhu 
<[email protected]>
Subject: Re: [petsc-users] [EXTERNAL] Question about SuperLU



On Jun 10, 2022, at 7:30 PM, Jorti, Zakariae 
<[email protected]<mailto:[email protected]>> wrote:

Hi,

Thank you all for your answers.
I have tried your suggestions and here is what I found.
Barry you were right about the first case. But in the second case, I am not 
using a Schur fieldsplit but a multiplicative fieldsplit : 
-fieldsplit_TEBV_fieldsplit_EBV_pc_type fieldsplit 
-fieldsplit_TEBV_fieldsplit_EBV_pc_fieldsplit_type multiplicative

  The previous email indicated

b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B 
blocks:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -

  which means there is a Schur complement PC inside the multiplicative so my 
explanation that the Schur complement "saves" the problem by passing into 
SuperLU_DIST a non-singular matrix that is some approximation to the Schur 
complement could be true.

Then I tried this flag 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat 
binary:BVmat:binary_matlab and checked the resulting matrix in Matlab.
For Matlab, this is a full rank matrix, and the LU factorization there was 
carried out without any issues.
I also outputted the BV block directly from the Jacobian matrix.
Once again, according to Matlab, it is a full rank matrix and it computes the 
LU factorization without any problem.

  The BV matrix you saved into Matlab is a "block" matrix where the first block 
is B and the second block V (presumably both the same size). Can you, in 
Matlab, extract the two blocks separated and examine them (via say spy) and 
also have Matlab factor each of them separately? In your failed fieldsplit case 
SuperLU_DIST is factoring each of these matrices separately which could produce 
a zero pivot that would not occur when the larger matrix (of both blocks) is 
factored together. Let's see what happens with Matlab's solver.

   It looks like you are running on one rank?  If the above process is not 
informative this is what you do next.

   Use PetscBinaryWrite() from Matlab to save each of the two blocks (one for B 
and one for V) to two files. Then use a simple standalone PETSc code, say 
src/ksp/ksp/tutorials/ex10.c to read each of the files and use SuperLU_DIST 
directly on each of the two linear systems. This will, at least to my 
understanding, result in the exact same SuperLU_DIST solves that you get with 
the failed use of PCFIELDSPLIT. If they succeed or fail will be very 
informative.

  Barry




So, there should not be any Schur complement approximation Sp.

When I ran a test with 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pre 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_error_if_not_converged, I got 
this error:

    0 SNES Function norm 6.368031218939e-02
      0 KSP Residual norm 6.368031218939e-02
      Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to 
CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged 
due to CONVERGED_RTOL iterations 3
[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[0]PETSC ERROR: Zero pivot in LU factorization: 
https://petsc.org/release/faq/#zeropivot
[0]PETSC ERROR: Zero pivot in row 1658
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-751-g2f43bd9bc3  GIT 
Date: 2022-01-26 22:34:02 -0600
[0]PETSC ERROR: ./main on a macx named 
pn2032683.lanl.gov<http://pn2032683.lanl.gov/> by zjorti Fri Jun 10 16:17:35 
2022
[0]PETSC ERROR: Configure options PETSC_ARCH=macx --with-fc=0 
--with-mpi-dir=/Users/zjorti/.brew --download-hypre --with-debugging=0 
--with-cxx-dialect=C++11 --download-superlu_dist --download-parmetis 
--download-metis --download-ptscotch --download-cmake



Then I tried this flag 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat 
binary:BVmat:binary_matlab and checked the resulting matrix in Matlab.
For Matlab, this is a full rank matrix, and the LU factorization there was 
carried out without any issues.
I also outputted the BV block directly from the Jacobian matrix.
Once again, according to Matlab, it is a full rank matrix and it computes the 
LU factorization without any problem.


________________________________
From: Barry Smith <[email protected]<mailto:[email protected]>>
Sent: Friday, June 10, 2022 7:32 AM
To: Jorti, Zakariae
Cc: [email protected]<mailto:[email protected]>; Tang, Xianzhu
Subject: [EXTERNAL] Re: [petsc-users] Question about SuperLU


  It is difficult to tell exactly how the preconditioner is being formed with 
the information below it looks like in the

first case: the original B diagonal block and V diagonal block of the matrix 
are being factored separately with SuperLU_DIST

second case: the B block is factored with SuperLU_DIST and an explicit 
approximation to a Schur complement of the V block (Schur complement on 
eliminating the B block) is formed using "Preconditioner for the Schur 
complement formed from Sp, an assembled approximation to S, which uses A00's 
%sdiagonal's inverse" (this is the printout from a KSPView() for this part of 
the preconditioner).

  My guess is you have a "Stokes"-like problem where the V block is identically 
0 so, of course, the SuperLU_DIST will fail on it. But the approximation of the 
Schur complement onto that block is not singular so SuperLU_DIST has no 
trouble. If I am wrong and the V block is not identically 0 then it may be 
singular (or possibly, but less likely just badly order) so that SuperLU_DIST 
encounters a zero pivot.

  You can run with -ksp_view_pre to have the KSP print the KSP solver algorithm 
details BEFORE the linear solve (hence they would get printed despite your 
failed solve). That would be useful to see exactly what your preconditioner is.

   You can use -ksp_view_pmat (with appropriate prefix) to display the matrix 
that is going to be factored. Thus you can quickly verify what V is.

  If you run with -ksp_error_if_not_converged then the solver will stop exactly 
when the zero pivot is encountered; this would include some information from 
SuperLU_DIST which might include the row number etc.

  Notes on PETSc improvements needed.

1) The man page for KSPCheckSolve() is terribly misleading

2) It would be nice to have a view that displayed the nested fieldsplit 
preconditioners more clearly






On Jun 9, 2022, at 5:19 PM, Jorti, Zakariae via petsc-users 
<[email protected]<mailto:[email protected]>> wrote:


Hi,

I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and for 
the preconditioning part, I am using a FieldSplit preconditioner. At the last 
fieldsplit/level, we are left with a {B,V} block that tried to precondition in 
2 different ways:
a) SuperLU:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type 
superlu_dist
b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B 
blocks:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition 
selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type 
preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type lu 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type
 superlu_dist 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu 
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type
 superlu_dist

Option a) yields the following error:
"     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to 
CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged 
due to CONVERGED_RTOL iterations 5
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did not 
converge due to DIVERGED_PC_FAILED iterations 0
                         PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
whereas options b) seems to be working well.
Is it possible that the SuperLU on the {V,B} block uses a reordering that 
introduces a zero pivot or could there be another explanation for this error?

Many thanks.
Best,

Zakariae

Reply via email to