1. Code going through line 692 looses the near nullspace of the matrices 
attached to the sub-KSPs
 2. The call to MatZeroRowsColumns() changes then non-zero structure for MPIAIJ 
but not for SEQAIJ
      (unless MAT_KEEP_NONZERO_PATTERN is used)

MatZeroRowsColumns() manual page states:

Unlike `MatZeroRows()` this does not change the nonzero structure of the 
matrix, it merely zeros those entries in the matrix.

 MatZeroRowsColumns_MPIAIJ() has the code

  /* only change matrix nonzero state if pattern was allowed to be changed */
  if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) {
    PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate;
    PetscCall(MPIU_Allreduce(&state, &A->nonzerostate, 1, MPIU_INT64, MPI_SUM, 
PetscObjectComm((PetscObject)A)));
  }

The if() test is simply an optimization to avoid the reduction if 
keepnonzeropattern is true. In your first run (when MAT_KEEP_NONZERO_PATTERN is 
not used) the  keepnonzeropattern is not set so the A matrix nonzerostate is 
updated using the nonzerostate of the two submatrices. But that A nonzero state 
will only change if one of the two submatrices nonzerostate changes. In the 
second run the A nonzerostate cannot be changed.

MatZeroRowsColumns_SeqAIJ() has the code

  if (diag != 0.0) {
    PetscCall(MatMissingDiagonal_SeqAIJ(A, &missing, &d));
    if (missing) {
      for (i = 0; i < N; i++) {
        if (rows[i] >= A->cmap->N) continue;
        PetscCheck(!a->nonew || rows[i] < d, PETSC_COMM_SELF, 
PETSC_ERR_ARG_WRONGSTATE, "Matrix is missing diagonal entry in row %" 
PetscInt_FMT " (%" PetscInt_FMT ")", d, rows[i]);
        PetscCall(MatSetValues_SeqAIJ(A, 1, &rows[i], 1, &rows[i], &diag, 
INSERT_VALUES));

which adds a nonzero to the matrix to fill in a missing diagonal. This will 
change the nonzerostate of the matrix, but in a different way than 
keepnonzeropattern flag is for.

So, do you have missing diagonals in your matrix?

Bug 1 - documentation
The documentation statement "Unlike `MatZeroRows()` this does not change the 
nonzero structure of the matrix, it merely zeros those entries in the matrix." 
appears to be incorrect for matrices missing diagonal entries since new 
nonzeros are added (to fill in the diagonal).  The documentation should really 
say 

"Unlike `MatZeroRows()` this routine  cannot remove the zeroed entries from the 
nonzero structure of the matrix; in other words setting the option 
`MAT_KEEP_NONZERO_PATTERN to PETSC_FALSE has no effect on this routine.

Bug 2 - 
The short circuit if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) is 
wrong because that flag is meaningless for this operation. The check should be 
changed (I think) to nonew instead of keepnonzeropattern.

I will prepare a bug fix later today. For now you can just use the 
MAT_KEEP_NONZERO_PATTERN  option for you code to work.

 Barry







> On Feb 9, 2024, at 8:00 AM, Jeremy Theler (External) 
> <jeremy.theler-...@ansys.com> wrote:
> 
> > > Because of a combination of settings, our code passes through this line:
> > >
> > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/pc/impls/fieldsplit/fieldsplit.c?ref_type=heads#L692
> > > 
> > > i.e. the matrices associated with each of the sub-KSPs of a fieldsplit 
> > > are destroyed and then re-created later.
> > > The thing is that one of these destroyed matrices had a near nullspace 
> > > attached, which is lost because the new matrix does > not have it anymore.
> > > 
> > > Is this a bug or are we missing something?
> > 
> > I just want to get a clear picture. You create a PCFIELDSPLIT, set it up, 
> > then pull out the matrices and attach a nullspace before the solve.
> 
> We need to solve an SNES. We use dmplex so we have the jacobian allocated 
> before starting the solve.
> At setup time we 
> 
>  1. define the PC of the KSP of the SNES to be fieldsplit
>  2. define the fields with ISes
>  3. call PCSetup() to create the sub-KSPs
>  4. retrieve the matrix attached to the sub-KSP that needs the near nullspace 
> and attach it to that matrix
> 
> > At a later time, you start another solve with this PC, and it has the 
> > DIFFERENT_NONZERO_PATTERN flag, so it recreates these matrices and loses 
> > your attached nullspace.
> 
> At a later time, in the jacobian evaluation we populate the global matrix 
> (i.e. not the matrices attached to each sub-KSPs) and then we set dirichlet 
> bcs with MatZeroRowsColumns() on that same global matrix.
> For some reason, in serial the near nullspace is not lost but in parallel the 
> call to MatZeroRowsColumns() does change the non-zero structure (even though 
> the manual says it does not) and then the code goes through that line 692 in 
> fieldsplit.c and the near nullspace is lost.
> 
> > First, does the matrix really change?
> 
> Well, the matrix during setup is not filled in, just allocated.
> The thing is that if we set MAT_KEEP_NONZERO_PATTERN to true with 
> MatSetOption() before setting the dirichlet BCs, then the near nullspace is 
> not lost (because the code does not go through line 692 of fieldsplit.c).
> 
> So there are (at least) two issues:
> 
>  1. Code going through line 692 looses the near nullspace of the matrices 
> attached to the sub-KSPs
>  2. The call to MatZeroRowsColumns() changes then non-zero structure for 
> MPIAIJ but not for SEQAIJ
> 
> 
> > Second, I had the same problem. I added callbacks to DM which allow a 
> > nullspace to be automatically attached if you extract a certain subfield. 
> > Are you using a DM?
> 
> 
> Yes. Can you give us an example?
> 
> Regards
> --
> jeremy

Reply via email to