The bug fix for 2 is availabel in https://gitlab.com/petsc/petsc/-/merge_requests/7279
> On Feb 9, 2024, at 10:50 AM, Barry Smith <[email protected]> wrote: > > > 1. Code going through line 692 looses the near nullspace of the matrices > attached to the sub-KSPs > 2. The call to MatZeroRowsColumns() changes then non-zero structure for > MPIAIJ but not for SEQAIJ > (unless MAT_KEEP_NONZERO_PATTERN is used) > > MatZeroRowsColumns() manual page states: > > Unlike `MatZeroRows()` this does not change the nonzero structure of the > matrix, it merely zeros those entries in the matrix. > > MatZeroRowsColumns_MPIAIJ() has the code > > /* only change matrix nonzero state if pattern was allowed to be changed */ > if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) { > PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate; > PetscCall(MPIU_Allreduce(&state, &A->nonzerostate, 1, MPIU_INT64, > MPI_SUM, PetscObjectComm((PetscObject)A))); > } > > The if() test is simply an optimization to avoid the reduction if > keepnonzeropattern is true. In your first run (when MAT_KEEP_NONZERO_PATTERN > is not used) the keepnonzeropattern is not set so the A matrix nonzerostate > is updated using the nonzerostate of the two submatrices. But that A nonzero > state will only change if one of the two submatrices nonzerostate changes. In > the second run the A nonzerostate cannot be changed. > > MatZeroRowsColumns_SeqAIJ() has the code > > if (diag != 0.0) { > PetscCall(MatMissingDiagonal_SeqAIJ(A, &missing, &d)); > if (missing) { > for (i = 0; i < N; i++) { > if (rows[i] >= A->cmap->N) continue; > PetscCheck(!a->nonew || rows[i] < d, PETSC_COMM_SELF, > PETSC_ERR_ARG_WRONGSTATE, "Matrix is missing diagonal entry in row %" > PetscInt_FMT " (%" PetscInt_FMT ")", d, rows[i]); > PetscCall(MatSetValues_SeqAIJ(A, 1, &rows[i], 1, &rows[i], &diag, > INSERT_VALUES)); > > which adds a nonzero to the matrix to fill in a missing diagonal. This will > change the nonzerostate of the matrix, but in a different way than > keepnonzeropattern flag is for. > > So, do you have missing diagonals in your matrix? > > Bug 1 - documentation > The documentation statement "Unlike `MatZeroRows()` this does not change the > nonzero structure of the matrix, it merely zeros those entries in the > matrix." appears to be incorrect for matrices missing diagonal entries since > new nonzeros are added (to fill in the diagonal). The documentation should > really say > > "Unlike `MatZeroRows()` this routine cannot remove the zeroed entries from > the nonzero structure of the matrix; in other words setting the option > `MAT_KEEP_NONZERO_PATTERN to PETSC_FALSE has no effect on this routine. > > Bug 2 - > The short circuit if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) is > wrong because that flag is meaningless for this operation. The check should > be changed (I think) to nonew instead of keepnonzeropattern. > > I will prepare a bug fix later today. For now you can just use the > MAT_KEEP_NONZERO_PATTERN option for you code to work. > > Barry > > > > > > > >> On Feb 9, 2024, at 8:00 AM, Jeremy Theler (External) >> <[email protected]> wrote: >> >> > > Because of a combination of settings, our code passes through this line: >> > > >> > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/pc/impls/fieldsplit/fieldsplit.c?ref_type=heads#L692 >> > > >> > > i.e. the matrices associated with each of the sub-KSPs of a fieldsplit >> > > are destroyed and then re-created later. >> > > The thing is that one of these destroyed matrices had a near nullspace >> > > attached, which is lost because the new matrix does > not have it >> > > anymore. >> > > >> > > Is this a bug or are we missing something? >> > >> > I just want to get a clear picture. You create a PCFIELDSPLIT, set it up, >> > then pull out the matrices and attach a nullspace before the solve. >> >> We need to solve an SNES. We use dmplex so we have the jacobian allocated >> before starting the solve. >> At setup time we >> >> 1. define the PC of the KSP of the SNES to be fieldsplit >> 2. define the fields with ISes >> 3. call PCSetup() to create the sub-KSPs >> 4. retrieve the matrix attached to the sub-KSP that needs the near >> nullspace and attach it to that matrix >> >> > At a later time, you start another solve with this PC, and it has the >> > DIFFERENT_NONZERO_PATTERN flag, so it recreates these matrices and loses >> > your attached nullspace. >> >> At a later time, in the jacobian evaluation we populate the global matrix >> (i.e. not the matrices attached to each sub-KSPs) and then we set dirichlet >> bcs with MatZeroRowsColumns() on that same global matrix. >> For some reason, in serial the near nullspace is not lost but in parallel >> the call to MatZeroRowsColumns() does change the non-zero structure (even >> though the manual says it does not) and then the code goes through that line >> 692 in fieldsplit.c and the near nullspace is lost. >> >> > First, does the matrix really change? >> >> Well, the matrix during setup is not filled in, just allocated. >> The thing is that if we set MAT_KEEP_NONZERO_PATTERN to true with >> MatSetOption() before setting the dirichlet BCs, then the near nullspace is >> not lost (because the code does not go through line 692 of fieldsplit.c). >> >> So there are (at least) two issues: >> >> 1. Code going through line 692 looses the near nullspace of the matrices >> attached to the sub-KSPs >> 2. The call to MatZeroRowsColumns() changes then non-zero structure for >> MPIAIJ but not for SEQAIJ >> >> >> > Second, I had the same problem. I added callbacks to DM which allow a >> > nullspace to be automatically attached if you extract a certain subfield. >> > Are you using a DM? >> >> >> Yes. Can you give us an example? >> >> Regards >> -- >> jeremy >
