The bug fix for 2 is availabel in 
https://gitlab.com/petsc/petsc/-/merge_requests/7279



> On Feb 9, 2024, at 10:50 AM, Barry Smith <[email protected]> wrote:
> 
> 
> 1. Code going through line 692 looses the near nullspace of the matrices 
> attached to the sub-KSPs
>  2. The call to MatZeroRowsColumns() changes then non-zero structure for 
> MPIAIJ but not for SEQAIJ
>       (unless MAT_KEEP_NONZERO_PATTERN is used)
> 
> MatZeroRowsColumns() manual page states:
> 
> Unlike `MatZeroRows()` this does not change the nonzero structure of the 
> matrix, it merely zeros those entries in the matrix.
> 
>  MatZeroRowsColumns_MPIAIJ() has the code
> 
>   /* only change matrix nonzero state if pattern was allowed to be changed */
>   if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) {
>     PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate;
>     PetscCall(MPIU_Allreduce(&state, &A->nonzerostate, 1, MPIU_INT64, 
> MPI_SUM, PetscObjectComm((PetscObject)A)));
>   }
> 
> The if() test is simply an optimization to avoid the reduction if 
> keepnonzeropattern is true. In your first run (when MAT_KEEP_NONZERO_PATTERN 
> is not used) the  keepnonzeropattern is not set so the A matrix nonzerostate 
> is updated using the nonzerostate of the two submatrices. But that A nonzero 
> state will only change if one of the two submatrices nonzerostate changes. In 
> the second run the A nonzerostate cannot be changed.
> 
> MatZeroRowsColumns_SeqAIJ() has the code
> 
>   if (diag != 0.0) {
>     PetscCall(MatMissingDiagonal_SeqAIJ(A, &missing, &d));
>     if (missing) {
>       for (i = 0; i < N; i++) {
>         if (rows[i] >= A->cmap->N) continue;
>         PetscCheck(!a->nonew || rows[i] < d, PETSC_COMM_SELF, 
> PETSC_ERR_ARG_WRONGSTATE, "Matrix is missing diagonal entry in row %" 
> PetscInt_FMT " (%" PetscInt_FMT ")", d, rows[i]);
>         PetscCall(MatSetValues_SeqAIJ(A, 1, &rows[i], 1, &rows[i], &diag, 
> INSERT_VALUES));
> 
> which adds a nonzero to the matrix to fill in a missing diagonal. This will 
> change the nonzerostate of the matrix, but in a different way than 
> keepnonzeropattern flag is for.
> 
> So, do you have missing diagonals in your matrix?
> 
> Bug 1 - documentation
> The documentation statement "Unlike `MatZeroRows()` this does not change the 
> nonzero structure of the matrix, it merely zeros those entries in the 
> matrix." appears to be incorrect for matrices missing diagonal entries since 
> new nonzeros are added (to fill in the diagonal).  The documentation should 
> really say 
> 
> "Unlike `MatZeroRows()` this routine  cannot remove the zeroed entries from 
> the nonzero structure of the matrix; in other words setting the option 
> `MAT_KEEP_NONZERO_PATTERN to PETSC_FALSE has no effect on this routine.
> 
> Bug 2 - 
> The short circuit if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) is 
> wrong because that flag is meaningless for this operation. The check should 
> be changed (I think) to nonew instead of keepnonzeropattern.
> 
> I will prepare a bug fix later today. For now you can just use the 
> MAT_KEEP_NONZERO_PATTERN  option for you code to work.
> 
>  Barry
> 
> 
> 
> 
> 
> 
> 
>> On Feb 9, 2024, at 8:00 AM, Jeremy Theler (External) 
>> <[email protected]> wrote:
>> 
>> > > Because of a combination of settings, our code passes through this line:
>> > >
>> > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/pc/impls/fieldsplit/fieldsplit.c?ref_type=heads#L692
>> > > 
>> > > i.e. the matrices associated with each of the sub-KSPs of a fieldsplit 
>> > > are destroyed and then re-created later.
>> > > The thing is that one of these destroyed matrices had a near nullspace 
>> > > attached, which is lost because the new matrix does > not have it 
>> > > anymore.
>> > > 
>> > > Is this a bug or are we missing something?
>> > 
>> > I just want to get a clear picture. You create a PCFIELDSPLIT, set it up, 
>> > then pull out the matrices and attach a nullspace before the solve.
>> 
>> We need to solve an SNES. We use dmplex so we have the jacobian allocated 
>> before starting the solve.
>> At setup time we 
>> 
>>  1. define the PC of the KSP of the SNES to be fieldsplit
>>  2. define the fields with ISes
>>  3. call PCSetup() to create the sub-KSPs
>>  4. retrieve the matrix attached to the sub-KSP that needs the near 
>> nullspace and attach it to that matrix
>> 
>> > At a later time, you start another solve with this PC, and it has the 
>> > DIFFERENT_NONZERO_PATTERN flag, so it recreates these matrices and loses 
>> > your attached nullspace.
>> 
>> At a later time, in the jacobian evaluation we populate the global matrix 
>> (i.e. not the matrices attached to each sub-KSPs) and then we set dirichlet 
>> bcs with MatZeroRowsColumns() on that same global matrix.
>> For some reason, in serial the near nullspace is not lost but in parallel 
>> the call to MatZeroRowsColumns() does change the non-zero structure (even 
>> though the manual says it does not) and then the code goes through that line 
>> 692 in fieldsplit.c and the near nullspace is lost.
>> 
>> > First, does the matrix really change?
>> 
>> Well, the matrix during setup is not filled in, just allocated.
>> The thing is that if we set MAT_KEEP_NONZERO_PATTERN to true with 
>> MatSetOption() before setting the dirichlet BCs, then the near nullspace is 
>> not lost (because the code does not go through line 692 of fieldsplit.c).
>> 
>> So there are (at least) two issues:
>> 
>>  1. Code going through line 692 looses the near nullspace of the matrices 
>> attached to the sub-KSPs
>>  2. The call to MatZeroRowsColumns() changes then non-zero structure for 
>> MPIAIJ but not for SEQAIJ
>> 
>> 
>> > Second, I had the same problem. I added callbacks to DM which allow a 
>> > nullspace to be automatically attached if you extract a certain subfield. 
>> > Are you using a DM?
>> 
>> 
>> Yes. Can you give us an example?
>> 
>> Regards
>> --
>> jeremy
> 

Reply via email to