Thank you for the code.  

A) By default MatZeroRows() does change the nonzero structure of the matrix 

B) PCFIELDSPLIT loses the null spaces attached to the submatrices if the 
nonzero structure of the matrix changes.

For the example code if one sets MatSetOption(A,MAT_KEEP_NONZERO_PATTERN, 
PETSC_TRUE); then using  MatZeroRows()  no longer changes the nonzero structure 
and thus the code behaves similarly for both MatZeroRows() and 
MatZeroRowsColumns().

Should B lose the null spaces when the nonzero structure changes? The 
documentation for MatSetNullSpace() does not explicitly state that the attached 
null space will remain with the matrix for its entire life unless the user 
calls MatSetNullspace again, but it implicitly implies that. Thus, I think the 
B should not lose the attached null spaces. I will change the code to preserve 
the null spaces in PCFIELDSPLIT when the nonzero structure changes.

Going back to your two points

1. Code going through line 692 loses the near nullspace of the matrices 
attached to the sub-KSPs
 2. The call to MatZeroRowsColumns() changes then non-zero structure for MPIAIJ 
but not for SEQAIJ

My new branch will prevent the loss of the nullspace in 1.

My previous fix (that is now in main)  fixes the bug where MatZeroRowsColumns() 
in parallel thought it changed the nonzero structure (while it actually did not 
change the nonzero structure).

Note: Independent of 1 and 2 most likely when you use MatZeroRows() in your 
code as you describe it you will want to use 
MatSetOption(A,MAT_KEEP_NONZERO_PATTERN, PETSC_TRUE); since the code will 
likely be a bit more efficient and will have less memory churn.





> On Feb 13, 2024, at 7:12 AM, Jeremy Theler (External) 
> <jeremy.theler-...@ansys.com> wrote:
> 
> Hi Barry
> 
> >   7279 does change the code for MatZeroRowsColumns_MPIAIJ(). But perhaps 
> > that does not resolve the problem you are seeing? If that is the case we 
> > will need a reproducible example so we can determine exactly what else is 
> > happening in your code to cause the difficulties. 
> >
> >Here is the diff for  MatZeroRowsColumns_MPIAIJ()
> >
> >@@ -1026,7 +1023,7 @@ static PetscErrorCode MatZeroRowsColumns_MPIAIJ(Mat A, 
> >PetscInt N, const PetscIn
> >
> >   PetscCall(PetscFree(lrows));
> >
> >   /* only change matrix nonzero state if pattern was allowed to be changed 
> > */
> > -  if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) {
> > +  if (!((Mat_SeqAIJ *)(l->A->data))->nonew) {
> >      PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate;
> >      PetscCall(MPIU_Allreduce(&state, &A->nonzerostate, 1, MPIU_INT64, 
> > MPI_SUM, PetscObjectComm((PetscObject)A)));
> >    }
> 
> Fair enough, I might have overlooked this single line. But this change does 
> not fix our issue. The if condition is still true. 
> I'll need some time to come up with a reproducible example because it 
> involves setting up the PC of the KSP of an SNES of a TS before doing 
> TSSolve() with the DM-allocated jacobian.
> 
> However, regarding bug #1, I do have a MWE. It is not CI-friendly. I tried to 
> modify ex47.c to illustrate the issue but could not make it work.
> Anyway, the attached tarball:
> reads a global matrix A from a file.
> reads two ISs from another file
> sets up a field split PC 
> attaches a near nullspace (read from another file) to one of the sub-KSPs
> sets "dirichlet BCs" with MatZeroRowsColumns() or MatZeroRows() depending if 
> -columns was given in the command line
> 
> The issue comes when calling MatZeroRowsColumns() or MatZeroRows().
> In the first case, the near nullspace is not lost but it is in the second:
> 
> $ make lost
> $ ./lost -columns -ksp_view | grep "near null"
>             has attached near null space
>         has attached near null space
> $ mpiexec -n 2 ./lost -columns -ksp_view | grep "near null"
>             has attached near null space
>         has attached near null space
> $ ./lost -ksp_view | grep "near null"
> $ mpiexec -n 2 ./lost -ksp_view | grep "near null"
> $ 
> 
> When using MatZeroRows(), the code passes through fieldsplit.c:692 which 
> looses the near nullspace.
> 
> 
> Note that the original issue we see in our code is that there is a difference 
> between the serial and parallel implementation of MatZeroRowsColumns().
> We loose the near nullspace only in parallel but not in serial because of a 
> combination of bugs #1 and #2.
> 
> 
> --
> jeremy
> 
> <lost.tar.gz>

Reply via email to