Re: [petsc-users] Diagnosing Convergence Issue in Fieldsplit Problem

2024-05-23 Thread Barry Smith


> On May 23, 2024, at 3:48 PM, Stefano Zampini  
> wrote:
> 
> the null space of the Schur complement is the restriction of the original 
> null space. I guess if fieldsplit is Schur type then we could in principle 
> extract the sub vectors and renormalize them

   Is this true if A is singular?   Or are you assuming the Schur complement 
form is only used if A is nonsingular? Would the user need to somehow indicate 
A is nonsingular?


> 
> 
> On Thu, May 23, 2024, 22:13 Jed Brown  <mailto:j...@jedbrown.org>> wrote:
>> This Message Is From an External Sender 
>> This message came from outside your organization.
>>  
>>  Barry Smith mailto:bsm...@petsc.dev>> writes:
>> 
>> >Unfortunately it cannot automatically because 
>> > -pc_fieldsplit_detect_saddle_point just grabs part of the matrix (having 
>> > no concept of "what part" so doesn't know to grab the null space 
>> > information. 
>> >
>> >It would be possible for PCFIELDSPLIT to access the null space of the 
>> > larger matrix directly as vectors and check if they are all zero in the 00 
>> > block, then it would know that the null space only applied to the second 
>> > block and could use it for the Schur complement.
>> >
>> >Matt, Jed, Stefano, Pierre does this make sense?
>> 
>> I think that would work (also need to check that the has_cnst flag is 
>> false), though if you've gone to the effort of filling in that Vec, you 
>> might as well provide the IS.
>> 
>> I also wonder if the RHS is consistent.



Re: [petsc-users] Diagnosing Convergence Issue in Fieldsplit Problem

2024-05-23 Thread Barry Smith

   Unfortunately it cannot automatically because 
-pc_fieldsplit_detect_saddle_point just grabs part of the matrix (having no 
concept of "what part" so doesn't know to grab the null space information. 

   It would be possible for PCFIELDSPLIT to access the null space of the larger 
matrix directly as vectors and check if they are all zero in the 00 block, then 
it would know that the null space only applied to the second block and could 
use it for the Schur complement.

   Matt, Jed, Stefano, Pierre does this make sense?

   Colton,
   
Meanwhile the quickest thing you can do is to generate the IS the defines 
the first and second block (instead of using 
-pc_fieldsplit_detect_saddle_point) and use PetscObjectCompose to attach the 
constant null space to the second block with the name "nullspace". PCFIELDSPLIT 
will then use this null space for the Schur complement solve.

  Barry


> On May 23, 2024, at 2:43 PM, Colton Bryant 
>  wrote:
> 
> Yes, the original operator definitely has a constant null space corresponding 
> to the constant pressure mode. I am currently handling this by using the 
> MatSetNullSpace function when the matrix is being created. Does this 
> information get passed to the submatrices of the fieldsplit?
> 
> -Colton
> 
> On Thu, May 23, 2024 at 12:36 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>Ok,
>> 
>> So what is happening is that GMRES with a restart of 30 is running on 
>> the Schur complement system with no preconditioning and LU (as a direct 
>> solver) is being used in the application of S (the Schur complement).  The 
>> convergence of GMRES is stagnating after getting about 8 digits of accuracy 
>> in the residual. Then at the second GMRES
>> restart it is comparing the explicitly computing residual b - Ax with that 
>> computed inside the GMRES algorithm (via a recursive formula) and finding a 
>> large difference so generating an error.  Since you are using a direct 
>> solver on the A_{00} block and it is well-conditioned this problem is not 
>> expected.
>> 
>>Is it possible that the S operator has a null space (perhaps of the 
>> constant vector)? Or, relatedly, does your original full matrix have a null 
>> space?
>> 
>>We have a way to associated null spaces of the submatrices in 
>> PCFIELDSPLIT by attaching them to the IS that define the fields, but 
>> unfortunately not trivially when using -pc_fieldsplit_detect_saddle_point. 
>> And sadly the current support seems completely undocumented. 
>> 
>>   Barry
>> 
>> 
>> 
>>> On May 23, 2024, at 2:16 PM, Colton Bryant 
>>> >> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>> 
>>> Hi Barry,
>>> 
>>> I saw that was reporting as an unused option and the error message I sent 
>>> was run with -fieldsplit_0_ksp_type preonly.
>>> 
>>> -Colton
>>> 
>>> On Thu, May 23, 2024 at 12:13 PM Barry Smith >> <mailto:bsm...@petsc.dev>> wrote:
>>>> 
>>>> 
>>>>Sorry I gave the wrong option. Use  -fieldsplit_0_ksp_type preonly
>>>> 
>>>> Barry
>>>> 
>>>>> On May 23, 2024, at 12:51 PM, Colton Bryant 
>>>>> >>>> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>>>> 
>>>>> That produces the error: 
>>>>> 
>>>>> [0]PETSC ERROR: Residual norm computed by GMRES recursion formula 
>>>>> 2.68054e-07 is far from the computed residual norm 6.86309e-06 at 
>>>>> restart, residual norm at start of cycle 2.68804e-07
>>>>> 
>>>>> The rest of the error is identical.
>>>>> 
>>>>> On Thu, May 23, 2024 at 10:46 AM Barry Smith >>>> <mailto:bsm...@petsc.dev>> wrote:
>>>>>> 
>>>>>>   Use -pc_fieldsplit_0_ksp_type preonly
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On May 23, 2024, at 12:43 PM, Colton Bryant 
>>>>>>> >>>>>> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>>>>>> 
>>>>>>> That produces the following error:
>>>>>>> 
>>>>>>> [0]PETSC ERROR: Residual norm computed by GMRES recursion formula 
>>>>>>> 2.79175e-07 is far from the computed residual norm 0.000113154 at 
>>>>>>> restart, residual norm at start of cycle 2.83065e-07
>>>>>>> [0]PETSC ERROR: See 
>>>

Re: [petsc-users] Diagnosing Convergence Issue in Fieldsplit Problem

2024-05-23 Thread Barry Smith

   Ok,

So what is happening is that GMRES with a restart of 30 is running on the 
Schur complement system with no preconditioning and LU (as a direct solver) is 
being used in the application of S (the Schur complement).  The convergence of 
GMRES is stagnating after getting about 8 digits of accuracy in the residual. 
Then at the second GMRES
restart it is comparing the explicitly computing residual b - Ax with that 
computed inside the GMRES algorithm (via a recursive formula) and finding a 
large difference so generating an error.  Since you are using a direct solver 
on the A_{00} block and it is well-conditioned this problem is not expected.

   Is it possible that the S operator has a null space (perhaps of the constant 
vector)? Or, relatedly, does your original full matrix have a null space?

   We have a way to associated null spaces of the submatrices in PCFIELDSPLIT 
by attaching them to the IS that define the fields, but unfortunately not 
trivially when using -pc_fieldsplit_detect_saddle_point. And sadly the current 
support seems completely undocumented. 

  Barry



> On May 23, 2024, at 2:16 PM, Colton Bryant 
>  wrote:
> 
> Hi Barry,
> 
> I saw that was reporting as an unused option and the error message I sent was 
> run with -fieldsplit_0_ksp_type preonly.
> 
> -Colton
> 
> On Thu, May 23, 2024 at 12:13 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>> 
>>Sorry I gave the wrong option. Use  -fieldsplit_0_ksp_type preonly
>> 
>> Barry
>> 
>>> On May 23, 2024, at 12:51 PM, Colton Bryant 
>>> >> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>> 
>>> That produces the error: 
>>> 
>>> [0]PETSC ERROR: Residual norm computed by GMRES recursion formula 
>>> 2.68054e-07 is far from the computed residual norm 6.86309e-06 at restart, 
>>> residual norm at start of cycle 2.68804e-07
>>> 
>>> The rest of the error is identical.
>>> 
>>> On Thu, May 23, 2024 at 10:46 AM Barry Smith >> <mailto:bsm...@petsc.dev>> wrote:
>>>> 
>>>>   Use -pc_fieldsplit_0_ksp_type preonly
>>>> 
>>>> 
>>>> 
>>>>> On May 23, 2024, at 12:43 PM, Colton Bryant 
>>>>> >>>> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>>>> 
>>>>> That produces the following error:
>>>>> 
>>>>> [0]PETSC ERROR: Residual norm computed by GMRES recursion formula 
>>>>> 2.79175e-07 is far from the computed residual norm 0.000113154 at 
>>>>> restart, residual norm at start of cycle 2.83065e-07
>>>>> [0]PETSC ERROR: See 
>>>>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dMycnbRyUqlSBxozlxAjTnb9HiE2xGqctU4JVMuKQoTyVcRsfBRtxvyu3T4rU5kTrXPiGcJ63yYIgp9cEefnGRE$
>>>>>   for trouble shooting.
>>>>> [0]PETSC ERROR: Petsc Release Version 3.21.0, unknown 
>>>>> [0]PETSC ERROR: ./mainOversetLS_exe on a arch-linux-c-opt named glass by 
>>>>> colton Thu May 23 10:41:09 2024
>>>>> [0]PETSC ERROR: Configure options --download-mpich --with-cc=gcc 
>>>>> --with-cxx=g++ --with-debugging=no --with-fc=gfortran COPTFLAGS=-O3 
>>>>> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux-c-opt 
>>>>> --download-sowing
>>>>> [0]PETSC ERROR: #1 KSPGMRESCycle() at 
>>>>> /home/colton/petsc/src/ksp/ksp/impls/gmres/gmres.c:115
>>>>> [0]PETSC ERROR: #2 KSPSolve_GMRES() at 
>>>>> /home/colton/petsc/src/ksp/ksp/impls/gmres/gmres.c:227
>>>>> [0]PETSC ERROR: #3 KSPSolve_Private() at 
>>>>> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:905
>>>>> [0]PETSC ERROR: #4 KSPSolve() at 
>>>>> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:1078
>>>>> [0]PETSC ERROR: #5 PCApply_FieldSplit_Schur() at 
>>>>> /home/colton/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:1203
>>>>> [0]PETSC ERROR: #6 PCApply() at 
>>>>> /home/colton/petsc/src/ksp/pc/interface/precon.c:497
>>>>> [0]PETSC ERROR: #7 KSP_PCApply() at 
>>>>> /home/colton/petsc/include/petsc/private/kspimpl.h:409
>>>>> [0]PETSC ERROR: #8 KSPFGMRESCycle() at 
>>>>> /home/colton/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:123
>>>>> [0]PETSC ERROR: #9 KSPSolve_FGMRES() at 
>>>>> /home/colton/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:235
>>>>> [0]PETSC ERROR: #10 KSPSolve_Private() at 
>>>>> /home/colton/petsc/s

Re: [petsc-users] Diagnosing Convergence Issue in Fieldsplit Problem

2024-05-23 Thread Barry Smith


   Sorry I gave the wrong option. Use  -fieldsplit_0_ksp_type preonly

Barry

> On May 23, 2024, at 12:51 PM, Colton Bryant 
>  wrote:
> 
> That produces the error: 
> 
> [0]PETSC ERROR: Residual norm computed by GMRES recursion formula 2.68054e-07 
> is far from the computed residual norm 6.86309e-06 at restart, residual norm 
> at start of cycle 2.68804e-07
> 
> The rest of the error is identical.
> 
> On Thu, May 23, 2024 at 10:46 AM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>   Use -pc_fieldsplit_0_ksp_type preonly
>> 
>> 
>> 
>>> On May 23, 2024, at 12:43 PM, Colton Bryant 
>>> >> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>> 
>>> That produces the following error:
>>> 
>>> [0]PETSC ERROR: Residual norm computed by GMRES recursion formula 
>>> 2.79175e-07 is far from the computed residual norm 0.000113154 at restart, 
>>> residual norm at start of cycle 2.83065e-07
>>> [0]PETSC ERROR: See 
>>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!b1t4acVCNdDfcXIpE51XD5Ur8HEGCU1TReic2lFJvabIiT3LnPOpIQaRlCmqgecCPxzOUIsJ6qnEUBH4UxMzLX8$
>>>   for trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.21.0, unknown 
>>> [0]PETSC ERROR: ./mainOversetLS_exe on a arch-linux-c-opt named glass by 
>>> colton Thu May 23 10:41:09 2024
>>> [0]PETSC ERROR: Configure options --download-mpich --with-cc=gcc 
>>> --with-cxx=g++ --with-debugging=no --with-fc=gfortran COPTFLAGS=-O3 
>>> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux-c-opt --download-sowing
>>> [0]PETSC ERROR: #1 KSPGMRESCycle() at 
>>> /home/colton/petsc/src/ksp/ksp/impls/gmres/gmres.c:115
>>> [0]PETSC ERROR: #2 KSPSolve_GMRES() at 
>>> /home/colton/petsc/src/ksp/ksp/impls/gmres/gmres.c:227
>>> [0]PETSC ERROR: #3 KSPSolve_Private() at 
>>> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:905
>>> [0]PETSC ERROR: #4 KSPSolve() at 
>>> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:1078
>>> [0]PETSC ERROR: #5 PCApply_FieldSplit_Schur() at 
>>> /home/colton/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:1203
>>> [0]PETSC ERROR: #6 PCApply() at 
>>> /home/colton/petsc/src/ksp/pc/interface/precon.c:497
>>> [0]PETSC ERROR: #7 KSP_PCApply() at 
>>> /home/colton/petsc/include/petsc/private/kspimpl.h:409
>>> [0]PETSC ERROR: #8 KSPFGMRESCycle() at 
>>> /home/colton/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:123
>>> [0]PETSC ERROR: #9 KSPSolve_FGMRES() at 
>>> /home/colton/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:235
>>> [0]PETSC ERROR: #10 KSPSolve_Private() at 
>>> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:905
>>> [0]PETSC ERROR: #11 KSPSolve() at 
>>> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:1078
>>> [0]PETSC ERROR: #12 solveStokes() at cartesianStokesGrid.cpp:1403
>>> 
>>> 
>>> 
>>> On Thu, May 23, 2024 at 10:33 AM Barry Smith >> <mailto:bsm...@petsc.dev>> wrote:
>>>> 
>>>>   Run the failing case with also -ksp_error_if_not_converged so we see 
>>>> exactly where the problem is first detected.
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On May 23, 2024, at 11:51 AM, Colton Bryant 
>>>>> >>>> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>>>> 
>>>>> Hi Barry,
>>>>> 
>>>>> Thanks for letting me know about the need to use fgmres in this case. I 
>>>>> ran a smaller problem (1230 in the first block) and saw similar behavior 
>>>>> in the true residual.
>>>>> 
>>>>> I also ran the same problem with the options -fieldsplit_0_pc_type svd 
>>>>> -fieldsplit_0_pc_svd_monitor and get the following output:
>>>>>   SVD: condition number 1.933639985881e+03, 0 of 1230 singular values 
>>>>> are (nearly) zero
>>>>>   SVD: smallest singular values: 4.132036392141e-03 
>>>>> 4.166444542385e-03 4.669534028645e-03 4.845532162256e-03 
>>>>> 5.047038625390e-03
>>>>>   SVD: largest singular values : 7.947990616611e+00 
>>>>> 7.961437414477e+00 7.961851612473e+00 7.971335373142e+00 
>>>>> 7.989870790960e+00
>>>>> 
>>>>> I would be surprised if the A_{00} block is ill conditioned as it's just 
>>>>> a standard discretization of

Re: [petsc-users] Diagnosing Convergence Issue in Fieldsplit Problem

2024-05-23 Thread Barry Smith

  Use -pc_fieldsplit_0_ksp_type preonly



> On May 23, 2024, at 12:43 PM, Colton Bryant 
>  wrote:
> 
> That produces the following error:
> 
> [0]PETSC ERROR: Residual norm computed by GMRES recursion formula 2.79175e-07 
> is far from the computed residual norm 0.000113154 at restart, residual norm 
> at start of cycle 2.83065e-07
> [0]PETSC ERROR: See 
> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZtDgFFlCg0V07VVhElstTyhYk7k_JOhZDT-XmG5SH94V00fLNgdy0Xm3HtempNQntM8S8XI0Eu8wDU5B5c7nyAk$
>   for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.21.0, unknown 
> [0]PETSC ERROR: ./mainOversetLS_exe on a arch-linux-c-opt named glass by 
> colton Thu May 23 10:41:09 2024
> [0]PETSC ERROR: Configure options --download-mpich --with-cc=gcc 
> --with-cxx=g++ --with-debugging=no --with-fc=gfortran COPTFLAGS=-O3 
> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux-c-opt --download-sowing
> [0]PETSC ERROR: #1 KSPGMRESCycle() at 
> /home/colton/petsc/src/ksp/ksp/impls/gmres/gmres.c:115
> [0]PETSC ERROR: #2 KSPSolve_GMRES() at 
> /home/colton/petsc/src/ksp/ksp/impls/gmres/gmres.c:227
> [0]PETSC ERROR: #3 KSPSolve_Private() at 
> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:905
> [0]PETSC ERROR: #4 KSPSolve() at 
> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:1078
> [0]PETSC ERROR: #5 PCApply_FieldSplit_Schur() at 
> /home/colton/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:1203
> [0]PETSC ERROR: #6 PCApply() at 
> /home/colton/petsc/src/ksp/pc/interface/precon.c:497
> [0]PETSC ERROR: #7 KSP_PCApply() at 
> /home/colton/petsc/include/petsc/private/kspimpl.h:409
> [0]PETSC ERROR: #8 KSPFGMRESCycle() at 
> /home/colton/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:123
> [0]PETSC ERROR: #9 KSPSolve_FGMRES() at 
> /home/colton/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:235
> [0]PETSC ERROR: #10 KSPSolve_Private() at 
> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:905
> [0]PETSC ERROR: #11 KSPSolve() at 
> /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:1078
> [0]PETSC ERROR: #12 solveStokes() at cartesianStokesGrid.cpp:1403
> 
> 
> 
> On Thu, May 23, 2024 at 10:33 AM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>   Run the failing case with also -ksp_error_if_not_converged so we see 
>> exactly where the problem is first detected.
>> 
>> 
>> 
>> 
>>> On May 23, 2024, at 11:51 AM, Colton Bryant 
>>> >> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>> 
>>> Hi Barry,
>>> 
>>> Thanks for letting me know about the need to use fgmres in this case. I ran 
>>> a smaller problem (1230 in the first block) and saw similar behavior in the 
>>> true residual.
>>> 
>>> I also ran the same problem with the options -fieldsplit_0_pc_type svd 
>>> -fieldsplit_0_pc_svd_monitor and get the following output:
>>>   SVD: condition number 1.933639985881e+03, 0 of 1230 singular values 
>>> are (nearly) zero
>>>   SVD: smallest singular values: 4.132036392141e-03 4.166444542385e-03 
>>> 4.669534028645e-03 4.845532162256e-03 5.047038625390e-03
>>>   SVD: largest singular values : 7.947990616611e+00 7.961437414477e+00 
>>> 7.961851612473e+00 7.971335373142e+00 7.989870790960e+00
>>> 
>>> I would be surprised if the A_{00} block is ill conditioned as it's just a 
>>> standard discretization of the laplacian with some rows replaced with ones 
>>> on the diagonal due to interpolations from the overset mesh. I'm wondering 
>>> if I'm somehow violating a solvability condition of the problem?
>>> 
>>> Thanks for the help!
>>> 
>>> -Colton
>>> 
>>> On Wed, May 22, 2024 at 6:09 PM Barry Smith >> <mailto:bsm...@petsc.dev>> wrote:
>>>> 
>>>>   Thanks for the info. I see you are using GMRES inside the Schur 
>>>> complement solver, this is ok but when you do you need to use fgmres as 
>>>> the outer solver. But this is unlikely to be the cause of the exact 
>>>> problem you are seeing.
>>>> 
>>>>   I'm not sure why the Schur complement KSP is suddenly seeing a large 
>>>> increase in the true residual norm.  Is it possible the A_{00} block is 
>>>> ill-conditioned?
>>>> 
>>>>Can you run with a smaller problem? Say 2,000 or so in the first block? 
>>>> Is there still a problem?
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On May 22, 2024, at 6:00 PM, Colton Bryant 
>>>>> >>&g

Re: [petsc-users] Diagnosing Convergence Issue in Fieldsplit Problem

2024-05-23 Thread Barry Smith

  Run the failing case with also -ksp_error_if_not_converged so we see exactly 
where the problem is first detected.




> On May 23, 2024, at 11:51 AM, Colton Bryant 
>  wrote:
> 
> Hi Barry,
> 
> Thanks for letting me know about the need to use fgmres in this case. I ran a 
> smaller problem (1230 in the first block) and saw similar behavior in the 
> true residual.
> 
> I also ran the same problem with the options -fieldsplit_0_pc_type svd 
> -fieldsplit_0_pc_svd_monitor and get the following output:
>   SVD: condition number 1.933639985881e+03, 0 of 1230 singular values are 
> (nearly) zero
>   SVD: smallest singular values: 4.132036392141e-03 4.166444542385e-03 
> 4.669534028645e-03 4.845532162256e-03 5.047038625390e-03
>   SVD: largest singular values : 7.947990616611e+00 7.961437414477e+00 
> 7.961851612473e+00 7.971335373142e+00 7.989870790960e+00
> 
> I would be surprised if the A_{00} block is ill conditioned as it's just a 
> standard discretization of the laplacian with some rows replaced with ones on 
> the diagonal due to interpolations from the overset mesh. I'm wondering if 
> I'm somehow violating a solvability condition of the problem?
> 
> Thanks for the help!
> 
> -Colton
> 
> On Wed, May 22, 2024 at 6:09 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>   Thanks for the info. I see you are using GMRES inside the Schur complement 
>> solver, this is ok but when you do you need to use fgmres as the outer 
>> solver. But this is unlikely to be the cause of the exact problem you are 
>> seeing.
>> 
>>   I'm not sure why the Schur complement KSP is suddenly seeing a large 
>> increase in the true residual norm.  Is it possible the A_{00} block is 
>> ill-conditioned?
>> 
>>Can you run with a smaller problem? Say 2,000 or so in the first block? 
>> Is there still a problem?
>> 
>> 
>> 
>> 
>> 
>>> On May 22, 2024, at 6:00 PM, Colton Bryant 
>>> >> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>> 
>>> Hi Barry,
>>> 
>>> I have not used any other solver parameters in the code and the full set of 
>>> solver related command line options are those I mentioned in the previous 
>>> email.
>>> 
>>> Below is the output from -ksp_view:
>>> 
>>> KSP Object: (back_) 1 MPI process
>>>   type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
>>> with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>>   maximum iterations=1, initial guess is zero
>>>   tolerances: relative=1e-08, absolute=1e-50, divergence=1.
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object: (back_) 1 MPI process
>>>   type: fieldsplit
>>> FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL
>>> Preconditioner for the Schur complement formed from S itself
>>> Split info:
>>> Split number 0 Defined by IS
>>> Split number 1 Defined by IS
>>> KSP solver for A00 block
>>>   KSP Object: (back_fieldsplit_0_) 1 MPI process
>>> type: gmres
>>>   restart=30, using Classical (unmodified) Gram-Schmidt 
>>> Orthogonalization with no iterative refinement
>>>   happy breakdown tolerance 1e-30
>>> maximum iterations=1, initial guess is zero
>>> tolerances: relative=1e-05, absolute=1e-50, divergence=1.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>>   PC Object: (back_fieldsplit_0_) 1 MPI process
>>> type: lu
>>>   out-of-place factorization
>>>   tolerance for zero pivot 2.22045e-14
>>>   matrix ordering: nd
>>>   factor fill ratio given 5., needed 8.83482
>>> Factored matrix follows:
>>>   Mat Object: (back_fieldsplit_0_) 1 MPI process
>>> type: seqaij
>>> rows=30150, cols=30150
>>> package used to perform factorization: petsc
>>> total: nonzeros=2649120, allocated nonzeros=2649120
>>>   using I-node routines: found 15019 nodes, limit used is 5
>>> linear system matrix = precond matrix:
>>> Mat Object: (back_fieldsplit_0_) 1 MPI process
>>>   type: seqaij
>>>   rows=30150, cols=30150
>>>   

Re: [petsc-users] Diagnosing Convergence Issue in Fieldsplit Problem

2024-05-22 Thread Barry Smith
49550, allocated nonzeros=149550
> total number of mallocs used during MatSetValues calls=0
>   using I-node routines: found 15150 nodes, limit used is 5
>   linear system matrix = precond matrix:
>   Mat Object: (back_) 1 MPI process
> type: seqaij
> rows=45150, cols=45150
> total: nonzeros=673650, allocated nonzeros=673650
> total number of mallocs used during MatSetValues calls=0
>   has attached null space
>   using I-node routines: found 15150 nodes, limit used is 5
> 
> Thanks again!
> 
> -Colton
> 
> On Wed, May 22, 2024 at 3:39 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>   Are you using any other command line options or did you hardwire any 
>> solver parameters in the code with, like, KSPSetXXX() or PCSetXXX() Please 
>> send all of them.
>> 
>>   Something funky definitely happened when the true residual norms jumped up.
>> 
>>   Could you run the same thing with -ksp_view and don't use any thing like 
>> -ksp_error_if_not_converged so we can see exactly what is being run.
>> 
>>   Barry
>> 
>> 
>>> On May 22, 2024, at 3:21 PM, Colton Bryant 
>>> >> <mailto:coltonbryant2...@u.northwestern.edu>> wrote:
>>> 
>>> This Message Is From an External Sender
>>> This message came from outside your organization.
>>> Hello,
>>> 
>>> I am solving the Stokes equations on a MAC grid discretized by finite 
>>> differences using a DMSTAG object. I have tested the solver quite 
>>> extensively on manufactured problems and it seems to work well. As I am 
>>> still just trying to get things working and not yet worried about speed I 
>>> am using the following solver options: 
>>> -pc_type fieldsplit
>>> -pc_fieldsplit_detect_saddle_point
>>> -fieldsplit_0_pc_type lu
>>> -fieldsplit_1_ksp_rtol 1.e-8
>>> 
>>> However I am now using this solver as an inner step of a larger code and 
>>> have run into issues. The code repeatedly solves the Stokes equations with 
>>> varying right hand sides coming from changing problem geometry (the solver 
>>> is a part of an overset grid scheme coupled to a level set method evolving 
>>> in time). After a couple timesteps I observe the following output when 
>>> running with -fieldsplit_1_ksp_converged_reason 
>>> -fieldsplit_1_ksp_monitor_true_residual: 
>>> 
>>> Residual norms for back_fieldsplit_1_ solve.
>>> 0 KSP preconditioned resid norm 2.826514299465e-02 true resid norm 
>>> 2.826514299465e-02 ||r(i)||/||b|| 1.e+00
>>> 1 KSP preconditioned resid norm 7.286621865915e-03 true resid norm 
>>> 7.286621865915e-03 ||r(i)||/||b|| 2.577953300039e-01
>>> 2 KSP preconditioned resid norm 1.500598474492e-03 true resid norm 
>>> 1.500598474492e-03 ||r(i)||/||b|| 5.309007192273e-02
>>> 3 KSP preconditioned resid norm 3.796396924978e-04 true resid norm 
>>> 3.796396924978e-04 ||r(i)||/||b|| 1.343137349666e-02
>>> 4 KSP preconditioned resid norm 8.091057439816e-05 true resid norm 
>>> 8.091057439816e-05 ||r(i)||/||b|| 2.862556697960e-03
>>> 5 KSP preconditioned resid norm 3.689113122359e-05 true resid norm 
>>> 3.689113122359e-05 ||r(i)||/||b|| 1.305181128239e-03
>>> 6 KSP preconditioned resid norm 2.116450533352e-05 true resid norm 
>>> 2.116450533352e-05 ||r(i)||/||b|| 7.487846545662e-04
>>> 7 KSP preconditioned resid norm 3.968234031201e-06 true resid norm 
>>> 3.968234031200e-06 ||r(i)||/||b|| 1.403932055801e-04
>>> 8 KSP preconditioned resid norm 6.666949419511e-07 true resid norm 
>>> 6.666949419506e-07 ||r(i)||/||b|| 2.358717739644e-05
>>> 9 KSP preconditioned resid norm 1.941522884928e-07 true resid norm 
>>> 1.941522884931e-07 ||r(i)||/||b|| 6.868965372998e-06
>>>10 KSP preconditioned resid norm 6.729545258682e-08 true resid norm 
>>> 6.729545258626e-08 ||r(i)||/||b|| 2.380863687793e-06
>>>11 KSP preconditioned resid norm 3.009070131709e-08 true resid norm 
>>> 3.009070131735e-08 ||r(i)||/||b|| 1.064586912687e-06
>>>12 KSP preconditioned resid norm 7.849353009588e-09 true resid norm 
>>> 7.849353009903e-09 ||r(i)||/||b|| 2.777043445840e-07
>>>13 KSP preconditioned resid norm 2.306283345754e-09 true resid norm 
>>> 2.306283346677e-09 ||r(i)||/||b|| 8.159461097060e-08
>>>14 KSP preconditioned resid norm 9.336302495083e-10 true resid norm 
>>> 9.336302502503e-10 ||r(i)||/||b|| 3.303115255517e-08
>&

Re: [petsc-users] Diagnosing Convergence Issue in Fieldsplit Problem

2024-05-22 Thread Barry Smith

  Are you using any other command line options or did you hardwire any solver 
parameters in the code with, like, KSPSetXXX() or PCSetXXX() Please send all of 
them.

  Something funky definitely happened when the true residual norms jumped up.

  Could you run the same thing with -ksp_view and don't use any thing like 
-ksp_error_if_not_converged so we can see exactly what is being run.

  Barry


> On May 22, 2024, at 3:21 PM, Colton Bryant 
>  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello,
> 
> I am solving the Stokes equations on a MAC grid discretized by finite 
> differences using a DMSTAG object. I have tested the solver quite extensively 
> on manufactured problems and it seems to work well. As I am still just trying 
> to get things working and not yet worried about speed I am using the 
> following solver options: 
> -pc_type fieldsplit
> -pc_fieldsplit_detect_saddle_point
> -fieldsplit_0_pc_type lu
> -fieldsplit_1_ksp_rtol 1.e-8
> 
> However I am now using this solver as an inner step of a larger code and have 
> run into issues. The code repeatedly solves the Stokes equations with varying 
> right hand sides coming from changing problem geometry (the solver is a part 
> of an overset grid scheme coupled to a level set method evolving in time). 
> After a couple timesteps I observe the following output when running with 
> -fieldsplit_1_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual: 
> 
> Residual norms for back_fieldsplit_1_ solve.
> 0 KSP preconditioned resid norm 2.826514299465e-02 true resid norm 
> 2.826514299465e-02 ||r(i)||/||b|| 1.e+00
> 1 KSP preconditioned resid norm 7.286621865915e-03 true resid norm 
> 7.286621865915e-03 ||r(i)||/||b|| 2.577953300039e-01
> 2 KSP preconditioned resid norm 1.500598474492e-03 true resid norm 
> 1.500598474492e-03 ||r(i)||/||b|| 5.309007192273e-02
> 3 KSP preconditioned resid norm 3.796396924978e-04 true resid norm 
> 3.796396924978e-04 ||r(i)||/||b|| 1.343137349666e-02
> 4 KSP preconditioned resid norm 8.091057439816e-05 true resid norm 
> 8.091057439816e-05 ||r(i)||/||b|| 2.862556697960e-03
> 5 KSP preconditioned resid norm 3.689113122359e-05 true resid norm 
> 3.689113122359e-05 ||r(i)||/||b|| 1.305181128239e-03
> 6 KSP preconditioned resid norm 2.116450533352e-05 true resid norm 
> 2.116450533352e-05 ||r(i)||/||b|| 7.487846545662e-04
> 7 KSP preconditioned resid norm 3.968234031201e-06 true resid norm 
> 3.968234031200e-06 ||r(i)||/||b|| 1.403932055801e-04
> 8 KSP preconditioned resid norm 6.666949419511e-07 true resid norm 
> 6.666949419506e-07 ||r(i)||/||b|| 2.358717739644e-05
> 9 KSP preconditioned resid norm 1.941522884928e-07 true resid norm 
> 1.941522884931e-07 ||r(i)||/||b|| 6.868965372998e-06
>10 KSP preconditioned resid norm 6.729545258682e-08 true resid norm 
> 6.729545258626e-08 ||r(i)||/||b|| 2.380863687793e-06
>11 KSP preconditioned resid norm 3.009070131709e-08 true resid norm 
> 3.009070131735e-08 ||r(i)||/||b|| 1.064586912687e-06
>12 KSP preconditioned resid norm 7.849353009588e-09 true resid norm 
> 7.849353009903e-09 ||r(i)||/||b|| 2.777043445840e-07
>13 KSP preconditioned resid norm 2.306283345754e-09 true resid norm 
> 2.306283346677e-09 ||r(i)||/||b|| 8.159461097060e-08
>14 KSP preconditioned resid norm 9.336302495083e-10 true resid norm 
> 9.336302502503e-10 ||r(i)||/||b|| 3.303115255517e-08
>15 KSP preconditioned resid norm 6.537456143401e-10 true resid norm 
> 6.537456141617e-10 ||r(i)||/||b|| 2.312903968982e-08
>16 KSP preconditioned resid norm 6.389159552788e-10 true resid norm 
> 6.389159550304e-10 ||r(i)||/||b|| 2.260437724130e-08
>17 KSP preconditioned resid norm 6.380905134246e-10 true resid norm 
> 6.380905136023e-10 ||r(i)||/||b|| 2.257517372981e-08
>18 KSP preconditioned resid norm 6.380440605992e-10 true resid norm 
> 6.380440604688e-10 ||r(i)||/||b|| 2.257353025207e-08
>19 KSP preconditioned resid norm 6.380427156582e-10 true resid norm 
> 6.380427157894e-10 ||r(i)||/||b|| 2.257348267830e-08
>20 KSP preconditioned resid norm 6.380426714897e-10 true resid norm 
> 6.380426714004e-10 ||r(i)||/||b|| 2.257348110785e-08
>21 KSP preconditioned resid norm 6.380426656970e-10 true resid norm 
> 6.380426658839e-10 ||r(i)||/||b|| 2.257348091268e-08
>22 KSP preconditioned resid norm 6.380426650538e-10 true resid norm 
> 6.380426650287e-10 ||r(i)||/||b|| 2.257348088242e-08
>23 KSP preconditioned resid norm 6.380426649918e-10 true resid norm 
> 6.380426645888e-10 ||r(i)||/||b|| 2.257348086686e-08
>24 KSP preconditioned resid norm 6.380426649803e-10 true resid norm 
> 6.380426644294e-10 ||r(i)||/||b|| 2.257348086122e-08
>25 KSP preconditioned resid norm 6.380426649796e-10 true resid norm 
> 6.380426649774e-10 ||r(i)||/||b|| 2.257348088061e-08
>26 KSP preconditioned resid norm 6.380426649795e-10 true resid norm 
> 6.380426653788e-10 

Re: [petsc-users] Modify matrix nonzero structure

2024-05-19 Thread Barry Smith

  Certainly missing Jacobian entries can dramatically change the Newton 
direction and hence the convergence. Even if the optimal (in time) setup skips 
some Jacobian entries it is always good to have runs with all the entries to 
see the "best possible" convergence.

  Barry


> On May 19, 2024, at 10:44 PM, Adrian Croucher  
> wrote:
> 
> Great, it sounds like this might be easier than I expected. Thanks very much.
> 
> Did you have any thoughts on my diagnosis of the problem (the poor nonlinear 
> solver convergence being caused by missing Jacobian elements representing 
> interaction between the sources)?
> 
> - Adrian
> 
> On 20/05/24 12:41 pm, Matthew Knepley wrote:
>> On Sun, May 19, 2024 at 8:25 PM Barry Smith > <mailto:bsm...@petsc.dev>> wrote:
>>> This Message Is From an External Sender 
>>> This message came from outside your organization.
>>>  
>>> 
>>>You can call MatSetOption(mat,MAT_NEW_NONZERO_LOCATION_ERR) then insert 
>>> the new values. If it is just a handful of new insertions the extra time 
>>> should be small.
>>> 
>>> Making a copy of the matrix won't give you a new matrix that is any 
>>> faster to insert into so best to just use the same matrix.
>> 
>> Let me add to Barry's answer. The preallocation infrastructure is now not 
>> strictly necessary. It is possible to just add all your nonzeros in and 
>> assembly,  and the performance will be pretty good (uses hashing etc). So if 
>> just adding a few nonzeros does not work, we can go this route.
>> 
>>   Thanks,
>> 
>>  Matt
>>  
>>>   Barry
>>> 
>>> 
>>>> On May 19, 2024, at 7:44 PM, Adrian Croucher >>> <mailto:a.crouc...@auckland.ac.nz>> wrote:
>>>> 
>>>> This Message Is From an External Sender
>>>> This message came from outside your organization.
>>>> hi,
>>>> 
>>>> I have a Jacobian matrix created using DMCreateMatrix(). What would be 
>>>> the best way to add extra nonzero entries into it?
>>>> 
>>>> I'm guessing that DMCreateMatrix() allocates the storage so the nonzero 
>>>> structure can't really be easily modified. Would it be a case of 
>>>> creating a new matrix, copying the nonzero entries from the original one 
>>>> and then adding the extra ones, before calling MatSetUp() or similar? If 
>>>> so, how exactly would you copy the nonzero structure from the original 
>>>> matrix?
>>>> 
>>>> Background: the flow problem I'm solving (on a DMPlex with finite volume 
>>>> method) has complex source terms that depend on the solution (e.g. 
>>>> pressure), and can also depend on other source terms. A simple example 
>>>> is when fluid is extracted from one location, with a pressure-dependent 
>>>> flow rate, and some of it is then reinjected in another location. This 
>>>> can result in poor nonlinear solver convergence. I think the reason is 
>>>> that there are effectively missing Jacobian entries in the row for the 
>>>> reinjection cell, which should have an additional dependence on the 
>>>> solution in the cell where fluid is extracted.
>>>> 
>>>> - Adrian
>> 
> -- 
> Dr Adrian Croucher
> Senior Research Fellow
> Department of Engineering Science
> Waipapa Taumata Rau / University of Auckland, New Zealand
> email: a.crouc...@auckland.ac.nz <mailto:a.crouc...@auckland.ac.nz>
> tel: +64 (0)9 923 4611



Re: [petsc-users] Modify matrix nonzero structure

2024-05-19 Thread Barry Smith

   You can call MatSetOption(mat,MAT_NEW_NONZERO_LOCATION_ERR) then insert the 
new values. If it is just a handful of new insertions the extra time should be 
small.

Making a copy of the matrix won't give you a new matrix that is any faster 
to insert into so best to just use the same matrix.

  Barry


> On May 19, 2024, at 7:44 PM, Adrian Croucher  
> wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> hi,
> 
> I have a Jacobian matrix created using DMCreateMatrix(). What would be 
> the best way to add extra nonzero entries into it?
> 
> I'm guessing that DMCreateMatrix() allocates the storage so the nonzero 
> structure can't really be easily modified. Would it be a case of 
> creating a new matrix, copying the nonzero entries from the original one 
> and then adding the extra ones, before calling MatSetUp() or similar? If 
> so, how exactly would you copy the nonzero structure from the original 
> matrix?
> 
> Background: the flow problem I'm solving (on a DMPlex with finite volume 
> method) has complex source terms that depend on the solution (e.g. 
> pressure), and can also depend on other source terms. A simple example 
> is when fluid is extracted from one location, with a pressure-dependent 
> flow rate, and some of it is then reinjected in another location. This 
> can result in poor nonlinear solver convergence. I think the reason is 
> that there are effectively missing Jacobian entries in the row for the 
> reinjection cell, which should have an additional dependence on the 
> solution in the cell where fluid is extracted.
> 
> - Adrian
> 
> -- 
> Dr Adrian Croucher
> Senior Research Fellow
> Department of Engineering Science
> Waipapa Taumata Rau / University of Auckland, New Zealand
> email: a.crouc...@auckland.ac.nz 
> tel: +64 (0)9 923 4611
> 
> 



Re: [petsc-users] duplicated libs

2024-05-13 Thread Barry Smith

  It is not always safe to remove duplicate libraries listed in different 
places in the list. Hence we cannot simply always remove them.

  Barry

> On May 13, 2024, at 10:00 PM, Runjian Wu  wrote:
> 
> Thanks for your reply!  Since I can manually remove duplicates, how about 
> adding a function to automatically remove duplicates at the end of 
> "configure" in the next PETSc version?
> 
> 
> Runjian
> 
> 
> 
> On 5/13/2024 9:31 PM, Barry Smith wrote:
>> 
>>Because the order of the libraries can be important, it is difficult for 
>> ./configure to remove unneeded duplicates automatically.
>> 
>>You can manually remove duplicates by editing 
>> $PETSC_ARCH/lib/petsc/conf/petscvariables after running ./configure
>> 
>>Barry
>> 
>> 
>> 
>>> On May 13, 2024, at 7:47 AM, Runjian Wu  
>>> <mailto:wurunj...@gmail.com> wrote:
>>> 
>>> This Message Is From an External Sender
>>> This message came from outside your organization.
>>> Hi all,
>>> 
>>> After I compiled PETSc, I found some duplicated libs in the variable 
>>> PETSC_EXTERNAL_LIB_BASIC, e.g., -lm, -lgfortran -lstdc++.  I am curious how 
>>> it happened and how to remove the duplicates?
>>> 
>>> Thanks,
>>> 
>>> Runjian Wu
>> 



Re: [petsc-users] VecGetArrayF90 vs. VecGetArrayReadF90

2024-05-13 Thread Barry Smith

  It errors in C because the argument is labeled with const, but there does not 
seem to be a way in Fortran to indicate an array is read only.



> On May 13, 2024, at 10:21 PM, Runjian Wu  wrote:
> 
> I only know the "intent(in)" attribute for the dummy arguments. 
> 
> In the counter function VecGetArrayRead(..), if I write values, will the 
> compiler report an error? 
> 
> Runjian
> 
> On Mon, May 13, 2024 at 9:36 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>I couldn't find a way in Fortran to declare an array as read-only. Is 
>> there such support?
>> 
>>Barry
>> 
>> 
>>> On May 13, 2024, at 7:28 AM, Runjian Wu >> <mailto:wurunj...@gmail.com>> wrote:
>>> 
>>> This Message Is From an External Sender
>>> This message came from outside your organization.
>>> Hi all,
>>> 
>>> I have a question about VecGetArrayReadF90(...).  If I use 
>>> VecGetArrayReadF90(...), I can still write entries into the array like 
>>> VecGetArrayF90(...).  Is it possible to report an error during compile 
>>> process?
>>> 
>>> Thanks,
>>> 
>>> Runjian Wu
>> 



Re: [petsc-users] Using Compute-Sanitizer with PETSc

2024-05-13 Thread Barry Smith

  Depending on your mpi mpiexec is not needed so

  compute-sanitizer --tool memcheck --leak-check full ./a.out args

  may work

> On May 13, 2024, at 8:16 PM, Sreeram R Venkat  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> I am trying to check my program for GPU memory leaks with the 
> compute-sanitizer tool. If I run my application with: 
> mpiexec -n 1 compute-sanitizer --tool memcheck --leak-check full ./a.out args
> 
> I get the message:
> Error: No attachable process found. compute-sanitizer timed-out.
> 
> Adding --target-processes all does not help.
> 
> Is there anything else I should be doing?
> 
> Thanks,
> Sreeram



Re: [petsc-users] VecGetArrayF90 vs. VecGetArrayReadF90

2024-05-13 Thread Barry Smith

   I couldn't find a way in Fortran to declare an array as read-only. Is there 
such support?

   Barry


> On May 13, 2024, at 7:28 AM, Runjian Wu  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hi all,
> 
> I have a question about VecGetArrayReadF90(...).  If I use 
> VecGetArrayReadF90(...), I can still write entries into the array like 
> VecGetArrayF90(...).  Is it possible to report an error during compile 
> process?
> 
> Thanks,
> 
> Runjian Wu



Re: [petsc-users] duplicated libs

2024-05-13 Thread Barry Smith

   Because the order of the libraries can be important, it is difficult for 
./configure to remove unneeded duplicates automatically.

   You can manually remove duplicates by editing 
$PETSC_ARCH/lib/petsc/conf/petscvariables after running ./configure

   Barry



> On May 13, 2024, at 7:47 AM, Runjian Wu  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hi all,
> 
> After I compiled PETSc, I found some duplicated libs in the variable 
> PETSC_EXTERNAL_LIB_BASIC, e.g., -lm, -lgfortran -lstdc++.  I am curious how 
> it happened and how to remove the duplicates?
> 
> Thanks,
> 
> Runjian Wu



Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code

2024-05-07 Thread Barry Smith

   Have you considered 
https://urldefense.us/v3/__https://www.pflotran.org/documentation/user_guide/how_to/installation/installation.html__;!!G_uCfscf7eWS!eZkBlApyezAyEBaDxGemEyUmTe87omvkv59rrf0Mq8L4bOTqEL9wynuMJ1ci9kRDOqucZBiYdGT2tRTVuCSJdSs$
 


> On May 7, 2024, at 2:22 PM, Shatanawi, Sawsan Muhammad via petsc-users 
>  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello everyone,
>  
> I hope this email finds you well.
> 
> 
>  My Name is Sawsan Shatanawi, and I was developing a Fortran code for 
> simulating groundwater flow in a 3D system with nonlinear behavior.  I solved 
> the nonlinear system using the PCG solver and Picard iteration, but I did not 
> get good results although I checked my matrix and RHS and everything, I 
> decided to change my solver to Newton Rapson method.
> I checked PETSc documents but I have a few questions:
> 1) My groundwater system is time-dependent, so should I use TS only instead 
> of SNES?
> 2) My system has its deltaT, would using deltaT as dt affect my solver, or is 
> it better to use TS-PETSc dt? Also, would using PETSc dt affect the 
> simulation of the groundwater system
> 3) I want my Jacobian matrix to be calculated by PETSc automatically
> 4) Do I need to define and calculate the residual vector?
>  
> My A-Matrix contains coefficients and external sources and my RHS vector 
> includes the boundary conditions  
> 
> 
> Please find the attached file contains a draft of my code
> 
> Thank you in advance for your time and help.
> 
> Best regards,
> 
>  Sawsan
> 
> 
> From: Shatanawi, Sawsan Muhammad  <mailto:sawsan.shatan...@wsu.edu>>
> Sent: Tuesday, January 16, 2024 10:43 AM
> To: Junchao Zhang mailto:junchao.zh...@gmail.com>>
> Cc: Barry Smith mailto:bsm...@petsc.dev>>; Matthew Knepley 
> mailto:knep...@gmail.com>>; Mark Adams  <mailto:mfad...@lbl.gov>>; petsc-users@mcs.anl.gov 
> <mailto:petsc-users@mcs.anl.gov>  <mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran 
> Groundwater Flow Simulation Code
>  
> Hello all,
> 
> Thank you for your valuable help. I will do your recommendations and hope it 
> will run without any issues.
> 
> Bests,
> Sawsan
>  
> From: Junchao Zhang mailto:junchao.zh...@gmail.com>>
> Sent: Friday, January 12, 2024 8:46 AM
> To: Shatanawi, Sawsan Muhammad  <mailto:sawsan.shatan...@wsu.edu>>
> Cc: Barry Smith mailto:bsm...@petsc.dev>>; Matthew Knepley 
> mailto:knep...@gmail.com>>; Mark Adams  <mailto:mfad...@lbl.gov>>; petsc-users@mcs.anl.gov 
> <mailto:petsc-users@mcs.anl.gov>  <mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran 
> Groundwater Flow Simulation Code
>  
> [EXTERNAL EMAIL]
> Hi, Sawsan,
>First in test_main.F90, you need to call VecGetArrayF90(temp_solution, 
> H_vector, ierr) and  VecRestoreArrayF90 (temp_solution, H_vector, ierr)  as 
> Barry mentioned.
>Secondly, in the loop of test_main.F90, it calls GW_solver(). Within it, 
> it calls PetscInitialize()/PetscFinalize(). But without MPI being 
> initialized, PetscInitialize()/PetscFinalize() can only be called once.
> do timestep =2 , NTSP
>call GW_boundary_conditions(timestep-1)
> !print *,HNEW(1,1,1)
>call GW_elevation()
>! print *, GWTOP(2,2,2)
>call GW_conductance()
>! print *, CC(2,2,2)
>call GW_recharge()
>! print *, B_Rech(5,4)
>call GW_pumping(timestep-1)
>! print *, B_pump(2,2,2)
>call GW_SW(timestep-1)
> print *,B_RIVER (2,2,2)
>call GW_solver(timestep-1,N)
>call GW_deallocate_loop()
> end do
> 
> A solution is to delete PetscInitialize()/PetscFinalize() in 
> GW_solver_try.F90 and move it to test_main.F90,  outside the do loop.
> 
> diff --git a/test_main.F90 b/test_main.F90
> index b5997c55..107bd3ee 100644
> --- a/test_main.F90
> +++ b/test_main.F90
> @@ -1,5 +1,6 @@
>  program test_GW
>  
> +#include 
>  use petsc
>  use GW_constants
>  use GW_param_by_user
> @@ -8,6 +9,9 @@ program test_GW
>  implicit none
>  integer :: N
>  integer :: timestep
> +PetscErrorCode ierr
> +
> +call PetscInitialize(ierr)
>  call GW_domain(N)
>  !print *, "N=",N
>  !print *, DELTAT
> @@ -37,4 +41,5 @@ program test_GW
>  end do
>  

Re: [petsc-users] PETSc options

2024-05-06 Thread Barry Smith


> On May 6, 2024, at 8:38 AM, Mark Adams  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> I don't know why this should have changed, but you can either not feed -v to 
> PETSc (a pain probably), use PETSc's getOptions methods instead of Fortran's, 
>  or make a dummy call PETSc's methods in addition to yours.

   Yes, just call PetscOptionsHasName() with -v and ignore the result.


> 
> Hope this helps,
> Mark
> 
> On Mon, May 6, 2024 at 1:04 AM Adrian Croucher  > wrote:
>> This Message Is From an External Sender
>> This message came from outside your organization.
>>  
>> hi,
>> 
>> My code has some optional command line arguments -v and -h for output of 
>> version number and usage help. These are processed using Fortran's 
>> get_command_argument().
>> 
>> Since updating PETSc to version 3.21, I get some extra warnings after 
>> the output:
>> 
>> acro018@EN438880:~$ waiwera -v
>> 1.5.0b1
>> WARNING! There are options you set that were not used!
>> WARNING! could be spelling mistake, etc!
>> There is one unused database option. It is:
>> Option left: name:-v (no value) source: command line
>> 
>> That didn't used to happen. What should I do to make them go away?
>> 
>> Regards, Adrian
>> 
>> -- 
>> Dr Adrian Croucher
>> Senior Research Fellow
>> Department of Engineering Science
>> Waipapa Taumata Rau / University of Auckland, New Zealand
>> email: a.crouc...@auckland.ac.nz 
>> tel: +64 (0)9 923 4611
>> 
> 



Re: [petsc-users] [petsc-maint] Inquiry about Multithreading Capabilities in PETSc's KSPSolver

2024-04-29 Thread Barry Smith

   Do you need fortran? If not just run again with also --with-fc=0 
--with-sowing=0

   If you need fortran send configure.log


> On Apr 29, 2024, at 3:45 PM, Yongzhong Li  
> wrote:
> 
> Hi Barry,
> 
> Thanks for your reply, I checkout to the git branch 
> barry/2023-09-15/fix-log-pcmpi  but get some errors when configuring PETSc, 
> below is the error message,
> 
> =
>  Configuring PETSc to compile on your system
> =
> =
>  * WARNING *
>   Found environment variable: MAKEFLAGS=s -j14 --jobserver-auth=3,5. Ignoring 
> it! Use
>   "./configure MAKEFLAGS=$MAKEFLAGS" if you really want to use this value
> =
> =
>   Running configure on SOWING; this may take several minutes
> =
>  
> *
>UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for 
> details):
> -
>   Error running configure on SOWING
> *
> My configuration is 
> 
> ./configure PETSC_ARCH=config-release --with-scalar-type=complex 
> --with-fortran-kernels=1 --with-debugging=0 COPTFLAGS=-O3 -march=native 
> CXXOPTFLAGS=-O3 -march=native FOPTFLAGS=-O3 -march=native --with-cxx=g++ 
> --download-openmpi --download-superlu --download-opencascade 
> --with-openblas-include=${OPENBLAS_INC} --with-openblas-lib=${OPENBLAS_LIB} 
> --with-threadsafety --with-log=0 --with-openmp
> 
> I didn’t have this issue when I configured PETSc using tarball release 
> download version. Any suggestions on this?
> 
> Thanks and regards,
> Yongzhong
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Saturday, April 27, 2024 at 12:54 PM
> To: Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>, 
> petsc-ma...@mcs.anl.gov <mailto:petsc-ma...@mcs.anl.gov> 
> mailto:petsc-ma...@mcs.anl.gov>>
> Subject: Re: [petsc-maint] Inquiry about Multithreading Capabilities in 
> PETSc's KSPSolver
> 
>  
>You should use the git branch barry/2023-09-15/fix-log-pcmpi  It is still 
> work-in-progress but much better than what is currently in the main PETSc 
> branch.
>  
>By default, the MPI linear solver server requires 10,000 unknowns per MPI 
> process, so for smaller problems, it will only run on one MPI rank and list 
> Sequential   in your output. In general you need on the order of at least 
> 10,000 unknowns per MPI process to get good speedup. You can control it with 
>  
>-mpi_linear_solver_server_minimum_count_per_rank 
>  
> Regarding the report of 1 iteration, that is fixed in the branch listed above.
>  
>   Barry
>  
> On Apr 26, 2024, at 5:11 PM, Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>> wrote:
>  
> Hi Barry,
> 
> Thanks, I am interested in this PCMPI solution provided by PETSc!
> 
> I tried the src/ksp/ksp/tutorials/ex1.c which is configured in CMakelists as 
> follows:
> 
> ./configure PETSC_ARCH=config-debug --with-scalar-type=complex 
> --with-fortran-kernels=1 --with-debugging=0 --with-logging=0 --with-cxx=g++ 
> --download-mpich --download-superlu --download-opencascade 
> --with-openblas-include=${OPENBLAS_INC} --with-openblas-lib=${OPENBLAS_LIB}
> 
> In the linux terminal, my bash script is as follows,
> 
> mpiexec -n 4 ./ex1 -mpi_linear_solver_server -mpi_linear_solver_server_view
>  
> However, I found the ouput a bit strange
> 
> Norm of error 1.23629e-15, Iterations 1
> MPI linear solver server statistics:
> RanksKSPSolve()s MatsKSPs   Avg. Size  Avg. 
> Its
>   Sequential 3 210   1
> Norm of error 1.23629e-15, Iterations 1
> MPI linear solver server statistics:
> RanksKSPSolve()s MatsKSPs   Avg. Size  Avg. 
> Its
>   Sequenti

Re: [petsc-users] Asking SuiteSparse to use Metis at PETSc config time

2024-04-29 Thread Barry Smith

--with-x=0


> On Apr 29, 2024, at 12:05 PM, Vanella, Marcos (Fed) via petsc-users 
>  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hi Satish, 
> Ok thank you for clarifying. I don't need to include Metis in the config 
> phase then (not using anywhere else).
> Is there a way I can configure PETSc to not require X11 (Xgraph functions, 
> etc.)?
> Thank you,
> Marcos
> From: Satish Balay mailto:ba...@mcs.anl.gov>>
> Sent: Monday, April 29, 2024 12:00 PM
> To: Vanella, Marcos (Fed)  >
> Cc: petsc-users@mcs.anl.gov  
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Asking SuiteSparse to use Metis at PETSc config 
> time
>  
> 
> # Other CMakeLists.txt files inside SuiteSparse are from dependent packages
> # (LAGraph/deps/json_h, GraphBLAS/cpu_features, and CHOLMOD/SuiteSparse_metis
> # which is a slightly revised copy of METIS 5.0.1) but none of those
> # CMakeLists.txt files are used to build any package in SuiteSparse.
> 
> 
> So suitesparse includes a copy of metis sources - i.e does not use external 
> metis library?
> 
> >>
> balay@pj01:~/petsc/arch-linux-c-debug/lib$ nm -Ao *.so |grep 
> METIS_PartGraphKway
> libcholmod.so 
>   >:0026e500 T SuiteSparse_metis_METIS_PartGraphKway
> <<<
> 
> And metis routines are already in -lcholmod [with some namespace fixes]
> 
> Satish
> 
> On Mon, 29 Apr 2024, Vanella, Marcos (Fed) via petsc-users wrote:
> 
> > Hi all, I'm wondering.. Is it possible to get SuiteSparse to use Metis at 
> > configure time with PETSc? Using Metis for reordering at symbolic 
> > factorization phase gives lower filling for factorization matrices than AMD 
> > in some cases (faster solution phase).
> > I tried this with gcc compilers and openmpi:
> > 
> > $./configure LDFLAGS="-ld_classic" COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" 
> > FOPTFLAGS="-O2 -g" --with-debugging=0 --with-shared-libraries=0 
> > --download-metis --download-suitesparse --download-hypre 
> > --download-fblaslapack --download-make --force
> > 
> > and get for SuiteSparse:
> > 
> > metis:
> >   Version:5.1.0
> >   Includes:   
> > -I/Users/mnv/Documents/Software/petsc/arch-darwin-opt-gcc/include
> >   Libraries:  
> > -Wl,-rpath,/Users/mnv/Documents/Software/petsc/arch-darwin-opt-gcc/lib 
> > -L/Users/mnv/Documents/Software/petsc/arch-darwin-opt-gcc/lib -lmetis
> > SuiteSparse:
> >   Version:7.7.0
> >   Includes:   
> > -I/Users/mnv/Documents/Software/petsc/arch-darwin-opt-gcc/include/suitesparse
> >  -I/Users/mnv/Documents/Software/petsc/arch-darwin-opt-gcc/include
> >   Libraries:  
> > -Wl,-rpath,/Users/mnv/Documents/Software/petsc/arch-darwin-opt-gcc/lib 
> > -L/Users/mnv/Documents/Software/petsc/arch-darwin-opt-gcc/lib -lspqr 
> > -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd 
> > -lsuitesparseconfig
> > 
> > for which I see Metis will be compiled but I don't have -lmetis linking in 
> > the SuiteSparse Libraries.
> > Thank you for your time!
> > Marcos
> > 



Re: [petsc-users] [petsc-maint] Inquiry about Multithreading Capabilities in PETSc's KSPSolver

2024-04-27 Thread Barry Smith

   You should use the git branch barry/2023-09-15/fix-log-pcmpi  It is still 
work-in-progress but much better than what is currently in the main PETSc 
branch.

   By default, the MPI linear solver server requires 10,000 unknowns per MPI 
process, so for smaller problems, it will only run on one MPI rank and list 
Sequential   in your output. In general you need on the order of at least 
10,000 unknowns per MPI process to get good speedup. You can control it with 

   -mpi_linear_solver_server_minimum_count_per_rank 

Regarding the report of 1 iteration, that is fixed in the branch listed above.

  Barry

> On Apr 26, 2024, at 5:11 PM, Yongzhong Li  
> wrote:
> 
> Hi Barry,
> 
> Thanks, I am interested in this PCMPI solution provided by PETSc!
> 
> I tried the src/ksp/ksp/tutorials/ex1.c which is configured in CMakelists as 
> follows:
> 
> ./configure PETSC_ARCH=config-debug --with-scalar-type=complex 
> --with-fortran-kernels=1 --with-debugging=0 --with-logging=0 --with-cxx=g++ 
> --download-mpich --download-superlu --download-opencascade 
> --with-openblas-include=${OPENBLAS_INC} --with-openblas-lib=${OPENBLAS_LIB}
> 
> In the linux terminal, my bash script is as follows,
> 
> mpiexec -n 4 ./ex1 -mpi_linear_solver_server -mpi_linear_solver_server_view
>  
> However, I found the ouput a bit strange
> 
> Norm of error 1.23629e-15, Iterations 1
> MPI linear solver server statistics:
> RanksKSPSolve()s MatsKSPs   Avg. Size  Avg. 
> Its
>   Sequential 3 210   1
> Norm of error 1.23629e-15, Iterations 1
> MPI linear solver server statistics:
> RanksKSPSolve()s MatsKSPs   Avg. Size  Avg. 
> Its
>   Sequential 3 210   1
> Norm of error 1.23629e-15, Iterations 1
> MPI linear solver server statistics:
> RanksKSPSolve()s MatsKSPs   Avg. Size  Avg. 
> Its
>   Sequential 3 210   1
> Norm of error 1.23629e-15, Iterations 1
> MPI linear solver server statistics:
> RanksKSPSolve()s MatsKSPs   Avg. Size  Avg. 
> Its
>   Sequential 3 210   1
> 
> It seems that mpi started four processes, but they all did the same things, 
> and I am confused why the ranks showed sequential. Are these supposed to be 
> the desired output when the mpi_linear_solver_server is turned on?
> 
> And if I run ex1 without any hypen options, I got 
> 
> Norm of error 2.47258e-15, Iterations 5
> 
> It looks like the KSPSolver use 5 iterations to reach convergence, but why 
> when mpi_linear_solver_server is enabled, it uses 1?
> 
> I hope to get some help on these issues, thank you!
> 
> Sincerely,
> Yongzhong
> 
> 
> 
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Tuesday, April 23, 2024 at 5:15 PM
> To: Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>, 
> petsc-ma...@mcs.anl.gov <mailto:petsc-ma...@mcs.anl.gov> 
> mailto:petsc-ma...@mcs.anl.gov>>, Piero Triverio 
> mailto:piero.trive...@utoronto.ca>>
> Subject: Re: [petsc-maint] Inquiry about Multithreading Capabilities in 
> PETSc's KSPSolver
> 
>  
>   Yes, only the routines that can explicitly use BLAS have multi-threading.
>  
>PETSc does support using nay MPI linear solvers from a sequential (or 
> OpenMP) main program using the 
> https://urldefense.us/v3/__https://petsc.org/release/manualpages/PC/PCMPI/*pcmpi__;Iw!!G_uCfscf7eWS!ZuPZtoeGFKUjdTAW0Ylzhjz0KaqtPKAf4ZOa1Xahj_4JUS8wwupZKDb_BQCWgFWPJIYRFlA3dTDHsu8HNnxbn4Q$
>   construct.  I am finishing up better support in the branch 
> barry/2023-09-15/fix-log-pcmpi
>  
>   Barry
>  
>  
>  
>  
>  
> 
> 
> On Apr 23, 2024, at 3:59 PM, Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>> wrote:
>  
> Thanks Barry! Does this mean that the sparse matrix-vector products, which 
> actually constitute the majority of the computations in my GMRES routine in 
> PETSc, don’t utilize multithreading? Only basic vector operations such as 
> VecAXPY and VecDot or dense matrix operations in PETSc will benefit from 
> multithreading, is it correct?
> 
> Best,
> Yongzhong
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Tuesday, April 23, 2024 at 3:35 PM
> To: Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:pets

Re: [petsc-users] CUDA GPU supported KSPs and PCs

2024-04-24 Thread Barry Smith

   It is less a question of what KSP and PC support running with CUDA and more 
a question of what parts of each KSP and PC run with CUDA (and which parts 
don't causing memory traffic back and forth between the CPU and GPU).  

 Generally speaking, all the PETSc Vec operations run on CUDA. Thus "all" 
the KSP "support CUDA". For Mat operations, it is more complicated; triangular 
solves do not run well (or at all) on CUDA but much of the other operations do 
run on CUDA. Since setting and and solving with some PC involved rather 
complicated Mat operations (like PCGAMG and PCFIELDSPLIT) parts may work on 
CUDA and parts may not. 

 The best way to determine how the GPU is being utilized is to run with 
-log_view and look at the columns that present the amount of memory traffic 
between the CPU and GPU and the percentage of floating point that is done on 
the GPU. Feel free to ask specific questions about the output. In some cases, 
given the output, we may be able to add additional CUDA support that is missing 
to decrease the memory traffic between the CPU and GPU and increase the flops 
done on the GPU.

We cannot produce a table of what is "supported" and what is not supported 
or even how much is supported since there are so many combinations of possible, 
hence it is best to run to determine the problematic places.

> On Apr 24, 2024, at 10:22 AM, Giyantha Binu Amaratunga Mukadange 
>  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hi, 
> 
> Is it possible to know which KSPs and PCs currently support running on Nvidia 
> GPUs with CUDA, or a source that has this information?
> The following page doesn't provide details about the supported KSPs and PCs. 
> https://urldefense.us/v3/__https://petsc.org/main/overview/gpu_roadmap/__;!!G_uCfscf7eWS!Y4JoiNMqvhSlcSuHnANYAK0LByq3ybAxBp_7_NTmzTcV2gBNyzA0D08G7lhPAZnQZsdtwo3zvr0Dlh3jIGpKCnM$
>   
> 
> 
> Thank you very much!
> 
> Best regards, 
> Binu



Re: [petsc-users] [petsc-maint] Inquiry about Multithreading Capabilities in PETSc's KSPSolver

2024-04-23 Thread Barry Smith

  Yes, only the routines that can explicitly use BLAS have multi-threading.

   PETSc does support using nay MPI linear solvers from a sequential (or 
OpenMP) main program using the 
https://urldefense.us/v3/__https://petsc.org/release/manualpages/PC/PCMPI/*pcmpi__;Iw!!G_uCfscf7eWS!axQjxZeWC27cy3WnpqerXeWbd74F9I1B5K9M5m_81RGHmibyn9It_T8Ru5XaCj_2X3FG7XpHUh65OKUae7RSBr0$
  construct.  I am finishing up better support in the branch 
barry/2023-09-15/fix-log-pcmpi

  Barry






> On Apr 23, 2024, at 3:59 PM, Yongzhong Li  
> wrote:
> 
> Thanks Barry! Does this mean that the sparse matrix-vector products, which 
> actually constitute the majority of the computations in my GMRES routine in 
> PETSc, don’t utilize multithreading? Only basic vector operations such as 
> VecAXPY and VecDot or dense matrix operations in PETSc will benefit from 
> multithreading, is it correct?
> 
> Best,
> Yongzhong
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Tuesday, April 23, 2024 at 3:35 PM
> To: Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>, 
> petsc-ma...@mcs.anl.gov <mailto:petsc-ma...@mcs.anl.gov> 
> mailto:petsc-ma...@mcs.anl.gov>>, Piero Triverio 
> mailto:piero.trive...@utoronto.ca>>
> Subject: Re: [petsc-maint] Inquiry about Multithreading Capabilities in 
> PETSc's KSPSolver
> 
> 你通常不会收到来自 bsm...@petsc.dev <mailto:bsm...@petsc.dev> 的电子邮件。了解这一点为什么很重要 
> <https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!axQjxZeWC27cy3WnpqerXeWbd74F9I1B5K9M5m_81RGHmibyn9It_T8Ru5XaCj_2X3FG7XpHUh65OKUa7GpfT4g$
>  >  
>  
>Intel MKL or OpenBLAS are the best bet, but for vector operations they 
> will not be significant since the vector operations do not dominate the 
> computations.
> 
> 
> On Apr 23, 2024, at 3:23 PM, Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>> wrote:
>  
> Hi Barry,
> 
> Thank you for the information provided!
> 
> Do you think different BLAS implementation will affect the multithreading 
> performance of some vectors operations in GMERS in PETSc?
>  
> I am now using OpenBLAS but didn’t see much improvement when theb 
> multithreading are enabled, do you think other implementation such as netlib 
> and intel-mkl will help?
> 
> Best,
> Yongzhong
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Monday, April 22, 2024 at 4:20 PM
> To: Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>, 
> petsc-ma...@mcs.anl.gov <mailto:petsc-ma...@mcs.anl.gov> 
> mailto:petsc-ma...@mcs.anl.gov>>, Piero Triverio 
> mailto:piero.trive...@utoronto.ca>>
> Subject: Re: [petsc-maint] Inquiry about Multithreading Capabilities in 
> PETSc's KSPSolver
> 
> 你通常不会收到来自 bsm...@petsc.dev <mailto:bsm...@petsc.dev> 的电子邮件。了解这一点为什么很重要 
> <https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!axQjxZeWC27cy3WnpqerXeWbd74F9I1B5K9M5m_81RGHmibyn9It_T8Ru5XaCj_2X3FG7XpHUh65OKUa7GpfT4g$
>  >  
>  
>PETSc provided solvers do not directly use threads. 
>  
>The BLAS used by LAPACK and PETSc may use threads depending on what BLAS 
> is being used and how it was configured. 
>  
>Some of the vector operations in GMRES in PETSc use BLAS that can use 
> threads, including axpy, dot, etc. For sufficiently large problems, the use 
> of threaded BLAS can help with these routines, but not significantly for the 
> solver. 
>  
>Dense matrix-vector products MatMult() and dense matrix direct solvers 
> PCLU use BLAS and thus can benefit from threading. The benefit can be 
> significant for large enough problems with good hardware, especially with 
> PCLU. 
>  
>If you run with -blas_view  PETSc tries to indicate information about the 
> threading of BLAS. You can also use -blas_num_threads  to set the number 
> of threads, equivalent to setting the environmental variable.  For dense 
> solvers you can vary the number of threads and run with -log_view to see what 
> it helps to improve and what it does not effect.
>  
>  
>  
> 
> On Apr 22, 2024, at 4:06 PM, Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>> wrote:
>  
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello all,
>  
> I am writing to ask if PETSc’s KSPSolver makes use of OpenMP/multithreading, 
> specifically when performing iterative solutions with the GMRES algorithm.
>  
> The questions appeared when I was r

Re: [petsc-users] [petsc-maint] Inquiry about Multithreading Capabilities in PETSc's KSPSolver

2024-04-23 Thread Barry Smith

   Intel MKL or OpenBLAS are the best bet, but for vector operations they will 
not be significant since the vector operations do not dominate the computations.

> On Apr 23, 2024, at 3:23 PM, Yongzhong Li  
> wrote:
> 
> Hi Barry,
> 
> Thank you for the information provided!
> 
> Do you think different BLAS implementation will affect the multithreading 
> performance of some vectors operations in GMERS in PETSc?
>  
> I am now using OpenBLAS but didn’t see much improvement when theb 
> multithreading are enabled, do you think other implementation such as netlib 
> and intel-mkl will help?
> 
> Best,
> Yongzhong
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Monday, April 22, 2024 at 4:20 PM
> To: Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>, 
> petsc-ma...@mcs.anl.gov <mailto:petsc-ma...@mcs.anl.gov> 
> mailto:petsc-ma...@mcs.anl.gov>>, Piero Triverio 
> mailto:piero.trive...@utoronto.ca>>
> Subject: Re: [petsc-maint] Inquiry about Multithreading Capabilities in 
> PETSc's KSPSolver
> 
> 你通常不会收到来自 bsm...@petsc.dev <mailto:bsm...@petsc.dev> 的电子邮件。了解这一点为什么很重要 
> <https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!ZUTwcNLfs6G5br1EAWueM5PYWNpPjiF2q3kNBXjQVBYWuiubV3uzu8ddOnkiPOhYCg2lA-vDMmxSZyP-hfLfTMk$
>  >  
>  
>PETSc provided solvers do not directly use threads. 
>  
>The BLAS used by LAPACK and PETSc may use threads depending on what BLAS 
> is being used and how it was configured. 
>  
>Some of the vector operations in GMRES in PETSc use BLAS that can use 
> threads, including axpy, dot, etc. For sufficiently large problems, the use 
> of threaded BLAS can help with these routines, but not significantly for the 
> solver. 
>  
>Dense matrix-vector products MatMult() and dense matrix direct solvers 
> PCLU use BLAS and thus can benefit from threading. The benefit can be 
> significant for large enough problems with good hardware, especially with 
> PCLU. 
>  
>If you run with -blas_view  PETSc tries to indicate information about the 
> threading of BLAS. You can also use -blas_num_threads  to set the number 
> of threads, equivalent to setting the environmental variable.  For dense 
> solvers you can vary the number of threads and run with -log_view to see what 
> it helps to improve and what it does not effect.
>  
>  
> 
> 
> On Apr 22, 2024, at 4:06 PM, Yongzhong Li  <mailto:yongzhong...@mail.utoronto.ca>> wrote:
>  
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello all,
>  
> I am writing to ask if PETSc’s KSPSolver makes use of OpenMP/multithreading, 
> specifically when performing iterative solutions with the GMRES algorithm.
>  
> The questions appeared when I was running a large numerical program based on 
> boundary element method. I used the PETSc's GMRES algorithm in KSPSolve to 
> solve a shell matrix system iteratively. I observed that threads were being 
> utilized, controlled by the OPENBLAS_NUM_THREADS environment variable. 
> However, I noticed no significant performance difference between running the 
> solver with multiple threads versus a single thread.
> 
> Could you please confirm if GMRES in KSPSolve leverages multithreading, and 
> also whether it is influenced by the multithreadings of the low-level math 
> libraries such as BLAS and LAPACK? If so, how can I enable multithreading 
> effectively to see noticeable improvements in solution times when using 
> GMRES? If not, why do I observe that threads are being used during the GMERS 
> solutions?
>  
> For reference, I am using PETSc version 3.16.0, configured in CMakelists as 
> follows:
> 
> ./configure PETSC_ARCH=config-release --with-scalar-type=complex 
> --with-fortran-kernels=1 --with-debugging=0 COPTFLAGS=-O3 -march=native 
> CXXOPTFLAGS=-O3 -march=native FOPTFLAGS=-O3 -march=native --with-cxx=g++ 
> --download-openmpi --download-superlu --download-opencascade 
> --with-openblas-include=${OPENBLAS_INC} --with-openblas-lib=${OPENBLAS_LIB} 
> --with-threadsafety --with-log=0 --with-openmp
> 
> To simplify the diagnosis of potential issues, I have also written a small 
> example program using GMRES to solve a sparse matrix system derived from a 2D 
> Poisson problem using the finite difference method. I found similar issues on 
> this piece of codes. The code is as follows:
> 
> #include 
> 
> /* Monitor function to print iteration number and residual norm */
> PetscErrorCode MyKSPMonitor(KSP ksp, PetscInt n, PetscReal rnorm, void *ctx) {
>   

Re: [petsc-users] [petsc-maint] Inquiry about Multithreading Capabilities in PETSc's KSPSolver

2024-04-22 Thread Barry Smith

   PETSc provided solvers do not directly use threads. 

   The BLAS used by LAPACK and PETSc may use threads depending on what BLAS is 
being used and how it was configured. 

   Some of the vector operations in GMRES in PETSc use BLAS that can use 
threads, including axpy, dot, etc. For sufficiently large problems, the use of 
threaded BLAS can help with these routines, but not significantly for the 
solver. 

   Dense matrix-vector products MatMult() and dense matrix direct solvers PCLU 
use BLAS and thus can benefit from threading. The benefit can be significant 
for large enough problems with good hardware, especially with PCLU. 

   If you run with -blas_view  PETSc tries to indicate information about the 
threading of BLAS. You can also use -blas_num_threads  to set the number of 
threads, equivalent to setting the environmental variable.  For dense solvers 
you can vary the number of threads and run with -log_view to see what it helps 
to improve and what it does not effect.



> On Apr 22, 2024, at 4:06 PM, Yongzhong Li  
> wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello all,
>  
> I am writing to ask if PETSc’s KSPSolver makes use of OpenMP/multithreading, 
> specifically when performing iterative solutions with the GMRES algorithm.
>  
> The questions appeared when I was running a large numerical program based on 
> boundary element method. I used the PETSc's GMRES algorithm in KSPSolve to 
> solve a shell matrix system iteratively. I observed that threads were being 
> utilized, controlled by the OPENBLAS_NUM_THREADS environment variable. 
> However, I noticed no significant performance difference between running the 
> solver with multiple threads versus a single thread.
> 
> Could you please confirm if GMRES in KSPSolve leverages multithreading, and 
> also whether it is influenced by the multithreadings of the low-level math 
> libraries such as BLAS and LAPACK? If so, how can I enable multithreading 
> effectively to see noticeable improvements in solution times when using 
> GMRES? If not, why do I observe that threads are being used during the GMERS 
> solutions?
>  
> For reference, I am using PETSc version 3.16.0, configured in CMakelists as 
> follows:
> 
> ./configure PETSC_ARCH=config-release --with-scalar-type=complex 
> --with-fortran-kernels=1 --with-debugging=0 COPTFLAGS=-O3 -march=native 
> CXXOPTFLAGS=-O3 -march=native FOPTFLAGS=-O3 -march=native --with-cxx=g++ 
> --download-openmpi --download-superlu --download-opencascade 
> --with-openblas-include=${OPENBLAS_INC} --with-openblas-lib=${OPENBLAS_LIB} 
> --with-threadsafety --with-log=0 --with-openmp
> 
> To simplify the diagnosis of potential issues, I have also written a small 
> example program using GMRES to solve a sparse matrix system derived from a 2D 
> Poisson problem using the finite difference method. I found similar issues on 
> this piece of codes. The code is as follows:
> 
> #include 
> 
> /* Monitor function to print iteration number and residual norm */
> PetscErrorCode MyKSPMonitor(KSP ksp, PetscInt n, PetscReal rnorm, void *ctx) {
> PetscErrorCode ierr;
> ierr = PetscPrintf(PETSC_COMM_WORLD, "Iteration %D, Residual norm %g\n", 
> n, (double)rnorm);
> CHKERRQ(ierr);
> return 0;
> }
> 
> int main(int argc, char **args) {
> Vec x, b, x_true, e;
> Mat A;
> KSP ksp;
> PetscErrorCode ierr;
> PetscInt i, j, Ii, J, n = 500; // Size of the grid n x n
> PetscInt Istart, Iend, ncols;
> PetscScalar v;
> PetscMPIInt rank;
> PetscInitialize(, , NULL, NULL);
> PetscLogDouble t1, t2; // Variables for timing
> MPI_Comm_rank(PETSC_COMM_WORLD, );
> 
> // Create vectors and matrix
> ierr = VecCreateMPI(PETSC_COMM_WORLD, PETSC_DECIDE, n*n, ); 
> CHKERRQ(ierr);
> ierr = VecDuplicate(x, ); CHKERRQ(ierr);
> ierr = VecDuplicate(x, _true); CHKERRQ(ierr);
> 
> // Set true solution as all ones
> ierr = VecSet(x_true, 1.0); CHKERRQ(ierr);
> 
> // Create and assemble matrix A for the 2D Laplacian using 5-point stencil
> ierr = MatCreate(PETSC_COMM_WORLD, ); CHKERRQ(ierr);
> ierr = MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, n*n, n*n); 
> CHKERRQ(ierr);
> ierr = MatSetFromOptions(A); CHKERRQ(ierr);
> ierr = MatSetUp(A); CHKERRQ(ierr);
> ierr = MatGetOwnershipRange(A, , ); CHKERRQ(ierr);
> for (Ii = Istart; Ii < Iend; Ii++) {
> i = Ii / n; // Row index
> j = Ii % n; // Column index
> v = -4.0;
> ierr = MatSetValues(A, 1, , 1, , , INSERT_VALUES); 
> CHKERRQ(ierr);
> if (i > 0) { // South
> J = Ii - n;
> v = 1.0;
> ierr = MatSetValues(A, 1, , 1, , , INSERT_VALUES); 
> CHKERRQ(ierr);
> }
> if (i < n - 1) { // North
> J = Ii + n;
> v = 1.0;
> ierr = MatSetValues(A, 1, , 1, , , INSERT_VALUES); 
> CHKERRQ(ierr);
> }
> if (j > 

Re: [petsc-users] Using PETSc GPU backend

2024-04-12 Thread Barry Smith

  800k is a pretty small problem for GPUs. 

  We would need to see the runs with output from -ksp_view -log_view to see if 
the timing results are reasonable.

> On Apr 12, 2024, at 1:48 PM, Ng, Cho-Kuen  wrote:
> 
> I performed tests on comparison using KSP with and without cuda backend on 
> NERSC's Perlmutter. For a finite element solve with 800k degrees of freedom, 
> the best times obtained using MPI and MPI+GPU were
> 
> o MPI - 128 MPI tasks, 27 s
> 
> o MPI+GPU - 4 MPI tasks, 4 GPU's, 32 s
> 
> Is that the performance one would expect using the hybrid mode of 
> computation. Attached image shows the scaling on a single node.
> 
> Thanks,
> Cho
> From: Ng, Cho-Kuen mailto:c...@slac.stanford.edu>>
> Sent: Saturday, August 12, 2023 8:08 AM
> To: Jacob Faibussowitsch mailto:jacob@gmail.com>>
> Cc: Barry Smith mailto:bsm...@petsc.dev>>; petsc-users 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using PETSc GPU backend
>  
> Thanks Jacob.
> From: Jacob Faibussowitsch mailto:jacob@gmail.com>>
> Sent: Saturday, August 12, 2023 5:02 AM
> To: Ng, Cho-Kuen mailto:c...@slac.stanford.edu>>
> Cc: Barry Smith mailto:bsm...@petsc.dev>>; petsc-users 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using PETSc GPU backend
>  
> > Can petsc show the number of GPUs used?
> 
> -device_view
> 
> Best regards,
> 
> Jacob Faibussowitsch
> (Jacob Fai - booss - oh - vitch)
> 
> > On Aug 12, 2023, at 00:53, Ng, Cho-Kuen via petsc-users 
> > mailto:petsc-users@mcs.anl.gov>> wrote:
> > 
> > Barry,
> > 
> > I tried again today on Perlmutter and running on multiple GPU nodes worked. 
> > Likely, I had messed up something the other day. Also, I was able to have 
> > multiple MPI tasks on a GPU using Nvidia MPS. The petsc output shows the 
> > number of MPI tasks:
> > 
> > KSP Object: 32 MPI processes
> > 
> > Can petsc show the number of GPUs used?
> > 
> > Thanks,
> > Cho
> > 
> > From: Barry Smith mailto:bsm...@petsc.dev>>
> > Sent: Wednesday, August 9, 2023 4:09 PM
> > To: Ng, Cho-Kuen mailto:c...@slac.stanford.edu>>
> > Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> > mailto:petsc-users@mcs.anl.gov>>
> > Subject: Re: [petsc-users] Using PETSc GPU backend
> >  
> >   We would need more information about "hanging". Do PETSc examples and 
> > tiny problems "hang" on multiple nodes? If you run with -info what are the 
> > last messages printed? Can you run with a debugger to see where it is 
> > "hanging"?
> > 
> > 
> > 
> >> On Aug 9, 2023, at 5:59 PM, Ng, Cho-Kuen  >> <mailto:c...@slac.stanford.edu>> wrote:
> >> 
> >> Barry and Matt,
> >> 
> >> Thanks for your help. Now I can use petsc GPU backend on Perlmutter: 1 
> >> node, 4 MPI tasks and 4 GPUs. However, I ran into problems with multiple 
> >> nodes: 2 nodes, 8 MPI tasks and 8 GPUs. The run hung on KSPSolve. How can 
> >> I fix this?
> >> 
> >> Best,
> >> Cho
> >> 
> >>  From: Barry Smith mailto:bsm...@petsc.dev>>
> >> Sent: Monday, July 17, 2023 6:58 AM
> >> To: Ng, Cho-Kuen mailto:c...@slac.stanford.edu>>
> >> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >> mailto:petsc-users@mcs.anl.gov>>
> >> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>  
> >>  The examples that use DM, in particular DMDA all trivially support using 
> >> the GPU with -dm_mat_type aijcusparse -dm_vec_type cuda
> >> 
> >> 
> >> 
> >>> On Jul 17, 2023, at 1:45 AM, Ng, Cho-Kuen  >>> <mailto:c...@slac.stanford.edu>> wrote:
> >>> 
> >>> Barry,
> >>> 
> >>> Thank you so much for the clarification. 
> >>> 
> >>> I see that ex104.c and ex300.c use  MatXAIJSetPreallocation(). Are there 
> >>> other tutorials available?
> >>> 
> >>> Cho
> >>>  From: Barry Smith mailto:bsm...@petsc.dev>>
> >>> Sent: Saturday, July 15, 2023 8:36 AM
> >>> To: Ng, Cho-Kuen mailto:c...@slac.stanford.edu>>
> >>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>> mailto:petsc-users@mcs.anl.gov>>
> >>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>  
> >>>  Cho,
> >>> 
> >>>

Re: [petsc-users] Problem with NVIDIA compiler and OpenACC

2024-04-05 Thread Barry Smith
/manual/nvhpc/23.7/Linux_x86_64/23.7/compilers/extras/qd/lib
 
-Wl,-rpath,/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/compilers/extras/qd/lib
 
-L/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/compilers/extras/qd/lib
 
-Wl,-rpath,/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/extras/CUPTI/lib64
 
-Wl,-rpath,/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/extras/CUPTI/lib64
 
-L/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/extras/CUPTI/lib64
 
-Wl,-rpath,/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/lib64
 
-Wl,-rpath,/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/lib64
 
-L/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/lib64
 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 
-L/usr/lib/gcc/x86_64-redhat-linux/11 
-Wl,-rpath,/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/comm_libs/mpi/lib
 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi 
-Wl,-rpath,/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/compilers/lib
 -lnvf -lnvomp -ldl -lnvhpcatm -latomic -lpthread -lnvcpumath -lnsnvc -lrt 
-lgcc_s -lm -lquadmath 


We see that libnvJitLink.so.12 is in 
/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/lib64
 

Then whenit links the executable (above) it passes 
-Wl,-rpath,/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/lib64
 to the linker so that at
run time, it should be able to find all the libraries in 
/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/lib64.
 

Is  
/software/sse2/tetralith_el9/manual/nvhpc/23.7/Linux_x86_64/23.7/cuda/12.2/lib64/libnvJitLink.so.12
 a link to something that actually exists.

> 
> At the moment I am not sure where it tries to find that library. Therefore I 
> thought that maybe the problem
> is that BlasLapack could put some path in the library, which does not exist. 
> At the beginning of configure.log
> it mentions  libnvJitlink.so.12, but then it seems to get lost somewhere.
> 
> I have to see again if there is already a problem when I make petsc check, or 
> if it is just in my program later.
> Not quite sure anymore.
> 
> 
> I will write back next week, Frank
> 
> 
> 
> 
> 
>> On 5 Apr 2024, at 19:47, Barry Smith > <mailto:bsm...@petsc.dev>> wrote:
>> 
>> 
>>   Thanks for the configure.log Send the configure.log for the failed 
>> nvJitlink problem.
>> 
>> 
>>> On Apr 5, 2024, at 12:58 PM, Frank Bramkamp >> <mailto:bramk...@nsc.liu.se>> wrote:
>>> 
>>> Hi Barry,
>>> 
>>> Here comes the latest configure.log file
>>> 
>>> My cuda nvJitlink problem unfortunately still exists. 
>>> I will try it on a different cluster to see if this a specific problem of 
>>> the actual nvhpc installation.
>>> 
>>> 
>>> Have a nice weekend, Frank
>>> 
>>> 
>>> 
>>> 
>> 
> 



Re: [petsc-users] Compiling PETSc with strumpack in ORNL Frontier

2024-04-05 Thread Barry Smith

   Please send the entire configure.log 

> On Apr 5, 2024, at 3:42 PM, Vanella, Marcos (Fed) via petsc-users 
>  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hi all, we are trying to compile PETSc in Frontier using the structured 
> matrix hierarchical solver strumpack, which uses GPU and might be a good 
> candidate for our Poisson discretization.
> The list of libs I used for PETSc in this case is:
> 
> $./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" 
> FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3 --offload-arch=gfx90a" --with-debugging=0 
> --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hip-arch=gfx908 
> --with-hipc=hipcc   --LIBS="-L${MPICH_DIR}/lib -lmpi 
> ${CRAY_XPMEM_POST_LINK_OPTS} -lxpmem ${PE_MPICH_GTL_DIR_amd_gfx90a} 
> ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels 
> --download-suitesparse --download-hypre --download-superlu_dist 
> --download-strumpack --download-metis --download-slate --download-magma 
> --download-parmetis --download-ptscotch --download-zfp 
> --download-butterflypack --with-openmp-dir=/opt/cray/pe/gcc/12.2.0/snos 
> --download-scalapack --download-cmake --force
> 
> I'm getting an error at configure time:
> 
> ...
>   Trying to download 
> https://urldefense.us/v3/__https://github.com/liuyangzhuan/ButterflyPACK__;!!G_uCfscf7eWS!Zixr16YdQu3fiyHhdpuVPSpY2C6CE_eyJBpOizV54Ljkkw_4u9KcWP5QRT1Ukap5cNKYJ7t3If6OkGXrUyG8E-A$
>   
> 
>  for BUTTERFLYPACK
> =
> =
>  Configuring BUTTERFLYPACK with CMake; this may take several 
> minutes
> =
> =
> Compiling and installing BUTTERFLYPACK; this may take several 
> minutes
> =
> =
> Trying to download 
> https://urldefense.us/v3/__https://github.com/pghysels/STRUMPACK__;!!G_uCfscf7eWS!Zixr16YdQu3fiyHhdpuVPSpY2C6CE_eyJBpOizV54Ljkkw_4u9KcWP5QRT1Ukap5cNKYJ7t3If6OkGXrA0zDldI$
>   
> 
>  for STRUMPACK
> =
> =
>Configuring STRUMPACK with CMake; this may take several minutes
> =
> =
>   Compiling and installing STRUMPACK; this may take several 
> minutes
> =
> 
> *
>UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for 
> details):
> -
>Error running make on  STRUMPACK
> *
> 
>  Looking in the configure.log file I see error like this related to strumpack 
> compilation:
> 
> /opt/cray/pe/craype/2.7.19/bin/CC -D__HIP_PLATFORM_AMD__=1 
> -D__HIP_PLATFORM_HCC__=1 -Dstrumpack_EXPORTS 
> -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc-direct/externalpackages/git.strumpack/src
>  
> -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc-direct/externalpackages/git.strumpack/petsc-build
>  -isystem 
> /autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc-direct/include
>  -isystem /opt/rocm-5.4.0/include -isystem /opt/rocm-5.4.0/hip/include 
> -isystem /opt/rocm-5.4.0/llvm/lib/clang/15.0.0/.. -Wno-lto-type-mismatch 
> -Wno-psabi -O3 -fPIC -fopenmp -Wno-lto-type-mismatch -Wno-psabi -O3 -fPIC 
> -fopenmp -fPIC -Wall -Wno-overloaded-virtual -fopenmp -x hip 
> --offload-arch=gfx900 --offload-arch=gfx906 --offload-arch=gfx908 
> --offload-arch=gfx90a --offload-arch=gfx1030 -MD -MT 
> 

Re: [petsc-users] Problem with NVIDIA compiler and OpenACC

2024-04-05 Thread Barry Smith

   I see what you are talking about in the blas checks. However those checks 
"don't really matter" in that configure still succeeds. 

   Do you have a problem later with libnvJitLink.so 
 ? When you build PETSc (send make.log) or when you run tests, make check? Or 
when you try to run your code?

   Barry


> On Apr 5, 2024, at 2:14 PM, Frank Bramkamp  wrote:
> 
> Hi Barry,
> 
> Here I send you the configure.log file for the libnvJitLink problem.
> 
> At the top of the configure.log file it seems to find libnvJitLink.so.12
> 
> But in the test for BlasLapack, it mentions 
> stdout:  /tmp/petsc-h7tpd5_s/config.packages.BlasLapack/conftest: error while 
> loading shared libraries: libnvJitLink.so.12: cannot open shared object file: 
> No such file or directory
> 
> 
> Is seems that for the BlasLapack test you include the “stubs" directory. In 
> “stubs”, we only have libnvJitLink.so
> but not libnvJitLink.so.12  The libnvJitLink.so.12 we have in a different 
> directory (libs64), where is was also found before.
> But maybe in the BlasLapack test, you only search for the libnvJitLink.so.12 
> in the stubs directory.
> 
> Here I am not sure, if there should be a  libnvJitLink.so.12 in stubs as well 
> or not. 
> That means, it is not so clear if you should check in different directories, 
> or if we should add another link from libnvJitLink.so in stubs to 
> libnvJitLink.so.12.  But if other people also do not have
> libnvJitLink.so.12 in the stubs directory by default, that would be still a 
> problem. But I also do not know if the stubs directory is the problem.
> 
> 
> Thanks, Frank
> 
> 



Re: [petsc-users] Problem with NVIDIA compiler and OpenACC

2024-04-05 Thread Barry Smith




 Thanks for the configure. log Send the configure. log for the failed nvJitlink problem. > On Apr 5, 2024, at 12: 58 PM, Frank Bramkamp  wrote: > > Hi Barry, > > Here comes the latest configure. log file




ZjQcmQRYFpfptBannerStart




  

  
	This Message Is From an External Sender
  
  
This message came from outside your organization.
  



 
  


ZjQcmQRYFpfptBannerEnd





   Thanks for the configure.log Send the configure.log for the failed nvJitlink problem.


> On Apr 5, 2024, at 12:58 PM, Frank Bramkamp  wrote:
> 
> Hi Barry,
> 
> Here comes the latest configure.log file
> 
> My cuda nvJitlink problem unfortunately still exists. 
> I will try it on a different cluster to see if this a specific problem of the actual nvhpc installation.
> 
> 
> Have a nice weekend, Frank
> 
> 
> 
> 




Re: [petsc-users] Problem with NVIDIA compiler and OpenACC

2024-04-05 Thread Barry Smith




 Frank, Could you send the final, successful configure. log. I want to see if PETSc ever mucks with it later in the configure process/ Barry > On Apr 5, 2024, at 10: 44 AM, Frank Bramkamp  wrote: > > > Dear




ZjQcmQRYFpfptBannerStart




  

  
	This Message Is From an External Sender
  
  
This message came from outside your organization.
  



 
  


ZjQcmQRYFpfptBannerEnd





  Frank,

   Could you send the final, successful configure.log. I want to see if PETSc ever mucks with it later in the configure process/

  Barry


> On Apr 5, 2024, at 10:44 AM, Frank Bramkamp  wrote:
> 
> 
> Dear Barry,
> 
> That looks very good now. The -lnvc is gone now.
> 
> I also tested my small fortran program. There I can see that libnvc is automatically added as well, but this time is comes after the 
> libaccdevice.so. library for openacc. And then my openacc commands also work again.
> 
> 
> I also mentioned some issues with some cuda nvJitlink  library. I just found out that some path in our cuda compiler module was not set correctly.
> I will try to compile it with cuda again as well.
> 
> We just start to get PETSC on GPUs with the cuda backend, and I start with openccc for our fortran code to get first experience how everything works with GPU
> porting.
> 
> 
> Good that you could fix the issue. 
> 
> Thanks for the great help. Have a nice weekend, Frank Bramkamp
> 
> 
> 
> 
> 
> 




Re: [petsc-users] Problem with NVIDIA compiler and OpenACC

2024-04-05 Thread Barry Smith




 There was a bug in my attempted fix so it actually did not skip the option. Try git pull and then run configure again. > On Apr 5, 2024, at 6: 30 AM, Frank Bramkamp  wrote: > > Dear Barry, > > I tried




ZjQcmQRYFpfptBannerStart




  

  
	This Message Is From an External Sender
  
  
This message came from outside your organization.
  



 
  


ZjQcmQRYFpfptBannerEnd





   There was a bug in my attempted fix so it actually did not skip the option.

   Try git pull and then run configure again.


> On Apr 5, 2024, at 6:30 AM, Frank Bramkamp  wrote:
> 
> Dear Barry,
> 
> I tried your fix for -lnvc.  Unfortunately it did not work so far.
> Here I send you the configure.log file again.
> 
> One can see that you try to skip something, but later it still always includes -lnvc for the linker.
> In the file petscvariables it also appears as before.
> 
> As I see it, it lists the linker options including -lnvc also before you try to skip it.
> Maybe it is already in the linker options before the skipping.
> 
> 
> Greetings, Frank 
> 
> 
> 




Re: [petsc-users] Problem with NVIDIA compiler and OpenACC

2024-04-04 Thread Barry Smith

   Frank,

Please try the PETSc git branch barry/2024-04-04/rm-lnvc-link-line/release 

  This will hopefully resolve the -lnvc issue. Please let us know and we can 
add the fix to our current release.

 Barry

> On Apr 4, 2024, at 9:37 AM, Frank Bramkamp  wrote:
> 
> This Message Is From an External Sender 
> This message came from outside your organization.
> Dear PETSC Team,
> 
> I found the following problem:
> I compile petsc 3.20.5 with Nvidia compiler 23.7.
> 
> 
> I use a pretty standard configuration, including
> 
> --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort COPTFLAGS="-O2 -g" 
> CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g"  --with-debugging=0 --with-log=1 
> --download-fblaslapack --with-cuda=0
> 
> I exclude cuda, since I was not sure if the problem was cuda related. 
> 
> 
> The problem is now, if I have s simple fortran program where I link the petsc 
> library, but I actually do not use petsc in that program
> (Just for testing). I want to use OpenACC directives in my program, e.g. 
> !$acc parallel loop .
> The problem is now, as soon I link with the petsc library, the openacc 
> commands do not work anymore.
> It seems that openacc is not initialised and hence it cannot find a GPU.
> 
> The problem seems that you link with -lnvc.
> In “petscvariables” => PETSC_WITH_EXTERNAL_LIB you include “-lnvc”.
> If I take this out, then openacc works. With “-lnvc” something gets messed up.
> 
> The problem is also discussed here:
> https://urldefense.us/v3/__https://forums.developer.nvidia.com/t/failed-cuda-device-detection-when-explicitly-linking-libnvc/203225/1__;!!G_uCfscf7eWS!Z2uhPVP0GUrttP3rh6nLk6BQsoI2EIfKfoLVXcwQFksSvtvvRILt4Yq0y-FFYmi3ugybPdn-te0Pw5mfExHSw7Y$
>   
> 
> 
> My understanding is that libnvc is more a runtime library that does not need 
> to be included by the linker.
> Not sure if there is a specific reason to include libnvc (I am not so 
> familiar what this library does).
> 
> If I take out -lnvc from “petscvariables”, then my program with openacc works 
> as expected. I did not try any more realistic program that includes petsc.
> 
> 
> 
> 2)
> When compiling petsc with cuda support, I also found that in the petsc 
> library the library libnvJitLink.so 
>   >.12
> Is not found. On my system this library is in $CUDA_ROOT/lib64
> I am not sure where this library is on your system ?! 
> 
> 
> Thanks a lot, Frank Bramkamp



Re: [petsc-users] Problem with NVIDIA compiler and OpenACC

2024-04-04 Thread Barry Smith

   Please send configure.log 

   We do not explicitly include libnvc but as Satish noted it may get listed 
when configure is generating link lines.

   With configure.log we'll know where it is being included (and we may be able 
to provide a fix that removes it  explicitly since it is apparently not needed 
according to the NVIDIA folks).

   Barry


> On Apr 4, 2024, at 10:33 AM, Frank Bramkamp  wrote:
> 
> This Message Is From an External Sender 
> This message came from outside your organization.
> Thanks for the reply,
> 
> Do you know if you actively include the libnvc library ?!
> Or is this somehow automatically included ?! 
> 
> Greetings, Frank
> 
> 
> 
> 
>> On 4 Apr 2024, at 15:56, Satish Balay > > wrote:
>> 
>> 
>> On Thu, 4 Apr 2024, Frank Bramkamp wrote:
>> 
>>> Dear PETSC Team,
>>> 
>>> I found the following problem:
>>> I compile petsc 3.20.5 with Nvidia compiler 23.7.
>>> 
>>> 
>>> I use a pretty standard configuration, including
>>> 
>>> --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort COPTFLAGS="-O2 -g" 
>>> CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" --with-debugging=0 --with-log=1 
>>> --download-fblaslapack --with-cuda=0
>>> 
>>> I exclude cuda, since I was not sure if the problem was cuda related. 
>> 
>> Can you try using (to exclude cuda): --with-cudac=0
>> 
>>> 
>>> 
>>> The problem is now, if I have s simple fortran program where I link the 
>>> petsc library, but I actually do not use petsc in that program
>>> (Just for testing). I want to use OpenACC directives in my program, e.g. 
>>> !$acc parallel loop .
>>> The problem is now, as soon I link with the petsc library, the openacc 
>>> commands do not work anymore.
>>> It seems that openacc is not initialised and hence it cannot find a GPU.
>>> 
>>> The problem seems that you link with -lnvc.
>>> In “petscvariables” => PETSC_WITH_EXTERNAL_LIB you include “-lnvc”.
>>> If I take this out, then openacc works. With “-lnvc” something gets messed 
>>> up.
>>> 
>>> The problem is also discussed here:
>>> https://urldefense.us/v3/__https://forums.developer.nvidia.com/t/failed-cuda-device-detection-when-explicitly-linking-libnvc/203225/1__;!!G_uCfscf7eWS!dlXNyKBzSbximQ13OXxwO506OF71yRM_H5KEnarqXE75D6Vg-ePZr2u6SJ5V3YpRETatvb9pMOUVmpyN0-19SFlbug$>>  >
>>> 
>>> My understanding is that libnvc is more a runtime library that does not 
>>> need to be included by the linker.
>>> Not sure if there is a specific reason to include libnvc (I am not so 
>>> familiar what this library does).
>>> 
>>> If I take out -lnvc from “petscvariables”, then my program with openacc 
>>> works as expected. I did not try any more realistic program that includes 
>>> petsc.
>>> 
>>> 
>>> 
>>> 2)
>>> When compiling petsc with cuda support, I also found that in the petsc 
>>> library the library libnvJitLink.so.12
>>> Is not found. On my system this library is in $CUDA_ROOT/lib64
>>> I am not sure where this library is on your system ?! 
>> 
>> Hm - good if you can send configure.log for this. configure attempts '$CC 
>> -v' to determine the link libraries to get c/c++/fortran compatibility 
>> libraries. But it can grab other libraries that the compilers are using 
>> internally here.
>> 
>> To avoid this - you can explicitly list these libraries to configure. For 
>> ex: for gcc/g++/gfortran
>> 
>> ./configure CC=gcc CXX=g++ FC=gfortran LIBS="-lgfortran -lstdc++"
>> 
>> Satish
>> 
>>> 
>>> 
>>> Thanks a lot, Frank Bramkamp



Re: [petsc-users] PETSC Matrix debugging

2024-04-01 Thread Barry Smith

  Note, you can also run with the option -mat_view and it will print each 
matrix that gets assembled.

  Also in the debugger you can do call MatView(mat,0)



> On Apr 1, 2024, at 2:18 PM, Matthew Knepley  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> On Mon, Apr 1, 2024 at 1:57 PM Shatanawi, Sawsan Muhammad via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> wrote:
>> This Message Is From an External Sender
>> This message came from outside your organization.
>>  
>> Hello everyone,
>> 
>> I hope this email finds you well.
>> 
>> Is there a way we can check how the matrix looks like after setting it.
>> I have tried debugging it with gdb- break points- and print statements, but 
>> it only gave me one value instead of a matrix.
>>  
>> Thank you in advance for your time and assistance.
>> 
> I usually use MatView(), which can print to the screen. Is that what you want?
> 
>   Thanks,
> 
>  Matt 
>> Best regards,
>> 
>>  Sawsan
>> 
>> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
> 
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bT1jdavVG8lGjxjhujAttmcaK9R1GFUxJtuFl1S2JK74c0mhrCwc2DkQippCFh8qwrk_9x5Dxjv-2H967RRgQPA$
>   
> 


Re: [petsc-users] ex19: Segmentation Violation when run with MUMPS on MacOS (arm64)

2024-03-30 Thread Barry Smith




 On my Mac at configure time I get clang: error: linker command failed with exit code 1 (use -v to see invocation) Possible ERROR while running linker: exit code 1 when trying to link a test case against the lapack libraries. I cannot generate




ZjQcmQRYFpfptBannerStart




  

  
	This Message Is From an External Sender
  
  
This message came from outside your organization.
  



 
  


ZjQcmQRYFpfptBannerEnd





  On my Mac at configure time I get 

clang: error: linker command failed with exit code 1 (use -v to see invocation)
Possible ERROR while running linker: exit code 1

when trying to link a test case against the lapack libraries.  I cannot generate an error message showing what error has occurred


> On Mar 30, 2024, at 2:59 PM, Satish Balay  wrote:
> 
> I'll just note - I can reproduce with:
> 
> petsc@npro petsc.x % ./configure --download-mpich --download-mumps --download-scalapack && make && make check 
> 
> And then - the following work fine for me:
> 
> petsc@npro petsc.x % ./configure --download-mpich --download-mumps --download-scalapack COPTFLAGS=-O0 FOPTFLAGS=-O0 LDFLAGS=-Wl,-ld_classic && make && make check
> 
>CLINKER arch-darwin-c-debug/lib/libpetsc.3.021.0.dylib
>   DSYMUTIL arch-darwin-c-debug/lib/libpetsc.3.021.0.dylib
> =
> Now to check if the libraries are working do:
> make PETSC_DIR=/Users/petsc/petsc.x PETSC_ARCH=arch-darwin-c-debug check
> =
> Running PETSc check examples to verify correct installation
> Using PETSC_DIR=/Users/petsc/petsc.x and PETSC_ARCH=arch-darwin-c-debug
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> C/C++ example src/snes/tutorials/ex19 run successfully with MUMPS
> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
> Completed PETSc check examples
> petsc@npro petsc.x %
> 
> petsc@npro petsc.z % ./configure --download-openmpi=https://urldefense.us/v3/__https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.3rc1.tar.bz2__;!!G_uCfscf7eWS!ZCdX8VX2WwwopcGzGQq5RpwHmqiuIiIb1r4zByAJVKyH9howYTVO-FFqVMGMbc8fUX6bC9Jsln49TEDpvn4TyIo$ --download-mumps --download-scalapack && make && make check
> 
>   DSYMUTIL arch-darwin-c-debug/lib/libpetsc.3.021.0.dylib
> =
> Now to check if the libraries are working do:
> make PETSC_DIR=/Users/petsc/petsc.z PETSC_ARCH=arch-darwin-c-debug check
> =
> Running PETSc check examples to verify correct installation
> Using PETSC_DIR=/Users/petsc/petsc.z and PETSC_ARCH=arch-darwin-c-debug
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> C/C++ example src/snes/tutorials/ex19 run successfully with MUMPS
> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
> Completed PETSc check examples
> petsc@npro petsc.z % 
> 
> [however parmmg and pastix are failing to build with openmpi]
> 
> And I thought this worked for me yesterday - but I see failures now.
> 
> ./configure --download-bison --download-chaco --download-ctetgen --download-eigen --download-fftw --download-hdf5 --download-hpddm --download-hwloc --download-hwloc-configure-arguments=--disable-opencl --download-hypre --download-libpng --download-metis --download-mmg --download-mpich --download-mpich-configure-arguments=--disable-opencl --download-mumps --download-netcdf --download-openblas --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" --download-parmmg --download-pastix --download-pnetcdf --download-pragmatic --download-ptscotch --download-scalapack --download-slepc --download-suitesparse --download-superlu_dist --download-tetgen --download-triangle --with-c2html=0 --with-debugging=1 --with-fortran-bindings=0 --with-shared-libraries=1 --with-x=0 --with-zlib --COPTFLAGS=-O0 --FOPTFLAGS=-O0 --LDFLAGS=-Wl,-ld_classic --with-clean
> 
> Satish
> 
> On Sat, 30 Mar 2024, Barry Smith wrote:
> 
>> 
>>  Can you check the value of IRHSCOMP in the debugger? Using gdb as the debugger may work better for this. 
>> 
>>  Barry
>> 
>> 
>>> On Mar 30, 2024, at 3:46 AM, zeyu xia  wrote:
>>> 
>>> This Message Is From an External Sender
>>> This message came from outside your organization.
>>> Hi! Thanks for your reply.
>>> 
>>> There still exist some problems, as seen in the files 'configure.log', 'make check3.txt', and 'debug.txt' in the attachment. Particularly, the file 'debug.txt' contains the output 

Re: [petsc-users] ex19: Segmentation Violation when run with MUMPS on MacOS (arm64)

2024-03-30 Thread Barry Smith

  Can you check the value of IRHSCOMP in the debugger? Using gdb as the 
debugger may work better for this. 

  Barry


> On Mar 30, 2024, at 3:46 AM, zeyu xia  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hi! Thanks for your reply.
> 
> There still exist some problems, as seen in the files 'configure.log', 'make 
> check3.txt', and 'debug.txt' in the attachment. Particularly, the file 
> 'debug.txt' contains the output of bt command of lldb.
> 
> Thanks for your attention.
> 
> Best regards,
> Zeyu Xia
> 
> 
> Satish Balay mailto:ba...@mcs.anl.gov>> 于2024年3月30日周六 
> 02:52写道:
>> I'm able to reproduce this error on a slightly older xcode [but don't know 
>> why this issue comes up]
>> 
>> > Apple clang version 15.0.0 (clang-1500.1.0.2.5)
>> 
>> Can you try using the additional configure options (along with 
>> LDFLAGS=-Wl,-ld_classic)  and see if it works?
>> 
>> COPTFLAGS=-O0 FOPTFLAGS=-O0
>> 
>> Satish
>> 
>> On Fri, 29 Mar 2024, zeyu xia wrote:
>> 
>> > Hi! I am grateful for your prompt response.
>> > 
>> > I follow your suggestions, and however, it still does not work. For the
>> > related information please find the files 'make check2.txt' and
>> > 'configure.log' in the attachment.
>> > 
>> > If possible, please do me a favor again. Thanks for your patience.
>> > 
>> > Best wishes,
>> > Zeyu Xia
>> > 
>> > 
>> > Satish Balay mailto:ba...@mcs.anl.gov>> 于2024年3月29日周五 
>> > 23:48写道:
>> > 
>> > > Could you:
>> > >
>> > > - reinstall brew after the xcode upgrade (not just update)
>> > > https://urldefense.us/v3/__https://petsc.org/main/install/install/*installing-on-macos__;Iw!!G_uCfscf7eWS!dGItos-D58VSJn4kOlKy2TEX-PWhflbWfNuM0zqhEXbGniD5S13iWCxgBmg9wYk4OrSwaP6jjzANIHN1ZHATKXE$
>> > > - not use --LDFLAGS=-Wl,-ld_classic
>> > >
>> > > And see if the problem persists?
>> > >
>> > > Satish
>> > >
>> > > On Fri, 29 Mar 2024, zeyu xia wrote:
>> > >
>> > > > Dear PETSc team:
>> > > >
>> > > > Recently I installed firedrake on MacOS (arm64) with the latest
>> > > > Xcode, and there seems some error with mumps. I ran two times of the
>> > > > command `make check`. The first time it just output wrong results, and
>> > > the
>> > > > second time it raised an error with Segmentation Violation. Please see
>> > > the
>> > > > files “make check.txt” and “configure.log” in the attachment.
>> > > >
>> > > > I will certainly be happy and grateful if you can take some 
>> > > > time
>> > > to
>> > > > deal with this problem. Thanks for your patience.
>> > > >
>> > > > Best wishes,
>> > > > Zeyu Xia
>> > > >
>> > >
>> > 
> 



[petsc-users] PETSc 3.21 release

2024-03-29 Thread Barry Smith
We are pleased to announce the release of PETSc version 3.21.0 at 
https://urldefense.us/v3/__https://petsc.org/release/download/__;!!G_uCfscf7eWS!eYJ4I4DyRSmpyGr66cTWnTsPF_K-dLY6xDA_znXt4dYB-KDtxykopISUhT4RK_hB0ljDEzelUUGxjML3npkg5jU$
  

A list of the major changes and updates can be found at 
https://urldefense.us/v3/__https://petsc.org/release/changes/321/__;!!G_uCfscf7eWS!eYJ4I4DyRSmpyGr66cTWnTsPF_K-dLY6xDA_znXt4dYB-KDtxykopISUhT4RK_hB0ljDEzelUUGxjML3X-Mnvnc$
 

The final update to petsc-3.20 i.e., petsc-3.20.6 is also available.

We recommend upgrading to PETSc 3.21.0 soon. As always, please report problems 
to petsc-ma...@mcs.anl.gov <mailto:petsc-ma...@mcs.anl.gov>  and ask questions 
at petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>

A reminder that releases are at the end of March and September each year.

This release includes contributions from

Albert Cowie
Alex Lindsay
Barry Smith
Blanca Mellado Pinto
David Andrs
David Kamensky
David Wells
Fabien Evard
Fande Kong
Hansol Suh
Hong Zhang
Ilya Fursov
James Wright
Jed Brown
Jeongu Kim
Jeremy L Thompson
Jeremy Theler
Jose Roman
Junchao Zhang
Koki Sagiyama
Lars Bilke
Lisandro Dalcin
Mark Adams
Martin Diehl
Massimiliano Leoni
Matthew Knepley
Matt McGurn
Mr. Hong Zhang
Nils Friess
Pablo Brubeck
Pierre Jolivet
René Chenard
Rezgar Shakeri
Richard Tran Mills
Satish Balay
Sebastian Grimberg
Stefano Zampini
Stephan Köhler
Toby Isaac
YANG Zongze
Zach Atkins

and bug reports/proposed improvements received from

Alain O' Miniussi
Benjamin Sturdevant
Damian Marek
David Bold
Fabian Wermelinger
Fabien Evrard
Gerard Henry
Gourav Kumbhojkar
Glenn Hammond
Hana Honnerová
Hao Luo
Henrik Büsing
Ilya Fursov
Jeremy Theler
Jesse Madsen
Jose Roman
Kevin G. Wang
Mark Adams
Mehmet Sahin
Miguel Angel Salazar de Troya
Niclas Götting
Pierre Jolivet
Simone Scacchi
Victor Eijkhout
Timothy J. Williams
Yi Hu

As always, thanks for your support,

Barry

Re: [petsc-users] Does ILU(15) still make sense or should just use LU?

2024-03-29 Thread Barry Smith

  Generically you see the ~[DOF]^3 for dense matrix factorizations. For sparse, 
depending on the problem and space dimension 1, 2, or 3 you do much better than 
~[DOF]^3 dof. Iterative solvers when working well offer the possibility of 
~[DOF] which is why they are needed for very large problems.

> On Mar 29, 2024, at 3:29 PM, Zou, Ling via petsc-users 
>  wrote:
> 
> Note that [Wall Time] ~ [DOF]^1.333, instead of being ~[DOF]^3.
> The [DOF]^3 rule was the scary part that I wanted to avoid LU.
>  
> -Ling
>  
> From: petsc-users  <mailto:petsc-users-boun...@mcs.anl.gov>> on behalf of Zou, Ling via 
> petsc-users mailto:petsc-users@mcs.anl.gov>>
> Date: Friday, March 29, 2024 at 2:06 PM
> To: Barry Smith mailto:bsm...@petsc.dev>>, Zhang, Hong 
> mailto:hzh...@mcs.anl.gov>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
> 
> Hong, are these results somewhat expected? I don’t see any speed up for using 
> 2 processors (maybe I don’t have 2 processors?).
>  
> Option
> Wall Time (sec)
> -pc_type lu
> 7.442
> mpiexec -n 2 -pc_type lu
> 9.112
> -pc_type lu -pc_factor_mat_solver_type mumps
> 8.748
> mpiexec -n 2 -pc_type lu -pc_factor_mat_solver_type mumps
> 9.013
>  
> For different size problems
> -pc_type lu -m 1000 -n 1000
> 7.442
> -pc_type lu -m 750 -n 750
> 3.142
> -pc_type lu -m 500 -n 500
> 1.007
> -pc_type lu -m 250 -n 250
> 0.150
> -pc_type lu -m 100 -n 100
> 0.016
>  
> 
>  
>  
>  
> From: petsc-users  <mailto:petsc-users-boun...@mcs.anl.gov>> on behalf of Zou, Ling via 
> petsc-users mailto:petsc-users@mcs.anl.gov>>
> Date: Friday, March 29, 2024 at 12:50 PM
> To: Barry Smith mailto:bsm...@petsc.dev>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
> 
> I cannot believe that I typed: make ex02
> Thanks, it works.
>  
> -Ling
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Friday, March 29, 2024 at 12:43 PM
> To: Zou, Ling mailto:l...@anl.gov>>
> Cc: Zhang, Hong mailto:hzh...@mcs.anl.gov>>, 
> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
> 
> cd src/ksp/ksp/tutorials make ex2 On Mar 29, 2024, at 1: 10 PM, Zou, Ling 
>  wrote: Hong, thanks! That’s great to know. I’d like to try 
> the ex2 tutorial case locally to see how it performs. I have already 
> installed PETSc 3. 20. 5
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>  
> ZjQcmQRYFpfptBannerEnd
>  
>cd src/ksp/ksp/tutorials
> make ex2 
>  
>  
> 
> On Mar 29, 2024, at 1:10 PM, Zou, Ling mailto:l...@anl.gov>> 
> wrote:
>  
> Hong, thanks! That’s great to know.
> I’d like to try the ex2 tutorial case locally to see how it performs. I have 
> already installed PETSc 3.20.5 on my Mac.
> Here shows the very last step of installation.
>  
> make PETSC_DIR=/Users/lingzou/Downloads/petsc-3.20.5 PETSC_ARCH=arch-opt check
> Running PETSc check examples to verify correct installation
> Using PETSC_DIR=/Users/lingzou/Downloads/petsc-3.20.5 and PETSC_ARCH=arch-opt
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> Completed PETSc check examples
>  
> I found myself not knowing how to compile petsc/src/ksp/ksp/tutorials/ex2.c
> Do we have a page for how to do that?
>  
> Best,
>  
> -Ling
>  
> From: Zhang, Hong mailto:hzh...@mcs.anl.gov>>
> Date: Thursday, March 28, 2024 at 4:59 PM
> To: Zou, Ling mailto:l...@anl.gov>>, Barry Smith 
> mailto:bsm...@petsc.dev>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
> 
> Ling,
> MUMPS 
> https://urldefense.us/v3/__https://mumps-solver.org/index.php__;!!G_uCfscf7eWS!ZmTlsQateB-nACNJAmqiJGcDWxWQOps2BeB7_vEs7q7-Rr8Do1invh3ez12a6aaIkSB7-jziREAovRpWXE73gS4$
>   
> <https://urldefense.us/v3/__https:/mumps-solver.org/index.php__;!!G_uCfscf7eWS!b4SLVXTUaKyR1_NPGNEtGinrk2pTkW9odwoiYKcTjslyDUQxuhihIs1ZLqrh2z33R3C5VLIwl86Bvw$>
>  , superlu and  superlu_dist 
> https://urldefense.us/v3/__http

Re: [petsc-users] Does ILU(15) still make sense or should just use LU?

2024-03-29 Thread Barry Smith

   cd src/ksp/ksp/tutorials
make ex2 


> On Mar 29, 2024, at 1:10 PM, Zou, Ling  wrote:
> 
> Hong, thanks! That’s great to know.
> I’d like to try the ex2 tutorial case locally to see how it performs. I have 
> already installed PETSc 3.20.5 on my Mac.
> Here shows the very last step of installation.
>  
> make PETSC_DIR=/Users/lingzou/Downloads/petsc-3.20.5 PETSC_ARCH=arch-opt check
> Running PETSc check examples to verify correct installation
> Using PETSC_DIR=/Users/lingzou/Downloads/petsc-3.20.5 and PETSC_ARCH=arch-opt
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> Completed PETSc check examples
>  
> I found myself not knowing how to compile petsc/src/ksp/ksp/tutorials/ex2.c
> Do we have a page for how to do that?
>  
> Best,
>  
> -Ling
>  
> From: Zhang, Hong 
> Date: Thursday, March 28, 2024 at 4:59 PM
> To: Zou, Ling , Barry Smith 
> Cc: petsc-users@mcs.anl.gov 
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
> 
> Ling,
> MUMPS 
> https://urldefense.us/v3/__https://mumps-solver.org/index.php__;!!G_uCfscf7eWS!fYi1HJwMm9FudQ0Jmc80axT8PKPd_uSQDnx_QONzQKRQWyTElDsv-kkch9H3dHrw1M1ezregBqWojsAXknJURaY$
>   , superlu and  superlu_dist 
> https://urldefense.us/v3/__https://portal.nersc.gov/project/sparse/superlu/__;!!G_uCfscf7eWS!fYi1HJwMm9FudQ0Jmc80axT8PKPd_uSQDnx_QONzQKRQWyTElDsv-kkch9H3dHrw1M1ezregBqWojsAXkSrGTOI$
>  
> are sparse LU solvers, i.e., they produce SPARSE LU matrix factors. For many 
> applications, they can solve 1 million DOF easily even in sequential mode. 
> For example 
> petsc/src/ksp/ksp/tutorials 
> ./ex2 -pc_type lu -pc_factor_mat_solver_type mumps -m 1000 -n 1000 
> -ksp_monitor_true_residual
>   0 KSP preconditioned resid norm 1.e+03 true resid norm 
> 6.330876716538e+01 ||r(i)||/||b|| 1.e+00
>   1 KSP preconditioned resid norm 9.976801056860e-09 true resid norm 
> 3.908107755078e-10 ||r(i)||/||b|| 6.173090916254e-12
> Norm of error 9.98582e-09 iterations 1
>  
> MUMPS LU solves this matrix of size 1.e6 in one iteration (takes few sec on 
> my laptop).
> As Barry suggests, try mumps first. If it fails or it is too slow, then 
> explore other solvers available in PETSc 
> https://urldefense.us/v3/__https://petsc.org/release/overview/linear_solve_table/__;!!G_uCfscf7eWS!fYi1HJwMm9FudQ0Jmc80axT8PKPd_uSQDnx_QONzQKRQWyTElDsv-kkch9H3dHrw1M1ezregBqWojsAXj0nzSvY$
>  
>  
> From my experiments, MUMPS is faster and more robust than 
> superlu/superlu_dist, yet it consumes slightly more memory.
> See 
> https://urldefense.us/v3/__https://petsc.org/release/manual/ksp/*using-external-linear-solvers__;Iw!!G_uCfscf7eWS!fYi1HJwMm9FudQ0Jmc80axT8PKPd_uSQDnx_QONzQKRQWyTElDsv-kkch9H3dHrw1M1ezregBqWojsAXrhJbO84$
>   on how to install mumps with petsc.
>  
> Hong
>  
>  
>  
>  
>  
>  
> From: Zou, Ling 
> Sent: Thursday, March 28, 2024 2:34 PM
> To: Barry Smith 
> Cc: Zhang, Hong ; petsc-users@mcs.anl.gov 
> 
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
>  
> Thank you. Those are great suggestions. Although I mentioned 1 million DOF, 
> but we rarely go there, so maybe stick with what is working now, and 
> meanwhile seeking helps from literatures.
>  
> -Ling
>  
> From: Barry Smith 
> Date: Thursday, March 28, 2024 at 2:26 PM
> To: Zou, Ling 
> Cc: Zhang, Hong , petsc-users@mcs.anl.gov 
> 
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
> 
> You may benefit from a literature search on your model AND preconditioners to 
> see what others have used. But I would try PETSc/MUMPS on the biggest size 
> you want and see how it goes (better it runs for a little longer and you 
> don't waste months 
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>  
> ZjQcmQRYFpfptBannerEnd
>  
>You may benefit from a literature search on your model AND preconditioners 
> to see what others have used. But I would try PETSc/MUMPS on the biggest size 
> you want and see how it goes (better it runs for a little longer and you 
> don't waste months trying to find a good preconditioner).
>  
>  
>  
>  
> 
> On Mar 28, 2024, at 2:20 PM, Zou, Ling  wrote:
>  
> Thank you, Barry.
> Yes, I have tried different preconditioners, but in a naïve way, i.e., 
> looping through possible options using `-pc_type ` command line.
> But no, not in a meaningful way because the lack of understanding of the 
> connection between physics (the problem we are dealing with) to ma

Re: [petsc-users] Does ILU(15) still make sense or should just use LU?

2024-03-28 Thread Barry Smith

   You may benefit from a literature search on your model AND preconditioners 
to see what others have used. But I would try PETSc/MUMPS on the biggest size 
you want and see how it goes (better it runs for a little longer and you don't 
waste months trying to find a good preconditioner).




> On Mar 28, 2024, at 2:20 PM, Zou, Ling  wrote:
> 
> Thank you, Barry.
> Yes, I have tried different preconditioners, but in a naïve way, i.e., 
> looping through possible options using `-pc_type ` command line.
> But no, not in a meaningful way because the lack of understanding of the 
> connection between physics (the problem we are dealing with) to math (the 
> correct combination of those preconditioners).
>  
> -Ling
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Thursday, March 28, 2024 at 1:09 PM
> To: Zou, Ling mailto:l...@anl.gov>>
> Cc: Zhang, Hong mailto:hzh...@mcs.anl.gov>>, 
> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
> 
> 1 million is possible for direct solvers using PETSc with the MUMPS direct 
> solver when you cannot get a preconditioner to work well for your problems. 
> ILU are not very robust preconditioners and I would not rely on them. Have 
> you investigated 
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>  
> ZjQcmQRYFpfptBannerEnd
>  
>1 million is possible for direct solvers using PETSc with the MUMPS direct 
> solver when you cannot get a preconditioner to work well for your problems.
>  
> ILU are not very robust preconditioners and I would not rely on them. 
> Have you investigated other preconditioners in PETSc, PCGAMG, PCASM, 
> PCFIELDSPLIT or some combination of these preconditioners work for many 
> problems, though certainly not all.
>  
> 
> 
> On Mar 28, 2024, at 1:14 PM, Zou, Ling mailto:l...@anl.gov>> 
> wrote:
>  
> Thank you, Barry.
> Yeah, this is unfortunate given that the problem we are handling is quite 
> heterogeneous (in both mesh and physics).
> I expect that our problem sizes will be mostly smaller than 1 million DOF, 
> should LU still be a practical solution? Can it scale well if we choose to 
> run the problem in a parallel way?
>  
> PS1: -ksp_norm_type unpreconditioned did not work as the true residual did 
> not go down, even with 300 linear iterations.
> PS2: what do you think if it will be beneficial to have more detailed 
> discussions (e.g., a presentation?) on the problem we are solving to seek 
> more advice?
>  
> -Ling
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Thursday, March 28, 2024 at 11:14 AM
> To: Zou, Ling mailto:l...@anl.gov>>
> Cc: Zhang, Hong mailto:hzh...@mcs.anl.gov>>, 
> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
> 
> This is a bad situation, the solver is not really converging. This can happen 
> with ILU() sometimes, it so badly scales things that the preconditioned 
> residual decreases a lot but the true residual is not really getting smaller. 
> Since your matrices 
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>  
> ZjQcmQRYFpfptBannerEnd
>  
>This is a bad situation, the solver is not really converging. This can 
> happen with ILU() sometimes, it so badly scales things that the 
> preconditioned residual decreases a lot but the true residual is not really 
> getting smaller. Since your matrices are small best to stick to LU.
>  
> You can use -ksp_norm_type unpreconditioned to force the convergence test 
> to use the true residual for a convergence test and the solver will discover 
> that it is not converging.
>  
>Barry
>  
>  
> 
> On Mar 28, 2024, at 11:43 AM, Zou, Ling via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> wrote:
>  
> Hong, thanks! That makes perfect sense.
> A follow up question about ILU.
>  
> The following is the performance of ILU(5). Note that each KPS solving 
> reports converged but as the output shows, the preconditioned residual does 
> while true residual does not. Is there any way this performance could be 
> improved?
> Background: the preconditioning matrix is finite difference generated, and 
> should be exact.
>  
> -Ling
>  
> Time Step 21, time = -491.75, dt = 1
> NL Step =  0, fnorm =  6.98749E+01
> 0 KSP preconditioned resid norm 1.6841

Re: [petsc-users] Does ILU(15) still make sense or should just use LU?

2024-03-28 Thread Barry Smith

   1 million is possible for direct solvers using PETSc with the MUMPS direct 
solver when you cannot get a preconditioner to work well for your problems.

ILU are not very robust preconditioners and I would not rely on them. Have 
you investigated other preconditioners in PETSc, PCGAMG, PCASM, PCFIELDSPLIT or 
some combination of these preconditioners work for many problems, though 
certainly not all.


> On Mar 28, 2024, at 1:14 PM, Zou, Ling  wrote:
> 
> Thank you, Barry.
> Yeah, this is unfortunate given that the problem we are handling is quite 
> heterogeneous (in both mesh and physics).
> I expect that our problem sizes will be mostly smaller than 1 million DOF, 
> should LU still be a practical solution? Can it scale well if we choose to 
> run the problem in a parallel way?
>  
> PS1: -ksp_norm_type unpreconditioned did not work as the true residual did 
> not go down, even with 300 linear iterations.
> PS2: what do you think if it will be beneficial to have more detailed 
> discussions (e.g., a presentation?) on the problem we are solving to seek 
> more advice?
>  
> -Ling
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Thursday, March 28, 2024 at 11:14 AM
> To: Zou, Ling mailto:l...@anl.gov>>
> Cc: Zhang, Hong mailto:hzh...@mcs.anl.gov>>, 
> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use 
> LU?
> 
> This is a bad situation, the solver is not really converging. This can happen 
> with ILU() sometimes, it so badly scales things that the preconditioned 
> residual decreases a lot but the true residual is not really getting smaller. 
> Since your matrices 
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>  
> ZjQcmQRYFpfptBannerEnd
>  
>This is a bad situation, the solver is not really converging. This can 
> happen with ILU() sometimes, it so badly scales things that the 
> preconditioned residual decreases a lot but the true residual is not really 
> getting smaller. Since your matrices are small best to stick to LU.
>  
> You can use -ksp_norm_type unpreconditioned to force the convergence test 
> to use the true residual for a convergence test and the solver will discover 
> that it is not converging.
>  
>Barry
>  
> 
> 
> On Mar 28, 2024, at 11:43 AM, Zou, Ling via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> wrote:
>  
> Hong, thanks! That makes perfect sense.
> A follow up question about ILU.
>  
> The following is the performance of ILU(5). Note that each KPS solving 
> reports converged but as the output shows, the preconditioned residual does 
> while true residual does not. Is there any way this performance could be 
> improved?
> Background: the preconditioning matrix is finite difference generated, and 
> should be exact.
>  
> -Ling
>  
> Time Step 21, time = -491.75, dt = 1
> NL Step =  0, fnorm =  6.98749E+01
> 0 KSP preconditioned resid norm 1.684131526824e+04 true resid norm 
> 6.987489798042e+01 ||r(i)||/||b|| 1.e+00
> 1 KSP preconditioned resid norm 5.970568556551e+02 true resid norm 
> 6.459553545222e+01 ||r(i)||/||b|| 9.244455064582e-01
> 2 KSP preconditioned resid norm 3.349113985192e+02 true resid norm 
> 7.250836872274e+01 ||r(i)||/||b|| 1.037688366186e+00
> 3 KSP preconditioned resid norm 3.290585904777e+01 true resid norm 
> 1.186282435163e+02 ||r(i)||/||b|| 1.697723316169e+00
> 4 KSP preconditioned resid norm 8.530606201233e+00 true resid norm 
> 4.088729421459e+01 ||r(i)||/||b|| 5.851499665310e-01
>   Linear solve converged due to CONVERGED_RTOL iterations 4
> NL Step =  1, fnorm =  4.08788E+01
> 0 KSP preconditioned resid norm 1.851047973094e+03 true resid norm 
> 4.087882723223e+01 ||r(i)||/||b|| 1.e+00
> 1 KSP preconditioned resid norm 3.696809614513e+01 true resid norm 
> 2.720016413105e+01 ||r(i)||/||b|| 6.653851387793e-01
> 2 KSP preconditioned resid norm 5.751891392534e+00 true resid norm 
> 3.326338240872e+01 ||r(i)||/||b|| 8.137068663873e-01
> 3 KSP preconditioned resid norm 8.540729397958e-01 true resid norm 
> 8.672410748720e+00 ||r(i)||/||b|| 2.121492062249e-01
>   Linear solve converged due to CONVERGED_RTOL iterations 3
> NL Step =  2, fnorm =  8.67124E+00
> 0 KSP preconditioned resid norm 5.511333966852e+00 true resid norm 
> 8.671237519593e+00 ||r(i)||/||b|| 1.e+00
> 1 KSP preconditioned resid norm 1.174962622023e+00 true resid norm 
> 8.731034658309e+00 ||r(i)||/||b|| 1.006896032842e+00
> 2 KSP preconditioned resid norm 1.104

Re: [petsc-users] Does ILU(15) still make sense or should just use LU?

2024-03-28 Thread Barry Smith

   This is a bad situation, the solver is not really converging. This can 
happen with ILU() sometimes, it so badly scales things that the preconditioned 
residual decreases a lot but the true residual is not really getting smaller. 
Since your matrices are small best to stick to LU.

You can use -ksp_norm_type unpreconditioned to force the convergence test 
to use the true residual for a convergence test and the solver will discover 
that it is not converging.

   Barry


> On Mar 28, 2024, at 11:43 AM, Zou, Ling via petsc-users 
>  wrote:
> 
> Hong, thanks! That makes perfect sense.
> A follow up question about ILU.
>  
> The following is the performance of ILU(5). Note that each KPS solving 
> reports converged but as the output shows, the preconditioned residual does 
> while true residual does not. Is there any way this performance could be 
> improved?
> Background: the preconditioning matrix is finite difference generated, and 
> should be exact.
>  
> -Ling
>  
> Time Step 21, time = -491.75, dt = 1
> NL Step =  0, fnorm =  6.98749E+01
> 0 KSP preconditioned resid norm 1.684131526824e+04 true resid norm 
> 6.987489798042e+01 ||r(i)||/||b|| 1.e+00
> 1 KSP preconditioned resid norm 5.970568556551e+02 true resid norm 
> 6.459553545222e+01 ||r(i)||/||b|| 9.244455064582e-01
> 2 KSP preconditioned resid norm 3.349113985192e+02 true resid norm 
> 7.250836872274e+01 ||r(i)||/||b|| 1.037688366186e+00
> 3 KSP preconditioned resid norm 3.290585904777e+01 true resid norm 
> 1.186282435163e+02 ||r(i)||/||b|| 1.697723316169e+00
> 4 KSP preconditioned resid norm 8.530606201233e+00 true resid norm 
> 4.088729421459e+01 ||r(i)||/||b|| 5.851499665310e-01
>   Linear solve converged due to CONVERGED_RTOL iterations 4
> NL Step =  1, fnorm =  4.08788E+01
> 0 KSP preconditioned resid norm 1.851047973094e+03 true resid norm 
> 4.087882723223e+01 ||r(i)||/||b|| 1.e+00
> 1 KSP preconditioned resid norm 3.696809614513e+01 true resid norm 
> 2.720016413105e+01 ||r(i)||/||b|| 6.653851387793e-01
> 2 KSP preconditioned resid norm 5.751891392534e+00 true resid norm 
> 3.326338240872e+01 ||r(i)||/||b|| 8.137068663873e-01
> 3 KSP preconditioned resid norm 8.540729397958e-01 true resid norm 
> 8.672410748720e+00 ||r(i)||/||b|| 2.121492062249e-01
>   Linear solve converged due to CONVERGED_RTOL iterations 3
> NL Step =  2, fnorm =  8.67124E+00
> 0 KSP preconditioned resid norm 5.511333966852e+00 true resid norm 
> 8.671237519593e+00 ||r(i)||/||b|| 1.e+00
> 1 KSP preconditioned resid norm 1.174962622023e+00 true resid norm 
> 8.731034658309e+00 ||r(i)||/||b|| 1.006896032842e+00
> 2 KSP preconditioned resid norm 1.104604471016e+00 true resid norm 
> 1.018397505468e+01 ||r(i)||/||b|| 1.174454630227e+00
> 3 KSP preconditioned resid norm 4.257063674222e-01 true resid norm 
> 4.023093124996e+00 ||r(i)||/||b|| 4.639583584126e-01
> 4 KSP preconditioned resid norm 1.023038868263e-01 true resid norm 
> 2.365298462869e+00 ||r(i)||/||b|| 2.727751901068e-01
> 5 KSP preconditioned resid norm 4.073772638935e-02 true resid norm 
> 2.302623112025e+00 ||r(i)||/||b|| 2.655472309255e-01
> 6 KSP preconditioned resid norm 1.510323179379e-02 true resid norm 
> 2.300216593521e+00 ||r(i)||/||b|| 2.652697020839e-01
> 7 KSP preconditioned resid norm 1.337324816903e-02 true resid norm 
> 2.300057733345e+00 ||r(i)||/||b|| 2.652513817259e-01
> 8 KSP preconditioned resid norm 1.247384902656e-02 true resid norm 
> 2.300456226062e+00 ||r(i)||/||b|| 2.652973374174e-01
> 9 KSP preconditioned resid norm 1.247038855375e-02 true resid norm 
> 2.300532560993e+00 ||r(i)||/||b|| 2.653061406512e-01
>10 KSP preconditioned resid norm 1.244611343317e-02 true resid norm 
> 2.299441241514e+00 ||r(i)||/||b|| 2.651802855496e-01
>11 KSP preconditioned resid norm 1.227243209527e-02 true resid norm 
> 2.273668115236e+00 ||r(i)||/||b|| 2.622080308720e-01
>12 KSP preconditioned resid norm 1.172621459354e-02 true resid norm 
> 2.113927895437e+00 ||r(i)||/||b|| 2.437861828442e-01
>13 KSP preconditioned resid norm 2.880752338189e-03 true resid norm 
> 1.076190247720e-01 ||r(i)||/||b|| 1.241103412620e-02
>   Linear solve converged due to CONVERGED_RTOL iterations 13
> NL Step =  3, fnorm =  1.59729E-01
> 0 KSP preconditioned resid norm 1.676948440854e+03 true resid norm 
> 1.597288981238e-01 ||r(i)||/||b|| 1.e+00
> 1 KSP preconditioned resid norm 2.266131510513e+00 true resid norm 
> 1.819663943811e+00 ||r(i)||/||b|| 1.139220244542e+01
> 2 KSP preconditioned resid norm 2.239911493901e+00 true resid norm 
> 1.923976907755e+00 ||r(i)||/||b|| 1.204526501062e+01
> 3 KSP preconditioned resid norm 1.446859034276e-01 true resid norm 
> 8.692945031946e-01 ||r(i)||/||b|| 5.442312026225e+00
>   Linear solve converged due to CONVERGED_RTOL iterations 3
> NL Step =  4, fnorm =  1.59564E-01
> 0 KSP preconditioned resid 

Re: [petsc-users] Using PetscPartitioner on WINDOWS

2024-03-25 Thread Barry Smith
; 
> > > > > -原始邮件-
> > > > > 发件人: "Satish Balay" mailto:ba...@mcs.anl.gov>>
> > > > > 发送时间:2024-03-20 21:29:56 (星期三)
> > > > > 收件人: 程奔  > > > > <mailto:ctcheng...@mail.scut.edu.cn>>
> > > > > 抄送: petsc-users  > > > > <mailto:petsc-users@mcs.anl.gov>>
> > > > > 主题: Re: [petsc-users] Using PetscPartitioner on WINDOWS
> > > > > 
> > > > > >>>>
> > > > > Configure Options: --configModules=PETSc.Configure 
> > > > > --optionsModule=config.compilerOptions --with-debugging=0 
> > > > > --with-cc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_cl
> > > > >  
> > > > > --with-fc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_ifort
> > > > >  --with-cxx=/cygdrive/g/mypetsc/p
> > > > etsc-3.20.2/lib/petsc/bin/win32fe/win_cl 
> > > > --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64
> > > >  mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib 
> > > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include 
> > > > --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/20
> > > > 21.10.0/lib/release/impi.lib 
> > > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec 
> > > > -localonly 
> > > > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz
> > > >  
> > > > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz
> > > >  --with-strict-petscerrorcode=0
> > > > > <<<
> > > > > 
> > > > > >>>>>>>>
> > > > > Warning: win32fe: File Not Found: /Ox
> > > > > Error: win32fe: Input File Not Found: 
> > > > > G:\mypetsc\PETSC-~2.2\ARCH-M~1\EXTERN~1\PETSC-~1\PETSC-~1\libmetis\/Ox
> > > > > >>>>>>>>>>
> > > > > 
> > > > > Looks like you are using an old snapshot of metis. Can you remove 
> > > > > your local tarballs - and let [cygwin] git download the appropriate 
> > > > > latest version?
> > > > > 
> > > > > Or download and use: 
> > > > > https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-metis/get/8b194fdf09661ac41b36fa16db0474d38f46f1ac.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU6wWwAikw$
> > <https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-metis/get/8b194fdf09661ac41b36fa16db0474d38f46f1ac.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU6wWwAikw$%3E>>
> >  > > Similarly for parmetis 
> > https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/f5e3aab04fd5fe6e09fa02f885c1c29d349f9f8b.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU4L6tLXtg$
> > <https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/f5e3aab04fd5fe6e09fa02f885c1c29d349f9f8b.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU4L6tLXtg$%3E>>
> >  > > 
> > > > > Satish
> > > > > 
> > > > > On Wed, 20 Mar 2024, 程奔 wrote:
> > > > > 
> > > > > > Hi I try petsc-3. 20. 2 and petsc-3. 20. 5 with configure 
> > > > > > ./configure --with-debugging=0 --with-cc=cl --with-fc=ifort 
> > > > > > --with-cxx=cl 
> > > > > > --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023. 2. 
> > > > > > 0/lib/intel64 mkl-intel-lp64-dll. lib
> > > > > > mkl-sequential-dll. lib
> > > > > > ZjQcmQRYFpfptBannerStart
> > > > > > This Message Is From an External Sender
> > > > > > This message came from outside your organization.
> > > > > >  
> > > > > > ZjQcmQRYFpfptBannerEnd
> > > > > > 
> > > > > > Hi 
> > > > > > I try petsc-3.20.2 and petsc-3.20.5 with configure 
> > > > > > 
> > > > > > ./configure  --with-debugging=0  --with-cc=cl --with-fc=ifort 
> > > > > > --with-cxx=cl  
> > > > > > --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64
> > > > > >  mkl-intel-lp6

Re: [petsc-users] Fortran interfaces: Google Summer of Code project?

2024-03-21 Thread Barry Smith


> On Mar 21, 2024, at 6:35 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
> 
>> In my limited understanding of the Fortran iso_c_binding, if we do not 
>> provide an equivalent Fortran stub (the user calls) that uses the 
>> iso_c_binding to call PETSc C code, then when the user calls PETSc C code 
>> directly via the iso_c_binding they have to pass iso_c_binding type 
>> arguments to the call. This I consider unacceptable. So my conclusion was 
>> there is the same number of stubs, just in a different language, so there is 
>> no reason to consider changing since we cannot "delete lots of stubs", but I 
>> could be wrong.
> 
> I don't want users to deal with iso_c_binding manually.
> 
> We already have the generated ftn-auto-interfaces/*.h90. The INTERFACE 
> keyword could be replaced with CONTAINS (making these definitions instead of 
> just interfaces), and then the bodies could use iso_c_binding to call the C 
> functions. That would reduce repetition and be the standards-compliant way to 
> do this.

   Sure, the interface and the stub go in the same file instead of two files. 
This is slightly nicer but not significantly simpler, and alone, it is not 
reason enough to write an entire new stub generator.


> What we do now with detecting the Fortran mangling scheme and calling 
> conventions "works" but doesn't conform to any standard and there's nothing 
> stopping Fortran implementations from creating yet another variant that we 
> have to deal with manually.

   From practical experience, calling C/Fortran using non-standards has only 
gotten easier over the last thirty-five years (fewer variants on how char* is 
handled); it has not gotten more complicated, so I submit the likelihood of 
"nothing stopping Fortran implementations from creating yet another variant 
that we have to deal with manually" is (though possible) rather unlikely. As 
far as I am concerned, much of iso_c_binding stuff just solved a problem that 
never really existed (except in some people's minds) since calling C/Fortran 
has always been easy, modulo knowing a tiny bit of information..


> I don't know if this change would enable inlining without LTO, though I think 
> the indirection through our C sourcef.c is rarely a performance factor for 
> Fortran users.



Re: [petsc-users] Fortran interfaces: Google Summer of Code project?

2024-03-21 Thread Barry Smith




 > On Mar 21, 2024, at 5: 19 PM, Jed Brown  wrote: > > Barry Smith  writes: > >> We've always had some tension between adding new features to bfort vs developing an entirely new




ZjQcmQRYFpfptBannerStart




  

  
	This Message Is From an External Sender
  
  
This message came from outside your organization.
  



 
  


ZjQcmQRYFpfptBannerEnd






> On Mar 21, 2024, at 5:19 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
> 
>> We've always had some tension between adding new features to bfort vs developing an entirely new tool (for example in Python (maybe calling a little LLVM to help parse the C function), for maybe there is already a tool out there) to replace bfort.
> 
> Note that depending on LLVM (presumably libclang) is a nontrivial dependency if the users don't already have it installed on their systems. I'm all for making it easier to extend the stub generator, but an equally-hacky pybfort wouldn't make much difference. If some better tools have emerged or we have a clear idea for a better design, let's discuss that.
> 
>> Both approaches have their advantages and disadvantages instead we've relied on the quick and dirty of providing the interfaces as needed). We have not needed the Fortran standard C interface stuff and I would prefer not to use it unless it offers some huge advantage).
> 
> Mainly that lots of C stubs could be deleted in favor of iso_c_binding.

In my limited understanding of the Fortran iso_c_binding, if we do not provide an equivalent Fortran stub (the user calls) that uses the iso_c_binding to call PETSc C code, then when the user calls PETSc C code directly via the iso_c_binding they have to pass iso_c_binding type arguments to the call. This I consider unacceptable. So my conclusion was there is the same number of stubs, just in a different language, so there is no reason to consider changing since we cannot "delete lots of stubs", but I could be wrong.








Re: [petsc-users] Fortran interfaces: Google Summer of Code project?

2024-03-21 Thread Barry Smith




 Martin, Thanks for the suggestions and offer. The tool we use for automatically generating the Fortran stubs and interfaces is bfort. Its limitations include that it cannot handle string arguments automatically and cannot generate more than




ZjQcmQRYFpfptBannerStart




  

  
	This Message Is From an External Sender
  
  
This message came from outside your organization.
  



 
  


ZjQcmQRYFpfptBannerEnd





   Martin,

Thanks for the suggestions and offer.

The tool we use for automatically generating the Fortran stubs and interfaces is bfort. 

 Its limitations include that it cannot handle string arguments automatically and cannot generate more than one interface for a function. This is why we need to provide these manually (the use of a,b,... is to prevent long lines and the need for continuations in the definitions of the interfaces).

 Adding support for strings is very straightforward, just a little more smarts in bfort. 

 Adding support for multiple interface generation is a bit trickier because the code must (based on the C calling sequence) automatically determine all the combinations of array vs single value the interfaces should generate and then generate a Fortran stub for each (all mapping back to the same master stub for that function). I've talked to Bill Gropp about having him add such support, but he simply does not have time for such work so most recent work on the bfort that PETSc uses has been by us.

 We've always had some tension between adding new features to bfort vs developing an entirely new tool (for example in Python (maybe calling a little LLVM to help parse the C function), for maybe there is already a tool out there) to replace bfort. Both approaches have their advantages and disadvantages instead we've relied on the quick and dirty of providing the interfaces as needed). We have not needed the Fortran standard C interface stuff and I would prefer not to use it unless it offers some huge advantage).

Thoughts?

   Barry



 



> On Mar 21, 2024, at 12:21 PM, Martin Diehl  wrote:
> 
> Dear PETSc team,
> 
> I've worked on Fortran interfaces (see
> https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/issues/1540__;!!G_uCfscf7eWS!Z2hB888HeDFnBuJk3ArQd9Lx0RRQuRSNQOAcJo8skOxRMZ5_V4fU8Ss6mn2AEQ-4Jn6tWTEhS-o5TzXOdkf8tNc$) but could not get far in
> the time I could afford.
> 
> In discussion with Javier (in CC) the idea came up to propose to offer
> the work on Fortran interfaces for PETSc as a Google Summer of Code
> project.
> 
> fortran-lang has been accepted as organization and the current projects
> are on:
> https://urldefense.us/v3/__https://github.com/fortran-lang/webpage/wiki/GSoC-2024-Project-ideas__;!!G_uCfscf7eWS!Z2hB888HeDFnBuJk3ArQd9Lx0RRQuRSNQOAcJo8skOxRMZ5_V4fU8Ss6mn2AEQ-4Jn6tWTEhS-o5TzXO0g4xW4w$
> 
> The main work would be the automatization of interfaces that are
> currently manually created via Python. This includes an improved user
> experience, because correct variable names (not a, b, c) can be used.
> It should be also possible to automatically create descriptions of the
> enumerators.
> 
> As outlook tasks, I would propose:
> - check whether a unified automatization script can also replace the
> current tool for creation of interfaces.
> - investigate improved handling of strings (there are ways in newer
> standards).
> 
> I can offer to do the supervision, but would certainly need guidance
> and the ok from the PETSc core team.
> 
> best regards,
> Martin
> 
> -- 
> KU Leuven
> Department of Computer Science
> Department of Materials Engineering
> Celestijnenlaan 200a
> 3001 Leuven, Belgium




Re: [petsc-users] MatSetValues() can't work right

2024-03-18 Thread Barry Smith
options --with-cc=gcc --with-cxx=g++ 
> --with-fc=gfortran --download-fblaslapack --download-hypre 
> --with-debugging=yes --download-mpich --with-clanguage=cxx
> [1]PETSC ERROR: #1 ISLocalToGlobalMappingApply() at 
> /home/lei/Software/PETSc/petsc-3.20.4/src/vec/is/utils/isltog.c:789
> [1]PETSC ERROR: #2 MatSetValuesLocal() at 
> /home/lei/Software/PETSc/petsc-3.20.4/src/mat/interface/matrix.c:2408
> [1]PETSC ERROR: #3 MatSetValuesStencil() at 
> /home/lei/Software/PETSc/petsc-3.20.4/src/mat/interface/matrix.c:1762
> 
> Is it not possible to set values across processors using MatSetValuesStencil? 
> If I want to set values of the matrix across processors, what should I do? 
> I am really confused, and I would greatly appreciate your help.
> 
> On Mon, Mar 18, 2024 at 9:28 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>The output is correct (only confusing). For PETSc DMDA by default viewing 
>> a parallel matrix converts it to the "natural" ordering instead of the PETSc 
>> parallel ordering.
>> 
>>See the Notes in 
>> https://urldefense.us/v3/__https://petsc.org/release/manualpages/DM/DMCreateMatrix/__;!!G_uCfscf7eWS!fTO1ShsqXrxcXKmKrn7uXjX68PlSaKv4RBgRvwP9BUQpeowdAqyQyxq3cSp_3H231u74LG5cJRd24lnABMYgziE$
>>  
>> 
>>   Barry
>> 
>> 
>>> On Mar 18, 2024, at 8:06 AM, Waltz Jan >> <mailto:jl2862237...@gmail.com>> wrote:
>>> 
>>> This Message Is From an External Sender
>>> This message came from outside your organization.
>>> PETSc version: 3.20.4
>>> Program:
>>> #include 
>>> #include 
>>> #include 
>>> #include 
>>> #include 
>>> #include 
>>> #include 
>>> 
>>> int main()
>>> {
>>> PetscInitialize(NULL, NULL, NULL, NULL);
>>> DM da;
>>> DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_GHOSTED, 
>>> DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, DMDA_STENCIL_STAR,
>>>  10, 1, 10, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 3, 1, 
>>> NULL, NULL, NULL, );
>>> DMSetFromOptions(da);
>>> DMSetUp(da);
>>> Mat Jac;
>>> DMCreateMatrix(da, );
>>> int row = 100, col = 100;
>>> double val = 1.;
>>> MatSetValues(Jac, 1, , 1, , , INSERT_VALUES);
>>> MatAssemblyBegin(Jac, MAT_FINAL_ASSEMBLY);
>>> MatAssemblyEnd(Jac, MAT_FINAL_ASSEMBLY);
>>> 
>>> PetscViewer viewer;
>>> PetscViewerASCIIOpen(PETSC_COMM_WORLD, "./jacobianmatrix.m", );
>>> PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB);
>>> MatView(Jac, viewer);
>>> PetscViewerDestroy();
>>> 
>>> PetscFinalize();
>>> }
>>> 
>>> When I ran the program with np = 6, I got the result as the below
>>> 
>>> It's obviously wrong.
>>> When I ran the program with np = 1 or 8, I got the right result as
>>> 
>> 



Re: [petsc-users] MatSetValues() can't work right

2024-03-18 Thread Barry Smith

   The output is correct (only confusing). For PETSc DMDA by default viewing a 
parallel matrix converts it to the "natural" ordering instead of the PETSc 
parallel ordering.

   See the Notes in 
https://urldefense.us/v3/__https://petsc.org/release/manualpages/DM/DMCreateMatrix/__;!!G_uCfscf7eWS!dHznAiOiU4NDCipIaS1et2IIGx1u779XYQmMGk4EeeLQf41tAhbciI4ne1JKfOR0jG5WCsFm7dRWuEy6KdNhM5w$
 

  Barry


> On Mar 18, 2024, at 8:06 AM, Waltz Jan  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> PETSc version: 3.20.4
> Program:
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> int main()
> {
> PetscInitialize(NULL, NULL, NULL, NULL);
> DM da;
> DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, 
> DM_BOUNDARY_GHOSTED, DMDA_STENCIL_STAR,
>  10, 1, 10, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 3, 1, 
> NULL, NULL, NULL, );
> DMSetFromOptions(da);
> DMSetUp(da);
> Mat Jac;
> DMCreateMatrix(da, );
> int row = 100, col = 100;
> double val = 1.;
> MatSetValues(Jac, 1, , 1, , , INSERT_VALUES);
> MatAssemblyBegin(Jac, MAT_FINAL_ASSEMBLY);
> MatAssemblyEnd(Jac, MAT_FINAL_ASSEMBLY);
> 
> PetscViewer viewer;
> PetscViewerASCIIOpen(PETSC_COMM_WORLD, "./jacobianmatrix.m", );
> PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB);
> MatView(Jac, viewer);
> PetscViewerDestroy();
> 
> PetscFinalize();
> }
> 
> When I ran the program with np = 6, I got the result as the below
> 
> It's obviously wrong.
> When I ran the program with np = 1 or 8, I got the right result as
> 



Re: [petsc-users] Using PetscPartitioner on WINDOWS

2024-03-18 Thread Barry Smith

Please switch to the latest PETSc version, it supports Metis and Parmetis on 
Windows.

  Barry


> On Mar 17, 2024, at 11:57 PM, 程奔 <202321009...@mail.scut.edu.cn> wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello,
> 
> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with 
> Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list 
> the softwares/packages used below:
> 1. PETSc: version 3.16.5
> 2. VS: version 2022 
> 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit
> 4. Cygwin
> 
> 
> On windows,
> Then I try to calculate a simple cantilever beam  that use Tetrahedral mesh.  
> So it's  unstructured grid
> I use DMPlexCreateFromFile() to creat dmplex.
> And then I want to distributing the mesh for using  PETSCPARTITIONERPARMETIS 
> type(in my opinion this PetscPartitioner type maybe the best for dmplex,
> 
> see fig 1 for my work to see different PetscPartitioner type about a  
> cantilever beam in Linux system.)
> 
> But unfortunatly, when i try to use parmetis on windows that configure PETSc 
> as follows
> 
> 
>  ./configure  --with-debugging=0  --with-cc='win32fe cl' --with-fc='win32fe 
> ifort' --with-cxx='win32fe cl'  
> 
> --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz
>   --with-shared-libraries=0 
> 
> --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include
>  --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib 
> --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec 
> --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz
>  
> --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz 
> 
> 
> 
> 
> it shows that 
> ***
> External package metis does not support --download-metis with Microsoft 
> compilers
> ***
> configure.log and make.log is attached
> 
> 
> 
> If I use PetscPartitioner Simple type the calculate time is much more than 
> PETSCPARTITIONERPARMETIS type.
> So On windows system I want to use PetscPartitioner like parmetis , if there 
> have any other PetscPartitioner type that can do the same work as parmetis, 
> 
> or I just try to download parmetis  separatly on windows(like this website , 
> https://urldefense.us/v3/__https://boogie.inm.ras.ru/terekhov/INMOST/-/wikis/0204-Compilation-ParMETIS-Windows__;!!G_uCfscf7eWS!aQuBcC4tH7O2WJoZkJWAnLHZplCB2W8UcXfvQSouKHeLkTk8v4zBycDCdUN6Xa3w9NCQcanI2isN-FopN4gfXyE$
>  )  
> 
> and then use Visual Studio to use it's library I don't know in this way PETSc 
> could use it successfully or not.
> 
> 
> 
> So I wrrit this email to report my problem and ask for your help.
> 
> Looking forward your reply!
> 
> 
> sinserely,
> Ben.
> 



Re: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS

2024-03-17 Thread Barry Smith

  I would just avoid the --download-openblas  option. The BLAS/LAPACK provided 
by Apple should perform fine, perhaps even better than OpenBLAS on your system.


> On Mar 17, 2024, at 9:58 AM, Zongze Yang  wrote:
> 
> This Message Is From an External Sender 
> This message came from outside your organization.
> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not 
> resolve the issue. The same error persisted.
> 
> Best wishes,
> Zongze
> 
>> On 17 Mar 2024, at 20:58, Pierre Jolivet > > wrote:
>> 
>> 
>> 
>>> On 17 Mar 2024, at 1:04 PM, Zongze Yang >> > wrote:
>>> 
>>> Thank you for providing the instructions. I try the first option.
>>> 
>>> Now, the error of the configuration is related to OpenBLAS.
>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. 
>>> Should this be reported to OpenBLAS? Or need to fix the configure in petsc?
>> 
>> I see our linux-opt-arm runner is using the additional flag 
>> '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to 
>> add that as well?
>> I don’t think there is much to fix on our end, OpenBLAS has been very broken 
>> lately on arm (current version is 0.3.26 but we can’t update because there 
>> is a huge performance regression which makes the pipeline timeout).
>> 
>> Thanks,
>> Pierre
>> 
>>> 
>>> The configure.log is attached. The errors are show below:
>>> ```
>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer 
>>> conversion passing 'blasint' (aka 'int') to parameter of type 'const 
>>> blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, 
>>> beta, C, info);
>>> 
>>> ^~~~
>>> 
>>> &
>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here
>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const 
>>> blasint *, const blasint *, const float *, const float *, const blasint *, 
>>> const float *, const blasint *, const float *, float *, const blasint *);
>>> 
>>> 
>>>^
>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer 
>>> conversion passing 'blasint' (aka 'int') to parameter of type 'const 
>>> blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, 
>>> beta, C, info);
>>> 
>>> ^~~~
>>> 
>>> &
>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here
>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const 
>>> blasint *, const blasint *, const double *, const double *, const blasint 
>>> *, const double *, const blasint *, const double *, double *, const blasint 
>>> *);
>>> 
>>> 
>>> ^
>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer 
>>> conversion passing 'blasint' (aka 'int') to parameter of type 'const 
>>> blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, 
>>> beta, C, info);
>>> 
>>> ^~~~
>>> 
>>> &
>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here
>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const 
>>> blasint *, const blasint *, const float *, const float *, const blasint *, 
>>> const float *, const blasint *, const float *, float *, const blasint *);
>>> 
>>> 
>>>^
>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer 
>>> conversion passing 'blasint' (aka 'int') to parameter of type 'const 
>>> blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, 
>>> 

Re: [petsc-users] MATSETVALUES: Fortran problem

2024-03-15 Thread Barry Smith


> On Mar 15, 2024, at 9:53 AM, Frank Bramkamp  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Dear PETSc Team,
> 
> I am using the latest petsc version 3.20.5.
> 
> 
> I would like to create a matrix using
> MatCreateSeqAIJ
> 
> To insert values, I use MatSetValues.
> It seems that the Fortran interface/stubs are missing for MatsetValues, as 
> the linker does not find any subroutine with that name.
> MatSetValueLocal seems to be fine.

   Please send the exact error message (cut and paste), there are definitely 
Fortran stubs for this function but it could be you exact parameter input does 
not have a stub yet.

   Barry

> 
> 
> Typically I am using a blocked matrix format (BAIJ), which works fine in 
> fortran.
> Soon we want to try PETSC on GPUs, using the format MATAIJCUSPARSE, since 
> there seems not to be a blocked format available in PETSC for GPUs so far.
> Therefore I first want to try the pointwise format MatCreateSeqAIJ format on 
> a CPU, before using the GPU format.
> 
> I think that CUDA also supports a block format now ?! Maybe that would be 
> also useful to have one day.
> 
> 
> Greetings, Frank Bramkamp
> 
> 
> 
> 
> 
> 
> 



Re: [petsc-users] Fieldsplit, multigrid and DM interaction

2024-03-13 Thread Barry Smith


   Sorry no one responded to this email sooner.

> On Mar 12, 2024, at 4:18 AM, Marco Seiz  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello,
> 
> 
> I'd like to solve a Stokes-like equation with PETSc, i.e.
> 
> 
> div( mu * symgrad(u) ) = -grad(p) - grad(mu*q)
> 
> div(u) = q
> 
> 
> with the spatially variable coefficients (mu, q) coming from another 
> application, which will advect and evolve fields via the velocity field 
> u from the Stokes solution, and throw back new (mu, q) to PETSc in a 
> loop, everything using finite difference. In preparation for this and 
> getting used to PETSc I wrote a simple inhomogeneous coefficient Poisson 
> solver, i.e.
> 
>   div (mu*grad(u) = -grad(mu*q), u unknown,
> 
> based on src/ksp/ksp/tutorials/ex32.c which converges really nicely even 
> for mu contrasts of 10^10 using -ksp_type fgmres -pc_type mg. Since my 
> coefficients later on can't be calculated from coordinates, I put them 
> on a separate DM and attached it to the main DM via PetscObjectCompose 
> and used a DMCoarsenHookAdd to coarsen the DM the coefficients live on, 
> inspired by src/ts/tutorials/ex29.c .
> 
> Adding another uncoupled DoF was simple enough and it converged 
> according to -ksp_converged_reason, but the solution started looking 
> very weird; roughly constant for each DoF, when it should be some 
> function going from roughly -value to +value due to symmetry. This 
> doesn't happen when I use a direct solver ( -ksp_type preonly -pc_type 
> lu -pc_factor_mat_solver_type umfpack ) and reading the archives, I 
> ought to be using -pc_type fieldsplit due to the block nature of the 
> matrix. I did that and the solution looked sensible again.
Hmm, this sounds like the operator has the constant null space that is 
accumulating in the iterative method. 

The standard why to handle this is to use MatSetNullSpace() to provide the 
nullspace information so the iterative solver can remove it at each iteration.

> 
> Now here comes the actual problem: Once I try adding multigrid 
> preconditioning to the split fields I get errors probably relating to 
> fieldsplit not "inheriting" (for lack of a better term) the associated 
> interpolations/added DMs and hooks on the fine DM. That is, when I use 
> the DMDA_Q0 interpolation, fieldsplit dies because it switches to 
> DMDA_Q1 and the size ratio is wrong ( Ratio between levels: (mx - 1)/(Mx 
> - 1) must be integer: mx 64 Mx 32 ). When I use DMDA_Q1, once the KSP 
> tries to setup the matrix on the coarsened problem the DM no longer has 
> the coefficient DMs which I previously had associated with it, i.e. 
> PetscCall(PetscObjectQuery((PetscObject)da, "coefficientdm", 
> (PetscObject *)_coeff)); puts a NULL pointer in dm_coeff and PETSc 
> dies when trying to get a named vector from that, but it works nicely 
> without fieldsplit.
> 
> Is there some way to get fieldsplit to automagically "inherit" those 
> added parts or do I need to manually modify the DMs the fieldsplit is 
> using? I've been using KSPSetComputeOperators since it allows for 
> re-discretization without having to manage the levels myself, whereas 
> some more involved examples like src/dm/impls/stag/tutorials/ex4.c build 
> the matrices in advance when re-discretizing and set them with 
> KSPSetOperators, which would avoid the problem as well but also means 
> managing the levels.

We don't have hooks to get your inner information automatically passed in from 
the outer DM but I think you can use

PCFieldSplitGetSubKSP() after KSPSetUp()

 to get your two sub-KSPs, you then can set the "sub" DM to these etc to get 
them to "largely" behave as in your previous "uncoupled" code. 
Hopefully also use KSPSetOperators().

  Barry

> 
> 
> Any advice concerning solving my target Stokes-like equation is welcome 
> as well. I am coming from a explicit timestepping background so reading 
> up on saddle point problems and their efficient solution is all quite 
> new to me.
> 
> 
> Best regards,
> 
> Marco
> 
> 
> 
> 
> 



Re: [petsc-users] petsc4py error code 86 from ViewerHDF5().create

2024-03-13 Thread Barry Smith




 > On Mar 12, 2024, at 11: 54 PM, adigitoleo (Leon)  wrote: > >> You need to ./configure PETSc for HDF5 using >> >>> --with-fortran-bindings=0 --with-mpi-dir=/usr --download-hdf5 >>




ZjQcmQRYFpfptBannerStart




  

  
	This Message Is From an External Sender
  
  
This message came from outside your organization.
  



 
  


ZjQcmQRYFpfptBannerEnd






> On Mar 12, 2024, at 11:54 PM, adigitoleo (Leon)  wrote:
> 
>>   You need to ./configure PETSc for HDF5 using
>> 
>>> --with-fortran-bindings=0 --with-mpi-dir=/usr --download-hdf5
>> 
> 
> Thanks, this has worked. I assumed PETSc would just pick up the HDF5
> library I already had on my system but perhaps that requires
> --with-hdf5-dir=/usr or something similar? Would this HDF5 library need
> to be configured for MPI as well?

  This could work if that HDF5 is configured for MPI. We prefer the --download-load option since it ensures the appropriate version of HDF5 is built with the same compilers and compile options as PETSc. A previously installed version of HDF5 is often incompabible in some way.


> 
> The underworld3 test suite is mostly passing but I do get a handful of
> failures coming from
> 
>petsc4py.PETSc.SNES.getConvergedReason()
> 
> giving -3 instead of the expected 0. But that's more a question for
> underworld devs.
> 
> Leon




Re: [petsc-users] petsc4py error code 86 from ViewerHDF5().create

2024-03-12 Thread Barry Smith

   You need to ./configure PETSc for HDF5 using

> --with-fortran-bindings=0 --with-mpi-dir=/usr --download-hdf5

  It may need additional options, if it does then rerun the ./configure with 
the additional options it lists.


> On Mar 12, 2024, at 8:19 PM, adigitoleo (Leon)  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello,
> 
> I'm new to the list and have a limited knowledge of PETSc so far, but
> I'm trying to use a software (underworld3) that relies on petsc4py.
> I have built PETSc with the following configure options:
> 
> --with-fortran-bindings=0 --with-mpi-dir=/usr
> 
> and `make test` gives me 160 failures which all seem to be timeouts or
> arising from my having insufficient "slots" (cores?). I subsequently
> built underworld3 with something like
> 
> cd $PETSC_DIR
> PETSC_DIR=... PETSC_ARCH=... NUMPY_INCLUDE=... pip install 
> src/binding/petsc4py
> cd /path/to/underworld3/tree
> pip install h5py
> pip install mpi4py
> PETSC_DIR=... PETSC_ARCH=... NUMPY_INCLUDE=... pip install -e .
> 
> following their instructions. Building their python wheel/package was
> successful, however when I run their tests (using pytest) I get errors
> during test collection, which all come from petsc4py and have a stack
> trace that ends in the snippet attached below. Am I going about this
> wrong? How do I ensure that the HDF5 types are defined?
> 
> src/underworld3/discretisation.py:86: in _from_gmsh
> viewer = PETSc.ViewerHDF5().create(filename + ".h5", "w", 
> comm=PETSc.COMM_SELF)
> petsc4py/PETSc/Viewer.pyx:916: in petsc4py.PETSc.ViewerHDF5.create
> ???
> E   petsc4py.PETSc.Error: error code 86
> --- Captured stderr 
> 
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: 
> https://urldefense.us/v3/__https://petsc.org/release/install/install/*external-packages__;Iw!!G_uCfscf7eWS!duVp7PZwdvHgymCgufX290k3tptJCHEo3vrV7dNt9zumYwqzDVsb1AG1HHargxq0LL-1JO6mjgiS7Vbykb1_siyXWw$
> [0]PETSC ERROR: Unknown PetscViewer type given: hdf5
> [0]PETSC ERROR: See 
> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!duVp7PZwdvHgymCgufX290k3tptJCHEo3vrV7dNt9zumYwqzDVsb1AG1HHargxq0LL-1JO6mjgiS7Vbykb1LEFOicQ$
>  for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.20.4, unknown
> [0]PETSC ERROR: /home/leon/vcs/underworld3/.venv-underworld3/bin/pytest 
> on a arch-linux-c-debug named roci by leon Wed Mar 13 00:01:33 2024
> [0]PETSC ERROR: Configure options --with-fortran-bindings=0 
> --with-mpi-dir=/usr
> [0]PETSC ERROR: #1 PetscViewerSetType() at 
> /home/leon/vcs/petsc/src/sys/classes/viewer/interface/viewreg.c:535
> 
> Cheers,
> Leon



Re: [petsc-users] snes matrix-free jacobian fails with preconditioner shape mismatch

2024-03-10 Thread Barry Smith


> On Mar 10, 2024, at 10:16 AM, Yi Hu  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Dear Mark,
> 
> Thanks for your reply. I see this mismatch. In fact my global DoF is 324. It 
> seems like I always get the local size = global Dof / np^2, np is my 
> processor number. By the way, I used DMDASNESsetFunctionLocal() to set my 
> form function. Is it eligible to mix DMDASNESsetFunctionLocal() and a native 
> SNESSetJacobian()?
> 

Yes

> Best,
> 
> Yi
> 
> On 3/10/24 13:55, Mark Adams wrote:
>> It looks like your input vector is the global vector, size 162, and the 
>> local matrix size is 81.
>> Mark
>> [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal 
>> input vector size 162
>> 
>> On Sun, Mar 10, 2024 at 7:21 AM Yi Hu mailto:y...@mpie.de>> 
>> wrote:
>>> This Message Is From an External Sender 
>>> This message came from outside your organization. 
>>>  
>>> Dear petsc team,
>>> 
>>> I implemented a matrix-free jacobian, and it can run sequentially. But 
>>> running parallel I got the pc error like this (running with mpirun -np 
>>> 2, only error from rank1 is presented here)
>>> 
>>> [1]PETSC ERROR: - Error Message 
>>> --
>>> [1]PETSC ERROR: Nonconforming object sizes
>>> [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal 
>>> input vector size 162
>>> [1]PETSC ERROR: See 
>>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0PlyUrtdfCA$
>>>  for trouble shooting.
>>> [1]PETSC ERROR: Petsc Release Version 3.17.3, Jun 29, 2022
>>> [1]PETSC ERROR: /home/yi/workspace/DAMASK_yi/bin/DAMASK_grid on a 
>>> arch-linux-c-opt named carbon-x1 by yi Sun Mar 10 12:01:46 2024
>>> [1]PETSC ERROR: Configure options --download-fftw --download-hdf5 
>>> --with-hdf5-fortran-bindings --download-fblaslapack --download-chaco 
>>> --download-hypre --download-metis --download-mumps --download-parmetis 
>>> --download-scalapack --download-suitesparse --download-superlu 
>>> --download-superlu_dist --download-triangle --download-zlib 
>>> --download-cmake --with-cxx-dialect=C++11 --with-c2html=0 
>>> --with-debugging=0 --with-ssl=0 --with-x=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 
>>> FOPTFLAGS=-O3
>>> [1]PETSC ERROR: #1 PCApply() at 
>>> /home/yi/App/petsc-3.17.3/src/ksp/pc/interface/precon.c:424
>>> [1]PETSC ERROR: #2 KSP_PCApply() at 
>>> /home/yi/App/petsc-3.17.3/include/petsc/private/kspimpl.h:376
>>> [1]PETSC ERROR: #3 KSPInitialResidual() at 
>>> /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itres.c:64
>>> [1]PETSC ERROR: #4 KSPSolve_GMRES() at 
>>> /home/yi/App/petsc-3.17.3/src/ksp/ksp/impls/gmres/gmres.c:242
>>> [1]PETSC ERROR: #5 KSPSolve_Private() at 
>>> /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:902
>>> [1]PETSC ERROR: #6 KSPSolve() at 
>>> /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:1078
>>> [1]PETSC ERROR: #7 SNESSolve_NEWTONLS() at 
>>> /home/yi/App/petsc-3.17.3/src/snes/impls/ls/ls.c:222
>>> [1]PETSC ERROR: #8 SNESSolve() at 
>>> /home/yi/App/petsc-3.17.3/src/snes/interface/snes.c:4756
>>> [1]PETSC ERROR: #9 User provided function() at User file:0 
>>> 
>>> However, from snes matrix-free documentation 
>>> (https://urldefense.us/v3/__https://petsc.org/release/manual/snes/*matrix-free-methods__;Iw!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0Ply6ZbywOw$),
>>>  it is said 
>>> matrix-free is used with pcnone. So I assume it would not apply 
>>> preconditioner, but it did use preconditioning probably the same as my 
>>> matrix-free shell matrix. Here is how i initialize my shell matrix and 
>>> the corresponding customized multiplication.
>>> 
>>>call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,&
>>> int(9*product(cells(1:2))*cells3,pPETSCINT),&
>>> int(9*product(cells(1:2))*cells3,pPETSCINT),&
>>>F_PETSc,Jac_PETSc,err_PETSc)
>>>call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc)
>>>call SNESSetDM(SNES_mech,DM_mech,err_PETSc)
>>>call 
>>> SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc)
>>>call SNESGetKSP(SNES_mech,ksp,err_PETSc)
>>>call PCSetType(pc,PCNONE,err_PETSc)
>>> 
>>> And my GK_op is like
>>> 
>>> subroutine GK_op(Jac,dF_global,output_local,err_PETSc)
>>> 
>>>DM   :: dm_local
>>>Vec  :: dF_global, dF_local, output_local
>>>Mat  :: Jac
>>>PetscErrorCode   :: err_PETSc
>>> 
>>>real(pREAL), pointer,dimension(:,:,:,:) :: dF_scal, output_scal
>>> 
>>>real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: &
>>>  dF
>>>real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: &
>>>  output
>>> 
>>>call 

Re: [petsc-users] Clarification for use of MatMPIBAIJSetPreallocationCSR

2024-03-03 Thread Barry Smith

Clarify in documentation which routines copy the provided values 
https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7336__;!!G_uCfscf7eWS!Z2j2ynvPrOvIKRuhQdogedQsz8yANmf4jbsUEXvXz2afujWJLfsAdG45b-wsInVpXM3tB3H6h1vTdYbzamBV5hs$
 

> On Mar 2, 2024, at 6:40 AM, Fabian Wermelinger  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> On Fri, 01 Mar 2024 12:51:45 -0600, Junchao Zhang wrote:
> >> The preferred method then is to use MatSetValues() for the copies (or
> >> possibly MatSetValuesRow() or MatSetValuesBlocked())?
> > MatSetValuesRow()
> 
> Thanks
> 
> >> The term "Preallocation" is confusing to me.  For example,
> >> MatCreateMPIBAIJWithArrays clearly states in the doc that arrays are copied
> >> (https://urldefense.us/v3/__https://petsc.org/release/manualpages/Mat/MatCreateMPIBAIJWithArrays/__;!!G_uCfscf7eWS!cls7psuvaRuDYuXpUpmU7lcEhX9AO0bb3qTpszwuTNP8LPPrJkCzoaHIdJCzxPR36D7SLYs9MKCxqMBKRnSQc8s$),
> >>  I
> >> would then assume PETSc maintains internal storage for it.  If something is
> >> preallocated, I would not make that assumption.
> >"Preallocation" in petsc means "tell petsc sizes of rows in a matrix",  so 
> >that
> >petsc can preallocate the memory before you do MatSetValues(). This is 
> >clearer
> >in 
> >https://urldefense.us/v3/__https://petsc.org/release/manualpages/Mat/MatMPIAIJSetPreallocation/__;!!G_uCfscf7eWS!cls7psuvaRuDYuXpUpmU7lcEhX9AO0bb3qTpszwuTNP8LPPrJkCzoaHIdJCzxPR36D7SLYs9MKCxqMBKrYGemdo$
> 
> Thanks for the clarification!
> 
> All best,
> 
> -- 
> fabs



Re: [petsc-users] 'Preconditioning' with lower-order method

2024-03-03 Thread Barry Smith

   Are you forming the Jacobian for the first and second order cases inside of 
Newton?

   You can run both with -log_view to see how much time is spent in the various 
events (compute function, compute Jacobian, linear solve, ...) for the two 
cases and compare them.



> On Mar 3, 2024, at 11:42 AM, Zou, Ling via petsc-users 
>  wrote:
> 
> Original email may have been sent to the incorrect place.
> See below.
>  
> -Ling
>  
> From: Zou, Ling mailto:l...@anl.gov>>
> Date: Sunday, March 3, 2024 at 10:34 AM
> To: petsc-users  >
> Subject: 'Preconditioning' with lower-order method
> 
> Hi all,
>  
> I am solving a PDE system over a spatial domain. Numerical methods are:
> Finite volume method (both 1st and 2nd order implemented)
> BDF1 and BDF2 for time integration.
> What I have noticed is that 1st order FVM converges much faster than 2nd 
> order FVM, regardless the time integration scheme. Well, not surprising since 
> 2nd order FVM introduces additional non-linearity.
>  
> I’m thinking about two possible ways to speed up 2nd order FVM, and would 
> like to get some thoughts or community knowledge before jumping into code 
> implementation.
>  
> Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order 
> FVM residual function be F1(x) = 0.
> Option – 1, multi-step for each time step
> Step 1: solving F1(x) = 0 to obtain a temporary solution x1
> Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final 
> solution.
> [Not sure if gain any saving at all]
>  
> Option -2, dynamically changing residual function F(x)
> In pseudo code, would be something like.
>  
> snesFormFunction(SNES snes, Vec u, Vec f, void *)
> {
>   if (snes.nl_it_no < 4) // 4 being arbitrary here
> f = F1(u);
>   else
> f = F2(u);
> }
>  
> I know this might be a bit crazy since it may crash after switching residual 
> function, still, any thoughts?
>  
> Best,
>  
> -Ling



[petsc-users] Remember to register for the PETSc meeting, May 23-24 2024 in Cologne, Germany.

2024-02-27 Thread Barry Smith

Reminder: the next PETSc annual meeting will be held May 23-24 in Cologne, 
Germany.

   Please register at 
https://urldefense.us/v3/__https://cds.uni-koeln.de/en/workshops/petsc-2024/home__;!!G_uCfscf7eWS!cMCzpexVX_aPUmT0eRDxBsv1mFwQFK_m2MiNLtOjlQjm3HM38CNEMtzW65s7Atx16ZuMVrQoMey2LdB81AoxAMY$
 , submit your presentation abstract, and register for the hotel now.

   Thanks.

   Barry






Re: [petsc-users] Problem of using PCFactorSetDropTolerance

2024-02-27 Thread Barry Smith

   We don't consider drop tolerance preconditioners as reliable or robust so I 
don't see anyone implementing it.



> On Feb 27, 2024, at 10:11 AM, Константин Мурусидзе 
>  wrote:
> 
> Thank you! Should we expect it to appear in the near future?
> 
> 
> 27.02.2024, 18:07, "Barry Smith" :
> 
>I'm sorry for the confusion. PETSc does not have a drop tolerance ILU so 
> this function does nothing as you've noted.
> 
>Barry
> 
> 
> On Feb 27, 2024, at 7:27 AM, Константин Мурусидзе 
> mailto:konstantin.murusi...@math.msu.ru>> 
> wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello! I have such problem, while solving linear system and using 
> PCFactorSetDropTolerance. As I understand this  function could significantly 
> reduce number of iterations of system of solution, but there is no iterations 
> decreasing.I had try a lot of options for this function, but there is no 
> result
> Could you suggest me if I have some mistake, while I setting up the 
> preconditioner?
>  
> PetscCall(KSPSetType(ksp, KSPCG));
> PetscCall(KSPSetInitialGuessNonzero(ksp,PETSC_TRUE));
> PetscCall(KSPSetNormType(ksp, KSP_NORM_DEFAULT));
> PetscCall(KSPCGSetType(ksp, KSP_CG_SYMMETRIC));
>  
> PetscCall(KSPGetPC(ksp, ));
> PetscCall(PCSetType(pc, PCILU));
> PetscCall(PCFactorSetShiftType(pc, MAT_SHIFT_POSITIVE_DEFINITE));
>   
> PetscCall(PCFactorSetLevels(pc, 2));
> PetscCall(PCFactorSetReuseOrdering(pc,PETSC_TRUE));
> PetscCall(PCFactorSetDropTolerance(pc, 1.e-5, 1.e-3, 1));
> PetscCall(PCFactorSetZeroPivot(pc, 1.e-3));
> PetscCall(KSPSetTolerances(ksp, 1.e-8, 1.e-8, PETSC_DEFAULT, 
> PETSC_DEFAULT));
>  
> PetscCall(KSPSetFromOptions(ksp));
>  
> PetscCall(KSPSolve(ksp, b, x));
>  PetscCall(KSPGetIterationNumber(ksp, ));
> 



Re: [petsc-users] Problem of using PCFactorSetDropTolerance

2024-02-27 Thread Barry Smith

   I'm sorry for the confusion. PETSc does not have a drop tolerance ILU so 
this function does nothing as you've noted.

   Barry


> On Feb 27, 2024, at 7:27 AM, Константин Мурусидзе 
>  wrote:
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> Hello! I have such problem, while solving linear system and using 
> PCFactorSetDropTolerance. As I understand this  function could significantly 
> reduce number of iterations of system of solution, but there is no iterations 
> decreasing.I had try a lot of options for this function, but there is no 
> result
> Could you suggest me if I have some mistake, while I setting up the 
> preconditioner?
>  
> PetscCall(KSPSetType(ksp, KSPCG));
> PetscCall(KSPSetInitialGuessNonzero(ksp,PETSC_TRUE));
> PetscCall(KSPSetNormType(ksp, KSP_NORM_DEFAULT));
> PetscCall(KSPCGSetType(ksp, KSP_CG_SYMMETRIC));
>  
> PetscCall(KSPGetPC(ksp, ));
> PetscCall(PCSetType(pc, PCILU));
> PetscCall(PCFactorSetShiftType(pc, MAT_SHIFT_POSITIVE_DEFINITE));
>   
> PetscCall(PCFactorSetLevels(pc, 2));
> PetscCall(PCFactorSetReuseOrdering(pc,PETSC_TRUE));
> PetscCall(PCFactorSetDropTolerance(pc, 1.e-5, 1.e-3, 1));
> PetscCall(PCFactorSetZeroPivot(pc, 1.e-3));
> PetscCall(KSPSetTolerances(ksp, 1.e-8, 1.e-8, PETSC_DEFAULT, 
> PETSC_DEFAULT));
>  
> PetscCall(KSPSetFromOptions(ksp));
>  
> PetscCall(KSPSolve(ksp, b, x));
>  PetscCall(KSPGetIterationNumber(ksp, ));



Re: [petsc-users] how to check state of form residual of snes, for solving ksp or line search?

2024-02-24 Thread Barry Smith




 > On Feb 24, 2024, at 10: 50 AM, Yi Hu  wrote: > > Dear Barry, > > Thanks for the hint. > > This works for my purpose. I did not need to access my form function value, so I did not call SNESGetFunction()




ZjQcmQRYFpfptBannerStart




  

  
	This Message Is From an External Sender
  
  
This message came from outside your organization.
  



 
  


ZjQcmQRYFpfptBannerEnd






> On Feb 24, 2024, at 10:50 AM, Yi Hu  wrote:
> 
> Dear Barry,
> 
> Thanks for the hint.
> 
> This works for my purpose. I did not need to access my form function value, so I did not call SNESGetFunction() in my callback function.
> 
> Maybe another small question about Jacobian test.
> 
> I implemented a matrix-free jacobian in my snes solver, then i ran my code with  "-snes_type newtonls -ksp_type gmres -snes_linesearch_type bt -snes_test_jacobian -snes_test_jacobian_view". It gave message like this
> 
> ||J - Jfd||_F/||J||_F = 0.321634, ||J - Jfd||_F = 6.81067e+12
> 
> which is much larger than the expected scale 1e-8, however the above ratio is constant and my |J-Jfd|_F is gradually changing at this scale around 6.81e+12. Is my own hand-coded matrix-free jacobian wrong?

  Most definitely something is wrong. Doesn't -snes_test_jacobian_view cause it to print the matrix difference so you can see what entries (if not all) are wrong? 

  BTW: "Newton" can converge even with incorrect Jacobians so just having Newton converge does not mean that the Jacobian is correct.
> 
> In fact my code has a converged and reasonable result for various cases. I guess Jfd is an approximation, so I could still have a possibly correct Jacobian.
> 
> Best wishes,
> 
> Yi
> 
> On 2/15/24 04:35, Barry Smith wrote:
>>   Use SNESSetUpdate() to provide a callback function that gets called by SNES automatically immediately before each linear solve. Inside your callback use SNESGetFunction(snes,f,NULL,NULL); to access the last computed value of your function, from this you can update your global variable.
>> 
>> Barry
>> 
>> 
>>> On Feb 14, 2024, at 4:28 PM, Yi Hu  wrote:
>>> 
>>> Dear PETSc team,
>>> 
>>> I am using a newtonls snes solver. I know that form residual is invoked at several locations of the algorithm, first evaluated for the rhs of ksp solver, then several times for obtaining the optimal step of line search.
>>> 
>>> In my problem I have a global variable that is updated every time when form residual is called, which is not desired in the context of netwonls. Basically, it should be only evaluated for the purpose of rhs of ksp (and not in line search). So is it possible to know in which context my form residual is used for? Then I can just skip my line of updating global variable depending on this condition. If not possible, could you suggest a workaround? Or do I need to design a customized line search routine?
>>> 
>>> Thanks for your help.
>>> 
>>> Best regards,
>>> 
>>> Yi
>>> 
>>> 
>>> 
>>> -
>>> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
>>> 
>>> Max-Planck-Institut für Eisenforschung GmbH
>>> Max-Planck-Straße 1
>>> D-40237 Düsseldorf
>>> Handelsregister B 2533 Amtsgericht Düsseldorf
>>> Geschäftsführung
>>> Prof. Dr. Gerhard Dehm
>>> Prof. Dr. Jörg Neugebauer
>>> Prof. Dr. Dierk Raabe
>>> Dr. Kai de Weldige
>>> Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000
>>> 
>>> 
>>> Please consider that invitations and e-mails of our institute are only valid if they end with …@mpie.de. If you are not sure of the validity please contact r...@mpie.de
>>> 
>>> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
>>> aus unserem Haus nur mit der Endung …@mpie.de gültig sind. In Zweifelsfällen wenden Sie sich bitte an r...@mpie.de
>>> -
>>> 
> 
> 
> -
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
> 
> Max-Planck-Institut für Eisenforschung GmbH
> Max-Planck-Straße 1
> D-40237 Düsseldorf
> Handelsregister B 2533 Amtsgericht Düsseldorf
> Geschäftsführung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. Jörg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
> Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000
> 
> 
> Please consider that invitations and e-mails of our institute are only valid if they end with …@mpie.de. If you are not sure of the validity please contact r...@mpie.de
> 
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung …@mpie.de gültig sind. In Zweifelsfällen wenden Sie sich bitte an r...@mpie.de
> -
> 




Re: [petsc-users] how to check state of form residual of snes, for solving ksp or line search?

2024-02-14 Thread Barry Smith


  Use SNESSetUpdate() to provide a callback function that gets called by SNES 
automatically immediately before each linear solve. Inside your callback use 
SNESGetFunction(snes,f,NULL,NULL); to access the last computed value of your 
function, from this you can update your global variable.

Barry


> On Feb 14, 2024, at 4:28 PM, Yi Hu  wrote:
> 
> Dear PETSc team,
> 
> I am using a newtonls snes solver. I know that form residual is invoked at 
> several locations of the algorithm, first evaluated for the rhs of ksp 
> solver, then several times for obtaining the optimal step of line search.
> 
> In my problem I have a global variable that is updated every time when form 
> residual is called, which is not desired in the context of netwonls. 
> Basically, it should be only evaluated for the purpose of rhs of ksp (and not 
> in line search). So is it possible to know in which context my form residual 
> is used for? Then I can just skip my line of updating global variable 
> depending on this condition. If not possible, could you suggest a workaround? 
> Or do I need to design a customized line search routine?
> 
> Thanks for your help.
> 
> Best regards,
> 
> Yi
> 
> 
> 
> -
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
> 
> Max-Planck-Institut für Eisenforschung GmbH
> Max-Planck-Straße 1
> D-40237 Düsseldorf
> Handelsregister B 2533 Amtsgericht Düsseldorf
> Geschäftsführung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. Jörg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
> Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000
> 
> 
> Please consider that invitations and e-mails of our institute are only valid 
> if they end with …@mpie.de. If you are not sure of the validity please 
> contact r...@mpie.de
> 
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung …@mpie.de gültig sind. In Zweifelsfällen 
> wenden Sie sich bitte an r...@mpie.de
> -
> 



Re: [petsc-users] Near nullspace lost in fieldsplit

2024-02-13 Thread Barry Smith

   The new code is in https://gitlab.com/petsc/petsc/-/merge_requests/7293 and 
retains the null space on the submatrices for both MatZeroRows() and 
MatZeroRowsAndColumns() regardless of changes to the nonzero structure of the 
matrix.

  Barry


> On Feb 13, 2024, at 7:12 AM, Jeremy Theler (External) 
>  wrote:
> 
> Hi Barry
> 
> >   7279 does change the code for MatZeroRowsColumns_MPIAIJ(). But perhaps 
> > that does not resolve the problem you are seeing? If that is the case we 
> > will need a reproducible example so we can determine exactly what else is 
> > happening in your code to cause the difficulties. 
> >
> >Here is the diff for  MatZeroRowsColumns_MPIAIJ()
> >
> >@@ -1026,7 +1023,7 @@ static PetscErrorCode MatZeroRowsColumns_MPIAIJ(Mat A, 
> >PetscInt N, const PetscIn
> >
> >   PetscCall(PetscFree(lrows));
> >
> >   /* only change matrix nonzero state if pattern was allowed to be changed 
> > */
> > -  if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) {
> > +  if (!((Mat_SeqAIJ *)(l->A->data))->nonew) {
> >  PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate;
> >  PetscCall(MPIU_Allreduce(, >nonzerostate, 1, MPIU_INT64, 
> > MPI_SUM, PetscObjectComm((PetscObject)A)));
> >}
> 
> Fair enough, I might have overlooked this single line. But this change does 
> not fix our issue. The if condition is still true. 
> I'll need some time to come up with a reproducible example because it 
> involves setting up the PC of the KSP of an SNES of a TS before doing 
> TSSolve() with the DM-allocated jacobian.
> 
> However, regarding bug #1, I do have a MWE. It is not CI-friendly. I tried to 
> modify ex47.c to illustrate the issue but could not make it work.
> Anyway, the attached tarball:
> reads a global matrix A from a file.
> reads two ISs from another file
> sets up a field split PC 
> attaches a near nullspace (read from another file) to one of the sub-KSPs
> sets "dirichlet BCs" with MatZeroRowsColumns() or MatZeroRows() depending if 
> -columns was given in the command line
> 
> The issue comes when calling MatZeroRowsColumns() or MatZeroRows().
> In the first case, the near nullspace is not lost but it is in the second:
> 
> $ make lost
> $ ./lost -columns -ksp_view | grep "near null"
> has attached near null space
> has attached near null space
> $ mpiexec -n 2 ./lost -columns -ksp_view | grep "near null"
> has attached near null space
> has attached near null space
> $ ./lost -ksp_view | grep "near null"
> $ mpiexec -n 2 ./lost -ksp_view | grep "near null"
> $ 
> 
> When using MatZeroRows(), the code passes through fieldsplit.c:692 which 
> looses the near nullspace.
> 
> 
> Note that the original issue we see in our code is that there is a difference 
> between the serial and parallel implementation of MatZeroRowsColumns().
> We loose the near nullspace only in parallel but not in serial because of a 
> combination of bugs #1 and #2.
> 
> 
> --
> jeremy
> 
> 



Re: [petsc-users] Near nullspace lost in fieldsplit

2024-02-13 Thread Barry Smith

   Thank you for the code.  

A) By default MatZeroRows() does change the nonzero structure of the matrix 

B) PCFIELDSPLIT loses the null spaces attached to the submatrices if the 
nonzero structure of the matrix changes.

For the example code if one sets MatSetOption(A,MAT_KEEP_NONZERO_PATTERN, 
PETSC_TRUE); then using  MatZeroRows()  no longer changes the nonzero structure 
and thus the code behaves similarly for both MatZeroRows() and 
MatZeroRowsColumns().

Should B lose the null spaces when the nonzero structure changes? The 
documentation for MatSetNullSpace() does not explicitly state that the attached 
null space will remain with the matrix for its entire life unless the user 
calls MatSetNullspace again, but it implicitly implies that. Thus, I think the 
B should not lose the attached null spaces. I will change the code to preserve 
the null spaces in PCFIELDSPLIT when the nonzero structure changes.

Going back to your two points

1. Code going through line 692 loses the near nullspace of the matrices 
attached to the sub-KSPs
 2. The call to MatZeroRowsColumns() changes then non-zero structure for MPIAIJ 
but not for SEQAIJ

My new branch will prevent the loss of the nullspace in 1.

My previous fix (that is now in main)  fixes the bug where MatZeroRowsColumns() 
in parallel thought it changed the nonzero structure (while it actually did not 
change the nonzero structure).

Note: Independent of 1 and 2 most likely when you use MatZeroRows() in your 
code as you describe it you will want to use 
MatSetOption(A,MAT_KEEP_NONZERO_PATTERN, PETSC_TRUE); since the code will 
likely be a bit more efficient and will have less memory churn.





> On Feb 13, 2024, at 7:12 AM, Jeremy Theler (External) 
>  wrote:
> 
> Hi Barry
> 
> >   7279 does change the code for MatZeroRowsColumns_MPIAIJ(). But perhaps 
> > that does not resolve the problem you are seeing? If that is the case we 
> > will need a reproducible example so we can determine exactly what else is 
> > happening in your code to cause the difficulties. 
> >
> >Here is the diff for  MatZeroRowsColumns_MPIAIJ()
> >
> >@@ -1026,7 +1023,7 @@ static PetscErrorCode MatZeroRowsColumns_MPIAIJ(Mat A, 
> >PetscInt N, const PetscIn
> >
> >   PetscCall(PetscFree(lrows));
> >
> >   /* only change matrix nonzero state if pattern was allowed to be changed 
> > */
> > -  if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) {
> > +  if (!((Mat_SeqAIJ *)(l->A->data))->nonew) {
> >  PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate;
> >  PetscCall(MPIU_Allreduce(, >nonzerostate, 1, MPIU_INT64, 
> > MPI_SUM, PetscObjectComm((PetscObject)A)));
> >}
> 
> Fair enough, I might have overlooked this single line. But this change does 
> not fix our issue. The if condition is still true. 
> I'll need some time to come up with a reproducible example because it 
> involves setting up the PC of the KSP of an SNES of a TS before doing 
> TSSolve() with the DM-allocated jacobian.
> 
> However, regarding bug #1, I do have a MWE. It is not CI-friendly. I tried to 
> modify ex47.c to illustrate the issue but could not make it work.
> Anyway, the attached tarball:
> reads a global matrix A from a file.
> reads two ISs from another file
> sets up a field split PC 
> attaches a near nullspace (read from another file) to one of the sub-KSPs
> sets "dirichlet BCs" with MatZeroRowsColumns() or MatZeroRows() depending if 
> -columns was given in the command line
> 
> The issue comes when calling MatZeroRowsColumns() or MatZeroRows().
> In the first case, the near nullspace is not lost but it is in the second:
> 
> $ make lost
> $ ./lost -columns -ksp_view | grep "near null"
> has attached near null space
> has attached near null space
> $ mpiexec -n 2 ./lost -columns -ksp_view | grep "near null"
> has attached near null space
> has attached near null space
> $ ./lost -ksp_view | grep "near null"
> $ mpiexec -n 2 ./lost -ksp_view | grep "near null"
> $ 
> 
> When using MatZeroRows(), the code passes through fieldsplit.c:692 which 
> looses the near nullspace.
> 
> 
> Note that the original issue we see in our code is that there is a difference 
> between the serial and parallel implementation of MatZeroRowsColumns().
> We loose the near nullspace only in parallel but not in serial because of a 
> combination of bugs #1 and #2.
> 
> 
> --
> jeremy
> 
> 



Re: [petsc-users] Near nullspace lost in fieldsplit

2024-02-12 Thread Barry Smith

   7279 does change the code for MatZeroRowsColumns_MPIAIJ(). But perhaps that 
does not resolve the problem you are seeing? If that is the case we will need a 
reproducible example so we can determine exactly what else is happening in your 
code to cause the difficulties. 

Here is the diff for  MatZeroRowsColumns_MPIAIJ()

@@ -1026,7 +1023,7 @@ static PetscErrorCode MatZeroRowsColumns_MPIAIJ(Mat A, 
PetscInt N, const PetscIn
   PetscCall(PetscFree(lrows));
 
   /* only change matrix nonzero state if pattern was allowed to be changed */
-  if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) {
+  if (!((Mat_SeqAIJ *)(l->A->data))->nonew) {
 PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate;
 PetscCall(MPIU_Allreduce(, >nonzerostate, 1, MPIU_INT64, MPI_SUM, 
PetscObjectComm((PetscObject)A)));
   }




> On Feb 12, 2024, at 7:02 AM, Jeremy Theler (External) 
>  wrote:
> 
> Hi Barry
> 
> >   The bug fix for 2 is availabel in 
> > https://gitlab.com/petsc/petsc/-/merge_requests/7279
> 
> Note that our code goes through   MatZeroRowsColumns_MPIAIJ() not through 
> MatZeroRows_MPIAIJ() so this fix does not change anything for the case I 
> mentioned.
> 
> --
> jeremy



Re: [petsc-users] Help with compiling PETSc on Summit with gcc 12.1.0

2024-02-09 Thread Barry Smith

 error while loading shared libraries: libmpi_ibm_usempif08.so: cannot open 
shared object file: No such file or directory

 So using the mpif90 does not work because it links a shared library that 
cannot be found at run time.

 Perhaps that library is only visible on the bach nodes. You can tryed adding 
-with-batch=0 to the ./configure options


  Barry


> On Feb 9, 2024, at 5:01 PM, Zhang, Chonglin  wrote:
> 
> Dear PETSc developers,
>  
> I am trying to compile PETSc on Summit with gcc 12.1.0 and spectrum-mpi 
> 10.4.0.6, but encountered the following configuration issues:
>  
> =
>   Configuring PETSc to compile on your system 
>
> =
> TESTING: checkFortranCompiler from 
> config.setCompilers(config/BuildSystem/config/setCompilers.py:1271)   
>   
>
> ***
> OSError while running ./configure 
> ---
> Cannot run executables created with FC. If this machine uses a batch system 
> to submit jobs you will need to configure using ./configure with the 
> additional option  --with-batch.
>  Otherwise there is problem with the compilers. Can you compile and run code 
> with your compiler 'mpif90'?
> ***
>  
> Also attached is the configure.log file. Could you help with this issue?
>  
> Thanks,
> Chonglin
>  
>  
>  
> 



Re: [petsc-users] Near nullspace lost in fieldsplit

2024-02-09 Thread Barry Smith

   The bug fix for 2 is availabel in 
https://gitlab.com/petsc/petsc/-/merge_requests/7279



> On Feb 9, 2024, at 10:50 AM, Barry Smith  wrote:
> 
> 
> 1. Code going through line 692 looses the near nullspace of the matrices 
> attached to the sub-KSPs
>  2. The call to MatZeroRowsColumns() changes then non-zero structure for 
> MPIAIJ but not for SEQAIJ
>   (unless MAT_KEEP_NONZERO_PATTERN is used)
> 
> MatZeroRowsColumns() manual page states:
> 
> Unlike `MatZeroRows()` this does not change the nonzero structure of the 
> matrix, it merely zeros those entries in the matrix.
> 
>  MatZeroRowsColumns_MPIAIJ() has the code
> 
>   /* only change matrix nonzero state if pattern was allowed to be changed */
>   if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) {
> PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate;
> PetscCall(MPIU_Allreduce(, >nonzerostate, 1, MPIU_INT64, 
> MPI_SUM, PetscObjectComm((PetscObject)A)));
>   }
> 
> The if() test is simply an optimization to avoid the reduction if 
> keepnonzeropattern is true. In your first run (when MAT_KEEP_NONZERO_PATTERN 
> is not used) the  keepnonzeropattern is not set so the A matrix nonzerostate 
> is updated using the nonzerostate of the two submatrices. But that A nonzero 
> state will only change if one of the two submatrices nonzerostate changes. In 
> the second run the A nonzerostate cannot be changed.
> 
> MatZeroRowsColumns_SeqAIJ() has the code
> 
>   if (diag != 0.0) {
> PetscCall(MatMissingDiagonal_SeqAIJ(A, , ));
> if (missing) {
>   for (i = 0; i < N; i++) {
> if (rows[i] >= A->cmap->N) continue;
> PetscCheck(!a->nonew || rows[i] < d, PETSC_COMM_SELF, 
> PETSC_ERR_ARG_WRONGSTATE, "Matrix is missing diagonal entry in row %" 
> PetscInt_FMT " (%" PetscInt_FMT ")", d, rows[i]);
> PetscCall(MatSetValues_SeqAIJ(A, 1, [i], 1, [i], , 
> INSERT_VALUES));
> 
> which adds a nonzero to the matrix to fill in a missing diagonal. This will 
> change the nonzerostate of the matrix, but in a different way than 
> keepnonzeropattern flag is for.
> 
> So, do you have missing diagonals in your matrix?
> 
> Bug 1 - documentation
> The documentation statement "Unlike `MatZeroRows()` this does not change the 
> nonzero structure of the matrix, it merely zeros those entries in the 
> matrix." appears to be incorrect for matrices missing diagonal entries since 
> new nonzeros are added (to fill in the diagonal).  The documentation should 
> really say 
> 
> "Unlike `MatZeroRows()` this routine  cannot remove the zeroed entries from 
> the nonzero structure of the matrix; in other words setting the option 
> `MAT_KEEP_NONZERO_PATTERN to PETSC_FALSE has no effect on this routine.
> 
> Bug 2 - 
> The short circuit if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) is 
> wrong because that flag is meaningless for this operation. The check should 
> be changed (I think) to nonew instead of keepnonzeropattern.
> 
> I will prepare a bug fix later today. For now you can just use the 
> MAT_KEEP_NONZERO_PATTERN  option for you code to work.
> 
>  Barry
> 
> 
> 
> 
> 
> 
> 
>> On Feb 9, 2024, at 8:00 AM, Jeremy Theler (External) 
>>  wrote:
>> 
>> > > Because of a combination of settings, our code passes through this line:
>> > >
>> > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/pc/impls/fieldsplit/fieldsplit.c?ref_type=heads#L692
>> > > 
>> > > i.e. the matrices associated with each of the sub-KSPs of a fieldsplit 
>> > > are destroyed and then re-created later.
>> > > The thing is that one of these destroyed matrices had a near nullspace 
>> > > attached, which is lost because the new matrix does > not have it 
>> > > anymore.
>> > > 
>> > > Is this a bug or are we missing something?
>> > 
>> > I just want to get a clear picture. You create a PCFIELDSPLIT, set it up, 
>> > then pull out the matrices and attach a nullspace before the solve.
>> 
>> We need to solve an SNES. We use dmplex so we have the jacobian allocated 
>> before starting the solve.
>> At setup time we 
>> 
>>  1. define the PC of the KSP of the SNES to be fieldsplit
>>  2. define the fields with ISes
>>  3. call PCSetup() to create the sub-KSPs
>>  4. retrieve the matrix attached to the sub-KSP that needs the near 
>> nullspace and attach it to that matrix
>> 
>> > At a later time, you start another solve with this PC, and it has the 
>> >

Re: [petsc-users] Near nullspace lost in fieldsplit

2024-02-09 Thread Barry Smith

1. Code going through line 692 looses the near nullspace of the matrices 
attached to the sub-KSPs
 2. The call to MatZeroRowsColumns() changes then non-zero structure for MPIAIJ 
but not for SEQAIJ
  (unless MAT_KEEP_NONZERO_PATTERN is used)

MatZeroRowsColumns() manual page states:

Unlike `MatZeroRows()` this does not change the nonzero structure of the 
matrix, it merely zeros those entries in the matrix.

 MatZeroRowsColumns_MPIAIJ() has the code

  /* only change matrix nonzero state if pattern was allowed to be changed */
  if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) {
PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate;
PetscCall(MPIU_Allreduce(, >nonzerostate, 1, MPIU_INT64, MPI_SUM, 
PetscObjectComm((PetscObject)A)));
  }

The if() test is simply an optimization to avoid the reduction if 
keepnonzeropattern is true. In your first run (when MAT_KEEP_NONZERO_PATTERN is 
not used) the  keepnonzeropattern is not set so the A matrix nonzerostate is 
updated using the nonzerostate of the two submatrices. But that A nonzero state 
will only change if one of the two submatrices nonzerostate changes. In the 
second run the A nonzerostate cannot be changed.

MatZeroRowsColumns_SeqAIJ() has the code

  if (diag != 0.0) {
PetscCall(MatMissingDiagonal_SeqAIJ(A, , ));
if (missing) {
  for (i = 0; i < N; i++) {
if (rows[i] >= A->cmap->N) continue;
PetscCheck(!a->nonew || rows[i] < d, PETSC_COMM_SELF, 
PETSC_ERR_ARG_WRONGSTATE, "Matrix is missing diagonal entry in row %" 
PetscInt_FMT " (%" PetscInt_FMT ")", d, rows[i]);
PetscCall(MatSetValues_SeqAIJ(A, 1, [i], 1, [i], , 
INSERT_VALUES));

which adds a nonzero to the matrix to fill in a missing diagonal. This will 
change the nonzerostate of the matrix, but in a different way than 
keepnonzeropattern flag is for.

So, do you have missing diagonals in your matrix?

Bug 1 - documentation
The documentation statement "Unlike `MatZeroRows()` this does not change the 
nonzero structure of the matrix, it merely zeros those entries in the matrix." 
appears to be incorrect for matrices missing diagonal entries since new 
nonzeros are added (to fill in the diagonal).  The documentation should really 
say 

"Unlike `MatZeroRows()` this routine  cannot remove the zeroed entries from the 
nonzero structure of the matrix; in other words setting the option 
`MAT_KEEP_NONZERO_PATTERN to PETSC_FALSE has no effect on this routine.

Bug 2 - 
The short circuit if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) is 
wrong because that flag is meaningless for this operation. The check should be 
changed (I think) to nonew instead of keepnonzeropattern.

I will prepare a bug fix later today. For now you can just use the 
MAT_KEEP_NONZERO_PATTERN  option for you code to work.

 Barry







> On Feb 9, 2024, at 8:00 AM, Jeremy Theler (External) 
>  wrote:
> 
> > > Because of a combination of settings, our code passes through this line:
> > >
> > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/pc/impls/fieldsplit/fieldsplit.c?ref_type=heads#L692
> > > 
> > > i.e. the matrices associated with each of the sub-KSPs of a fieldsplit 
> > > are destroyed and then re-created later.
> > > The thing is that one of these destroyed matrices had a near nullspace 
> > > attached, which is lost because the new matrix does > not have it anymore.
> > > 
> > > Is this a bug or are we missing something?
> > 
> > I just want to get a clear picture. You create a PCFIELDSPLIT, set it up, 
> > then pull out the matrices and attach a nullspace before the solve.
> 
> We need to solve an SNES. We use dmplex so we have the jacobian allocated 
> before starting the solve.
> At setup time we 
> 
>  1. define the PC of the KSP of the SNES to be fieldsplit
>  2. define the fields with ISes
>  3. call PCSetup() to create the sub-KSPs
>  4. retrieve the matrix attached to the sub-KSP that needs the near nullspace 
> and attach it to that matrix
> 
> > At a later time, you start another solve with this PC, and it has the 
> > DIFFERENT_NONZERO_PATTERN flag, so it recreates these matrices and loses 
> > your attached nullspace.
> 
> At a later time, in the jacobian evaluation we populate the global matrix 
> (i.e. not the matrices attached to each sub-KSPs) and then we set dirichlet 
> bcs with MatZeroRowsColumns() on that same global matrix.
> For some reason, in serial the near nullspace is not lost but in parallel the 
> call to MatZeroRowsColumns() does change the non-zero structure (even though 
> the manual says it does not) and then the code goes through that line 692 in 
> fieldsplit.c and the near nullspace is lost.
> 
> > First, does the matrix really change?
> 
> Well, the matrix during setup is not filled in, just allocated.
> The thing is that if we set MAT_KEEP_NONZERO_PATTERN to true with 
> MatSetOption() before setting the dirichlet BCs, then the near nullspace is 
> not lost (because the code does not go through line 692 

Re: [petsc-users] PETSc crashes when different rank sets row, col and A values using MatCreateSeqAIJWithArrays

2024-02-08 Thread Barry Smith

  No, it uses the exact layout you provided. 

  You can use 
https://petsc.org/release/manualpages/PC/PCREDISTRIBUTE/#pcredistribute to have 
the solver redistribute the rows to have an equal number per MPI process during 
the solve process, which will give you the effect you are looking for.

  Barry


> On Feb 8, 2024, at 4:07 AM, Maruthi NH  wrote:
> 
> Hi Barry,
> Thanks. Yes, the global column index was wrong. I have one more question 
> regarding MatCreateMPIAIJWithArrays. If I have 100 elements in rank 0 and 50 
> in rank 1, does PETSc redistribute equally among procs before solving?
> 
> Regards,
> Maruthi
> 
> On Mon, Feb 5, 2024 at 2:18 AM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>Is each rank trying to create its own sequential matrix with 
>> MatCreateSeqAIJWithArrays() or did you mean MatCreateMPIAIJWithArrays()?
>> 
>>If the latter, then possibly one of your size arguments is wrong or the 
>> indices are incorrect for the given sizes.
>> 
>>Barry
>> 
>> 
>> > On Feb 4, 2024, at 3:15 PM, Maruthi NH > > <mailto:maruth...@gmail.com>> wrote:
>> > 
>> > Hi all,
>> > 
>> > I have a row, col, and A values in CSR format; let's say rank 0 has 200 
>> > unknowns and rank 1 has 100 unknowns. If I use MatCreateSeqAIJWithArrays 
>> > to create a Matrix, it crashes. However, if each rank has an equal number 
>> > of unknowns, it works fine. Please let me know how to proceed
>> > 
>> > 
>> > Regards,
>> > Maruthi
>> 



Re: [petsc-users] KSP has an extra iteration when use shell matrix

2024-02-06 Thread Barry Smith


> On Feb 6, 2024, at 3:49 PM, Yi Hu  wrote:
> 
> Dear Barry,
> 
> Thanks for your help. You suggestion works.
> 
> I found out another approach. I can still use Vec as ctx of my MatShell, then 
> I just need to initialize my MatShell with my solver Vec x. Then I can use an 
> empty FormJacobShell(), coz my MatShell ctx is automatically updated with my 
> solver vec (as I initialized with it). In MyMult the ctx of MatShell is the 
> current solve vec.
> 
  This is slightly dangerous and may not always work. This is because the x 
passed to the computeJacobian routine is not guaranteed to be exactly the 
vector you passed into SNESSolve() as the solution (it could potentially be 
some temporary work vector). I think it is best not to rely on the vector 
always being the same.



> Best regards,
> 
> Yi
> 
> On 2/6/24 01:39, Barry Smith wrote:
>> 
>>   You cannot do call MatShellSetContext(jac,X,ierr) in subroutine 
>> FormJacobianShell(snes,X,jac,B,dummy,ierr) because Fortran arguments are 
>> always pass by address and the address of the X being passed in may not be 
>> valid after the routine that called FormJacobianShell() has returned. This 
>> is why the GetContext works inside this function but not later in MyMult. 
>> 
>>Instead add a Vec Xbase member to the MatCtx type and update that inside 
>> FormJacobShell(). Like
>> 
>>   TYPE(MatCtx),POINTER :: ctxF_pt
>>MatShellGetContext(jac,ctxF_ptr,ierr)
>>ctxF_pt%Xbase = X
>> 
>> The reason this works is because ctxF_pt%Xbase = X copies the value of X 
>> (the PETSc vector) to Xbase, not the address of X.
>> 
>>   Barry
>> 
>> 
>> 
>> 
>>> On Feb 5, 2024, at 4:18 PM, Yi Hu  <mailto:y...@mpie.de> 
>>> wrote:
>>> 
>>> Dear Barry,
>>> 
>>> the code is attached.
>>> 
>>> Just to let you know. When I commented out MatShellSetContext() in 
>>> FormJacobianShell(), then the code seems to work, meaning that the base 
>>> vector is passed to shell matrix context behind the scene.
>>> 
>>> Best regards,
>>> 
>>> Yi
>>> 
>>> On 2/5/24 19:09, Barry Smith wrote:
>>>> 
>>>>   Send the entire code.
>>>> 
>>>> 
>>>>> On Feb 4, 2024, at 4:43 PM, Yi Hu  <mailto:y...@mpie.de> 
>>>>> wrote:
>>>>> 
>>>>> Thanks for your response. You are correct. I overlooked this step.
>>>>> 
>>>>> Now I am trying to correct my "shell matrix approach" for ex1f.F90 of 
>>>>> snes solver 
>>>>> (https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90).
>>>>>  I realized that I need to record the base vector X in the context of 
>>>>> shell matrix and then use this info to carry MyMult. However, the context 
>>>>> cannot be obtained through MatShellGetContext(). Here are the critical 
>>>>> parts of my code.
>>>>> 
>>>>>INTERFACE MatCreateShell
>>>>>  SUBROUTINE MatCreateShell(comm,mloc,nloc,m,n,ctx,mat,ierr)
>>>>>USE solver_context
>>>>>MPI_Comm :: comm
>>>>>PetscInt :: mloc,nloc,m,n
>>>>>Vec :: ctx
>>>>>Mat :: mat
>>>>>PetscErrorCode :: ierr
>>>>>  END SUBROUTINE MatCreateShell
>>>>>END INTERFACE MatCreateShell
>>>>> 
>>>>>INTERFACE MatShellSetContext
>>>>>  SUBROUTINE MatShellSetContext(mat,ctx,ierr)
>>>>>USE solver_context
>>>>>Mat :: mat
>>>>>!TYPE(MatCtx) :: ctx
>>>>>Vec :: ctx
>>>>>PetscErrorCode :: ierr
>>>>>  END SUBROUTINE MatShellSetContext
>>>>>END INTERFACE MatShellSetContext
>>>>> 
>>>>>INTERFACE MatShellGetContext
>>>>>  SUBROUTINE MatShellGetContext(mat,ctx,ierr)
>>>>>USE solver_context
>>>>>Mat :: mat
>>>>>Vec, Pointer :: ctx
>>>>>PetscErrorCode :: ierr
>>>>>  END SUBROUTINE MatShellGetContext
>>>>>END INTERFACE MatShellGetContext
>>>>> 
>>>>> in my FormShellJacobian() I did
>>>>> 
>>>>> subroutine Form

Re: [petsc-users] KSP has an extra iteration when use shell matrix

2024-02-05 Thread Barry Smith

  You cannot do call MatShellSetContext(jac,X,ierr) in subroutine 
FormJacobianShell(snes,X,jac,B,dummy,ierr) because Fortran arguments are always 
pass by address and the address of the X being passed in may not be valid after 
the routine that called FormJacobianShell() has returned. This is why the 
GetContext works inside this function but not later in MyMult. 

   Instead add a Vec Xbase member to the MatCtx type and update that inside 
FormJacobShell(). Like

  TYPE(MatCtx),POINTER :: ctxF_pt
   MatShellGetContext(jac,ctxF_ptr,ierr)
   ctxF_pt%Xbase = X

The reason this works is because ctxF_pt%Xbase = X copies the value of X (the 
PETSc vector) to Xbase, not the address of X.

  Barry




> On Feb 5, 2024, at 4:18 PM, Yi Hu  wrote:
> 
> Dear Barry,
> 
> the code is attached.
> 
> Just to let you know. When I commented out MatShellSetContext() in 
> FormJacobianShell(), then the code seems to work, meaning that the base 
> vector is passed to shell matrix context behind the scene.
> 
> Best regards,
> 
> Yi
> 
> On 2/5/24 19:09, Barry Smith wrote:
>> 
>>   Send the entire code.
>> 
>> 
>>> On Feb 4, 2024, at 4:43 PM, Yi Hu  <mailto:y...@mpie.de> 
>>> wrote:
>>> 
>>> Thanks for your response. You are correct. I overlooked this step.
>>> 
>>> Now I am trying to correct my "shell matrix approach" for ex1f.F90 of snes 
>>> solver 
>>> (https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90).
>>>  I realized that I need to record the base vector X in the context of shell 
>>> matrix and then use this info to carry MyMult. However, the context cannot 
>>> be obtained through MatShellGetContext(). Here are the critical parts of my 
>>> code.
>>> 
>>>INTERFACE MatCreateShell
>>>  SUBROUTINE MatCreateShell(comm,mloc,nloc,m,n,ctx,mat,ierr)
>>>USE solver_context
>>>MPI_Comm :: comm
>>>PetscInt :: mloc,nloc,m,n
>>>Vec :: ctx
>>>Mat :: mat
>>>PetscErrorCode :: ierr
>>>  END SUBROUTINE MatCreateShell
>>>END INTERFACE MatCreateShell
>>> 
>>>INTERFACE MatShellSetContext
>>>  SUBROUTINE MatShellSetContext(mat,ctx,ierr)
>>>USE solver_context
>>>Mat :: mat
>>>!TYPE(MatCtx) :: ctx
>>>Vec :: ctx
>>>PetscErrorCode :: ierr
>>>  END SUBROUTINE MatShellSetContext
>>>END INTERFACE MatShellSetContext
>>> 
>>>INTERFACE MatShellGetContext
>>>  SUBROUTINE MatShellGetContext(mat,ctx,ierr)
>>>USE solver_context
>>>Mat :: mat
>>>Vec, Pointer :: ctx
>>>PetscErrorCode :: ierr
>>>  END SUBROUTINE MatShellGetContext
>>>END INTERFACE MatShellGetContext
>>> 
>>> in my FormShellJacobian() I did
>>> 
>>> subroutine FormJacobianShell(snes,X,jac,B,dummy,ierr)
>>> 
>>> ..
>>> 
>>>   call MatShellSetContext(jac,X,ierr)
>>> 
>>> ..
>>> 
>>> Then in MyMult() I tried to recover this context by
>>> 
>>> call MatShellGetContext(J,x,ierr)
>>> 
>>> call VecView(x,PETSC_VIEWER_STDOUT_WORLD,ierr)
>>> 
>>> Then the program failed with
>>> 
>>> [0]PETSC ERROR: - Error Message 
>>> --
>>> [0]PETSC ERROR: Null argument, when expecting valid pointer
>>> [0]PETSC ERROR: Null Pointer: Parameter # 1
>>> 
>>> In MyMult, I actually defined x to be a pointer. So I am confused here. 
>>> 
>>> Best regards,
>>> 
>>> Yi
>>> 
>>> On 1/31/24 03:18, Barry Smith wrote:
>>>> 
>>>>It is not running an extra KSP iteration. This "extra" matmult is 
>>>> normal and occurs in many of the SNESLineSearchApply_* functions, for 
>>>> example, 
>>>> https://petsc.org/release/src/snes/linesearch/impls/bt/linesearchbt.c.html#SNESLineSearchApply_BT
>>>>  It is used to decide if the Newton step results in sufficient decrease of 
>>>> the function value.
>>>> 
>>>>   Barry
>>>> 
>>>> 
>>>> 
>>>>> On Jan 30, 2024, at 3:19 PM, Yi Hu  <mailto:y...@mpie.de> 
>>>>> wrote:
>

Re: [petsc-users] KSP has an extra iteration when use shell matrix

2024-02-05 Thread Barry Smith

  Send the entire code.


> On Feb 4, 2024, at 4:43 PM, Yi Hu  wrote:
> 
> Thanks for your response. You are correct. I overlooked this step.
> 
> Now I am trying to correct my "shell matrix approach" for ex1f.F90 of snes 
> solver 
> (https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90).
>  I realized that I need to record the base vector X in the context of shell 
> matrix and then use this info to carry MyMult. However, the context cannot be 
> obtained through MatShellGetContext(). Here are the critical parts of my code.
> 
>INTERFACE MatCreateShell
>  SUBROUTINE MatCreateShell(comm,mloc,nloc,m,n,ctx,mat,ierr)
>USE solver_context
>MPI_Comm :: comm
>PetscInt :: mloc,nloc,m,n
>Vec :: ctx
>Mat :: mat
>PetscErrorCode :: ierr
>  END SUBROUTINE MatCreateShell
>END INTERFACE MatCreateShell
> 
>INTERFACE MatShellSetContext
>  SUBROUTINE MatShellSetContext(mat,ctx,ierr)
>USE solver_context
>Mat :: mat
>!TYPE(MatCtx) :: ctx
>Vec :: ctx
>PetscErrorCode :: ierr
>  END SUBROUTINE MatShellSetContext
>END INTERFACE MatShellSetContext
> 
>INTERFACE MatShellGetContext
>  SUBROUTINE MatShellGetContext(mat,ctx,ierr)
>USE solver_context
>Mat :: mat
>Vec, Pointer :: ctx
>PetscErrorCode :: ierr
>  END SUBROUTINE MatShellGetContext
>END INTERFACE MatShellGetContext
> 
> in my FormShellJacobian() I did
> 
> subroutine FormJacobianShell(snes,X,jac,B,dummy,ierr)
> 
> ..
> 
>   call MatShellSetContext(jac,X,ierr)
> 
> ..
> 
> Then in MyMult() I tried to recover this context by
> 
> call MatShellGetContext(J,x,ierr)
> 
> call VecView(x,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
> Then the program failed with
> 
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: Null argument, when expecting valid pointer
> [0]PETSC ERROR: Null Pointer: Parameter # 1
> 
> In MyMult, I actually defined x to be a pointer. So I am confused here. 
> 
> Best regards,
> 
> Yi
> 
> On 1/31/24 03:18, Barry Smith wrote:
>> 
>>It is not running an extra KSP iteration. This "extra" matmult is normal 
>> and occurs in many of the SNESLineSearchApply_* functions, for example, 
>> https://petsc.org/release/src/snes/linesearch/impls/bt/linesearchbt.c.html#SNESLineSearchApply_BT
>>  It is used to decide if the Newton step results in sufficient decrease of 
>> the function value.
>> 
>>   Barry
>> 
>> 
>> 
>>> On Jan 30, 2024, at 3:19 PM, Yi Hu  <mailto:y...@mpie.de> 
>>> wrote:
>>> 
>>> Hello Barry,
>>> 
>>> Thanks for your reply. The monitor options are fine. I actually meant my 
>>> modification of snes tutorial ex1f.F90 does not work and has some 
>>> unexpected behavior. I basically wanted to test if I can use a shell matrix 
>>> as my jacobian (code is here 
>>> https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90).
>>>  After compile my modified version and run with these monitor options, it 
>>> gives me the following,
>>> 
>>>  ( in rhs )
>>>  ( leave rhs )
>>>   0 SNES Function norm 6.041522986797e+00 
>>>   in jac shell +++
>>> 0 KSP Residual norm 6.041522986797e+00 
>>>  === start mymult ===
>>>  === done mymult ===
>>> 1 KSP Residual norm 5.065392549852e-16 
>>>   Linear solve converged due to CONVERGED_RTOL iterations 1
>>>  === start mymult ===
>>>  === done mymult ===
>>>  ( in rhs )
>>>  ( leave rhs )
>>>   1 SNES Function norm 3.512662245652e+00 
>>>   in jac shell +++
>>> 0 KSP Residual norm 3.512662245652e+00 
>>>  === start mymult ===
>>>  === done mymult ===
>>> 1 KSP Residual norm 6.230314124713e-16 
>>>   Linear solve converged due to CONVERGED_RTOL iterations 1
>>>  === start mymult ===
>>>  === done mymult ===
>>>  ( in rhs )
>>>  ( leave rhs )
>>>  ( in rhs )
>>>  ( leave rhs )
>>>   2 SNES Function norm 8.969285922373e-01 
>>>   in jac shell +++
>>> 0 KSP Residual norm 8.969285922373e-01 
>>>  === start mymult ===
>>>  === done mymult ===

Re: [petsc-users] PETSc crashes when different rank sets row, col and A values using MatCreateSeqAIJWithArrays

2024-02-04 Thread Barry Smith


   Is each rank trying to create its own sequential matrix with 
MatCreateSeqAIJWithArrays() or did you mean MatCreateMPIAIJWithArrays()?

   If the latter, then possibly one of your size arguments is wrong or the 
indices are incorrect for the given sizes.

   Barry


> On Feb 4, 2024, at 3:15 PM, Maruthi NH  wrote:
> 
> Hi all,
> 
> I have a row, col, and A values in CSR format; let's say rank 0 has 200 
> unknowns and rank 1 has 100 unknowns. If I use MatCreateSeqAIJWithArrays to 
> create a Matrix, it crashes. However, if each rank has an equal number of 
> unknowns, it works fine. Please let me know how to proceed
> 
> 
> Regards,
> Maruthi



Re: [petsc-users] PetscSection: Fortran interfaces

2024-02-03 Thread Barry Smith


  The Fortran "stubs" (subroutines) should be in 
$PETSC_ARCH/src/vec/is/section/interface/ftn-auto/sectionf.c and compiled and 
linked into the PETSc library.

  The same tool that builds the interfaces in 
$PETSC_ARCH/src/vec/f90-mod/ftn-auto-interfaces/petscpetscsection.h90,  also 
builds the stubs so it is surprising one would exist but not the other.

  Barry


> On Feb 3, 2024, at 11:27 AM, Martin Diehl  wrote:
> 
> Dear PETSc team,
> 
> I currently can't make use of Fortran interfaces for "section".
> In particular, I can't see how to use
> 
> PetscSectionGetFieldComponents
> PetscSectionGetFieldDof   
> PetscSectionGetFieldOffset
> 
> The interfaces for them are created in $PETSC_ARCH/src/vec/f90-mod/ftn-
> auto-interfaces/petscpetscsection.h90, but it seems that they are not
> exposed to the public.
> 
> Could you give me a hint how to use them or fix this?
> 
> with best regards,
> Martin
> 
> 
> -- 
> KU Leuven
> Department of Computer Science
> Department of Materials Engineering
> Celestijnenlaan 200a
> 3001 Leuven, Belgium
> 



Re: [petsc-users] Build a basis of the kernel of a matrix

2024-02-02 Thread Barry Smith

   Call MatSetOption(mat,MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE) 
   For each column call VecGetArray(), zero the "small entries", then call 
MatSetValues() for that single column. 

   Barry


> On Feb 2, 2024, at 12:28 PM, TARDIEU Nicolas via petsc-users 
>  wrote:
> 
> Dear PETSc users,
> 
> I consider a sparse rectangular matrix B of size n x m, n compute a basis of its kernel.
> Conceptually, I compute the SVD decomposition u, s, v = svd(B) and obtain the 
> basis from the columns of v^T associated with (near) zero singular values.
> 
> In terms of implementation, B is an MPIAIJ matrix and I can use SLEPc to 
> retrieve each vector v_i associated with a given singular value. 
> Then I want to assemble a matrix from the set of v_i vectors. Depending on 
> the sparsity of B, the v_i may also be very sparse, so I want to build a 
> sparse matrix.
> 
> Is there an efficient way to do this in PETSc?
> 
> Regards,
> Nicolas
> 
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis 
> à l'intention exclusive des destinataires et les informations qui y figurent 
> sont strictement confidentielles. Toute utilisation de ce Message non 
> conforme à sa destination, toute diffusion ou toute publication totale ou 
> partielle, est interdite sauf autorisation expresse.
> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
> copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. 
> Si vous avez reçu ce Message par erreur, merci de le supprimer de votre 
> système, ainsi que toutes ses copies, et de n'en garder aucune trace sur 
> quelque support que ce soit. Nous vous remercions également d'en avertir 
> immédiatement l'expéditeur par retour du message.
> Il est impossible de garantir que les communications par messagerie 
> électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
> erreur ou virus.
> 
> This message and any attachments (the 'Message') are intended solely for the 
> addressees. The information contained in this Message is confidential. Any 
> use of information contained in this Message not in accord with its purpose, 
> any dissemination or disclosure, either whole or partial, is prohibited 
> except formal approval.
> If you are not the addressee, you may not copy, forward, disclose or use any 
> part of it. If you have received this message in error, please delete it and 
> all copies from your system and notify the sender immediately by return 
> message.
> E-mail communication cannot be guaranteed to be timely secure, error or 
> virus-free.



Re: [petsc-users] Preconditioning of Liouvillian Superoperator

2024-02-01 Thread Barry Smith



> On Feb 1, 2024, at 6:57 AM, Niclas Götting  
> wrote:
> 
> Thank you very much for the input!
> 
> I've spent a lot of time to compress the linear system to a quarter of its 
> size. This resulted in a form, though, which cannot be represented by 
> Kronecker products. Maybe I should return to the original form..
> 
> The new structure of the linear system is as follows:
> 
> +---+---++
> | A | 0 | -2*B^T |
> +---+---++
> | 0 | D | -C^T   |
> +---+---++
> | B | C | D  |
> +---+---++
> 
> The matrices D and C are huge and should therefore be accountable for most of 
> the computational cost. Looking at the visual structure of C, I assume that 
> it can maybe be written as a sum of (skew-)symmetric matrices. A, B, and D 
> are not symmetric at all, though.
> 
> Can the block substructure alone be helpful for solving the linear system in 
> a smarter manner?

  Possibly, it depends on details about A and D. If good preconditioners are 
available for A and D separately then PCFIELDSPLIT can be used to construct a 
preconditioner for the entire matrix. If A is small then in the fieldsplit 
process LU can be used for A so it does not need a good preconditioner. So what 
is the structure of D?
> 
> On 1/31/24 18:45, Barry Smith wrote:
>>For large problems, preconditioners have to take advantage of some 
>> underlying mathematical structure of the operator to perform well (require 
>> few iterations). Just black-boxing the system with simple preconditioners 
>> will not be effective.
>> 
>>So, one needs to look at the Liouvillian Superoperator's structure to see 
>> what one can take advantage of. I first noticed that it can be represented 
>> as a Kronecker product:   A x I or a combination of Kronecker products? In 
>> theory, one can take advantage of Kronecker structure to solve such systems 
>> much more efficiently than just directly solving the huge system naively as 
>> a huge system. In addition it may be possible to use the Kronecker structure 
>> of the operator to perform matrix-vector products with the operator much 
>> more efficiently than by first explicitly forming the huge matrix 
>> representation and doing the multiplies with that. I suggest some googling 
>> with linear solver, preconditioning, Kronecker product.
>> 
>>> On Jan 31, 2024, at 6:51 AM, Niclas Götting  
>>> wrote:
>>> 
>>> Hi all,
>>> 
>>> I've been trying for the last couple of days to solve a linear system using 
>>> iterative methods. The system size itself scales exponentially (64^N) with 
>>> the number of components, so I receive sizes of
>>> 
>>> * (64, 64) for one component
>>> * (4096, 4096) for two components
>>> * (262144, 262144) for three components
>>> 
>>> I can solve the first two cases with direct solvers and don't run into any 
>>> problems; however, the last case is the first nontrivial and it's too large 
>>> for a direct solution, which is why I believe that I need an iterative 
>>> solver.
>>> 
>>> As I know the solution for the first two cases, I tried to reproduce them 
>>> using GMRES and failed on the second, because GMRES didn't converge and 
>>> seems to have been going in the wrong direction (the vector to which it 
>>> "tries" to converge is a totally different one than the correct solution). 
>>> I went as far as -ksp_max_it 100, which takes orders of magnitude 
>>> longer than the LU solution and I'd intuitively think that GMRES should not 
>>> take *that* much longer than LU. Here is the information I have about this 
>>> (4096, 4096) system:
>>> 
>>> * not symmetric (which is why I went for GMRES)
>>> * not singular (SVD: condition number 1.427743623238e+06, 0 of 4096 
>>> singular values are (nearly) zero)
>>> * solving without preconditioning does not converge (DIVERGED_ITS)
>>> * solving with iLU and natural ordering fails due to zeros on the diagonal
>>> * solving with iLU and RCM ordering does not converge (DIVERGED_ITS)
>>> 
>>> After some searching I also found [this](http://arxiv.org/abs/1504.06768) 
>>> paper, which mentions the use of ILUTP, which I believe in PETSc should be 
>>> used via hypre, which, however, threw a SEGV for me, and I'm not sure if 
>>> it's worth debugging at this point in time, because I might be missing 
>>> something entirely different.
>>> 
>>> Does anybody have an idea how this system could be solved in finite time, 
>>> such that the method also scales to the three component problem?
>>> 
>>> Thank you all very much in advance!
>>> 
>>> Best regards
>>> Niclas
>>> 



Re: [petsc-users] Preconditioning of Liouvillian Superoperator

2024-01-31 Thread Barry Smith


   For large problems, preconditioners have to take advantage of some 
underlying mathematical structure of the operator to perform well (require few 
iterations). Just black-boxing the system with simple preconditioners will not 
be effective. 

   So, one needs to look at the Liouvillian Superoperator's structure to see 
what one can take advantage of. I first noticed that it can be represented as a 
Kronecker product:   A x I or a combination of Kronecker products? In theory, 
one can take advantage of Kronecker structure to solve such systems much more 
efficiently than just directly solving the huge system naively as a huge 
system. In addition it may be possible to use the Kronecker structure of the 
operator to perform matrix-vector products with the operator much more 
efficiently than by first explicitly forming the huge matrix representation and 
doing the multiplies with that. I suggest some googling with linear solver, 
preconditioning, Kronecker product.

> On Jan 31, 2024, at 6:51 AM, Niclas Götting  
> wrote:
> 
> Hi all,
> 
> I've been trying for the last couple of days to solve a linear system using 
> iterative methods. The system size itself scales exponentially (64^N) with 
> the number of components, so I receive sizes of
> 
> * (64, 64) for one component
> * (4096, 4096) for two components
> * (262144, 262144) for three components
> 
> I can solve the first two cases with direct solvers and don't run into any 
> problems; however, the last case is the first nontrivial and it's too large 
> for a direct solution, which is why I believe that I need an iterative solver.
> 
> As I know the solution for the first two cases, I tried to reproduce them 
> using GMRES and failed on the second, because GMRES didn't converge and seems 
> to have been going in the wrong direction (the vector to which it "tries" to 
> converge is a totally different one than the correct solution). I went as far 
> as -ksp_max_it 100, which takes orders of magnitude longer than the LU 
> solution and I'd intuitively think that GMRES should not take *that* much 
> longer than LU. Here is the information I have about this (4096, 4096) system:
> 
> * not symmetric (which is why I went for GMRES)
> * not singular (SVD: condition number 1.427743623238e+06, 0 of 4096 singular 
> values are (nearly) zero)
> * solving without preconditioning does not converge (DIVERGED_ITS)
> * solving with iLU and natural ordering fails due to zeros on the diagonal
> * solving with iLU and RCM ordering does not converge (DIVERGED_ITS)
> 
> After some searching I also found [this](http://arxiv.org/abs/1504.06768) 
> paper, which mentions the use of ILUTP, which I believe in PETSc should be 
> used via hypre, which, however, threw a SEGV for me, and I'm not sure if it's 
> worth debugging at this point in time, because I might be missing something 
> entirely different.
> 
> Does anybody have an idea how this system could be solved in finite time, 
> such that the method also scales to the three component problem?
> 
> Thank you all very much in advance!
> 
> Best regards
> Niclas
> 



Re: [petsc-users] KSP has an extra iteration when use shell matrix

2024-01-30 Thread Barry Smith
)
>  ( leave rhs )
>  ( in rhs )
>  ( leave rhs )
>  ( in rhs )
>  ( leave rhs )
>  ( in rhs )
>  ( leave rhs )
>  ( in rhs )
>  ( leave rhs )
>  ( in rhs )
>  ( leave rhs )
>  ( in rhs )
>  ( leave rhs )
>  ( in rhs )
>  ( leave rhs )
>  === start mymult ===
>  === done mymult ===
> Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 8
> Number of SNES iterations = 8
> 
> After each "Linear solve converged due to CONVERGED_ATOL iterations", the 
> code starts to do mymult again. So I thought it did an extra (unwanted) KSP 
> iteration. I would like to ask if this extra iteration could be disabled, or 
> maybe I am wrong about it.
> 
> Best regards,
> 
> Yi
> 
> On 1/30/24 18:35, Barry Smith wrote:
>> 
>>   How do I see a difference? What does "hence ruin my previous converged KSP 
>> result" mean? A different answer at the end of the KSP solve?
>> 
>> $ ./joe > joe.basic
>> ~/Src/petsc/src/ksp/ksp/tutorials (barry/2023-09-15/fix-log-pcmpi=) 
>> arch-fix-log-pcmpi
>> $ ./joe -ksp_monitor -ksp_converged_reason -snes_monitor > joe.monitor
>> ~/Src/petsc/src/ksp/ksp/tutorials (barry/2023-09-15/fix-log-pcmpi=) 
>> arch-fix-log-pcmpi
>> $ diff joe.basic joe.monitor 
>> 0a1,36
>> >   0 SNES Function norm 6.041522986797e+00 
>> > 0 KSP Residual norm 6.041522986797e+00 
>> > 1 KSP Residual norm 5.065392549852e-16 
>> >   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> >   1 SNES Function norm 3.512662245652e+00 
>> > 0 KSP Residual norm 3.512662245652e+00 
>> > 1 KSP Residual norm 6.230314124713e-16 
>> >   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> >   2 SNES Function norm 8.969285922373e-01 
>> > 0 KSP Residual norm 8.969285922373e-01 
>> > 1 KSP Residual norm 0.e+00 
>> >   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> >   3 SNES Function norm 4.863816734540e-01 
>> > 0 KSP Residual norm 4.863816734540e-01 
>> > 1 KSP Residual norm 0.e+00 
>> >   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> >   4 SNES Function norm 3.512070785520e-01 
>> > 0 KSP Residual norm 3.512070785520e-01 
>> > 1 KSP Residual norm 0.e+00 
>> >   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> >   5 SNES Function norm 2.769700293115e-01 
>> > 0 KSP Residual norm 2.769700293115e-01 
>> > 1 KSP Residual norm 1.104778916974e-16 
>> >   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> >   6 SNES Function norm 2.055345318150e-01 
>> > 0 KSP Residual norm 2.055345318150e-01 
>> > 1 KSP Residual norm 1.535110861002e-17 
>> >   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> >   7 SNES Function norm 1.267482220786e-01 
>> > 0 KSP Residual norm 1.267482220786e-01 
>> > 1 KSP Residual norm 1.498679601680e-17 
>> >   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> >   8 SNES Function norm 3.468150619264e-02 
>> > 0 KSP Residual norm 3.468150619264e-02 
>> > 1 KSP Residual norm 5.944160522951e-18 
>> >   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> 
>> 
>> 
>>> On Jan 30, 2024, at 11:19 AM, Yi Hu  <mailto:y...@mpie.de> 
>>> wrote:
>>> 
>>> Dear PETSc team,
>>>  
>>> I am still trying to sort out my previous thread 
>>> https://lists.mcs.anl.gov/pipermail/petsc-users/2024-January/050079.html 
>>> using a minimal working example. However, I encountered another problem. 
>>> Basically I combined the basic usage of SNES solver and shell matrix and 
>>> tried to make it work. The jacobian of my snes is replaced by a customized 
>>> MATOP_MULT. The minimal example code can be viewed here 
>>> https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90
>>>  
>>> When running with -ksp_monitor -ksp_converged_reason, it shows an extra 
>>> mymult step, and hence ruin my previous converged KSP result. Implement a 
>>> customized converged call-back also does not help. I am wondering how to 
>>> skip this extra ksp iteration. Could anyone help me on this?
>>>  
>>> Thanks for your help.
>>>  
>>> Best wishes,
>>> Yi
>>> 
>>> 
>>> -

Re: [petsc-users] KSP has an extra iteration when use shell matrix

2024-01-30 Thread Barry Smith

  How do I see a difference? What does "hence ruin my previous converged KSP 
result" mean? A different answer at the end of the KSP solve?

$ ./joe > joe.basic
~/Src/petsc/src/ksp/ksp/tutorials (barry/2023-09-15/fix-log-pcmpi=) 
arch-fix-log-pcmpi
$ ./joe -ksp_monitor -ksp_converged_reason -snes_monitor > joe.monitor
~/Src/petsc/src/ksp/ksp/tutorials (barry/2023-09-15/fix-log-pcmpi=) 
arch-fix-log-pcmpi
$ diff joe.basic joe.monitor 
0a1,36
>   0 SNES Function norm 6.041522986797e+00 
> 0 KSP Residual norm 6.041522986797e+00 
> 1 KSP Residual norm 5.065392549852e-16 
>   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>   1 SNES Function norm 3.512662245652e+00 
> 0 KSP Residual norm 3.512662245652e+00 
> 1 KSP Residual norm 6.230314124713e-16 
>   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>   2 SNES Function norm 8.969285922373e-01 
> 0 KSP Residual norm 8.969285922373e-01 
> 1 KSP Residual norm 0.e+00 
>   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>   3 SNES Function norm 4.863816734540e-01 
> 0 KSP Residual norm 4.863816734540e-01 
> 1 KSP Residual norm 0.e+00 
>   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>   4 SNES Function norm 3.512070785520e-01 
> 0 KSP Residual norm 3.512070785520e-01 
> 1 KSP Residual norm 0.e+00 
>   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>   5 SNES Function norm 2.769700293115e-01 
> 0 KSP Residual norm 2.769700293115e-01 
> 1 KSP Residual norm 1.104778916974e-16 
>   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>   6 SNES Function norm 2.055345318150e-01 
> 0 KSP Residual norm 2.055345318150e-01 
> 1 KSP Residual norm 1.535110861002e-17 
>   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>   7 SNES Function norm 1.267482220786e-01 
> 0 KSP Residual norm 1.267482220786e-01 
> 1 KSP Residual norm 1.498679601680e-17 
>   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>   8 SNES Function norm 3.468150619264e-02 
> 0 KSP Residual norm 3.468150619264e-02 
> 1 KSP Residual norm 5.944160522951e-18 
>   Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1



> On Jan 30, 2024, at 11:19 AM, Yi Hu  wrote:
> 
> Dear PETSc team,
>  
> I am still trying to sort out my previous thread 
> https://lists.mcs.anl.gov/pipermail/petsc-users/2024-January/050079.html 
> using a minimal working example. However, I encountered another problem. 
> Basically I combined the basic usage of SNES solver and shell matrix and 
> tried to make it work. The jacobian of my snes is replaced by a customized 
> MATOP_MULT. The minimal example code can be viewed here 
> https://github.com/hyharry/small_petsc_test/blob/master/test_shell_jac/ex1f.F90
>  
> When running with -ksp_monitor -ksp_converged_reason, it shows an extra 
> mymult step, and hence ruin my previous converged KSP result. Implement a 
> customized converged call-back also does not help. I am wondering how to skip 
> this extra ksp iteration. Could anyone help me on this?
>  
> Thanks for your help.
>  
> Best wishes,
> Yi
> 
> 
> -
> Stay up to date and follow us on LinkedIn, Twitter and YouTube.
> 
> Max-Planck-Institut für Eisenforschung GmbH
> Max-Planck-Straße 1
> D-40237 Düsseldorf
>  
> Handelsregister B 2533 
> Amtsgericht Düsseldorf
>  
> Geschäftsführung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. Jörg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
>  
> Ust.-Id.-Nr.: DE 11 93 58 514 
> Steuernummer: 105 5891 1000
> 
> 
> Please consider that invitations and e-mails of our institute are 
> only valid if they end with …@mpie.de. 
> If you are not sure of the validity please contact r...@mpie.de 
> 
> 
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung …@mpie.de gültig sind. 
> In Zweifelsfällen wenden Sie sich bitte an r...@mpie.de 
> -



Re: [petsc-users] Parallel vector layout for TAO optimization with separable state/design structure

2024-01-30 Thread Barry Smith

   This is a problem with MPI programming and optimization; I am unaware of a 
perfect solution.

   Put the design variables into the solution vector on MPI rank 0, and when 
doing your objective/gradient, send the values to all the MPI processes where 
you use them. You can use a VecScatter to handle the communication you need or 
MPI_Scatter() etc whatever makes the most sense in your code. 

   Barry


> On Jan 30, 2024, at 10:53 AM, Guenther, Stefanie via petsc-users 
>  wrote:
> 
> Hi Petsc team, 
>  
> I have a question regarding parallel layout of a Petsc vector to be used in 
> TAO optimizers for cases where the optimization variables split into ‘design’ 
> and ‘state’ variables (e.g. such as in PDE-constrained optimization as in 
> tao_lcl). In our case, the state variable naturally parallelizes evenly 
> amongst multiple processors and this distribution is fixed. The ‘design’ 
> vector however does not, it is very short compared to the state vector and it 
> is required on all state-processors when evaluating the objective function 
> and gradient. My question would be how the TAO optimization vector x = 
> [design,state] should be created in such a way that the ‘state’ part is 
> distributed as needed in our solver, while the design part is not.
>  
> My only idea so far was to copy the design variables to all processors and 
> augment / interleave the optimization vector as x = [state_proc1,design, 
> state_proc2, design, … ] . When creating this vector in parallel on 
> PETSC_COMM_WORLD, each processor would then own the same number of variables 
> ( [state_proc, design] ), as long as the numbers match up, and I would 
> only need to be careful when gathering the gradient wrt the design parts from 
> all processors.
>  
> This seems cumbersome however, and I would be worried whether the 
> optimization problem is harder to solve this way. Is there any other way to 
> achieve this splitting, that I am missing here? Note that the distribution of 
> the state itself is given and can not be changed, and that the state vs 
> design vectors have very different (and independent) dimensions.
>  
> Thanks for your help and thoughts!
> Best,
> Stefanie



Re: [petsc-users] pc_redistribute issue

2024-01-29 Thread Barry Smith

Document the change in behavior for matrices with a block size greater than one

https://gitlab.com/petsc/petsc/-/merge_requests/7246


> On Jan 27, 2024, at 3:37 PM, Mark Adams  wrote:
> 
> Note, pc_redistibute is a great idea but you lose the block size, which is 
> obvious after you realize it, but is error prone.
> Maybe it would be better to throw an error if bs > 1 and add a 
> -pc_redistribute_ignore_block_size or something for users that want to press 
> on.
> 
> Thanks,
> Mark
> 
> On Sat, Jan 27, 2024 at 1:26 PM Mark Adams  > wrote:
>> Well, that puts the reason after the iterations, which is progress.
>> 
>> Oh, I see the preconditioned norm goes down a lot, but the reported residual 
>> that you would think is used for testing (see first post) does not go down 
>> 12 digits.
>> This matrix is very ill conditioned. LU just gets about 7 digits.
>> 
>> Thanks,
>> Mark
>> 
>> Residual norms for redistribute_ solve.
>> 0 KSP preconditioned resid norm 3.97683909e+16 true resid norm 
>> 6.646245659859e+06 ||r(i)||/||b|| 1.e+00
>> 1 KSP preconditioned resid norm 3.257912040767e+02 true resid norm 
>> 1.741027565497e-04 ||r(i)||/||b|| 2.619565472898e-11
>>   Linear redistribute_ solve converged due to CONVERGED_RTOL iterations 1
>> KSP Object: (redistribute_) 1 MPI process
>>   type: gmres
>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
>> with no iterative refinement
>> happy breakdown tolerance 1e-30
>>   maximum iterations=1, initial guess is zero
>>   tolerances:  relative=1e-12, absolute=1e-50, divergence=1.
>>   left preconditioning
>>   using PRECONDITIONED norm type for convergence test
>> PC Object: (redistribute_) 1 MPI process
>>   type: bjacobi
>> number of blocks = 1
>> Local solver information for first block is in the following KSP and PC 
>> objects on rank 0:
>> Use -redistribute_ksp_view ::ascii_info_detail to display information 
>> for all blocks
>> KSP Object: (redistribute_sub_) 1 MPI process
>>   type: preonly
>>   maximum iterations=1, initial guess is zero
>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
>>   left preconditioning
>>   using NONE norm type for convergence test
>> PC Object: (redistribute_sub_) 1 MPI process
>>   type: lu
>> out-of-place factorization
>> tolerance for zero pivot 2.22045e-14
>> matrix ordering: external
>> factor fill ratio given 0., needed 0.
>>   Factored matrix follows:
>> Mat Object: (redistribute_sub_) 1 MPI process
>>   type: mumps
>>   rows=44378, cols=44378
>>   package used to perform factorization: mumps
>>   total: nonzeros=50309372, allocated nonzeros=50309372
>> MUMPS run parameters:
>> 
>> On Sat, Jan 27, 2024 at 12:51 PM Matthew Knepley > > wrote:
>>> Okay, so the tolerance is right. It must be using ||b|| instead of ||r0||. 
>>> Run with
>>> 
>>>   -redistribute_ksp_monitor_true_residual
>>> 
>>> You might have to force r0.
>>> 
>>>   Thanks,
>>> 
>>>  Matt
>>> 
>>> On Sat, Jan 27, 2024 at 11:44 AM Mark Adams >> > wrote:
 KSP Object: (redistribute_) 1 MPI process
   type: gmres
 restart=30, using Classical (unmodified) Gram-Schmidt 
 Orthogonalization with no iterative refinement
 happy breakdown tolerance 1e-30
   maximum iterations=1, initial guess is zero
   tolerances:  relative=1e-12, absolute=1e-50, divergence=1.
   left preconditioning
   using PRECONDITIONED norm type for convergence test
 PC Object: (redistribute_) 1 MPI process
   type: bjacobi
 number of blocks = 1
 Local solver information for first block is in the following KSP and 
 PC objects on rank 0:
 Use -redistribute_ksp_view ::ascii_info_detail to display information 
 for all blocks
 KSP Object: (redistribute_sub_) 1 MPI process
   type: preonly
   maximum iterations=1, initial guess is zero
   tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
   left preconditioning
   using NONE norm type for convergence test
 PC Object: (redistribute_sub_) 1 MPI process
   type: lu
 out-of-place factorization
 tolerance for zero pivot 2.22045e-14
 
 On Sat, Jan 27, 2024 at 10:24 AM Matthew Knepley >>> > wrote:
> View the solver.
> 
>   Matt
> 
> On Sat, Jan 27, 2024 at 9:43 AM Mark Adams  > wrote:
>> I am not getting ksp_rtol 1e-12 into pc_redistribute correctly?
>> 
>>   Linear redistribute_ solve converged due to CONVERGED_RTOL iterations 1
>>   0 KSP Residual norm 2.182384017537e+02 
>>   1 KSP Residual norm 1.889764161573e-04 
>> Number of 

Re: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34)

2024-01-26 Thread Barry Smith

   When run with -log_view_gpu_time each event has two times: the time of 
kernel on the GPU (computed directly on the GPU using GPU timers) and the time 
of the CPU clock. The time on the CPU for the event always encloses the entire 
kernel (hence its time is always at least as large as the time of the kernel). 
Basically the CPU time in an event where all the action happens on the GPU is 
the time of kernel launch plus the time to run the kernel and confirm it is 
finished. So the GPU flop is the flop rate actually achieved on the GPU while 
the CPU flop rate is the effective flop rate the user is getting on the 
application.

  

> On Jan 26, 2024, at 6:48 AM, Anthony Jourdon  wrote:
> 
> Hello,
> 
> Thank you for your answers.
> I am working with Dave May on this topic.
> 
> Still running src/ksp/ksp/tutorials/ex34 with the same options reported by 
> Dave, I added the option -log_view_gpu_time.
> Now the log provides gpu flop/s instead of nans.
> However, I have trouble understanding the numbers reported in the log (file 
> attached).
> The numbers reported for Total Mflop/s and GPU Mflop/s are different even 
> when 100% of the work is supposed to be done on the GPU.
> The numbers reported for GPU Mflop/s are always higher than the numbers 
> reported for Total Mflop/s.
> As I understand, the Total Mflop/s should be the sum of both GPU and CPU 
> flop/s, but if the gpu does 100% of the work, why are there different numbers 
> reported by the GPU and Total flop/s columns and why the GPU flop/s are 
> always higher than the Total flop/s ?
> Or am I missing something?
> 
> Thank you for your attention.
> Anthony Jourdon
> 
> 
> 
> Le sam. 20 janv. 2024 à 02:25, Barry Smith  <mailto:bsm...@petsc.dev>> a écrit :
>> 
>>Nans indicate we do not have valid computational times for these 
>> operations; think of them as Not Available. Providing valid times for the 
>> "inner" operations listed with Nans requires inaccurate times (higher) for 
>> the outer operations, since extra synchronization between the CPU and GPU 
>> must be done to get valid times for the inner options. We opted to have the 
>> best valid times for the outer operations since those times reflect the time 
>> of the application.
>> 
>> 
>> 
>> 
>> 
>> > On Jan 19, 2024, at 12:35 PM, Dave May > > <mailto:dave.mayhe...@gmail.com>> wrote:
>> > 
>> > Hi all,
>> > 
>> > I am trying to understand the logging information associated with the 
>> > %flops-performed-on-the-gpu reported by -log_view when running 
>> >   src/ksp/ksp/tutorials/ex34
>> > with the following options
>> > -da_grid_x 192
>> > -da_grid_y 192
>> > -da_grid_z 192
>> > -dm_mat_type seqaijhipsparse
>> > -dm_vec_type seqhip
>> > -ksp_max_it 10
>> > -ksp_monitor
>> > -ksp_type richardson
>> > -ksp_view
>> > -log_view
>> > -mg_coarse_ksp_max_it 2
>> > -mg_coarse_ksp_type richardson
>> > -mg_coarse_pc_type none
>> > -mg_levels_ksp_type richardson
>> > -mg_levels_pc_type none
>> > -options_left
>> > -pc_mg_levels 3
>> > -pc_mg_log
>> > -pc_type mg
>> > 
>> > This config is not intended to actually solve the problem, rather it is a 
>> > stripped down set of options designed to understand what parts of the 
>> > smoothers are being executed on the GPU.
>> > 
>> > With respect to the log file attached, my first set of questions related 
>> > to the data reported under "Event Stage 2: MG Apply".
>> > 
>> > [1] Why is the log littered with nan's?
>> > * I don't understand how and why "GPU Mflop/s" should be reported as nan 
>> > when a value is given for "GPU %F" (see MatMult for example).
>> > 
>> > * For events executed on the GPU, I assume the column "Time (sec)" relates 
>> > to "CPU execute time", this would explain why we see a nan in "Time (sec)" 
>> > for MatMult.
>> > If my assumption is correct, how should I interpret the column "Flop 
>> > (Max)" which is showing 1.92e+09? 
>> > I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" should 
>> > also relate to CPU and GPU flops would be logged in "GPU Mflop/s"
>> > 
>> > [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, 
>> > MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" 
>> > as 93. I believe

Re: [petsc-users] Bug in VecNorm, 3.20.3

2024-01-23 Thread Barry Smith


   This could happen if the values in the vector get changed but the 
PetscObjectState does not get updated. Normally this is impossible, any action 
that changes a vectors values changes its state (so for example calling 
VecGetArray()/VecRestoreArray() updates the state. 

   Are you accessing the vector values in any non-standard way? 

   Barry


> On Jan 23, 2024, at 11:39 AM, mich...@paraffinalia.co.uk wrote:
> 
> Hello,
> 
> I have used the GMRES solver in PETSc successfully up to now, but on 
> installing the most recent release, 3.20.3, the solver fails by exiting 
> early. Output from the code is:
> 
> lt-nbi-solve-laplace: starting PETSc solver [23.0537]
>  0 KSP Residual norm < 1.e-11
> Linear solve converged due to CONVERGED_ATOL iterations 0
> lt-nbi-solve-laplace: 0 iterations [23.0542] (22.9678)
> 
> and tracing execution shows the norm returned by VecNorm to be 0.
> 
> If I modify the function by commenting out line 217 of
> 
>  src/vec/vec/interface/rvector.c
> 
>  /* if (flg) PetscFunctionReturn(PETSC_SUCCESS); */
> 
> the code executes correctly:
> 
> lt-nbi-solve-laplace: starting PETSc solver [22.9392]
>  0 KSP Residual norm 1.10836
>  1 KSP Residual norm 0.0778301
>  2 KSP Residual norm 0.0125121
>  3 KSP Residual norm 0.00165836
>  4 KSP Residual norm 0.000164066
>  5 KSP Residual norm 2.12824e-05
>  6 KSP Residual norm 4.50696e-06
>  7 KSP Residual norm 5.85082e-07
> Linear solve converged due to CONVERGED_RTOL iterations 7
> 
> My compile options are:
> 
> PETSC_ARCH=linux-gnu-real ./configure --with-mpi=0 --with-scalar-type=real 
> --with-threadsafety --with-debugging=0 --with-log=0 --with-openmp
> 
> uname -a returns:
> 
> 5.15.80 #1 SMP PREEMPT Sun Nov 27 13:28:05 CST 2022 x86_64 Intel(R) Core(TM) 
> i5-6200U CPU @ 2.30GHz GenuineIntel GNU/Linux
> 



Re: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34)

2024-01-19 Thread Barry Smith

   Junchao,

How come  vecseqcupm_impl.hpp  has  PetscCall(PetscLogFlops(n)); instead of 
logging the flops on the GPU? 

This could be the root of the problem, the VecShift used to remove the null 
space from vectors in the solver is logging incorrectly. (For some reason there 
is no LogEventBegin/End() for VecShift which is why it doesn't get it on line 
in the -log_view).

   Barry




> On Jan 19, 2024, at 3:17 PM, Barry Smith  wrote:
> 
> 
>   Junchao
> 
> I run the following on the CI machine, why does this happen? With trivial 
> solver options it runs ok.
> 
> bsmith@petsc-gpu-02:/scratch/bsmith/petsc/src/ksp/ksp/tutorials$ ./ex34 
> -da_grid_x 192 -da_grid_y 192 -da_grid_z 192 -dm_mat_type seqaijhipsparse 
> -dm_vec_type seqhip -ksp_max_it 10 -ksp_monitor -ksp_type richardson 
> -ksp_view -log_view -mg_coarse_ksp_max_it 2 -mg_coarse_ksp_type richardson 
> -mg_coarse_pc_type none -mg_levels_ksp_type richardson -mg_levels_pc_type 
> none -options_left -pc_mg_levels 3 -pc_mg_log -pc_type mg
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: GPU error
> [0]PETSC ERROR: hipSPARSE errorcode 3 (HIPSPARSE_STATUS_INVALID_VALUE)
> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program 
> crashed before usage or a spelling mistake, etc!
> [0]PETSC ERROR:   Option left: name:-options_left (no value) source: command 
> line
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.20.3, unknown 
> [0]PETSC ERROR: ./ex34 on a  named petsc-gpu-02 by bsmith Fri Jan 19 14:15:20 
> 2024
> [0]PETSC ERROR: Configure options 
> --package-prefix-hash=/home/bsmith/petsc-hash-pkgs --with-make-np=24 
> --with-make-test-np=8 --with-hipc=/opt/rocm-5.4.3/bin/hipcc 
> --with-hip-dir=/opt/rocm-5.4.3 COPTFLAGS="-g -O" FOPTFLAGS="-g -O" 
> CXXOPTFLAGS="-g -O" HIPOPTFLAGS="-g -O" --with-cuda=0 --with-hip=1 
> --with-precision=double --with-clanguage=c --download-kokkos 
> --download-kokkos-kernels --download-hypre --download-magma 
> --with-magma-fortran-bindings=0 --download-mfem --download-metis 
> --with-strict-petscerrorcode PETSC_ARCH=arch-ci-linux-hip-double
> [0]PETSC ERROR: #1 MatMultAddKernel_SeqAIJHIPSPARSE() at 
> /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3131
> [0]PETSC ERROR: #2 MatMultAdd_SeqAIJHIPSPARSE() at 
> /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3004
> [0]PETSC ERROR: #3 MatMultAdd() at 
> /scratch/bsmith/petsc/src/mat/interface/matrix.c:2770
> [0]PETSC ERROR: #4 MatInterpolateAdd() at 
> /scratch/bsmith/petsc/src/mat/interface/matrix.c:8603
> [0]PETSC ERROR: #5 PCMGMCycle_Private() at 
> /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:87
> [0]PETSC ERROR: #6 PCMGMCycle_Private() at 
> /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:83
> [0]PETSC ERROR: #7 PCApply_MG_Internal() at 
> /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:611
> [0]PETSC ERROR: #8 PCApply_MG() at 
> /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:633
> [0]PETSC ERROR: #9 PCApply() at 
> /scratch/bsmith/petsc/src/ksp/pc/interface/precon.c:498
> [0]PETSC ERROR: #10 KSP_PCApply() at 
> /scratch/bsmith/petsc/include/petsc/private/kspimpl.h:383
> [0]PETSC ERROR: #11 KSPSolve_Richardson() at 
> /scratch/bsmith/petsc/src/ksp/ksp/impls/rich/rich.c:106
> [0]PETSC ERROR: #12 KSPSolve_Private() at 
> /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:906
> [0]PETSC ERROR: #13 KSPSolve() at 
> /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:1079
> [0]PETSC ERROR: #14 main() at ex34.c:52
> [0]PETSC ERROR: PETSc Option Table entries:
> 
>   Dave,
> 
> Trying to debug the 7% now, but having trouble running, as you see above.
> 
> 
> 
>> On Jan 19, 2024, at 3:02 PM, Dave May  wrote:
>> 
>> Thank you Barry and Junchao for these explanations. I'll turn on 
>> -log_view_gpu_time.
>> 
>> Do either of you have any thoughts regarding why the percentage of flop's 
>> being reported on the GPU is not 100% for MGSmooth Level {0,1,2} for this 
>> solver configuration?
>> 
>> This number should have nothing to do with timings as it reports the ratio 
>> of operations performed on the GPU and CPU, presumably obtained from 
>> PetscLogFlops() and PetscLogGpuFlops().
>> 
>> Cheers,
>> Dave
>> 
>> On Fri, 19 Jan 2024 at 11:39, Junchao Zhang > <mailto:junchao.zh...@gmail.com>> wrote:
>>> Try to also add -log_view_gpu_time, 
>>> https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/
>>> 
>>> --Jun

Re: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34)

2024-01-19 Thread Barry Smith

  Junchao

I run the following on the CI machine, why does this happen? With trivial 
solver options it runs ok.

bsmith@petsc-gpu-02:/scratch/bsmith/petsc/src/ksp/ksp/tutorials$ ./ex34 
-da_grid_x 192 -da_grid_y 192 -da_grid_z 192 -dm_mat_type seqaijhipsparse 
-dm_vec_type seqhip -ksp_max_it 10 -ksp_monitor -ksp_type richardson -ksp_view 
-log_view -mg_coarse_ksp_max_it 2 -mg_coarse_ksp_type richardson 
-mg_coarse_pc_type none -mg_levels_ksp_type richardson -mg_levels_pc_type none 
-options_left -pc_mg_levels 3 -pc_mg_log -pc_type mg
[0]PETSC ERROR: - Error Message 
--
[0]PETSC ERROR: GPU error
[0]PETSC ERROR: hipSPARSE errorcode 3 (HIPSPARSE_STATUS_INVALID_VALUE)
[0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program 
crashed before usage or a spelling mistake, etc!
[0]PETSC ERROR:   Option left: name:-options_left (no value) source: command 
line
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.20.3, unknown 
[0]PETSC ERROR: ./ex34 on a  named petsc-gpu-02 by bsmith Fri Jan 19 14:15:20 
2024
[0]PETSC ERROR: Configure options 
--package-prefix-hash=/home/bsmith/petsc-hash-pkgs --with-make-np=24 
--with-make-test-np=8 --with-hipc=/opt/rocm-5.4.3/bin/hipcc 
--with-hip-dir=/opt/rocm-5.4.3 COPTFLAGS="-g -O" FOPTFLAGS="-g -O" 
CXXOPTFLAGS="-g -O" HIPOPTFLAGS="-g -O" --with-cuda=0 --with-hip=1 
--with-precision=double --with-clanguage=c --download-kokkos 
--download-kokkos-kernels --download-hypre --download-magma 
--with-magma-fortran-bindings=0 --download-mfem --download-metis 
--with-strict-petscerrorcode PETSC_ARCH=arch-ci-linux-hip-double
[0]PETSC ERROR: #1 MatMultAddKernel_SeqAIJHIPSPARSE() at 
/scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3131
[0]PETSC ERROR: #2 MatMultAdd_SeqAIJHIPSPARSE() at 
/scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3004
[0]PETSC ERROR: #3 MatMultAdd() at 
/scratch/bsmith/petsc/src/mat/interface/matrix.c:2770
[0]PETSC ERROR: #4 MatInterpolateAdd() at 
/scratch/bsmith/petsc/src/mat/interface/matrix.c:8603
[0]PETSC ERROR: #5 PCMGMCycle_Private() at 
/scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:87
[0]PETSC ERROR: #6 PCMGMCycle_Private() at 
/scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:83
[0]PETSC ERROR: #7 PCApply_MG_Internal() at 
/scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:611
[0]PETSC ERROR: #8 PCApply_MG() at 
/scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:633
[0]PETSC ERROR: #9 PCApply() at 
/scratch/bsmith/petsc/src/ksp/pc/interface/precon.c:498
[0]PETSC ERROR: #10 KSP_PCApply() at 
/scratch/bsmith/petsc/include/petsc/private/kspimpl.h:383
[0]PETSC ERROR: #11 KSPSolve_Richardson() at 
/scratch/bsmith/petsc/src/ksp/ksp/impls/rich/rich.c:106
[0]PETSC ERROR: #12 KSPSolve_Private() at 
/scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:906
[0]PETSC ERROR: #13 KSPSolve() at 
/scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:1079
[0]PETSC ERROR: #14 main() at ex34.c:52
[0]PETSC ERROR: PETSc Option Table entries:

  Dave,

Trying to debug the 7% now, but having trouble running, as you see above.



> On Jan 19, 2024, at 3:02 PM, Dave May  wrote:
> 
> Thank you Barry and Junchao for these explanations. I'll turn on 
> -log_view_gpu_time.
> 
> Do either of you have any thoughts regarding why the percentage of flop's 
> being reported on the GPU is not 100% for MGSmooth Level {0,1,2} for this 
> solver configuration?
> 
> This number should have nothing to do with timings as it reports the ratio of 
> operations performed on the GPU and CPU, presumably obtained from 
> PetscLogFlops() and PetscLogGpuFlops().
> 
> Cheers,
> Dave
> 
> On Fri, 19 Jan 2024 at 11:39, Junchao Zhang  > wrote:
>> Try to also add -log_view_gpu_time, 
>> https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/
>> 
>> --Junchao Zhang
>> 
>> 
>> On Fri, Jan 19, 2024 at 11:35 AM Dave May > > wrote:
>>> Hi all,
>>> 
>>> I am trying to understand the logging information associated with the 
>>> %flops-performed-on-the-gpu reported by -log_view when running 
>>>   src/ksp/ksp/tutorials/ex34
>>> with the following options
>>> -da_grid_x 192
>>> -da_grid_y 192
>>> -da_grid_z 192
>>> -dm_mat_type seqaijhipsparse
>>> -dm_vec_type seqhip
>>> -ksp_max_it 10
>>> -ksp_monitor
>>> -ksp_type richardson
>>> -ksp_view
>>> -log_view
>>> -mg_coarse_ksp_max_it 2
>>> -mg_coarse_ksp_type richardson
>>> -mg_coarse_pc_type none
>>> -mg_levels_ksp_type richardson
>>> -mg_levels_pc_type none
>>> -options_left
>>> -pc_mg_levels 3
>>> -pc_mg_log
>>> -pc_type mg
>>> 
>>> This config is not intended to actually solve the problem, rather it is a 
>>> stripped down set of options designed to understand what parts of the 
>>> smoothers are being executed on the GPU.
>>> 
>>> With respect to the log file attached, my first set 

Re: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34)

2024-01-19 Thread Barry Smith


   Nans indicate we do not have valid computational times for these operations; 
think of them as Not Available. Providing valid times for the "inner" 
operations listed with Nans requires inaccurate times (higher) for the outer 
operations, since extra synchronization between the CPU and GPU must be done to 
get valid times for the inner options. We opted to have the best valid times 
for the outer operations since those times reflect the time of the application.





> On Jan 19, 2024, at 12:35 PM, Dave May  wrote:
> 
> Hi all,
> 
> I am trying to understand the logging information associated with the 
> %flops-performed-on-the-gpu reported by -log_view when running 
>   src/ksp/ksp/tutorials/ex34
> with the following options
> -da_grid_x 192
> -da_grid_y 192
> -da_grid_z 192
> -dm_mat_type seqaijhipsparse
> -dm_vec_type seqhip
> -ksp_max_it 10
> -ksp_monitor
> -ksp_type richardson
> -ksp_view
> -log_view
> -mg_coarse_ksp_max_it 2
> -mg_coarse_ksp_type richardson
> -mg_coarse_pc_type none
> -mg_levels_ksp_type richardson
> -mg_levels_pc_type none
> -options_left
> -pc_mg_levels 3
> -pc_mg_log
> -pc_type mg
> 
> This config is not intended to actually solve the problem, rather it is a 
> stripped down set of options designed to understand what parts of the 
> smoothers are being executed on the GPU.
> 
> With respect to the log file attached, my first set of questions related to 
> the data reported under "Event Stage 2: MG Apply".
> 
> [1] Why is the log littered with nan's?
> * I don't understand how and why "GPU Mflop/s" should be reported as nan when 
> a value is given for "GPU %F" (see MatMult for example).
> 
> * For events executed on the GPU, I assume the column "Time (sec)" relates to 
> "CPU execute time", this would explain why we see a nan in "Time (sec)" for 
> MatMult.
> If my assumption is correct, how should I interpret the column "Flop (Max)" 
> which is showing 1.92e+09? 
> I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" should 
> also relate to CPU and GPU flops would be logged in "GPU Mflop/s"
> 
> [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, MGSmooth 
> Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as 93. I 
> believe this value should be 100 as the smoother (and coarse grid solver) are 
> configured as richardson(2)+none and thus should run entirely on the GPU. 
> Furthermore, when one inspects all events listed under "Event Stage 2: MG 
> Apply" those events which do flops correctly report "GPU %F" as 100. 
> And the events showing "GPU %F" = 0 such as 
>   MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync
> don't do any flops (on the CPU or GPU) - which is also correct (although non 
> GPU events should show nan??)
> 
> Hence I am wondering what is the explanation for the missing 7% from "GPU %F" 
> for KSPSolve and MGSmooth {0,1,2}??
> 
> Does anyone understand this -log_view, or can explain to me how to interpret 
> it?
> 
> It could simply be that:
> a) something is messed up with -pc_mg_log
> b) something is messed up with the PETSc build
> c) I am putting too much faith in -log_view and should profile the code 
> differently.
> 
> Either way I'd really like to understand what is going on.
> 
> 
> Cheers,
> Dave
> 
> 
> 
> 



Re: [petsc-users] Question about a parallel implementation of PCFIELDSPLIT

2024-01-19 Thread Barry Smith

   Generally fieldsplit is used on problems that have a natural "split" of the 
variables into two or more subsets. For example u0,v0,u1,v1,u2,v2,u3,v4 This is 
often indicated in the vectors and matrices with the "blocksize" argument, 2 in 
this case. DM also often provides this information. 

   When laying out a vector/matrix with a blocksize one must ensure that an 
equal number of of the subsets appears on each MPI process. So, for example, if 
the above vector is distributed over 3 MPI processes one could use   
u0,v0,u1,v1   u2,v2  u3,v3  but one cannot use u0,v0,u1v1,u2,v2   
u3,v3.  Another way to think about it is that one must split up the vector as 
indexed by block among the processes. For most multicomponent problems this 
type of decomposition is very natural in the logic of the code.

  Barry
 

> On Jan 19, 2024, at 3:19 AM, Pantelis Moschopoulos 
>  wrote:
> 
> Dear all,
> 
> When I am using PCFIELDSPLIT and pc type "schur" in serial mode everything 
> works fine. When I turn now to parallel, I observe that the number of ranks 
> that I can use must divide the number of N without any remainder, where N is 
> the number of unknowns. Otherwise, an error of the following form emerges: 
> "Local columns of A10 3473 do not equal local rows of A00 3471".
> 
> Can I do something to overcome this?
> 
> Thanks,
> Pantelis



Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   Thanks. Same version I tried. 


> On Jan 18, 2024, at 6:09 PM, Yesypenko, Anna  wrote:
> 
> Hi Barry,
> 
> I'm using version 3.20.3. The tacc system is lonestar6.
> 
> Best,
> Anna
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Sent: Thursday, January 18, 2024 4:43 PM
> To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>; Victor Eijkhout 
> mailto:eijkh...@tacc.utexas.edu>>
> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix
>  
> 
>Ok, I ran it on an ANL machine with CUDA and it worked fine for many runs, 
> even increased the problem size without producing any problems. Both versions 
> of the Python code. 
> 
>Anna,
> 
>What version of PETSc are you using?
> 
>Victor,
> 
>Does anyone at ANL have access to this TACC system to try to reproduce?
> 
> 
>   Barry
> 
>
> 
>> On Jan 18, 2024, at 4:38 PM, Barry Smith > <mailto:bsm...@petsc.dev>> wrote:
>> 
>> 
>>It is using the hash map system for inserting values which only inserts 
>> on the CPU, not on the GPU. So I don't see that it would be moving any data 
>> to the GPU until the mat assembly() is done which it never gets to. Hence I 
>> have trouble understanding why the GPU has anything to do with the crash. 
>> 
>>I guess I need to try to reproduce it on a GPU system.
>> 
>>Barry
>> 
>> 
>> 
>> 
>>> On Jan 18, 2024, at 4:28 PM, Matthew Knepley >> <mailto:knep...@gmail.com>> wrote:
>>> 
>>> On Thu, Jan 18, 2024 at 4:18 PM Yesypenko, Anna >> <mailto:a...@oden.utexas.edu>> wrote:
>>> Hi Matt, Barry,
>>> 
>>> Apologies for the extra dependency on scipy. I can replicate the error by 
>>> calling setValue (i,j,v) in a loop as well.
>>> In roughly half of 10 runs, the following script fails because of an error 
>>> in hashmapijv – the same as my original post.
>>> It successfully runs without error the other times.
>>> 
>>> Barry is right that it's CUDA specific. The script runs fine on the CPU.
>>> Do you have any suggestions or example scripts on assigning entries to a 
>>> AIJCUSPARSE matrix?
>>> 
>>> Oh, you definitely do not want to be doing this. I believe you would rather
>>> 
>>> 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient.
>>> 
>>> 2) Produce the values on the GPU and call
>>> 
>>>   https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/
>>>   https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/
>>> 
>>>   This is what most people do who are forming matrices directly on the GPU.
>>> 
>>> What you are currently doing is incredibly inefficient, and I think 
>>> accounts for you running out of memory.
>>> It talks back and forth between the CPU and GPU.
>>> 
>>>   Thanks,
>>> 
>>>  Matt
>>> 
>>> Here is a minimum snippet that doesn't depend on scipy.
>>> ```
>>> from petsc4py import PETSc
>>> import numpy as np
>>> 
>>> n = int(5e5); 
>>> nnz = 3 * np.ones(n, dtype=np.int32)
>>> nnz[0] = nnz[-1] = 2
>>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>>> A.setType('aijcusparse')
>>> 
>>> A.setValue(0, 0, 2)
>>> A.setValue(0, 1, -1)
>>> A.setValue(n-1, n-2, -1)
>>> A.setValue(n-1, n-1, 2)
>>> 
>>> for index in range(1, n - 1):
>>>  A.setValue(index, index - 1, -1)
>>>  A.setValue(index, index, 2)
>>>  A.setValue(index, index + 1, -1)
>>> A.assemble()
>>> ```
>>> If it means anything to you, when the hash error occurs, it is for index 
>>> 67283 after filling 201851 nonzero values.
>>> 
>>> Thank you for your help and suggestions!
>>> Anna
>>> 
>>> From: Barry Smith mailto:bsm...@petsc.dev>>
>>> Sent: Thursday, January 18, 2024 2:35 PM
>>> To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
>>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
>>> mailto:petsc-users@mcs.anl.gov>>
>>> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix
>>>  
>>> 
>>>Do you ever get a problem with 'aij` ?   Can you run in a loop with 
>&g

Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   Ok, I ran it on an ANL machine with CUDA and it worked fine for many runs, 
even increased the problem size without producing any problems. Both versions 
of the Python code. 

   Anna,

   What version of PETSc are you using?

   Victor,

   Does anyone at ANL have access to this TACC system to try to reproduce?


  Barry

   

> On Jan 18, 2024, at 4:38 PM, Barry Smith  wrote:
> 
> 
>It is using the hash map system for inserting values which only inserts on 
> the CPU, not on the GPU. So I don't see that it would be moving any data to 
> the GPU until the mat assembly() is done which it never gets to. Hence I have 
> trouble understanding why the GPU has anything to do with the crash. 
> 
>I guess I need to try to reproduce it on a GPU system.
> 
>Barry
> 
> 
> 
> 
>> On Jan 18, 2024, at 4:28 PM, Matthew Knepley  wrote:
>> 
>> On Thu, Jan 18, 2024 at 4:18 PM Yesypenko, Anna > <mailto:a...@oden.utexas.edu>> wrote:
>>> Hi Matt, Barry,
>>> 
>>> Apologies for the extra dependency on scipy. I can replicate the error by 
>>> calling setValue (i,j,v) in a loop as well.
>>> In roughly half of 10 runs, the following script fails because of an error 
>>> in hashmapijv – the same as my original post.
>>> It successfully runs without error the other times.
>>> 
>>> Barry is right that it's CUDA specific. The script runs fine on the CPU.
>>> Do you have any suggestions or example scripts on assigning entries to a 
>>> AIJCUSPARSE matrix?
>> 
>> Oh, you definitely do not want to be doing this. I believe you would rather
>> 
>> 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient.
>> 
>> 2) Produce the values on the GPU and call
>> 
>>   https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/
>>   https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/
>> 
>>   This is what most people do who are forming matrices directly on the GPU.
>> 
>> What you are currently doing is incredibly inefficient, and I think accounts 
>> for you running out of memory.
>> It talks back and forth between the CPU and GPU.
>> 
>>   Thanks,
>> 
>>  Matt
>> 
>>> Here is a minimum snippet that doesn't depend on scipy.
>>> ```
>>> from petsc4py import PETSc
>>> import numpy as np
>>> 
>>> n = int(5e5); 
>>> nnz = 3 * np.ones(n, dtype=np.int32)
>>> nnz[0] = nnz[-1] = 2
>>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>>> A.setType('aijcusparse')
>>> 
>>> A.setValue(0, 0, 2)
>>> A.setValue(0, 1, -1)
>>> A.setValue(n-1, n-2, -1)
>>> A.setValue(n-1, n-1, 2)
>>> 
>>> for index in range(1, n - 1):
>>>  A.setValue(index, index - 1, -1)
>>>  A.setValue(index, index, 2)
>>>  A.setValue(index, index + 1, -1)
>>> A.assemble()
>>> ```
>>> If it means anything to you, when the hash error occurs, it is for index 
>>> 67283 after filling 201851 nonzero values.
>>> 
>>> Thank you for your help and suggestions!
>>> Anna
>>> 
>>> From: Barry Smith mailto:bsm...@petsc.dev>>
>>> Sent: Thursday, January 18, 2024 2:35 PM
>>> To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
>>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
>>> mailto:petsc-users@mcs.anl.gov>>
>>> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix
>>>  
>>> 
>>>Do you ever get a problem with 'aij` ?   Can you run in a loop with 
>>> 'aij' to confirm it doesn't fail then?
>>> 
>>>
>>> 
>>>Barry
>>> 
>>> 
>>>> On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna >>> <mailto:a...@oden.utexas.edu>> wrote:
>>>> 
>>>> Dear Petsc users/developers,
>>>> 
>>>> I'm experiencing a bug when using petsc4py with GPU support. It may be my 
>>>> mistake in how I set up a AIJCUSPARSE matrix.
>>>> For larger matrices, I sometimes encounter a error in assigning matrix 
>>>> values; the error is thrown in PetscHMapIJVQuerySet().
>>>> Here is a minimum snippet that populates a sparse tridiagonal matrix. 
>>>> 
>>>> ```
>>>> from petsc4py import PETSc
>>>> from scipy.sparse import diags
>>>> import numpy as np
>>>> 
>>>> n = int(5e5); 
>&

Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   It is using the hash map system for inserting values which only inserts on 
the CPU, not on the GPU. So I don't see that it would be moving any data to the 
GPU until the mat assembly() is done which it never gets to. Hence I have 
trouble understanding why the GPU has anything to do with the crash. 

   I guess I need to try to reproduce it on a GPU system.

   Barry




> On Jan 18, 2024, at 4:28 PM, Matthew Knepley  wrote:
> 
> On Thu, Jan 18, 2024 at 4:18 PM Yesypenko, Anna  <mailto:a...@oden.utexas.edu>> wrote:
>> Hi Matt, Barry,
>> 
>> Apologies for the extra dependency on scipy. I can replicate the error by 
>> calling setValue (i,j,v) in a loop as well.
>> In roughly half of 10 runs, the following script fails because of an error 
>> in hashmapijv – the same as my original post.
>> It successfully runs without error the other times.
>> 
>> Barry is right that it's CUDA specific. The script runs fine on the CPU.
>> Do you have any suggestions or example scripts on assigning entries to a 
>> AIJCUSPARSE matrix?
> 
> Oh, you definitely do not want to be doing this. I believe you would rather
> 
> 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient.
> 
> 2) Produce the values on the GPU and call
> 
>   https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/
>   https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/
> 
>   This is what most people do who are forming matrices directly on the GPU.
> 
> What you are currently doing is incredibly inefficient, and I think accounts 
> for you running out of memory.
> It talks back and forth between the CPU and GPU.
> 
>   Thanks,
> 
>  Matt
> 
>> Here is a minimum snippet that doesn't depend on scipy.
>> ```
>> from petsc4py import PETSc
>> import numpy as np
>> 
>> n = int(5e5); 
>> nnz = 3 * np.ones(n, dtype=np.int32)
>> nnz[0] = nnz[-1] = 2
>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>> A.setType('aijcusparse')
>> 
>> A.setValue(0, 0, 2)
>> A.setValue(0, 1, -1)
>> A.setValue(n-1, n-2, -1)
>> A.setValue(n-1, n-1, 2)
>> 
>> for index in range(1, n - 1):
>>  A.setValue(index, index - 1, -1)
>>  A.setValue(index, index, 2)
>>  A.setValue(index, index + 1, -1)
>> A.assemble()
>> ```
>> If it means anything to you, when the hash error occurs, it is for index 
>> 67283 after filling 201851 nonzero values.
>> 
>> Thank you for your help and suggestions!
>> Anna
>> 
>> From: Barry Smith mailto:bsm...@petsc.dev>>
>> Sent: Thursday, January 18, 2024 2:35 PM
>> To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
>> mailto:petsc-users@mcs.anl.gov>>
>> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix
>>  
>> 
>>Do you ever get a problem with 'aij` ?   Can you run in a loop with 'aij' 
>> to confirm it doesn't fail then?
>> 
>>
>> 
>>Barry
>> 
>> 
>>> On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna >> <mailto:a...@oden.utexas.edu>> wrote:
>>> 
>>> Dear Petsc users/developers,
>>> 
>>> I'm experiencing a bug when using petsc4py with GPU support. It may be my 
>>> mistake in how I set up a AIJCUSPARSE matrix.
>>> For larger matrices, I sometimes encounter a error in assigning matrix 
>>> values; the error is thrown in PetscHMapIJVQuerySet().
>>> Here is a minimum snippet that populates a sparse tridiagonal matrix. 
>>> 
>>> ```
>>> from petsc4py import PETSc
>>> from scipy.sparse import diags
>>> import numpy as np
>>> 
>>> n = int(5e5); 
>>> 
>>> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
>>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>>> A.setType('aijcusparse')
>>> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
>>> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
>>> ### this is the line where the error is thrown.
>>> A.assemble()
>>> ```
>>> 
>>> The error trace is below:
>>> ```
>>> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
>>>   File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
>>> petsc4py.PETSc.matsetvalues_csr
>>>   File "petsc4py/PETSc/petscmat.pxi"

Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   It appears to be crashing in kh_resize() in khash.h on a memory allocation 
failure when it tries to get additional memory for storing the matrix.

   This code seems to be only using the CPU memory so it should also fail in a 
similar way with 'aij'.   

  But the matrix is not large and so I don't think it should be running out of 
memory. I cannot reproduce the crash with same parameters on my non-CUDA 
machine so debugging will be tricky.

   Barry






> On Jan 18, 2024, at 3:35 PM, Barry Smith  wrote:
> 
> 
>Do you ever get a problem with 'aij` ?   Can you run in a loop with 'aij' 
> to confirm it doesn't fail then?
> 
>
> 
>Barry
> 
> 
>> On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna  wrote:
>> 
>> Dear Petsc users/developers,
>> 
>> I'm experiencing a bug when using petsc4py with GPU support. It may be my 
>> mistake in how I set up a AIJCUSPARSE matrix.
>> For larger matrices, I sometimes encounter a error in assigning matrix 
>> values; the error is thrown in PetscHMapIJVQuerySet().
>> Here is a minimum snippet that populates a sparse tridiagonal matrix. 
>> 
>> ```
>> from petsc4py import PETSc
>> from scipy.sparse import diags
>> import numpy as np
>> 
>> n = int(5e5); 
>> 
>> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>> A.setType('aijcusparse')
>> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
>> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
>> ### this is the line where the error is thrown.
>> A.assemble()
>> ```
>> 
>> The error trace is below:
>> ```
>> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
>>   File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
>> petsc4py.PETSc.matsetvalues_csr
>>   File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
>> petsc4py.PETSc.matsetvalues_ijv
>> petsc4py.PETSc.Error: error code 76
>> [0] MatSetValues() at 
>> /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
>> [0] MatSetValues_Seq_Hash() at 
>> /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
>> [0] PetscHMapIJVQuerySet() at 
>> /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
>> [0] Error in external library
>> [0] [khash] Assertion: `ret >= 0' failed.
>> ```
>> 
>> If I run the same script a handful of times, it will run without errors 
>> eventually.
>> Does anyone have insight on why it is behaving this way? I'm running on a 
>> node with 3x NVIDIA A100 PCIE 40GB.
>> 
>> Thank you!
>> Anna
> 



Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   Do you ever get a problem with 'aij` ?   Can you run in a loop with 'aij' to 
confirm it doesn't fail then?

   

   Barry


> On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna  wrote:
> 
> Dear Petsc users/developers,
> 
> I'm experiencing a bug when using petsc4py with GPU support. It may be my 
> mistake in how I set up a AIJCUSPARSE matrix.
> For larger matrices, I sometimes encounter a error in assigning matrix 
> values; the error is thrown in PetscHMapIJVQuerySet().
> Here is a minimum snippet that populates a sparse tridiagonal matrix. 
> 
> ```
> from petsc4py import PETSc
> from scipy.sparse import diags
> import numpy as np
> 
> n = int(5e5); 
> 
> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
> A.setType('aijcusparse')
> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
> ### this is the line where the error is thrown.
> A.assemble()
> ```
> 
> The error trace is below:
> ```
> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
>   File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
> petsc4py.PETSc.matsetvalues_csr
>   File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
> petsc4py.PETSc.matsetvalues_ijv
> petsc4py.PETSc.Error: error code 76
> [0] MatSetValues() at 
> /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
> [0] MatSetValues_Seq_Hash() at 
> /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
> [0] PetscHMapIJVQuerySet() at 
> /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
> [0] Error in external library
> [0] [khash] Assertion: `ret >= 0' failed.
> ```
> 
> If I run the same script a handful of times, it will run without errors 
> eventually.
> Does anyone have insight on why it is behaving this way? I'm running on a 
> node with 3x NVIDIA A100 PCIE 40GB.
> 
> Thank you!
> Anna



Re: [petsc-users] ScaLAPACK EPS error

2024-01-18 Thread Barry Smith

   Looks like you are using an older version of PETSc. Could you please switch 
to the latest and try again and send same information if that also fails.

  Barry


> On Jan 18, 2024, at 12:59 PM, Peder Jørgensgaard Olesen via petsc-users 
>  wrote:
> 
> Hello,
> 
> I need to determine the full set of eigenpairs to a rather large (N=16,000) 
> dense Hermitian matrix. I've managed to do this using SLEPc's standard 
> Krylov-Schur EPS, but I think it could be done more efficiently using 
> ScaLAPACK. I receive the following error when attempting this. As I 
> understand it, descinit is used to initialize an array, and the variable in 
> question designates the leading dimension of the array, for which it seems an 
> illegal value is somehow passed.
> 
> I know ScaLAPACK is an external package, but it seems as if the error would 
> be in the call from SLEPc. Any ideas as to what could cause this?
> 
> Thanks,
> Peder
> 
> Error message (excerpt):
> 
> PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032
> PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250
> PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47
> PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323
> PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134
> PETSC ERROR: -- Error message --
> PETSC ERROR: Error in external library
> PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9
> (...)
> 
> Log file (excerpt):
> {  357,0}:  On entry to DESCINIT parameter number   9 had an illegal value
> [and a few hundred lines similar to this]



Re: [petsc-users] undefined reference to `petsc_allreduce_ct_th'

2024-01-18 Thread Barry Smith

   The PETSc petsclog.h  (included by petscsys.h) uses C macro magic to log 
calls to MPI routines. This is how the symbol is getting into your code. But 
normally 
if you use PetscInitialize() and link to the PETSc library the symbol would get 
resolved.

   If that part of the code does not need PETSc at all you can not include 
petscsys.h and instead include mpi.h otherwise you need to track down why when 
your code gets linked against PETSc libraries that symbol is not resolved.

  Barry


> On Jan 18, 2024, at 11:55 AM, Aaron Scheinberg  wrote:
> 
> Hello,
> 
> I'm getting this error when linking:
> 
> undefined reference to `petsc_allreduce_ct_th'
> 
> The instances are regular MPI_Allreduces in my code that are not located in 
> parts of the code related to PETSc, so I'm wondering what is happening to 
> involve PETSc here? Can I configure it to avoid that? I consulted google, the 
> FAQ and skimmed other documentation but didn't see anything. Thanks!
> 
> Aaron



Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code

2024-01-12 Thread Barry Smith

PETSc vectors contain inside themselves an array with the numerical values. 
VecGetArrayF90() exposes this array to Fortran so you may access the values in 
that array. So VecGetArrayF90()  does not create a new array, it gives you 
temporary access to an already existing array inside the vector.

  Barry




> On Jan 11, 2024, at 11:49 PM, Shatanawi, Sawsan Muhammad 
>  wrote:
> 
> Hello,
> 
> Thank you all for your help.
> 
> I have changed VecGetArray to VecGetArrayF90, and the location of destory 
> call. but I want to make sure that VecGet ArrayF90 is to make a new array( 
> vector) that I can use in the rest of my Fortran code?
> 
> when I run it and debugged it, I got 
> 
>   5.200E-03
>50.0
>10.0
>   0.000E+00
> PETSC: Attaching gdb to 
> /weka/data/lab/richey/sawsan/GW_CODE/code2024/SS_GWM/./GW.exe of pid 33065 on 
> display :0.0 on machine sn16
> Unable to start debugger in xterm: No such file or directory
>   0.000E+00
> Attempting to use an MPI routine after finalizing MPICH
> srun: error: sn16: task 0: Exited with exit code 1
> [sawsan.shatanawi@login-p2n02 SS_GWM]$ gdb ./GW/exe
> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> ./GW/exe: No such file or directory.
> (gdb) run
> Starting program:
> No executable file specified.
> Use the "file" or "exec-file" command.
> (gdb) bt
> No stack.
> (gdb)
> 
> If the highlighted line is the error, I don't know why when I write gdb , it 
> does not show me the location of error
> The code : sshatanawi/SS_GWM (github.com) 
> <https://github.com/sshatanawi/SS_GWM> 
> 
> I really appreciate your helps
> 
> Sawsan
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Sent: Wednesday, January 10, 2024 5:35 PM
> To: Junchao Zhang mailto:junchao.zh...@gmail.com>>
> Cc: Shatanawi, Sawsan Muhammad  <mailto:sawsan.shatan...@wsu.edu>>; Mark Adams  <mailto:mfad...@lbl.gov>>; petsc-users@mcs.anl.gov 
> <mailto:petsc-users@mcs.anl.gov>  <mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran 
> Groundwater Flow Simulation Code
>  
> [EXTERNAL EMAIL]
> 
> 
>> On Jan 10, 2024, at 6:49 PM, Junchao Zhang > <mailto:junchao.zh...@gmail.com>> wrote:
>> 
>> Hi, Sawsan,
>>  I could build your code and I also could gdb it.
>> 
>> $ gdb ./GW.exe
>> ...
>> $ Thread 1 "GW.exe" received signal SIGSEGV, Segmentation fault.
>> 0x71e6d44f in vecgetarray_ (x=0x7fffa718, fa=0x0, 
>> ia=0x7fffa75c, ierr=0x0) at 
>> /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257
>> 257   *ierr = VecGetArray(*x, );
>> (gdb) bt
>> #0  0x71e6d44f in vecgetarray_ (x=0x7fffa718, fa=0x0, 
>> ia=0x7fffa75c, ierr=0x0) at 
>> /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257
>> #1  0x0040b6e3 in gw_solver (t_s=1.40129846e-45, n=300) at 
>> GW_solver_try.F90:169
>> #2  0x0040c6a8 in test_gw () at test_main.F90:35
>>  
>> ierr=0x0  caused the segfault.  See 
>> https://petsc.org/release/manualpages/Vec/VecGetArray/#vecgetarray 
>> <https://urldefense.com/v3/__https://petsc.org/release/manualpages/Vec/VecGetArray/*vecgetarray__;Iw!!JmPEgBY0HMszNaDT!tqBApprMfYxwNz4Zvnk8coNE5AeWjA9wSdAM7QJcIIVP1z0VDsVIalo4Sew2b0fW3bZtTAbPh-h0MUsZ9Km12jA$>,
>>  you should use VecGetArrayF90 instead.
>> 
>> BTW,  Barry,  the code 
>> https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L169 
>> <https://urldefense.com/v3/__https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90*L169__;Iw!!JmPEgBY0HMszNaDT!tqBApprMfYxwNz4Zvnk8coNE5AeWjA9wSdAM7QJcIIVP1z0VDsVIalo4Sew2b0fW3bZtTAbPh-h0MUsZh2eAi4o$>
>>  has "call VecGetArray(temp_solution, H_vector, ierr)".I don't find 
>> petsc Fortran examples doing VecGetArray.  Do we still support it?
> 
> This is not the correct calling sequence for VecGetArray() from Fortran. 
> 
> Regardless, definitely should not be writing any new code that uses 
> VecGetArray() from Fortran. Should use VecGetArrayF90().
> 

Re: [petsc-users] KSP number of iterations different from size of residual array

2024-01-11 Thread Barry Smith

   Kevin

   A couple of different things are at play here producing the unexpected 
results.

   I have created a merge request 
https://gitlab.com/petsc/petsc/-/merge_requests/7179 clarifying why the results 
obtained from the KSPGetResidualHistory() and KSPGetIterationNumber() can be 
different in the docs. 

   I also fixed a couple of locations of KSPLogResidual() (in gmres and fgmres) 
that resulted in extra incorrect logging of the history.

   In summary, with the "standard" textbook Krylov methods, one expects numIts 
= nEntries - 1, but this need not be the case for advanced Krylov methods (like 
those with inner iterations or pipelining) or under exceptional circumstances 
like the use of CG in trust region methods.

   Barry



> On Jan 11, 2024, at 1:20 PM, Kevin G. Wang  wrote:
> 
> Hi Barry,
> 
> Thanks for your help!!
> 
> I have checked that in KSPSetResidualHistory, "reset" is set to PETSC_TRUE. I 
> did a few quick tests after reading your message. There seems to be some 
> patterns between "numIts" (given by KSPGetIterationNumber) and "nEntries" 
> (given by KSPGetResidualHistory):
> 
> 1. With gmres or fgmres as the solver:
>   - If the number of iterations (until error tolerance is met) is small, like 
> 20 - 30, indeed as you said, numIts = nEntries - 1.
>   - if the number of iterations is large, this is no longer true. I have a 
> case where nEntries = 372, numIts = 360.
> 2. With bcgsl, it looks like numIts = 2*(nEntries - 1).
> 3. With ibcgs, nEntries = 0, while numIts is nonzero.
> 
> In all these tests, I have set the preconditioner to "none".
> 
> My code (where the KSP functions are called) is here: 
> https://github.com/kevinwgy/m2c/blob/main/LinearSystemSolver.cpp
> 
> I am using PETSc 3.12.4.
> 
> Thanks!
> Kevin
> 
> 
> On Thu, Jan 11, 2024 at 12:26 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>Trying again.
>> 
>> Normally, numIts would be one less than nEntries since the initial 
>> residual is computed (and stored in the history) before any iterations.
>> 
>> Is this what you are seeing or are you seeing other values for the two?
>> 
>> I've started a run of the PETSc test suite that compares the two values 
>> for inconsistencies for all tests to see if I can find any problems.
>> 
>> Barry
>> 
>> Also note the importance of the reset value in KSPSetResidualHistory() 
>> which means the values will not match when reset is PETSC_FALSE.
>> 
>>> On Jan 10, 2024, at 7:09 PM, Kevin G. Wang >> <mailto:kevi...@vt.edu>> wrote:
>>> 
>>> Hello everyone!
>>> 
>>> I am writing a code that uses PETSc/KSP to solve linear systems. I just 
>>> realized that after running "KSPSolve(...)", the number of iterations given 
>>> by 
>>> 
>>> KSPGetIterationNumber(ksp, )
>>> 
>>> is *different* from the size of the residual history given by
>>> 
>>> KSPGetResidualHistory(ksp, NULL, );
>>> 
>>> That is, "numIts" is not equal to "nEntries". Is this expected, or a bug in 
>>> my code? (I thought they should be the same...)
>>> 
>>> I have tried several pairs of solvers and preconditioners (e.g., fgmres & 
>>> bjacobi, ibcgs & bjacobi). This issue happens to all of them.
>>> 
>>> Thanks!
>>> Kevin
>>> 
>>> --
>>> Kevin G. Wang, Ph.D.
>>> Associate Professor
>>> Kevin T. Crofton Department of Aerospace and Ocean Engineering
>>> Virginia Tech
>>> 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061
>>> Office: (540) 231-7547  |  Mobile: (650) 862-2663 
>>> URL: https://www.aoe.vt.edu/people/faculty/wang.html 
>>> Codes: https://github.com/kevinwgy
>> 
> 
> 
> --
> Kevin G. Wang, Ph.D.
> Associate Professor
> Kevin T. Crofton Department of Aerospace and Ocean Engineering
> Virginia Tech
> 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061
> Office: (540) 231-7547  |  Mobile: (650) 862-2663 
> URL: https://www.aoe.vt.edu/people/faculty/wang.html 
> Codes: https://github.com/kevinwgy



Re: [petsc-users] KSP number of iterations different from size of residual array

2024-01-11 Thread Barry Smith

   Trying again.

Normally, numIts would be one less than nEntries since the initial residual 
is computed (and stored in the history) before any iterations.

Is this what you are seeing or are you seeing other values for the two?

I've started a run of the PETSc test suite that compares the two values for 
inconsistencies for all tests to see if I can find any problems.

Barry

Also note the importance of the reset value in KSPSetResidualHistory() 
which means the values will not match when reset is PETSC_FALSE.

> On Jan 10, 2024, at 7:09 PM, Kevin G. Wang  wrote:
> 
> Hello everyone!
> 
> I am writing a code that uses PETSc/KSP to solve linear systems. I just 
> realized that after running "KSPSolve(...)", the number of iterations given 
> by 
> 
> KSPGetIterationNumber(ksp, )
> 
> is *different* from the size of the residual history given by
> 
> KSPGetResidualHistory(ksp, NULL, );
> 
> That is, "numIts" is not equal to "nEntries". Is this expected, or a bug in 
> my code? (I thought they should be the same...)
> 
> I have tried several pairs of solvers and preconditioners (e.g., fgmres & 
> bjacobi, ibcgs & bjacobi). This issue happens to all of them.
> 
> Thanks!
> Kevin
> 
> --
> Kevin G. Wang, Ph.D.
> Associate Professor
> Kevin T. Crofton Department of Aerospace and Ocean Engineering
> Virginia Tech
> 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061
> Office: (540) 231-7547  |  Mobile: (650) 862-2663 
> URL: https://www.aoe.vt.edu/people/faculty/wang.html 
> Codes: https://github.com/kevinwgy



Re: [petsc-users] Parallel processes run significantly slower

2024-01-11 Thread Barry Smith

   Take a look at the discussion in 
https://petsc.gitlab.io/-/petsc/-/jobs/5814862879/artifacts/public/html/manual/streams.html
 and I suggest you run the streams benchmark from the branch 
barry/2023-09-15/fix-log-pcmpi on your machine to get a baseline for what kind 
of speedup you can expect.  

Then let us know your thoughts.

   Barry



> On Jan 11, 2024, at 11:37 AM, Stefano Zampini  
> wrote:
> 
> You are creating the matrix on the wrong communicator if you want it 
> parallel. You are using PETSc.COMM_SELF
> 
> On Thu, Jan 11, 2024, 19:28 Steffen Wilksen | Universitaet Bremen 
> mailto:swilk...@itp.uni-bremen.de>> wrote:
>> Hi all,
>> 
>> I'm trying to do repeated matrix-vector-multiplication of large sparse 
>> matrices in python using petsc4py. Even the most simple method of 
>> parallelization, dividing up the calculation to run on multiple processes 
>> indenpendtly, does not seem to give a singnificant speed up for large 
>> matrices. I constructed a minimal working example, which I run using
>> 
>> mpiexec -n N python parallel_example.py,
>> 
>> where N is the number of processes. Instead of taking approximately the same 
>> time irrespective of the number of processes used, the calculation is much 
>> slower when starting more MPI processes. This translates to little to no 
>> speed up when splitting up a fixed number of calculations over N processes. 
>> As an example, running with N=1 takes 9s, while running with N=4 takes 34s. 
>> When running with smaller matrices, the problem is not as severe (only 
>> slower by a factor of 1.5 when setting MATSIZE=1e+5 instead of 
>> MATSIZE=1e+6). I get the same problems when just starting the script four 
>> times manually without using MPI.
>> I attached both the script and the log file for running the script with N=4. 
>> Any help would be greatly appreciated. Calculations are done on my laptop, 
>> arch linux version 6.6.8 and PETSc version 3.20.2.
>> 
>> Kind Regards
>> Steffen
>> 



Re: [petsc-users] SNES seems not use my matrix-free operation

2024-01-11 Thread Barry Smith

   The following assumes you are not using the shell matrix context for some 
other purpose

> subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc)
>   
>   SNES :: snes
>   Vec  :: F_global
>  
>   ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: &
>   !   F
>   Mat  :: Jac, Jac_pre
>   PetscObject  :: dummy
>   PetscErrorCode   :: err_PETSc
>  
>   print*, '@@ start build my jac'
>   
>   PetscCall(MatShellSetContext(Jac,F_global,ierr))   ! record the current 
> base vector where the Jacobian is to be applied
>   print*, '@@ end build my jac'
>  
> end subroutine formJacobian

subroutine Gk_op 
...
   Vec base
   PetscCall(MatShellGetContext(Jac,base,ierr))

   ! use base in the computation of your matrix-free Jacobian vector product




> On Jan 11, 2024, at 5:55 AM, Yi Hu  wrote:
> 
> Now I understand a bit more about the workflow of set jacobian. It seems that 
> the SNES can be really fine-grained. As you point out, J is built via 
> formJacobian() callback, and can be based on previous solution (or the base 
> vector u, as you mentioned). And then KSP can use a customized MATOP_MULT to 
> solve the linear equations J(u)*x=rhs. 
>  
> So I followed your idea about removing DMSNESSetJacobianLocal() and did the 
> following.
>  
> ……
>   call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,&
>   
> 9*product(cells(1:2))*cells3,9*product(cells(1:2))*cells3,&
>   0,Jac_PETSc,err_PETSc)
>   CHKERRQ(err_PETSc)
>   call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc)
>   CHKERRQ(err_PETSc)
>   call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,formJacobian,0,err_PETSc)
>   CHKERRQ(err_PETSc)
> ……
>  
> And my formJacobian() is 
>  
> subroutine formJacobian(snes,F_global,Jac,Jac_pre,dummy,err_PETSc)
>   
>   SNES :: snes
>   Vec  :: F_global
>  
>   ! real(pREAL), dimension(3,3,cells(1),cells(2),cells3), intent(in) :: &
>   !   F
>   Mat  :: Jac, Jac_pre
>   PetscObject  :: dummy
>   PetscErrorCode   :: err_PETSc
>  
>   print*, '@@ start build my jac'
>   
>   call MatCopy(Jac_PETSc,Jac,SAME_NONZERO_PATTERN,err_PETSc)
>   CHKERRQ(err_PETSc)
>   call MatCopy(Jac_PETSc,Jac_pre,SAME_NONZERO_PATTERN,err_PETSc)
>   CHKERRQ(err_PETSc)
>   ! Jac = Jac_PETSc
>   ! Jac_pre = Jac_PETSc
>  
>   print*, '@@ end build my jac'
>  
> end subroutine formJacobian
>  
> it turns out that no matter by a simple assignment or MatCopy(), the compiled 
> program gives me the same error as before. So I guess the real jacobian is 
> still not set. I wonder how to get around this and let this built jac in 
> formJacobian() to be the same as my shell matrix.
>  
> Yi
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Wednesday, January 10, 2024 4:27 PM
> To: Yi Hu mailto:y...@mpie.de>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] SNES seems not use my matrix-free operation
>  
>  
>   By default if SNESSetJacobian() is not called with a function pointer PETSc 
> attempts to compute the Jacobian matrix explicitly with finite differences 
> and coloring. This doesn't makes sense with a shell matrix. Hence the error 
> message below regarding MatFDColoringCreate().
>  
>   DMSNESSetJacobianLocal() calls SNESSetJacobian() with a function pointer of 
> SNESComputeJacobian_DMLocal() so preventing the error from triggering in your 
> code.
>  
>   You can provide your own function to SNESSetJacobian() and thus not need to 
> call DMSNESSetJacobianLocal(). What you do depends on how you want to record 
> the "base" vector that tells your matrix-free multiply routine where the 
> Jacobian matrix vector product is being applied, that is J(u)*x. u is the 
> "base" vector which is passed to the function provided with SNESSetJacobian().
>  
>Barry
>  
> 
> 
> On Jan 10, 2024, at 6:20 AM, Yi Hu mailto:y...@mpie.de>> wrote:
>  
> Thanks for the clarification. It is more clear to me now about the global to 
> local processes after checking the examples, e.g. 
> ksp/ksp/tutorials/ex14f.F90. 
>  
> And for using Vec locally, I followed your advice of VecGet.. and VecRestore… 
> In fact I used DMDAVecGetArrayReadF90() and some other relevant subroutines. 
>  
> For your comment on DMSNESSetJacobianLocal(). It seems that I need to use 
> both SNESSetJacobian() and then DMSNESSetJacobianLocal() to get things 
> working. When I do only SNESSetJacobian(), it d

Re: [petsc-users] Help with Integrating PETSc into Fortran Groundwater Flow Simulation Code

2024-01-10 Thread Barry Smith


> On Jan 10, 2024, at 6:49 PM, Junchao Zhang  wrote:
> 
> Hi, Sawsan,
>  I could build your code and I also could gdb it.
> 
> $ gdb ./GW.exe
> ...
> $ Thread 1 "GW.exe" received signal SIGSEGV, Segmentation fault.
> 0x71e6d44f in vecgetarray_ (x=0x7fffa718, fa=0x0, 
> ia=0x7fffa75c, ierr=0x0) at 
> /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257
> 257   *ierr = VecGetArray(*x, );
> (gdb) bt
> #0  0x71e6d44f in vecgetarray_ (x=0x7fffa718, fa=0x0, 
> ia=0x7fffa75c, ierr=0x0) at 
> /scratch/jczhang/petsc/src/vec/vec/interface/ftn-custom/zvectorf.c:257
> #1  0x0040b6e3 in gw_solver (t_s=1.40129846e-45, n=300) at 
> GW_solver_try.F90:169
> #2  0x0040c6a8 in test_gw () at test_main.F90:35
>  
> ierr=0x0  caused the segfault.  See 
> https://petsc.org/release/manualpages/Vec/VecGetArray/#vecgetarray, you 
> should use VecGetArrayF90 instead.
> 
> BTW,  Barry,  the code 
> https://github.com/sshatanawi/SS_GWM/blob/main/GW_solver_try.F90#L169 has 
> "call VecGetArray(temp_solution, H_vector, ierr)".I don't find petsc 
> Fortran examples doing VecGetArray.  Do we still support it?

This is not the correct calling sequence for VecGetArray() from Fortran. 

Regardless, definitely should not be writing any new code that uses 
VecGetArray() from Fortran. Should use VecGetArrayF90().

> 
> --Junchao Zhang
> 
> 
> On Wed, Jan 10, 2024 at 2:38 PM Shatanawi, Sawsan Muhammad via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> wrote:
>> Hello all,
>> 
>> I hope you are doing well.
>> 
>> Generally, I use gdb  to debug the code.
>>  I got the attached error message.
>> 
>> I have tried to add the flag -start_in_debugger in the make file, but it 
>> didn't work, so it seems I was doing it in the wrong way
>> 
>> This is the link for the whole code: sshatanawi/SS_GWM (github.com) 
>> <https://github.com/sshatanawi/SS_GWM>
>>  <https://github.com/sshatanawi/SS_GWM>  
>> GitHub - sshatanawi/SS_GWM <https://github.com/sshatanawi/SS_GWM>
>> Contribute to sshatanawi/SS_GWM development by creating an account on GitHub.
>> github.com <http://github.com/>
>> 
>> 
>> You can read the description of the code in " Model Desprciption.pdf"
>> the compiling file is makefile_f90 where you can find the linked code files
>> 
>> I really appreciate your help
>> 
>> Bests,
>> Sawsan
>> From: Mark Adams mailto:mfad...@lbl.gov>>
>> Sent: Friday, January 5, 2024 4:53 AM
>> To: Shatanawi, Sawsan Muhammad > <mailto:sawsan.shatan...@wsu.edu>>
>> Cc: Matthew Knepley mailto:knep...@gmail.com>>; 
>> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
>> mailto:petsc-users@mcs.anl.gov>>
>> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran 
>> Groundwater Flow Simulation Code
>>  
>> [EXTERNAL EMAIL]
>> 
>> This is a segv. As Matt said, you need to use a debugger for this or add 
>> print statements to narrow down the place where this happens.
>> 
>> You will need to learn how to use debuggers to do your project so you might 
>> as well start now.
>> 
>> If you have a machine with a GUI debugger that is easier but command line 
>> debuggers are good to learn anyway.
>> 
>> I tend to run debuggers directly (eg, lldb ./a.out -- program-args ...) and 
>> use a GUI debugger (eg, Totalview or DDT) if available.
>> 
>> Mark
>> 
>> 
>> On Wed, Dec 20, 2023 at 10:02 PM Shatanawi, Sawsan Muhammad via petsc-users 
>> mailto:petsc-users@mcs.anl.gov>> wrote:
>> Hello Matthew,
>> 
>> Thank you for your help. I am sorry that I keep coming back with my error 
>> messages, but I reached a point that I don't know how to fix them, and I 
>> don't understand them easily.
>> The list of errors is getting shorter, now I am getting the attached error 
>> messages 
>> 
>> Thank you again,
>> 
>> Sawsan
>> From: Matthew Knepley mailto:knep...@gmail.com>>
>> Sent: Wednesday, December 20, 2023 6:54 PM
>> To: Shatanawi, Sawsan Muhammad > <mailto:sawsan.shatan...@wsu.edu>>
>> Cc: Barry Smith mailto:bsm...@petsc.dev>>; 
>> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
>> mailto:petsc-users@mcs.anl.gov>>
>> Subject: Re: [petsc-users] Help with Integrating PETSc into Fortran 
>> Groundwater Flow Simulation Code
>>  
>> [EXTERNAL EMAIL]
>> 
>> On Wed, Dec 20, 2023 at 9:49 PM Sh

  1   2   3   4   5   6   7   8   9   10   >