subject:"Re\: \[petsc\-users\] GAMG"

Re: [petsc-users] GAMG and Hypre preconditioner

2023-06-27 Thread Zisheng Ye via petsc-users

Hi Jed

Thanks for your reply. I have sent the log files to petsc-ma...@mcs.anl.gov.

Zisheng

From: Jed Brown 
Sent: Tuesday, June 27, 2023 1:02 PM
To: Zisheng Ye ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] GAMG and Hypre preconditioner

[External Sender]

Zisheng Ye via petsc-users  writes:

> Dear PETSc Team
>
> We are testing the GPU support in PETSc's KSPSolve, especially for the GAMG 
> and Hypre preconditioners. We have encountered several issues that we would 
> like to ask for your suggestions.
>
> First, we have couple of questions when working with a single MPI rank:
>
>   1.  We have tested two backends, CUDA and Kokkos. One commonly encountered 
> error is related to SpGEMM in CUDA when the mat is large as listed below:
>
> cudaMalloc((void **), bufferSize2) error( cudaErrorMemoryAllocation): 
> out of memory
>
> For CUDA backend, one can use "-matmatmult_backend_cpu -matptap_backend_cpu" 
> to avoid these problems. However, there seems no equivalent options in Kokkos 
> backend. Is there any good practice to avoid this error for both backends and 
> if we can avoid this error in Kokkos backend?

Junchao will know more about KK tuning, but the faster GPU matrix-matrix 
algorithms use extra memory. We should be able to make the host option 
available with kokkos.

>   2.  We have tested the combination of Hypre and Kokkos as backend. It looks 
> like this combination is not compatible with each other, as we observed that 
> KSPSolve takes a greater number of iterations to exit, and the residual norm 
> in the post-checking is much larger than the one obtained when working with 
> CUDA backend. This happens for matrices with block size larger than 1. Is 
> there any explanation to the error?
>
> Second, we have couple more questions when working with multiple MPI ranks:
>
>   1.  We are currently using OpenMPI as we couldnt get Intel MPI to work as a 
> GPU-aware MPI, is this a known issue with Intel MPI?

As far as I know, Intel's MPI is only for SYCL/Intel GPUs. In general, 
GPU-aware MPI has been incredibly flaky on all HPC systems despite being 
introduced ten years ago.

>   2.  With OpenMPI we currently see a slow down when increasing the MPI count 
> as shown in the figure below, is this normal?

Could you share -log_view output from a couple representative runs? You could 
send those here or to petsc-ma...@mcs.anl.gov. We need to see what kind of work 
is not scaling to attribute what may be causing it.

Re: [petsc-users] GAMG and Hypre preconditioner

2023-06-27 Thread Jed Brown

Zisheng Ye via petsc-users  writes:

> Dear PETSc Team
>
> We are testing the GPU support in PETSc's KSPSolve, especially for the GAMG 
> and Hypre preconditioners. We have encountered several issues that we would 
> like to ask for your suggestions.
>
> First, we have couple of questions when working with a single MPI rank:
>
>   1.  We have tested two backends, CUDA and Kokkos. One commonly encountered 
> error is related to SpGEMM in CUDA when the mat is large as listed below:
>
> cudaMalloc((void **), bufferSize2) error( cudaErrorMemoryAllocation): 
> out of memory
>
> For CUDA backend, one can use "-matmatmult_backend_cpu -matptap_backend_cpu" 
> to avoid these problems. However, there seems no equivalent options in Kokkos 
> backend. Is there any good practice to avoid this error for both backends and 
> if we can avoid this error in Kokkos backend?

Junchao will know more about KK tuning, but the faster GPU matrix-matrix 
algorithms use extra memory. We should be able to make the host option 
available with kokkos.

>   2.  We have tested the combination of Hypre and Kokkos as backend. It looks 
> like this combination is not compatible with each other, as we observed that 
> KSPSolve takes a greater number of iterations to exit, and the residual norm 
> in the post-checking is much larger than the one obtained when working with 
> CUDA backend. This happens for matrices with block size larger than 1. Is 
> there any explanation to the error?
>
> Second, we have couple more questions when working with multiple MPI ranks:
>
>   1.  We are currently using OpenMPI as we couldnt get Intel MPI to work as a 
> GPU-aware MPI, is this a known issue with Intel MPI?

As far as I know, Intel's MPI is only for SYCL/Intel GPUs. In general, 
GPU-aware MPI has been incredibly flaky on all HPC systems despite being 
introduced ten years ago.

>   2.  With OpenMPI we currently see a slow down when increasing the MPI count 
> as shown in the figure below, is this normal?

Could you share -log_view output from a couple representative runs? You could 
send those here or to petsc-ma...@mcs.anl.gov. We need to see what kind of work 
is not scaling to attribute what may be causing it.

Re: [petsc-users] GAMG failure

2023-03-28 Thread Mark Adams

On Tue, Mar 28, 2023 at 12:38 PM Blaise Bourdin  wrote:

>
>
> On Mar 27, 2023, at 9:11 PM, Mark Adams  wrote:
>
> Yes, the eigen estimates are converging slowly.
>
> BTW, have you tried hypre? It is a good solver (lots lots more woman years)
> These eigen estimates are conceptually simple, but they can lead to
> problems like this (hypre and an eigen estimate free smoother).
>
> I just moved from petsc 3.3 to main, so my experience with an old version
> of hyper has not been very convincing. Strangely enough, ML has always been
> the most efficient PC for me.
>

ML is a good solver.


> Maybe it’s time to revisit.
> That said, I would really like to get decent performances out of gamg. One
> day, I’d like to be able to account for the special structure of
> phase-field fracture in the construction of the coarse space.
>
>
> But try this (good to have options anyway):
>
> -pc_gamg_esteig_ksp_max_it 20
>
> Chevy will scale the estimate that we give by, I think, 5% by default.
> Maybe 10.
> You can set that with:
>
> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,*1.05*
>
> 0.2 is the scaling of the high eigen estimate for the low eigen value in
> Chebyshev.
>
>
>
> Jed’s suggestion of using -pc_gamg_reuse_interpolation 0 worked.
>

OK, have to admit I am surprised.
But I guess with your fracture the matrix/physics/dynamics does change a lot


> I am testing your options at the moment.
>

There are a lot of options and it is cumbersome but they are finite and
good to know.
Glad its working,


>
> Thanks a lot,
>
> Blaise
>
> —
> Canada Research Chair in Mathematical and Computational Aspects of Solid
> Mechanics (Tier 1)
> Professor, Department of Mathematics & Statistics
> Hamilton Hall room 409A, McMaster University
> 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada
> https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243
>
>

Re: [petsc-users] GAMG failure

2023-03-28 Thread Jed Brown

This suite has been good for my solid mechanics solvers. (It's written here as 
a coarse grid solver because we do matrix-free p-MG first, but you can use it 
directly.)

https://github.com/hypre-space/hypre/issues/601#issuecomment-1069426997

Blaise Bourdin  writes:

>  On Mar 27, 2023, at 9:11 PM, Mark Adams  wrote:
>
>  Yes, the eigen estimates are converging slowly. 
>
>  BTW, have you tried hypre? It is a good solver (lots lots more woman years)
>  These eigen estimates are conceptually simple, but they can lead to problems 
> like this (hypre and an eigen estimate free
>  smoother).
>
> I just moved from petsc 3.3 to main, so my experience with an old version of 
> hyper has not been very convincing. Strangely
> enough, ML has always been the most efficient PC for me. Maybe it’s time to 
> revisit.
> That said, I would really like to get decent performances out of gamg. One 
> day, I’d like to be able to account for the special structure
> of phase-field fracture in the construction of the coarse space.
>
>  But try this (good to have options anyway):
>
>  -pc_gamg_esteig_ksp_max_it 20
>
>  Chevy will scale the estimate that we give by, I think, 5% by default. Maybe 
> 10.
>  You can set that with:
>
>  -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05
>
>  0.2 is the scaling of the high eigen estimate for the low eigen value in 
> Chebyshev.
>
> Jed’s suggestion of using -pc_gamg_reuse_interpolation 0 worked. I am testing 
> your options at the moment.
>
> Thanks a lot,
>
> Blaise
>
> — 
> Canada Research Chair in Mathematical and Computational Aspects of Solid 
> Mechanics (Tier 1)
> Professor, Department of Mathematics & Statistics
> Hamilton Hall room 409A, McMaster University
> 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada 
> https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243

Re: [petsc-users] GAMG failure

2023-03-28 Thread Blaise Bourdin

On Mar 27, 2023, at 9:11 PM, Mark Adams wrote:

Yes, the eigen estimates are converging slowly.

BTW, have you tried hypre? It is a good solver (lots lots more woman years)
These eigen estimates are conceptually simple, but they can lead to problems like this (hypre and an eigen estimate free smoother).

I just moved from petsc 3.3 to main, so my experience with an old version of hyper has not been very convincing. Strangely enough, ML has always been the most efficient PC for me. Maybe it’s time to revisit.
That said, I would really like to get decent performances out of gamg. One day, I’d like to be able to account for the special structure of phase-field fracture in the construction of the coarse space.

But try this (good to have options anyway):

-pc_gamg_esteig_ksp_max_it 20

Chevy will scale the estimate that we give by, I think, 5% by default. Maybe 10.
You can set that with:

-mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05

0.2 is the scaling of the high eigen estimate for the low eigen value in Chebyshev.

Jed’s suggestion of using -pc_gamg_reuse_interpolation 0 worked. I am testing your options at the moment.

Thanks a lot,

Blaise

—

Canada Research Chair in Mathematical and Computational Aspects of Solid Mechanics (Tier 1)
Professor, Department of Mathematics & Statistics
Hamilton Hall room 409A, McMaster University
1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada
https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243

Re: [petsc-users] GAMG failure

2023-03-27 Thread Mark Adams

Yes, the eigen estimates are converging slowly.

BTW, have you tried hypre? It is a good solver (lots lots more woman years)
These eigen estimates are conceptually simple, but they can lead to
problems like this (hypre and an eigen estimate free smoother).

But try this (good to have options anyway):

-pc_gamg_esteig_ksp_max_it 20

Chevy will scale the estimate that we give by, I think, 5% by default.
Maybe 10.
You can set that with:

-mg_levels_ksp_chebyshev_esteig 0,0.2,0,*1.05*

0.2 is the scaling of the high eigen estimate for the low eigen value in
Chebyshev.


On Mon, Mar 27, 2023 at 5:06 PM Blaise Bourdin  wrote:

>
>
> On Mar 24, 2023, at 3:21 PM, Mark Adams  wrote:
>
> * Do you set:
>
> PetscCall(MatSetOption(Amat, MAT_SPD, PETSC_TRUE));
>
> PetscCall(MatSetOption(Amat, MAT_SPD_ETERNAL, PETSC_TRUE));
>
>
> Yes
>
>
> Do that to get CG Eigen estimates. Outright failure is usually caused by a
> bad Eigen estimate.
> -pc_gamg_esteig_ksp_monitor_singular_value
> Will print out the estimates as its iterating. You can look at that to
> check that the max has converged.
>
>
> I just did, and something is off:
> I do multiple calls to SNESSolve (staggered scheme for phase-field
> fracture), but only get informations on the first solve (which is not the
> one failing, of course)
> Here is what I get:
> Residual norms for Displacement_pc_gamg_esteig_ solve.
>   0 KSP Residual norm 7.636421712860e+01 % max 1.e+00 min
> 1.e+00 max/min 1.e+00
>   1 KSP Residual norm 3.402024867977e+01 % max 1.114319928921e+00 min
> 1.114319928921e+00 max/min 1.e+00
>   2 KSP Residual norm 2.124815079671e+01 % max 1.501143586520e+00 min
> 5.739351119078e-01 max/min 2.615528402732e+00
>   3 KSP Residual norm 1.581785698912e+01 % max 1.644351137983e+00 min
> 3.263683482596e-01 max/min 5.038329074347e+00
>   4 KSP Residual norm 1.254871990315e+01 % max 1.714668863819e+00 min
> 2.044075812142e-01 max/min 8.388479789416e+00
>   5 KSP Residual norm 1.051198229090e+01 % max 1.760078533063e+00 min
> 1.409327403114e-01 max/min 1.248878386367e+01
>   6 KSP Residual norm 9.061658306086e+00 % max 1.792995287686e+00 min
> 1.023484740555e-01 max/min 1.751853463603e+01
>   7 KSP Residual norm 8.015529297567e+00 % max 1.821497535985e+00 min
> 7.818018001928e-02 max/min 2.329871248104e+01
>   8 KSP Residual norm 7.201063258957e+00 % max 1.855140071935e+00 min
> 6.178572472468e-02 max/min 3.002538337458e+01
>   9 KSP Residual norm 6.548491711695e+00 % max 1.903578294573e+00 min
> 5.008612895206e-02 max/min 3.800609738466e+01
>  10 KSP Residual norm 6.002109992255e+00 % max 1.961356890125e+00 min
> 4.130572033722e-02 max/min 4.748390475004e+01
>   Residual norms for Displacement_pc_gamg_esteig_ solve.
>   0 KSP Residual norm 2.373573910237e+02 % max 1.e+00 min
> 1.e+00 max/min 1.e+00
>   1 KSP Residual norm 8.845061415709e+01 % max 1.081192207576e+00 min
> 1.081192207576e+00 max/min 1.e+00
>   2 KSP Residual norm 5.607525485152e+01 % max 1.345947059840e+00 min
> 5.768825326129e-01 max/min 2.333138869267e+00
>   3 KSP Residual norm 4.123522550864e+01 % max 1.481153523075e+00 min
> 3.070603564913e-01 max/min 4.823655974348e+00
>   4 KSP Residual norm 3.345765664017e+01 % max 1.551374710727e+00 min
> 1.953487694959e-01 max/min 7.941563771968e+00
>   5 KSP Residual norm 2.859712984893e+01 % max 1.604588395452e+00 min
> 1.313871480574e-01 max/min 1.221267391199e+01
>   6 KSP Residual norm 2.525636054248e+01 % max 1.650487481750e+00 min
> 9.322735730688e-02 max/min 1.770389646804e+01
>   7 KSP Residual norm 2.270711391451e+01 % max 1.697243639599e+00 min
> 6.945419058256e-02 max/min 2.443687883140e+01
>   8 KSP Residual norm 2.074739485241e+01 % max 1.737293728907e+00 min
> 5.319942519758e-02 max/min 3.265624999621e+01
>   9 KSP Residual norm 1.912808268870e+01 % max 1.771708608618e+00 min
> 4.229776586667e-02 max/min 4.188657656771e+01
>  10 KSP Residual norm 1.787394414641e+01 % max 1.802834420843e+00 min
> 3.460455235448e-02 max/min 5.209818645753e+01
>   Residual norms for Displacement_pc_gamg_esteig_ solve.
>   0 KSP Residual norm 1.361990679391e+03 % max 1.e+00 min
> 1.e+00 max/min 1.e+00
>   1 KSP Residual norm 5.377188333825e+02 % max 1.086812916769e+00 min
> 1.086812916769e+00 max/min 1.e+00
>   2 KSP Residual norm 2.819790765047e+02 % max 1.474233179517e+00 min
> 6.475176340551e-01 max/min 2.276745994212e+00
>   3 KSP Residual norm 1.856720658591e+02 % max 1.646049713883e+00 min
> 4.391851040105e-01 max/min 3.747963441500e+00
>   4 KSP Residual norm 1.446507859917e+02 % max 1.760403013135e+00 min
> 2.972886103795e-01 max/min 5.921528614526e+00
>   5 KSP Residual norm 1.212491636433e+02 % max 1.839250080524e+00 min
> 1.921591413785e-01 max/min 9.571494061277e+00
>   6 KSP Residual norm 1.052783637696e+02 % max 1.887062042760e+00 min
> 1.275920366984e-01 max/min 1.478981048966e+01
>   7 KSP

Re: [petsc-users] GAMG failure

2023-03-27 Thread Jed Brown

Try -pc_gamg_reuse_interpolation 0. I thought this was disabled by default, but 
I see pc_gamg->reuse_prol = PETSC_TRUE in the code.

Blaise Bourdin  writes:

>  On Mar 24, 2023, at 3:21 PM, Mark Adams  wrote:
>
>  * Do you set: 
>
>  PetscCall(MatSetOption(Amat, MAT_SPD, PETSC_TRUE));
>
>  PetscCall(MatSetOption(Amat, MAT_SPD_ETERNAL, PETSC_TRUE));
>
> Yes
>
>  Do that to get CG Eigen estimates. Outright failure is usually caused by a 
> bad Eigen estimate.
>  -pc_gamg_esteig_ksp_monitor_singular_value
>  Will print out the estimates as its iterating. You can look at that to check 
> that the max has converged.
>
> I just did, and something is off:
> I do multiple calls to SNESSolve (staggered scheme for phase-field fracture), 
> but only get informations on the first solve (which is
> not the one failing, of course)
> Here is what I get:
> Residual norms for Displacement_pc_gamg_esteig_ solve.
>   0 KSP Residual norm 7.636421712860e+01 % max 1.e+00 min 
> 1.e+00 max/min
> 1.e+00
>   1 KSP Residual norm 3.402024867977e+01 % max 1.114319928921e+00 min 
> 1.114319928921e+00 max/min
> 1.e+00
>   2 KSP Residual norm 2.124815079671e+01 % max 1.501143586520e+00 min 
> 5.739351119078e-01 max/min
> 2.615528402732e+00
>   3 KSP Residual norm 1.581785698912e+01 % max 1.644351137983e+00 min 
> 3.263683482596e-01 max/min
> 5.038329074347e+00
>   4 KSP Residual norm 1.254871990315e+01 % max 1.714668863819e+00 min 
> 2.044075812142e-01 max/min
> 8.388479789416e+00
>   5 KSP Residual norm 1.051198229090e+01 % max 1.760078533063e+00 min 
> 1.409327403114e-01 max/min
> 1.248878386367e+01
>   6 KSP Residual norm 9.061658306086e+00 % max 1.792995287686e+00 min 
> 1.023484740555e-01 max/min
> 1.751853463603e+01
>   7 KSP Residual norm 8.015529297567e+00 % max 1.821497535985e+00 min 
> 7.818018001928e-02 max/min
> 2.329871248104e+01
>   8 KSP Residual norm 7.201063258957e+00 % max 1.855140071935e+00 min 
> 6.178572472468e-02 max/min
> 3.002538337458e+01
>   9 KSP Residual norm 6.548491711695e+00 % max 1.903578294573e+00 min 
> 5.008612895206e-02 max/min
> 3.800609738466e+01
>  10 KSP Residual norm 6.002109992255e+00 % max 1.961356890125e+00 min 
> 4.130572033722e-02 max/min
> 4.748390475004e+01
>   Residual norms for Displacement_pc_gamg_esteig_ solve.
>   0 KSP Residual norm 2.373573910237e+02 % max 1.e+00 min 
> 1.e+00 max/min
> 1.e+00
>   1 KSP Residual norm 8.845061415709e+01 % max 1.081192207576e+00 min 
> 1.081192207576e+00 max/min
> 1.e+00
>   2 KSP Residual norm 5.607525485152e+01 % max 1.345947059840e+00 min 
> 5.768825326129e-01 max/min
> 2.333138869267e+00
>   3 KSP Residual norm 4.123522550864e+01 % max 1.481153523075e+00 min 
> 3.070603564913e-01 max/min
> 4.823655974348e+00
>   4 KSP Residual norm 3.345765664017e+01 % max 1.551374710727e+00 min 
> 1.953487694959e-01 max/min
> 7.941563771968e+00
>   5 KSP Residual norm 2.859712984893e+01 % max 1.604588395452e+00 min 
> 1.313871480574e-01 max/min
> 1.221267391199e+01
>   6 KSP Residual norm 2.525636054248e+01 % max 1.650487481750e+00 min 
> 9.322735730688e-02 max/min
> 1.770389646804e+01
>   7 KSP Residual norm 2.270711391451e+01 % max 1.697243639599e+00 min 
> 6.945419058256e-02 max/min
> 2.443687883140e+01
>   8 KSP Residual norm 2.074739485241e+01 % max 1.737293728907e+00 min 
> 5.319942519758e-02 max/min
> 3.265624999621e+01
>   9 KSP Residual norm 1.912808268870e+01 % max 1.771708608618e+00 min 
> 4.229776586667e-02 max/min
> 4.188657656771e+01
>  10 KSP Residual norm 1.787394414641e+01 % max 1.802834420843e+00 min 
> 3.460455235448e-02 max/min
> 5.209818645753e+01
>   Residual norms for Displacement_pc_gamg_esteig_ solve.
>   0 KSP Residual norm 1.361990679391e+03 % max 1.e+00 min 
> 1.e+00 max/min
> 1.e+00
>   1 KSP Residual norm 5.377188333825e+02 % max 1.086812916769e+00 min 
> 1.086812916769e+00 max/min
> 1.e+00
>   2 KSP Residual norm 2.819790765047e+02 % max 1.474233179517e+00 min 
> 6.475176340551e-01 max/min
> 2.276745994212e+00
>   3 KSP Residual norm 1.856720658591e+02 % max 1.646049713883e+00 min 
> 4.391851040105e-01 max/min
> 3.747963441500e+00
>   4 KSP Residual norm 1.446507859917e+02 % max 1.760403013135e+00 min 
> 2.972886103795e-01 max/min
> 5.921528614526e+00
>   5 KSP Residual norm 1.212491636433e+02 % max 1.839250080524e+00 min 
> 1.921591413785e-01 max/min
> 9.571494061277e+00
>   6 KSP Residual norm 1.052783637696e+02 % max 1.887062042760e+00 min 
> 1.275920366984e-01 max/min
> 1.478981048966e+01
>   7 KSP Residual norm 9.230292625762e+01 % max 1.917891358356e+00 min 
> 8.853577120467e-02 max/min
> 2.166233300122e+01
>   8 KSP Residual norm 8.262607594297e+01 % max 1.935857204308e+00 min 
> 6.706949937710e-02 max/min
> 2.886345093206e+01
>   9 KSP Residual norm 7.616474911000e+01 % max 1.946323901431e+00 min 
> 5.354310733090e-02 max/min
> 3.635059671458e+01
>  10 KSP Residual norm

Re: [petsc-users] GAMG failure

2023-03-27 Thread Blaise Bourdin







On Mar 24, 2023, at 3:21 PM, Mark Adams  wrote:



* Do you set:


    PetscCall(MatSetOption(Amat, MAT_SPD, PETSC_TRUE));








    PetscCall(MatSetOption(Amat, MAT_SPD_ETERNAL, PETSC_TRUE));







Yes







Do that to get CG Eigen estimates. Outright failure is usually caused by a bad Eigen estimate.
-pc_gamg_esteig_ksp_monitor_singular_value

Will print out the estimates as its iterating. You can look at that to check that the max has converged.






I just did, and something is off:
I do multiple calls to SNESSolve (staggered scheme for phase-field fracture), but only get informations on the first solve (which is not the one failing, of course)
Here is what I get:


Residual norms for Displacement_pc_gamg_esteig_ solve.
  0 KSP Residual norm 7.636421712860e+01 % max 1.e+00 min 1.e+00 max/min 1.e+00
  1 KSP Residual norm 3.402024867977e+01 % max 1.114319928921e+00 min 1.114319928921e+00 max/min 1.e+00
  2 KSP Residual norm 2.124815079671e+01 % max 1.501143586520e+00 min 5.739351119078e-01 max/min 2.615528402732e+00
  3 KSP Residual norm 1.581785698912e+01 % max 1.644351137983e+00 min 3.263683482596e-01 max/min 5.038329074347e+00
  4 KSP Residual norm 1.254871990315e+01 % max 1.714668863819e+00 min 2.044075812142e-01 max/min 8.388479789416e+00
  5 KSP Residual norm 1.051198229090e+01 % max 1.760078533063e+00 min 1.409327403114e-01 max/min 1.248878386367e+01
  6 KSP Residual norm 9.061658306086e+00 % max 1.792995287686e+00 min 1.023484740555e-01 max/min 1.751853463603e+01
  7 KSP Residual norm 8.015529297567e+00 % max 1.821497535985e+00 min 7.818018001928e-02 max/min 2.329871248104e+01
  8 KSP Residual norm 7.201063258957e+00 % max 1.855140071935e+00 min 6.178572472468e-02 max/min 3.002538337458e+01
  9 KSP Residual norm 6.548491711695e+00 % max 1.903578294573e+00 min 5.008612895206e-02 max/min 3.800609738466e+01
 10 KSP Residual norm 6.002109992255e+00 % max 1.961356890125e+00 min 4.130572033722e-02 max/min 4.748390475004e+01
  Residual norms for Displacement_pc_gamg_esteig_ solve.
  0 KSP Residual norm 2.373573910237e+02 % max 1.e+00 min 1.e+00 max/min 1.e+00
  1 KSP Residual norm 8.845061415709e+01 % max 1.081192207576e+00 min 1.081192207576e+00 max/min 1.e+00
  2 KSP Residual norm 5.607525485152e+01 % max 1.345947059840e+00 min 5.768825326129e-01 max/min 2.333138869267e+00
  3 KSP Residual norm 4.123522550864e+01 % max 1.481153523075e+00 min 3.070603564913e-01 max/min 4.823655974348e+00
  4 KSP Residual norm 3.345765664017e+01 % max 1.551374710727e+00 min 1.953487694959e-01 max/min 7.941563771968e+00
  5 KSP Residual norm 2.859712984893e+01 % max 1.604588395452e+00 min 1.313871480574e-01 max/min 1.221267391199e+01
  6 KSP Residual norm 2.525636054248e+01 % max 1.650487481750e+00 min 9.322735730688e-02 max/min 1.770389646804e+01
  7 KSP Residual norm 2.270711391451e+01 % max 1.697243639599e+00 min 6.945419058256e-02 max/min 2.443687883140e+01
  8 KSP Residual norm 2.074739485241e+01 % max 1.737293728907e+00 min 5.319942519758e-02 max/min 3.265624999621e+01
  9 KSP Residual norm 1.912808268870e+01 % max 1.771708608618e+00 min 4.229776586667e-02 max/min 4.188657656771e+01
 10 KSP Residual norm 1.787394414641e+01 % max 1.802834420843e+00 min 3.460455235448e-02 max/min 5.209818645753e+01
  Residual norms for Displacement_pc_gamg_esteig_ solve.
  0 KSP Residual norm 1.361990679391e+03 % max 1.e+00 min 1.e+00 max/min 1.e+00
  1 KSP Residual norm 5.377188333825e+02 % max 1.086812916769e+00 min 1.086812916769e+00 max/min 1.e+00
  2 KSP Residual norm 2.819790765047e+02 % max 1.474233179517e+00 min 6.475176340551e-01 max/min 2.276745994212e+00
  3 KSP Residual norm 1.856720658591e+02 % max 1.646049713883e+00 min 4.391851040105e-01 max/min 3.747963441500e+00
  4 KSP Residual norm 1.446507859917e+02 % max 1.760403013135e+00 min 2.972886103795e-01 max/min 5.921528614526e+00
  5 KSP Residual norm 1.212491636433e+02 % max 1.839250080524e+00 min 1.921591413785e-01 max/min 9.571494061277e+00
  6 KSP Residual norm 1.052783637696e+02 % max 1.887062042760e+00 min 1.275920366984e-01 max/min 1.478981048966e+01
  7 KSP Residual norm 9.230292625762e+01 % max 1.917891358356e+00 min 8.853577120467e-02 max/min 2.166233300122e+01
  8 KSP Residual norm 8.262607594297e+01 % max 1.935857204308e+00 min 6.706949937710e-02 max/min 2.886345093206e+01
  9 KSP Residual norm 7.616474911000e+01 % max 1.946323901431e+00 min 5.354310733090e-02 max/min 3.635059671458e+01
 10 KSP Residual norm 7.138356892221e+01 % max 1.954382723686e+00 min 4.367661484659e-02 max/min 4.474666204216e+01
  Residual norms for Displacement_pc_gamg_esteig_ solve.
  0 KSP Residual norm 3.702300162209e+03 % max 1.e+00 min 1.e+00 max/min 1.e+00
  1 KSP Residual norm 1.255008322497e+03 % max

Re: [petsc-users] GAMG failure

2023-03-24 Thread Mark Adams

* Do you set:

PetscCall(MatSetOption(Amat, MAT_SPD, PETSC_TRUE));
PetscCall(MatSetOption(Amat, MAT_SPD_ETERNAL, PETSC_TRUE));

Do that to get CG Eigen estimates. Outright failure is usually caused by a
bad Eigen estimate.
-pc_gamg_esteig_ksp_monitor_singular_value
Will print out the estimates as its iterating. You can look at that to
check that the max has converged.

*  -pc_gamg_aggressive_coarsening 0

will slow coarsening as well as threshold.

* you can run with '-info :pc' and send me the output (grep on GAMG)

Mark

On Fri, Mar 24, 2023 at 2:47 PM Jed Brown  wrote:

> You can -pc_gamg_threshold .02 to slow the coarsening and either stronger
> smoother or increase number of iterations used for estimation (or increase
> tolerance). I assume your system is SPD and you've set the near-null space.
>
> Blaise Bourdin  writes:
>
> > Hi,
> >
> > I am having issue with GAMG for some very ill-conditioned 2D linearized
> elasticity problems (sharp variation of elastic moduli with thin  regions
> of nearly incompressible material). I use snes_type newtonls,
> linesearch_type cp, and pc_type gamg without any further options. pc_type
> Jacobi converges fine (although slowly of course).
> >
> >
> > I am not really surprised that gamg would not converge out of the box,
> but don’t know where to start to investigate the convergence failure. Can
> anybody help?
> >
> > Blaise
> >
> > —
> > Canada Research Chair in Mathematical and Computational Aspects of Solid
> Mechanics (Tier 1)
> > Professor, Department of Mathematics & Statistics
> > Hamilton Hall room 409A, McMaster University
> > 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada
> > https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243
>

Re: [petsc-users] GAMG failure

2023-03-24 Thread Jed Brown

You can -pc_gamg_threshold .02 to slow the coarsening and either stronger 
smoother or increase number of iterations used for estimation (or increase 
tolerance). I assume your system is SPD and you've set the near-null space.

Blaise Bourdin  writes:

> Hi,
>
> I am having issue with GAMG for some very ill-conditioned 2D linearized 
> elasticity problems (sharp variation of elastic moduli with thin  regions of 
> nearly incompressible material). I use snes_type newtonls, linesearch_type 
> cp, and pc_type gamg without any further options. pc_type Jacobi converges 
> fine (although slowly of course).
>
>
> I am not really surprised that gamg would not converge out of the box, but 
> don’t know where to start to investigate the convergence failure. Can anybody 
> help?
>
> Blaise
>
> — 
> Canada Research Chair in Mathematical and Computational Aspects of Solid 
> Mechanics (Tier 1)
> Professor, Department of Mathematics & Statistics
> Hamilton Hall room 409A, McMaster University
> 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada 
> https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243

Re: [petsc-users] gamg out of memory with gpu

2022-12-26 Thread Matthew Knepley

On Mon, Dec 26, 2022 at 10:29 AM Edoardo Centofanti <
edoardo.centofant...@universitadipavia.it> wrote:

> Thank you for your answer. Can you provide me the full path of the example
> you have in mind? The one I found does not seem to exploit the algebraic
> multigrid, but just the geometric one.
>

cd $PETSC_DIR/src/snes/tutorials/ex5
./ex5 -da_grid_x 64 -da_grid_y 64 -mms 3 -pc_type gang

and for GPUs I think you need the options to move things over

  -dm_vec_type cuda -dm_mat_type aijcusparse

  Thanks,

 Matt


> Thanks,
> Edoardo
>
> Il giorno lun 26 dic 2022 alle ore 15:39 Matthew Knepley <
> knep...@gmail.com> ha scritto:
>
>> On Mon, Dec 26, 2022 at 4:41 AM Edoardo Centofanti <
>> edoardo.centofant...@universitadipavia.it> wrote:
>>
>>> Hi PETSc Users,
>>>
>>> I am experimenting some issues with the GAMG precondtioner when used
>>> with GPU.
>>> In particular, it seems to go out of memory very easily (around 5000
>>> dofs are enough to make it throw the "[0]PETSC ERROR: cuda error 2
>>> (cudaErrorMemoryAllocation) : out of memory" error).
>>> I have these issues both with single and multiple GPUs (on the same or
>>> on different nodes). The exact same problems work like a charm with HYPRE
>>> BoomerAMG on GPUs.
>>> With both preconditioners I exploit the device acceleration by giving
>>> the usual command line options "-dm_vec_type cuda" and "-dm_mat_type
>>> aijcusparse" (I am working with structured meshes). My PETSc version is
>>> 3.17.
>>>
>>> Is this a known issue of the GAMG preconditioner?
>>>
>>
>> No. Can you get it to do this with a PETSc example? Say SNES ex5?
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Thank you in advance,
>>> Edoardo
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/

Re: [petsc-users] gamg out of memory with gpu

2022-12-26 Thread Edoardo Centofanti

Thank you for your answer. Can you provide me the full path of the example
you have in mind? The one I found does not seem to exploit the algebraic
multigrid, but just the geometric one.

Thanks,
Edoardo

Il giorno lun 26 dic 2022 alle ore 15:39 Matthew Knepley 
ha scritto:

> On Mon, Dec 26, 2022 at 4:41 AM Edoardo Centofanti <
> edoardo.centofant...@universitadipavia.it> wrote:
>
>> Hi PETSc Users,
>>
>> I am experimenting some issues with the GAMG precondtioner when used with
>> GPU.
>> In particular, it seems to go out of memory very easily (around 5000
>> dofs are enough to make it throw the "[0]PETSC ERROR: cuda error 2
>> (cudaErrorMemoryAllocation) : out of memory" error).
>> I have these issues both with single and multiple GPUs (on the same or on
>> different nodes). The exact same problems work like a charm with HYPRE
>> BoomerAMG on GPUs.
>> With both preconditioners I exploit the device acceleration by giving the
>> usual command line options "-dm_vec_type cuda" and "-dm_mat_type
>> aijcusparse" (I am working with structured meshes). My PETSc version is
>> 3.17.
>>
>> Is this a known issue of the GAMG preconditioner?
>>
>
> No. Can you get it to do this with a PETSc example? Say SNES ex5?
>
>   Thanks,
>
>  Matt
>
>
>> Thank you in advance,
>> Edoardo
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] gamg out of memory with gpu

2022-12-26 Thread Matthew Knepley

On Mon, Dec 26, 2022 at 4:41 AM Edoardo Centofanti <
edoardo.centofant...@universitadipavia.it> wrote:

> Hi PETSc Users,
>
> I am experimenting some issues with the GAMG precondtioner when used with
> GPU.
> In particular, it seems to go out of memory very easily (around 5000
> dofs are enough to make it throw the "[0]PETSC ERROR: cuda error 2
> (cudaErrorMemoryAllocation) : out of memory" error).
> I have these issues both with single and multiple GPUs (on the same or on
> different nodes). The exact same problems work like a charm with HYPRE
> BoomerAMG on GPUs.
> With both preconditioners I exploit the device acceleration by giving the
> usual command line options "-dm_vec_type cuda" and "-dm_mat_type
> aijcusparse" (I am working with structured meshes). My PETSc version is
> 3.17.
>
> Is this a known issue of the GAMG preconditioner?
>

No. Can you get it to do this with a PETSc example? Say SNES ex5?

  Thanks,

 Matt


> Thank you in advance,
> Edoardo
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/

Re: [petsc-users] GAMG and linearized elasticity

2022-12-13 Thread Jed Brown

Do you have slip/symmetry boundary conditions, where some components are 
constrained? In that case, there is no uniform block size and I think you'll 
need DMPlexCreateRigidBody() and MatSetNearNullSpace().

The PCSetCoordinates() code won't work for non-constant block size.

-pc_type gamg should work okay out of the box for elasticity. For hypre, I've 
had good luck with this options suite, which also runs on GPU.

-pc_type hypre -pc_hypre_boomeramg_coarsen_type pmis 
-pc_hypre_boomeramg_interp_type ext+i -pc_hypre_boomeramg_no_CF 
-pc_hypre_boomeramg_P_max 6 -pc_hypre_boomeramg_relax_type_down Chebyshev 
-pc_hypre_boomeramg_relax_type_up Chebyshev 
-pc_hypre_boomeramg_strong_threshold 0.5

Blaise Bourdin  writes:

> Hi,
>
> I am getting close to finish porting a code from petsc 3.3 / sieve to main / 
> dmplex, but am
> now encountering difficulties 
> I am reasonably sure that the Jacobian and residual are correct. The codes 
> handle boundary
> conditions differently (MatZeroRowCols vs dmplex constraints) so it is not 
> trivial to compare
> them. Running with snes_type ksponly pc_type Jacobi or hyper gives me the 
> same results in
> roughly the same number of iterations.
>
> In my old code, gamg would work out of the box. When using petsc-main, 
> -pc_type gamg -
> pc_gamg_type agg works for _some_ problems using P1-Lagrange elements, but 
> never for
> P2-Lagrange. The typical error message is in gamg_agg.txt
>
> When using -pc_type classical, a problem where the KSP would converge in 47 
> iteration in
> 3.3 now takes 1400.  ksp_view_3.3.txt and ksp_view_main.txt show the output 
> of -ksp_view
> for both versions. I don’t notice anything obvious.
>
> Strangely, removing the call to PCSetCoordinates does not have any impact on 
> the
> convergence.
>
> I am sure that I am missing something, or not passing the right options. 
> What’s a good
> starting point for 3D elasticity?
> Regards,
> Blaise
>
> — 
> Canada Research Chair in Mathematical and Computational Aspects of Solid 
> Mechanics
> (Tier 1)
> Professor, Department of Mathematics & Statistics
> Hamilton Hall room 409A, McMaster University
> 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada 
> https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: Petsc has generated inconsistent data
> [0]PETSC ERROR: Computed maximum singular value as zero
> [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be 
> the program crashed before they were used or a spelling mistake, etc!
> [0]PETSC ERROR: Option left: name:-displacement_ksp_converged_reason value: 
> ascii source: file
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.18.2-341-g16200351da0  GIT 
> Date: 2022-12-12 23:42:20 +
> [0]PETSC ERROR: 
> /home/bourdinb/Development/mef90/mef90-dmplex/bbserv-gcc11.2.1-mvapich2-2.3.7-O/bin/ThermoElasticity
>  on a bbserv-gcc11.2.1-mvapich2-2.3.7-O named bb01 by bourdinb Tue Dec 13 
> 17:02:19 2022
> [0]PETSC ERROR: Configure options --CFLAGS=-Wunused 
> --FFLAGS="-ffree-line-length-none -fallow-argument-mismatch -Wunused" 
> --COPTFLAGS="-O2 -march=znver2" --CXXOPTFLAGS="-O2 -march=znver2" 
> --FOPTFLAGS="-O2 -march=znver2" --download-chaco=1 --download-exodusii=1 
> --download-fblaslapack=1 --download-hdf5=1 --download-hypre=1 
> --download-metis=1 --download-ml=1 --download-mumps=1 --download-netcdf=1 
> --download-p4est=1 --download-parmetis=1 --download-pnetcdf=1 
> --download-scalapack=1 --download-sowing=1 
> --download-sowing-cc=/opt/rh/devtoolset-9/root/usr/bin/gcc 
> --download-sowing-cxx=/opt/rh/devtoolset-9/root/usr/bin/g++ 
> --download-sowing-cpp=/opt/rh/devtoolset-9/root/usr/bin/cpp 
> --download-sowing-cxxcpp=/opt/rh/devtoolset-9/root/usr/bin/cpp 
> --download-superlu=1 --download-triangle=1 --download-yaml=1 
> --download-zlib=1 --with-debugging=0 
> --with-mpi-dir=/opt/HPC/mvapich2/2.3.7-gcc11.2.1 --with-pic 
> --with-shared-libraries=1 --with-mpiexec=srun --with-x11=0
> [0]PETSC ERROR: #1 PCGAMGOptProlongator_AGG() at 
> /1/HPC/petsc/main/src/ksp/pc/impls/gamg/agg.c:779
> [0]PETSC ERROR: #2 PCSetUp_GAMG() at 
> /1/HPC/petsc/main/src/ksp/pc/impls/gamg/gamg.c:639
> [0]PETSC ERROR: #3 PCSetUp() at 
> /1/HPC/petsc/main/src/ksp/pc/interface/precon.c:994
> [0]PETSC ERROR: #4 KSPSetUp() at 
> /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:405
> [0]PETSC ERROR: #5 KSPSolve_Private() at 
> /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:824
> [0]PETSC ERROR: #6 KSPSolve() at 
> /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:1070
> [0]PETSC ERROR: #7 SNESSolve_KSPONLY() at 
> /1/HPC/petsc/main/src/snes/impls/ksponly/ksponly.c:48
> [0]PETSC ERROR: #8 SNESSolve() at 
> /1/HPC/petsc/main/src/snes/interface/snes.c:4693
> [0]PETSC ERROR: #9 
>

Re: [petsc-users] GAMG crash during setup when using multiple GPUs

2022-02-11 Thread Sajid Ali Syed

Hi Mark,

Thanks for the information.

@Junchao: Given that there are known issues with GPU aware MPI, it might be 
best to wait until there is an updated version of cray-mpich (which hopefully 
contains the relevant fixes).

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Mark Adams 
Sent: Thursday, February 10, 2022 8:47 PM
To: Junchao Zhang 
Cc: Sajid Ali Syed ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] GAMG crash during setup when using multiple GPUs

Perlmutter has problems with GPU aware MPI.
This is being actively worked on at NERSc.

Mark

On Thu, Feb 10, 2022 at 9:22 PM Junchao Zhang 
mailto:junchao.zh...@gmail.com>> wrote:
Hi, Sajid Ali,
  I have no clue. I have access to perlmutter.  I am thinking how to debug that.
  If your app is open-sourced and easy to build, then I can build and debug it. 
Otherwise, suppose you build and install petsc (only with options needed by 
your app) to a shared directory, and I can access your executable (which uses 
RPATH for libraries), then maybe I can debug it (I only need to install my own 
petsc to the shared directory)

--Junchao Zhang


On Thu, Feb 10, 2022 at 6:04 PM Sajid Ali Syed 
mailto:sas...@fnal.gov>> wrote:
Hi Junchao,

With "-use_gpu_aware_mpi 0" there is no error. I'm attaching the log for this 
case with this email.

I also ran with gpu aware mpi to see if I could reproduce the error and got the 
error but from a different location. This logfile is also attached.

This was using the newest cray-mpich on NERSC-perlmutter (8.1.12). Let me know 
if I can share further information to help with debugging this.

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=Fea4VIbc4UoqdTFjAk3kg3Hp94LYXkjR3gHIdP08lMeT-3zEDZNKDcHjRejBIggW=ezCw13eIYUcCzUki3rlnpGZWZrdcTxlGpG57GqrEz_s=>


From: Junchao Zhang mailto:junchao.zh...@gmail.com>>
Sent: Thursday, February 10, 2022 1:43 PM
To: Sajid Ali Syed mailto:sas...@fnal.gov>>
Cc: petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] GAMG crash during setup when using multiple GPUs

Also, try "-use_gpu_aware_mpi 0" to see if there is a difference.

--Junchao Zhang


On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang 
mailto:junchao.zh...@gmail.com>> wrote:
Did it fail without GPU at 64 MPI ranks?

--Junchao Zhang


On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed 
mailto:sas...@fnal.gov>> wrote:

Hi PETSc-developers,

I’m seeing the following crash that occurs during the setup phase of the 
preconditioner when using multiple GPUs. The relevant error trace is shown 
below:

(GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, 
CUDA_ERROR_ALREADY_MAPPED, line no 272
[24]PETSC ERROR: - Error Message 
--
[24]PETSC ERROR: General MPI error
[24]PETSC ERROR: MPI error 1 Invalid buffer pointer
[24]PETSC ERROR: See 
https://petsc.org/release/faq/<https://urldefense.proofpoint.com/v2/url?u=https-3A__petsc.org_release_faq_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=3AFKDE-HT__MEeFxdxlc6bMDLLjchFccw_htjVmWkOsApaEairnUJYnKT28tfsiN=ZpvtorGvQdUD8O-wLBTUYUUb6-Kccver8Cc4kXlZ7J0=>
 for trouble shooting.
[24]PETSC ERROR: Petsc Development GIT revision: 
f351d5494b5462f62c419e00645ac2e477b88cae  GIT Date: 2022-02-08 15:08:19 +
...
[24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54
[24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274
[24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218
[24]PETSC ERROR: #4 PetscSFBcastEnd() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499
[24]PETSC ERROR: #5 VecScatterEnd_Internal() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87
[24]PETSC ERROR: #6 VecScatterEnd() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366
[24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6k

Re: [petsc-users] GAMG crash during setup when using multiple GPUs

2022-02-10 Thread Mark Adams

Perlmutter has problems with GPU aware MPI.
This is being actively worked on at NERSc.

Mark

On Thu, Feb 10, 2022 at 9:22 PM Junchao Zhang 
wrote:

> Hi, Sajid Ali,
>   I have no clue. I have access to perlmutter.  I am thinking how to debug
> that.
>   If your app is open-sourced and easy to build, then I can build and
> debug it. Otherwise, suppose you build and install petsc (only with options
> needed by your app) to a shared directory, and I can access your executable
> (which uses RPATH for libraries), then maybe I can debug it (I only need to
> install my own petsc to the shared directory)
>
> --Junchao Zhang
>
>
> On Thu, Feb 10, 2022 at 6:04 PM Sajid Ali Syed  wrote:
>
>> Hi Junchao,
>>
>> With "-use_gpu_aware_mpi 0" there is no error. I'm attaching the log for
>> this case with this email.
>>
>> I also ran with gpu aware mpi to see if I could reproduce the error and
>> got the error but from a different location. This logfile is also attached.
>>
>> This was using the newest cray-mpich on NERSC-perlmutter (8.1.12). Let me
>> know if I can share further information to help with debugging this.
>>
>> Thank You,
>> Sajid Ali (he/him) | Research Associate
>> Scientific Computing Division
>> Fermi National Accelerator Laboratory
>> s-sajid-ali.github.io
>>
>> ----------
>> *From:* Junchao Zhang 
>> *Sent:* Thursday, February 10, 2022 1:43 PM
>> *To:* Sajid Ali Syed 
>> *Cc:* petsc-users@mcs.anl.gov 
>> *Subject:* Re: [petsc-users] GAMG crash during setup when using multiple
>> GPUs
>>
>> Also, try "-use_gpu_aware_mpi 0" to see if there is a difference.
>>
>> --Junchao Zhang
>>
>>
>> On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang 
>> wrote:
>>
>> Did it fail without GPU at 64 MPI ranks?
>>
>> --Junchao Zhang
>>
>>
>> On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed  wrote:
>>
>> Hi PETSc-developers,
>>
>> I’m seeing the following crash that occurs during the setup phase of the
>> preconditioner when using multiple GPUs. The relevant error trace is shown
>> below:
>>
>> (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, 
>> CUDA_ERROR_ALREADY_MAPPED, line no 272
>> [24]PETSC ERROR: - Error Message 
>> --
>> [24]PETSC ERROR: General MPI error
>> [24]PETSC ERROR: MPI error 1 Invalid buffer pointer
>> [24]PETSC ERROR: See https://petsc.org/release/faq/ 
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__petsc.org_release_faq_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=3AFKDE-HT__MEeFxdxlc6bMDLLjchFccw_htjVmWkOsApaEairnUJYnKT28tfsiN=ZpvtorGvQdUD8O-wLBTUYUUb6-Kccver8Cc4kXlZ7J0=>
>>  for trouble shooting.
>> [24]PETSC ERROR: Petsc Development GIT revision: 
>> f351d5494b5462f62c419e00645ac2e477b88cae  GIT Date: 2022-02-08 15:08:19 +
>> ...
>> [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54
>> [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274
>> [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218
>> [24]PETSC ERROR: #4 PetscSFBcastEnd() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499
>> [24]PETSC ERROR: #5 VecScatterEnd_Internal() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87
>> [24]PETSC ERROR: #6 VecScatterEnd() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366
>> [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302
>>  
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__mpiaijcusparse.cu-3A302=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=3AFKDE-HT__MEeFxdxlc6bMDLLjchFccw_htjVmWkOsApaEairnUJYnKT28tfsiN=eMW4lGCKOn_tzQeT5gnM0i9mgEMwwbOe1EkCAtKG9M8=>
>> [24]PETSC ERROR: #8 MatMult() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttlj

Re: [petsc-users] GAMG crash during setup when using multiple GPUs

2022-02-10 Thread Junchao Zhang

Hi, Sajid Ali,
  I have no clue. I have access to perlmutter.  I am thinking how to debug
that.
  If your app is open-sourced and easy to build, then I can build and debug
it. Otherwise, suppose you build and install petsc (only with options
needed by your app) to a shared directory, and I can access your executable
(which uses RPATH for libraries), then maybe I can debug it (I only need to
install my own petsc to the shared directory)

--Junchao Zhang


On Thu, Feb 10, 2022 at 6:04 PM Sajid Ali Syed  wrote:

> Hi Junchao,
>
> With "-use_gpu_aware_mpi 0" there is no error. I'm attaching the log for
> this case with this email.
>
> I also ran with gpu aware mpi to see if I could reproduce the error and
> got the error but from a different location. This logfile is also attached.
>
> This was using the newest cray-mpich on NERSC-perlmutter (8.1.12). Let me
> know if I can share further information to help with debugging this.
>
> Thank You,
> Sajid Ali (he/him) | Research Associate
> Scientific Computing Division
> Fermi National Accelerator Laboratory
> s-sajid-ali.github.io
>
> --
> *From:* Junchao Zhang 
> *Sent:* Thursday, February 10, 2022 1:43 PM
> *To:* Sajid Ali Syed 
> *Cc:* petsc-users@mcs.anl.gov 
> *Subject:* Re: [petsc-users] GAMG crash during setup when using multiple
> GPUs
>
> Also, try "-use_gpu_aware_mpi 0" to see if there is a difference.
>
> --Junchao Zhang
>
>
> On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang 
> wrote:
>
> Did it fail without GPU at 64 MPI ranks?
>
> --Junchao Zhang
>
>
> On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed  wrote:
>
> Hi PETSc-developers,
>
> I’m seeing the following crash that occurs during the setup phase of the
> preconditioner when using multiple GPUs. The relevant error trace is shown
> below:
>
> (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, 
> CUDA_ERROR_ALREADY_MAPPED, line no 272
> [24]PETSC ERROR: - Error Message 
> --
> [24]PETSC ERROR: General MPI error
> [24]PETSC ERROR: MPI error 1 Invalid buffer pointer
> [24]PETSC ERROR: See https://petsc.org/release/faq/ 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__petsc.org_release_faq_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=3AFKDE-HT__MEeFxdxlc6bMDLLjchFccw_htjVmWkOsApaEairnUJYnKT28tfsiN=ZpvtorGvQdUD8O-wLBTUYUUb6-Kccver8Cc4kXlZ7J0=>
>  for trouble shooting.
> [24]PETSC ERROR: Petsc Development GIT revision: 
> f351d5494b5462f62c419e00645ac2e477b88cae  GIT Date: 2022-02-08 15:08:19 +
> ...
> [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54
> [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274
> [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218
> [24]PETSC ERROR: #4 PetscSFBcastEnd() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499
> [24]PETSC ERROR: #5 VecScatterEnd_Internal() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87
> [24]PETSC ERROR: #6 VecScatterEnd() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366
> [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302
>  
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__mpiaijcusparse.cu-3A302=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=3AFKDE-HT__MEeFxdxlc6bMDLLjchFccw_htjVmWkOsApaEairnUJYnKT28tfsiN=eMW4lGCKOn_tzQeT5gnM0i9mgEMwwbOe1EkCAtKG9M8=>
> [24]PETSC ERROR: #8 MatMult() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438
> [24]PETSC ERROR: #9 PCApplyBAorAB() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730
> [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421
> [24]PETSC ERROR: #11 KSPGMRESCycle() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro

Re: [petsc-users] GAMG crash during setup when using multiple GPUs

2022-02-10 Thread Junchao Zhang

Also, try "-use_gpu_aware_mpi 0" to see if there is a difference.

--Junchao Zhang


On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang 
wrote:

> Did it fail without GPU at 64 MPI ranks?
>
> --Junchao Zhang
>
>
> On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed  wrote:
>
>> Hi PETSc-developers,
>>
>> I’m seeing the following crash that occurs during the setup phase of the
>> preconditioner when using multiple GPUs. The relevant error trace is shown
>> below:
>>
>> (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, 
>> CUDA_ERROR_ALREADY_MAPPED, line no 272
>> [24]PETSC ERROR: - Error Message 
>> --
>> [24]PETSC ERROR: General MPI error
>> [24]PETSC ERROR: MPI error 1 Invalid buffer pointer
>> [24]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>> [24]PETSC ERROR: Petsc Development GIT revision: 
>> f351d5494b5462f62c419e00645ac2e477b88cae  GIT Date: 2022-02-08 15:08:19 +
>> ...
>> [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54
>> [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274
>> [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218
>> [24]PETSC ERROR: #4 PetscSFBcastEnd() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499
>> [24]PETSC ERROR: #5 VecScatterEnd_Internal() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87
>> [24]PETSC ERROR: #6 VecScatterEnd() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366
>> [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302
>> [24]PETSC ERROR: #8 MatMult() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438
>> [24]PETSC ERROR: #9 PCApplyBAorAB() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730
>> [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421
>> [24]PETSC ERROR: #11 KSPGMRESCycle() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:162
>> [24]PETSC ERROR: #12 KSPSolve_GMRES() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:247
>> [24]PETSC ERROR: #13 KSPSolve_Private() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:925
>> [24]PETSC ERROR: #14 KSPSolve() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:1103
>> [24]PETSC ERROR: #15 PCGAMGOptProlongator_AGG() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/agg.c:1127
>> [24]PETSC ERROR: #16 PCSetUp_GAMG() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/gamg.c:626
>> [24]PETSC ERROR: #17 PCSetUp() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:1017
>> [24]PETSC ERROR: #18 KSPSetUp() at 
>> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:417
>> [24]PETSC ERROR: #19 main() at poisson3d.c:69
>> [24]PETSC ERROR: PETSc Option Table entries:
>> [24]PETSC ERROR: -dm_mat_type aijcusparse
>> [24]PETSC ERROR: -dm_vec_type cuda
>> [24]PETSC ERROR: -ksp_monitor
>> [24]PETSC ERROR: -ksp_norm_type unpreconditioned
>> [24]PETSC ERROR: -ksp_type cg
>> [24]PETSC ERROR: -ksp_view
>> [24]PETSC ERROR: -log_view
>> [24]PETSC ERROR: -mg_levels_esteig_ksp_type cg
>> [24]PETSC ERROR: -mg_levels_ksp_type chebyshev
>> [24]PETSC ERROR: -mg_levels_pc_type jacobi
>> [24]PETSC ERROR: -pc_gamg_agg_nsmooths 1
>> [24]PETSC ERROR: -pc_gamg_square_graph 1
>> [24]PETSC ERROR: -pc_gamg_threshold 0.0
>> [24]PETSC ERROR: -pc_gamg_threshold_scale 0.0
>> [24]PETSC ERROR: -pc_gamg_type agg
>> [24]PETSC ERROR: -pc_type gamg
>> [24]PETSC ERROR: End of Error Message ---send entire

Re: [petsc-users] GAMG crash during setup when using multiple GPUs

2022-02-10 Thread Junchao Zhang

Did it fail without GPU at 64 MPI ranks?

--Junchao Zhang


On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed  wrote:

> Hi PETSc-developers,
>
> I’m seeing the following crash that occurs during the setup phase of the
> preconditioner when using multiple GPUs. The relevant error trace is shown
> below:
>
> (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, 
> CUDA_ERROR_ALREADY_MAPPED, line no 272
> [24]PETSC ERROR: - Error Message 
> --
> [24]PETSC ERROR: General MPI error
> [24]PETSC ERROR: MPI error 1 Invalid buffer pointer
> [24]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [24]PETSC ERROR: Petsc Development GIT revision: 
> f351d5494b5462f62c419e00645ac2e477b88cae  GIT Date: 2022-02-08 15:08:19 +
> ...
> [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54
> [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274
> [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218
> [24]PETSC ERROR: #4 PetscSFBcastEnd() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499
> [24]PETSC ERROR: #5 VecScatterEnd_Internal() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87
> [24]PETSC ERROR: #6 VecScatterEnd() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366
> [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302
> [24]PETSC ERROR: #8 MatMult() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438
> [24]PETSC ERROR: #9 PCApplyBAorAB() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730
> [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421
> [24]PETSC ERROR: #11 KSPGMRESCycle() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:162
> [24]PETSC ERROR: #12 KSPSolve_GMRES() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:247
> [24]PETSC ERROR: #13 KSPSolve_Private() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:925
> [24]PETSC ERROR: #14 KSPSolve() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:1103
> [24]PETSC ERROR: #15 PCGAMGOptProlongator_AGG() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/agg.c:1127
> [24]PETSC ERROR: #16 PCSetUp_GAMG() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/gamg.c:626
> [24]PETSC ERROR: #17 PCSetUp() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:1017
> [24]PETSC ERROR: #18 KSPSetUp() at 
> /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:417
> [24]PETSC ERROR: #19 main() at poisson3d.c:69
> [24]PETSC ERROR: PETSc Option Table entries:
> [24]PETSC ERROR: -dm_mat_type aijcusparse
> [24]PETSC ERROR: -dm_vec_type cuda
> [24]PETSC ERROR: -ksp_monitor
> [24]PETSC ERROR: -ksp_norm_type unpreconditioned
> [24]PETSC ERROR: -ksp_type cg
> [24]PETSC ERROR: -ksp_view
> [24]PETSC ERROR: -log_view
> [24]PETSC ERROR: -mg_levels_esteig_ksp_type cg
> [24]PETSC ERROR: -mg_levels_ksp_type chebyshev
> [24]PETSC ERROR: -mg_levels_pc_type jacobi
> [24]PETSC ERROR: -pc_gamg_agg_nsmooths 1
> [24]PETSC ERROR: -pc_gamg_square_graph 1
> [24]PETSC ERROR: -pc_gamg_threshold 0.0
> [24]PETSC ERROR: -pc_gamg_threshold_scale 0.0
> [24]PETSC ERROR: -pc_gamg_type agg
> [24]PETSC ERROR: -pc_type gamg
> [24]PETSC ERROR: End of Error Message ---send entire 
> error message to petsc-ma...@mcs.anl.gov--
>
> Attached with this email is the full error log and the submit script for a
> 8-node/64-GPU/64 MPI rank job. I’ll also note that the same program did not
> crash

Re: [petsc-users] GAMG memory consumption

2021-11-24 Thread Dave May

I think your run with -pc_type mg is defining a multigrid hierarchy with a
only single level. (A single level mg PC would also explain the 100+
iterations required to converge.) The gamg configuration is definitely
coarsening your problem and has a deeper hierarchy.  A single level
hierarchy will require less memory than a multilevel hierarchy.

Cheers,
Dave

On Wed 24. Nov 2021 at 19:03, Matthew Knepley  wrote:

> On Wed, Nov 24, 2021 at 12:26 PM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalin...@stfc.ac.uk> wrote:
>
>> Hello,
>>
>>
>>
>> I would like to understand why more memory is consumed by -pc_type gamg
>> compared to -pc_type mg for the same problem size
>>
>>
>>
>> ksp/ksp/tutorial: ./ex45 -da_grid_x 368 -da_grid_x 368 -da_grid_x 368
>> -ksp_type cg
>>
>>
>>
>> -pc_type mg
>>
>>
>>
>> Maximum (over computational time) process memory:total 1.9399e+10
>> max 9.7000e+09 min 9.6992e+09
>>
>>
>>
>> -pc_type gamg
>>
>>
>>
>> Maximum (over computational time) process memory:total 4.9671e+10
>> max 2.4836e+10 min 2.4835e+10
>>
>>
>>
>>
>> Am I right in understanding that the memory limiting factor is ‘max
>> 2.4836e+10’ as it is the maximum memory used at any given time?
>>
>
> Yes, I believe so.
>
> GAMG is using A_C = P^T A P, where P is the prolongation from coarse to
> fine, in order to compute the coarse operator A_C, rather than
> rediscretization, since it does not have any notion of discretization or
> coarse meshes. This takes more memory.
>
>   Thanks,
>
> Matt
>
>
>> I have attached the -log_view output of both the preconditioners.
>>
>>
>>
>> Best regards,
>>
>> Karthik.
>>
>>
>>
>> This email and any attachments are intended solely for the use of the
>> named recipients. If you are not the intended recipient you must not use,
>> disclose, copy or distribute this email or any of its attachments and
>> should notify the sender immediately and delete this email from your
>> system. UK Research and Innovation (UKRI) has taken every reasonable
>> precaution to minimise risk of this email or any attachments containing
>> viruses or malware but the recipient should carry out its own virus and
>> malware checks before opening the attachments. UKRI does not accept any
>> liability for any losses or damages which the recipient may sustain due to
>> presence of any viruses.
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] GAMG memory consumption

2021-11-24 Thread Mark Adams

As Matt said GAMG uses more memory.
But these numbers look odd: max == min and total = max + min, for both
cases.
I would use
https://petsc.org/release/docs/manualpages/Sys/PetscMallocDump.html to look
at this more closely.

On Wed, Nov 24, 2021 at 1:03 PM Matthew Knepley  wrote:

> On Wed, Nov 24, 2021 at 12:26 PM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalin...@stfc.ac.uk> wrote:
>
>> Hello,
>>
>>
>>
>> I would like to understand why more memory is consumed by -pc_type gamg
>> compared to -pc_type mg for the same problem size
>>
>>
>>
>> ksp/ksp/tutorial: ./ex45 -da_grid_x 368 -da_grid_x 368 -da_grid_x 368
>> -ksp_type cg
>>
>>
>>
>> -pc_type mg
>>
>>
>>
>> Maximum (over computational time) process memory:total 1.9399e+10
>> max 9.7000e+09 min 9.6992e+09
>>
>>
>>
>> -pc_type gamg
>>
>>
>>
>> Maximum (over computational time) process memory:total 4.9671e+10
>> max 2.4836e+10 min 2.4835e+10
>>
>>
>>
>>
>> Am I right in understanding that the memory limiting factor is ‘max
>> 2.4836e+10’ as it is the maximum memory used at any given time?
>>
>
> Yes, I believe so.
>
> GAMG is using A_C = P^T A P, where P is the prolongation from coarse to
> fine, in order to compute the coarse operator A_C, rather than
> rediscretization, since it does not have any notion of discretization or
> coarse meshes. This takes more memory.
>
>   Thanks,
>
> Matt
>
>
>> I have attached the -log_view output of both the preconditioners.
>>
>>
>>
>> Best regards,
>>
>> Karthik.
>>
>>
>>
>> This email and any attachments are intended solely for the use of the
>> named recipients. If you are not the intended recipient you must not use,
>> disclose, copy or distribute this email or any of its attachments and
>> should notify the sender immediately and delete this email from your
>> system. UK Research and Innovation (UKRI) has taken every reasonable
>> precaution to minimise risk of this email or any attachments containing
>> viruses or malware but the recipient should carry out its own virus and
>> malware checks before opening the attachments. UKRI does not accept any
>> liability for any losses or damages which the recipient may sustain due to
>> presence of any viruses.
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] GAMG memory consumption

2021-11-24 Thread Matthew Knepley

On Wed, Nov 24, 2021 at 12:26 PM Karthikeyan Chockalingam - STFC UKRI <
karthikeyan.chockalin...@stfc.ac.uk> wrote:

> Hello,
>
>
>
> I would like to understand why more memory is consumed by -pc_type gamg
> compared to -pc_type mg for the same problem size
>
>
>
> ksp/ksp/tutorial: ./ex45 -da_grid_x 368 -da_grid_x 368 -da_grid_x 368
> -ksp_type cg
>
>
>
> -pc_type mg
>
>
>
> Maximum (over computational time) process memory:total 1.9399e+10
> max 9.7000e+09 min 9.6992e+09
>
>
>
> -pc_type gamg
>
>
>
> Maximum (over computational time) process memory:total 4.9671e+10
> max 2.4836e+10 min 2.4835e+10
>
>
>
>
> Am I right in understanding that the memory limiting factor is ‘max
> 2.4836e+10’ as it is the maximum memory used at any given time?
>

Yes, I believe so.

GAMG is using A_C = P^T A P, where P is the prolongation from coarse to
fine, in order to compute the coarse operator A_C, rather than
rediscretization, since it does not have any notion of discretization or
coarse meshes. This takes more memory.

  Thanks,

Matt


> I have attached the -log_view output of both the preconditioners.
>
>
>
> Best regards,
>
> Karthik.
>
>
>
> This email and any attachments are intended solely for the use of the
> named recipients. If you are not the intended recipient you must not use,
> disclose, copy or distribute this email or any of its attachments and
> should notify the sender immediately and delete this email from your
> system. UK Research and Innovation (UKRI) has taken every reasonable
> precaution to minimise risk of this email or any attachments containing
> viruses or malware but the recipient should carry out its own virus and
> malware checks before opening the attachments. UKRI does not accept any
> liability for any losses or damages which the recipient may sustain due to
> presence of any viruses.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/

Re: [petsc-users] gamg student questions

2021-10-17 Thread Matthew Knepley

On Sun, Oct 17, 2021 at 9:04 AM Mark Adams  wrote:

> Hi Daniel, [this is a PETSc users list question so let me move it there]
>
> The behavior that you are seeing is a bit odd but not surprising.
>
> First, you should start with simple problems and get AMG (you might want
> to try this exercise with hypre as well: --download-hypre and use -pc_type
> hypre, or BDDC, see below).
>

We have two examples that do this:

  1) SNES ex56: This shows good performance of GAMG on Q1 and Q2 elasticity

  2) SNES ex17: This sets up a lot of finite element elasticity problems
where you can experiment with GAMG, ML, Hypre, BDDC, and other
preconditioners

As a rule of thumb, if my solver is taking more than 100 iterations
(usually for 1e-8 tolerance), something is very wrong. Either the problem
is setup incorrectly, the solver is
configured incorrectly, or I need to switch solvers.

  Thanks,

 Matt


> There are, alas, a lot of tuning parameters in AMG/DD and I recommend a
> homotopy process: you can start with issues that deal with your
> discretization on a simple cube, linear elasticity, cube elements, modest
> Posson ratio, etc., and first get "textbook multigrid efficiency" (TME),
> which for elasticity and a V(2,2) cycle in GAMG is about one digit of error
> reduction per iteration and perfectly monotonic until it hits floating
> point precision.
>
> I would set this problem up and I would hope it runs OK, but the
> problems that you want to do are probably pretty hard (high order FE,
> plasticity, incompressibility) so there will be more work to do.
>
> That said, PETSc has nice domain decomposition solvers that are more
> optimized and maintained for elasticity. Now that I think about it, you
> should probably look at these (
> https://petsc.org/release/docs/manualpages/PC/PCBDDC.html
> https://petsc.org/release/docs/manual/ksp/#balancing-domain-decomposition-by-constraints).
> I think they prefer, but do not require, that you do not assemble your
> element matrices, but let them do it. The docs will make that clear.
>
> BSSC is great but it is not magic, and it is no less complex, so I would
> still recommend the same process of getting TME and then moving to the
> problems that you want to solve.
>
> Good luck,
> Mark
>
>
>
> On Sat, Oct 16, 2021 at 10:50 PM Daniel N Pickard  wrote:
>
>> Hi Dr Adams,
>>
>>
>> I am using the gamg in petsc to solve some elasticity problems for
>> modeling bones. I am new to profiling with petsc, but I am observing that
>> around a thousand iterations my norm has gone down 3 orders of magnitude
>> but the solver slows down and progress sort of stalls. The norm
>> also doesn't decrease monotonically, but jumps around a bit. I also notice
>> that if I request to only use 1 multigrid level, the preconditioner is
>> much cheaper and not as powerful so the code takes more iterations, but
>> runs 2-3x faster. Is this expected that large models require lots of
>> iterations and convergence slows down as we get more accurate? What exactly
>> should I be looking for when I am profiling to try to understand how to run
>> faster? I see that a lot of my ratio's are 2.7, but I think that is because
>> my mesh partitioner is not doing a great job making equal domains. What are
>> the giveaways in the log_view that tell you that petsc could be optimized
>> more?
>>
>>
>> Also when I look at the solution with just 4 orders of magnitude of
>> convergence I can see that the solver has not made much progress in the
>> interior of the domain, but seems to have smoothed out the boundary where
>> forces where applied very well. Does this mean I should use a larger
>> threshold to get more course grids that can fix the low frequency error?
>>
>>
>> Thanks,
>>
>> Daniel Pickard
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/

Re: [petsc-users] gamg student questions

2021-10-17 Thread Mark Adams

Hi Daniel, [this is a PETSc users list question so let me move it there]

The behavior that you are seeing is a bit odd but not surprising.

First, you should start with simple problems and get AMG (you might want to
try this exercise with hypre as well: --download-hypre and use -pc_type
hypre, or BDDC, see below).

There are, alas, a lot of tuning parameters in AMG/DD and I recommend a
homotopy process: you can start with issues that deal with your
discretization on a simple cube, linear elasticity, cube elements, modest
Posson ratio, etc., and first get "textbook multigrid efficiency" (TME),
which for elasticity and a V(2,2) cycle in GAMG is about one digit of error
reduction per iteration and perfectly monotonic until it hits floating
point precision.

I would set this problem up and I would hope it runs OK, but the
problems that you want to do are probably pretty hard (high order FE,
plasticity, incompressibility) so there will be more work to do.

That said, PETSc has nice domain decomposition solvers that are more
optimized and maintained for elasticity. Now that I think about it, you
should probably look at these (
https://petsc.org/release/docs/manualpages/PC/PCBDDC.html
https://petsc.org/release/docs/manual/ksp/#balancing-domain-decomposition-by-constraints).
I think they prefer, but do not require, that you do not assemble your
element matrices, but let them do it. The docs will make that clear.

BSSC is great but it is not magic, and it is no less complex, so I would
still recommend the same process of getting TME and then moving to the
problems that you want to solve.

Good luck,
Mark

On Sat, Oct 16, 2021 at 10:50 PM Daniel N Pickard  wrote:

> Hi Dr Adams,
>
>
> I am using the gamg in petsc to solve some elasticity problems for
> modeling bones. I am new to profiling with petsc, but I am observing that
> around a thousand iterations my norm has gone down 3 orders of magnitude
> but the solver slows down and progress sort of stalls. The norm
> also doesn't decrease monotonically, but jumps around a bit. I also notice
> that if I request to only use 1 multigrid level, the preconditioner is
> much cheaper and not as powerful so the code takes more iterations, but
> runs 2-3x faster. Is this expected that large models require lots of
> iterations and convergence slows down as we get more accurate? What exactly
> should I be looking for when I am profiling to try to understand how to run
> faster? I see that a lot of my ratio's are 2.7, but I think that is because
> my mesh partitioner is not doing a great job making equal domains. What are
> the giveaways in the log_view that tell you that petsc could be optimized
> more?
>
>
> Also when I look at the solution with just 4 orders of magnitude of
> convergence I can see that the solver has not made much progress in the
> interior of the domain, but seems to have smoothed out the boundary where
> forces where applied very well. Does this mean I should use a larger
> threshold to get more course grids that can fix the low frequency error?
>
>
> Thanks,
>
> Daniel Pickard
>

Re: [petsc-users] GAMG preconditioning

2021-04-12 Thread Barry Smith


  Please send -log_view for the ilu and GAMG case.

  Barry


> On Apr 12, 2021, at 10:34 AM, Milan Pelletier via petsc-users 
>  wrote:
> 
> Dear all,
> 
> I am currently trying to use PETSc with CG solver and GAMG preconditioner.
> I have started with the following set of parameters:
> -ksp_type cg
> -pc_type gamg
> -pc_gamg_agg_nsmooths 1 
> -pc_gamg_threshold 0.02 
> -mg_levels_ksp_type chebyshev 
> -mg_levels_pc_type sor 
> -mg_levels_ksp_max_it 2
> 
> Unfortunately, the preconditioning seems to run extremely slowly. I tried to 
> play around with the numbers, to check if I could notice some difference, but 
> could not observe significant changes. 
> As a comparison, the KSPSetup call with GAMG PC takes more than 10 times 
> longer than completing the whole computation (preconditioning + ~400 KSP 
> iterations to convergence) of the similar case using the following parameters 
> :
> -ksp_type cg
> -pc_type ilu
> -pc_factor_levels 0
> 
> The matrix size for my case is ~1,850,000*1,850,000 elements, with 
> ~38,000,000 non-zero terms (i.e. ~20 per row). For both ILU and AMG cases I 
> use matseqaij/vecseq storage (as a first step I work with only 1 MPI process).
> 
> Is there something wrong in the parameter set I have been using?
> I understand that the preconditioning overhead with AMG is higher than with 
> ILU, but I would also expect CG/GAMG to be competitive against CG/ILU, 
> especially considering the relatively big problem size.
> 
> For information, I am using the PETSc version built from commit 
> 6840fe907c1a3d26068082d180636158471d79a2 (release branch from April 7, 2021). 
> 
> Any clue or idea would be greatly appreciated!
> Thanks for your help,
> 
> Best regards,
> Milan Pelletier
> 
>

Re: [petsc-users] GAMG preconditioning

2021-04-12 Thread Mark Adams

Can you briefly describe your application,?

AMG usually only works well for straightforward elliptic problems, at least
right out of the box.


On Mon, Apr 12, 2021 at 11:35 AM Milan Pelletier via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Dear all,
>
> I am currently trying to use PETSc with CG solver and GAMG preconditioner.
> I have started with the following set of parameters:
> -ksp_type cg
> -pc_type gamg
> -pc_gamg_agg_nsmooths 1
> -pc_gamg_threshold 0.02
> -mg_levels_ksp_type chebyshev
> -mg_levels_pc_type sor
> -mg_levels_ksp_max_it 2
>
> Unfortunately, the preconditioning seems to run extremely slowly. I tried
> to play around with the numbers, to check if I could notice some
> difference, but could not observe significant changes.
> As a comparison, the KSPSetup call with GAMG PC takes more than 10 times
> longer than completing the whole computation (preconditioning + ~400 KSP
> iterations to convergence) of the similar case using the following
> parameters :
> -ksp_type cg
> -pc_type ilu
> -pc_factor_levels 0
>
> The matrix size for my case is ~1,850,000*1,850,000 elements, with
> ~38,000,000 non-zero terms (i.e. ~20 per row). For both ILU and AMG cases I
> use matseqaij/vecseq storage (as a first step I work with only 1 MPI
> process).
>
> Is there something wrong in the parameter set I have been using?
> I understand that the preconditioning overhead with AMG is higher than
> with ILU, but I would also expect CG/GAMG to be competitive against CG/ILU,
> especially considering the relatively big problem size.
>
> For information, I am using the PETSc version built from commit
> 6840fe907c1a3d26068082d180636158471d79a2 (release branch from April 7,
> 2021).
>
> Any clue or idea would be greatly appreciated!
> Thanks for your help,
>
> Best regards,
> Milan Pelletier
>
>
>

Re: [petsc-users] GAMG parameters for ideal coarsening ratio

2020-03-17 Thread Mark Adams

On Tue, Mar 17, 2020 at 1:42 PM Sajid Ali 
wrote:

> Hi Mark/Jed,
>
> The problem I'm solving is scalar helmholtz in 2D, (u_t = A*u_xx + A*u_yy
> + F_t*u, with the familiar 5 point central difference as the derivative
> approximation,
>

I assume this is definite HelmHoltz. The time integrator will also add a
mass term. I'm assuming F_t looks like a mass matrix.

> I'm also attaching the result of -info | grep GAMG if that helps). My goal
> is to get weak and strong scaling results for the FD solver (leading me to
> double check all my parameters). I ran the sweep again as Mark suggested
> and it looks like my base params were close to optimal ( negative threshold
> and 10 levels of squaring
>

For low order discretizations, squaring every level, as you are doing,
sound right. And the mass matrix confuses GAMG's filtering heuristics so no
filter sounds reasonable.

Note, hypre would do better than GAMG on this problem.

> with gmres/jacobi smoothers (chebyshev/sor is slower)).
>

You don't want to use GMRES as a smoother (unless you have
indefinite Helmholtz). SOR will be more expensive but often converges a lot
faster. chebyshev/jacobi would probably be better for you.

And you want CG (-ksp_type cg) if this system is symmetric positive
definite.

>
> [image: image.png]
>
> While I think that the base parameters should work well for strong
> scaling, do I have to modify any of my parameters for a weak scaling run ?
> Does GAMG automatically increase the number of mg-levels as grid size
> increases or is it upon the user to do that ?
>
> @Mark : Is there a GAMG implementation paper I should cite ? I've already
> added a citation for the Comput. Mech. (2007) 39: 497–507 as a reference
> for the general idea of applying agglomeration type multigrid
> preconditioning to helmholtz operators.
>
>
> Thank You,
> Sajid Ali | PhD Candidate
> Applied Physics
> Northwestern University
> s-sajid-ali.github.io
>
>

Re: [petsc-users] GAMG parameters for ideal coarsening ratio

2020-03-17 Thread Sajid Ali

 Hi Mark/Jed,

The problem I'm solving is scalar helmholtz in 2D, (u_t = A*u_xx + A*u_yy +
F_t*u, with the familiar 5 point central difference as the derivative
approximation, I'm also attaching the result of -info | grep GAMG if that
helps). My goal is to get weak and strong scaling results for the FD solver
(leading me to double check all my parameters). I ran the sweep again as
Mark suggested and it looks like my base params were close to optimal (
negative threshold and 10 levels of squaring with gmres/jacobi smoothers
(chebyshev/sor is slower)).

[image: image.png]

While I think that the base parameters should work well for strong scaling,
do I have to modify any of my parameters for a weak scaling run ? Does GAMG
automatically increase the number of mg-levels as grid size increases or is
it upon the user to do that ?

@Mark : Is there a GAMG implementation paper I should cite ? I've already
added a citation for the Comput. Mech. (2007) 39: 497–507 as a reference
for the general idea of applying agglomeration type multigrid
preconditioning to helmholtz operators.


Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io

Re: [petsc-users] GAMG parameters for ideal coarsening ratio

2020-03-16 Thread Jed Brown

Sajid Ali  writes:

> Hi PETSc-developers,
>
> As per the manual, the ideal gamg parameters are those which result in
> MatPtAP time being roughly similar to (or just slightly larger) than KSP
> solve times. The way to adjust this is by changing the threshold for
> coarsening and/or squaring the graph. I was working with a grid of size
> 2^14 by 2^14 in a linear & time-independent TS with the following params :
>
> #PETSc Option Table entries:
> -ksp_monitor
> -ksp_rtol 1e-5
> -ksp_type fgmres
> -ksp_view
> -log_view
> -mg_levels_ksp_type gmres
> -mg_levels_pc_type jacobi
> -pc_gamg_coarse_eq_limit 1000
> -pc_gamg_reuse_interpolation true
> -pc_gamg_square_graph 10
> -pc_gamg_threshold -0.04
> -pc_gamg_type agg
> -pc_gamg_use_parallel_coarse_grid_solver
> -pc_mg_monitor
> -pc_type gamg
> -prop_steps 8
> -ts_monitor
> -ts_type cn
> #End of PETSc Option Table entries
>
> With this I get a grid complexity of 1.33047, 6 multigrid levels,
> MatPtAP/KSPSolve ratio of 0.24, and the linear solve at each TS step takes
> 5 iterations (with approx one order of magnitude reduction in residual per
> step for iterations 2 through 5 and two orders for the first). The
> convergence and grid complexity look good, but the ratio of grid coarsening
> time to ksp-solve time is far from ideal. I've attached the log file from
> this set of base parameters as well.
>
> To investigate the effect of coarsening rates, I ran a parameter sweep over
> the coarsening parameters (threshold and sq. graph) and I'm confused by the
> results. For some reason either the number of gamg levels turns out to be
> too high or it is set to 1. When I try to manually set the number of levels
> to 4 (with pc_mg_levels 4 and thres. -0.04/ squaring of 10) I see
> performance much worse than the base parameters. Any advice as to what I'm
> missing in my search for a set of params where MatPtAP to KSPSolve is ~ 1 ?

Your solver looks efficient and the time to setup roughly matches the
solve time:

PCSetUp8 1.0 1.2202e+02 1.0 4.39e+09 1.0 4.9e+05 6.5e+03 
6.3e+02 36 12 19 27 21  36 12 19 27 22  9201
PCApply   40 1.0 1.1077e+02 1.0 2.63e+10 1.0 2.0e+06 3.8e+03 
2.0e+03 33 72 79 65 68  33 72 79 65 68 60662

If you have a specific need to reduce setup time or reduce solve time
(e.g., if you'll do many solves with the same setup), you might be able
to adjust.  But your iteration count is pretty low so probably not a lot
of room in that direction.

Re: [petsc-users] GAMG scalability for serendipity 20 nodes hexahedra

2019-06-27 Thread TARDIEU Nicolas via petsc-users



Thank you very much for your answer, Mark.
Do you think it is worth it, to play around with aggregation variants? Plain 
aggregation "à la Notay" for instance.

Nicolas
 



 

De : mfad...@lbl.gov 
Envoyé : mercredi 26 juin 2019 22:37
À : TARDIEU Nicolas
Cc : PETSc users list
Objet : Re: [petsc-users] GAMG scalability for serendipity 20 nodes hexahedra
  

I get growth with Q2 elements also. I've never seen anyone report scaling of 
high order elements with generic AMG.


First, discretizations are very important for AMG solver. All optimal solvers 
really. I've never looked at serendipity elements. It might be a good idea to 
try Q2 as well.


SNES ex56 is 3D elasticity on a cube with tensor elements. Below are parameters 
that I have been using. I see some evidence that more smoothing steps 
(-mg_levels_ksp_max_it N) helps "scaling" but not necessarily solve time.


An example of what I see, running ex56 with -cells 8,12,16  -max_conv_its 5 and 
the below params I get these iteration counts: 19, 20, 31, 31, 38.


My guess is that you need higher order interpolation for higher order elements 
and when you add a new level you get an increase in condition number (ie, it is 
not an optimal MG method). But, the original smoothed aggregation paper did 
have high order discretizations  their theory said it was still optimal, as I 
recall.


Mark


-log_view
-max_conv_its 5
-petscspace_degree 2
-snes_max_it 2
-ksp_max_it 100
-ksp_type cg
-ksp_rtol 1.e-11
-ksp_atol 1.e-71
-ksp_norm_type unpreconditioned
-snes_rtol 1.e-10
-pc_type gamg
-pc_gamg_type agg
-pc_gamg_agg_nsmooths 1
-pc_gamg_coarse_eq_limit 1000
-pc_gamg_process_eq_limit 200
-pc_gamg_reuse_interpolation true
-ksp_converged_reason
-snes_monitor_short
-ksp_monitor_short
-snes_converged_reason
-use_mat_nearnullspace true
-mg_levels_ksp_max_it 4
-mg_levels_ksp_type chebyshev
-mg_levels_esteig_ksp_type cg
-gamg_est_ksp_type cg
-gamg_est_ksp_max_it 10
-mg_levels_esteig_ksp_max_it 10
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_pc_type jacobi
-petscpartitioner_type simple
-mat_block_size 3
-matptap_via scalable
-run_type 1
-pc_gamg_repartition false
-pc_gamg_threshold 0.0
-pc_gamg_threshold_scale .25
-pc_gamg_square_graph 1
-check_pointer_intensity 0
-snes_type ksponly
-ex56_dm_view
-options_left




 


On Wed, Jun 26, 2019 at 8:21 AM TARDIEU Nicolas via petsc-users 
 wrote:
 
Dear PETSc team,


I have run a simple weak scalability test based on canonical 3D elasticity 
problem : a cube, meshed with 8 nodes hexaedra, clamped on one of its face and 
submited to a pressure load on the opposite face. 
I am using the FGMRES ksp with GAMG as preconditioner. I have set the rigid 
body modes using MatNullSpaceCreateRigidBody and its works like a charm. The 
solver exhibit a perfect scalability until 800 cores (I haven't tested with 
more cores). The ksp always  converges in 11 or 12 iterations. Let me emphasize 
that I use GAMG default options.



Nevertheless, if I switch to a quadratic mesh with 20 nodes serendipity 
hexaedra, the weak scalability deteriorates. For instance the number of 
iteration for the ksp increases from 20 iterations for the smallest problem to 
30 for the biggest. 
Here is my question : I wonder what is the right tuning for GAMG to recover the 
same weak scalability as in the linear case? I apologize if this is a stupid 
question...





 
I  look forward to reading you,  
Nicolas

  

Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute  diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.
Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses  copies, et de n'en garder aucune trace sur quelque 
support que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.
Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.

This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure,  either whole or partial, is prohibited except 
formal approval.
If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately

Re: [petsc-users] GAMG scalability for serendipity 20 nodes hexahedra

2019-06-26 Thread Mark Adams via petsc-users

I get growth with Q2 elements also. I've never seen anyone report scaling
of high order elements with generic AMG.

First, discretizations are very important for AMG solver. All optimal
solvers really. I've never looked at serendipity elements. It might be a
good idea to try Q2 as well.

SNES ex56 is 3D elasticity on a cube with tensor elements. Below are
parameters that I have been using. I see some evidence that more smoothing
steps (-mg_levels_ksp_max_it N) helps "scaling" but not necessarily solve
time.

An example of what I see, running ex56 with -cells 8,12,16  -max_conv_its 5
and the below params I get these iteration counts: 19, 20, 31, 31, 38.

My guess is that you need higher order interpolation for higher order
elements and when you add a new level you get an increase in condition
number (ie, it is not an optimal MG method). But, the original smoothed
aggregation paper did have high order discretizations their theory said it
was still optimal, as I recall.

Mark

-log_view
-max_conv_its 5
-petscspace_degree 2
-snes_max_it 2
-ksp_max_it 100
-ksp_type cg
-ksp_rtol 1.e-11
-ksp_atol 1.e-71
-ksp_norm_type unpreconditioned
-snes_rtol 1.e-10
-pc_type gamg
-pc_gamg_type agg
-pc_gamg_agg_nsmooths 1
-pc_gamg_coarse_eq_limit 1000
-pc_gamg_process_eq_limit 200
-pc_gamg_reuse_interpolation true
-ksp_converged_reason
-snes_monitor_short
-ksp_monitor_short
-snes_converged_reason
-use_mat_nearnullspace true
-mg_levels_ksp_max_it 4
-mg_levels_ksp_type chebyshev
-mg_levels_esteig_ksp_type cg
-gamg_est_ksp_type cg
-gamg_est_ksp_max_it 10
-mg_levels_esteig_ksp_max_it 10
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_pc_type jacobi
-petscpartitioner_type simple
-mat_block_size 3
-matptap_via scalable
-run_type 1
-pc_gamg_repartition false
-pc_gamg_threshold 0.0
-pc_gamg_threshold_scale .25
-pc_gamg_square_graph 1
-check_pointer_intensity 0
-snes_type ksponly
-ex56_dm_view
-options_left



On Wed, Jun 26, 2019 at 8:21 AM TARDIEU Nicolas via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Dear PETSc team,
>
>
> I have run a simple weak scalability test based on canonical 3D elasticity
> problem : a cube, meshed with 8 nodes hexaedra, clamped on one of its face
> and submited to a pressure load on the opposite face.
>
> I am using the FGMRES ksp with GAMG as preconditioner. I have set the
> rigid body modes using MatNullSpaceCreateRigidBody and its works like a
> charm. The solver exhibit a perfect scalability until 800 cores (I haven't
> tested with more cores). The ksp always converges in 11 or 12 iterations.
> Let me emphasize that I use GAMG default options.
>
>
> Nevertheless, if I switch to a quadratic mesh with 20 nodes serendipity
> hexaedra, the weak scalability deteriorates. For instance the number of
> iteration for the ksp increases from 20 iterations for the smallest problem
> to 30 for the biggest.
>
> Here is my question : I wonder what is the right tuning for GAMG to
> recover the same weak scalability as in the linear case? I apologize if
> this is a stupid question...
>
> I look forward to reading you,
> Nicolas
>
>
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont
> établis à l'intention exclusive des destinataires et les informations qui y
> figurent sont strictement confidentielles. Toute utilisation de ce Message
> non conforme à sa destination, toute diffusion ou toute publication totale
> ou partielle, est interdite sauf autorisation expresse.
>
> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de
> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
> partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de
> votre système, ainsi que toutes ses copies, et de n'en garder aucune trace
> sur quelque support que ce soit. Nous vous remercions également d'en
> avertir immédiatement l'expéditeur par retour du message.
>
> Il est impossible de garantir que les communications par messagerie
> électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
> erreur ou virus.
> 
>
> This message and any attachments (the 'Message') are intended solely for
> the addressees. The information contained in this Message is confidential.
> Any use of information contained in this Message not in accord with its
> purpose, any dissemination or disclosure, either whole or partial, is
> prohibited except formal approval.
>
> If you are not the addressee, you may not copy, forward, disclose or use
> any part of it. If you have received this message in error, please delete
> it and all copies from your system and notify the sender immediately by
> return message.
>
> E-mail communication cannot be guaranteed to be timely secure, error or
> virus-free.
>

Re: [petsc-users] GAMG parallel convergence sensitivity

2019-03-14 Thread Jed Brown via petsc-users

Mark Lohry  writes:

> It seems to me with these semi-implicit methods the CFL limit is still so
> close to the explicit limit (that paper stops at 30), I don't really see
> the purpose unless you're running purely incompressible? That's just my
> ignorance speaking though. I'm currently running fully implicit for
> everything, with CFLs around 1e3 - 1e5 or so.

It depends what you're trying to resolve.  Sounds like maybe you're
stepping toward steady state.  The paper is wishing to resolve vortex
and baroclinic dynamics while stepping over acoustics and barotropic
waves.

Re: [petsc-users] GAMG parallel convergence sensitivity

2019-03-13 Thread Jed Brown via petsc-users

Mark Lohry via petsc-users  writes:

> For what it's worth, I'm regularly solving much larger problems (1M-100M
> unknowns, unsteady) with this discretization and AMG setup on 500+ cores
> with impressively great convergence, dramatically better than ILU/ASM. This
> just happens to be the first time I've experimented with this extremely low
> Mach number, which is known to have a whole host of issues and generally
> needs low-mach preconditioners, I was just a bit surprised by this specific
> failure mechanism.

A common technique for low-Mach preconditioning is to convert to
primitive variables (much better conditioned for the solve) and use a
Schur fieldsplit into the pressure space.  For modest time step, you can
use SIMPLE-like method ("selfp" in PCFieldSplit lingo) to approximate
that Schur complement.  You can also rediscretize to form that
approximation.  This paper has a bunch of examples of choices for the
state variables and derivation of the continuous pressure preconditioner
each case.  (They present it as a classical semi-implicit method, but
that would be the Schur complement preconditioner if using FieldSplit
with a fully implicit or IMEX method.)

https://doi.org/10.1137/090775889

Re: [petsc-users] GAMG parallel convergence sensitivity

2019-03-13 Thread Mark Adams via petsc-users

>
>
>
> Any thoughts here? Is there anything obviously wrong with my setup?
>

Fast and robust solvers for NS require specialized methods that are not
provided in PETSc and the methods tend to require tighter integration with
the meshing and discretization than the algebraic interface supports.

I see you are using 20 smoothing steps. That is very high. Generally you
want to use the v-cycle more (ie, lower number of smoothing steps and more
iterations).

And, full MG is a bit tricky. I would not use it, but if it helps, fine.


> Any way to reduce the dependence of the convergence iterations on the
> parallelism?
>

This comes from the bjacobi smoother. Use jacobi and you will not have a
parallelism problem and you have bjacobi in the limit of parallelism.


> -- obviously I expect the iteration count to be higher in parallel, but I
> didn't expect such catastrophic failure.
>
>
You are beyond what AMG is designed for. If you press this problem it will
break any solver and will break generic AMG relatively early.

This makes it hard to give much advice. You really just need to test things
and use what works best. There are special purpose methods that you can
implement in PETSc but that is a topic for a significant project.

1 2 3 >

1 - 100 of 229 matches

Mail list logo