Re: [petsc-users] GAMG scaling

Mark Adams Thu, 04 May 2017 05:44:41 -0700

Thanks Hong,

I am not seeing these options with -help ...


On Wed, May 3, 2017 at 10:05 PM, Hong <[email protected]> wrote:

> I basically used 'runex56' and set '-ne' be compatible with np.
> Then I used option
> '-matptap_via scalable'
> '-matptap_via hypre'
> '-matptap_via nonscalable'
>
> I attached a job script below.
>
> In master branch, I set default as 'nonscalable' for small - medium size
> matrices, and automatically switch to 'scalable' when matrix size gets
> larger.
>
> Petsc solver uses MatPtAP,  which does local RAP to reduce communication
> and accelerate computation.
> I suggest you simply use default setting. Let me know if you encounter
> trouble.
>
> Hong
>
> job.ne174.n8.np125.sh:
> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
> 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
> -pc_gamg_reuse_interpolation true -ksp_converged_reason
> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
> -pc_gamg_repartition false -pc_mg_cycle_type v 
> -pc_gamg_use_parallel_coarse_grid_solver
> -mg_coarse_pc_type jacobi -mg_coarse_ksp_type cg -ksp_monitor -log_view
> -matptap_via scalable > log.ne174.n8.np125.scalable
>
> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
> 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
> -pc_gamg_reuse_interpolation true -ksp_converged_reason
> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
> -pc_gamg_repartition false -pc_mg_cycle_type v 
> -pc_gamg_use_parallel_coarse_grid_solver
> -mg_coarse_pc_type jacobi -mg_coarse_ksp_type cg -ksp_monitor -log_view
> -matptap_via hypre > log.ne174.n8.np125.hypre
>
> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
> 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
> -pc_gamg_reuse_interpolation true -ksp_converged_reason
> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
> -pc_gamg_repartition false -pc_mg_cycle_type v 
> -pc_gamg_use_parallel_coarse_grid_solver
> -mg_coarse_pc_type jacobi -mg_coarse_ksp_type cg -ksp_monitor -log_view
> -matptap_via nonscalable > log.ne174.n8.np125.nonscalable
>
> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
> 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
> -pc_gamg_reuse_interpolation true -ksp_converged_reason
> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
> -pc_gamg_repartition false -pc_mg_cycle_type v 
> -pc_gamg_use_parallel_coarse_grid_solver
> -mg_coarse_pc_type jacobi -mg_coarse_ksp_type cg -ksp_monitor -log_view >
> log.ne174.n8.np125
>
> On Wed, May 3, 2017 at 2:08 PM, Mark Adams <[email protected]> wrote:
>
>> Hong,the input files do not seem to be accessible. What are the command
>> line option? (I don't see a "rap" or "scale" in the source).
>>
>>
>>
>> On Wed, May 3, 2017 at 12:17 PM, Hong <[email protected]> wrote:
>>
>>> Mark,
>>> Below is the copy of my email sent to you on Feb 27:
>>>
>>> I implemented scalable MatPtAP and did comparisons of three
>>> implementations using ex56.c on alcf cetus machine (this machine has
>>> small memory, 1GB/core):
>>> - nonscalable PtAP: use an array of length PN to do dense axpy
>>> - scalable PtAP:       do sparse axpy without use of PN array
>>> - hypre PtAP.
>>>
>>> The results are attached. Summary:
>>> - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre PtAP
>>> - scalable PtAP is 4x faster than hypre PtAP
>>> - hypre uses less memory (see job.ne399.n63.np1000.sh)
>>>
>>> Based on above observation, I set the default PtAP algorithm as
>>> 'nonscalable'.
>>> When PN > local estimated nonzero of C=PtAP, then switch default to
>>> 'scalable'.
>>> User can overwrite default.
>>>
>>> For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
>>> MatPtAP                   3.6224e+01 (nonscalable for small mats,
>>> scalable for larger ones)
>>> scalable MatPtAP     4.6129e+01
>>> hypre                        1.9389e+02
>>>
>>> This work in on petsc-master. Give it a try. If you encounter any
>>> problem, let me know.
>>>
>>> Hong
>>>
>>> On Wed, May 3, 2017 at 10:01 AM, Mark Adams <[email protected]> wrote:
>>>
>>>> (Hong), what is the current state of optimizing RAP for scaling?
>>>>
>>>> Nate, is driving 3D elasticity problems at scaling with GAMG and we are
>>>> working out performance problems. They are hitting problems at ~1.5B dof
>>>> problems on a basic Cray (XC30 I think).
>>>>
>>>> Thanks,
>>>> Mark
>>>>
>>>
>>>
>>
>

Re: [petsc-users] GAMG scaling

Reply via email to