On Mon, Nov 22, 2021 at 12:41 PM Karthikeyan Chockalingam - STFC UKRI < [email protected]> wrote:
> @Matthew Knepley <[email protected]> Yes, it works with main build! > That is good. I think we might need to push up the next point release then, since we have a Hypre bug for which a fix is non-obvious. Thanks, Matt > @Mark Adams <[email protected]> I have attached the log_view output of ex4 > for your reference. But pcapply or pcsetup did not record any gpu flops. > > If it is not too much trouble can you send me the log_view output of > ksp/ksp/tutorial/ex45 using hypre? > > > > Many thanks! > > Karthik. > > > > *From: *Matthew Knepley <[email protected]> > *Date: *Monday, 22 November 2021 at 15:26 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > [email protected]> > *Cc: *Stefano Zampini <[email protected]>, " > [email protected]" <[email protected]> > *Subject: *Re: [petsc-users] hypre on gpus > > > > On Mon, Nov 22, 2021 at 10:20 AM Karthikeyan Chockalingam - STFC UKRI < > [email protected]> wrote: > > Thanks Stefano, > > > > I just did a clean build with no problems. Can you try a clean build of > main? > > > > Thanks, > > > > Matt > > > > > > I have another build without the --download-hypre-commit=origin/hypre_petsc > but that gives a different error. > > > > [kchockalingam@glados tutorials]$ ./ex4 -ksp_view -ksp_type cg -mat_type > hypre -pc_type hypre > > *[0]PETSC ERROR: --------------------- Error Message > --------------------------------------------------------------* > > [0]PETSC ERROR: Error in external library > > [0]PETSC ERROR: Error in HYPRE_IJMatrixAssemble(): error code 12 > > [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.15.3, Aug 06, 2021 > > [0]PETSC ERROR: ./ex4 on a named glados.dl.ac.uk by kchockalingam Mon > Nov 22 15:07:46 2021 > > [0]PETSC ERROR: Configure options > --package-prefix-hash=/home/kchockalingam/petsc-hash-pkgs > --with-make-test-np=2 COPTFLAGS="-g -O3 -fno-omit-frame-pointer" > FOPTFLAGS="-g -O3 -fno-omit-frame-pointer" CXXOPTFLAGS="-g -O3 > -fno-omit-frame-pointer" --with-cuda=1 --with-cuda-arch=70 > --with-blaslapack=1 --with-cuda-dir=/apps/packages/cuda/10.1/ > --with-mpi-dir=/apps/packages/gcc/7.3.0/openmpi/3.1.2 --download-hypre=1 > --download-hypre-configure-arguments=HYPRE_CUDA_SM=70 --with-debugging=no > PETSC_ARCH=arch-ci-linux-cuda11-double > > [0]PETSC ERROR: #1 MatAssemblyEnd_HYPRE() at > /home/kchockalingam/tools/petsc-3.15.3/src/mat/impls/hypre/mhypre.c:1212 > > [0]PETSC ERROR: #2 MatAssemblyEnd() at > /home/kchockalingam/tools/petsc-3.15.3/src/mat/interface/matrix.c:5652 > > [0]PETSC ERROR: #3 main() at ex4.c:84 > > [0]PETSC ERROR: PETSc Option Table entries: > > [0]PETSC ERROR: -ksp_type cg > > [0]PETSC ERROR: -ksp_view > > [0]PETSC ERROR: -mat_type hypre > > [0]PETSC ERROR: -pc_type hypre > > *[0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to [email protected]* > > -------------------------------------------------------------------------- > > > > Best, > > Karthik. > > > > *From: *Stefano Zampini <[email protected]> > *Date: *Monday, 22 November 2021 at 14:46 > *To: *Matthew Knepley <[email protected]> > *Cc: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > [email protected]>, "[email protected]" < > [email protected]> > *Subject: *Re: [petsc-users] hypre on gpus > > > > You don't need to specify the HYPRE commit. Remove > --download-hypre-commit=origin/hypre_petsc > from the configuration options > > > > Il giorno lun 22 nov 2021 alle ore 17:29 Matthew Knepley < > [email protected]> ha scritto: > > On Mon, Nov 22, 2021 at 8:50 AM Karthikeyan Chockalingam - STFC UKRI < > [email protected]> wrote: > > Hi Matt, > > > > Below is the entire error message: > > > > I cannot reproduce this: > > > > main $:/PETSc3/petsc/petsc-dev/src/ksp/ksp/tutorials$ ./ex4 -ksp_view > -ksp_type cg -mat_type hypre -pc_type hypre > KSP Object: 1 MPI processes > type: cg > maximum iterations=10000, initial guess is zero > tolerances: relative=0.000138889, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: hypre > HYPRE BoomerAMG preconditioning > Cycle type V > Maximum number of levels 25 > Maximum number of iterations PER hypre call 1 > Convergence tolerance PER hypre call 0. > Threshold for strong coupling 0.25 > Interpolation truncation factor 0. > Interpolation: max elements per row 0 > Number of levels of aggressive coarsening 0 > Number of paths for aggressive coarsening 1 > Maximum row sums 0.9 > Sweeps down 1 > Sweeps up 1 > Sweeps on coarse 1 > Relax down symmetric-SOR/Jacobi > Relax up symmetric-SOR/Jacobi > Relax on coarse Gaussian-elimination > Relax weight (all) 1. > Outer relax weight (all) 1. > Using CF-relaxation > Not using more complex smoothers. > Measure type local > Coarsen type Falgout > Interpolation type classical > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: hypre > rows=56, cols=56 > Norm of error 8.69801e-05 iterations 2 > > This is on the 'main' branch. So either there is some bug in release, or > something is strange on your end. Since we run Hypre tests for the CI, > > I am leaning toward the latter. Can you try the 'main' branch? We will > have to use this anyway if we want any fixes. > > > > Thanks, > > > > Matt > > > > *[0]PETSC ERROR: --------------------- Error Message > --------------------------------------------------------------* > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Must call MatXXXSetPreallocation(), MatSetUp() or the > matrix has not yet been factored on argument 1 "mat" before > MatGetOwnershipRange() > > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.16.0, Sep 29, 2021 > > [0]PETSC ERROR: ./ex4 on a named sqg2b4.bullx by kxc07-lxm25 Mon Nov 22 > 11:33:41 2021 > > [0]PETSC ERROR: Configure options > --prefix=/lustre/scafellpike/local/apps/gcc7/petsc/3.16.0-cuda11.2 > --with-debugging=yes > --with-blaslapack-dir=/lustre/scafellpike/local/apps/intel/intel_cs/2018.0.128/mkl > --with-cuda=1 --with-cuda-arch=70 --download-hypre=yes > --download-hypre-configure-arguments=HYPRE_CUDA_SM=70 > --download-hypre-commit=origin/hypre_petsc --with-shared-libraries=1 > --known-mpi-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx > -with-fc=mpif90 > > [0]PETSC ERROR: #1 MatGetOwnershipRange() at > /netfs/smain01/scafellpike/local/package_build/build/rja87-build/petsc-cuda-3.16.0/src/mat/interface/matrix.c:6784 > > [0]PETSC ERROR: #2 main() at ex4.c:40 > > [0]PETSC ERROR: PETSc Option Table entries: > > [0]PETSC ERROR: -ksp_type cg > > [0]PETSC ERROR: -ksp_view > > [0]PETSC ERROR: -mat_type hypre > > [0]PETSC ERROR: -pc_type hypre > > [0]PETSC ERROR: -use_gpu_aware_mpi 0 > > *[0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to [email protected]* > > -------------------------------------------------------------------------- > > > > I have also attached the make.log. Thank you for having a look. > > > > Best, > > Karthik. > > > > *From: *Matthew Knepley <[email protected]> > *Date: *Monday, 22 November 2021 at 13:41 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > [email protected]> > *Cc: *Mark Adams <[email protected]>, "[email protected]" < > [email protected]> > *Subject: *Re: [petsc-users] hypre on gpus > > > > On Mon, Nov 22, 2021 at 6:47 AM Karthikeyan Chockalingam - STFC UKRI < > [email protected]> wrote: > > Thank you for your response. I tried to run the same example > > > > petsc/src/ksp/ksp/tutorials$ *./ex4 -ksp_type cg -mat_type hypre > -ksp_view -pc_type hypre* > > > > but it crashed with the below error > > > > *[0]PETSC ERROR: --------------------- Error Message > --------------------------------------------------------------* > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Must call MatXXXSetPreallocation(), MatSetUp() or the > matrix has not yet been factored on argument 1 "mat" before > MatGetOwnershipRange() > > > > Hi Karthik, > > > > Please do not clip the error message. We need the entire output. This > seems strange that you would get a logic error, > > since that should be the same on any architecture. So could you also send > the make.log? > > > > Thanks, > > > > Matt > > > > Below are the options used to configure hypre with cuda support. Do you > spot any mistakes? > > > > --with-blaslapack-dir=/lustre/scafellpike/local/apps/intel/intel_cs/2018.0.128/mkl > > > --with-cuda=1 > > --with-cuda-arch=70 > > --download-hypre=yes > > --download-hypre-configure-arguments=HYPRE_CUDA_SM=70 > > --download-hypre-commit=origin/hypre_petsc > > --with-shared-libraries=1 > > --known-mpi-shared-libraries=1 > > --with-cc=mpicc > > --with-cxx=mpicxx > > -with-fc=mpif90 > > > > Best, > > Karthik. > > > > > > *From: *Mark Adams <[email protected]> > *Date: *Friday, 19 November 2021 at 16:31 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > [email protected]> > *Cc: *"[email protected]" <[email protected]> > *Subject: *Re: [petsc-users] hypre on gpus > > > > You should run with -options_left to check that your options are being > used. It may be -mat_type hypre. > > > > I have tested this: > > > > petsc/src/ksp/ksp/tutorials$ *srun -n2 ./ex4 -ksp_type cg -mat_type hypre > -ksp_view -pc_type hypre* > > > > You can add -log_view and that will print performance data for each method > like KSPSolve. > > > > If you are configured with a GPU there will be some extra columns that > give the percent of flops on the GPU. > > > > In the past hypre has not registered flops with us, but I do get flops > from hypre now, and -ksp_view showed that it did indeed use hypre. > > I saw that the flops were 100% GPU in hypre. > > > > > > On Fri, Nov 19, 2021 at 10:47 AM Karthikeyan Chockalingam - STFC UKRI < > [email protected]> wrote: > > Hello, > > > > I tried to solve a 3D Poisson (ksp/tutorial/ex45) using -pc_type hypre on > gpus > > > > ./ex45 -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -dm_mat_type hypre > -dm_vec_type cuda -ksp_type cg -pc_type hypre -pc_hypre_type boomeramg > -ksp_monitor -log_view > > > > I profiled the run using NSYS - attached you find all the relevant files. > > Looking at the profile I doubt if hypre is running on gpus. The cuda > kernels are barely active. > > I don’t see any cuda kernel relevant to solve. > > Is my assessment correct? How can I verify if hypre indeed is running on > gpus? > > > > Best, > > Karthik. > > > > *From: *Mark Adams <[email protected]> > *Date: *Friday, 8 October 2021 at 18:47 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > [email protected]> > *Cc: *"[email protected]" <[email protected]> > *Subject: *Re: [petsc-users] hypre on gpus > > > > I think you would want to use 'cuda' vec_type, but I . > > You might ask Hypre how one verifies that the GPU is used. > > Mark > > > > On Fri, Oct 8, 2021 at 1:35 PM Karthikeyan Chockalingam - STFC UKRI < > [email protected]> wrote: > > Yes, I used it for both cpu and gpu. Is that not okay? > > > > For gpu: -dm_mat_type hypre -dm_vec_type mpicuda > > > > For cpu: -dm_mat_type hypre -dm_vec_type mpi > > > > *From: *Mark Adams <[email protected]> > *Date: *Friday, 8 October 2021 at 18:28 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > [email protected]> > *Cc: *"[email protected]" <[email protected]> > *Subject: *Re: [petsc-users] hypre on gpus > > > > Did you use -dm_mat_type hypre on the GPU case ? > > > > On Fri, Oct 8, 2021 at 12:19 PM Karthikeyan Chockalingam - STFC UKRI < > [email protected]> wrote: > > I tried a different exercise ran the same problem on two cpu cores and on > two gpu: > > > > On gpu > > > > PCApply 6 1.0 6.0335e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+00 15 0 0 0 1 15 0 0 0 1 0 0 0 0.00e+00 5 > 9.65e+01 0 > > > > and on cpu > > > > PCApply 6 1.0 5.6348e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 16 0 0 0 0 16 0 0 0 0 0 > > > > timings again are close but gpu version did a reduction 6.0e+00 but the > cpu version did not 0.0e+00. > > I am not sure if that is any indication if hypre ran on gpus? > > > > Thanks, > > Karthik. > > > > > > *From: *Mark Adams <[email protected]> > *Date: *Friday, 8 October 2021 at 16:36 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > [email protected]> > *Cc: *"[email protected]" <[email protected]> > *Subject: *Re: [petsc-users] hypre on gpus > > > > > > > > On Fri, Oct 8, 2021 at 10:29 AM Karthikeyan Chockalingam - STFC UKRI < > [email protected]> wrote: > > The PCApply timing on > > > > gpu > > > > PCApply 6 1.0 1.0235e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 39 0 0 0 0 39 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > > > > and cpu > > > > PCApply 6 1.0 1.0242e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 41 0 0 0 0 41 0 0 0 0 0 > > > > > > You don't have GPUs. probably. > > Use -dm_mat_type hypre. > > > > are close. It is hard for me tell if hypre on gpu is on or not. > > > > Best, > > Karthik. > > > > > > *From: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > [email protected]> > *Date: *Friday, 8 October 2021 at 14:55 > *To: *Mark Adams <[email protected]> > *Cc: *"[email protected]" <[email protected]> > *Subject: *Re: [petsc-users] hypre on gpus > > > > Thanks Mark, I will try your recommendations. > > Should I also change -dm_vec_type to hypre currently I have it as mpicuda? > > > > Karthik. > > > > > > *From: *Mark Adams <[email protected]> > *Date: *Friday, 8 October 2021 at 14:33 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > [email protected]> > *Cc: *"[email protected]" <[email protected]> > *Subject: *Re: [petsc-users] hypre on gpus > > > > Hypre does not record its flops with PETSc's timers. > > Configure with and without CUDA and see if the timings change in PCApply. > > Hypre does not dynamically switch between CUDA and CPU solves at > this time, but you want to use -dm_mat_type hypre. > > Mark > > > > On Fri, Oct 8, 2021 at 6:59 AM Karthikeyan Chockalingam - STFC UKRI < > [email protected]> wrote: > > Hello, > > > > I am trying to run ex45 (in KSP tutorial) using hypre on gpus. I have > attached the python configuration file and -log_view output from running > the below command options > > > > mpirun -n 2 ./ex45 -log_view -da_grid_x 169 -da_grid_y 169 -da_grid_z 169 > -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type gmres -pc_type > hypre -pc_hypre_type boomeramg -ksp_gmres_restart 31 > -pc_hypre_boomeramg_strong_threshold 0.7 -ksp_monitor > > > > The problem was solved and converged but from the output file I suspect > hypre is not running on gpus as PCApply and DMCreate does *not* record > any gpu Mflop/s. However, some events such KSPSolve, MatMult etc are > running on gpus. > > > > Can you please let me know if I need to add any extra flag to the attached > arch-ci-linux-cuda11-double-xx.py script file to get hypre working on gpus? > > > > Thanks, > > Karthik. > > > > > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > <http://www.cse.buffalo.edu/~knepley/> > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > <http://www.cse.buffalo.edu/~knepley/> > > > > -- > > Stefano > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > <http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
