On Mon, Jul 3, 2017 at 4:18 AM, Damian Kaliszan <[email protected]> wrote:
> Hi, > > OK. So this is clear now. > Maybe you will be able to help the answer the question I raised > some time ago, why, when > submitting a slurm job and setting different number of > --cpus-per-task, the execution times of ksp solver > (gmres type) vary when max number of iterations set to 100.000 > and ksp, when finished, returns ksp.getIterationNumber() (in petsc4py) > equal to 900 (for some cases) and 100.000 for other. > It looks like number of cpus per task influence the number of steps > the solver solves the equation. Is it possible at all? > If you have one solver executing on the whole communicator, PETSC_COMM_WORLD, all processes will report the same number of iterates. If you have independent, serial solvers on each process, PETSC_COMM_SELF, they can report different numbers of iterates depending on the convergence criteria. Matt > Best, > Damian > > W liście datowanym 1 lipca 2017 (22:23:19) napisano: > > > >> On Jul 1, 2017, at 3:15 PM, Damian Kaliszan <[email protected]> > wrote: > >> > >> Hi, > >> So... - - with-openmp=0/1 configuration option seems to be useless?... > > > It merely enables the compiler flags for compiling with OpenMP; > > for example if your code has OpenMP in it. > > >> In one of my previous messages I wrote that, when openmp enabled, and > OMP_NUM_THREADS set, I notice different timings for ksp solver. Strange....? > > > For the PETSc part yes number of threads shouldn't matter since PETSc > would only use 1. > >> > >> > >> Best, > >> Damian > >> W dniu 1 lip 2017, o 00:50, użytkownik Barry Smith <[email protected]> > napisał: > >> > >> The current version of PETSc does not use OpenMP, you are free to use > OpenMP in your portions of the code of course. If you want PETSc using > OpenMP you have to use the old, unsupported version of PETSc. We never > found any benefit to using OpenMP. > >> > >> Barry > >> > >> On Jun 30, 2017, at 5:40 PM, Danyang Su <[email protected]> wrote: > >> > >> Dear All, > >> > >> I recalled there was OpenMP available for PETSc for the old > development version. When google "petsc hybrid mpi openmp", there returned > some papers about this feature. My code was first parallelized using OpenMP > and then redeveloped using PETSc, with OpenMP kept but not used together > with MPI. Before retesting the code using hybrid mpi-openmp, I picked one > PETSc example ex10 by adding "omp_set_num_threads(max_threads);" under > PetscInitialize. > >> > >> The PETSc is the current development version configured as follows > >> > >> --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 > --CFLAGS=-fopenmp --CXXFLAGS=-fopenmp --FFLAGS=-fopenmp COPTFLAGS="-O3 > -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" > FOPTFLAGS="-O3 -march=native -mtune=native" --with-large-file-io=1 > --download-cmake=yes --download-mumps --download-scalapack > --download-parmetis --download-metis --download-ptscotch > --download-fblaslapack --download-mpich --download-hypre > --download-superlu_dist --download-hdf5=yes --with-openmp --with-threadcomm > --with-pthreadclasses --with-openmpclasses > >> > >> The code can be successfully compiled. However, when I run the code > with OpenMP, it does not work, the time shows no change in performance if 1 > or 2 threads per processor is used. Also, the CPU/Threads usage indicates > that no thread is used. > >> > >> I just wonder if OpenMP is still available in the latest version, > though it is not recommended to use. > >> > >> mpiexec -n 2 ./ex10 -f0 mat_rhs_pc_nonzero/a_react_in_2.bin -rhs > mat_rhs_pc_nonzero/b_react_in_2.bin -ksp_rtol 1.0e-20 -ksp_monitor > -ksp_error_if_not_converged -sub_pc_factor_shift_type nonzero -mat_view > ascii::ascii_info -log_view -max_threads 1 -threadcomm_type openmp > -threadcomm_nthreads 1 > >> > >> KSPSolve 1 1.0 8.9934e-01 1.0 1.03e+09 1.0 7.8e+01 > 3.6e+04 7.8e+01 69 97 89 6 76 89 97 98 98 96 2290 > >> PCSetUp 2 1.0 8.9590e-02 1.0 2.91e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 7 3 0 0 0 9 3 0 0 0 648 > >> PCSetUpOnBlocks 2 1.0 8.9465e-02 1.0 2.91e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 7 3 0 0 0 9 3 0 0 0 649 > >> PCApply 40 1.0 3.1993e-01 1.0 2.70e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 24 25 0 0 0 32 25 0 0 0 1686 > >> > >> mpiexec -n 2 ./ex10 -f0 mat_rhs_pc_nonzero/a_react_in_2.bin -rhs > mat_rhs_pc_nonzero/b_react_in_2.bin -ksp_rtol 1.0e-20 -ksp_monitor > -ksp_error_if_not_converged -sub_pc_factor_shift_type nonzero -mat_view > ascii::ascii_info -log_view -max_threads 2 -threadcomm_type openmp > -threadcomm_nthreads 2 > >> > >> KSPSolve 1 1.0 8.9701e-01 1.0 1.03e+09 1.0 7.8e+01 > 3.6e+04 7.8e+01 69 97 89 6 76 89 97 98 98 96 2296 > >> PCSetUp 2 1.0 8.7635e-02 1.0 2.91e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 7 3 0 0 0 9 3 0 0 0 663 > >> PCSetUpOnBlocks 2 1.0 8.7511e-02 1.0 2.91e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 7 3 0 0 0 9 3 0 0 0 664 > >> PCApply 40 1.0 3.1878e-01 1.0 2.70e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 24 25 0 0 0 32 25 0 0 0 1692 > >> > >> Thanks and regards, > >> > >> Danyang > >> > >> > >> <ex10.c><makefile.txt> > >> > > > > > ------------------------------------------------------- > Damian Kaliszan > > Poznan Supercomputing and Networking Center > HPC and Data Centres Technologies > ul. Jana Pawła II 10 > 61-139 Poznan > POLAND > > phone (+48 61) 858 5109 > e-mail [email protected] > www - http://www.man.poznan.pl/ > ------------------------------------------------------- > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener http://www.caam.rice.edu/~mk51/
