Hi Barry, My code using DMDA, the mesh is partitioned in x-direction only. Can I have MPI+OpenMP works in the following way:
I want to create a new communicator which includes processes with Rank%12==0, PETSc objects will be created with this new sub-set of processes. In each node (which has 12 cores), the first core (Rank%12==0) does MPI communicating job (with Rank%12==0 process of other nodes), then commanded other 11 processes do computation works using openMP? Thank you. On Tue, May 23, 2017 at 1:58 AM, Barry Smith <[email protected]> wrote: > > > On May 22, 2017, at 11:25 AM, Pham Pham <[email protected]> wrote: > > > > Hi Matt, > > > > For the machine I have, Is it a good idea if I mix MPI and OpenMP: MPI > for cores with Rank%12==0 and OpenMP for the others ? > > > > MPI+OpenMP doesn't work this way. Each "rank" is an MPI process, you > cannot say some ranks are MPI and some are OpenMP. If you want to use one > MPI process per node and have each MPI process have 12 OpenMP threads you > need to find out for YOUR systems MPI how you tell it to put one MPI > process per node; > > Barry > > > Thank you, > > > > PVS. > > > > On Thu, May 11, 2017 at 8:27 PM, Matthew Knepley <[email protected]> > wrote: > > On Thu, May 11, 2017 at 7:08 AM, Pham Pham <[email protected]> wrote: > > Hi Matt, > > > > Thank you for the reply. > > > > I am using University HPC which has multiple nodes, and should be good > for parallel computing. The bad performance might be due to the way I > install and run PETSc... > > > > Looking at the output when running streams, I can see that the Processor > names were the same. > > Does that mean only one processor involved in computing, did it cause > the bad performance? > > > > Yes. From the data, it appears that the kind of processor you have has > 12 cores, but only enough memory bandwidth to support 1.5 cores. > > Try running the STREAMS with only 1 process per node. This is a setting > in your submission script, but it is different for every cluster. Thus > > I would ask the local sysdamin for this machine to help you do that. You > should see almost perfect scaling with that configuration. You might > > also try 2 processes per node to compare. > > > > Thanks, > > > > Matt > > > > Thank you very much. > > > > Ph. > > > > Below is testing output: > > > > [mpepvs@atlas5-c01 petsc-3.7.5]$ make > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 > PETSC_ARCH=arch-linux-cxx-opt streams > > cd src/benchmarks/streams; /usr/bin/gmake --no-print-directory > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 > PETSC_ARCH=arch-linux-cxx-opt streams > > /app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/bin/mpicxx -o > MPIVersion.o c -wd1572 -g -O3 -fPIC > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include > -I/hom > > > e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include > -I/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/include > `pwd`/MPIVersion.c > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > +++++++++++++++++++++++++++++++ > > The version of PETSc you are using is out-of-date, we recommend updating > to the new release > > Available Version: 3.7.6 Installed Version: 3.7.5 > > http://www.mcs.anl.gov/petsc/download/index.html > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > +++++++++++++++++++++++++++++++ > > Running streams with 'mpiexec.hydra ' using 'NPMAX=12' > > Number of MPI processes 1 Processor names atlas5-c01 > > Triad: 11026.7604 Rate (MB/s) > > Number of MPI processes 2 Processor names atlas5-c01 atlas5-c01 > > Triad: 14669.6730 Rate (MB/s) > > Number of MPI processes 3 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 > > Triad: 12848.2644 Rate (MB/s) > > Number of MPI processes 4 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 > > Triad: 15033.7687 Rate (MB/s) > > Number of MPI processes 5 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 > > Triad: 13299.3830 Rate (MB/s) > > Number of MPI processes 6 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 > > Triad: 14382.2116 Rate (MB/s) > > Number of MPI processes 7 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 > > Triad: 13194.2573 Rate (MB/s) > > Number of MPI processes 8 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 > > Triad: 14199.7255 Rate (MB/s) > > Number of MPI processes 9 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 > > Triad: 13045.8946 Rate (MB/s) > > Number of MPI processes 10 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 > > Triad: 13058.3283 Rate (MB/s) > > Number of MPI processes 11 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 > > Triad: 13037.3334 Rate (MB/s) > > Number of MPI processes 12 Processor names atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 > atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 > > Triad: 12526.6096 Rate (MB/s) > > ------------------------------------------------ > > np speedup > > 1 1.0 > > 2 1.33 > > 3 1.17 > > 4 1.36 > > 5 1.21 > > 6 1.3 > > 7 1.2 > > 8 1.29 > > 9 1.18 > > 10 1.18 > > 11 1.18 > > 12 1.14 > > Estimation of possible speedup of MPI programs based on Streams > benchmark. > > It appears you have 1 node(s) > > See graph in the file src/benchmarks/streams/scaling.png > > > > On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <[email protected]> > wrote: > > On Fri, May 5, 2017 at 10:18 AM, Pham Pham <[email protected]> wrote: > > Hi Satish, > > > > It runs now, and shows a bad speed up: > > Please help to improve this. > > > > http://www.mcs.anl.gov/petsc/documentation/faq.html#computers > > > > The short answer is: You cannot improve this without buying a different > machine. This is > > a fundamental algorithmic limitation that cannot be helped by threads, > or vectorization, or > > anything else. > > > > Matt > > > > Thank you. > > > > <scaling.png> > > > > > > On Fri, May 5, 2017 at 10:02 PM, Satish Balay <[email protected]> wrote: > > With Intel MPI - its best to use mpiexec.hydra [and not mpiexec] > > > > So you can do: > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 > PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test > > > > > > [you can also specify --with-mpiexec=mpiexec.hydra at configure time] > > > > Satish > > > > > > On Fri, 5 May 2017, Pham Pham wrote: > > > > > *Hi,* > > > *I can configure now, but fail when testing:* > > > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$ make > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 > PETSC_ARCH=arch-linux-cxx-opt > > > test Running test examples to verify correct installation > > > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and > > > PETSC_ARCH=arch-linux-cxx-opt > > > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 > MPI > > > process > > > See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > mpiexec_atlas7-c10: cannot connect to local mpd > (/tmp/mpd2.console_mpepvs); > > > possible causes: > > > 1. no mpd is running on this host > > > 2. an mpd is running but was started without a "console" (-n option) > > > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 > MPI > > > processes > > > See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > mpiexec_atlas7-c10: cannot connect to local mpd > (/tmp/mpd2.console_mpepvs); > > > possible causes: > > > 1. no mpd is running on this host > > > 2. an mpd is running but was started without a "console" (-n option) > > > Possible error running Fortran example src/snes/examples/tutorials/ > ex5f > > > with 1 MPI process > > > See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > mpiexec_atlas7-c10: cannot connect to local mpd > (/tmp/mpd2.console_mpepvs); > > > possible causes: > > > 1. no mpd is running on this host > > > 2. an mpd is running but was started without a "console" (-n option) > > > Completed test examples > > > ========================================= > > > Now to evaluate the computer systems you plan use - do: > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 > > > PETSC_ARCH=arch-linux-cxx-opt streams > > > > > > > > > > > > > > > *Please help on this.* > > > *Many thanks!* > > > > > > > > > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <[email protected]> > wrote: > > > > > > > Sorry - should have mentioned: > > > > > > > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again. > > > > > > > > The mpich install from previous build [that is currently in > > > > arch-linux-cxx-opt/] > > > > is conflicting with --with-mpi-dir=/app1/centos6.3/gnu/mvapich2-1.9/ > > > > > > > > Satish > > > > > > > > > > > > On Wed, 19 Apr 2017, Pham Pham wrote: > > > > > > > > > I reconfigured PETSs with installed MPI, however, I got serous > error: > > > > > > > > > > **************************ERROR************************* > ************ > > > > > Error during compile, check arch-linux-cxx-opt/lib/petsc/ > conf/make.log > > > > > Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to > > > > > [email protected] > > > > > ************************************************************ > ******** > > > > > > > > > > Please explain what is happening? > > > > > > > > > > Thank you very much. > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <[email protected]> > > > > wrote: > > > > > > > > > > > Presumably your cluster already has a recommended MPI to use > [which is > > > > > > already installed. So you should use that - instead of > > > > > > --download-mpich=1 > > > > > > > > > > > > Satish > > > > > > > > > > > > On Wed, 19 Apr 2017, Pham Pham wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > I just installed petsc-3.7.5 into my university cluster. When > > > > evaluating > > > > > > > the computer system, PETSc reports "It appears you have 1 > node(s)", I > > > > > > donot > > > > > > > understand this, since the system is a multinodes system. > Could you > > > > > > please > > > > > > > explain this to me? > > > > > > > > > > > > > > Thank you very much. > > > > > > > > > > > > > > S. > > > > > > > > > > > > > > Output: > > > > > > > ========================================= > > > > > > > Now to evaluate the computer systems you plan use - do: > > > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 > > > > > > > PETSC_ARCH=arch-linux-cxx-opt streams > > > > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$ make > > > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 > > > > > > PETSC_ARCH=arch-linux-cxx-opt > > > > > > > streams > > > > > > > cd src/benchmarks/streams; /usr/bin/gmake --no-print-directory > > > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 > > > > > > PETSC_ARCH=arch-linux-cxx-opt > > > > > > > streams > > > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx > -o > > > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing > > > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O > > > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include > > > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx- > opt/include > > > > > > > `pwd`/MPIVersion.c > > > > > > > Running streams with > > > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec > ' > > > > > > using > > > > > > > 'NPMAX=12' > > > > > > > Number of MPI processes 1 Processor names atlas7-c10 > > > > > > > Triad: 9137.5025 Rate (MB/s) > > > > > > > Number of MPI processes 2 Processor names atlas7-c10 > atlas7-c10 > > > > > > > Triad: 9707.2815 Rate (MB/s) > > > > > > > Number of MPI processes 3 Processor names atlas7-c10 > atlas7-c10 > > > > > > atlas7-c10 > > > > > > > Triad: 13559.5275 Rate (MB/s) > > > > > > > Number of MPI processes 4 Processor names atlas7-c10 > atlas7-c10 > > > > > > atlas7-c10 > > > > > > > atlas7-c10 > > > > > > > Triad: 14193.0597 Rate (MB/s) > > > > > > > Number of MPI processes 5 Processor names atlas7-c10 > atlas7-c10 > > > > > > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 > > > > > > > Triad: 14492.9234 Rate (MB/s) > > > > > > > Number of MPI processes 6 Processor names atlas7-c10 > atlas7-c10 > > > > > > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 > > > > > > > Triad: 15476.5912 Rate (MB/s) > > > > > > > Number of MPI processes 7 Processor names atlas7-c10 > atlas7-c10 > > > > > > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 > > > > > > > Triad: 15148.7388 Rate (MB/s) > > > > > > > Number of MPI processes 8 Processor names atlas7-c10 > atlas7-c10 > > > > > > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 > > > > > > > Triad: 15799.1290 Rate (MB/s) > > > > > > > Number of MPI processes 9 Processor names atlas7-c10 > atlas7-c10 > > > > > > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 > atlas7-c10 > > > > > > > Triad: 15671.3104 Rate (MB/s) > > > > > > > Number of MPI processes 10 Processor names atlas7-c10 > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 > > > > > > > Triad: 15601.4754 Rate (MB/s) > > > > > > > Number of MPI processes 11 Processor names atlas7-c10 > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 > > > > > > > Triad: 15434.5790 Rate (MB/s) > > > > > > > Number of MPI processes 12 Processor names atlas7-c10 > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 > atlas7-c10 > > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 > > > > > > > Triad: 15134.1263 Rate (MB/s) > > > > > > > ------------------------------------------------ > > > > > > > np speedup > > > > > > > 1 1.0 > > > > > > > 2 1.06 > > > > > > > 3 1.48 > > > > > > > 4 1.55 > > > > > > > 5 1.59 > > > > > > > 6 1.69 > > > > > > > 7 1.66 > > > > > > > 8 1.73 > > > > > > > 9 1.72 > > > > > > > 10 1.71 > > > > > > > 11 1.69 > > > > > > > 12 1.66 > > > > > > > Estimation of possible speedup of MPI programs based on Streams > > > > > > benchmark. > > > > > > > It appears you have 1 node(s) > > > > > > > Unable to plot speedup to a file > > > > > > > Unable to open matplotlib to plot speedup > > > > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$ > > > > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > >
