Hi Matt, For the machine I have, Is it a good idea if I mix MPI and OpenMP: MPI for cores with Rank%12==0 and OpenMP for the others ?
Thank you, PVS. On Thu, May 11, 2017 at 8:27 PM, Matthew Knepley <[email protected]> wrote: > On Thu, May 11, 2017 at 7:08 AM, Pham Pham <[email protected]> wrote: > >> Hi Matt, >> >> Thank you for the reply. >> >> I am using University HPC which has multiple nodes, and should be good >> for parallel computing. The bad performance might be due to the way I >> install and run PETSc... >> >> Looking at the output when running streams, I can see that the Processor >> names were the same. >> Does that mean only one processor involved in computing, did it cause the >> bad performance? >> > > Yes. From the data, it appears that the kind of processor you have has 12 > cores, but only enough memory bandwidth to support 1.5 cores. > Try running the STREAMS with only 1 process per node. This is a setting in > your submission script, but it is different for every cluster. Thus > I would ask the local sysdamin for this machine to help you do that. You > should see almost perfect scaling with that configuration. You might > also try 2 processes per node to compare. > > Thanks, > > Matt > > >> Thank you very much. >> >> Ph. >> >> Below is testing output: >> >> [mpepvs@atlas5-c01 petsc-3.7.5]$ make >> PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 >> PETSC_ARCH=arch-linux-cxx-opt streams >> >> >> >> >> cd src/benchmarks/streams; /usr/bin/gmake --no-print-directory >> PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 >> PETSC_ARCH=arch-linux-cxx-opt streams >> /app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/bin/mpicxx -o >> MPIVersion.o c -wd1572 -g -O3 -fPIC >> -I/home/svu/mpepvs/petsc/petsc-3.7.5/include >> -I/hom >> >> >> e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include >> -I/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/include >> `pwd`/MPIVersion.c >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> +++++++++++++++++++++++++++++++ >> The version of PETSc you are using is out-of-date, we recommend updating >> to the new release >> Available Version: 3.7.6 Installed Version: 3.7.5 >> http://www.mcs.anl.gov/petsc/download/index.html >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> +++++++++++++++++++++++++++++++ >> Running streams with 'mpiexec.hydra ' using 'NPMAX=12' >> Number of MPI processes 1 Processor names atlas5-c01 >> Triad: 11026.7604 Rate (MB/s) >> Number of MPI processes 2 Processor names atlas5-c01 atlas5-c01 >> Triad: 14669.6730 Rate (MB/s) >> Number of MPI processes 3 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 >> Triad: 12848.2644 Rate (MB/s) >> Number of MPI processes 4 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 >> Triad: 15033.7687 Rate (MB/s) >> Number of MPI processes 5 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 >> Triad: 13299.3830 Rate (MB/s) >> Number of MPI processes 6 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 >> Triad: 14382.2116 Rate (MB/s) >> Number of MPI processes 7 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 >> Triad: 13194.2573 Rate (MB/s) >> Number of MPI processes 8 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 >> Triad: 14199.7255 Rate (MB/s) >> Number of MPI processes 9 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 >> Triad: 13045.8946 Rate (MB/s) >> Number of MPI processes 10 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 >> Triad: 13058.3283 Rate (MB/s) >> Number of MPI processes 11 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 >> Triad: 13037.3334 Rate (MB/s) >> Number of MPI processes 12 Processor names atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 >> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 >> Triad: 12526.6096 Rate (MB/s) >> ------------------------------------------------ >> np speedup >> 1 1.0 >> 2 1.33 >> 3 1.17 >> 4 1.36 >> 5 1.21 >> 6 1.3 >> 7 1.2 >> 8 1.29 >> 9 1.18 >> 10 1.18 >> 11 1.18 >> 12 1.14 >> Estimation of possible speedup of MPI programs based on Streams benchmark. >> It appears you have 1 node(s) >> See graph in the file src/benchmarks/streams/scaling.png >> >> On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <[email protected]> >> wrote: >> >>> On Fri, May 5, 2017 at 10:18 AM, Pham Pham <[email protected]> wrote: >>> >>>> Hi Satish, >>>> >>>> It runs now, and shows a bad speed up: >>>> Please help to improve this. >>>> >>> >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers >>> >>> The short answer is: You cannot improve this without buying a different >>> machine. This is >>> a fundamental algorithmic limitation that cannot be helped by threads, >>> or vectorization, or >>> anything else. >>> >>> Matt >>> >>> >>>> Thank you. >>>> >>>> >>>> >>>> >>>> On Fri, May 5, 2017 at 10:02 PM, Satish Balay <[email protected]> >>>> wrote: >>>> >>>>> With Intel MPI - its best to use mpiexec.hydra [and not mpiexec] >>>>> >>>>> So you can do: >>>>> >>>>> make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 >>>>> PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test >>>>> >>>>> >>>>> [you can also specify --with-mpiexec=mpiexec.hydra at configure time] >>>>> >>>>> Satish >>>>> >>>>> >>>>> On Fri, 5 May 2017, Pham Pham wrote: >>>>> >>>>> > *Hi,* >>>>> > *I can configure now, but fail when testing:* >>>>> > >>>>> > [mpepvs@atlas7-c10 petsc-3.7.5]$ make >>>>> > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 >>>>> PETSC_ARCH=arch-linux-cxx-opt >>>>> > test Running test examples to verify correct installation >>>>> > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and >>>>> > PETSC_ARCH=arch-linux-cxx-opt >>>>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with >>>>> 1 MPI >>>>> > process >>>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>> > mpiexec_atlas7-c10: cannot connect to local mpd >>>>> (/tmp/mpd2.console_mpepvs); >>>>> > possible causes: >>>>> > 1. no mpd is running on this host >>>>> > 2. an mpd is running but was started without a "console" (-n >>>>> option) >>>>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with >>>>> 2 MPI >>>>> > processes >>>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>> > mpiexec_atlas7-c10: cannot connect to local mpd >>>>> (/tmp/mpd2.console_mpepvs); >>>>> > possible causes: >>>>> > 1. no mpd is running on this host >>>>> > 2. an mpd is running but was started without a "console" (-n >>>>> option) >>>>> > Possible error running Fortran example src/snes/examples/tutorials/ex >>>>> 5f >>>>> > with 1 MPI process >>>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>> > mpiexec_atlas7-c10: cannot connect to local mpd >>>>> (/tmp/mpd2.console_mpepvs); >>>>> > possible causes: >>>>> > 1. no mpd is running on this host >>>>> > 2. an mpd is running but was started without a "console" (-n >>>>> option) >>>>> > Completed test examples >>>>> > ========================================= >>>>> > Now to evaluate the computer systems you plan use - do: >>>>> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 >>>>> > PETSC_ARCH=arch-linux-cxx-opt streams >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > *Please help on this.* >>>>> > *Many thanks!* >>>>> > >>>>> > >>>>> > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <[email protected]> >>>>> wrote: >>>>> > >>>>> > > Sorry - should have mentioned: >>>>> > > >>>>> > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again. >>>>> > > >>>>> > > The mpich install from previous build [that is currently in >>>>> > > arch-linux-cxx-opt/] >>>>> > > is conflicting with --with-mpi-dir=/app1/centos6.3 >>>>> /gnu/mvapich2-1.9/ >>>>> > > >>>>> > > Satish >>>>> > > >>>>> > > >>>>> > > On Wed, 19 Apr 2017, Pham Pham wrote: >>>>> > > >>>>> > > > I reconfigured PETSs with installed MPI, however, I got serous >>>>> error: >>>>> > > > >>>>> > > > **************************ERROR***************************** >>>>> ******** >>>>> > > > Error during compile, check arch-linux-cxx-opt/lib/petsc/c >>>>> onf/make.log >>>>> > > > Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to >>>>> > > > [email protected] >>>>> > > > ************************************************************ >>>>> ******** >>>>> > > > >>>>> > > > Please explain what is happening? >>>>> > > > >>>>> > > > Thank you very much. >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay < >>>>> [email protected]> >>>>> > > wrote: >>>>> > > > >>>>> > > > > Presumably your cluster already has a recommended MPI to use >>>>> [which is >>>>> > > > > already installed. So you should use that - instead of >>>>> > > > > --download-mpich=1 >>>>> > > > > >>>>> > > > > Satish >>>>> > > > > >>>>> > > > > On Wed, 19 Apr 2017, Pham Pham wrote: >>>>> > > > > >>>>> > > > > > Hi, >>>>> > > > > > >>>>> > > > > > I just installed petsc-3.7.5 into my university cluster. When >>>>> > > evaluating >>>>> > > > > > the computer system, PETSc reports "It appears you have 1 >>>>> node(s)", I >>>>> > > > > donot >>>>> > > > > > understand this, since the system is a multinodes system. >>>>> Could you >>>>> > > > > please >>>>> > > > > > explain this to me? >>>>> > > > > > >>>>> > > > > > Thank you very much. >>>>> > > > > > >>>>> > > > > > S. >>>>> > > > > > >>>>> > > > > > Output: >>>>> > > > > > ========================================= >>>>> > > > > > Now to evaluate the computer systems you plan use - do: >>>>> > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 >>>>> > > > > > PETSC_ARCH=arch-linux-cxx-opt streams >>>>> > > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$ make >>>>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 >>>>> > > > > PETSC_ARCH=arch-linux-cxx-opt >>>>> > > > > > streams >>>>> > > > > > cd src/benchmarks/streams; /usr/bin/gmake >>>>> --no-print-directory >>>>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 >>>>> > > > > PETSC_ARCH=arch-linux-cxx-opt >>>>> > > > > > streams >>>>> > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx >>>>> -o >>>>> > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing >>>>> > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O >>>>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include >>>>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/incl >>>>> ude >>>>> > > > > > `pwd`/MPIVersion.c >>>>> > > > > > Running streams with >>>>> > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec >>>>> ' >>>>> > > > > using >>>>> > > > > > 'NPMAX=12' >>>>> > > > > > Number of MPI processes 1 Processor names atlas7-c10 >>>>> > > > > > Triad: 9137.5025 Rate (MB/s) >>>>> > > > > > Number of MPI processes 2 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > > Triad: 9707.2815 Rate (MB/s) >>>>> > > > > > Number of MPI processes 3 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > atlas7-c10 >>>>> > > > > > Triad: 13559.5275 Rate (MB/s) >>>>> > > > > > Number of MPI processes 4 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > atlas7-c10 >>>>> > > > > > atlas7-c10 >>>>> > > > > > Triad: 14193.0597 Rate (MB/s) >>>>> > > > > > Number of MPI processes 5 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 >>>>> > > > > > Triad: 14492.9234 Rate (MB/s) >>>>> > > > > > Number of MPI processes 6 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 >>>>> > > > > > Triad: 15476.5912 Rate (MB/s) >>>>> > > > > > Number of MPI processes 7 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 >>>>> > > > > > Triad: 15148.7388 Rate (MB/s) >>>>> > > > > > Number of MPI processes 8 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 >>>>> > > > > > Triad: 15799.1290 Rate (MB/s) >>>>> > > > > > Number of MPI processes 9 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > > Triad: 15671.3104 Rate (MB/s) >>>>> > > > > > Number of MPI processes 10 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 >>>>> > > > > > Triad: 15601.4754 Rate (MB/s) >>>>> > > > > > Number of MPI processes 11 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 >>>>> > > > > > Triad: 15434.5790 Rate (MB/s) >>>>> > > > > > Number of MPI processes 12 Processor names atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 >>>>> atlas7-c10 >>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 >>>>> > > > > > Triad: 15134.1263 Rate (MB/s) >>>>> > > > > > ------------------------------------------------ >>>>> > > > > > np speedup >>>>> > > > > > 1 1.0 >>>>> > > > > > 2 1.06 >>>>> > > > > > 3 1.48 >>>>> > > > > > 4 1.55 >>>>> > > > > > 5 1.59 >>>>> > > > > > 6 1.69 >>>>> > > > > > 7 1.66 >>>>> > > > > > 8 1.73 >>>>> > > > > > 9 1.72 >>>>> > > > > > 10 1.71 >>>>> > > > > > 11 1.69 >>>>> > > > > > 12 1.66 >>>>> > > > > > Estimation of possible speedup of MPI programs based on >>>>> Streams >>>>> > > > > benchmark. >>>>> > > > > > It appears you have 1 node(s) >>>>> > > > > > Unable to plot speedup to a file >>>>> > > > > > Unable to open matplotlib to plot speedup >>>>> > > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$ >>>>> > > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$ >>>>> > > > > > >>>>> > > > > >>>>> > > > > >>>>> > > > >>>>> > > >>>>> > > >>>>> > >>>>> >>>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener >
