Barry, Please find attached the patch for the minor change to control the number of elements from command line for snes/ex20.c. I know that this can be achieved with -grid_x etc from command_line but thought this just made the typing for the refinement process a little easier. I apologize if there was any confusion.
Also, find attached the full log summaries for -np=1 and -np=2. Thanks. Vijay On Wed, Feb 2, 2011 at 6:06 PM, Barry Smith <bsmith at mcs.anl.gov> wrote: > > ?We need all the information from -log_summary to see what is going on. > > ?Not sure what -grid 20 means but don't expect any good parallel performance > with less than at least 10,000 unknowns per process. > > ? Barry > > On Feb 2, 2011, at 5:38 PM, Vijay S. Mahadevan wrote: > >> Here's the performance statistic on 1 and 2 processor runs. >> >> /usr/lib/petsc/linux-gnu-cxx-opt/bin/mpiexec -n 1 ./ex20 -grid 20 >> -log_summary >> >> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total >> Time (sec): ? ? ? ? ? 8.452e+00 ? ? ?1.00000 ? 8.452e+00 >> Objects: ? ? ? ? ? ? ?1.470e+02 ? ? ?1.00000 ? 1.470e+02 >> Flops: ? ? ? ? ? ? ? ?5.045e+09 ? ? ?1.00000 ? 5.045e+09 ?5.045e+09 >> Flops/sec: ? ? ? ? ? ?5.969e+08 ? ? ?1.00000 ? 5.969e+08 ?5.969e+08 >> MPI Messages: ? ? ? ? 0.000e+00 ? ? ?0.00000 ? 0.000e+00 ?0.000e+00 >> MPI Message Lengths: ?0.000e+00 ? ? ?0.00000 ? 0.000e+00 ?0.000e+00 >> MPI Reductions: ? ? ? 4.440e+02 ? ? ?1.00000 >> >> /usr/lib/petsc/linux-gnu-cxx-opt/bin/mpiexec -n 2 ./ex20 -grid 20 >> -log_summary >> >> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total >> Time (sec): ? ? ? ? ? 7.851e+00 ? ? ?1.00000 ? 7.851e+00 >> Objects: ? ? ? ? ? ? ?2.000e+02 ? ? ?1.00000 ? 2.000e+02 >> Flops: ? ? ? ? ? ? ? ?4.670e+09 ? ? ?1.00580 ? 4.657e+09 ?9.313e+09 >> Flops/sec: ? ? ? ? ? ?5.948e+08 ? ? ?1.00580 ? 5.931e+08 ?1.186e+09 >> MPI Messages: ? ? ? ? 7.965e+02 ? ? ?1.00000 ? 7.965e+02 ?1.593e+03 >> MPI Message Lengths: ?1.412e+07 ? ? ?1.00000 ? 1.773e+04 ?2.824e+07 >> MPI Reductions: ? ? ? 1.046e+03 ? ? ?1.00000 >> >> I am not entirely sure if I can make sense out of that statistic but >> if there is something more you need, please feel free to let me know. >> >> Vijay >> >> On Wed, Feb 2, 2011 at 5:15 PM, Matthew Knepley <knepley at gmail.com> wrote: >>> On Wed, Feb 2, 2011 at 5:04 PM, Vijay S. Mahadevan <vijay.m at gmail.com> >>> wrote: >>>> >>>> Matt, >>>> >>>> The -with-debugging=1 option is certainly not meant for performance >>>> studies but I didn't expect it to yield the same cpu time as a single >>>> processor for snes/ex20 i.e., my runs with 1 and 2 processors take >>>> approximately the same amount of time for computation of solution. But >>>> I am currently configuring without debugging symbols and shall let you >>>> know what that yields. >>>> >>>> On a similar note, is there something extra that needs to be done to >>>> make use of multi-core machines while using MPI ? I am not sure if >>>> this is even related to PETSc but could be an MPI configuration option >>>> that maybe either I or the configure process is missing. All ideas are >>>> much appreciated. >>> >>> Sparse MatVec (MatMult) is a memory bandwidth limited operation. On most >>> cheap multicore machines, there is a single memory bus, and thus using more >>> cores gains you very little extra performance. I still suspect you are not >>> actually >>> running in parallel, because you usually see a small speedup. That is why I >>> suggested looking at -log_summary since it tells you how many processes were >>> run and breaks down the time. >>> ? ?Matt >>> >>>> >>>> Vijay >>>> >>>> On Wed, Feb 2, 2011 at 4:53 PM, Matthew Knepley <knepley at gmail.com> >>>> wrote: >>>>> On Wed, Feb 2, 2011 at 4:46 PM, Vijay S. Mahadevan <vijay.m at gmail.com> >>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am trying to configure my petsc install with an MPI installation to >>>>>> make use of a dual quad-core desktop system running Ubuntu. But >>>>>> eventhough the configure/make process went through without problems, >>>>>> the scalability of the programs don't seem to reflect what I expected. >>>>>> My configure options are >>>>>> >>>>>> --download-f-blas-lapack=1 --with-mpi-dir=/usr/lib/ --download-mpich=1 >>>>>> --with-mpi-shared=0 --with-shared=0 --COPTFLAGS=-g >>>>>> --download-parmetis=1 --download-superlu_dist=1 --download-hypre=1 >>>>>> --download-blacs=1 --download-scalapack=1 --with-clanguage=C++ >>>>>> --download-plapack=1 --download-mumps=1 --download-umfpack=yes >>>>>> --with-debugging=1 --with-errorchecking=yes >>>>> >>>>> 1) For performance studies, make a build using --with-debugging=0 >>>>> 2) Look at -log_summary for a breakdown of performance >>>>> ? ?Matt >>>>> >>>>>> >>>>>> Is there something else that needs to be done as part of the configure >>>>>> process to enable a decent scaling ? I am only comparing programs with >>>>>> mpiexec (-n 1) and (-n 2) but they seem to be taking approximately the >>>>>> same time as noted from -log_summary. If it helps, I've been testing >>>>>> with snes/examples/tutorials/ex20.c for all purposes with a custom >>>>>> -grid parameter from command-line to control the number of unknowns. >>>>>> >>>>>> If there is something you've witnessed before in this configuration or >>>>>> if you need anything else to analyze the problem, do let me know. >>>>>> >>>>>> Thanks, >>>>>> Vijay >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments >>>>> is infinitely more interesting than any results to which their >>>>> experiments >>>>> lead. >>>>> -- Norbert Wiener >>>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments >>> is infinitely more interesting than any results to which their experiments >>> lead. >>> -- Norbert Wiener >>> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: ex20.patch Type: text/x-patch Size: 526 bytes Desc: not available URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110202/b1b8c55d/attachment-0001.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: ex20_np1.out Type: application/octet-stream Size: 11823 bytes Desc: not available URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110202/b1b8c55d/attachment-0002.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: ex20_np2.out Type: application/octet-stream Size: 12814 bytes Desc: not available URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110202/b1b8c55d/attachment-0003.obj>
