> On Apr 12, 2015, at 12:48 PM, Gideon Simpson <[email protected]> wrote:
> 
> I was hoping to demonstrate in my class the computational gain with petsc/mpi 
> in solving a simple problem, like discretized poisson or heat, as the number 
> of processes increases.  Can anyone recommend any of the petsc examples for 
> this purpose?  Perhaps I’m just using poorly chosen KSP/PC pairs, but I 
> haven’t been able to observe gain.  I’m planning to demo this on a commodity 
> intel cluster with infiniband.  

   Gideon,

     I would use src/ksp/ksp/examples/tutorials/ex45 to get across three 
concepts

   1)  algorithmic complexity (1 process).  Run it with several levels of 
refinement (say -da_refine 4 depending on how much memory you have) with 
        a) -pc_type jacobi  -ksp_type bcgs  (algorithm with poor computational 
complexity, very parallel)
        b) -pc_type mg  -ksp_type bcgs    (algorithm with good computational 
complexity, good parallel but less than jacobi)
       then run it again with one more level of refinement (say -da_refine 5) 
and see how much more time each method takes

   2) scaling (2 process)  Run as with 1) but on two processes and note that 
the "poorer" algorithm Jacobi gives better "speedup" then mg

   3) understanding the limitations of your machine (see 
http://www.mcs.anl.gov/petsc/documentation/faq.html#computers) how total memory 
bandwidth of all your available cores determines the performance of the PETSc 
solvers. So run the streams benchmark (included now with PETSc in the 
src/benchmarks/streams directory) to see its speedup when you use a different 
number of cores and combinations of cores on each node and across nodes and 
then run the PETSc example to see its speedups. Note that you likely have to do 
something smarter than mpiexec -n n ./ex45 ... to launch the program since you 
need to have control over what nodes the mpi puts each of the processes it 
starts up; for example does it spread the processes one on each node or first 
pack them on one node (check the documentation for your mpiexec and how to 
control this). You will find that different choices lead to very different 
performance and this can be related to the streams benchmark and available 
memory bandwidth.

  Barry



> 
> -gideon
> 

Reply via email to