Hi Since Vec and most of Mat is now threaded we have started to do more detailed profiling. I'm posting these initial tasters from a two socket Intel Core Bloomfield processor system (i.e. 8 cores) to stimulate discussion.
The matrix comes from a 3D lock exchange problem discretised using a continuous Galerkin finite element formulation and has about 450k degrees of freedom. I have configured the simulator (Fluidity - http://amcg.ese.ic.ac.uk/Fluidity) to dump out PETSc matrices at each solve. These individual matrices are then solved using petsc-dev/src/ksp/ksp/examples/tests/ex6 compiled with GCC 4.6.3 --with-debugging=0. The PETSc options are: -get_total_flops -pc_type gamg -ksp_type cg -ksp_rtol 1.0e-6 -log_summary The 3 log files attached are for OMP_NUM_THREADS=1, OMP_NUM_THREADS=8 and non-threaded MPI run with 8 processes for comparison. So the reason this benchmark is interesting is because it is pressure which is really stiff , and it uses GAMG as a blackbox. Using xxdiff to compare the logs I think the interesting points are: - Overall OpenMP compares favourably with MPI. - OpenMP converged in 2 less iterations than with MPI. Earlier I was expecting fewer iterations simply because of the absence of partitions to diminish the effectiveness of coarsening. I have not been following Mark's GAMG development but it looks repartitioning is being used to get around that issue (?). However, the biggest plus is because Chebychev is used as a smoother (rather than something difficult to parallelise like SSOR), GAMG appears to scale pretty well when threaded with OpenMP. - Important operations like MatMult etc perform well. - From the summary, "mystage 1" is the main section where OMP appears to need more work. We suffer from operations such as MatPtAP and MatTrnMatMult for example which we have not got around to looking at yet. As this is a relatively small and boring UMA machine I have not bothered with scaling curves. We are setting the same benchmark up on 32-core Interlagos compute nodes at the moment - hopefully these will be ready by tomorrow. Comments welcome. Cheers Gerard -------------- next part -------------- A non-text attachment was scrubbed... Name: lock_exchange.tar.gz Type: application/x-gzip Size: 7559 bytes Desc: not available URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120314/c96b4c35/attachment.bin>
