Hi,

I'm fine tuning a simulation in the frequency domain (similar to 
https://github.com/dealii/dealii/pull/6747). Because it is in the frequency 
domain I can parallelize:

   1. Using MPI
   2. Running several simulations in parallel for different independent 
   frequency ranges.
   3. A combination of the previous options (MPI + independent frequency 
   ranges)

I found a curious behavior, that I can not explain.

If I run the command below I obtain:
Note that./membrane is a deal.II simulation and the HDF5 file has the 
simulation parameters

./membrane membrane_step_0.h5

+---------------------------------------------+------------+------------+
| Total wallclock time elapsed since start    |      12.3s |            |
|                                             |            |            |
| Section                         | no. calls |  wall time | % of total |
+---------------------------------+-----------+------------+------------+
| assembly                        |         1 |      11.7s |        95% |
| output                          |         1 |  2.03e-06s |         0% |
| setup                           |         1 |     0.263s |       2.1% |
| solve                           |         1 |     0.303s |       2.5% |
+---------------------------------+-----------+------------+------------+

If I run the command below, as expected I obtain a similar result:

mpiexec -n 1 ./membrane membrane_step_0.h5

+---------------------------------------------+------------+------------+
| Total wallclock time elapsed since start    |      12.3s |            |
|                                             |            |            |
| Section                         | no. calls |  wall time | % of total |
+---------------------------------+-----------+------------+------------+
| assembly                        |         1 |      11.7s |        95% |
| output                          |         1 |  1.95e-06s |         0% |
| setup                           |         1 |     0.259s |       2.1% |
| solve                           |         1 |     0.282s |       2.3% |
+---------------------------------+-----------+------------+------------+

If I run 16 independent processes in parallel in a 16 core machine, I 
obtain a similar result, slightly slower, but I think that it is probably 
normal:

./membrane membrane_step_0.h5 & ./membrane membrane_step_1.h5 & ./membrane 
membrane_step_2.h5 & ./membrane membrane_step_3.h5 & ./membrane 
membrane_step_4.h5 & ./membrane membrane_step_5.h5 & ./membrane 
membrane_step_6.h5 & ./membrane membrane_step_7.h5 & ./membrane 
membrane_step_8.h5 & ./membrane membrane_step_9.h5 & ./membrane 
membrane_step_10.h5 & ./membrane membrane_step_11.h5 & ./membrane 
membrane_step_12.h5 & ./membrane membrane_step_13.h5 & ./membrane 
membrane_step_14.h5 & ./membrane membrane_step_15.h5 & ./membrane 
membrane_step_16.h5 &

+---------------------------------------------+------------+------------+
| Total wallclock time elapsed since start    |      15.1s |            |
|                                             |            |            |
| Section                         | no. calls |  wall time | % of total |
+---------------------------------+-----------+------------+------------+
| assembly                        |         1 |      14.5s |        96% |
| output                          |         1 |  2.24e-06s |         0% |
| setup                           |         1 |      0.25s |       1.7% |
| solve                           |         1 |     0.341s |       2.3% |
+---------------------------------+-----------+------------+------------+
 

But it is striking that If I run 16 MPI jobs of one process per MPI job in 
the same machine, then I get an important decrease in performance. Is there 
an MPI overhead per MPI job? I thought that a single MPI process takes only 
one single core. Am I missing something?

mpiexec -n 1 ./membrane membrane_step_0.h5 & mpiexec -n 1 ./membrane 
membrane_step_1.h5 & mpiexec -n 1 ./membrane membrane_step_2.h5 & mpiexec 
-n 1 ./membrane membrane_step_3.h5 & mpiexec -n 1 ./membrane 
membrane_step_4.h5 & mpiexec -n 1 ./membrane membrane_step_5.h5 & mpiexec 
-n 1 ./membrane membrane_step_6.h5 & mpiexec -n 1 ./membrane 
membrane_step_7.h5 & mpiexec -n 1 ./membrane membrane_step_8.h5 & mpiexec 
-n 1 ./membrane membrane_step_9.h5 & mpiexec -n 1 ./membrane 
membrane_step_10.h5 & mpiexec -n 1 ./membrane membrane_step_11.h5 & mpiexec 
-n 1 ./membrane membrane_step_12.h5 & mpiexec -n 1 ./membrane 
membrane_step_13.h5 & mpiexec -n 1 ./membrane membrane_step_14.h5 & mpiexec 
-n 1 ./membrane membrane_step_15.h5 &

+---------------------------------------------+------------+------------+  
                    
| Total wallclock time elapsed since start    |      25.8s |            | 
|                                             |            |            | 
| Section                         | no. calls |  wall time | % of total |  
                    
+---------------------------------+-----------+------------+------------+ 
| assembly                        |         1 |      24.9s |        96% | 
| output                          |         1 |  3.14e-06s |         0% |  
                    
| setup                           |         1 |     0.343s |       1.3% | 
| solve                           |         1 |     0.539s |       2.1% |
+---------------------------------+-----------+------------+------------+ 

Thanks!
Daniel

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to