Nizamov Shawkat wrote:
Dear Meep users !
Lately I was trying to implement pythonic interface for libmeep. My
current goal is enabling mpi feature of libmeep in python scripts.
While enabling mpi is as straight as just calling mpi_init with
correct arguments, the actual mpi acceleration do not meet my
expectations. I did some tests with meep-mpi and c++ tasks also and
found that the results are essentially the same - mpi efficiency is
lower than expected. For mpich the overall calculation speed on
dual-core pentium was actually slower than on single core (mpi
interconnection took way too much time). With openmpi, things got
better. My simple tests, performed with meep-mpi (scheme), c++
compiled testfiles (available in meep source) and pythonic interface
all show approx 20-30% acceleraton (at best, 40% in time stepping
code, see below) comparing dual-core pentium to single-core on the
same PC. My expectation for such SMP system was much higher - about
75-80%
So the first question is - what mpi efficiency do you observe on your
systems? Can someone provide simple benchmark (ctl preferred), which
do not take hours to run on desktop PC and clearly demonstrates
advantage of mpi?
Next question - there is a special argument - num_chunks in class
structure. Is it supposed to be the number of available processor
cores, so that calculation domain optimally split among nodes ?
And the last - Are there any hints for using meep with mpi support ?
With best regards
Nizamov Shawkat
Supplementary:
cat 1.ctl
(set! geometry-lattice (make lattice (size 16 8 no-size)))
(set! geometry (list
(make block (center 0 0) (size infinity 1 infinity)
(material (make dielectric (epsilon 12))
(set! sources (list
(make source
(src (make continuous-src (frequency 0.15)))
(component Ez)
(center -7 0
(set! pml-layers (list (make pml (thickness 1.0
(set! resolution 10)
(run-until 2000
(at-beginning output-epsilon)
(at-end output-efield-z))
mpirun -np 1 /usr/bin/meep-mpi 1.ctl
Using MPI version 2.0, 1 processes
---
Initializing structure...
Working in 2D dimensions.
block, center = (0,0,0)
size (1e+20,1,1e+20)
axes (1,0,0), (0,1,0), (0,0,1)
dielectric constant epsilon = 12
time for set_epsilon = 0.054827 s
---
creating output file ./1-eps-00.00.h5...
Meep progress: 230.95/2000.0 = 11.5% done in 4.0s, 30.6s to go
on time step 4625 (time=231.25), 0.000865026 s/step
Meep progress: 468.55/2000.0 = 23.4% done in 8.0s, 26.2s to go
on time step 9378 (time=468.9), 0.000841727 s/step
Meep progress: 705.8/2000.0 = 35.3% done in 12.0s, 22.0s to go
on time step 14123 (time=706.15), 0.000843144 s/step
Meep progress: 943.35/2000.0 = 47.2% done in 16.0s, 17.9s to go
on time step 18874 (time=943.7), 0.000841985 s/step
Meep progress: 1181.4/2000.0 = 59.1% done in 20.0s, 13.9s to go
on time step 23635 (time=1181.75), 0.00084028 s/step
Meep progress: 1418.85/2000.0 = 70.9% done in 24.0s, 9.8s to go
on time step 28384 (time=1419.2), 0.000842386 s/step
Meep progress: 1654.05/2000.0 = 82.7% done in 28.0s, 5.9s to go
on time step 33088 (time=1654.4), 0.000850374 s/step
Meep progress: 1891.5/2000.0 = 94.6% done in 32.0s, 1.8s to go
on time step 37837 (time=1891.85), 0.000842369 s/step
creating output file ./1-ez-002000.00.h5...
run 0 finished at t = 2000.0 (4 timesteps)
Elapsed run time = 33.9869 s
mpirun -np 2 /usr/bin/meep-mpi 1.ctl
Using MPI version 2.0, 2 processes
---
Initializing structure...
Working in 2D dimensions.
block, center = (0,0,0)
size (1e+20,1,1e+20)
axes (1,0,0), (0,1,0), (0,0,1)
dielectric constant epsilon = 12
time for set_epsilon = 0.0299381 s
---
creating output file ./1-eps-00.00.h5...
Meep progress: 328.85/2000.0 = 16.4% done in 4.0s, 20.4s to go
on time step 6577 (time=328.85), 0.00060946 s/step
Meep progress: 656.1/2000.0 = 32.8% done