[Meep-discuss] MPI efficiency

2009-01-12 Thread Nizamov Shawkat
Dear Meep users !

Lately I was trying to implement pythonic interface for libmeep. My current
goal is enabling mpi feature of libmeep in python scripts. While enabling
mpi is as straight as just calling mpi_init with correct arguments, the
actual mpi acceleration do not meet my expectations. I did some tests with
meep-mpi and c++ tasks also and found that the results are essentially the
same - mpi efficiency is lower than expected.  For mpich the overall
calculation speed on dual-core pentium was actually slower than on single
core (mpi interconnection took way too much time). With openmpi, things got
better. My simple tests, performed with meep-mpi (scheme), c++ compiled
testfiles (available in meep source) and pythonic interface all show approx
20-30% acceleraton (at best, 40% in time stepping code, see below) comparing
dual-core pentium to single-core on the same PC. My  expectation for such
SMP system was much higher - about 75-80%

So the first question is - what mpi efficiency do you observe on your
systems? Can someone provide simple benchmark (ctl preferred), which do not
take hours to run on desktop PC and clearly demonstrates advantage of mpi?

Next question - there is a special argument - num_chunks in class structure.
Is it supposed to be the number of available processor cores, so that
calculation domain optimally split among nodes ?

And the last - Are there any hints for using meep with mpi support ?

With best regards
Nizamov Shawkat





Supplementary:

 cat 1.ctl
(set! geometry-lattice (make lattice (size 16 8 no-size)))
(set! geometry (list
(make block (center 0 0) (size infinity 1 infinity)
  (material (make dielectric (epsilon 12))
(set! sources (list
   (make source
 (src (make continuous-src (frequency 0.15)))
 (component Ez)
 (center -7 0
(set! pml-layers (list (make pml (thickness 1.0
(set! resolution 10)
(run-until 2000
   (at-beginning output-epsilon)
   (at-end output-efield-z))


 mpirun -np 1 /usr/bin/meep-mpi 1.ctl
Using MPI version 2.0, 1 processes
---
Initializing structure...
Working in 2D dimensions.
 block, center = (0,0,0)
  size (1e+20,1,1e+20)
  axes (1,0,0), (0,1,0), (0,0,1)
  dielectric constant epsilon = 12
time for set_epsilon = 0.054827 s
---
creating output file ./1-eps-00.00.h5...
Meep progress: 230.95/2000.0 = 11.5% done in 4.0s, 30.6s to go
on time step 4625 (time=231.25), 0.000865026 s/step
Meep progress: 468.55/2000.0 = 23.4% done in 8.0s, 26.2s to go
on time step 9378 (time=468.9), 0.000841727 s/step
Meep progress: 705.8/2000.0 = 35.3% done in 12.0s, 22.0s to go
on time step 14123 (time=706.15), 0.000843144 s/step
Meep progress: 943.35/2000.0 = 47.2% done in 16.0s, 17.9s to go
on time step 18874 (time=943.7), 0.000841985 s/step
Meep progress: 1181.4/2000.0 = 59.1% done in 20.0s, 13.9s to go
on time step 23635 (time=1181.75), 0.00084028 s/step
Meep progress: 1418.85/2000.0 = 70.9% done in 24.0s, 9.8s to go
on time step 28384 (time=1419.2), 0.000842386 s/step
Meep progress: 1654.05/2000.0 = 82.7% done in 28.0s, 5.9s to go
on time step 33088 (time=1654.4), 0.000850374 s/step
Meep progress: 1891.5/2000.0 = 94.6% done in 32.0s, 1.8s to go
on time step 37837 (time=1891.85), 0.000842369 s/step
creating output file ./1-ez-002000.00.h5...
run 0 finished at t = 2000.0 (4 timesteps)

Elapsed run time = 33.9869 s

 mpirun -np 2 /usr/bin/meep-mpi 1.ctl
Using MPI version 2.0, 2 processes
---
Initializing structure...
Working in 2D dimensions.
 block, center = (0,0,0)
  size (1e+20,1,1e+20)
  axes (1,0,0), (0,1,0), (0,0,1)
  dielectric constant epsilon = 12
time for set_epsilon = 0.0299381 s
---
creating output file ./1-eps-00.00.h5...
Meep progress: 328.85/2000.0 = 16.4% done in 4.0s, 20.4s to go
on time step 6577 (time=328.85), 0.00060946 s/step
Meep progress: 656.1/2000.0 = 32.8% done in 8.0s, 16.4s to go
on time step 13123 (time=656.15), 0.000611187 s/step
Meep progress: 985.9/2000.0 = 49.3% done in 12.0s, 12.4s to go
on time step 19719 (time=985.95), 0.000606462 s/step
Meep progress: 1315.1/2000.0 = 65.8% done in 16.0s, 8.3s to go
on time step 26302 (time=1315.1), 0.000608525 s/step
Meep progress: 1644.95/2000.0 = 82.2% done in 20.0s, 4.3s to go
on time step 32911 (time=1645.55), 0.000605277 s/step
Meep progress: 1975.0/2000.0 = 98.8% done in 24.0s, 0.3s to go
on time step 39512 (time=1975.6), 0.000606022 s/step
creating output file ./1-ez-002000.00.h5...
run 0 finished at t = 2000.0 (4 timesteps)

Elapsed run time = 24.57 s


For python script:
(skipped harmless warnings like   [ubuntu:24234] mca: base: component_find:
unable to open osc pt2pt: file not found (ignored))


 mpirun -np 1 ./test-tut1-mpi.py
Using MPI version 2.0, 1
processes
Count processors: 1 My rank is
0
time for set_epsilon = 0.696046
s
creating output file

Re: [Meep-discuss] MPI efficiency

2009-01-12 Thread Benjamin M. Schwartz
Nizamov Shawkat wrote:
 So the first question is - what mpi efficiency do you observe on your
 systems?

I have observed excellent acceleration.  Dual-core systems have been
near-linear, and running on 8 machines with a GigE interconnect I got
about 6x acceleration.

The important thing is to make sure that your problem size is large
enough.  The website suggests that MPI is only useful for 3D problems, and
I would add that it is best for problems that are at least 100 MB in size.

If your problem size is already very large, then it's possible that there
is a misconfiguration in your system.

--Ben



signature.asc
Description: OpenPGP digital signature
___
meep-discuss mailing list
meep-discuss@ab-initio.mit.edu
http://ab-initio.mit.edu/cgi-bin/mailman/listinfo/meep-discuss

Re: [Meep-discuss] MPI efficiency

2009-01-12 Thread Zheng Li
Nizamov Shawkat wrote:
 Dear Meep users !

 Lately I was trying to implement pythonic interface for libmeep. My
 current goal is enabling mpi feature of libmeep in python scripts.
 While enabling mpi is as straight as just calling mpi_init with
 correct arguments, the actual mpi acceleration do not meet my
 expectations. I did some tests with meep-mpi and c++ tasks also and
 found that the results are essentially the same - mpi efficiency is
 lower than expected.  For mpich the overall calculation speed on
 dual-core pentium was actually slower than on single core (mpi
 interconnection took way too much time). With openmpi, things got
 better. My simple tests, performed with meep-mpi (scheme), c++
 compiled testfiles (available in meep source) and pythonic interface
 all show approx 20-30% acceleraton (at best, 40% in time stepping
 code, see below) comparing dual-core pentium to single-core on the
 same PC. My  expectation for such SMP system was much higher - about
 75-80%

 So the first question is - what mpi efficiency do you observe on your
 systems? Can someone provide simple benchmark (ctl preferred), which
 do not take hours to run on desktop PC and clearly demonstrates
 advantage of mpi?

 Next question - there is a special argument - num_chunks in class
 structure. Is it supposed to be the number of available processor
 cores, so that calculation domain optimally split among nodes ?

 And the last - Are there any hints for using meep with mpi support ?

 With best regards
 Nizamov Shawkat





 Supplementary:

  cat 1.ctl
 (set! geometry-lattice (make lattice (size 16 8 no-size)))
 (set! geometry (list
 (make block (center 0 0) (size infinity 1 infinity)
   (material (make dielectric (epsilon 12))
 (set! sources (list
(make source
  (src (make continuous-src (frequency 0.15)))
  (component Ez)
  (center -7 0
 (set! pml-layers (list (make pml (thickness 1.0
 (set! resolution 10)
 (run-until 2000
(at-beginning output-epsilon)
(at-end output-efield-z))


  mpirun -np 1 /usr/bin/meep-mpi 1.ctl
 Using MPI version 2.0, 1 processes  
 --- 
 Initializing structure...   
 Working in 2D dimensions.   
  block, center = (0,0,0)
   size (1e+20,1,1e+20)  
   axes (1,0,0), (0,1,0), (0,0,1)
   dielectric constant epsilon = 12  
 time for set_epsilon = 0.054827 s   
 --- 
 creating output file ./1-eps-00.00.h5...  
 Meep progress: 230.95/2000.0 = 11.5% done in 4.0s, 30.6s to go
 on time step 4625 (time=231.25), 0.000865026 s/step
 Meep progress: 468.55/2000.0 = 23.4% done in 8.0s, 26.2s to go
 on time step 9378 (time=468.9), 0.000841727 s/step
 Meep progress: 705.8/2000.0 = 35.3% done in 12.0s, 22.0s to go
 on time step 14123 (time=706.15), 0.000843144 s/step
 Meep progress: 943.35/2000.0 = 47.2% done in 16.0s, 17.9s to go
 on time step 18874 (time=943.7), 0.000841985 s/step
 Meep progress: 1181.4/2000.0 = 59.1% done in 20.0s, 13.9s to go
 on time step 23635 (time=1181.75), 0.00084028 s/step
 Meep progress: 1418.85/2000.0 = 70.9% done in 24.0s, 9.8s to go
 on time step 28384 (time=1419.2), 0.000842386 s/step
 Meep progress: 1654.05/2000.0 = 82.7% done in 28.0s, 5.9s to go
 on time step 33088 (time=1654.4), 0.000850374 s/step
 Meep progress: 1891.5/2000.0 = 94.6% done in 32.0s, 1.8s to go
 on time step 37837 (time=1891.85), 0.000842369 s/step
 creating output file ./1-ez-002000.00.h5...
 run 0 finished at t = 2000.0 (4 timesteps)

 Elapsed run time = 33.9869 s

  mpirun -np 2 /usr/bin/meep-mpi 1.ctl
 Using MPI version 2.0, 2 processes  
 --- 
 Initializing structure...   
 Working in 2D dimensions.   
  block, center = (0,0,0)
   size (1e+20,1,1e+20)  
   axes (1,0,0), (0,1,0), (0,0,1)
   dielectric constant epsilon = 12  
 time for set_epsilon = 0.0299381 s  
 --- 
 creating output file ./1-eps-00.00.h5...  
 Meep progress: 328.85/2000.0 = 16.4% done in 4.0s, 20.4s to go  
 on time step 6577 (time=328.85), 0.00060946 s/step  
 Meep progress: 656.1/2000.0 = 32.8% done 

[Meep-discuss] : To get the value of E-field by SI

2009-01-12 Thread Yan Liu
Hello! everyone,
First, I am not very sure of the specific way to generate E-field in MEEP.
In FDTD method, the E-field is converted to
sqrt(epsilon_0/mu_0) where epsilon_0 = 8.85e-12, mu_0 = 1.2566e-6. I wonder
how to transfer the result of E-field to the SI value (V/m).

Second, in this link
http://www.mail-archive.com/meep-discuss@ab-initio.mit.edu/msg00386.html
Steven said that we need to calculate the 'amplitude' by the following step:

Suppose you want to input 5W of power into a waveguide (or whatever). What
you should do is to put a source with some amplitude A and measure the
power P that you get in your waveguide (using one of Meep's flux functions).
Then to make the power 5, you should scale A by sqrt(5/P).

I wonder how I can deal with it under C++ interface?

Best Regards!
--
___
meep-discuss mailing list
meep-discuss@ab-initio.mit.edu
http://ab-initio.mit.edu/cgi-bin/mailman/listinfo/meep-discuss