[petsc-users] On what condition is useful MPI-based solution?

2010-01-14 Thread Jose E. Roman

El 14/01/2010, a las 01:04, Takuya Sekikawa escribi?:

 Dear SLEPc/PETSc team,
 
 On Wed, 13 Jan 2010 09:50:57 +0100
 Jose E. Roman jroman at dsic.upv.es wrote:
 
 [Q2]
 Generally, Is MPI only useful in very large matrix?
 Now I have to solve eigenvalue problem of 1M x 1M matrix,
 Should I use MPI-based system?
 
 For a 1 million matrix I would suggest to run in parallel on an MPI cluster. 
 However, a single computer might be enough if the matrix is very sparse, you 
 need very few eigenvalues, and/or the system has enough memory (but in that 
 case, be prepared for very long response times, depending on how your 
 problem converges).
 Jose
 
 Do you have any example of how many PCs need to solve this level of
 problem? and also how many memories do each PC should have?
 
 I would like to know how much resources do I need (PCs, memories)
 and how long it takes to solve. (not precisely, rough estimation
 is enough)
 
 Problem is 1M x 1M symmetric sparse matrix, and only a few eigenpairs 
 (at least 1) I need. so currently I plan to use lanczos or krylov-schur
 method, with EPS_NEV=1.

For nev=1, the workspace used by the solver is moderate. Maybe 20 vectors of 
length 1M (i.e. 160 Mbytes). If the matrix is really sparse, say 30 nonzero 
elements per row, then the matrix is not a problem either (roughly 364 Mbytes). 
So one PC may be enough. If the matrix is much denser, or you have convergence 
problems, or you need to do shift-and-invert, then things get worse.

The execution time basically depends on convergence. For instance, ex1 with 
n=1M will have very bad convergence, but your problem may not. Run with 
-eps_monitor or -eps_monitor_draw to see how the solver is progressing.

Jose


 
 Takuya
 ---
   Takuya Sekikawa
 Mathematical Systems, Inc
sekikawa at msi.co.jp
 ---
 
 



[petsc-users] On what condition is useful MPI-based solution?

2010-01-13 Thread Takuya Sekikawa
Dear SLEPc/PETSc team,

I tried to run SLEPc's samples program ex1 in MPI-based multi-PC
environment. (1-PC vs 2-PCs)

(ex)
[running ex1 in only 1 PC]
--
$ time -p mpiexec -n 1 ./ex1 -n 4000 -eps_max_it 1
...
real 76.2
--
says, ex1 took 76.2 seconds.

next, I run same sample in 2 PCs environment.

[running ex1 in 2 PCs]
--
$ time -p mpiexec -n 2 ./ex1 -n 4000 -eps_max_it 1
...
real 265.54
--
I got 265.54 seconds. (slower than single PC)

[Q1]
Can ex1 sample speed up with MPI?, if so, generally on what condition?

[Q2]
Generally, Is MPI only useful in very large matrix?
Now I have to solve eigenvalue problem of 1M x 1M matrix,
Should I use MPI-based system?

Thanks,
Takuya
---
  Takuya Sekikawa
 Mathematical Systems, Inc
sekikawa at msi.co.jp
---



[petsc-users] On what condition is useful MPI-based solution?

2010-01-13 Thread Takuya Sekikawa
Hi Jose,
Thank you for very quick reply.

On Wed, 13 Jan 2010 09:50:57 +0100
Jose E. Roman jroman at dsic.upv.es wrote:

 On 13/01/2010, Takuya Sekikawa wrote:
 
  Dear SLEPc/PETSc team,
  
  I tried to run SLEPc's samples program ex1 in MPI-based multi-PC
  environment. (1-PC vs 2-PCs)
  
  (ex)
  [running ex1 in only 1 PC]
  --
  $ time -p mpiexec -n 1 ./ex1 -n 4000 -eps_max_it 1
  ...
  real 76.2
  --
  says, ex1 took 76.2 seconds.
  
  next, I run same sample in 2 PCs environment.
  
  [running ex1 in 2 PCs]
  --
  $ time -p mpiexec -n 2 ./ex1 -n 4000 -eps_max_it 1
  ...
  real 265.54
  --
  I got 265.54 seconds. (slower than single PC)
  
  [Q1]
  Can ex1 sample speed up with MPI?, if so, generally on what condition?
 
 Yes. The same example on my desktop computer (Intel Core 2 Duo):
 With -n 1 -- real 33.99
 With -n 2 -- real 21.90
 If you simply have two PCs connected via a slow network, then you cannot 
 expect good speedup. Try in a cluster with fast network.
 On the other hand, a better way to measure the parallel execution time is to 
 edit the source file and put PetscGetTime around the Solve call. 

Thank you for benchmarking on your environment.
that result is quite useful for me.
Certainly my current test network is not fast.
Probably that is the problem. thank you.

  [Q2]
  Generally, Is MPI only useful in very large matrix?
  Now I have to solve eigenvalue problem of 1M x 1M matrix,
  Should I use MPI-based system?
 
 For a 1 million matrix I would suggest to run in parallel on an MPI cluster. 
 However, a single computer might be enough if the matrix is very sparse, you 
 need very few eigenvalues, and/or the system has enough memory (but in that 
 case, be prepared for very long response times, depending on how your problem 
 converges).

Also this advice is really meaningful to me, thank you so much!

Takuya

 Jose
 
  
  Thanks,
  Takuya
  ---
   Takuya Sekikawa
  Mathematical Systems, Inc
 sekikawa at msi.co.jp
  ---
  
 

---
Takuya Sekikawa
 Mathematical Systems, Inc
sekikawa at msi.co.jp
---




[petsc-users] On what condition is useful MPI-based solution?

2010-01-13 Thread Takuya Sekikawa
I have a few more questions.

Just in case I want to ask.

[Q1]
In general, like ex1, Don't we need any change to source code to
run program with MPI-based system?
in other words, to run SLEPc-based program in MPI-based system,
all I need to do is changing ./configure option (--with-mpi=1, etc)
and re-compiling (without changing source) ?

[Q2]
Can I use intel-compiler (icc/icpc) on MPI-based SLEPc application?
(because intel compiler is really fast so I want to use it on MPI-based
system too)

Takuya

On Wed, 13 Jan 2010 09:50:57 +0100
Jose E. Roman jroman at dsic.upv.es wrote:

 
 On 13/01/2010, Takuya Sekikawa wrote:
 
  Dear SLEPc/PETSc team,
  
  I tried to run SLEPc's samples program ex1 in MPI-based multi-PC
  environment. (1-PC vs 2-PCs)
  
  (ex)
  [running ex1 in only 1 PC]
  --
  $ time -p mpiexec -n 1 ./ex1 -n 4000 -eps_max_it 1
  ...
  real 76.2
  --
  says, ex1 took 76.2 seconds.
  
  next, I run same sample in 2 PCs environment.
  
  [running ex1 in 2 PCs]
  --
  $ time -p mpiexec -n 2 ./ex1 -n 4000 -eps_max_it 1
  ...
  real 265.54
  --
  I got 265.54 seconds. (slower than single PC)
  
  [Q1]
  Can ex1 sample speed up with MPI?, if so, generally on what condition?
 
 Yes. The same example on my desktop computer (Intel Core 2 Duo):
 With -n 1 -- real 33.99
 With -n 2 -- real 21.90
 If you simply have two PCs connected via a slow network, then you cannot 
 expect good speedup. Try in a cluster with fast network.
 On the other hand, a better way to measure the parallel execution time is to 
 edit the source file and put PetscGetTime around the Solve call. 
 
  
  [Q2]
  Generally, Is MPI only useful in very large matrix?
  Now I have to solve eigenvalue problem of 1M x 1M matrix,
  Should I use MPI-based system?
 
 For a 1 million matrix I would suggest to run in parallel on an MPI cluster. 
 However, a single computer might be enough if the matrix is very sparse, you 
 need very few eigenvalues, and/or the system has enough memory (but in that 
 case, be prepared for very long response times, depending on how your problem 
 converges).
 Jose
 
  
  Thanks,
  Takuya
  ---
   Takuya Sekikawa
  Mathematical Systems, Inc
 sekikawa at msi.co.jp
  ---
  
 

---
   ?   Takuya Sekikawa
 ???Mathematical Systems, Inc
   ? sekikawa at msi.co.jp
---




[petsc-users] On what condition is useful MPI-based solution?

2010-01-13 Thread Jose E. Roman

On 13/01/2010, Takuya Sekikawa wrote:

 I have a few more questions.

 Just in case I want to ask.

 [Q1]
 In general, like ex1, Don't we need any change to source code to
 run program with MPI-based system?
 in other words, to run SLEPc-based program in MPI-based system,
 all I need to do is changing ./configure option (--with-mpi=1, etc)
 and re-compiling (without changing source) ?

No change in the program. Just enable MPI in the installation.


 [Q2]
 Can I use intel-compiler (icc/icpc) on MPI-based SLEPc application?
 (because intel compiler is really fast so I want to use it on MPI- 
 based
 system too)

 Takuya

Yes, no problem.



[petsc-users] On what condition is useful MPI-based solution?

2010-01-13 Thread Jed Brown
On Wed, 13 Jan 2010 09:50:57 +0100, Jose E. Roman jroman at dsic.upv.es 
wrote:
 On the other hand, a better way to measure the parallel execution time
 is to edit the source file and put PetscGetTime around the Solve call.

Or use -log_summary

Jed


[petsc-users] On what condition is useful MPI-based solution?

2010-01-13 Thread Satish Balay
On Wed, 13 Jan 2010, Jose E. Roman wrote:

 
 On 13/01/2010, Takuya Sekikawa wrote:
 
  I have a few more questions.
  
  Just in case I want to ask.
  
  [Q1]
  In general, like ex1, Don't we need any change to source code to
  run program with MPI-based system?
  in other words, to run SLEPc-based program in MPI-based system,
  all I need to do is changing ./configure option (--with-mpi=1, etc)
  and re-compiling (without changing source) ?
 
 No change in the program. Just enable MPI in the installation.

Note that the examples are written to be parallel - and run with or
without MPI. You cannot take any random sequential code, compile it
with MPI - and expect it to run parallely.

Eventhough SLEPc/PETSc hide most of the MPI related stuff from the
user - there is generally some user code that should be MPI aware. [in
ex1 - the matrix assembly is written to be MPI aware - and the data is
distrubuted and assembled parallely]

  
  [Q2]
  Can I use intel-compiler (icc/icpc) on MPI-based SLEPc application?
  (because intel compiler is really fast so I want to use it on MPI-based
  system too)
 
 Yes,

Note: Faster compilers speedup sequential part of the code [hence the
overall runtime for the parallel run aswell]. But your bottleneck is
MPI/communication cost - which won't change. So your parallel
scalability will be skewed further.

To improve parallel scalability - you'll have to get the fastest
network you can - between the nodes. And then install MPI that can
perform well on the given network setup.

[and as Jed mentioned -log_summary is a better tool to compare performance]

Satish