Re: [petsc-users] MUMPS error

2015-05-24 Thread venkatesh g
I am using Matload option as in the ex7.c code given by the Slepc.
ierr = MatLoad(A,viewer);CHKERRQ(ierr);


There is no problem here right ? or any additional option is required for
very large matrices while running the eigensolver in parallel ?

cheers,
Venkatesh

On Sat, May 23, 2015 at 5:43 PM, Matthew Knepley knep...@gmail.com wrote:

 On Sat, May 23, 2015 at 7:09 AM, venkatesh g venkateshg...@gmail.com
 wrote:

 Hi,
 Thanks.
 Per node it has 24 cores and each core has 4 GB RAM. And the job was
 submitted in 10 nodes.

 So, does it mean it requires 10G for one core ? or for 1 node ?


 The error message from MUMPS said that it tried to allocate 10G. We must
 assume each process
 tried to do the same thing. That means if you scheduled 24 processes on a
 node, it would try to
 allocate at least 240G, which is in excess of what you specify above.

 Note that this has nothing to do with PETSc. It is all in the
 documentation for that machine and its
 scheduling policy.

   Thanks,

  Matt


 cheers,

 Venkatesh

 On Sat, May 23, 2015 at 5:17 PM, Matthew Knepley knep...@gmail.com
 wrote:

 On Sat, May 23, 2015 at 6:44 AM, venkatesh g venkateshg...@gmail.com
 wrote:

 Hi,
 The same eigenproblem runs with 120 GB RAM in a serial machine in
 Matlab.

 In Cray I fired with 240*4 GB RAM in parallel. So it has to go in right
 ?


 I do not know how MUMPS allocates memory, but the message is
 unambiguous. Also,
 this is concerned with the memory available per node. Do you know how
 many processes
 per node were scheduled? The message below indicates that it was trying
 to allocate 10G
 for one process.


 And for small matrices it is having negative scaling i.e 24 core is
 running faster.


 Yes, for strong scaling you always get slowdown eventually since
 overheads dominate
 work, see Amdahl's Law.

   Thanks,

  Matt


 I have attached the submission script.

 Pls see.. Kindly let me know

 cheers,
 Venkatesh


 On Sat, May 23, 2015 at 4:58 PM, Matthew Knepley knep...@gmail.com
 wrote:

 On Sat, May 23, 2015 at 2:39 AM, venkatesh g venkateshg...@gmail.com
 wrote:

 Hi again,

 I have installed the Petsc and Slepc in Cray with intel compilers
 with Mumps.

 I am getting this error when I solve eigenvalue problem with large
 matrices: [201]PETSC ERROR: Error reported by MUMPS in numerical
 factorization phase: Cannot allocate required memory 9632 megabytes


 It ran out of memory on the node.


 Also it is again not scaling well for small matrices.


 MUMPS strong scaling for small matrices is not very good. Weak scaling
 is looking at big matrices.

   Thanks,

  Matt


 Kindly let me know what to do.

 cheers,

 Venkatesh


 On Tue, May 19, 2015 at 3:02 PM, Matthew Knepley knep...@gmail.com
 wrote:

 On Tue, May 19, 2015 at 1:04 AM, venkatesh g 
 venkateshg...@gmail.com wrote:

 Hi,

 I have attached the log of the command which I gave in the master
 node: make streams NPMAX=32

 I dont know why it says 'It appears you have only 1 node'. But
 other codes run in parallel with good scaling on 8 nodes.


 If you look at the STREAMS numbers, you can see that your system is
 only able to support about 2 cores with the
 available memory bandwidth. Thus for bandwidth constrained
 operations (almost everything in sparse linear algebra
 and solvers), your speedup will not be bigger than 2.

 Other codes may do well on this machine, but they would be compute
 constrained, using things like DGEMM.

   Thanks,

  Matt


 Kindly let me know.

 Venkatesh



 On Mon, May 18, 2015 at 11:21 PM, Barry Smith bsm...@mcs.anl.gov
 wrote:


Run the streams benchmark on this system and send the results.
 http://www.mcs.anl.gov/petsc/documentation/faq.html#computers


  On May 18, 2015, at 11:14 AM, venkatesh g 
 venkateshg...@gmail.com wrote:
 
  Hi,
  I have emailed the mumps-user list.
  Actually the cluster has 8 nodes with 16 cores, and other codes
 scale well.
  I wanted to ask if this job takes much time, then if I submit on
 more cores, I have to increase the icntl(14).. which would again take 
 long
 time.
 
  So is there another way ?
 
  cheers,
  Venkatesh
 
  On Mon, May 18, 2015 at 7:16 PM, Matthew Knepley 
 knep...@gmail.com wrote:
  On Mon, May 18, 2015 at 8:29 AM, venkatesh g 
 venkateshg...@gmail.com wrote:
  Hi I have attached the performance logs for 2 jobs on different
 processors. I had to increase the workspace icntl(14) when I submit 
 on more
 cores since it is failing with small value of icntl(14).
 
  1. performance_log1.txt is run on 8 cores (option given
 -mat_mumps_icntl_14 200)
  2. performance_log2.txt is run on 2 cores (option given
 -mat_mumps_icntl_14 85  )
 
  1) Your number of iterates increased from 7600 to 9600, but that
 is a relatively small effect
 
  2) MUMPS is just taking a lot longer to do forward/backward
 solve. You might try emailing
  the list for them. However, I would bet that your system has
 enough bandwidth for 2 procs
  and not much more.
 
Thanks,
 
   

Re: [petsc-users] MUMPS error

2015-05-24 Thread Matthew Knepley
On Sun, May 24, 2015 at 8:57 AM, venkatesh g venkateshg...@gmail.com
wrote:

 I am using Matload option as in the ex7.c code given by the Slepc.
 ierr = MatLoad(A,viewer);CHKERRQ(ierr);


 There is no problem here right ? or any additional option is required for
 very large matrices while running the eigensolver in parallel ?


This will load the matrix from the viewer (presumably disk). There are no
options for large matrices.

  Thanks,

  Matt


 cheers,
 Venkatesh

 On Sat, May 23, 2015 at 5:43 PM, Matthew Knepley knep...@gmail.com
 wrote:

 On Sat, May 23, 2015 at 7:09 AM, venkatesh g venkateshg...@gmail.com
 wrote:

 Hi,
 Thanks.
 Per node it has 24 cores and each core has 4 GB RAM. And the job was
 submitted in 10 nodes.

 So, does it mean it requires 10G for one core ? or for 1 node ?


 The error message from MUMPS said that it tried to allocate 10G. We must
 assume each process
 tried to do the same thing. That means if you scheduled 24 processes on a
 node, it would try to
 allocate at least 240G, which is in excess of what you specify above.

 Note that this has nothing to do with PETSc. It is all in the
 documentation for that machine and its
 scheduling policy.

   Thanks,

  Matt


 cheers,

 Venkatesh

 On Sat, May 23, 2015 at 5:17 PM, Matthew Knepley knep...@gmail.com
 wrote:

 On Sat, May 23, 2015 at 6:44 AM, venkatesh g venkateshg...@gmail.com
 wrote:

 Hi,
 The same eigenproblem runs with 120 GB RAM in a serial machine in
 Matlab.

 In Cray I fired with 240*4 GB RAM in parallel. So it has to go in
 right ?


 I do not know how MUMPS allocates memory, but the message is
 unambiguous. Also,
 this is concerned with the memory available per node. Do you know how
 many processes
 per node were scheduled? The message below indicates that it was trying
 to allocate 10G
 for one process.


 And for small matrices it is having negative scaling i.e 24 core is
 running faster.


 Yes, for strong scaling you always get slowdown eventually since
 overheads dominate
 work, see Amdahl's Law.

   Thanks,

  Matt


 I have attached the submission script.

 Pls see.. Kindly let me know

 cheers,
 Venkatesh


 On Sat, May 23, 2015 at 4:58 PM, Matthew Knepley knep...@gmail.com
 wrote:

 On Sat, May 23, 2015 at 2:39 AM, venkatesh g venkateshg...@gmail.com
  wrote:

 Hi again,

 I have installed the Petsc and Slepc in Cray with intel compilers
 with Mumps.

 I am getting this error when I solve eigenvalue problem with large
 matrices: [201]PETSC ERROR: Error reported by MUMPS in numerical
 factorization phase: Cannot allocate required memory 9632 megabytes


 It ran out of memory on the node.


 Also it is again not scaling well for small matrices.


 MUMPS strong scaling for small matrices is not very good. Weak
 scaling is looking at big matrices.

   Thanks,

  Matt


 Kindly let me know what to do.

 cheers,

 Venkatesh


 On Tue, May 19, 2015 at 3:02 PM, Matthew Knepley knep...@gmail.com
 wrote:

 On Tue, May 19, 2015 at 1:04 AM, venkatesh g 
 venkateshg...@gmail.com wrote:

 Hi,

 I have attached the log of the command which I gave in the master
 node: make streams NPMAX=32

 I dont know why it says 'It appears you have only 1 node'. But
 other codes run in parallel with good scaling on 8 nodes.


 If you look at the STREAMS numbers, you can see that your system is
 only able to support about 2 cores with the
 available memory bandwidth. Thus for bandwidth constrained
 operations (almost everything in sparse linear algebra
 and solvers), your speedup will not be bigger than 2.

 Other codes may do well on this machine, but they would be compute
 constrained, using things like DGEMM.

   Thanks,

  Matt


 Kindly let me know.

 Venkatesh



 On Mon, May 18, 2015 at 11:21 PM, Barry Smith bsm...@mcs.anl.gov
 wrote:


Run the streams benchmark on this system and send the results.
 http://www.mcs.anl.gov/petsc/documentation/faq.html#computers


  On May 18, 2015, at 11:14 AM, venkatesh g 
 venkateshg...@gmail.com wrote:
 
  Hi,
  I have emailed the mumps-user list.
  Actually the cluster has 8 nodes with 16 cores, and other codes
 scale well.
  I wanted to ask if this job takes much time, then if I submit
 on more cores, I have to increase the icntl(14).. which would again 
 take
 long time.
 
  So is there another way ?
 
  cheers,
  Venkatesh
 
  On Mon, May 18, 2015 at 7:16 PM, Matthew Knepley 
 knep...@gmail.com wrote:
  On Mon, May 18, 2015 at 8:29 AM, venkatesh g 
 venkateshg...@gmail.com wrote:
  Hi I have attached the performance logs for 2 jobs on different
 processors. I had to increase the workspace icntl(14) when I submit 
 on more
 cores since it is failing with small value of icntl(14).
 
  1. performance_log1.txt is run on 8 cores (option given
 -mat_mumps_icntl_14 200)
  2. performance_log2.txt is run on 2 cores (option given
 -mat_mumps_icntl_14 85  )
 
  1) Your number of iterates increased from 7600 to 9600, but
 that is a relatively small effect
 
  2) MUMPS is 

[petsc-users] Passing a context to TSPreStep()

2015-05-24 Thread Miguel Angel Salazar de Troya
Hi all

I would like to record the information about the failed steps in an
adaptive time stepping algorithm. I have noticed that I could call
TSPreStep() and get the current time step, but I want to store it in an
external data structure. Now, for TSMonitor(), one of the arguments is a
context that I can use to store some results. Is there a similar thing for
TSPreStep() How could I implement it?

Thanks
Miguel

-- 
*Miguel Angel Salazar de Troya*
Graduate Research Assistant
Department of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign
(217) 550-2360
salaz...@illinois.edu