Re: [petsc-users] create global vector in latest version of petsc

2016-10-05 Thread Hengjie Wang

Hi,

There is no error now. Thank you so much.

Frank

On 10/5/2016 8:50 PM, Barry Smith wrote:

   Sorry, as indicated in 
http://www.mcs.anl.gov/petsc/documentation/changes/dev.html in order to get the 
previous behavior of
DMDACreate3d() you need to follow it with the two lines

DMSetFromOptions(da);
DMSetUp(da);

   Barry




On Oct 5, 2016, at 9:11 PM, Hengjie Wang <hengj...@uci.edu> wrote:

Hi,

I just tried .F90. It had the error. I attached the full error log.

Thank you.

Frank


On 10/5/2016 6:57 PM, Barry Smith wrote:

   PETSc fortran programs should always end with .F90 not .f90 can you try 
again with that name? The capital F is important.

   Barry


On Oct 5, 2016, at 7:57 PM, frank <hengj...@uci.edu> wrote:

Hi,

I update petsc to the latest version by pulling from the repo. Then I find one 
of my old code, which worked before, output errors now.
After debugging, I find that the error is caused by "DMCreateGlobalVector".
I attach a short program which can re-produce the error. This program works 
well with an older version of petsc.
I also attach the script I used to configure petsc.

The error message is below. Did I miss something in the installation ? Thank 
you.

1 [0]PETSC ERROR: - Error Message 
--
  2 [0]PETSC ERROR: Null argument, when expecting valid pointer
  3 [0]PETSC ERROR: Null Object: Parameter # 2
  4 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
trouble shooting.
  5 [0]PETSC ERROR: Petsc Development GIT revision: v3.7.4-1571-g7fc5cb5  GIT 
Date: 2016-10-05 10:56:19 -0500
  6 [0]PETSC ERROR: [2]PETSC ERROR: ./test_ksp.exe on a gnu-dbg-32idx named 
kolmog1 by frank Wed Oct  5 17:40:07 2016
  7 [0]PETSC ERROR: Configure options --known-mpi-shared="0 " --known-memcmp-ok  --with-debugging="1 " --with-shared-libraries=0 
--with-mpi-compilers="1 " --download-blacs="1 " --download-metis="1 " --download-parmetis="1 " 
--download-superlu_dist="1 " --download-hypre=1 PETSC_ARCH=gnu-dbg-32idx
  8 [0]PETSC ERROR: #1 VecSetLocalToGlobalMapping() line 83 in 
/home/frank/petsc/src/vec/vec/interface/vector.c
  9 [0]PETSC ERROR: #2 DMCreateGlobalVector_DA() line 45 in 
/home/frank/petsc/src/dm/impls/da/dadist.c
10 [0]PETSC ERROR: #3 DMCreateGlobalVector() line 880 in 
/home/frank/petsc/src/dm/interface/dm.c


Regards,
Frank










Re: [petsc-users] create global vector in latest version of petsc

2016-10-05 Thread Hengjie Wang

Hi,

I just tried .F90. It had the error. I attached the full error log.

Thank you.

Frank


On 10/5/2016 6:57 PM, Barry Smith wrote:

   PETSc fortran programs should always end with .F90 not .f90 can you try 
again with that name? The capital F is important.

   Barry


On Oct 5, 2016, at 7:57 PM, frank  wrote:

Hi,

I update petsc to the latest version by pulling from the repo. Then I find one 
of my old code, which worked before, output errors now.
After debugging, I find that the error is caused by "DMCreateGlobalVector".
I attach a short program which can re-produce the error. This program works 
well with an older version of petsc.
I also attach the script I used to configure petsc.

The error message is below. Did I miss something in the installation ? Thank 
you.

1 [0]PETSC ERROR: - Error Message 
--
  2 [0]PETSC ERROR: Null argument, when expecting valid pointer
  3 [0]PETSC ERROR: Null Object: Parameter # 2
  4 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
trouble shooting.
  5 [0]PETSC ERROR: Petsc Development GIT revision: v3.7.4-1571-g7fc5cb5  GIT 
Date: 2016-10-05 10:56:19 -0500
  6 [0]PETSC ERROR: [2]PETSC ERROR: ./test_ksp.exe on a gnu-dbg-32idx named 
kolmog1 by frank Wed Oct  5 17:40:07 2016
  7 [0]PETSC ERROR: Configure options --known-mpi-shared="0 " --known-memcmp-ok  --with-debugging="1 " --with-shared-libraries=0 
--with-mpi-compilers="1 " --download-blacs="1 " --download-metis="1 " --download-parmetis="1 " 
--download-superlu_dist="1 " --download-hypre=1 PETSC_ARCH=gnu-dbg-32idx
  8 [0]PETSC ERROR: #1 VecSetLocalToGlobalMapping() line 83 in 
/home/frank/petsc/src/vec/vec/interface/vector.c
  9 [0]PETSC ERROR: #2 DMCreateGlobalVector_DA() line 45 in 
/home/frank/petsc/src/dm/impls/da/dadist.c
10 [0]PETSC ERROR: #3 DMCreateGlobalVector() line 880 in 
/home/frank/petsc/src/dm/interface/dm.c


Regards,
Frank






[0]PETSC ERROR: - Error Message 
--
[0]PETSC ERROR: Null argument, when expecting valid pointer
[0]PETSC ERROR: Null Object: Parameter # 2
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.7.4-1571-g7fc5cb5  GIT Date: 
2016-10-05 10:56:19 -0500
[0]PETSC ERROR: [2]PETSC ERROR: ./test_ksp.exe on a gnu-dbg-32idx named kolmog1 
by frank Wed Oct  5 18:58:44 2016
[0]PETSC ERROR: Configure options --known-mpi-shared="0 " --known-memcmp-ok  
--with-debugging="1 " --with-shared-libraries=0 --with-mpi-compilers="1 " 
--download-blacs="1 " --download-metis="1 " --download-parmetis="1 " 
--download-superlu_dist="1 " --download-hypre=1 PETSC_ARCH=gnu-dbg-32idx
[0]PETSC ERROR: #1 VecSetLocalToGlobalMapping() line 83 in 
/home/frank/petsc/src/vec/vec/interface/vector.c
- Error Message 
--
[2]PETSC ERROR: Null argument, when expecting valid pointer
[2]PETSC ERROR: Null Object: Parameter # 2
[2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
trouble shooting.
[2]PETSC ERROR: Petsc Development GIT revision: v3.7.4-1571-g7fc5cb5  GIT Date: 
2016-10-05 10:56:19 -0500
[2]PETSC ERROR: ./test_ksp.exe on a gnu-dbg-32idx named kolmog1 by frank Wed 
Oct  5 18:58:44 2016
[2]PETSC ERROR: Configure options --known-mpi-shared="0 " --known-memcmp-ok  
--with-debugging="1 " --with-shared-libraries=0 --with-mpi-compilers="1 " 
--download-blacs="1 " --download-metis="1 " --download-parmetis="1 " 
--download-superlu_dist="1 " --download-hypre=1 PETSC_ARCH=gnu-dbg-32idx
[2]PETSC ERROR: #1 VecSetLocalToGlobalMapping() line 83 in 
/home/frank/petsc/src/vec/vec/interface/vector.c
[2]PETSC ERROR: [0]PETSC ERROR: #2 DMCreateGlobalVector_DA() line 45 in 
/home/frank/petsc/src/dm/impls/da/dadist.c
[2]PETSC ERROR: #3 DMCreateGlobalVector() line 880 in 
/home/frank/petsc/src/dm/interface/dm.c
#2 DMCreateGlobalVector_DA() line 45 in 
/home/frank/petsc/src/dm/impls/da/dadist.c
[0]PETSC ERROR: #3 DMCreateGlobalVector() line 880 in 
/home/frank/petsc/src/dm/interface/dm.c
[3]PETSC ERROR: - Error Message 
--
[3]PETSC ERROR: Null argument, when expecting valid pointer
[3]PETSC ERROR: Null Object: Parameter # 2
[3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
trouble shooting.
[3]PETSC ERROR: Petsc Development GIT revision: v3.7.4-1571-g7fc5cb5  GIT Date: 
2016-10-05 10:56:19 -0500
[3]PETSC ERROR: ./test_ksp.exe on a gnu-dbg-32idx named kolmog1 by frank Wed 
Oct  5 18:58:44 2016
[3]PETSC ERROR: Configure options --known-mpi-shared="0 " --known-memcmp-ok  
--with-debugging="1 " --with-shared-libraries=0 --with-mpi-compilers="1 " 
--download-blacs="1 " --download-metis="1 " --download-parmetis="1 " 

Re: [petsc-users] create global vector in latest version of petsc

2016-10-05 Thread Hengjie Wang

Hi,

I did.
I am using GNU compiler 5.4.0. I don't know if this matters.
Thank you

Frank

On 10/5/2016 6:08 PM, Matthew Knepley wrote:
On Wed, Oct 5, 2016 at 7:57 PM, frank > wrote:


Hi,

I update petsc to the latest version by pulling from the repo.
Then I find one of my old code, which worked before, output errors
now.
After debugging, I find that the error is caused by
"DMCreateGlobalVector".
I attach a short program which can re-produce the error. This
program works well with an older version of petsc.
I also attach the script I used to configure petsc.


First, did you reconfigure after pulling? If not, please do this, 
rebuild, and try again.


  Thanks,

Matt

The error message is below. Did I miss something in the
installation ? Thank you.

1 [0]PETSC ERROR: - Error Message
--
  2 [0]PETSC ERROR: Null argument, when expecting valid pointer
  3 [0]PETSC ERROR: Null Object: Parameter # 2
  4 [0]PETSC ERROR: See
http://www.mcs.anl.gov/petsc/documentation/faq.html
 for trouble
shooting.
  5 [0]PETSC ERROR: Petsc Development GIT revision:
v3.7.4-1571-g7fc5cb5  GIT Date: 2016-10-05 10:56:19 -0500
  6 [0]PETSC ERROR: [2]PETSC ERROR: ./test_ksp.exe on a
gnu-dbg-32idx named kolmog1 by frank Wed Oct  5 17:40:07 2016
  7 [0]PETSC ERROR: Configure options --known-mpi-shared="0 "
--known-memcmp-ok --with-debugging="1 " --with-shared-libraries=0
--with-mpi-compilers="1 " --download-blacs="1 "
--download-metis="1 " --download-parmetis="1 "
--download-superlu_dist="1 " --download-hypre=1
PETSC_ARCH=gnu-dbg-32idx
  8 [0]PETSC ERROR: #1 VecSetLocalToGlobalMapping() line 83 in
/home/frank/petsc/src/vec/vec/interface/vector.c
  9 [0]PETSC ERROR: #2 DMCreateGlobalVector_DA() line 45 in
/home/frank/petsc/src/dm/impls/da/dadist.c
 10 [0]PETSC ERROR: #3 DMCreateGlobalVector() line 880 in
/home/frank/petsc/src/dm/interface/dm.c


Regards,
Frank






--
What most experimenters take for granted before they begin their 
experiments is infinitely more interesting than any results to which 
their experiments lead.

-- Norbert Wiener




Re: [petsc-users] Question about memory usage in Multigrid preconditioner

2016-09-16 Thread Hengjie Wang

Hi Dave,

I add both options and test it by solving the poisson eqn in  a 1024 
cube with 32^3 cores. This test used to give the OOM error. Now it runs 
well.

I attach the ksp_view and log_view's output in case you want to know.
I also test my original code with those petsc options by simulating a 
decaying turbulence in a 1024 cube. It also works.  I am going to test 
the code on a larger scale. If there is any problem then, I will let you 
know.

This really helps a lot. Thank you so much.

Regards,
Frank


On 9/15/2016 3:35 AM, Dave May wrote:

HI all,

I the only unexpected memory usage I can see is associated with the 
call to MatPtAP().

Here is something you can try immediately.
Run your code with the additional options
  -matrap 0 -matptap_scalable

I didn't realize this before, but the default behaviour of MatPtAP in 
parallel is actually to to explicitly form the transpose of P (e.g. 
assemble R = P^T) and then compute R.A.P.

You don't want to do this. The option -matrap 0 resolves this issue.

The implementation of P^T.A.P has two variants.
The scalable implementation (with respect to memory usage) is selected 
via the second option -matptap_scalable.


Try it out - I see a significant memory reduction using these options 
for particular mesh sizes / partitions.


I've attached a cleaned up version of the code you sent me.
There were a number of memory leaks and other issues.
The main points being
  * You should call DMDAVecGetArrayF90() before VecAssembly{Begin,End}
  * You should call PetscFinalize(), otherwise the option -log_summary 
(-log_view) will not display anything once the program has completed.



Thanks,
  Dave


On 15 September 2016 at 08:03, Hengjie Wang <hengj...@uci.edu 
<mailto:hengj...@uci.edu>> wrote:


Hi Dave,

Sorry, I should have put more comment to explain the code.
The number of process in each dimension is the same: Px = Py=Pz=P.
So is the domain size.
So if the you want to run the code for a  512^3 grid points on
16^3 cores, you need to set "-N 512 -P 16" in the command line.
I add more comments and also fix an error in the attached code. (
The error only effects the accuracy of solution but not the memory
usage. )

Thank you.
Frank


On 9/14/2016 9:05 PM, Dave May wrote:



On Thursday, 15 September 2016, Dave May <dave.mayhe...@gmail.com
<mailto:dave.mayhe...@gmail.com>> wrote:



On Thursday, 15 September 2016, frank <hengj...@uci.edu> wrote:

Hi,

I write a simple code to re-produce the error. I hope
this can help to diagnose the problem.
The code just solves a 3d poisson equation.


Why is the stencil width a runtime parameter?? And why is the
default value 2? For 7-pnt FD Laplace, you only need
a stencil width of 1.

Was this choice made to mimic something in the
real application code?


Please ignore - I misunderstood your usage of the param set by -P


I run the code on a 1024^3 mesh. The process partition is
32 * 32 * 32. That's when I re-produce the OOM error.
Each core has about 2G memory.
I also run the code on a 512^3 mesh with 16 * 16 * 16
processes. The ksp solver works fine.
I attached the code, ksp_view_pre's output and my petsc
option file.

Thank you.
Frank

    On 09/09/2016 06:38 PM, Hengjie Wang wrote:

Hi Barry,

I checked. On the supercomputer, I had the option
"-ksp_view_pre" but it is not in file I sent you. I am
sorry for the confusion.

Regards,
Frank

On Friday, September 9, 2016, Barry Smith
<bsm...@mcs.anl.gov> wrote:


> On Sep 9, 2016, at 3:11 PM, frank
<hengj...@uci.edu> wrote:
>
> Hi Barry,
>
> I think the first KSP view output is from
-ksp_view_pre. Before I submitted the test, I was
not sure whether there would be OOM error or not. So
I added both -ksp_view_pre and -ksp_view.

  But the options file you sent specifically does
NOT list the -ksp_view_pre so how could it be from that?

   Sorry to be pedantic but I've spent too much time
in the past trying to debug from incorrect
information and want to make sure that the
information I have is correct before thinking.
Please recheck exactly what happened. Rerun with the
exact input file you emailed if that is needed.

   Barry

>
> Frank
>
>
> On 09/09/2016

Re: [petsc-users] Question about memory usage in Multigrid preconditioner

2016-09-15 Thread Hengjie Wang

Hi Dave,

Sorry, I should have put more comment to explain the code.
The number of process in each dimension is the same: Px = Py=Pz=P. So is 
the domain size.
So if the you want to run the code for a  512^3 grid points on 16^3 
cores, you need to set "-N 512 -P 16" in the command line.
I add more comments and also fix an error in the attached code. ( The 
error only effects the accuracy of solution but not the memory usage. )


Thank you.
Frank

On 9/14/2016 9:05 PM, Dave May wrote:



On Thursday, 15 September 2016, Dave May <dave.mayhe...@gmail.com 
<mailto:dave.mayhe...@gmail.com>> wrote:




On Thursday, 15 September 2016, frank <hengj...@uci.edu
<javascript:_e(%7B%7D,'cvml','hengj...@uci.edu');>> wrote:

Hi,

I write a simple code to re-produce the error. I hope this can
help to diagnose the problem.
The code just solves a 3d poisson equation.


Why is the stencil width a runtime parameter?? And why is the
default value 2? For 7-pnt FD Laplace, you only need a stencil
width of 1.

Was this choice made to mimic something in the real application code?


Please ignore - I misunderstood your usage of the param set by -P


I run the code on a 1024^3 mesh. The process partition is 32 *
32 * 32. That's when I re-produce the OOM error. Each core has
about 2G memory.
I also run the code on a 512^3 mesh with 16 * 16 * 16
processes. The ksp solver works fine.
I attached the code, ksp_view_pre's output and my petsc option
file.

Thank you.
Frank

    On 09/09/2016 06:38 PM, Hengjie Wang wrote:

Hi Barry,

I checked. On the supercomputer, I had the option
"-ksp_view_pre" but it is not in file I sent you. I am sorry
for the confusion.

Regards,
Frank

On Friday, September 9, 2016, Barry Smith
<bsm...@mcs.anl.gov> wrote:


> On Sep 9, 2016, at 3:11 PM, frank <hengj...@uci.edu> wrote:
>
> Hi Barry,
>
> I think the first KSP view output is from
-ksp_view_pre. Before I submitted the test, I was not
sure whether there would be OOM error or not. So I added
both -ksp_view_pre and -ksp_view.

  But the options file you sent specifically does NOT
list the -ksp_view_pre so how could it be from that?

   Sorry to be pedantic but I've spent too much time in
the past trying to debug from incorrect information and
want to make sure that the information I have is correct
before thinking. Please recheck exactly what happened.
Rerun with the exact input file you emailed if that is
needed.

   Barry

>
> Frank
>
>
> On 09/09/2016 12:38 PM, Barry Smith wrote:
>>   Why does ksp_view2.txt have two KSP views in it
while ksp_view1.txt has only one KSPView in it? Did you
run two different solves in the 2 case but not the one?
>>
>>   Barry
>>
>>
>>
>>> On Sep 9, 2016, at 10:56 AM, frank <hengj...@uci.edu>
wrote:
>>>
>>> Hi,
>>>
>>> I want to continue digging into the memory problem here.
>>> I did find a work around in the past, which is to use
less cores per node so that each core has 8G memory.
However this is deficient and expensive. I hope to locate
the place that uses the most memory.
>>>
>>> Here is a brief summary of the tests I did in past:
>>>> Test1:   Mesh 1536*128*384  | Process Mesh 48*4*12
>>> Maximum (over computational time) process memory:   
   total 7.0727e+08
>>> Current process memory:  
   total 7.0727e+08

>>> Maximum (over computational time) space
PetscMalloc()ed:  total 6.3908e+11
>>> Current space PetscMalloc()ed:  
total 1.8275e+09

>>>
>>>> Test2:Mesh 1536*128*384  | Process Mesh 96*8*24
>>> Maximum (over computational time) process memory:   
   total 5.9431e+09
>>> Current process memory:  
   total 5.9431e+09

>>> Maximum (over computational time) space
PetscMalloc()ed:  total 5.3202e+12
>>&

Re: [petsc-users] Question about memory usage in Multigrid preconditioner

2016-09-09 Thread Hengjie Wang
Hi Barry,

I checked. On the supercomputer, I had the option "-ksp_view_pre" but it is
not in file I sent you. I am sorry for the confusion.

Regards,
Frank

On Friday, September 9, 2016, Barry Smith  wrote:

>
> > On Sep 9, 2016, at 3:11 PM, frank >
> wrote:
> >
> > Hi Barry,
> >
> > I think the first KSP view output is from -ksp_view_pre. Before I
> submitted the test, I was not sure whether there would be OOM error or not.
> So I added both -ksp_view_pre and -ksp_view.
>
>   But the options file you sent specifically does NOT list the
> -ksp_view_pre so how could it be from that?
>
>Sorry to be pedantic but I've spent too much time in the past trying to
> debug from incorrect information and want to make sure that the information
> I have is correct before thinking. Please recheck exactly what happened.
> Rerun with the exact input file you emailed if that is needed.
>
>Barry
>
> >
> > Frank
> >
> >
> > On 09/09/2016 12:38 PM, Barry Smith wrote:
> >>   Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt
> has only one KSPView in it? Did you run two different solves in the 2 case
> but not the one?
> >>
> >>   Barry
> >>
> >>
> >>
> >>> On Sep 9, 2016, at 10:56 AM, frank >
> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I want to continue digging into the memory problem here.
> >>> I did find a work around in the past, which is to use less cores per
> node so that each core has 8G memory. However this is deficient and
> expensive. I hope to locate the place that uses the most memory.
> >>>
> >>> Here is a brief summary of the tests I did in past:
>  Test1:   Mesh 1536*128*384  |  Process Mesh 48*4*12
> >>> Maximum (over computational time) process memory:   total
> 7.0727e+08
> >>> Current process memory:
>  total 7.0727e+08
> >>> Maximum (over computational time) space PetscMalloc()ed:  total
> 6.3908e+11
> >>> Current space PetscMalloc()ed:
> total 1.8275e+09
> >>>
>  Test2:Mesh 1536*128*384  |  Process Mesh 96*8*24
> >>> Maximum (over computational time) process memory:   total
> 5.9431e+09
> >>> Current process memory:
>  total 5.9431e+09
> >>> Maximum (over computational time) space PetscMalloc()ed:  total
> 5.3202e+12
> >>> Current space PetscMalloc()ed:
>  total 5.4844e+09
> >>>
>  Test3:Mesh 3072*256*768  |  Process Mesh 96*8*24
> >>> OOM( Out Of Memory ) killer of the supercomputer terminated the
> job during "KSPSolve".
> >>>
> >>> I attached the output of ksp_view( the third test's output is from
> ksp_view_pre ), memory_view and also the petsc options.
> >>>
> >>> In all the tests, each core can access about 2G memory. In test3,
> there are 4223139840 non-zeros in the matrix. This will consume about
> 1.74M, using double precision. Considering some extra memory used to store
> integer index, 2G memory should still be way enough.
> >>>
> >>> Is there a way to find out which part of KSPSolve uses the most memory?
> >>> Thank you so much.
> >>>
> >>> BTW, there are 4 options remains unused and I don't understand why
> they are omitted:
> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly
> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi
> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1
> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson
> >>>
> >>>
> >>> Regards,
> >>> Frank
> >>>
> >>> On 07/13/2016 05:47 PM, Dave May wrote:
> 
>  On 14 July 2016 at 01:07, frank >
> wrote:
>  Hi Dave,
> 
>  Sorry for the late reply.
>  Thank you so much for your detailed reply.
> 
>  I have a question about the estimation of the memory usage. There are
> 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is
> used. So the memory per process is:
>    4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ?
>  Did I do sth wrong here? Because this seems too small.
> 
>  No - I totally f***ed it up. You are correct. That'll teach me for
> fumbling around with my iphone calculator and not using my brain. (Note
> that to convert to MB just divide by 1e6, not 1024^2 - although I
> apparently cannot convert between units correctly)
> 
>  From the PETSc objects associated with the solver, It looks like it
> _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities
> are: somewhere in your usage of PETSc you've introduced a memory leak;
> PETSc is doing a huge over allocation (e.g. as per our discussion of
> MatPtAP); or in your application code there are other objects you have
> forgotten to log the memory for.
> 
> 
> 
>  I am running this job on Bluewater
>  I am using the 7 points FD stencil in 3D.
> 
>  I thought so on both counts.
> 
>  I apologize that I made a stupid mistake in computing the memory per
> core. My settings render each core can access