Re: [petsc-users] performance regression with GAMG

2023-10-05 Thread Stephan Kramer
Great, that seems to fix the issue indeed - i.e. on the branch with the 
low memory filtering switched off (by default) we no longer see the 
"inconsistent data" error or hangs, and going back to the square graph 
aggressive coarsening brings us back the old performance. So we'd be 
keen to have that branch merged indeed

Many thanks for your assistance with this
Stephan

On 05/10/2023 01:11, Mark Adams wrote:

Thanks Stephan,

It looks like the matrix is in a bad/incorrect state and parallel Mat-Mat
is waiting for messages that were not sent. A bug.

Can you try my branch, which is ready to merge, adams/gamg-fast-filter.
We added a new filtering method in main that uses low memory but I found it
was slow, so this branch brings back the old filter code, used by default,
and keeps the low memory version as an option.
It is possible this low memory filtering messed up the internals of the Mat
in some way.
I hope this is it, but if not we can continue.

This MR also makes square graph the default.
I have found it does create better aggregates and on GPUs, with Kokkos bug
fixes from Junchao, Mat-Mat is fast. (it might be slow on CPUs)

Mark




On Wed, Oct 4, 2023 at 12:30 AM Stephan Kramer 
wrote:


Hi Mark

Thanks again for re-enabling the square graph aggressive coarsening
option which seems to have restored performance for most of our cases.
Unfortunately we do have a remaining issue, which only seems to occur
for the larger mesh size ("level 7" which has 6,389,890 vertices and we
normally run on 1536 cpus): we either get a "Petsc has generated
inconsistent data" error, or a hang - both when constructing the square
graph matrix. So this is with the new
-pc_gamg_aggressive_square_graph=true option, without the option there's
no error but of course we would get back to the worse performance.

Backtrace for the "inconsistent data" error. Note this is actually just
petsc main from 17 Sep, git 9a75acf6e50cfe213617e - so after your merge
of adams/gamg-add-old-coarsening into main - with one unrelated commit
from firedrake

[0]PETSC ERROR: - Error Message
--
[0]PETSC ERROR: Petsc has generated inconsistent data
[0]PETSC ERROR: j 8 not equal to expected number of sends 9
[0]PETSC ERROR: Petsc Development GIT revision:
v3.4.2-43104-ga3b76b71a1  GIT Date: 2023-09-18 10:26:04 +0100
[0]PETSC ERROR: stokes_cubed_sphere_7e3_A3_TS1.py on a  named
gadi-cpu-clx-0241.gadi.nci.org.au by sck551 Wed Oct  4 14:30:41 2023
[0]PETSC ERROR: Configure options --prefix=/tmp/firedrake-prefix
--with-make-np=4 --with-debugging=0 --with-shared-libraries=1
--with-fortran-bindings=0 --with-zlib --with-c2html=0
--with-mpiexec=mpiexec --with-cc=mpicc --with-cxx=mpicxx
--with-fc=mpifort --download-hdf5 --download-hypre
--download-superlu_dist --download-ptscotch --download-suitesparse
--download-pastix --download-hwloc --download-metis --download-scalapack
--download-mumps --download-chaco --download-ml
CFLAGS=-diag-disable=10441 CXXFLAGS=-diag-disable=10441
[0]PETSC ERROR: #1 PetscGatherMessageLengths2() at
/jobfs/95504034.gadi-pbs/petsc/src/sys/utils/mpimesg.c:270
[0]PETSC ERROR: #2 MatTransposeMatMultSymbolic_MPIAIJ_MPIAIJ() at
/jobfs/95504034.gadi-pbs/petsc/src/mat/impls/aij/mpi/mpimatmatmult.c:1867
[0]PETSC ERROR: #3 MatProductSymbolic_AtB_MPIAIJ_MPIAIJ() at
/jobfs/95504034.gadi-pbs/petsc/src/mat/impls/aij/mpi/mpimatmatmult.c:2071
[0]PETSC ERROR: #4 MatProductSymbolic() at
/jobfs/95504034.gadi-pbs/petsc/src/mat/interface/matproduct.c:795
[0]PETSC ERROR: #5 PCGAMGSquareGraph_GAMG() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/pc/impls/gamg/gamg.c:489
[0]PETSC ERROR: #6 PCGAMGCoarsen_AGG() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/pc/impls/gamg/agg.c:969
[0]PETSC ERROR: #7 PCSetUp_GAMG() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/pc/impls/gamg/gamg.c:645
[0]PETSC ERROR: #8 PCSetUp() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/pc/interface/precon.c:1069
[0]PETSC ERROR: #9 PCApply() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/pc/interface/precon.c:484
[0]PETSC ERROR: #10 PCApply() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/pc/interface/precon.c:487
[0]PETSC ERROR: #11 KSP_PCApply() at
/jobfs/95504034.gadi-pbs/petsc/include/petsc/private/kspimpl.h:383
[0]PETSC ERROR: #12 KSPSolve_CG() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/ksp/impls/cg/cg.c:162
[0]PETSC ERROR: #13 KSPSolve_Private() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/ksp/interface/itfunc.c:910
[0]PETSC ERROR: #14 KSPSolve() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/ksp/interface/itfunc.c:1082
[0]PETSC ERROR: #15 PCApply_FieldSplit_Schur() at

/jobfs/95504034.gadi-pbs/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:1175
[0]PETSC ERROR: #16 PCApply() at
/jobfs/95504034.gadi-pbs/petsc/src/ksp/pc/interface/precon.c:487
[0]PETSC ERROR: #17 KSP_PCApply() at
/jobfs/95504034.gadi-pbs/petsc/include/petsc/private/kspimpl.h:383
[0]PETSC ERROR: #18 KSPSolve_PREONLY() at
/jobfs/9

Re: [petsc-users] performance regression with GAMG

2023-10-03 Thread Stephan Kramer
_ceval.h:46
#20 _PyEval_Vector (tstate=, con=, 
locals=, args=, argcount=4, 
kwnames=) at ../Python/ceval.c:5065
#21 0x15378ee1e057 in __Pyx_PyObject_FastCallDict (func=out>, args=0x1, _nargs=, kwargs=) at 
src/petsc4py/PETSc.c:548022
#22 __pyx_f_8petsc4py_5PETSc_PCApply_Python (__pyx_v_pc=0x2cc7500, 
__pyx_v_x=0x1, __pyx_v_y=0xc0fe132c) at src/petsc4py/PETSc.c:31979
#23 0x15378d216cba in PCApply (pc=0x2cc7500, x=0x1, y=0xc0fe132c) at 
/jobfs/95504034.gadi-pbs/petsc/src/ksp/pc/interface/precon.c:487
#24 0x15378d4d153c in KSP_PCApply (ksp=0x2cc7500, x=0x1, 
y=0xc0fe132c) at 
/jobfs/95504034.gadi-pbs/petsc/include/petsc/private/kspimpl.h:383
#25 0x15378d4d1097 in KSPSolve_CG (ksp=0x2cc7500) at 
/jobfs/95504034.gadi-pbs/petsc/src/ksp/ksp/impls/cg/cg.c:162


Let me know if there is anything further we can try to debug this issue

Kind regards
Stephan Kramer


On 02/09/2023 01:58, Mark Adams wrote:

Fantastic!

I fixed a memory free problem. You should be OK now.
I am pretty sure you are good but I would like to wait to get any feedback
from you.
We should have a release at the end of the month and it would be nice to
get this into it.

Thanks,
Mark


On Fri, Sep 1, 2023 at 7:07 AM Stephan Kramer 
wrote:


Hi Mark

Sorry took a while to report back. We have tried your branch but hit a
few issues, some of which we're not entirely sure are related.

First switching off minimum degree ordering, and then switching to the
old version of aggressive coarsening, as you suggested, got us back to
the coarsening behaviour that we had previously, but then we also
observed an even further worsening of the iteration count: it had
previously gone up by 50% already (with the newer main petsc), but now
was more than double "old" petsc. Took us a while to realize this was
due to the default smoother changing from Cheby+SOR to Cheby+Jacobi.
Switching this also back to the old default we get back to very similar
coarsening levels (see below for more details if it is of interest) and
iteration counts.

So that's all very good news. However, we were also starting seeing
memory errors (double free or corruption) when we switched off the
minimum degree ordering. Because this was at an earlier version of your
branch we then rebuild, hoping this was just an earlier bug that had
been fixed, but then we were having MPI-lockup issues. We have now
figured out the MPI issues are completely unrelated - some combination
with a newer mpi build and firedrake on our cluster which also occur
using main branches of everything. So switching back to an older MPI
build we are hoping to now test your most recent version of
adams/gamg-add-old-coarsening with these options and see whether the
memory errors are still there. Will let you know

Best wishes
Stephan Kramer

Coarsening details with various options for Level 6 of the test case:

In our original setup (using "old" petsc), we had:

rows=516, cols=516, bs=6
rows=12660, cols=12660, bs=6
rows=346974, cols=346974, bs=6
rows=19169670, cols=19169670, bs=3

Then with the newer main petsc we had

rows=666, cols=666, bs=6
rows=7740, cols=7740, bs=6
rows=34902, cols=34902, bs=6
rows=736578, cols=736578, bs=6
rows=19169670, cols=19169670, bs=3

Then on your branch with minimum_degree_ordering False:

rows=504, cols=504, bs=6
rows=2274, cols=2274, bs=6
rows=11010, cols=11010, bs=6
rows=35790, cols=35790, bs=6
rows=430686, cols=430686, bs=6
rows=19169670, cols=19169670, bs=3

And with minimum_degree_ordering False and use_aggressive_square_graph
True:

rows=498, cols=498, bs=6
rows=12672, cols=12672, bs=6
rows=346974, cols=346974, bs=6
rows=19169670, cols=19169670, bs=3

So that is indeed pretty much back to what it was before








On 31/08/2023 23:40, Mark Adams wrote:

Hi Stephan,

This branch is settling down.  adams/gamg-add-old-coarsening
<https://gitlab.com/petsc/petsc/-/commits/adams/gamg-add-old-coarsening>
I made the old, not minimum degree, ordering the default but kept the new
"aggressive" coarsening as the default, so I am hoping that just adding
"-pc_gamg_use_aggressive_square_graph true" to your regression tests will
get you back to where you were before.
Fingers crossed ... let me know if you have any success or not.

Thanks,
Mark


On Tue, Aug 15, 2023 at 1:45 PM Mark Adams  wrote:


Hi Stephan,

I have a branch that you can try: adams/gamg-add-old-coarsening
<https://gitlab.com/petsc/petsc/-/commits/adams/gamg-add-old-coarsening
Things to test:
* First, verify that nothing unintended changed by reproducing your bad
results with this branch (the defaults are the same)
* Try not using the minimum degree ordering that I suggested
with: -pc_gamg_use_minimum_degree_ordering false
-- I am eag

Re: [petsc-users] performance regression with GAMG

2023-09-01 Thread Stephan Kramer

Hi Mark

Sorry took a while to report back. We have tried your branch but hit a 
few issues, some of which we're not entirely sure are related.


First switching off minimum degree ordering, and then switching to the 
old version of aggressive coarsening, as you suggested, got us back to 
the coarsening behaviour that we had previously, but then we also 
observed an even further worsening of the iteration count: it had 
previously gone up by 50% already (with the newer main petsc), but now 
was more than double "old" petsc. Took us a while to realize this was 
due to the default smoother changing from Cheby+SOR to Cheby+Jacobi. 
Switching this also back to the old default we get back to very similar 
coarsening levels (see below for more details if it is of interest) and 
iteration counts.


So that's all very good news. However, we were also starting seeing 
memory errors (double free or corruption) when we switched off the 
minimum degree ordering. Because this was at an earlier version of your 
branch we then rebuild, hoping this was just an earlier bug that had 
been fixed, but then we were having MPI-lockup issues. We have now 
figured out the MPI issues are completely unrelated - some combination 
with a newer mpi build and firedrake on our cluster which also occur 
using main branches of everything. So switching back to an older MPI 
build we are hoping to now test your most recent version of 
adams/gamg-add-old-coarsening with these options and see whether the 
memory errors are still there. Will let you know


Best wishes
Stephan Kramer

Coarsening details with various options for Level 6 of the test case:

In our original setup (using "old" petsc), we had:

  rows=516, cols=516, bs=6
  rows=12660, cols=12660, bs=6
  rows=346974, cols=346974, bs=6
  rows=19169670, cols=19169670, bs=3

Then with the newer main petsc we had

  rows=666, cols=666, bs=6
  rows=7740, cols=7740, bs=6
  rows=34902, cols=34902, bs=6
  rows=736578, cols=736578, bs=6
  rows=19169670, cols=19169670, bs=3

Then on your branch with minimum_degree_ordering False:

  rows=504, cols=504, bs=6
  rows=2274, cols=2274, bs=6
  rows=11010, cols=11010, bs=6
  rows=35790, cols=35790, bs=6
  rows=430686, cols=430686, bs=6
  rows=19169670, cols=19169670, bs=3

And with minimum_degree_ordering False and use_aggressive_square_graph True:

  rows=498, cols=498, bs=6
  rows=12672, cols=12672, bs=6
  rows=346974, cols=346974, bs=6
  rows=19169670, cols=19169670, bs=3

So that is indeed pretty much back to what it was before








On 31/08/2023 23:40, Mark Adams wrote:

Hi Stephan,

This branch is settling down.  adams/gamg-add-old-coarsening
<https://gitlab.com/petsc/petsc/-/commits/adams/gamg-add-old-coarsening>
I made the old, not minimum degree, ordering the default but kept the new
"aggressive" coarsening as the default, so I am hoping that just adding
"-pc_gamg_use_aggressive_square_graph true" to your regression tests will
get you back to where you were before.
Fingers crossed ... let me know if you have any success or not.

Thanks,
Mark


On Tue, Aug 15, 2023 at 1:45 PM Mark Adams  wrote:


Hi Stephan,

I have a branch that you can try: adams/gamg-add-old-coarsening
<https://gitlab.com/petsc/petsc/-/commits/adams/gamg-add-old-coarsening>

Things to test:
* First, verify that nothing unintended changed by reproducing your bad
results with this branch (the defaults are the same)
* Try not using the minimum degree ordering that I suggested
with: -pc_gamg_use_minimum_degree_ordering false
   -- I am eager to see if that is the main problem.
* Go back to what I think is the old method:
-pc_gamg_use_minimum_degree_ordering
false -pc_gamg_use_aggressive_square_graph true

When we get back to where you were, I would like to try to get modern
stuff working.
I did add a -pc_gamg_aggressive_mis_k <2>
You could to another step of MIS coarsening with -pc_gamg_aggressive_mis_k
3

Anyway, lots to look at but, alas, AMG does have a lot of parameters.

Thanks,
Mark

On Mon, Aug 14, 2023 at 4:26 PM Mark Adams  wrote:



On Mon, Aug 14, 2023 at 11:03 AM Stephan Kramer 
wrote:


Many thanks for looking into this, Mark

My 3D tests were not that different and I see you lowered the

threshold.

Note, you can set the threshold to zero, but your test is running so

much

differently than mine there is something else going on.
Note, the new, bad, coarsening rate of 30:1 is what we tend to shoot

for

in 3D.

So it is not clear what the problem is.  Some questions:

* do you have a picture of this mesh to show me?

It's just a standard hexahedral cubed sphere mesh with the refinement
level giving the number of times each of the six sides have been
subdivided: so Level_5 mean 2^5 x 2^5 squares which is extruded to 16
layers. So the total number

Re: [petsc-users] performance regression with GAMG

2023-08-14 Thread Stephan Kramer

Many thanks for looking into this, Mark

My 3D tests were not that different and I see you lowered the threshold.
Note, you can set the threshold to zero, but your test is running so much
differently than mine there is something else going on.
Note, the new, bad, coarsening rate of 30:1 is what we tend to shoot for
in 3D.

So it is not clear what the problem is.  Some questions:

* do you have a picture of this mesh to show me?


It's just a standard hexahedral cubed sphere mesh with the refinement 
level giving the number of times each of the six sides have been 
subdivided: so Level_5 mean 2^5 x 2^5 squares which is extruded to 16 
layers. So the total number of elements at Level_5 is 6 x 32 x 32 x 16 = 
98304  hexes. And everything doubles in all 3 dimensions (so 2^3) going 
to the next Level



* what do you mean by Q1-Q2 elements?


Q2-Q1, basically Taylor hood on hexes, so (tri)quadratic for velocity 
and (tri)linear for pressure


I guess you could argue we could/should just do good old geometric 
multigrid instead. More generally we do use this solver configuration a 
lot for tetrahedral Taylor Hood (P2-P1) in particular also for our 
adaptive mesh runs - would it be worth to see if we have the same 
performance issues with tetrahedral P2-P1?


It would be nice to see if the new and old codes are similar without
aggressive coarsening.
This was the intended change of the major change in this time frame as you
noticed.
If these jobs are easy to run, could you check that the old and new
versions are similar with "-pc_gamg_square_graph  0 ",  ( and you only need
one time step).
All you need to do is check that the first coarse grid has about the same
number of equations (large).
Unfortunately we're seeing some memory errors when we use this option, 
and I'm not entirely clear whether we're just running out of memory and 
need to put it on a special queue.


The run with square_graph 0 using new PETSc managed to get through one 
solve at level 5, and is giving the following mg levels:


    rows=174, cols=174, bs=6
  total: nonzeros=30276, allocated nonzeros=30276
--
  rows=2106, cols=2106, bs=6
  total: nonzeros=4238532, allocated nonzeros=4238532
--
  rows=21828, cols=21828, bs=6
  total: nonzeros=62588232, allocated nonzeros=62588232
--
  rows=589824, cols=589824, bs=6
  total: nonzeros=1082528928, allocated nonzeros=1082528928
--
  rows=2433222, cols=2433222, bs=3
  total: nonzeros=456526098, allocated nonzeros=456526098

comparing with square_graph 100 with new PETSc

  rows=96, cols=96, bs=6
  total: nonzeros=9216, allocated nonzeros=9216
--
  rows=1440, cols=1440, bs=6
  total: nonzeros=647856, allocated nonzeros=647856
--
  rows=97242, cols=97242, bs=6
  total: nonzeros=65656836, allocated nonzeros=65656836
--
  rows=2433222, cols=2433222, bs=3
  total: nonzeros=456526098, allocated nonzeros=456526098

and old PETSc with square_graph 100

  rows=90, cols=90, bs=6
  total: nonzeros=8100, allocated nonzeros=8100
--
  rows=1872, cols=1872, bs=6
  total: nonzeros=1234080, allocated nonzeros=1234080
--
  rows=47652, cols=47652, bs=6
  total: nonzeros=23343264, allocated nonzeros=23343264
--
  rows=2433222, cols=2433222, bs=3
  total: nonzeros=456526098, allocated nonzeros=456526098
--

Unfortunately old PETSc with square_graph 0 did not complete a single 
solve before giving the memory error




BTW, I am starting to think I should add the old method back as an option.
I did not think this change would cause large differences.


Yes, I think that would be much appreciated. Let us know if we can do 
any testing


Best wishes
Stephan




Thanks,
Mark





Note that we are providing the rigid body near nullspace,
hence the bs=3 to bs=6.
We have tried different values for the gamg_threshold but it doesn't
really seem to significantly alter the coarsening amount in that first
step.

Do you have any suggestions for further things we should try/look at?
Any feedback would be much appreciated

Best wishes
Stephan Kramer

Full logs including log_view timings available from
https://github.com/stephankramer/petsc-scaling/

In particular:


https://github.com/stephankramer/petsc-scaling/blob/main/before/Level_5/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/after/Level_5/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/before/Level_6/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/after/Level_6/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/before/Level_7/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/after/Level_7/output_2.dat






[petsc-users] performance regression with GAMG

2023-08-09 Thread Stephan Kramer

Dear petsc devs

We have noticed a performance regression using GAMG as the 
preconditioner to solve the velocity block in a Stokes equations saddle 
point system with variable viscosity solved on a 3D hexahedral mesh of a 
spherical shell using Q2-Q1 elements. This is comparing performance from 
the beginning of last year (petsc 3.16.4) and a more recent petsc master 
(from around May this year). This is the weak scaling analysis we 
published in https://doi.org/10.5194/gmd-15-5127-2022 Previously the 
number of iterations for the velocity block (inner solve of the Schur 
complement) starts at 40 iterations 
(https://gmd.copernicus.org/articles/15/5127/2022/gmd-15-5127-2022-f10-web.png) 
and only slowly going for larger problems (+more cores). Now the number 
of iterations now starts at 60 
(https://github.com/stephankramer/petsc-scaling/blob/main/after/SPD_Combined_Iterations.png), 
same tolerances, again slowly going up with increasing size, with the 
cost per iteration also gone up (slightly) - resulting in an increased 
runtime of > 50%.


The main change we can see is that the coarsening seems to have gotten a 
lot less aggressive at the first coarsening stage (finest->to 
one-but-finest) - presumably after the MIS(A^T A) -> MIS(MIS(A)) change? 
The performance issues might be similar to 
https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2023-April/048366.html ?


As an example at "Level 7" (6,389,890 vertices, run on 1536 cpus) on the 
older petsc version we had:


  rows=126, cols=126, bs=6
  total: nonzeros=15876, allocated nonzeros=15876
--
  rows=3072, cols=3072, bs=6
  total: nonzeros=3344688, allocated nonzeros=3344688
--
  rows=91152, cols=91152, bs=6
  total: nonzeros=109729584, allocated nonzeros=109729584
--
  rows=2655378, cols=2655378, bs=6
  total: nonzeros=1468980252, allocated nonzeros=1468980252
--
  rows=152175366, cols=152175366, bs=3
  total: nonzeros=29047661586, allocated nonzeros=29047661586

Whereas with the newer version we get:

  rows=420, cols=420, bs=6
  total: nonzeros=176400, allocated nonzeros=176400
--
  rows=6462, cols=6462, bs=6
  total: nonzeros=10891908, allocated nonzeros=10891908
--
  rows=91716, cols=91716, bs=6
  total: nonzeros=81687384, allocated nonzeros=81687384
--
  rows=5419362, cols=5419362, bs=6
  total: nonzeros=3668190588, allocated nonzeros=3668190588
--
  rows=152175366, cols=152175366, bs=3
  total: nonzeros=29047661586, allocated nonzeros=29047661586

So in the first step it coarsens from 150e6 to 5.4e6 DOFs instead of to 
2.6e6 DOFs. Note that we are providing the rigid body near nullspace, 
hence the bs=3 to bs=6.
We have tried different values for the gamg_threshold but it doesn't 
really seem to significantly alter the coarsening amount in that first step.


Do you have any suggestions for further things we should try/look at? 
Any feedback would be much appreciated


Best wishes
Stephan Kramer

Full logs including log_view timings available from 
https://github.com/stephankramer/petsc-scaling/


In particular:

https://github.com/stephankramer/petsc-scaling/blob/main/before/Level_5/output_2.dat
https://github.com/stephankramer/petsc-scaling/blob/main/after/Level_5/output_2.dat
https://github.com/stephankramer/petsc-scaling/blob/main/before/Level_6/output_2.dat
https://github.com/stephankramer/petsc-scaling/blob/main/after/Level_6/output_2.dat
https://github.com/stephankramer/petsc-scaling/blob/main/before/Level_7/output_2.dat
https://github.com/stephankramer/petsc-scaling/blob/main/after/Level_7/output_2.dat 



Re: [petsc-users] move from KSPSetNullSpace to MatSetNullSpace

2015-07-05 Thread Stephan Kramer

   Stephan,

   Sorry for the long delay. I have updated the branch 
barry/maint/mv-matnullspace-to-mat  to use the Amat argument for passing in the 
null space instead of the pmat argument. Could you please try it with your 
configuration of solvers and let us know if it resolves the problem?

   Thanks

   Barry

If it works it will eventually go into the maint (and then patch release).




That's great! I've tried out the branch and all our tests are passing again. 
Thanks a lot for looking into this

Cheers
Stephan


Re: [petsc-users] move from KSPSetNullSpace to MatSetNullSpace

2015-06-28 Thread Stephan Kramer

Sorry if we're talking completely cross-purposes here: but the pcksp solve
(that doesn't actually have a nullspace) is inside a solve that does have a
nullspace. If what you are saying is that applying the nullspace inside the
pcksp solve does not affect the outer solve, I can only see that if the
difference in outcome of the pcksp solve (with or without the nullspace)
would be inside the nullspace, because such a difference would be projected
out anyway straight after the pcksp solve. I don't see that this is true
however. My reasoning is the following:

Say the original PCKSP solve (that doesn't apply the nullspace) gets
passed some residual r from the outer solve, so it solves:

  Pz=r

where P is our mass matrix approximation to the system in the outer solve,
and P is full rank. Now if we do remove the nullspace during the pcksp
solve, effectively we change the system to the following:

  NM^{-1}P z = NM^{-1}r

where M^{-1} is the preconditioner inside the pcksp solve (assuming
left-preconditiong here), and N is the projection operator that projects
out the nullspace. This system is now rank-deficient, where I can add:

  z - z + P^{-1}M n

for arbitrary n in the nullspace.

So not only is the possible difference between solving with or without a
nullspace not found in that nullspace, but worse, I've ended up with a
preconditioner that's rank deficient in the outer solve.



This analysis does not make sense to me. The whole point of having all this
nullspace stuff is so that we do not get
a component of the null space in the solution. The way we avoid these in
Krylov methods is to project them out,
which is what happens when you attach it. Thus, your PCKSP solution 'z'
will not have any 'n' component. Your
outer solve will also not have any 'n' component, so I do not see where the
inconsistency comes in.



The nullspace component does indeed get projected out in the pcksp
solve. However that does not mean it gives the right answer - the right
answer would be the same answer I would get before (when we weren't
subtracting the nullspace in the pcksp) modulo the nullspace. For instance
if my initial residual r is exactly Mn I would get the answer 0 instead
of P^{-1}Mn - or equally correct would have been P^{-1}Mn with
the nullspace component projected out afterwards, but not 0. So the 
inconsistency is
in having a preconditioner that's not full-rank and whose nullspace
is different than the nullspace of the real matrix in the outer solve.

Some more observations:
* if I switch the preconditioner inside the pcksp to be a right-preconditioner
(-ksp_ksp_pc_side right) I get back the correct answer - which is kind of what
I expect as in that case it should just give the correct pcksp answer
with the nullspace projected out
* if instead of gmres+sor for the pcksp solve, I select cg+sor (which is
what I actually want), the pcksp solve fails with an indefinite pc. Both cases
produce the same wrong answer in the outer solve (both using default
left-preconditioning)

Hope this makes sense
Cheers
Stephan








Re: [petsc-users] move from KSPSetNullSpace to MatSetNullSpace

2015-06-28 Thread Stephan Kramer

   If we move the location of the nullspace used in removal from the pmat to 
the mat would that completely resolve the problem for you?

  Barry


That would indeed resolve the issue (and to me would also make the most sense)

Stephan


Re: [petsc-users] move from KSPSetNullSpace to MatSetNullSpace

2015-06-23 Thread Stephan Kramer

On Mon, Jun 22, 2015 at 1:13 PM, Stephan Kramer s.kramer at

imperial.ac.uk
wrote:

 Dear petsc devs


I've been trying to move our code from using KSPSetNullSpace to use
MatSetNullSpace instead. Although I appreciate the convenience of the
nullspace being propagated automatically through the solver hierarchy,
I'm
still a little confused on how to deal with the case that mat/=pmat in a
ksp.

If I read the code correctly I need to call MatSetNullSpace on the pmat
in
order for a ksp to project that nullspace out of the residual during the
krylov iteration. However the nullspaces of mat and pmat are not
necessarily the same. For instance, what should I do if the system that I
want to solve has a nullspace (i.e. the `mat` has a null vector), but the
preconditioner matrix does not.

As an example of existing setups that we have for which this is the case:
take a standard Stokes velocity, pressure system - where we want to solve
for pressure through a Schur complement. Suppose the boundary conditions
are Dirichlet for velocity on all boundaries, then the pressure equation
has the standard, constant nullspace. A frequently used preconditioner is
the scaled pressure mass matrix where G^T K^{-1} G is approximated by a
pressure mass matrix that is scaled by the value of the viscosity. So in
this case the actual system has a nullspace, but the preconditioner
matrix
does not.

The way we previously used to solve this is by setting the nullspace on
the ksp of the Schur system, where the mat comes from
MatCreateSchurComplement and the pmat is the scaled mass matrix. We then
set the pctype to PCKSP and do not set a nullspace on its associated
ksp. I
don't really see how I can do this using only MatSetNullSpace - unless I
create a copy of the mass matrix and have one copy with and one copy
without a nullspace so that I could use the one with the nullspace for
the
pmat of the Schur ksp (and thus the mat of the pcksp) and the copy
without
the nullspace as the pmat for the pcksp.

We would very much appreciate some guidance on what the correct way to
deal with this kind of situation is



I can understand that this is inconsistent (we have not really thought of
a
good way to make this consistent). However, does
this produce the wrong results? If A has a null space, won't you get the
same answer by attaching it to P, or am I missing the
import of the above example?



Attaching the nullspace to P means it will be applied in the PCKSP solve as
well - which in this case is just a mass matrix solve which doesn't
have a nullspace. I was under the impression that removing a nullspace in
a solve
where the matrix doesn't having one, would lead to trouble - but I'm happy
to
be told wrong.



I guess I was saying that this solve is only applied after a system with
that null space,
so I did not think it would change the answer.

  Thanks,

 Matt


Sorry if we're talking completely cross-purposes here: but the pcksp 
solve (that doesn't actually have a nullspace) is inside a solve that 
does have a nullspace. If what you are saying is that applying the 
nullspace inside the pcksp solve does not affect the outer solve, I can 
only see that if the difference in outcome of the pcksp solve (with or 
without the nullspace) would be inside the nullspace, because such a 
difference would be projected out anyway straight after the pcksp solve. 
I don't see that this is true however. My reasoning is the following:


Say the original PCKSP solve (that doesn't apply the nullspace) gets 
passed some residual r from the outer solve, so it solves:


  Pz=r

where P is our mass matrix approximation to the system in the outer 
solve, and P is full rank. Now if we do remove the nullspace during the 
pcksp solve, effectively we change the system to the following:


  NM^{-1}P z = NM^{-1}r

where M^{-1} is the preconditioner inside the pcksp solve (assuming 
left-preconditiong here), and N is the projection operator that projects 
out the nullspace. This system is now rank-deficient, where I can add:


  z - z + P^{-1}M n

for arbitrary n in the nullspace.

So not only is the possible difference between solving with or without a 
nullspace not found in that nullspace, but worse, I've ended up with a 
preconditioner that's rank deficient in the outer solve.


I've experimented with the example I described before with the Schur 
complement of a Stokes velocity,pressure system on the outside using 
fgmres and the viscosit scaled pressure mass matrix as preconditioner 
which is solved in the pcksp solve with cg+sor. With petsc 3.5 (so I can 
still use KSPSetNullSpace) if I set a nullspace only on the outside ksp, 
or if I set it both on the outside ksp and the pcksp sub_ksp, I get 
different answers (and also a different a number of iterations). The 
difference in answer visually looks like some grid scale noise that 
indeed could very well be of the form of a constant vector multiplied by 
an inverse mass matrix.


If you agree that applying

Re: [petsc-users] move from KSPSetNullSpace to MatSetNullSpace

2015-06-22 Thread Stephan Kramer

On Mon, Jun 22, 2015 at 1:13 PM, Stephan Kramer s.kramer at imperial.ac.uk
wrote:


Dear petsc devs

I've been trying to move our code from using KSPSetNullSpace to use
MatSetNullSpace instead. Although I appreciate the convenience of the
nullspace being propagated automatically through the solver hierarchy, I'm
still a little confused on how to deal with the case that mat/=pmat in a
ksp.

If I read the code correctly I need to call MatSetNullSpace on the pmat in
order for a ksp to project that nullspace out of the residual during the
krylov iteration. However the nullspaces of mat and pmat are not
necessarily the same. For instance, what should I do if the system that I
want to solve has a nullspace (i.e. the `mat` has a null vector), but the
preconditioner matrix does not.

As an example of existing setups that we have for which this is the case:
take a standard Stokes velocity, pressure system - where we want to solve
for pressure through a Schur complement. Suppose the boundary conditions
are Dirichlet for velocity on all boundaries, then the pressure equation
has the standard, constant nullspace. A frequently used preconditioner is
the scaled pressure mass matrix where G^T K^{-1} G is approximated by a
pressure mass matrix that is scaled by the value of the viscosity. So in
this case the actual system has a nullspace, but the preconditioner matrix
does not.

The way we previously used to solve this is by setting the nullspace on
the ksp of the Schur system, where the mat comes from
MatCreateSchurComplement and the pmat is the scaled mass matrix. We then
set the pctype to PCKSP and do not set a nullspace on its associated ksp. I
don't really see how I can do this using only MatSetNullSpace - unless I
create a copy of the mass matrix and have one copy with and one copy
without a nullspace so that I could use the one with the nullspace for the
pmat of the Schur ksp (and thus the mat of the pcksp) and the copy without
the nullspace as the pmat for the pcksp.

We would very much appreciate some guidance on what the correct way to
deal with this kind of situation is



I can understand that this is inconsistent (we have not really thought of a
good way to make this consistent). However, does
this produce the wrong results? If A has a null space, won't you get the
same answer by attaching it to P, or am I missing the
import of the above example?


Attaching the nullspace to P means it will be applied in the PCKSP solve as
well - which in this case is just a mass matrix solve which doesn't
have a nullspace. I was under the impression that removing a nullspace in a 
solve
where the matrix doesn't having one, would lead to trouble - but I'm happy to
be told wrong.

Cheers
Stephan


[petsc-users] move from KSPSetNullSpace to MatSetNullSpace

2015-06-22 Thread Stephan Kramer

Dear petsc devs

I've been trying to move our code from using KSPSetNullSpace to use 
MatSetNullSpace instead. Although I appreciate the convenience of the 
nullspace being propagated automatically through the solver hierarchy, 
I'm still a little confused on how to deal with the case that mat/=pmat 
in a ksp.


If I read the code correctly I need to call MatSetNullSpace on the pmat 
in order for a ksp to project that nullspace out of the residual during 
the krylov iteration. However the nullspaces of mat and pmat are not 
necessarily the same. For instance, what should I do if the system that 
I want to solve has a nullspace (i.e. the `mat` has a null vector), but 
the preconditioner matrix does not.


As an example of existing setups that we have for which this is the 
case: take a standard Stokes velocity, pressure system - where we want 
to solve for pressure through a Schur complement. Suppose the boundary 
conditions are Dirichlet for velocity on all boundaries, then the 
pressure equation has the standard, constant nullspace. A frequently 
used preconditioner is the scaled pressure mass matrix where G^T 
K^{-1} G is approximated by a pressure mass matrix that is scaled by the 
value of the viscosity. So in this case the actual system has a 
nullspace, but the preconditioner matrix does not.


The way we previously used to solve this is by setting the nullspace on 
the ksp of the Schur system, where the mat comes from 
MatCreateSchurComplement and the pmat is the scaled mass matrix. We then 
set the pctype to PCKSP and do not set a nullspace on its associated 
ksp. I don't really see how I can do this using only MatSetNullSpace - 
unless I create a copy of the mass matrix and have one copy with and one 
copy without a nullspace so that I could use the one with the nullspace 
for the pmat of the Schur ksp (and thus the mat of the pcksp) and the 
copy without the nullspace as the pmat for the pcksp.


We would very much appreciate some guidance on what the correct way to 
deal with this kind of situation is


Cheers
Stephan


Re: [petsc-users] gamg failure with petsc-dev

2014-04-01 Thread Stephan Kramer

On 01/04/14 16:07, Mark Adams wrote:

Stephan, I have pushed a pull request to fix this but for now you can just
use -mg_levels_ksp_type chebyshev -mg_levels_pc_type jacobi.  This used to
be the default be we move to SOR recently.
Mark


Ah, that's great news. Thanks a lot for the effort. You're right: the previous 
defaults should be fine for us; your fix should hopefully only improve things




On Sat, Mar 29, 2014 at 5:52 PM, Mark Adams mfad...@lbl.gov wrote:


Sorry for getting to this late.  I think you have figured it out basically
but there are a few things:

1) You must set the block size of A (bs=2) for the null spaces to work and
for aggregation MG to work properly. SA-AMG really does not make sense
unless you work at the vertex level, for which we need the block size.


Yes indeed. I've come to realize this now by looking into how smoothed aggregation with a near null space actually works. We currently have our dofs numbered the wrong way around (vertices on the 
inside, velocity component on the outside - which made sense for other eqns we solve with the model) so will take a bit of work, but might well be worth the effort


Thanks a lot for looking into this
Cheers
Stephan




2) You must be right that the zero column is because the aggregation
produced a singleton aggregate.  And so the coarse grid is low rank.  This
is not catastrophic, it is like a fake BC equations.  The numerics just
have to work around it.  Jacobi does this.  I will fix SOR.

Mark



Ok, I found out a bit more. The fact that the prolongator has zero
columns appears to arise in petsc 3.4 as well. The only reason it wasn't
flagged before is that the default for the smoother (not the aggregation
smoother but the standard pre and post smoothing) changed from jacobi to
sor. I can make the example work with the additional option:

$ ./ex49 -elas_pc_type gamg -mx 100 -my 100 -mat_no_inode
-elas_mg_levels_1_pc_type jacobi

Vice versa, if in petsc 3.4.4 I change ex49 to include the near nullspace
(the /* constrain near-null space bit */) at the end, it works with jacobi
(the default in 3.4) but it breaks with sor with the same error message as
above. I'm not entirely sure why jacobi doesn't give an error with a zero
on the diagonal, but the zero column also means that the related coarse dof
doesn't actually affect the fine grid solution.

I think (but I might be barking up the wrong tree here) that the zero
columns appear because the aggregation method typically will have a few
small aggregates that are not big enough to support the polynomials of the
near null space (i.e. the polynomials restricted to an aggregate are not
linearly independent). A solution would be to reduce the number of
polynomials for these aggregates (only take the linearly independent).
Obviously this has the down-side that the degrees of freedom per aggregate
at the coarse level is no longer a constant making the administration more
complicated. It would be nice to find a solution though as I've always been
taught that jacobi is not a robust smoother for multigrid.

Cheers
Stephan









Re: [petsc-users] gamg failure with petsc-dev

2014-03-24 Thread Stephan Kramer

On 21/03/14 11:34, Stephan Kramer wrote:

On 21/03/14 04:24, Jed Brown wrote:

Stephan Kramer s.kra...@imperial.ac.uk writes:


We have been having some problems with GAMG on petsc-dev (master) for
cases that worked fine on petsc 3.4. We're solving a Stokes equation
(just the velocity block) for a simple convection in a square box
(isoviscous). The problem only occurs if we supply a near null space
(via MatSetNearNullSpace) where we supply the usual (1,0) (0,1) and
(-y,x) (near) null space vectors. If we supply those, the smoother
complains that the diagonal of the A matrix at the first coarsened
level contains a zero. If I dump out the prolongator from the finest
to the first coarsened level it indeed contains a zero column at that
same index. We're pretty confident that the fine level A matrix is
correct (it solves fine with LU). I've briefly spoken to Matt about
this and he suggested trying to run with -pc_gamg_agg_nsmooths 0 (as
the default changed from 3.4 - dev) but that didn't make any
difference, the dumped out prolongator still has zero columns, and it
crashes in the same way. Do you have any further suggestions what to
try and how to further debug this?


Do you set the block size?  Can you reproduce by modifying
src/ksp/ksp/examples/tutorials/ex49.c (plane strain elasticity)?



I don't set a block size, no. About ex49: Ah great, with master (just updated 
now) I get:

[skramer@stommel]{/data/stephan/git/petsc/src/ksp/ksp/examples/tutorials}$ 
./ex49 -elas_pc_type gamg -mx 100 -my 100 -mat_no_inode
[0]PETSC ERROR: - Error Message 
--
[0]PETSC ERROR: Arguments are incompatible
[0]PETSC ERROR: Zero diagonal on row 1
[0]PETSC ERROR: See http://http://www.mcs.anl.gov/petsc/documentation/faq.html 
for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-3671-gbb161d1  GIT Date: 
2014-03-21 01:14:15 +
[0]PETSC ERROR: ./ex49 on a linux-gnu-c-opt named stommel by skramer Fri Mar 21 
11:25:55 2014
[0]PETSC ERROR: Configure options --download-fblaslapack=1 --download-blacs=1 
--download-scalapack=1 --download-ptscotch=1 --download-mumps=1 
--download-hypre=1 --download-suitesparse=1 --download-ml=1
[0]PETSC ERROR: #1 MatInvertDiagonal_SeqAIJ() line 1728 in 
/data/stephan/git/petsc/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #2 MatSOR_SeqAIJ() line 1760 in 
/data/stephan/git/petsc/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #3 MatSOR() line 3734 in 
/data/stephan/git/petsc/src/mat/interface/matrix.c
[0]PETSC ERROR: #4 PCApply_SOR() line 35 in 
/data/stephan/git/petsc/src/ksp/pc/impls/sor/sor.c
[0]PETSC ERROR: #5 PCApply() line 440 in 
/data/stephan/git/petsc/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #6 KSP_PCApply() line 227 in 
/data/stephan/git/petsc/include/petsc-private/kspimpl.h
[0]PETSC ERROR: #7 KSPSolve_Chebyshev() line 456 in 
/data/stephan/git/petsc/src/ksp/ksp/impls/cheby/cheby.c
[0]PETSC ERROR: #8 KSPSolve() line 458 in 
/data/stephan/git/petsc/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #9 PCMGMCycle_Private() line 19 in 
/data/stephan/git/petsc/src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: #10 PCMGMCycle_Private() line 48 in 
/data/stephan/git/petsc/src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: #11 PCApply_MG() line 330 in 
/data/stephan/git/petsc/src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: #12 PCApply() line 440 in 
/data/stephan/git/petsc/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #13 KSP_PCApply() line 227 in 
/data/stephan/git/petsc/include/petsc-private/kspimpl.h
[0]PETSC ERROR: #14 KSPInitialResidual() line 63 in 
/data/stephan/git/petsc/src/ksp/ksp/interface/itres.c
[0]PETSC ERROR: #15 KSPSolve_GMRES() line 234 in 
/data/stephan/git/petsc/src/ksp/ksp/impls/gmres/gmres.c
[0]PETSC ERROR: #16 KSPSolve() line 458 in 
/data/stephan/git/petsc/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #17 solve_elasticity_2d() line 1053 in 
/data/stephan/git/petsc/src/ksp/ksp/examples/tutorials/ex49.c
[0]PETSC ERROR: #18 main() line 1104 in 
/data/stephan/git/petsc/src/ksp/ksp/examples/tutorials/ex49.c
[0]PETSC ERROR: End of Error Message ---send entire error 
message to petsc-ma...@mcs.anl.gov--

Which is the same error we were getting on our problem
Cheers
Stephan




Ok, I found out a bit more. The fact that the prolongator has zero columns appears to arise in petsc 3.4 as well. The only reason it wasn't flagged before is that the default for the smoother (not the 
aggregation smoother but the standard pre and post smoothing) changed from jacobi to sor. I can make the example work with the additional option:


$ ./ex49 -elas_pc_type gamg -mx 100 -my 100 -mat_no_inode 
-elas_mg_levels_1_pc_type jacobi

Vice versa, if in petsc 3.4.4 I change ex49 to include the near nullspace (the /* constrain near-null space bit */) at the end, it works with jacobi (the default in 3.4) but it breaks with sor with 
the same error message as above. I'm not entirely sure why jacobi doesn't

Re: [petsc-users] gamg failure with petsc-dev

2014-03-21 Thread Stephan Kramer

On 21/03/14 04:24, Jed Brown wrote:

Stephan Kramer s.kra...@imperial.ac.uk writes:


We have been having some problems with GAMG on petsc-dev (master) for
cases that worked fine on petsc 3.4. We're solving a Stokes equation
(just the velocity block) for a simple convection in a square box
(isoviscous). The problem only occurs if we supply a near null space
(via MatSetNearNullSpace) where we supply the usual (1,0) (0,1) and
(-y,x) (near) null space vectors. If we supply those, the smoother
complains that the diagonal of the A matrix at the first coarsened
level contains a zero. If I dump out the prolongator from the finest
to the first coarsened level it indeed contains a zero column at that
same index. We're pretty confident that the fine level A matrix is
correct (it solves fine with LU). I've briefly spoken to Matt about
this and he suggested trying to run with -pc_gamg_agg_nsmooths 0 (as
the default changed from 3.4 - dev) but that didn't make any
difference, the dumped out prolongator still has zero columns, and it
crashes in the same way. Do you have any further suggestions what to
try and how to further debug this?


Do you set the block size?  Can you reproduce by modifying
src/ksp/ksp/examples/tutorials/ex49.c (plane strain elasticity)?



I don't set a block size, no. About ex49: Ah great, with master (just updated 
now) I get:

[skramer@stommel]{/data/stephan/git/petsc/src/ksp/ksp/examples/tutorials}$ 
./ex49 -elas_pc_type gamg -mx 100 -my 100 -mat_no_inode
[0]PETSC ERROR: - Error Message 
--
[0]PETSC ERROR: Arguments are incompatible
[0]PETSC ERROR: Zero diagonal on row 1
[0]PETSC ERROR: See http://http://www.mcs.anl.gov/petsc/documentation/faq.html 
for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-3671-gbb161d1  GIT Date: 
2014-03-21 01:14:15 +
[0]PETSC ERROR: ./ex49 on a linux-gnu-c-opt named stommel by skramer Fri Mar 21 
11:25:55 2014
[0]PETSC ERROR: Configure options --download-fblaslapack=1 --download-blacs=1 
--download-scalapack=1 --download-ptscotch=1 --download-mumps=1 
--download-hypre=1 --download-suitesparse=1 --download-ml=1
[0]PETSC ERROR: #1 MatInvertDiagonal_SeqAIJ() line 1728 in 
/data/stephan/git/petsc/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #2 MatSOR_SeqAIJ() line 1760 in 
/data/stephan/git/petsc/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #3 MatSOR() line 3734 in 
/data/stephan/git/petsc/src/mat/interface/matrix.c
[0]PETSC ERROR: #4 PCApply_SOR() line 35 in 
/data/stephan/git/petsc/src/ksp/pc/impls/sor/sor.c
[0]PETSC ERROR: #5 PCApply() line 440 in 
/data/stephan/git/petsc/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #6 KSP_PCApply() line 227 in 
/data/stephan/git/petsc/include/petsc-private/kspimpl.h
[0]PETSC ERROR: #7 KSPSolve_Chebyshev() line 456 in 
/data/stephan/git/petsc/src/ksp/ksp/impls/cheby/cheby.c
[0]PETSC ERROR: #8 KSPSolve() line 458 in 
/data/stephan/git/petsc/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #9 PCMGMCycle_Private() line 19 in 
/data/stephan/git/petsc/src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: #10 PCMGMCycle_Private() line 48 in 
/data/stephan/git/petsc/src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: #11 PCApply_MG() line 330 in 
/data/stephan/git/petsc/src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: #12 PCApply() line 440 in 
/data/stephan/git/petsc/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #13 KSP_PCApply() line 227 in 
/data/stephan/git/petsc/include/petsc-private/kspimpl.h
[0]PETSC ERROR: #14 KSPInitialResidual() line 63 in 
/data/stephan/git/petsc/src/ksp/ksp/interface/itres.c
[0]PETSC ERROR: #15 KSPSolve_GMRES() line 234 in 
/data/stephan/git/petsc/src/ksp/ksp/impls/gmres/gmres.c
[0]PETSC ERROR: #16 KSPSolve() line 458 in 
/data/stephan/git/petsc/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #17 solve_elasticity_2d() line 1053 in 
/data/stephan/git/petsc/src/ksp/ksp/examples/tutorials/ex49.c
[0]PETSC ERROR: #18 main() line 1104 in 
/data/stephan/git/petsc/src/ksp/ksp/examples/tutorials/ex49.c
[0]PETSC ERROR: End of Error Message ---send entire error 
message to petsc-ma...@mcs.anl.gov--

Which is the same error we were getting on our problem
Cheers
Stephan


[petsc-users] gamg failure with petsc-dev

2014-03-20 Thread Stephan Kramer

Hi guys,

We have been having some problems with GAMG on petsc-dev (master) for cases that worked fine on petsc 3.4. We're solving a Stokes equation (just the velocity block) for a simple convection in a square 
box (isoviscous). The problem only occurs if we supply a near null space (via MatSetNearNullSpace) where we supply the usual (1,0) (0,1) and (-y,x) (near) null space vectors. If we supply those, the 
smoother complains that the diagonal of the A matrix at the first coarsened level contains a zero. If I dump out the prolongator from the finest to the first coarsened level it indeed contains a zero 
column at that same index. We're pretty confident that the fine level A matrix is correct (it solves fine with LU). I've briefly spoken to Matt about this and he suggested trying to run with 
-pc_gamg_agg_nsmooths 0 (as the default changed from 3.4 - dev) but that didn't make any difference, the dumped out prolongator still has zero columns, and it crashes in the same way. Do you have any 
further suggestions what to try and how to further debug this?


Cheers
Stephan


Re: [petsc-users] Fwd: MatZeroRowsColumns for mpibaij

2013-11-11 Thread Stephan Kramer

On 01/11/13 06:01, Matthew Knepley wrote:

On Wed, Oct 30, 2013 at 9:19 PM, Rhodri Davies rhodri.dav...@anu.edu.au 
mailto:rhodri.dav...@anu.edu.au wrote:

Dear all,

Would you mind giving us an update on your thoughts re: this question? 
We're very keen to be able to use such functionally and it would be good to 
know if you're looking into this / have any
suggestions?


I looked at the code. It does not look hard to make a MPIBAIJ version. Since 
the scatter context for BAIJ is already exploded, most
of the MPIAIJ code goes straight through, it just a matter of handling the 
blocking in a few places.

Here is what would be really helpful for us. Write a nice test. The code would 
probably only take an hour of being careful. However,
the test assembly would be more. If you give us a nice test example for it, 
writing it could happen pretty quickly.

   Thanks,

 Matt


That would be great. I had a look at the existing tests and found src/mat/examples/tests/ex140.c which already claims to test MatZeroRowsColumns with MPIBAIJ - except it isn't being run in parallel. 
Would running this test in parallel be a sufficient test to work with? It currently fails with the expected message No support for this operation for this object type!. I'd be happy to write another 
test but it would basically be creating another matrix and testing it does the right thing with the matrix and the rhs. Let me know if there's anything else we could help out with


Cheers
Stephan


From: Stephan Kramer s.kra...@imperial.ac.uk
Subject: Fwd: [petsc-users] MatZeroRowsColumns for mpibaij
Date: 31 October 2013 13:16:05 GMT+11:00
To: Rhodri Davies rhodri.dav...@anu.edu.au




 Original Message 
Subject: [petsc-users] MatZeroRowsColumns for mpibaij
Date: Wed, 19 Jun 2013 13:22:31 +0100
From: Stephan Kramer s.kra...@imperial.ac.uk
To: PETSc users list petsc-users@mcs.anl.gov

Dear all,

We have found the MatZeroRowsColumns() routine to be very useful for lifting 
boundary conditions. Unfortunately we also found that it isn't implemented for 
matrix type mpibaij and we're quite keen on
keeping the baij structure as it seems to give us significant savings for DG 
problems (in particular the blocked sor). I had a quick look at the 
implementation of MatZeroRowsColums_MPIAIJ but couldn't
really estimate how hard it would be to do something similar for mpibaij (an 
implementation for seqbaij exists already). Is this something that's likely to 
be looked at in the future?

Stephan


[petsc-users] MatZeroRowsColumns for mpibaij

2013-06-19 Thread Stephan Kramer

Dear all,

We have found the MatZeroRowsColumns() routine to be very useful for lifting boundary conditions. Unfortunately we also found that it isn't implemented for matrix type mpibaij and we're quite keen on 
keeping the baij structure as it seems to give us significant savings for DG problems (in particular the blocked sor). I had a quick look at the implementation of MatZeroRowsColums_MPIAIJ but couldn't 
really estimate how hard it would be to do something similar for mpibaij (an implementation for seqbaij exists already). Is this something that's likely to be looked at in the future?


Stephan


[petsc-users] Help with make: m2c: Command not found

2010-11-15 Thread Stephan Kramer
On 15/11/10 09:54, TAY wee-beng wrote:
 Hi,

 I have a modified makefile originally given to me by the PETSc team. It
 worked well but there's a small problem. Sometimes, when I have edited a
 fortran file and tries to rebuild, it gives this error:

 /opt/openmpi-1.4.1/bin/mpif90 -r8  -w95 -c -O3 -save airfoil.f90
 /opt/openmpi-1.4.1/bin/mpif90 -w95 -c -O3 -save grid.f90
 m2c-o cell_data.o cell_data.mod
 make: m2c: Command not found
 make: *** [cell_data.o] Error 127

 However, if I delete all the *.o and *.mod files and build clean,
 there's no such problems. Is there anyway to solve this problem?

 I have attached the makefile

 Thanks!

 Yours sincerely,

 TAY wee-beng





This is caused by a built-in implicit make rule that tells make it can make 
object files
from modula-2 .mod files by running m2c - obviously not what you want. You can 
override it by adding the following
line to your makefile:

%.o: %.mod

Make sure the next line doesn't start with a tab.

Cheers
Stephan


[petsc-users] [Fortran] subroutines inside modules?

2010-10-01 Thread Stephan Kramer
On 30/09/10 23:09, Leo van Kampenhout wrote:
 Declaring it external in the program/subroutine that is using the module
 results in

 main.F:65.43:
external gridtest
 Error: Cannot change attributes of USE-associated symbol at (1)

 Thanks, Leo

Yes, as I said before :) - module subroutines should *not* be declared 
external. You do
not need that line.

Cheers
Stephan




 2010/9/30 Stephan Kramer s.kramer at imperial.ac.uk
 mailto:s.kramer at imperial.ac.uk

 On 30/09/10 15:31, Leo van Kampenhout wrote:

 Hi all,

 since it is mandatory to declare all subroutines as external in
 Fortran, is it possible for Modules to have subroutines? I'm
 unable to
 declare the subroutine external inside the module itself, nor in the
 program which is using it. Not declaring it external at all
 results in
 the following compilation error:

 /net/users/csg/csg4035/master/workdir/src/main.F:97: undefined
 reference
 to `__grid_MOD_readgrid'

 (the module is here is named grid, the subroutine readgrid )

 Thanks,
 Leo


 If you put your subroutine in a module, it should not be declared
 external. You can directly call it from within the module itself. When
 calling it inside any other module/program you need to add use
 grid before
 the implicit none.

 Putting subroutines inside a module is highly recommended as it
 automatically
 provides an explicit interface so that the compiler can check the
 arguments in
 your subroutine call.

 Cheers
 Stephan





[petsc-users] [Fortran] subroutines inside modules?

2010-09-30 Thread Stephan Kramer
On 30/09/10 15:31, Leo van Kampenhout wrote:
 Hi all,

 since it is mandatory to declare all subroutines as external in
 Fortran, is it possible for Modules to have subroutines? I'm unable to
 declare the subroutine external inside the module itself, nor in the
 program which is using it. Not declaring it external at all results in
 the following compilation error:

 /net/users/csg/csg4035/master/workdir/src/main.F:97: undefined reference
 to `__grid_MOD_readgrid'

 (the module is here is named grid, the subroutine readgrid )

 Thanks,
 Leo


If you put your subroutine in a module, it should not be declared
external. You can directly call it from within the module itself. When
calling it inside any other module/program you need to add use grid before
the implicit none.

Putting subroutines inside a module is highly recommended as it automatically
provides an explicit interface so that the compiler can check the arguments in
your subroutine call.

Cheers
Stephan


[petsc-users] reporting failing pcksp solves

2010-06-20 Thread Stephan Kramer
On 20/06/10 16:21, Stephan Kramer wrote:
 On 20/06/10 05:07, Barry Smith wrote:


 Hmm, this is a good question. There are tons of places where some sort 
 of inner solve is imbedded in an outer solver, in fact many levels of 
 nesting. We should handle this in a systematic way which likely means 
 carefully checks put in various places in the code.

 We could instead (or in addition) add an option to have KSPSolve() 
 generate an error on not-convergence  instead of returning with a negative 
 converged reason. The drawback to this is that you cannot handle the error 
 and continue running, it just ends. Would this be useful?

  Barry

 I can see the logistical nightmare of having to check return values of every 
 PCApply, MatMult, etc. in petsc code. I guess this is where exception 
 handling would be a great thing to have.

 I think for our own purposes, we will follow Matt's suggestion of wrapping 
 KSPDefaultConverged() and set a global flag that can be checked afterwards. 
 Our code checks successful completion convergence
 of every solve afterwards. It always handles solver failures as a fatal 
 error, but the the code continues for a bit (end of timestep) to ensure final 
 diagnostics, mesh and fields are written out - it
 is an adaptive mesh model, so users may not have even seen the mesh that was 
 used to solve on.

 For simpler applications a petsc error generated on the spot would be a 
 useful option, yes, but a more robust general way of handling nested solver 
 failures would be greatly appreciated. I was
 slightly surprised to be honest to discover petsc happily continuing its ksp 
 solve even though the MatSchurComplement solves were silently failing. A 
 drawback of the suggested workaround, is that even
 if the MatSchurComplement solve fails in the first iteration (for instance 
 reaching max iterations), the outer solve will still continue, trying and 
 failing to solve the MatSchurComplement inner solve
 each iteration, until it finally hits the maximum number of outer iterations.

 Cheers
 Stephan

Actually, looking a little closer, I see a lot of negative KSPConvergedReasons 
are not actually raised by KSPDefaultConvergence(). In particular I still won't 
be able to trap KSP_DIVERGED_ITS this 
way, except for me adding an extra n  ksp-max_it check of course, but that 
still leaves out a lot of krylov method-specific reasons...





 On Jun 18, 2010, at 11:35 AM, Stephan Kramer wrote:

 Dear all,

 Is there a way in petsc, when performing inner solves like PCKSP or 
 MatSchurComplementGetKSP, to make the outer solve stop immediately and 
 report back a negative convergence reason? I find that often when such 
 inner solves fail, the outer solve happily continues and sometimes 
 falsely reports convergence due to the preconditioner becoming rank 
 deficient. I'd like our code, using petsc, to be able to trap that sort of 
 situations and give a suitable error message.

 Cheers
 Stephan





-- 
Stephan Kramer  s.kramer at imperial.ac.uk
Applied Modelling and Computation Group,
Department of Earth Science and Engineering,
Imperial College London


[petsc-users] reporting failing pcksp solves

2010-06-18 Thread Stephan Kramer
Dear all,

Is there a way in petsc, when performing inner solves like PCKSP or 
MatSchurComplementGetKSP, to make the outer solve stop immediately and report 
back a negative convergence reason? I find that often 
when such inner solves fail, the outer solve happily continues and sometimes 
falsely reports convergence due to the preconditioner becoming rank deficient. 
I'd like our code, using petsc, to be able 
to trap that sort of situations and give a suitable error message.

Cheers
Stephan


[petsc-users] configure script for petsc

2010-04-23 Thread Stephan Kramer
Hi all,

I'm trying to write a portable configure script for our software that uses 
petsc, and needs to deal with petsc installations on a number of different 
platforms. In petsc 3.0 I could do

   make -f $(PETSC_DIR)/conf/base getlinklibs

and

   make -f $(PETSC_DIR)/conf/base getincludedirs

that would work for both installed petsc and petsc in its own build directory 
(provided PETSC_ARCH is set correctly of course). In petsc 3.1 conf/base seems 
to have disappeared, is the best
way to deal with this to make my own little makefile:

include ${PETSC_DIR}/conf/variables
include ${PETSC_DIR}/conf/rules

and use make -f my_own_little_makefile getlinklibs instead?

Cheers
Stephan


[petsc-users] configure script for petsc

2010-04-23 Thread Stephan Kramer
Satish Balay wrote:
 On Fri, 23 Apr 2010, Stephan Kramer wrote:
 
 Hi all,

 I'm trying to write a portable configure script for our software that uses
 petsc, and needs to deal with petsc installations on a number of different
 platforms. In petsc 3.0 I could do

   make -f $(PETSC_DIR)/conf/base getlinklibs

 and

   make -f $(PETSC_DIR)/conf/base getincludedirs

 that would work for both installed petsc and petsc in its own build
 directory (provided PETSC_ARCH is set correctly of course). In petsc 3.1
 conf/base seems to have disappeared, is the best
 way to deal with this to make my own little makefile:

 include ${PETSC_DIR}/conf/variables
 include ${PETSC_DIR}/conf/rules

 and use make -f my_own_little_makefile getlinklibs instead?
 
 Yes - this way - you can add more stuff into this makefile - if needed.
 
 If you just do getlinklibs and getincludedirs - then you can probably
 just do:
 
 make -f $(PETSC_DIR)/makefile getincludedirs

True, except the makefile doesn't make it into a prefix-installed
installation of petsc. Having my own makefile do it works fine -
  was just checking if this is indeed the recommended way

Thanks a lot
Cheers
Stephan


 
 etc..
 
 Satish
 


-- 
Stephan Kramer  s.kramer at imperial.ac.uk
Applied Modelling and Computation Group,
Department of Earth Science and Engineering,
Imperial College London


MPIBAIJ and sor

2009-10-27 Thread Stephan Kramer
Hi all,

I'm slightly confused by the fact that if I use my BAIJ matrix in parallel and 
select the sor preconditioner. it complains about mat-ops-relax and 
mat-ops-pbrelax not being defined. Also I can't 
find a MatRelax_MPIBAIJ in src/mat/impls/baij/mpi/mpibaij.c as I would have 
expected. Is there no sor type relaxation for parallel block matrices at all?

Cheers
Stephan


some sor questions

2009-09-22 Thread Stephan Kramer
Hi all,

I have some questions basically about the MatRelax_SeqAIJ routine:

If I understand correctly there are 2 versions of the sor routine 
depending on whether or not there is a zero guess, so that with a zero 
guess in the forward sweep you don't need to multiply the upper diagonal 
part U with the part of the x vector that is still zero. Why then does 
it look like that both versions log the same number of flops? I would 
have expected that the full forward sweep (i.e. no zero guess) takes 
2*a-nz flops (i.e. the same as a matvec) and not a-nz.

Why does the Richardson iteration with sor not take the zero guess into 
account, i.e. why does PCApplyRichardson_SOR not set SOR_ZERO_INIT_GUESS 
in the call to MatRelax if the Richardson ksp has a zero initial guess set?

In parallel if you specify SOR_LOCAL_FORWARD_SWEEP or 
SOR_LOCAL_BACKWARD_SWEEP it calls MatRelax on the local part of the 
matrix, mat-A, with its=lits and lits=PETSC_NULL (?). However the first 
line of MatRelax_SeqAIJ then says: its = its*lits. Is that right?

Please tell me if I'm totally misunderstanding how the routine works, 
thanks for any help.

Cheers
Stephan

-- 
Stephan Kramer  s.kramer at imperial.ac.uk
Applied Modelling and Computation Group,
Department of Earth Science and Engineering,
Imperial College London


some sor questions

2009-09-22 Thread Stephan Kramer
Thanks for your answers

Barry Smith wrote:
 On Sep 22, 2009, at 8:47 AM, Stephan Kramer wrote:
 
 Hi all,

 I have some questions basically about the MatRelax_SeqAIJ routine:

 If I understand correctly there are 2 versions of the sor routine  
 depending on whether or not there is a zero guess, so that with a  
 zero guess in the forward sweep you don't need to multiply the upper  
 diagonal part U with the part of the x vector that is still zero.  
 Why then does it look like that both versions log the same number of  
 flops? I would have expected that the full forward sweep (i.e. no  
 zero guess) takes 2*a-nz flops (i.e. the same as a matvec) and not  
 a-nz.
 
 You are right. This is an error in our code. It will be in the  
 next patch.
 
 Why does the Richardson iteration with sor not take the zero guess  
 into account, i.e. why does PCApplyRichardson_SOR not set  
 SOR_ZERO_INIT_GUESS in the call to MatRelax if the Richardson ksp  
 has a zero initial guess set?
 
 This appears to be a design limitation. There is no mechanism to  
 pass the information that the initial guess is zero into  
 PCApplyRichardson(). We could add support for this by adding one more  
 argument to PCApplyRichardson() for this information. I don't see a  
 simpler way.  If one is running, say 2 iterations of Richardson then  
 this would be a measurable improvement in time. If one is running many  
 iterations then the savings is tiny. Perhaps this support should be  
 added.

I'm thinking of the application of ksp richardson with sor as a smoother 
in pcmg. In which case the down smoother will have zero initial guess 
(as it only acts on the residual), and there will be typicaly only 1 or 
2 iterations, so the saving would be significant. Is there another way I 
should set this up instead?

 
 
 
 In parallel if you specify SOR_LOCAL_FORWARD_SWEEP or  
 SOR_LOCAL_BACKWARD_SWEEP it
 
 
 calls MatRelax on the local part of the matrix, mat-A, with  
 its=lits and lits=PETSC_NULL (?).
 
 However the first line of MatRelax_SeqAIJ then says: its = its*lits.  
 Is that right?
 
 This is all wrong. It should be passing 1 in, not PETSC_NULL. This  
 was fixed in petsc-dev but not in petsc-3.0.0 I will fix it in  
 petsc-3.0.0 and it will be in the next patch.
 
 Thanks for pointing out the problems.
 
 If you plan to use SOR a lot, you might consider switching to  
 petsc-dev since I have made some improvements there. Also consider the  
 Eisenstat trick preconditioner.
 
 Barry
 
 
 Please tell me if I'm totally misunderstanding how the routine  
 works, thanks for any help.

 Cheers
 Stephan

 -- 
 Stephan Kramer  s.kramer at imperial.ac.uk
 Applied Modelling and Computation Group,
 Department of Earth Science and Engineering,
 Imperial College London
 
 



MatGetArrayF90 returns 2d array

2009-06-30 Thread Stephan Kramer
Hello,

Why is it that MatGetArrayF90 returns a pointer to a 2d scalar array, 
instead of 1d as stated in the documentation? In fact it seems to always 
return a nrows x ncolumns array. Is this to deal with the MATDENSE case? 
Would it not be more elegant to always return a 1d array, so you get 
what you expect for the sparse matrices, and return a 1d array of length 
nrows*ncolumns in the case of MATDENSE?

Cheers
Stephan Kramer

-- 
Stephan Kramer
Applied Modelling and Computation Group,
Department of Earth Science and Engineering,
Imperial College London


Mismatch in explicit fortran interface for MatGetInfo

2009-06-18 Thread Stephan Kramer
Hi Satish,

I tried with p6 and it indeed works fine now. Thanks a lot for looking 
into this. A related question: it seems still quite a lot of very common 
interfaces (PETScInitialize, KSPSettype, MatView, etc.) are missing. Are 
there any plans of adding those in the future?

Cheers
Stephan

Satish Balay wrote:
 On Sat, 30 May 2009, Stephan Kramer wrote:
 
 Satish Balay wrote:
 On Sat, 30 May 2009, Stephan Kramer wrote:

 Thanks a lot for looking into this. The explicit fortran interfaces are in
 general very useful. The problem occurred for me with petsc-3.0.0-p1. I'm
 happy to try it out with a more recent patch-level or with petsc-dev.
 Did you configure with '--with-fortran-interfaces=1' or are you
 directly using '#include finclude/ftn-auto/petscmat.h90'?

 Configured with '--with-fortran-interfaces=1', yes, and then using them via
 the fortran modules: use petscksp, use petscmat, etc.
 
 ok. --with-fortran-interfaces was broken in p0, worked in p1,p2,p3,p4
 - broken in curent p5. The next patch update - p6 will have the fix
 for this issue [along with the fix for MatGetInfo() interface]
 
 Satish
 



Mismatch in explicit fortran interface for MatGetInfo

2009-05-30 Thread Stephan Kramer
Thanks a lot for looking into this. The explicit fortran interfaces are 
in general very useful. The problem occurred for me with petsc-3.0.0-p1. 
I'm happy to try it out with a more recent patch-level or with petsc-dev.

My current workaround is to simply wrap the call the MatGetInfo inside 
an external subroutine that doesn't use the fortran interfaces, so we 
can still apply the interfaces in the rest of the code. This is part of 
our CFD code that has to build on a number of different platforms, so I 
will probably have to maintain this workaround, so it will build with 
whatever petsc 3 version is available on a given platform.

The attached files were for current petsc-dev?

Cheers
Stephan

Satish Balay wrote:
 I have the following fixed files [untested yet]. But could you tell me
 how you've configured PETSc? - and what patchlevel?
 
 [It appears that there are quiet a few breakages with
 --with-fortran-interfaces=1 - I might fix this in petsc-dev - not
 3.0.0 - as it depends upon some f90 interface changes that are only in
 petsc-dev]
 
 
 Attaching the modified files that go with my untested fix.
 
 include/finclude/ftn-auto/petscmat.h90
 include/finclude/ftn-custom/petscmat.h90
 src/mat/interface/ftn-auto/matrixf.c
 src/mat/interface/ftn-custom/zmatrixf.c
 src/mat/interface/matrix.c
 
 Satish
 
 
 On Wed, 27 May 2009, Barry Smith wrote:
 
  Stephan,

   Satish is working on the patch for this and will get it to you shortly.

   Sorry for the delay, we were debating how to handle it.

Barry

 On May 23, 2009, at 9:00 AM, Stephan Kramer wrote:

 Hi all,

 First of all thanks of a lot for providing explicit fortran interfaces for
 most functions in Petsc 3. This is of great help. I do however run into a
 problem using MatGetInfo. The calling sequence for fortran (according to the
 manual) is:

 double precision info(MAT_INFO_SIZE)
 Mat cid:part1.02040105.02020103 at imperial.ac.uk A
 integer ierr

 call MatGetInfo
 cid:part2.00070305.02060306 at imperial.ac.uk(A,MAT_LOCAL,info,ierr)

 The interface however seems to indicate the info argument has to be a single
 double precision (i.e. a scalar not an array). I guess with implicit
 interfaces this sort of thing would work, but with the provided explicit
 interface, at least gfortran won't let me have it.

 Cheers
 Stephan




Mismatch in explicit fortran interface for MatGetInfo

2009-05-30 Thread Stephan Kramer
Satish Balay wrote:
 On Sat, 30 May 2009, Stephan Kramer wrote:
 
 Thanks a lot for looking into this. The explicit fortran interfaces are in
 general very useful. The problem occurred for me with petsc-3.0.0-p1. I'm
 happy to try it out with a more recent patch-level or with petsc-dev.
 
 Did you configure with '--with-fortran-interfaces=1' or are you
 directly using '#include finclude/ftn-auto/petscmat.h90'?
 

Configured with '--with-fortran-interfaces=1', yes, and then using them 
via the fortran modules: use petscksp, use petscmat, etc.

 With my builds --with-fortran-interfaces=1 is broken with p0 [so p1
 might also be broken] There was some reorganizing of f90 interface in
 the newer patchlevel [p4/ or p5] - and thats also broken [so petsc-dev
 is also broken]
 
 
 My current workaround is to simply wrap the call the MatGetInfo inside an
 external subroutine that doesn't use the fortran interfaces, so we can still
 apply the interfaces in the rest of the code. This is part of our CFD code
 that has to build on a number of different platforms, so I will probably have
 to maintain this workaround, so it will build with whatever petsc 3 version 
 is
 available on a given platform.

 The attached files were for current petsc-dev?
 
 Thare are from petsc-3.0.0 - but it should also work with
 petsc-dev. Its tested by building PETSc normally - and changing the
 example ex12f.F as follows:
 
 asterix:/home/balay/tmp/petsc-dist-testhg diff 
 src/ksp/ksp/examples/tests/ex12f.F
 diff -r 981c76f817e6 src/ksp/ksp/examples/tests/ex12f.F
 --- a/src/ksp/ksp/examples/tests/ex12f.FTue May 26 22:13:06 2009 -0500
 +++ b/src/ksp/ksp/examples/tests/ex12f.FSat May 30 11:02:03 2009 -0500
 @@ -8,6 +8,10 @@
  #include finclude/petscpc.h
  #include finclude/petscksp.h
  #include finclude/petscviewer.h
 +#define PETSC_USE_FORTRAN_INTERFACES
 +#include finclude/petscmat.h90
 +#undef PETSC_USE_FORTRAN_INTERFACES
 +
  !
  !  This example is the Fortran version of ex6.c.  The program reads a PETSc 
 matrix
  !  and vector from a file and solves a linear system.  Input arguments are:
 asterix:/home/balay/tmp/petsc-dist-test
 
 Satish
 

I'll have a go at it

 Cheers
 Stephan

 Satish Balay wrote:
 I have the following fixed files [untested yet]. But could you tell me
 how you've configured PETSc? - and what patchlevel?

 [It appears that there are quiet a few breakages with
 --with-fortran-interfaces=1 - I might fix this in petsc-dev - not
 3.0.0 - as it depends upon some f90 interface changes that are only in
 petsc-dev]


 Attaching the modified files that go with my untested fix.

 include/finclude/ftn-auto/petscmat.h90
 include/finclude/ftn-custom/petscmat.h90
 src/mat/interface/ftn-auto/matrixf.c
 src/mat/interface/ftn-custom/zmatrixf.c
 src/mat/interface/matrix.c

 Satish


 On Wed, 27 May 2009, Barry Smith wrote:

  Stephan,

   Satish is working on the patch for this and will get it to you shortly.

   Sorry for the delay, we were debating how to handle it.

Barry

 On May 23, 2009, at 9:00 AM, Stephan Kramer wrote:

 Hi all,

 First of all thanks of a lot for providing explicit fortran interfaces
 for
 most functions in Petsc 3. This is of great help. I do however run into
 a
 problem using MatGetInfo. The calling sequence for fortran (according to
 the
 manual) is:

 double precision info(MAT_INFO_SIZE)
 Mat cid:part1.02040105.02020103 at imperial.ac.uk A
 integer ierr

 call MatGetInfo
 cid:part2.00070305.02060306 at imperial.ac.uk(A,MAT_LOCAL,info,ierr)

 The interface however seems to indicate the info argument has to be a
 single
 double precision (i.e. a scalar not an array). I guess with implicit
 interfaces this sort of thing would work, but with the provided explicit
 interface, at least gfortran won't let me have it.

 Cheers
 Stephan


 
 



Mismatch in explicit fortran interface for MatGetInfo

2009-05-23 Thread Stephan Kramer
Hi all,

First of all thanks of a lot for providing explicit fortran interfaces 
for most functions in Petsc 3. This is of great help. I do however run 
into a problem using MatGetInfo. The calling sequence for fortran 
(according to the manual) is:

double precision info(MAT_INFO_SIZE)
Mat cid:part1.02040105.02020103 at imperial.ac.uk A
integer ierr

call MatGetInfo 
cid:part2.00070305.02060306 at imperial.ac.uk(A,MAT_LOCAL,info,ierr)

The interface however seems to indicate the info argument has to be a 
single double precision (i.e. a scalar not an array). I guess with 
implicit interfaces this sort of thing would work, but with the provided 
explicit interface, at least gfortran won't let me have it.

Cheers
Stephan



redistribution of vectors

2009-02-10 Thread Stephan Kramer
Hi,

I'm trying to pick up a linear system (matrix and rhs) that I've written 
out in a previous parallel run using MatView and VecView in binary mode. 
Now when I've read the matrix and rhs vector the rows are evenly 
distributed by PETSc over the processes. The partioning previously used 
in the parallel run however doesn't in general correspond to this even 
distribution (the global node numbering is the same, but number of rows 
are not exactly equal per process due to other constraints). So what I'm 
trying to do is redistribute the read in matrix and vector to match the 
previously used partioning. I've found that for the matrix this is easy 
to do using MatGetSubMatrix(), but I'm a little stuck on how to do this 
for the vector. Is there a similar way of doing this for the vector? 
There seems to be no way of extracting vector values not stored on this 
processor.

I've tried to look at src/ksp/ksp/examples/tutorials/example10.c where 
something similar is done for a repartioning created by MatPartioning. 
It does the redistribution of the matrix with MatGetSubMatrix() as I 
expected, but then seems to do nothing for the vector (there is only a 
comment:  /* need to move the vector also */ ).

Does this mean I just have to extract the vector values locally and do 
some redistributing with MPI myself?

Cheers
Stephan


redistribution of vectors

2009-02-10 Thread Stephan Kramer
Matthew Knepley wrote:
 On Tue, Feb 10, 2009 at 2:11 PM, Stephan Kramer
 stephan.kramer at imperial.ac.uk wrote:
   
 Hi,

 I'm trying to pick up a linear system (matrix and rhs) that I've written out
 in a previous parallel run using MatView and VecView in binary mode. Now
 when I've read the matrix and rhs vector the rows are evenly distributed by
 PETSc over the processes. The partioning previously used in the parallel run
 however doesn't in general correspond to this even distribution (the global
 node numbering is the same, but number of rows are not exactly equal per
 process due to other constraints). So what I'm trying to do is redistribute
 the read in matrix and vector to match the previously used partioning. I've
 found that for the matrix this is easy to do using MatGetSubMatrix(), but
 I'm a little stuck on how to do this for the vector. Is there a similar way
 of doing this for the vector? There seems to be no way of extracting vector
 values not stored on this processor.

 I've tried to look at src/ksp/ksp/examples/tutorials/example10.c where
 something similar is done for a repartioning created by MatPartioning. It
 does the redistribution of the matrix with MatGetSubMatrix() as I expected,
 but then seems to do nothing for the vector (there is only a comment:  /*
 need to move the vector also */ ).

 Does this mean I just have to extract the vector values locally and do some
 redistributing with MPI myself?
 

 You can certainly use a VecScatter to redistribute a vector. You
 already have the
 indices from MatGetSubMatrix() so it should be easy.

 However, you can also use VecLoadIntoVector() and provide an initial
 partitioning.

   Matt
   
Ah brilliant, I must have overlooked these options (there are so many in 
PETSc!). In that case is there also something like VecLoadIntoVector for 
MatLoad, i.e. where I specify the distribution beforehand, otherwise 
I'll just go for the MatGetSubMatrix() + VecScatter().

Just a small related question, for MatGetSubMatrix where I ask for the 
rows owned by each process but all columns, should I really assemble an 
index set ranging over all global indices? This seems a bit wasteful and 
non-scalable. Or should I put in the effort and, using information I 
have over my halo regions, work out which columns might have nonzeros 
for these rows. Or am I missing something and is there an easier short-cut?

Thanks a lot
Stephan

   
 Cheers
 Stephan
 



redistribution of vectors

2009-02-10 Thread Stephan Kramer
Barry Smith wrote:

 On Feb 10, 2009, at 5:01 PM, Stephan Kramer wrote:






 You can certainly use a VecScatter to redistribute a vector. You
 already have the
 indices from MatGetSubMatrix() so it should be easy.

 However, you can also use VecLoadIntoVector() and provide an initial
 partitioning.

  Matt

 Ah brilliant, I must have overlooked these options (there are so many 
 in PETSc!). In that case is there also something like 
 VecLoadIntoVector for MatLoad, i.e. where I specify the distribution 
 beforehand, otherwise I'll just go for the MatGetSubMatrix() + 
 VecScatter().

We don't currently have a MatLoadIntoVector(). The current plan is 
 to actually remove VecLoadIntoVector() and have both VecLoad() and 
 MatLoad() be essentially load into operations where one is free to 
 define the layout of the vector and matrix before or not thus getting 
 the full spectrum of possibilities in one clean interface. Sadly we 
 haven't done this reorganization due to having many other things to do.


 Just a small related question, for MatGetSubMatrix where I ask for 
 the rows owned by each process but all columns, should I really 
 assemble an index set ranging over all global indices? This seems a 
 bit wasteful and non-scalable. Or should I put in the effort and, 
 using information I have over my halo regions, work out which columns 
 might have nonzeros for these rows. Or am I missing something and is 
 there an easier short-cut?

It is true that it is not totally scalable but then dumping some 
 huge honkying sparse matrix to a file and reading it in is not 
 scalable either. We've never had a problem with this yet for 
 reasonable matrix sizes. Rather than fixing this MatGetSubMatrix() , 
 in your case, having a better MatLoad() merged with 
 MatLoadIntoMatrix() would be the thing I recommend optimizing. But 
 frankly until you get up to 1 billion unknowns what is there now is 
 probably good enough.

Barry

Good point. No, wasn't really planning on using this for anything that 
huge. Thanks a lot for your answers

Cheers
Stephan


 Thanks a lot
 Stephan


 Cheers
 Stephan







KSPSetNullSpace and CG for singular systems

2008-08-02 Thread Stephan Kramer
Shao-Ching Huang wrote:
 Hi,

 I am trying to use CG to solve a singular systems (i.e. Poisson
 equation with periodic conditions on all boundaries). 

 The code works when I use GMRES, but it diverges when I switch CG.

 Since the null space is a constant vector, in the code I have:
   
I might be missing something here, but aren't linearly varying vectors 
in your null space as well?

Cheers
Stephan
KSP ksp;
MatNullSpace nullspace;
...
MatNullSpaceCreate(MPI_COMM_WORLD, PETSC_TRUE, 0, PETSC_NULL, nullspace);
KSPSetNullSpace(ksp, nullspace);
MatNullSpaceRemove(nullspace, f-p_rhs, PETSC_NULL);

KSPSetFromOptions(ksp);
KSPSetUp(ksp);
KSPSolve(ksp, f-p_rhs, f-phi);

 [f-p_rhs is the RHS vector. f-phi is the solution vector.]

 When I use -ksp_type gmres, it converges (shown by -ksp_monitor).

 However, when I switch to -ksp_type cg, it diverges.
 In this case (cg), -ksp_converged_reason says:

   Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 1

 I also try adding -sub_pc_factor_shift_nonzero 0.01 but the
 CG case still fails.

 Am I missing some step in handling the null space when using CG?

 Thanks,

 Shao-Ching Huang



   




PetscPrintf/ifort 9.1

2007-08-29 Thread Stephan Kramer
Paul T. Bauman wrote:
 Hi Barry,

 To preface, I sincerely appreciate your prompt response and 
 willingness to produce quick fixes.

 However, it was pointed out to me in my office today (by someone who 
 knows more about interfacing C and FORTRAN than I and who watches the 
 petsc-users list) that the following code would have been the correct 
 way for me to do this (I tested and it works on intel and g95):

 call PetscPrintf(PETSC_COMM_WORLD, 
 ===//achar(10), ierr)
 call PetscPrintf(PETSC_COMM_WORLD,  Beginning of 
 Program//achar(10), ierr)
 call PetscPrintf(PETSC_COMM_WORLD, 
 ===//achar(10), ierr)

 'achar' is a FORTRAN intrinsic function that gets fed an integer and 
 returns the corresponding ascii character from the ascii character 
 set.  In this case, 10 corresponds to the new line command and thus 
 passes the correct format to the C call. I tested on the reverted 
 version of zmprint.c and it works correctly (although it also works 
 correctly with the patch you sent).
While this may work for you, it is certainly not *the* correct way to do 
it. achar(10) corresponds to LF, whereas windows systems
expect CR+LF (i.e. achar(13)//achar(10)), it may still work under 
windows as well though, depends on your terminal/ application
you view the output with. The correct way in C is really to use \n (and 
in the above case I guess), and in fortran to use two
separate write statements.

Cheers Stephan
 I wanted to share out of the sake of proliferation of knowledge (and 
 not to make you feel like I wasted your time). Thanks again for your 
 efforts, especially to us lingering FORTRAN'ers. Hopefully this will 
 clarify in the future if it ever comes up again. Sorry for the trouble.

 Paul

 Barry Smith wrote:
   Paul,

I have pushed a fix to petsc-dev that resolves this problem. If 
 you are
 not using petsc-dev you can simply replace the file 
 src/sys/fileio/ftn-custom/zmprintf.c with the attached file then run 
 make lib shared in that directory then relink your program.

Barry


 On Tue, 28 Aug 2007, Paul T. Bauman wrote:

  
 Hello,

 Kind of a non-numerical question - sorry.  I'm using PetscPrintf 
 from Fortran
 using the following calling sequence (for example):

 call PetscPrintf(PETSC_COMM_WORLD,
 === \n, ierr)
 call PetscPrintf(PETSC_COMM_WORLD,Beginning of 
 Program
 \n, ierr)
 call PetscPrintf(PETSC_COMM_WORLD,
 === \n, ierr)

 On my mac laptop (PowerPC) with PETSc 2.3.2-p7 compiled with gcc 
 4.0.1 and
 g95, this prints as expected:

 ===
   Beginning of Program
 ===

 On a linux workstation (AMD Opteron) with PETSc 2.3.2-p7 compiled 
 with icc,
 icxx, ifort 9.1, the \n is treated as a character and not treated 
 as move
 to new line so the output is all on one line:

 === \n   
 Beginning of
 Program \n=== \n

 While not the end of the world, it is a bit annoying. Am I screwing 
 up the
 calling sequence and just got lucky with g95?  Or a bug (PETSc or 
 Intel)?

 Thanks,

 Paul

 P.S. The reason why I care is because on any compiler I've used 
 (other than
 Intel), write(*,*) prints out at different times and not in sequence 
 with
 PETSc so all my output comes out after PETSc is done outputting. Using
 PetscPrintf, I can have everything print out in order.

 






PCShellSetContext fortran interface

2007-06-07 Thread Stephan Kramer
Hi,

I don't know if this is the expected behaviour but calling PCShellSetContext 
from fortran, the thing that is stored is not the integer value of context, 
but the pointer to this integer. Thus problems arise when supplying an 
integer variable that goes out of scope (e.g. on the stack) inbetween 
PCShellSetContext and the call to the 'apply shell preconditioner' routine. I 
know this makes sense from the C point of view, where you are merely 
supplying a void pointer, but this is not what you expect in Fortran were the 
only use of context could be as a reference/index itself to another object, 
and you thus expect the value of context to be stored.

Cheers
Stephan Kramer