Re: [petsc-users] pcfieldsplit for a composite dm with multiple subfields

Gideon Simpson Mon, 07 Sep 2015 18:19:30 -0700

Running with that flag gives me this:

[0]PETSC ERROR: PETSC: Attaching gdb to ./blowup_batch_refine of pid 16111 on 
gs_air
Unable to start debugger: No such file or directory




-gideon

> On Sep 7, 2015, at 9:11 PM, Barry Smith <[email protected]> wrote:
> 
> 
>  This should not happen. Run with a debug version of PETSc installed and the 
> option -start_in_debugger noxterm  Once the debugger starts up type cont and 
> when it crashes type where or bt  Send all output
> 
> 
> 
>  Barry
> 
> 
>> On Sep 7, 2015, at 8:09 PM, Gideon Simpson <[email protected]> wrote:
>> 
>> I’m getting an error with -snes_mf_operator, 
>> 
>>  0 SNES Function norm 1.421454390131e-02 
>> [0]PETSC ERROR: 
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
>> probably memory access out of range
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see 
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X 
>> to find memory corruption errors
>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and 
>> run 
>> [0]PETSC ERROR: to get more information on the crash.
>> [0]PETSC ERROR: --------------------- Error Message 
>> --------------------------------------------------------------
>> [0]PETSC ERROR: Signal received
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
>> trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
>> [0]PETSC ERROR: ./blowup_batch_refine on a arch-macports named gs_air by 
>> gideon Mon Sep  7 21:08:19 2015
>> [0]PETSC ERROR: Configure options --prefix=/opt/local 
>> --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries 
>> --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 
>> --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate
>>  --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local 
>> --with-superlu-dir=/opt/local --with-metis-dir=/opt/local 
>> --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local 
>> --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local 
>> CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp 
>> FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp 
>> F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os 
>> FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" 
>> CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os 
>> FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports 
>> --with-mpiexec=mpiexec-mpich-mp
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>> 
>> -gideon
>> 
>>> On Sep 7, 2015, at 9:01 PM, Barry Smith <[email protected]> wrote:
>>> 
>>> 
>>> My guess is the Jacobian is not correct (or correct "enough"), hence PETSc 
>>> SNES is generating a poor descent direction. You can try 
>>> -snes_mf_operator -ksp_monitor_true residual as additional arguments. What 
>>> happens?
>>> 
>>> Barry
>>> 
>>> 
>>> 
>>>> On Sep 7, 2015, at 7:49 PM, Gideon Simpson <[email protected]> 
>>>> wrote:
>>>> 
>>>> No problem Matt, I don’t think we had previously discussed that output.  
>>>> Here is a case where things fail.
>>>> 
>>>>     0 SNES Function norm 4.027481756921e-09 
>>>>     1 SNES Function norm 1.760477878365e-12 
>>>>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
>>>>   0 SNES Function norm 5.066222213176e+03 
>>>>   1 SNES Function norm 8.484697184230e+02 
>>>>   2 SNES Function norm 6.549559723294e+02 
>>>>   3 SNES Function norm 5.770723278153e+02 
>>>>   4 SNES Function norm 5.237702240594e+02 
>>>>   5 SNES Function norm 4.753909019848e+02 
>>>>   6 SNES Function norm 4.221784590755e+02 
>>>>   7 SNES Function norm 3.806525080483e+02 
>>>>   8 SNES Function norm 3.762054656019e+02 
>>>>   9 SNES Function norm 3.758975226873e+02 
>>>>  10 SNES Function norm 3.757032042706e+02 
>>>>  11 SNES Function norm 3.728798164234e+02 
>>>>  12 SNES Function norm 3.723078741075e+02 
>>>>  13 SNES Function norm 3.721848059825e+02 
>>>>  14 SNES Function norm 3.720227575629e+02 
>>>>  15 SNES Function norm 3.720051998555e+02 
>>>>  16 SNES Function norm 3.718945430587e+02 
>>>>  17 SNES Function norm 3.700412694044e+02 
>>>>  18 SNES Function norm 3.351964889461e+02 
>>>>  19 SNES Function norm 3.096016086233e+02 
>>>>  20 SNES Function norm 3.008410789787e+02 
>>>>  21 SNES Function norm 2.752316716557e+02 
>>>>  22 SNES Function norm 2.707658474165e+02 
>>>>  23 SNES Function norm 2.698436736049e+02 
>>>>  24 SNES Function norm 2.618233857172e+02 
>>>>  25 SNES Function norm 2.600121920634e+02 
>>>>  26 SNES Function norm 2.585046423168e+02 
>>>>  27 SNES Function norm 2.568551090220e+02 
>>>>  28 SNES Function norm 2.556404537064e+02 
>>>>  29 SNES Function norm 2.536353523683e+02 
>>>>  30 SNES Function norm 2.533596070171e+02 
>>>>  31 SNES Function norm 2.532324379596e+02 
>>>>  32 SNES Function norm 2.531842335211e+02 
>>>>  33 SNES Function norm 2.531684527520e+02 
>>>>  34 SNES Function norm 2.531637604618e+02 
>>>>  35 SNES Function norm 2.531624767821e+02 
>>>>  36 SNES Function norm 2.531621359093e+02 
>>>>  37 SNES Function norm 2.531620504925e+02 
>>>>  38 SNES Function norm 2.531620350055e+02 
>>>>  39 SNES Function norm 2.531620310522e+02 
>>>>  40 SNES Function norm 2.531620300471e+02 
>>>>  41 SNES Function norm 2.531620298084e+02 
>>>>  42 SNES Function norm 2.531620297478e+02 
>>>>  43 SNES Function norm 2.531620297324e+02 
>>>>  44 SNES Function norm 2.531620297303e+02 
>>>>  45 SNES Function norm 2.531620297302e+02 
>>>> Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 45
>>>> 0 SNES Function norm 9.636339304380e+03 
>>>> 1 SNES Function norm 8.997731184634e+03 
>>>> 2 SNES Function norm 8.120498349232e+03 
>>>> 3 SNES Function norm 7.322379894820e+03 
>>>> 4 SNES Function norm 6.599581599149e+03 
>>>> 5 SNES Function norm 6.374872854688e+03 
>>>> 6 SNES Function norm 6.372518007653e+03 
>>>> 7 SNES Function norm 6.073996314301e+03 
>>>> 8 SNES Function norm 5.635965277054e+03 
>>>> 9 SNES Function norm 5.155389064046e+03 
>>>> 10 SNES Function norm 5.080567902638e+03 
>>>> 11 SNES Function norm 5.058878643969e+03 
>>>> 12 SNES Function norm 5.058835649793e+03 
>>>> 13 SNES Function norm 5.058491285707e+03 
>>>> 14 SNES Function norm 5.057452865337e+03 
>>>> 15 SNES Function norm 5.057226140688e+03 
>>>> 16 SNES Function norm 5.056651272898e+03 
>>>> 17 SNES Function norm 5.056575190057e+03 
>>>> 18 SNES Function norm 5.056574632598e+03 
>>>> 19 SNES Function norm 5.056574520229e+03 
>>>> 20 SNES Function norm 5.056574492569e+03 
>>>> 21 SNES Function norm 5.056574485124e+03 
>>>> 22 SNES Function norm 5.056574483029e+03 
>>>> 23 SNES Function norm 5.056574482427e+03 
>>>> 24 SNES Function norm 5.056574482302e+03 
>>>> 25 SNES Function norm 5.056574482287e+03 
>>>> 26 SNES Function norm 5.056574482282e+03 
>>>> 27 SNES Function norm 5.056574482281e+03 
>>>> Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 27
>>>> SNES Object: 1 MPI processes
>>>> type: newtonls
>>>> maximum iterations=50, maximum function evaluations=10000
>>>> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
>>>> total number of linear solver iterations=28
>>>> total number of function evaluations=323
>>>> total number of grid sequence refinements=2
>>>> SNESLineSearch Object:   1 MPI processes
>>>>   type: bt
>>>>     interpolation: cubic
>>>>     alpha=1.000000e-04
>>>>   maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>   tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
>>>> lambda=1.000000e-08
>>>>   maximum iterations=40
>>>> KSP Object:   1 MPI processes
>>>>   type: gmres
>>>>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
>>>> Orthogonalization with no iterative refinement
>>>>     GMRES: happy breakdown tolerance 1e-30
>>>>   maximum iterations=10000, initial guess is zero
>>>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>>   left preconditioning
>>>>   using PRECONDITIONED norm type for convergence test
>>>> PC Object:   1 MPI processes
>>>>   type: lu
>>>>     LU: out-of-place factorization
>>>>     tolerance for zero pivot 2.22045e-14
>>>>     matrix ordering: nd
>>>>     factor fill ratio given 0, needed 0
>>>>       Factored matrix follows:
>>>>         Mat Object:           1 MPI processes
>>>>           type: seqaij
>>>>           rows=15991, cols=15991
>>>>           package used to perform factorization: mumps
>>>>           total: nonzeros=255801, allocated nonzeros=255801
>>>>           total number of mallocs used during MatSetValues calls =0
>>>>             MUMPS run parameters:
>>>>               SYM (matrix type):                   0 
>>>>               PAR (host participation):            1 
>>>>               ICNTL(1) (output for error):         6 
>>>>               ICNTL(2) (output of diagnostic msg): 0 
>>>>               ICNTL(3) (output for global info):   0 
>>>>               ICNTL(4) (level of printing):        0 
>>>>               ICNTL(5) (input mat struct):         0 
>>>>               ICNTL(6) (matrix prescaling):        7 
>>>>               ICNTL(7) (sequentia matrix ordering):6 
>>>>               ICNTL(8) (scalling strategy):        77 
>>>>               ICNTL(10) (max num of refinements):  0 
>>>>               ICNTL(11) (error analysis):          0 
>>>>               ICNTL(12) (efficiency control):                         1 
>>>>               ICNTL(13) (efficiency control):                         0 
>>>>               ICNTL(14) (percentage of estimated workspace increase): 20 
>>>>               ICNTL(18) (input mat struct):                           0 
>>>>               ICNTL(19) (Shur complement info):                       0 
>>>>               ICNTL(20) (rhs sparse pattern):                         0 
>>>>               ICNTL(21) (somumpstion struct):                            0 
>>>>               ICNTL(22) (in-core/out-of-core facility):               0 
>>>>               ICNTL(23) (max size of memory can be allocated locally):0 
>>>>               ICNTL(24) (detection of null pivot rows):               0 
>>>>               ICNTL(25) (computation of a null space basis):          0 
>>>>               ICNTL(26) (Schur options for rhs or solution):          0 
>>>>               ICNTL(27) (experimental parameter):                     -8 
>>>>               ICNTL(28) (use parallel or sequential ordering):        1 
>>>>               ICNTL(29) (parallel ordering):                          0 
>>>>               ICNTL(30) (user-specified set of entries in inv(A)):    0 
>>>>               ICNTL(31) (factors is discarded in the solve phase):    0 
>>>>               ICNTL(33) (compute determinant):                        0 
>>>>               CNTL(1) (relative pivoting threshold):      0.01 
>>>>               CNTL(2) (stopping criterion of refinement): 1.49012e-08 
>>>>               CNTL(3) (absomumpste pivoting threshold):      0 
>>>>               CNTL(4) (vamumpse of static pivoting):         -1 
>>>>               CNTL(5) (fixation for null pivots):         0 
>>>>               RINFO(1) (local estimated flops for the elimination after 
>>>> analysis): 
>>>>                 [0] 1.95838e+06 
>>>>               RINFO(2) (local estimated flops for the assembly after 
>>>> factorization): 
>>>>                 [0]  143924 
>>>>               RINFO(3) (local estimated flops for the elimination after 
>>>> factorization): 
>>>>                 [0]  1.95943e+06 
>>>>               INFO(15) (estimated size of (in MB) MUMPS internal data for 
>>>> running numerical factorization): 
>>>>               [0] 7 
>>>>               INFO(16) (size of (in MB) MUMPS internal data used during 
>>>> numerical factorization): 
>>>>                 [0] 7 
>>>>               INFO(23) (num of pivots eliminated on this processor after 
>>>> factorization): 
>>>>                 [0] 15991 
>>>>               RINFOG(1) (global estimated flops for the elimination after 
>>>> analysis): 1.95838e+06 
>>>>               RINFOG(2) (global estimated flops for the assembly after 
>>>> factorization): 143924 
>>>>               RINFOG(3) (global estimated flops for the elimination after 
>>>> factorization): 1.95943e+06 
>>>>               (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): 
>>>> (0,0)*(2^0)
>>>>               INFOG(3) (estimated real workspace for factors on all 
>>>> processors after analysis): 255801 
>>>>               INFOG(4) (estimated integer workspace for factors on all 
>>>> processors after analysis): 127874 
>>>>               INFOG(5) (estimated maximum front size in the complete 
>>>> tree): 11 
>>>>               INFOG(6) (number of nodes in the complete tree): 3996 
>>>>               INFOG(7) (ordering option effectively use after analysis): 6 
>>>>               INFOG(8) (structural symmetry in percent of the permuted 
>>>> matrix after analysis): 86 
>>>>               INFOG(9) (total real/complex workspace to store the matrix 
>>>> factors after factorization): 255865 
>>>>               INFOG(10) (total integer space store the matrix factors 
>>>> after factorization): 127890 
>>>>               INFOG(11) (order of largest frontal matrix after 
>>>> factorization): 11 
>>>>               INFOG(12) (number of off-diagonal pivots): 19 
>>>>               INFOG(13) (number of delayed pivots after factorization): 8 
>>>>               INFOG(14) (number of memory compress after factorization): 0 
>>>>               INFOG(15) (number of steps of iterative refinement after 
>>>> solution): 0 
>>>>               INFOG(16) (estimated size (in MB) of all MUMPS internal data 
>>>> for factorization after analysis: value on the most memory consuming 
>>>> processor): 7 
>>>>               INFOG(17) (estimated size of all MUMPS internal data for 
>>>> factorization after analysis: sum over all processors): 7 
>>>>               INFOG(18) (size of all MUMPS internal data allocated during 
>>>> factorization: value on the most memory consuming processor): 7 
>>>>               INFOG(19) (size of all MUMPS internal data allocated during 
>>>> factorization: sum over all processors): 7 
>>>>               INFOG(20) (estimated number of entries in the factors): 
>>>> 255801 
>>>>               INFOG(21) (size in MB of memory effectively used during 
>>>> factorization - value on the most memory consuming processor): 7 
>>>>               INFOG(22) (size in MB of memory effectively used during 
>>>> factorization - sum over all processors): 7 
>>>>               INFOG(23) (after analysis: value of ICNTL(6) effectively 
>>>> used): 0 
>>>>               INFOG(24) (after analysis: value of ICNTL(12) effectively 
>>>> used): 1 
>>>>               INFOG(25) (after factorization: number of pivots modified by 
>>>> static pivoting): 0 
>>>>               INFOG(28) (after factorization: number of null pivots 
>>>> encountered): 0
>>>>               INFOG(29) (after factorization: effective number of entries 
>>>> in the factors (sum over all processors)): 255865
>>>>               INFOG(30, 31) (after solution: size in Mbytes of memory used 
>>>> during solution phase): 5, 5
>>>>               INFOG(32) (after analysis: type of analysis done): 1
>>>>               INFOG(33) (value used for ICNTL(8)): 7
>>>>               INFOG(34) (exponent of the determinant if determinant is 
>>>> requested): 0
>>>>   linear system matrix = precond matrix:
>>>>   Mat Object:     1 MPI processes
>>>>     type: seqaij
>>>>     rows=15991, cols=15991
>>>>     total: nonzeros=223820, allocated nonzeros=431698
>>>>     total number of mallocs used during MatSetValues calls =15991
>>>>       using I-node routines: found 4000 nodes, limit used is 5
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -gideon
>>>> 
>>>>> On Sep 7, 2015, at 8:40 PM, Matthew Knepley <[email protected]> wrote:
>>>>> 
>>>>> On Mon, Sep 7, 2015 at 7:32 PM, Gideon Simpson <[email protected]> 
>>>>> wrote:
>>>>> Barry,
>>>>> 
>>>>> I finally got a chance to really try using the grid sequencing within my 
>>>>> code.  I find that, in some cases, even if it can solve successfully on 
>>>>> the coarsest mesh, the SNES fails, usually due to a line search failure, 
>>>>> when it tries to compute along the grid sequence.  Would you have any 
>>>>> suggestions?
>>>>> 
>>>>> I apologize if I have asked before, but can you give me -snes_view for 
>>>>> the solver? I could not find it in the email thread.
>>>>> 
>>>>> I would suggest trying to fiddle with the line search, or precondition it 
>>>>> with Richardson. It would be nice to see -snes_monitor
>>>>> for the runs that fail, and then we can break down the residual into 
>>>>> fields and look at it again (if my custom residual monitor
>>>>> does not work we can write one easily). Seeing which part of the residual 
>>>>> does not converge is key to designing the NASM
>>>>> for the problem. I have just seen the virtuoso of this, Xiao-Chuan Cai, 
>>>>> present it. We need better monitoring in PETSc.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>>   Matt
>>>>> 
>>>>> -gideon
>>>>> 
>>>>>> On Aug 28, 2015, at 4:21 PM, Barry Smith <[email protected]> wrote:
>>>>>> 
>>>>>> 
>>>>>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <[email protected]> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Yes, if i continue in this parameter on the coarse mesh, I can 
>>>>>>> generally solve at all values. I do find that I need to do some amount 
>>>>>>> of continuation to solve near the endpoint.  The problem is that on the 
>>>>>>> coarse mesh, things are not fully resolved at all the values along the 
>>>>>>> continuation parameter, and I would like to do refinement.  
>>>>>>> 
>>>>>>> One subtlety is that I actually want the intermediate continuation 
>>>>>>> solutions  too.  Currently, without doing any grid sequence, I compute 
>>>>>>> each, write it to disk, and then go on to the next one.  So I now need 
>>>>>>> to go back an refine them.  I was thinking that perhaps I could refine 
>>>>>>> them on the fly, dump them to disk, and use the coarse solution as the 
>>>>>>> starting guess at the next iteration, but that would seem to require 
>>>>>>> resetting the snes back to the coarse grid.
>>>>>>> 
>>>>>>> The alternative would be to just script the mesh refinement in a post 
>>>>>>> processing stage, where each value of the continuation is parameter is 
>>>>>>> loaded on the coarse mesh, and refined.  Perhaps that’s the most 
>>>>>>> practical thing to do.
>>>>>> 
>>>>>> I would do the following. Create your DM and create a SNES that will do 
>>>>>> the continuation
>>>>>> 
>>>>>> loop over continuation parameter
>>>>>> 
>>>>>>      SNESSolve(snes,NULL,Ucoarse);
>>>>>> 
>>>>>>      if (you decide you want to see the refined solution at this 
>>>>>> continuation point) {
>>>>>>           SNESCreate(comm,&snesrefine);
>>>>>>           SNESSetDM()
>>>>>>           etc
>>>>>>           SNESSetGridSequence(snesrefine,)
>>>>>>           SNESSolve(snesrefine,0,Ucoarse);
>>>>>>           SNESGetSolution(snesrefine,&Ufine);
>>>>>>           VecView(Ufine or do whatever you want to do with the Ufine at 
>>>>>> that continuation point
>>>>>>           SNESDestroy(snesrefine);
>>>>>>     end if
>>>>>> 
>>>>>> end loop over continuation parameter.
>>>>>> 
>>>>>> Barry
>>>>>> 
>>>>>>> 
>>>>>>> -gideon
>>>>>>> 
>>>>>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith <[email protected]> wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 3.  This problem is actually part of a continuation problem that 
>>>>>>>>> roughly looks like this 
>>>>>>>>> 
>>>>>>>>> for( continuation parameter p = 0 to 1){
>>>>>>>>> 
>>>>>>>>>       solve with parameter p_i using solution from p_{i-1},
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> What I would like to do is to start the solver, for each value of 
>>>>>>>>> parameter p_i on the coarse mesh, and then do grid sequencing on 
>>>>>>>>> that.  But it appears that after doing grid sequencing on the initial 
>>>>>>>>> p_0 = 0, the SNES is set to use the finer mesh.
>>>>>>>> 
>>>>>>>> So you are using continuation to give you a good enough initial guess 
>>>>>>>> on the coarse level to even get convergence on the coarse level? First 
>>>>>>>> I would check if you even need the continuation (or can you not even 
>>>>>>>> solve the coarse problem without it).
>>>>>>>> 
>>>>>>>> If you do need the continuation then you will need to tweak how you do 
>>>>>>>> the grid sequencing. I think this will work: 
>>>>>>>> 
>>>>>>>> Do not use -snes_grid_sequencing  
>>>>>>>> 
>>>>>>>> Run SNESSolve() as many times as you want with your continuation 
>>>>>>>> parameter. This will all happen on the coarse mesh.
>>>>>>>> 
>>>>>>>> Call SNESSetGridSequence()
>>>>>>>> 
>>>>>>>> Then call SNESSolve() again and it will do one solve on the coarse 
>>>>>>>> level and then interpolate to the next level etc.
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their 
>>>>> experiments is infinitely more interesting than any results to which 
>>>>> their experiments lead.
>>>>> -- Norbert Wiener
>>>> 
>>> 
>> 
>

Re: [petsc-users] pcfieldsplit for a composite dm with multiple subfields

Reply via email to