Re: [petsc-users] MatMult() returning different values depending on # of processors?
MatGetSubMatrix() is collective on Mat and ISCreateXXX is collective on the provided comm, so the logic you have built to call it on one proc at a time is unnecessary at best and most likely incorrect and likely to produce strange results. You can forgo the if statement and loop over processors, create the ISes on the same comm as x, and then call MatGetSubMatrix() once. - Peter On Thu, Jun 26, 2014 at 4:26 PM, Arthur Kurlej akur...@gmail.com wrote: I cannot send the original code, but I reproduced the problem in another code. I have attached a makefile the code, and the data for the x vector and A matrix. I think the problem may be with my ShortenMatrix function, but it's not clear to me what exactly is going wrong and how to fix it. So I would appreciate some assistance there. Thanks, Arthur On Wed, Jun 25, 2014 at 6:24 PM, Barry Smith bsm...@mcs.anl.gov wrote: Can you send the code that reproduces this behavior? Barry On Jun 25, 2014, at 4:37 PM, Arthur Kurlej akur...@gmail.com wrote: Hi Barry, So for the matrix C that I am currently testing (size 162x162), the condition number is roughly 10^4. For reference, I'm porting MATLAB code into PETSc, and for one processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. So I know that for one processor, my program is performing as expected. I've included examples below of values for b (also of size 162), ranging from indices 131 to 141. #processors=1: 0 1.315217173959314e-20 1.315217173959314e-20 4.843201487740107e-17 4.843201487740107e-17 8.16610470065e-14 8.16610470065e-14 6.303834267553249e-11 6.303834267553249e-11 2.227932688485483e-08 2.227932688485483e-08 # processors=2: 5.480410831461926e-22 2.892553944350444e-22 2.892553944350444e-22 7.524038923310717e-24 7.524038923214420e-24 -3.340766769043093e-26 -7.558372155761972e-27 5.551561288838557e-25 5.550551546879874e-25 -1.579397982093437e-22 2.655766754178065e-22 # processors = 4: 5.480410831461926e-22 2.892553944351728e-22 2.892553944351728e-22 7.524092205125593e-24 7.524092205125593e-24 -2.584939414228212e-26 -2.584939414228212e-26 0 0 -1.245940797657998e-23 -1.245940797657998e-23 # processors = 8: 5.480410831461926e-22 2.892553944023035e-22 2.892553944023035e-22 7.524065744581494e-24 7.524065744581494e-24 -2.250265175188197e-26 -2.250265175188197e-26 -6.543127892265160e-26 1.544288143499193e-317 8.788794008375919e-25 8.788794008375919e-25 Thanks, Arthur On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith bsm...@mcs.anl.gov wrote: How different are the values in b? Can you send back a few examples of the different b’s? Any idea of the condition number of C? Barry On Jun 25, 2014, at 3:10 PM, Arthur Kurlej akur...@gmail.com wrote: Hi all, While running my code, I have found that MatMult() returns different values depending on the number of processors I use (and there is quite the variance in the values). The setup of my code is as follows (I can go into more depth/background if needed): -Generate parallel AIJ matrix of size NxN, denoted as A -Retrieve parallel AIJ submatrix from the last N-1 rowscolumns from A, denoted as C -Generate vector of length N-1, denoted as x -Find C*x=b I have already checked that A, C, and x are all equivalent when ran for any number of processors, it is only the values of vector b that varies. Does anyone have an idea about what's going on? Thanks, Arthur
Re: [petsc-users] MatMult() returning different values depending on # of processors?
Yep! Looks good. One other thing is that you can pre/postface all PETSc calls with ierr = ... CHKERRQ(ierr); and it will properly trace other problems if they arise. - Peter On Thu, Jun 26, 2014 at 8:56 PM, Arthur Kurlej akur...@gmail.com wrote: Hello, I'm sorry, but this is a bit new for me, and I'm still not quite sure I follow. Are you recommending that opposed to doing this: if(procs!=1){ for(i=0;iprocs;i++){ if(rank==i){ VecGetLocalSize(*x,length); VecGetOwnershipRange(*x,div,NULL); ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,iscol); ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,isrow); ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); CHKERRQ(ierr); } } } else{ ISCreateStride(PETSC_COMM_SELF,final_size,begin,1,iscol); ISCreateStride(PETSC_COMM_SELF,final_size,begin,1,isrow); ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); CHKERRQ(ierr); } The proper implementation would instead just be the following: VecGetLocalSize(*x,length); VecGetOwnershipRange(*x,div,NULL); ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,iscol); ISCreateStride(PETSC_COMM_WORLD,length,div+begin,1,isrow); ierr = MatGetSubMatrix(*A,isrow,iscol,MAT_INITIAL_MATRIX,AA); CHKERRQ(ierr); ? On Thu, Jun 26, 2014 at 5:32 PM, Peter Brune prbr...@gmail.com wrote: MatGetSubMatrix() is collective on Mat and ISCreateXXX is collective on the provided comm, so the logic you have built to call it on one proc at a time is unnecessary at best and most likely incorrect and likely to produce strange results. You can forgo the if statement and loop over processors, create the ISes on the same comm as x, and then call MatGetSubMatrix() once. - Peter On Thu, Jun 26, 2014 at 4:26 PM, Arthur Kurlej akur...@gmail.com wrote: I cannot send the original code, but I reproduced the problem in another code. I have attached a makefile the code, and the data for the x vector and A matrix. I think the problem may be with my ShortenMatrix function, but it's not clear to me what exactly is going wrong and how to fix it. So I would appreciate some assistance there. Thanks, Arthur On Wed, Jun 25, 2014 at 6:24 PM, Barry Smith bsm...@mcs.anl.gov wrote: Can you send the code that reproduces this behavior? Barry On Jun 25, 2014, at 4:37 PM, Arthur Kurlej akur...@gmail.com wrote: Hi Barry, So for the matrix C that I am currently testing (size 162x162), the condition number is roughly 10^4. For reference, I'm porting MATLAB code into PETSc, and for one processor, the PETSc b vector is roughly equivalent to the MATLAB b vector. So I know that for one processor, my program is performing as expected. I've included examples below of values for b (also of size 162), ranging from indices 131 to 141. #processors=1: 0 1.315217173959314e-20 1.315217173959314e-20 4.843201487740107e-17 4.843201487740107e-17 8.16610470065e-14 8.16610470065e-14 6.303834267553249e-11 6.303834267553249e-11 2.227932688485483e-08 2.227932688485483e-08 # processors=2: 5.480410831461926e-22 2.892553944350444e-22 2.892553944350444e-22 7.524038923310717e-24 7.524038923214420e-24 -3.340766769043093e-26 -7.558372155761972e-27 5.551561288838557e-25 5.550551546879874e-25 -1.579397982093437e-22 2.655766754178065e-22 # processors = 4: 5.480410831461926e-22 2.892553944351728e-22 2.892553944351728e-22 7.524092205125593e-24 7.524092205125593e-24 -2.584939414228212e-26 -2.584939414228212e-26 0 0 -1.245940797657998e-23 -1.245940797657998e-23 # processors = 8: 5.480410831461926e-22 2.892553944023035e-22 2.892553944023035e-22 7.524065744581494e-24 7.524065744581494e-24 -2.250265175188197e-26 -2.250265175188197e-26 -6.543127892265160e-26 1.544288143499193e-317 8.788794008375919e-25 8.788794008375919e-25 Thanks, Arthur On Wed, Jun 25, 2014 at 4:06 PM, Barry Smith bsm...@mcs.anl.gov wrote: How different are the values in b? Can you send back a few examples of the different b’s? Any idea of the condition number of C? Barry On Jun 25, 2014, at 3:10 PM, Arthur Kurlej akur...@gmail.com wrote: Hi all, While running my code, I have found that MatMult() returns different values depending on the number of processors I use (and there is quite the variance in the values). The setup of my code is as follows (I can go into more depth/background if needed): -Generate parallel AIJ matrix of size NxN, denoted as A -Retrieve parallel AIJ submatrix from the last N-1 rowscolumns from A, denoted as C -Generate vector of length N-1, denoted as x -Find C*x=b
Re: [petsc-users] Scattering PetscScalar
If you have created a vector with the desired PETSc layout, you may use VecGetOwnershipRanges() http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecGetOwnershipRanges.html to get the ownership ranges for each processor indexable by rank, but with type PetscInt instead of PetscScalar. It's generally discouraged to use types such as PetscScalar to denote index or length quantities. - Peter On Mon, Jun 9, 2014 at 12:59 PM, Likun Tan tlk0...@hotmail.com wrote: Dear Petsc developers, I defined PetscScalar *len to store the number of nodes in each processor, MPI_Comm_rank(PETSC_COMM_WPRLD, rank); len[rank]=NumNode; but I want the elements in len to be seen in all the processors. Is there any function in Petsc being able to do this job? Many thanks, Likun
Re: [petsc-users] PetscMalloc with Fortran
You should be using an array of type ISColoringValue. ISColoringValue is by default a short, not an int, so you're getting nonsense entries. We should either maintain or remove ex5s if it does something like this. - Peter On Thu, May 15, 2014 at 11:56 AM, Jonas Mairhofer mairho...@itt.uni-stuttgart.de wrote: If 'colors' can be a dynamically allocated array then I dont know where the mistake is in this code: ISColoring iscoloring Integer, allocatable :: colors(:) PetscInt maxc ... !calculate max. number of colors maxc = 2*irc+1 !irc is the number of ghost nodes needed to calculate the function I want to solve allocate(colors(user%xm)) !where user%xm is the number of locally owned nodes of a global array !Set colors DO i=1,user%xm colors(i) = mod(i,maxc) END DO call ISColoringCreate(PETSC_COMM_WORLD,maxc,user%xm,colors,iscoloring,ierr) ... deallocate(colors) call ISColoringDestroy(iscoloring,ierr) On execution I get the following error message (running the DO Loop from 0 to user%xm-1 does not change anything): [0]PETSC ERROR: - Error Message [0]PETSC ERROR: Arguments are incompatible! [0]PETSC ERROR: Number of colors passed in 291 is less then the actual number of colors in array 61665! [0]PETSC ERROR: [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: [0]PETSC ERROR: ./DFT on a arch-linux2-c-debug named aries.itt.uni-stuttgart.de by mhofer Thu May 15 18:01:41 2014 [0]PETSC ERROR: Libraries linked from /usr/ITT/mhofer/Documents/Diss/NumericalMethods/ Libraries/Petsc/petsc-3.4.4/arch-linux2-c-debug/lib [0]PETSC ERROR: Configure run at Wed Mar 19 11:00:35 2014 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack --download-mpich [0]PETSC ERROR: [0]PETSC ERROR: ISColoringCreate() line 276 in /usr/ITT/mhofer/Documents/Diss/NumericalMethods/ Libraries/Petsc/petsc-3.4.4/src/vec/is/is/utils/iscoloring.c But when I print out colors, it only has entries from 0 to 218, so no entry is larger then 291 as stated in the error message. Am 15.05.2014 16:45, schrieb Jed Brown: Jonas Mairhofer mairho...@itt.uni-stuttgart.de writes: Hi, I'm trying to set the coloring of a matrix using ISColoringCreate. Therefore I need an array 'colors' which in C can be creates as (from example ex5s.c) int *colors PetscMalloc(...,colors) There is no PetscMalloc in Fortran, due to language deficiencies. colors(i) = ISColoringCreate(...) How do I have to define the array colors in Fortran? I tried: Integer, allocatable :: colors(:)andallocate() instead of PetscMalloc and Integer, pointer :: colors but neither worked. The ISColoringCreate Fortran binding copies from the array you pass into one allocated using PetscMalloc. You should pass a normal Fortran array (statically or dynamically allocated).
Re: [petsc-users] Test Jacobian
I've updated the manpage for SNESTEST in next and master to reflect the exact behavior of SNESTEST. Hopefully users will actually check the docs and this will clear up some of the confusion. - Peter On Wed, Apr 30, 2014 at 9:23 AM, Jed Brown j...@jedbrown.org wrote: Xiangdong epsco...@gmail.com writes: I echo the error message Que reported: [0]PETSC ERROR: - Error Message [0]PETSC ERROR: Object is in wrong state! [0]PETSC ERROR: SNESTest aborts after Jacobian test! [0]PETSC ERROR: The snes_fd, snes_fd_color and my own Jacobian work fine separately, but -snes_type test complains the same error message Que mentioned before. I'm afraid we'll reply with the same response. As indicated in the message, this is how SNESTest works. What do you want to have happen?
Re: [petsc-users] SNES: approximating the Jacobian with computed residuals?
On Tue, Apr 22, 2014 at 8:48 AM, Fischer, Greg A. fisch...@westinghouse.com wrote: Hello PETSc-users, I'm using the SNES component with the NGMRES method in my application. I'm using a matrix-free context for the Jacobian and the MatMFFDComputeJacobian() function in my FormJacobian routine. My understanding is that this effectively approximates the Jacobian using the equation at the bottom of Page 103 in the PETSc User's Manual. This works, but the expense of computing two function evaluations in each SNES iteration nearly wipes out the performance improvements over Picard iteration. Try -snes_type anderson. It's less stable than NGMRES, but requires one function evaluation per iteration. The manual is out of date. I guess it's time to fix that. It's interesting that the cost of matrix assembly and a linear solve is around the same as that of a function evaluation. Output from -log_summary would help in the diagnosis. Based on my (limited) understanding of the Oosterlee/Washio SIAM paper (Krylov Subspace Acceleration of Nonlinear Multigrid...), they seem to suggest that it's possible to approximate the Jacobian with a series of previously-computed residuals (eq 2.14), rather than additional function evaluations in each iteration. Is this correct? If so, could someone point me to a reference that demonstrates how to do this with PETSc? What indication do you have that the Jacobian is calculated at all in the NGMRES method? The two function evaluations are related to computing the quantities labeled F(u_M) and F(u_A) in O/W. We already use the Jacobian approximation for the minimization problem (2.14). - Peter Or, perhaps a better question to ask is: are there other ways of reducing the computing burden associated with estimating the Jacobian? Thanks, Greg
Re: [petsc-users] 2-stage preconditioner
PCComposite is probably the answer http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCCOMPOSITE.html The multiplicative variant will call one after the other. In order to nest it you may have to register your own custom type rather than using shell; this is doable with http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCRegister.html - Peter On Tue, Apr 22, 2014 at 9:43 AM, Qin Lu lu_qin_2...@yahoo.com wrote: Hello, I need to implement a 2-stage preconditioner using PETSc linear solver: 1. The first stage uses a user-provided preconditioner routine. It seems I can set it with PCShellSetApply. 2. The second stage uses PETSc's ILU. Shall I just call this two preconditioners in sequence, or there is a particular way to hook them up? Is there any sample code for this? Many thanks for your suggestions. Best Regards, Qin
Re: [petsc-users] SNES: approximating the Jacobian with computed residuals?
On Tue, Apr 22, 2014 at 10:56 AM, Fischer, Greg A. fisch...@westinghouse.com wrote: *From:* Peter Brune [mailto:prbr...@gmail.com] *Sent:* Tuesday, April 22, 2014 10:16 AM *To:* Fischer, Greg A. *Cc:* petsc-users@mcs.anl.gov *Subject:* Re: [petsc-users] SNES: approximating the Jacobian with computed residuals? On Tue, Apr 22, 2014 at 8:48 AM, Fischer, Greg A. fisch...@westinghouse.com wrote: Hello PETSc-users, I'm using the SNES component with the NGMRES method in my application. I'm using a matrix-free context for the Jacobian and the MatMFFDComputeJacobian() function in my FormJacobian routine. My understanding is that this effectively approximates the Jacobian using the equation at the bottom of Page 103 in the PETSc User's Manual. This works, but the expense of computing two function evaluations in each SNES iteration nearly wipes out the performance improvements over Picard iteration. Try -snes_type anderson. It's less stable than NGMRES, but requires one function evaluation per iteration. The manual is out of date. I guess it's time to fix that. It's interesting that the cost of matrix assembly and a linear solve is around the same as that of a function evaluation. Output from -log_summary would help in the diagnosis. I tried the –snes_type anderson option, and it seems to be requiring even more function evaluations than the Picard iterations. I’ve attached –log_summary output. This seems strange, because I can use the NLKAIN code ( http://nlkain.sourceforge.net/) to fairly good effect, and I’ve read that it’s related to Anderson mixing. Would it be useful to adjust the parameters? If I recall correctly, NLKAIN is yet another improvement on Anderson Mixing. Our NGMRES is what's in O/W and is built largely around being nonlinearly preconditionable with something strong like FAS. What is the perceived difference in convergence? (what does -snes_monitor say?) Any multitude of tolerances may be different between the two methods, and it's hard to judge without knowing much, much more. Seeing what happens when one changes the parameters is of course important if you're looking at performance. By Picard, you mean simple fixed-point iteration, right? What constitutes a Picard iteration is a longstanding argument on this list and therefore requires clarification, unfortunately. :) This (without linesearch) can be duplicated in PETSc with -snes_type nrichardson -snes_linesearch_type basic. For a typical problem one must damp this with -snes_linesearch_damping damping parameter That's what the linesearch is there to avoid, but this takes more function evaluations. I’ve also attached –log_summary output for NGMRES. Does anything jump out as being amiss? ## ## # WARNING!!!# ## # This code was compiled with a debugging option, # # To get timing results run ./configure# # using --with-debugging=no, the performance will # # be generally two or three times faster. # ## ## Timing comparisons aren't reasonable with debugging on. Based on my (limited) understanding of the Oosterlee/Washio SIAM paper (Krylov Subspace Acceleration of Nonlinear Multigrid...), they seem to suggest that it's possible to approximate the Jacobian with a series of previously-computed residuals (eq 2.14), rather than additional function evaluations in each iteration. Is this correct? If so, could someone point me to a reference that demonstrates how to do this with PETSc? What indication do you have that the Jacobian is calculated at all in the NGMRES method? The two function evaluations are related to computing the quantities labeled F(u_M) and F(u_A) in O/W. We already use the Jacobian approximation for the minimization problem (2.14). - Peter Thanks for the clarification. -Greg Or, perhaps a better question to ask is: are there other ways of reducing the computing burden associated with estimating the Jacobian? Thanks, Greg
Re: [petsc-users] SNES: approximating the Jacobian with computed residuals?
-snes_view would also be useful for basic santity checks. On Tue, Apr 22, 2014 at 11:43 AM, Peter Brune prbr...@gmail.com wrote: On Tue, Apr 22, 2014 at 10:56 AM, Fischer, Greg A. fisch...@westinghouse.com wrote: *From:* Peter Brune [mailto:prbr...@gmail.com] *Sent:* Tuesday, April 22, 2014 10:16 AM *To:* Fischer, Greg A. *Cc:* petsc-users@mcs.anl.gov *Subject:* Re: [petsc-users] SNES: approximating the Jacobian with computed residuals? On Tue, Apr 22, 2014 at 8:48 AM, Fischer, Greg A. fisch...@westinghouse.com wrote: Hello PETSc-users, I'm using the SNES component with the NGMRES method in my application. I'm using a matrix-free context for the Jacobian and the MatMFFDComputeJacobian() function in my FormJacobian routine. My understanding is that this effectively approximates the Jacobian using the equation at the bottom of Page 103 in the PETSc User's Manual. This works, but the expense of computing two function evaluations in each SNES iteration nearly wipes out the performance improvements over Picard iteration. Try -snes_type anderson. It's less stable than NGMRES, but requires one function evaluation per iteration. The manual is out of date. I guess it's time to fix that. It's interesting that the cost of matrix assembly and a linear solve is around the same as that of a function evaluation. Output from -log_summary would help in the diagnosis. I tried the –snes_type anderson option, and it seems to be requiring even more function evaluations than the Picard iterations. I’ve attached –log_summary output. This seems strange, because I can use the NLKAIN code ( http://nlkain.sourceforge.net/) to fairly good effect, and I’ve read that it’s related to Anderson mixing. Would it be useful to adjust the parameters? If I recall correctly, NLKAIN is yet another improvement on Anderson Mixing. Our NGMRES is what's in O/W and is built largely around being nonlinearly preconditionable with something strong like FAS. What is the perceived difference in convergence? (what does -snes_monitor say?) Any multitude of tolerances may be different between the two methods, and it's hard to judge without knowing much, much more. Seeing what happens when one changes the parameters is of course important if you're looking at performance. By Picard, you mean simple fixed-point iteration, right? What constitutes a Picard iteration is a longstanding argument on this list and therefore requires clarification, unfortunately. :) This (without linesearch) can be duplicated in PETSc with -snes_type nrichardson -snes_linesearch_type basic. For a typical problem one must damp this with -snes_linesearch_damping damping parameter That's what the linesearch is there to avoid, but this takes more function evaluations. I’ve also attached –log_summary output for NGMRES. Does anything jump out as being amiss? ## ## # WARNING!!!# ## # This code was compiled with a debugging option, # # To get timing results run ./configure# # using --with-debugging=no, the performance will # # be generally two or three times faster. # ## ## Timing comparisons aren't reasonable with debugging on. Based on my (limited) understanding of the Oosterlee/Washio SIAM paper (Krylov Subspace Acceleration of Nonlinear Multigrid...), they seem to suggest that it's possible to approximate the Jacobian with a series of previously-computed residuals (eq 2.14), rather than additional function evaluations in each iteration. Is this correct? If so, could someone point me to a reference that demonstrates how to do this with PETSc? What indication do you have that the Jacobian is calculated at all in the NGMRES method? The two function evaluations are related to computing the quantities labeled F(u_M) and F(u_A) in O/W. We already use the Jacobian approximation for the minimization problem (2.14). - Peter Thanks for the clarification. -Greg Or, perhaps a better question to ask is: are there other ways of reducing the computing burden associated with estimating the Jacobian? Thanks, Greg
Re: [petsc-users] SNES: approximating the Jacobian with computed residuals?
I accidentally forgot to reply to the list; resending. On Tue, Apr 22, 2014 at 4:55 PM, Peter Brune prbr...@gmail.com wrote: On Tue, Apr 22, 2014 at 4:40 PM, Fischer, Greg A. fisch...@westinghouse.com wrote: *From:* Peter Brune [mailto:prbr...@gmail.com] *Sent:* Tuesday, April 22, 2014 12:44 PM *To:* Fischer, Greg A. *Cc:* petsc-users@mcs.anl.gov *Subject:* Re: [petsc-users] SNES: approximating the Jacobian with computed residuals? On Tue, Apr 22, 2014 at 10:56 AM, Fischer, Greg A. fisch...@westinghouse.com wrote: *From:* Peter Brune [mailto:prbr...@gmail.com] *Sent:* Tuesday, April 22, 2014 10:16 AM *To:* Fischer, Greg A. *Cc:* petsc-users@mcs.anl.gov *Subject:* Re: [petsc-users] SNES: approximating the Jacobian with computed residuals? On Tue, Apr 22, 2014 at 8:48 AM, Fischer, Greg A. fisch...@westinghouse.com wrote: Hello PETSc-users, I'm using the SNES component with the NGMRES method in my application. I'm using a matrix-free context for the Jacobian and the MatMFFDComputeJacobian() function in my FormJacobian routine. My understanding is that this effectively approximates the Jacobian using the equation at the bottom of Page 103 in the PETSc User's Manual. This works, but the expense of computing two function evaluations in each SNES iteration nearly wipes out the performance improvements over Picard iteration. Try -snes_type anderson. It's less stable than NGMRES, but requires one function evaluation per iteration. The manual is out of date. I guess it's time to fix that. It's interesting that the cost of matrix assembly and a linear solve is around the same as that of a function evaluation. Output from -log_summary would help in the diagnosis. I tried the –snes_type anderson option, and it seems to be requiring even more function evaluations than the Picard iterations. I’ve attached –log_summary output. This seems strange, because I can use the NLKAIN code ( http://nlkain.sourceforge.net/) to fairly good effect, and I’ve read that it’s related to Anderson mixing. Would it be useful to adjust the parameters? If I recall correctly, NLKAIN is yet another improvement on Anderson Mixing. Our NGMRES is what's in O/W and is built largely around being nonlinearly preconditionable with something strong like FAS. What is the perceived difference in convergence? (what does -snes_monitor say?) Any multitude of tolerances may be different between the two methods, and it's hard to judge without knowing much, much more. Seeing what happens when one changes the parameters is of course important if you're looking at performance. I’m not looking to apply a preconditioner, so it sounds like perhaps NGMRES isn’t a good choice for this application. It all depends on the characteristics of your problem. I tried a different problem (one that is larger and more realistic), and found the SNESAnderson performance to be substantially better. NLKAIN still converges faster, though. (NLKAIN: ~1350 function calls; SNESAnderson: ~1800 function calls; fixed-point: ~2550 function calls). The –snes_monitor seems to indicate steadily decreasing function norms. Playing around with -snes_anderson_beta will in all likelihood yield some performance improvements; perhaps significant. If I run snes/examples/tutorials/ex5 with -snes_anderson_beta 1 and snes_anderson_beta 0.1 the difference is 304 to 103 iterations. I may have to go look and see what NLKAIN does for damping; most Anderson Mixing implementations seem to do nothing complicated for this. Anything adaptive would take more function evaluations. The –snes_anderson_monitor option doesn’t seem to produce any output. I’ve tried passing “-snes_anderson_monitor” and “-snes_anderson_monitor true” as options, but don’t see any output analogous to “-snes_gmres_monitor”. What’s the correct way to pass that option? in the -snes_ngmres_monitor case, the monitor prints what happens with the choice between the candidate and the minimized solutions AND if the restart conditions are in play. In Anderson mixing, the first part doesn't apply, so -snes_anderson_monitor only prints anything if the method is restarting; the default is no restart, so nothing will show up. Changing to periodic restart or difference restart (which takes more reductions) will yield other output. By Picard, you mean simple fixed-point iteration, right? What constitutes a Picard iteration is a longstanding argument on this list and therefore requires clarification, unfortunately. :) This (without linesearch) can be duplicated in PETSc with -snes_type nrichardson -snes_linesearch_type basic. For a typical problem one must damp this with -snes_linesearch_damping damping parameter That's what the linesearch is there to avoid, but this takes more function evaluations. Yes, I mean fixed-point iteration. Cool. I’ve also attached –log_summary output for NGMRES
Re: [petsc-users] SNES failed with reason -3
-3 is a linear solver failure. You can see the whole list of convergence/divergence reasons under SNESConvergedReason in $PETSC_DIR/include/petscsnes.h Anyhow, what kind of output do you get when running with -ksp_monitor and -ksp_converged_reason? - Peter On Wed, Apr 16, 2014 at 9:57 AM, Que Cat quecat...@gmail.com wrote: Dear Petsc-Users, I call the nonliner solver snes in my code and it said that SNES Failed! Reason -3. I testes with -pc_type lu and it passed. I have checked the hand-coded Jacobian, ant it looks fine. Could you suggest any possible sources of the errors? THank you. Que
Re: [petsc-users] preconditioner for -snes_fd or -snes_fd_color
On Fri, Apr 4, 2014 at 3:57 PM, Xiangdong epsco...@gmail.com wrote: Hello everyone, If we use -snes_fd or -snes_fd_color, can we still pass a preconditioner to snes/ksp? In snes/ex1.c, when I use -snes_fd, the program never calls the FormJacobian1 to get the preconditioner. FormJacobian1 forms... wait for it... the Jacobian! Not the preconditioner. -snes_fd and -snes_fd_color replace the Jacobian-forming routine. Any suggestions about providing preconditioner to snes_fd or snes_fd_color? Both of these explicitly create the approximate Jacobian matrix by using finite differences, so they may be used with any preconditioner. -snes_mf is a different beast entirely and merely approximates the action of the matrix without forming it, which is where the preconditioner issues come in. - Peter Thanks. Xiangdong
Re: [petsc-users] preconditioner for -snes_fd or -snes_fd_color
On Fri, Apr 4, 2014 at 4:22 PM, Xiangdong epsco...@gmail.com wrote: On Fri, Apr 4, 2014 at 5:15 PM, Xiangdong epsco...@gmail.com wrote: On Fri, Apr 4, 2014 at 5:02 PM, Peter Brune br...@mcs.anl.gov wrote: On Fri, Apr 4, 2014 at 3:57 PM, Xiangdong epsco...@gmail.com wrote: Hello everyone, If we use -snes_fd or -snes_fd_color, can we still pass a preconditioner to snes/ksp? In snes/ex1.c, when I use -snes_fd, the program never calls the FormJacobian1 to get the preconditioner. FormJacobian1 forms... wait for it... the Jacobian! Not the preconditioner. -snes_fd and -snes_fd_color replace the Jacobian-forming routine. It looks like the FormJacobian1 forms both Jacobian J and preconditioner B. My observations are 1) Without any options, FormJacobian1 can provide both Jacobian J and preconditioner B. From http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESJacobianFunction.html#SNESJacobianFunction: Pmat - the matrix to be used in constructing the preconditioner, usually the same as Amat The preconditioner is formed later using Pmat, whether or not it's just the same as the Jacobian. 2) With -snes_fd, the program does not call the FormJacobian1 function at all, even if we have SNESSetJacobian in the codes. If it does not call the FormJacobian1 and SNESSetJacobian is ignored with -snes_fd, how can I provide the preconditioner to snes? Thank you. Xiangdong Any suggestions about providing preconditioner to snes_fd or snes_fd_color? Both of these explicitly create the approximate Jacobian matrix by using finite differences, so they may be used with any preconditioner. -snes_mf is a different beast entirely and merely approximates the action of the matrix without forming it, which is where the preconditioner issues come in. Given that -snes_fd is explicitly formed, so -pc_type gamg or ilu will work. What happens if I want to provide my own preconditioner? If you have a custom preconditioner in mind, you could implement it in a PCSHELL http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCSHELL.htmlor register a custom preconditioner http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCRegister.html. If your preconditioner is merely a matrix you may use PCMAT http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMAT.html Thank you. Xiangdong - Peter Thanks. Xiangdong
Re: [petsc-users] [Bitbucket] Pull request #161: GAMG: shift coarse grids to avoid zero diags from low rank coarse grid space (petsc/petsc)
On Tue, Apr 1, 2014 at 5:21 PM, Mark Adams mfad...@lbl.gov wrote: Jed: get a thesaurus! I am getting sick of 'elide' :) The thesaurus is the problem, man! The thesaurus is the problem! - Peter The fix would be to eliminate the rotation mode(s) (or any of the modes for that matter) for that vertex of P. It just happens on the first coarse grid where the number of equation per vertex goes up (2--3 in 2D elasticity). This would be problematic for a few reasons but the most obvious is that the coarse grid would have variable block size. Note, because we have Galerkin coarse grids we can really just add anything to the diagonal. I've had good luck with doing this but it is tedious to get you implementations to deal with these fake equations. But I've found that getting fancy does not help convergence and is not perfect anyway. On Tue, Apr 1, 2014 at 5:59 PM, Jed Brown pullrequests-re...@bitbucket.org wrote: Jed Brown commented on pull request #161: GAMG: shift coarse grids to avoid zero diags from low rank coarse grid space [jedbrown] *Jed Brown* commented on pull request #161: GAMG: shift coarse grids to avoid zero diags from low rank coarse grid spacehttps://bitbucket.org/petsc/petsc/pull-request/161/gamg-shift-coarse-grids-to-avoid-zero#comment-1559358 Hmm, why can't singletons be elided completely? Then you would have a zero row of P. View this pull requesthttps://bitbucket.org/petsc/petsc/pull-request/161/gamg-shift-coarse-grids-to-avoid-zero#comment-1559358or add a comment by replying to this email. In reply to *Mark Adams* avoid[ing] zero columns columns is impossible. And the fix is the same thing that people do for BCs: make an equation not in the math; it is just going along for the ride. I had custom code to do this in Prometheus, it would try to add singletons to other aggregates, in fact it would break up all small aggregates and try to distribute them, but it is not guaranteed to succeed. I could add post processing step where I scan for singletons and then add them to other aggregates but this will not always work: for instance, given a high threshold a (good) input problem might have a singleton in the graph. What do you do? lower the threshold, redo the graph and try again. Really? This is crufty MPI code: any singletons?; send messages to you neighbors: can you take me?; receive replies; pick one; send it. What happens if there are two singletons next to each other? Now I say if I can take you and I am a singleton, and I have a higher processor ID I will take you and you must give me your singleton so I will no longer be a singleton. Fine. What if there are three singletons next each other Just to avoid a BC like equation? This there a polynomial time algorithm to do this? Unwatch this pull requesthttps://bitbucket.org/petsc/petsc/pull-request/161/unwatch/madams/f80fc7b4784df2819a4a5e329f4f4fc1813e22a9/to stop receiving email updates. [image: Bitbucket] https://bitbucket.org
Re: [petsc-users] GMRES error
That's what happens when you try to VecAXPY/MAXPY/whatever with a NaN or inf. We should really have a better error message for this. - Peter On Fri, Mar 28, 2014 at 7:50 AM, Mark Adams mfad...@lbl.gov wrote: I get this error in the middle of a GMRES iteration. Any ideas? 0 SNES Function norm 6.067786990073e+17 Residual norms for fsa_ solve. 0 KSP Residual norm 6.067786990073e+17 1 KSP Residual norm 6.037192129591e+17 2 KSP Residual norm 5.852390318109e+17 3 KSP Residual norm 5.562063100790e+17 4 KSP Residual norm 5.284787094948e+17 5 KSP Residual norm 5.183753281867e+17 6 KSP Residual norm 4.813052731918e+17 7 KSP Residual norm 4.571658388437e+17 8 KSP Residual norm 4.326765203018e+17 9 KSP Residual norm 4.044786063878e+17 10 KSP Residual norm 3.852338418686e+17 11 KSP Residual norm 3.533696733608e+17 12 KSP Residual norm 3.242503119640e+17 13 KSP Residual norm 2.946010215248e+17 14 KSP Residual norm 2.642699328492e+17 15 KSP Residual norm 2.365537803535e+17 16 KSP Residual norm 1.841438363878e+17 17 KSP Residual norm 1.333683638359e+17 18 KSP Residual norm 7.325375319129e+16 19 KSP Residual norm 2.964254455676e+16 20 KSP Residual norm 1.216650547542e+16[3]PETSC ERROR: - Error Message [3]PETSC ERROR: Invalid argument! [3]PETSC ERROR: Scalar value must be same on all processes, argument # 3! [3]PETSC ERROR: [3]PETSC ERROR: Petsc Development GIT revision: a9e422f38a085385bc6ffdad704c9d6a22342ac9 GIT Date: 2013-12-31 01:26:24 -0700 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: [3]PETSC ERROR: ../../XGC1_3/xgc2 on a arch-xc30-dbg named nid04306 by madams Fri Mar 28 05:34:54 2014 [3]PETSC ERROR: Libraries linked from /global/homes/m/madams/petsc_private/arch-xc30-dbg/lib [3]PETSC ERROR: Configure run at Tue Dec 31 14:53:42 2013 [3]PETSC ERROR: Configure options --COPTFLAGS=-g -no-ipo --CXXOPTFLAGS=-g -no-ipo --FOPTFLAGS=-g -no-ipo --download-hypre --download-metis --download-parmetis --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=1 --with-fc=ftn --with-fortranlib-autodetect=0 --with-hdf5-dir=/opt/cray/hdf5-parallel/1.8.9/intel/120/ --with-shared-libraries=0 --with-x=0 LIBS=-lstdc++ PETSC_ARCH=arch-xc30-dbg [3]PETSC ERROR: [3]PETSC ERROR: VecMAXPY() line 1257 in /global/u2/m/madams/petsc_private/src/vec/vec/interface/rvector.c [3]PETSC ERROR: KSPGMRESBuildSoln() line 353 in /global/u2/m/madams/petsc_private/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPGMRESCycle() line 211 in /global/u2/m/madams/petsc_private/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve_GMRES() line 235 in /global/u2/m/madams/petsc_private/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve() line 432 in /global/u2/m/madams/petsc_private/src/ksp/ksp/interface/itfunc.c [3]PETSC ERROR: SNESSolve_KSPONLY() line 44 in /global/u2/m/madams/petsc_private/src/snes/impls/ksponly/ksponly.c [3]PETSC ERROR: SNESSolve() line 3812 in /global/u2/m/madams/petsc_private/src/snes/interface/snes.c Rank 3 [Fri Mar 28 05:35:04 2014] [c6-2c1s4n2] application called MPI_Abort(MPI_COMM_WORLD, 62) - process 3
Re: [petsc-users] Compute norm of a single component of DMDAVec struct
You could use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMCreateFieldDecomposition.html to get DMDAs representing the layout of the individual components and DMCreateGlobalVector() on those DMDAs to get properly laid-out individual field vectors. Then, you would use the ISes given by this function to build VecScatters using http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html from the monolithic vector to the component ones. Then apply the scatter and do whatever you want with those vectors (take norms, etc.) Hope this helps. - Peter On Thu, Mar 20, 2014 at 10:38 AM, Mark Lohry mlo...@gmail.com wrote: I'm using a struct for a multi-component PDE as suggested in the manual, like so: typedef struct { PetscScalar u,v,omega,temperature; } Node; Node **f,**u; DMDAVecGetArray(DM da,Vec local,u); DMDAVecGetArray(DM da,Vec global,f); Calling VecNorm(...) on these vectors gives a norm for the entire vector. If one wants separate norms for each component of the struct, i.e. Norm(u) or Norm(v), what's the right approach? Would I need to manually compute norms locally and then call an MPI reduce function, or is this ability built-in to PETSc somewhere? -Mark Lohry
Re: [petsc-users] When does DIVERGED_LINE_SEARCH Happen?
Is there more output from the line search? What happens when you run with -snes_linesearch_monitor? I remember there being a reason that I didn't put this update in the maintenance branch. Let me figure out exactly why and get back to you. On Mon, Mar 17, 2014 at 5:37 PM, Dafang Wang dafang.w...@jhu.edu wrote: Hi Peter, My version of PETSc (v3.4.3) does not contain the bug fix you mentioned: + ierr = SNESLineSearchSetNorms(linesearch,xnorm,fnorm,ynorm);CHKERRQ(ierr); Would that be a problem? I typically used the default value of -snes_stol, never setting it to zero. I will let you know soon if you believe this is important. It would certainly be worth a try. - Peter Cheers, Dafang On 03/17/2014 06:27 PM, Peter Brune wrote: This may be related to a bug we had reported before to petsc-maint: https://bitbucket.org/petsc/petsc/commits/ced04f9d467b04aa83a18d3f8875c7f72c17217a What version of PETSc are you running? Also, what happens if you set -snes_stol to zero? Thanks, - Peter On Mon, Mar 17, 2014 at 5:19 PM, Dafang Wang dafang.w...@jhu.edu wrote: Hi Barry, Thanks for your tips. I have read the webpage you mentioned many times before, but still I have been stuck on the line-search problem for weeks. I cannot guarantee my Jacobian is correct but I believe an incorrect Jacobian is very unlikely. My Jacobian-calculation code has been under test for a year with both analytical and realistic models, and the results have been good until recently when I ran a very realistic physical model. Also, I looked up the implementation of SNESSolve_NEWTONLS() in ls.c. According to the algorithm, when the function SNESLineSearchApply() does not succeed, one may encounter two possible outcomes: CONVERGED_SNORM_RELATIVE (if the search step is too small) or otherwise, DIVERGED_LINE_SEARCH. Does this mean that both these two outcomes indicate that the line search fails? I ask this question because my simulation encountered many CONVERGED_SNORM_RELATIVE. I treated them as if my nonlinear system converged, accepted the nonlinear solution, and then proceeded to the next time step of my simulation. Apparently, such practice has worked well in most cases, (even when I encountered suspicious DIVERGED_LINE_SEARCH behaviors). However, I wonder if there are any potential pitfalls in my practice such as missing a nonlinear solve divergence and taking a partial solution as the correct solution. Thank you very much for your time and help. Best, Dafang On 03/15/2014 11:15 AM, Barry Smith wrote: Failed line search are almost always due to an incorrect Jacobian. Please let us know if the suggestions at http://www.mcs.anl.gov/petsc/documentation/faq.html#newton don't help. Barry On Mar 14, 2014, at 8:57 PM, Dafang Wang dafang.w...@jhu.edu wrote: Hi, Does anyone know what the error code DIVERGED_LINE_SEARCH means in the SNES nonlinear solve? Or what scenario would lead to this error code? Running a solid mechanics simulation, I found that the occurrence of DIVERGED_LINE_SEARCH was very unpredictable and sensitive to the input values to my nonlinear system, although my system should not be that unstable. As shown by the two examples below, my system diverged in one case and converged in the other, although the input values in these two cases differed by only 1e-4, Moreover, the Newton steps in the two cases were very similar up to NL step 1. Since then, however, Case 1 encountered a line-search divergence whereas Case 2 converged successfully. This is my main confusion. (Note that each residual vector contains 3e04 DOF, so when their L2 norms differ within 1e-4, the two systems should be very close.) My simulation input consists of two scalar values (p1 and p2), each of which acts as a constant pressure boundary condition. Case 1, diverge: p1= -10.190869 p2= -2.367555 NL step 0, |residual|_2 = 1.621402e-02 Line search: Using full step: fnorm 1.621401550027e-02 gnorm 7.022558235262e-05 NL step 1, |residual|_2 = 7.022558e-05 Line search: Using full step: fnorm 7.022558235262e-05 gnorm 1.636418730611e-06 NL step 2, |residual|_2 = 1.636419e-06 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 2 Case 2: converge: p1= -10.190747 p2= -2.367558 NL step 0, |residual|_2 = 1.621380e-02 Line search: Using full step: fnorm 1.621379778276e-02 gnorm 6.976373804153e-05 NL step 1, |residual|_2 = 6.976374e-05 Line search: Using full step: fnorm 6.976373804153e-05 gnorm 4.000992847275e-07 NL step 2, |residual|_2 = 4.000993e-07 Line search: Using full step: fnorm 4.000992847275e-07 gnorm 1.621646014441e-08 NL step 3, |residual|_2 = 1.621646e-08 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 Aside from the input values, the initial solution in both cases may differ very slightly. (Each case is one time step
Re: [petsc-users] Drastic increase in memory usage between DMDA_STENCIL_BOX and DMDA_STENCIL_STAR
On Tue, Mar 18, 2014 at 8:04 PM, Mani Chandra mc0...@gmail.com wrote: Hi, I see a 4x increase in the memory usage when I change from DMDA_STENCIL_STAR to DMDA_STENCIL_BOX. Attached are the outputs of -log_summary which shows a huge increase in the matrix memory usage. Is this expected? Yup. In 2D your matrices balloon from 5 entries per row to 9. In 3D it's 7 to 27. The number of colors required to color the matrix will increase by a significant factor as well, explaining the situation in your log files by simple multiplication. On a different note, suppose I am running a serial calculation with no need to exchange data but I am using corner node information, do I need to use DMDA_STENCIL_BOX? Would the jacobian when computed using colored finite differences be correctly represented if I use corner information but still use DMDA_STENCIL_STAR? If you're using diagonal corners in your function evaluation, then you need DMDA_STENCIL_BOX. Otherwise you're going to have columns of the Jacobian of the same color that aren't structurally orthogonal, and the finite difference Jacobian will be straightup wrong. - Peter Thanks, Mani
Re: [petsc-users] When does DIVERGED_LINE_SEARCH Happen?
This may be related to a bug we had reported before to petsc-maint: https://bitbucket.org/petsc/petsc/commits/ced04f9d467b04aa83a18d3f8875c7f72c17217a What version of PETSc are you running? Also, what happens if you set -snes_stol to zero? Thanks, - Peter On Mon, Mar 17, 2014 at 5:19 PM, Dafang Wang dafang.w...@jhu.edu wrote: Hi Barry, Thanks for your tips. I have read the webpage you mentioned many times before, but still I have been stuck on the line-search problem for weeks. I cannot guarantee my Jacobian is correct but I believe an incorrect Jacobian is very unlikely. My Jacobian-calculation code has been under test for a year with both analytical and realistic models, and the results have been good until recently when I ran a very realistic physical model. Also, I looked up the implementation of SNESSolve_NEWTONLS() in ls.c. According to the algorithm, when the function SNESLineSearchApply() does not succeed, one may encounter two possible outcomes: CONVERGED_SNORM_RELATIVE (if the search step is too small) or otherwise, DIVERGED_LINE_SEARCH. Does this mean that both these two outcomes indicate that the line search fails? I ask this question because my simulation encountered many CONVERGED_SNORM_RELATIVE. I treated them as if my nonlinear system converged, accepted the nonlinear solution, and then proceeded to the next time step of my simulation. Apparently, such practice has worked well in most cases, (even when I encountered suspicious DIVERGED_LINE_SEARCH behaviors). However, I wonder if there are any potential pitfalls in my practice such as missing a nonlinear solve divergence and taking a partial solution as the correct solution. Thank you very much for your time and help. Best, Dafang On 03/15/2014 11:15 AM, Barry Smith wrote: Failed line search are almost always due to an incorrect Jacobian. Please let us know if the suggestions at http://www.mcs.anl.gov/petsc/ documentation/faq.html#newton don't help. Barry On Mar 14, 2014, at 8:57 PM, Dafang Wang dafang.w...@jhu.edu wrote: Hi, Does anyone know what the error code DIVERGED_LINE_SEARCH means in the SNES nonlinear solve? Or what scenario would lead to this error code? Running a solid mechanics simulation, I found that the occurrence of DIVERGED_LINE_SEARCH was very unpredictable and sensitive to the input values to my nonlinear system, although my system should not be that unstable. As shown by the two examples below, my system diverged in one case and converged in the other, although the input values in these two cases differed by only 1e-4, Moreover, the Newton steps in the two cases were very similar up to NL step 1. Since then, however, Case 1 encountered a line-search divergence whereas Case 2 converged successfully. This is my main confusion. (Note that each residual vector contains 3e04 DOF, so when their L2 norms differ within 1e-4, the two systems should be very close.) My simulation input consists of two scalar values (p1 and p2), each of which acts as a constant pressure boundary condition. Case 1, diverge: p1= -10.190869 p2= -2.367555 NL step 0, |residual|_2 = 1.621402e-02 Line search: Using full step: fnorm 1.621401550027e-02 gnorm 7.022558235262e-05 NL step 1, |residual|_2 = 7.022558e-05 Line search: Using full step: fnorm 7.022558235262e-05 gnorm 1.636418730611e-06 NL step 2, |residual|_2 = 1.636419e-06 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 2 Case 2: converge: p1= -10.190747 p2= -2.367558 NL step 0, |residual|_2 = 1.621380e-02 Line search: Using full step: fnorm 1.621379778276e-02 gnorm 6.976373804153e-05 NL step 1, |residual|_2 = 6.976374e-05 Line search: Using full step: fnorm 6.976373804153e-05 gnorm 4.000992847275e-07 NL step 2, |residual|_2 = 4.000993e-07 Line search: Using full step: fnorm 4.000992847275e-07 gnorm 1.621646014441e-08 NL step 3, |residual|_2 = 1.621646e-08 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 Aside from the input values, the initial solution in both cases may differ very slightly. (Each case is one time step in a time-sequence simulation. The two cases behaved nearly identically up to the last time step before the step shown above, so their initial solutions may differ by a cumulative error but such error should be very small.) Is it possible that little difference in initial guess leads to different local minimum regions where the line search in Case 1 failed? Any comments will be greatly appreciated. Thanks, Dafang -- Dafang Wang, Ph.D Postdoctoral Fellow Institute of Computational Medicine Department of Biomedical Engineering Johns Hopkins University Hackerman Hall Room 218 Baltimore, MD, 21218 -- Dafang Wang, Ph.D Postdoctoral Fellow Institute of Computational Medicine Department of Biomedical Engineering Johns Hopkins University Hackerman
Re: [petsc-users] TS matrix free
Please send the entire error output. - Peter On Wed, Jan 29, 2014 at 10:09 AM, Konstantinos Kontzialis ckontzia...@lycos.com wrote: Dear all, I am using a matrix free method with a user defined preconditioner. I run with the option -snes_mf_operator, but I get the following: Must call DMShellSetMatrix() or DMShellSetCreateMatrix()! Why is this happening? I use the TS library. Kostas
Re: [petsc-users] Parallel Graph Coloring
On Tue, Nov 5, 2013 at 10:29 AM, Gaetan Kenway gaet...@gmail.com wrote: Hi I have a quick question regarding the MatGetColoring() function. According to the documentation, For parallel matrices currently converts to sequential matrix and uses the sequential coloring on that. The meaning of this is that it transfers the parallel matrix to a single processor and does the coloring there, not that it only considers on-processor entries. What comes out is therefore a valid column coloring of the whole matrix. Parallel matrix coloring is in development but is not quite ready for public consumption. Anecdotal evidence suggests that the serial coloring does not become a bottleneck on small parallel runs. A good option is to provide the coloring through your discretization. This will often be more efficient than the matrix colorings, as you know the structure of your problem. - Peter I am wondering does this give actually give a valid parallel coloring? My application is a cell centered multi-block finite volume code, (with block-based decomposition) and the communications are done using a two level exchange of halo cells. I would like to use FD (actually forward mode AD to compute the jacbian matrix). So, what I'm wondering is, if a cell with color 0 is perturbed on processor 0, is it guaranteed that a residual on processor 1, that is influenced by the 0-colored cell on proc zero is *only* influenced by the color 0 from proc 0 and not a 0-colored cell on proc 1? I hope that is clear Thank you, Gaetan Kenway
Re: [petsc-users] Parallel Graph Coloring
On Tue, Nov 5, 2013 at 11:07 AM, Gaetan Kenway gaet...@gmail.com wrote: Hi Peter Thanks for the info. I actually already have a coloring that is done through the discretrization that is near optimal. The issue is that I have a funny interprocessor dependence that is tricky to do my own sequential coloring. So I was looking at doing it in a global, parallel sense. For interest, I tried using the matGetColor() on my assembled matrix just to see how many colors it would take. My discretrization coloring has 35 colors (for a stencil with 33 cells) and the matGetColor() resulted in approximately 78 colors or about twice as many. Jacobian coloring requires a column, or distance-2 coloring of the matrix. The distance-2 colorings we get from the serial algorithms aren't that bad, so I suspect that you might be doing a 1-coloring from your mesh; I assume that you mean that your residual/Jacobian stencil has 33 entries rather than some distance-2 thing? How did the problems with your coloring manifest themselves? - Peter Thanks, Gaetan On Tue, Nov 5, 2013 at 11:39 AM, Peter Brune br...@mcs.anl.gov wrote: On Tue, Nov 5, 2013 at 10:29 AM, Gaetan Kenway gaet...@gmail.com wrote: Hi I have a quick question regarding the MatGetColoring() function. According to the documentation, For parallel matrices currently converts to sequential matrix and uses the sequential coloring on that. The meaning of this is that it transfers the parallel matrix to a single processor and does the coloring there, not that it only considers on-processor entries. What comes out is therefore a valid column coloring of the whole matrix. Parallel matrix coloring is in development but is not quite ready for public consumption. Anecdotal evidence suggests that the serial coloring does not become a bottleneck on small parallel runs. A good option is to provide the coloring through your discretization. This will often be more efficient than the matrix colorings, as you know the structure of your problem. - Peter I am wondering does this give actually give a valid parallel coloring? My application is a cell centered multi-block finite volume code, (with block-based decomposition) and the communications are done using a two level exchange of halo cells. I would like to use FD (actually forward mode AD to compute the jacbian matrix). So, what I'm wondering is, if a cell with color 0 is perturbed on processor 0, is it guaranteed that a residual on processor 1, that is influenced by the 0-colored cell on proc zero is *only* influenced by the color 0 from proc 0 and not a 0-colored cell on proc 1? I hope that is clear Thank you, Gaetan Kenway
Re: [petsc-users] How to get processes layout from already created DMDA object?
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMDAGetOwnershipRanges.html will give you the ownership ranges in each dimension. - Peter On Wed, Sep 25, 2013 at 10:08 AM, Petr Kotas petr.ko...@gmail.com wrote: Hi guys, I have probably an easy question. How do I extract distribution of processes from already created DM object. Which was created with call to ierr = DMDACreate3d(PETSC_COMM_WORLD, boundaryType, boundaryType, boundaryType, DMDA_STENCIL_BOX, M[0], M[1], M[2], PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, dof, stencilSize, PETSC_NULL, PETSC_NULL, PETSC_NULL, da); where *boundaryType*, *M[i], dof, stencilSize *are my variables. Simply, I would like to to know how PETSC_DECIDE(ed) about the layout. I need it for my other part of the code. Thank You, Petr Kotas
Re: [petsc-users] SNES Line Search Monitor
On Sat, Aug 17, 2013 at 6:46 PM, Su Yan suy...@gmail.com wrote: I am using PETSc 3.4.1. This problem is reproducible. This is not reproduction of the problem. Does it happen with a simple test problem that you can allow us to debug? Additionally, does your problem have the same convergence history every time that it is run with the same options? If not, then the -snes_linesearch_monitor is most likely a misdiagnosis of some other difficulty. - Peter I tested several other cases, almost every case has the similar problem. Take this one as an example: with ./abc.bin -snes_monitor -snes_linesearch_monitor 0 SNES Function norm 1.470975405802e+06 Line search: lambdas = [1, 0.5, 0], fnorms = [7051.02, 735338, 1.47098e+06] Line search terminated: lambda = 0.999762, fnorms = 7026.38 1 SNES Function norm 7.026377609722e+03 Line search: lambdas = [1, 0.5, 0], fnorms = [189.711, 3514.55, 7026.38] Line search terminated: lambda = 0.999295, fnorms = 189.783 2 SNES Function norm 1.897827226353e+02 Line search: lambdas = [1, 0.5, 0], fnorms = [32.3811, 100.302, 189.783] Line search terminated: lambda = 1.01592, fnorms = 32.5125 3 SNES Function norm 3.251248928384e+01 Line search: lambdas = [1, 0.5, 0], fnorms = [0.00236829, 16.2564, 32.5125] Line search terminated: lambda = 1.1, fnorms = 0.00233087 4 SNES Function norm 2.330867616400e-03 with ./abc.bin -snes_monitor 0 SNES Function norm 1.470975405802e+06 1 SNES Function norm 7.026377609170e+03 2 SNES Function norm 1.897827231670e+02 3 SNES Function norm 3.251248934919e+01 4 SNES Function norm 2.171496483661e-02 Significant difference can be observed. The equation I am solving is quite ill-conditioned. Without L2 line search it is hard to converge in some cases. Still try to figure out the reason. Thanks, Su On Sat, Aug 17, 2013 at 8:49 AM, Peter Brune br...@mcs.anl.gov wrote: On Sat, Aug 17, 2013 at 1:40 AM, Su Yan suy...@gmail.com wrote: Hi, I ran into something really weird when I tried to solve a nonlinear equation with Newton method and line search. Specifically, I used SNESSetType(snesMbr, SNESNEWTONLS); and PetscOptionsSetValue(-snes_linesearch_type, l2); Which version of PETSc are you using? When I execute my program abc.bin with the following command: ./abc.bin -snes_monitor -snes_linesearch_monitor I got the following output: 0 SNES Function norm 1.457697974866e+07 Line search: lambdas = [1, 0.5, 0], fnorms = [669102, 7.35102e+06, 1.4577e+07] Line search terminated: lambda = 1.00553, fnorms = 652606 1 SNES Function norm 6.526060362905e+05 Line search: lambdas = [1, 0.5, 0], fnorms = [3406.6, 326873, 652606] Line search terminated: lambda = 1.00171, fnorms = 2801.6 2 SNES Function norm 2.801596249480e+03 Line search: lambdas = [1, 0.5, 0], fnorms = [2.51242, 1401.2, 2801.6] Line search terminated: lambda = 1.00029, fnorms = 2.09292 3 SNES Function norm 2.092918540169e+00 Line search: lambdas = [1, 0.5, 0], fnorms = [0.000123295, 1.04646, 2.09292] Line search terminated: lambda = 1, fnorms = 0.000122588 4 SNES Function norm 1.225883678418e-04 Converged Reason: FNORM_RELATIVE The nonlinear problem converged normally with a relative f_norm set as 1E-7. However, if I execute exactly the same program, but with a slightly different runtime command: ./abc.bin -snes_monitor I got the following output: 0 SNES Function norm 1.457697974975e+07 1 SNES Function norm 6.526060348917e+05 2 SNES Function norm 2.801608208510e+03 3 SNES Function norm 2.450488738084e+03 4 SNES Function norm 3.269507987119e+02 5 SNES Function norm 3.016606325384e+02 6 SNES Function norm 2.463851989463e+02 7 SNES Function norm 1.546266418976e+02 8 SNES Function norm 1.492518400407e+02 9 SNES Function norm 1.477122410995e+02 10 SNES Function norm 1.503359418680e+02 11 SNES Function norm 1.504759910776e+02 12 SNES Function norm 1.417592634863e+02 13 SNES Function norm 3.047096130411e+05 Is this problem reproducible? Note that even your 0th norm is very slightly different, and the first three norms are quite similar. and the solver diverged. The only difference was that whether I used -snes_linesearch_monitor or not. In my understanding, this runtime option only turns on the screen print. So why did it make such a big difference? Is there anything special with this option turned on? Hope someone could help me out. This is correct. It should not influence the solution at all, and all it does is enable printing. Also, the l2 line search takes significantly more work for well-behaved Newton convergence than bt, as bt automatically accepts the full step. l2 is mostly meant for cases where the step is automatically ill-scaled. - Peter Thanks a lot. Regards, Su
Re: [petsc-users] SNES Line Search Monitor
On Sat, Aug 17, 2013 at 1:40 AM, Su Yan suy...@gmail.com wrote: Hi, I ran into something really weird when I tried to solve a nonlinear equation with Newton method and line search. Specifically, I used SNESSetType(snesMbr, SNESNEWTONLS); and PetscOptionsSetValue(-snes_linesearch_type, l2); Which version of PETSc are you using? When I execute my program abc.bin with the following command: ./abc.bin -snes_monitor -snes_linesearch_monitor I got the following output: 0 SNES Function norm 1.457697974866e+07 Line search: lambdas = [1, 0.5, 0], fnorms = [669102, 7.35102e+06, 1.4577e+07] Line search terminated: lambda = 1.00553, fnorms = 652606 1 SNES Function norm 6.526060362905e+05 Line search: lambdas = [1, 0.5, 0], fnorms = [3406.6, 326873, 652606] Line search terminated: lambda = 1.00171, fnorms = 2801.6 2 SNES Function norm 2.801596249480e+03 Line search: lambdas = [1, 0.5, 0], fnorms = [2.51242, 1401.2, 2801.6] Line search terminated: lambda = 1.00029, fnorms = 2.09292 3 SNES Function norm 2.092918540169e+00 Line search: lambdas = [1, 0.5, 0], fnorms = [0.000123295, 1.04646, 2.09292] Line search terminated: lambda = 1, fnorms = 0.000122588 4 SNES Function norm 1.225883678418e-04 Converged Reason: FNORM_RELATIVE The nonlinear problem converged normally with a relative f_norm set as 1E-7. However, if I execute exactly the same program, but with a slightly different runtime command: ./abc.bin -snes_monitor I got the following output: 0 SNES Function norm 1.457697974975e+07 1 SNES Function norm 6.526060348917e+05 2 SNES Function norm 2.801608208510e+03 3 SNES Function norm 2.450488738084e+03 4 SNES Function norm 3.269507987119e+02 5 SNES Function norm 3.016606325384e+02 6 SNES Function norm 2.463851989463e+02 7 SNES Function norm 1.546266418976e+02 8 SNES Function norm 1.492518400407e+02 9 SNES Function norm 1.477122410995e+02 10 SNES Function norm 1.503359418680e+02 11 SNES Function norm 1.504759910776e+02 12 SNES Function norm 1.417592634863e+02 13 SNES Function norm 3.047096130411e+05 Is this problem reproducible? Note that even your 0th norm is very slightly different, and the first three norms are quite similar. and the solver diverged. The only difference was that whether I used -snes_linesearch_monitor or not. In my understanding, this runtime option only turns on the screen print. So why did it make such a big difference? Is there anything special with this option turned on? Hope someone could help me out. This is correct. It should not influence the solution at all, and all it does is enable printing. Also, the l2 line search takes significantly more work for well-behaved Newton convergence than bt, as bt automatically accepts the full step. l2 is mostly meant for cases where the step is automatically ill-scaled. - Peter Thanks a lot. Regards, Su
[petsc-users] odd SNES behavior
Mark - I can reproduce on a simple single processor example. It appears that the MatMFFD is being used within GAMG instead of the preconditioning matrix. Therefore, SNESComputeFunction is being called at each chebychev iteration on the coarse grid. For instance, I get backtraces like: #0 SNESComputeFunction (snes=0x870250, x=0xb1a020, y=0xc5e150) at /home/prbrune/packages/petsc-dev/src/snes/interface/snes.c:1993 #1 0x768964cf in MatMult_MFFD (mat=0xa82e30, a=0xc54350, y=0xc5e150) at /home/prbrune/packages/petsc-dev/src/mat/impls/mffd/mffd.c:404 #2 0x76692070 in MatMult (mat=0xa82e30, x=0xc54350, y=0xc5e150) at /home/prbrune/packages/petsc-dev/src/mat/interface/matrix.c:2156 #3 0x7705fa81 in KSP_MatMult (ksp=0xbf6410, A=0xa82e30, x=0xc54350, y=0xc5e150) at /home/prbrune/packages/petsc-dev/include/petsc-private/kspimpl.h:198 #4 0x7706686d in KSPSolve_Chebyshev (ksp=0xbf6410) at /home/prbrune/packages/petsc-dev/src/ksp/ksp/impls/cheby/cheby.c:502 #5 0x76f7c53c in KSPSolve (ksp=0xbf6410, b=0xa6e9a0, x=0xb41990) at /home/prbrune/packages/petsc-dev/src/ksp/ksp/interface/itfunc.c:430 #6 0x76f4d9ae in PCMGMCycle_Private (pc=0xa22bf0, mglevelsin=0xbd72c8, reason=0x0) at /home/prbrune/packages/petsc-dev/src/ksp/pc/impls/mg/mg.c:19 #7 0x76f52280 in PCApply_MG (pc=0xa22bf0, b=0xa6e9a0, x=0xb41990) at /home/prbrune/packages/petsc-dev/src/ksp/pc/impls/mg/mg.c:330 #8 0x76e32b23 in PCApply (pc=0xa22bf0, x=0xa6e9a0, y=0xb41990) at /home/prbrune/packages/petsc-dev/src/ksp/pc/interface/precon.c:390 #9 0x76fa8a4d in KSP_PCApply (ksp=0x9e60f0, x=0xa6e9a0, y=0xb41990) at /home/prbrune/packages/petsc-dev/include/petsc-private/kspimpl.h:221 #10 0x76fa98f3 in KSPInitialResidual (ksp=0x9e60f0, vsoln=0xa79030, vt1=0xb2d4f0, vt2=0xb37b90, vres=0xb41990, vb=0xa6e9a0) at /home/prbrune/packages/petsc-dev/src/ksp/ksp/interface/itres.c:64 #11 0x77031a88 in KSPSolve_GMRES (ksp=0x9e60f0) at /home/prbrune/packages/petsc-dev/src/ksp/ksp/impls/gmres/gmres.c:234 #12 0x76f7c53c in KSPSolve (ksp=0x9e60f0, b=0xa6e9a0, x=0xa79030) at /home/prbrune/packages/petsc-dev/src/ksp/ksp/interface/itfunc.c:430 #13 0x7714b1ea in SNES_KSPSolve (snes=0x870250, ksp=0x9e60f0, b=0xa6e9a0, x=0xa79030) at /home/prbrune/packages/petsc-dev/src/snes/interface/snes.c:4418 #14 0x770d8ec5 in SNESSolve_NEWTONLS (snes=0x870250) at /home/prbrune/packages/petsc-dev/src/snes/impls/ls/ls.c:217 #15 0x77142e9b in SNESSolve (snes=0x870250, b=0x0, x=0xa24840) at /home/prbrune/packages/petsc-dev/src/snes/interface/snes.c:3649 #16 0x00402c83 in main (argc=9, argv=0x7fffde08) at ex5.c:157 Is the wrong matrix getting sent to the smoothers somewhere? - Peter On Wed, Mar 6, 2013 at 11:25 AM, Mark F. Adams mark.adams at columbia.eduwrote: I have a SNES solve if I switch the pc_type between gamg and hypre I get some odd results: 1) The gamg solve takes two more Newton iterations and converges for a different reason. This could be OK, but it seems odd. 2) The more mystifying issue is that the gamg solve seems to take just a few more total linear solve iterations but does about 5x as many function evaluations. I've attached two log files. Any ideas what is going on here? It seems like it should be impossible for the choice of PC to effect the nonlinear solve like this. Mark -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130306/842f9c51/attachment.html
[petsc-users] odd SNES behavior
That's the thing; you WANT a MFFD matrix in the snes-jacobian slot but not the snes-jacobian_pre slot. Somewhere the PC's getting the wrong one and putting it on the smoother. - Peter On Wed, Mar 6, 2013 at 3:44 PM, Mark F. Adams mark.adams at columbia.eduwrote: Yikes, the logic of SNES and MG (with dm etc) is getting a bit too convoluted. I would run in the debugger with a break point for MatCreate_MFFD() this will give a hint why it is being used. #0 0x000100fca059 in MatCreate_MFFD () #1 0x000100eedc34 in MatSetType () #2 0x000100fc7bc3 in MatCreateMFFD () #3 0x0001012acc90 in MatCreateSNESMF () #4 0x0001012c092e in SNESSetUp () #5 0x0001012c277b in SNESSolve () I will keep digging. This is the first one and it is not in the PC ... -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130306/f6871cd9/attachment.html
[petsc-users] odd SNES behavior
This is a likely culprit: (gdb) bt #0 KSPSetOperators (ksp=0xbf6410, Amat=0xa82e30, Pmat=0xaac190, flag=DIFFERENT_NONZERO_PATTERN) at /home/prbrune/packages/petsc-dev/src/ksp/ksp/interface/itcreate.c:540 #1 0x76f554cf in PCSetUp_MG (pc=0xa22bf0) at /home/prbrune/packages/petsc-dev/src/ksp/pc/impls/mg/mg.c:572 #2 0x76ef2f07 in PCSetUp_GAMG (pc=0xa22bf0) at /home/prbrune/packages/petsc-dev/src/ksp/pc/impls/gamg/gamg.c:974 #3 0x76e3936e in PCSetUp (pc=0xa22bf0) at /home/prbrune/packages/petsc-dev/src/ksp/pc/interface/precon.c:838 #4 0x76f7adb7 in KSPSetUp (ksp=0x9e60f0) at /home/prbrune/packages/petsc-dev/src/ksp/ksp/interface/itfunc.c:278 #5 0x76f7bc7d in KSPSolve (ksp=0x9e60f0, b=0xa6e9a0, x=0xa79030) at /home/prbrune/packages/petsc-dev/src/ksp/ksp/interface/itfunc.c:388 #6 0x7714b1ea in SNES_KSPSolve (snes=0x870250, ksp=0x9e60f0, b=0xa6e9a0, x=0xa79030) at /home/prbrune/packages/petsc-dev/src/snes/interface/snes.c:4418 #7 0x770d8ec5 in SNESSolve_NEWTONLS (snes=0x870250) at /home/prbrune/packages/petsc-dev/src/snes/impls/ls/ls.c:217 #8 0x77142e9b in SNESSolve (snes=0x870250, b=0x0, x=0xa24840) at /home/prbrune/packages/petsc-dev/src/snes/interface/snes.c:3649 #9 0x00402c83 in main (argc=9, argv=0x7fffde08) at ex5.c:157 (gdb) up #1 0x76f554cf in PCSetUp_MG (pc=0xa22bf0) at /home/prbrune/packages/petsc-dev/src/ksp/pc/impls/mg/mg.c:572 572 ierr = KSPSetOperators(mglevels[n-1]-smoothd,pc-mat,pc-pmat,pc-flag);CHKERRQ(ierr); The matrix on the smoother gets RESET to the non-preconditioner matrix in the final call to PCSetUp_MG in PCSetUp_GAMG. You could just reset it after that, like you do with the PC type. - Peter On Wed, Mar 6, 2013 at 4:31 PM, Mark F. Adams mark.adams at columbia.eduwrote: It looks like SNES is not getting set it's master Jacobian. If you run the same code with hypre does it also hit this create? MatCreate_MFFD is called just once with gamg and hypre. It looks like Cheby is using the outside operator instead of the pre_op. This could be desirable but I don't want it here. I'll keep digging. -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130306/f1a9fe87/attachment.html
[petsc-users] SNES line search
On Wed, Oct 17, 2012 at 2:40 PM, Dmitry Karpeev karpeev at mcs.anl.gov wrote: On Wed, Oct 17, 2012 at 2:11 PM, Jed Brown jedbrown at mcs.anl.gov wrote: On Wed, Oct 17, 2012 at 2:02 PM, Shiyuan Gu sgu at anl.gov wrote: Hi, I am using petsc-3.3-p2. For my particular equation, for the first few iterations, the residual norm is effectively reduced and the solution X is very close to the exact solution (the residual norm is e-17). Then in the next iteration, SNES_KSPSolve() returns a newton step very close to zero which is correct, but the line search after it fails, and the function SNESLSCheckLocalMin_Private(...) is called. In the function SNESLSCheckLocalMin_Private(), the solution vector X is rewritten to X=J^T*F where J is the Jacobian and F is the residual. Since F is very close to zero, X is also zero which is wrong. Is there a way to skip the line search?(if the line search is not performed, there would be no problem for this particular equation). How should we handle this situation? You can turn off the line search with -snes_linesearch_type basic, but I recommend setting an appropriate atol and stol instead, to prevent attempting to solve the system more accurately than machine precision when given a good initial guess. Maybe it wouldn't be a bad idea to (a) not use the solution vector as a temporary in detecting the local minimum, Dumping garbage into X upon failure appears to be my careless mistake. I'll fix it in 3.3. (b) print the norm of the last solution before bailing out of the Newton loop? This can go into the -info output. - Peter Dmitry. Also, after the solution vector X is rewritten by SNESLSCheckLocalMin_Private(), the new residual norm is not printed on screen (which is huge for this particular problem). Since the residual norm printed on the screen is going to machine accuracy, users might think that the SNESSolve converges nicely and the returned solution is correct. This seems a bit misleading in my opinion. Are you checking -snes_converged_reason? Thanks. Shiyuan -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121017/47457651/attachment.html
[petsc-users] SNES line search
On Wed, Oct 17, 2012 at 3:01 PM, Dmitry Karpeev karpeev at mcs.anl.gov wrote: On Wed, Oct 17, 2012 at 2:56 PM, Peter Brune prbrune at gmail.com wrote: On Wed, Oct 17, 2012 at 2:40 PM, Dmitry Karpeev karpeev at mcs.anl.govwrote: On Wed, Oct 17, 2012 at 2:11 PM, Jed Brown jedbrown at mcs.anl.gov wrote: On Wed, Oct 17, 2012 at 2:02 PM, Shiyuan Gu sgu at anl.gov wrote: Hi, I am using petsc-3.3-p2. For my particular equation, for the first few iterations, the residual norm is effectively reduced and the solution X is very close to the exact solution (the residual norm is e-17). Then in the next iteration, SNES_KSPSolve() returns a newton step very close to zero which is correct, but the line search after it fails, and the function SNESLSCheckLocalMin_Private(...) is called. In the function SNESLSCheckLocalMin_Private(), the solution vector X is rewritten to X=J^T*F where J is the Jacobian and F is the residual. Since F is very close to zero, X is also zero which is wrong. Is there a way to skip the line search?(if the line search is not performed, there would be no problem for this particular equation). How should we handle this situation? You can turn off the line search with -snes_linesearch_type basic, but I recommend setting an appropriate atol and stol instead, to prevent attempting to solve the system more accurately than machine precision when given a good initial guess. Maybe it wouldn't be a bad idea to (a) not use the solution vector as a temporary in detecting the local minimum, Dumping garbage into X upon failure appears to be my careless mistake. I'll fix it in 3.3. (b) print the norm of the last solution before bailing out of the Newton loop? This can go into the -info output. Why not (also) let SNESMonitor() be called on that last solution? Oh, I interpreted norm of the solution literally rather than norm of the function at that solution, which makes more sense. However, as it quits after failing the line search and doesn't transfer the failed solution over. This is indeed a divergence, so printing a failed norm probably isn't the right solution. You can see what the line search is doing with -snes_linesearch_monitor. We changed the behavior in petsc-dev a little bit with respect to this type of case, as we were seeing bad interaction between line search divergence in the case of small step sizes and small-normed solutions with the logic as it was in 3.3 and before. This could be that type of problem, where the solution is well converged and the line search cannot make more progress. In any case, I'll fix the problem with writing garbage into the solution. In order to truly diagnose this, output with -snes_linesearch_monitor and -snes_converged reason would be nice. - Peter Dmitry. - Peter Dmitry. Also, after the solution vector X is rewritten by SNESLSCheckLocalMin_Private(), the new residual norm is not printed on screen (which is huge for this particular problem). Since the residual norm printed on the screen is going to machine accuracy, users might think that the SNESSolve converges nicely and the returned solution is correct. This seems a bit misleading in my opinion. Are you checking -snes_converged_reason? Thanks. Shiyuan -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121017/bae7df4d/attachment-0001.html
[petsc-users] SNES line search
Fix for the garbage in X part is in. http://petsc.cs.iit.edu/petsc/releases/petsc-3.3/rev/9d679552ef45 - Peter On Wed, Oct 17, 2012 at 3:34 PM, Peter Brune prbrune at gmail.com wrote: On Wed, Oct 17, 2012 at 3:01 PM, Dmitry Karpeev karpeev at mcs.anl.govwrote: On Wed, Oct 17, 2012 at 2:56 PM, Peter Brune prbrune at gmail.com wrote: On Wed, Oct 17, 2012 at 2:40 PM, Dmitry Karpeev karpeev at mcs.anl.govwrote: On Wed, Oct 17, 2012 at 2:11 PM, Jed Brown jedbrown at mcs.anl.govwrote: On Wed, Oct 17, 2012 at 2:02 PM, Shiyuan Gu sgu at anl.gov wrote: Hi, I am using petsc-3.3-p2. For my particular equation, for the first few iterations, the residual norm is effectively reduced and the solution X is very close to the exact solution (the residual norm is e-17). Then in the next iteration, SNES_KSPSolve() returns a newton step very close to zero which is correct, but the line search after it fails, and the function SNESLSCheckLocalMin_Private(...) is called. In the function SNESLSCheckLocalMin_Private(), the solution vector X is rewritten to X=J^T*F where J is the Jacobian and F is the residual. Since F is very close to zero, X is also zero which is wrong. Is there a way to skip the line search?(if the line search is not performed, there would be no problem for this particular equation). How should we handle this situation? You can turn off the line search with -snes_linesearch_type basic, but I recommend setting an appropriate atol and stol instead, to prevent attempting to solve the system more accurately than machine precision when given a good initial guess. Maybe it wouldn't be a bad idea to (a) not use the solution vector as a temporary in detecting the local minimum, Dumping garbage into X upon failure appears to be my careless mistake. I'll fix it in 3.3. (b) print the norm of the last solution before bailing out of the Newton loop? This can go into the -info output. Why not (also) let SNESMonitor() be called on that last solution? Oh, I interpreted norm of the solution literally rather than norm of the function at that solution, which makes more sense. However, as it quits after failing the line search and doesn't transfer the failed solution over. This is indeed a divergence, so printing a failed norm probably isn't the right solution. You can see what the line search is doing with -snes_linesearch_monitor. We changed the behavior in petsc-dev a little bit with respect to this type of case, as we were seeing bad interaction between line search divergence in the case of small step sizes and small-normed solutions with the logic as it was in 3.3 and before. This could be that type of problem, where the solution is well converged and the line search cannot make more progress. In any case, I'll fix the problem with writing garbage into the solution. In order to truly diagnose this, output with -snes_linesearch_monitor and -snes_converged reason would be nice. - Peter Dmitry. - Peter Dmitry. Also, after the solution vector X is rewritten by SNESLSCheckLocalMin_Private(), the new residual norm is not printed on screen (which is huge for this particular problem). Since the residual norm printed on the screen is going to machine accuracy, users might think that the SNESSolve converges nicely and the returned solution is correct. This seems a bit misleading in my opinion. Are you checking -snes_converged_reason? Thanks. Shiyuan -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121017/6545e477/attachment.html
[petsc-users] SNES Shell Problems
I've pushed a real fix to petsc-3.3 now. Using SNESShellSetSolve() on psnes should be enough now. - Peter On Mon, Aug 27, 2012 at 11:19 PM, Jed Brown jedbrown at mcs.anl.gov wrote: On Mon, Aug 27, 2012 at 11:13 PM, Gaetan Kenway kenway at utias.utoronto.cawrote: Excellent! It finally runs without segfaulting. Now... to get it actually solve my problem :) Great. This is not a real fix, but the real fix will take a bit more effort. Let us know if you have any problems, otherwise we'll let you know when the proper fix is pushed. -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120828/1eb5bd77/attachment.html
[petsc-users] SNES Shell Problems
I've tested and pushed a fix to petsc-3.3. If there are any problems, let me know. On Mon, Aug 27, 2012 at 12:24 PM, Gaetan Kenway kenway at utias.utoronto.cawrote: Excellent. Thanks. The reason for using the snes shell solver is I would like to try out the ngmres snes. solver. The code I'm working with already has a has a non-linear multi-grid method, coded directly in the fortran source. My plan was to setup the the ngmres snes solver, then create a snes shell as the preconditioner that wraps the nonlienear multigrid solver already in place. Does this make sense? Is this the sort of usage you had in mind? This is very much the usage we were intending for this. - Peter Thanks, Gaetan On Mon, Aug 27, 2012 at 1:10 PM, Jed Brown jedbrown at mcs.anl.gov wrote: On Mon, Aug 27, 2012 at 12:09 PM, Gaetan Kenway kenway at utias.utoronto.ca wrote: That's what I had deduced. I tried modifying the source code taking out the 'C', but it still didn't seem to generate the required wrapper. Does the functions that get passed function pointer handle's require special handling? It needs custom handling. Peter is working on it. -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120827/a41363ec/attachment.html
[petsc-users] SNES Shell Problems
I suspect that you could get past your problem by using SNESGetPC to create the initial psnes context from the outer solver context and then setting its type as you've done here. A more complete backtrace from the debugger might give me a better notion of exactly what is going wrong. You can get this output conveniently using the -on_error_attach_debugger. Thanks, - Peter On Mon, Aug 27, 2012 at 6:51 PM, Gaetan Kenway kenway at utias.utoronto.cawrote: Hi Again I've pulled the most recent 3.3 branch code, compiled and tried running with the shell solver. When I run just the snessolve() on the shell solver everything works just fine. However, when I try to combine it with NGMRES I get issues. Here's how I setup the solvers: ! --- Create Shell snes context - call SNESCreate(SUMB_COMM_WORLD, psnes, ierr) call EChk(ierr, __FILE__, __LINE__) ! Set the type to shell call SNESSetType(psnes, SNESSHELL, ierr) call EChk(ierr, __FILE__, __LINE__) ! Set the user-defined function for psnes to our funtion call SNESShellSetSolve(psnes, psnes_func, ierr) call EChk(ierr, __FILE__, __LINE__) ! --- Create main snes solver object call SNESCreate(SUMB_COMM_WORLD, outer_snes, ierr) call EChk(ierr, __FILE__, __LINE__) ! Set the type to the nonlinear ngmres call SNESSetType(outer_snes, SNESNGMRES, ierr) call EChk(ierr, __FILE__, __LINE__) call SNESSetFunction(outer_snes, rVec, formFunction_snes, ctx2, ierr) call EChk(ierr, __FILE__, __LINE__) ! Set the above create psnes context to the 'pc' of the one just created call SNESSetPC(outer_snes, psnes, ierr) call EChk(ierr, __FILE__, __LINE__) call snessetfromoptions(outer_snes, ierr) call EChk(ierr, __FILE__, __LINE__) The actual commands I'm running are: call SNESSolve(outer_snes, rhs, wVec, ierr) ! this works when 'outer_snes' is replaced with 'psnes' call EChk(ierr,__FILE__,__LINE__) ! I'm pretty sure the function is correct since it runs through when just the shell solver is called. the formFunction_snes also seems to work ok by itself. The traceback I get is below: [0]PETSC ERROR: [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSCERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: - Stack Frames [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] SNESSolve_Shell line 166 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/snes/impls/shell/snesshell.c [0]PETSC ERROR: [0] SNESSolve line 3491 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_NGMRES line 189 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/snes/impls/ngmres/snesngmres.c [0]PETSC ERROR: [0] SNESSolve line 3491 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] VecCreateMPIWithArray line 313 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: [0] VecCreateMPIWithArray line 313 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: [0] VecCreateMPIWithArray line 313 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: [0] VecCreateMPIWithArray line 313 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/vec/vec/impls/mpi/pbvec.c I can tell from the output it first calls the supplied 'formFunction' correctly and then has made its way to the SNESSolve_Shell command as expected. However, my guess is it actually dies at the following line: ierr = (*shell-solve)(snes,snes-vec_sol);CHKERRQ(ierr); I'm not quite sure what's going on. Something fishy with the vec_sol pointer or the solve command pointer? Do you have any examples of using NGMRES? Thanks, Gaetan On Mon, Aug 27, 2012 at 5:06 PM, Gaetan Kenway kenway at utias.utoronto.cawrote: Thanks very much for the very quick response. I will pull the change from the mercurial repository and give it a try in the next couple of days. Gaetan On Mon, Aug 27, 2012 at 3:59 PM, Peter Brune brune at mcs.anl.gov wrote: I've tested and pushed a fix to petsc-3.3. If there are any problems, let me know. On Mon, Aug 27, 2012 at 12:24 PM, Gaetan Kenway kenway at utias.utoronto.ca wrote: Excellent. Thanks
[petsc-users] LineSearch question
Most likely not a bug; more like a feature. BT is meant to take the full step when the descent condition is fine. The descent condition can be fine without making a ton of progress. Taking the full step avoids a lot of work (extra function evaluations) when the solve is working right, but messes up hard when it isn't. I'm wondering why your problem is doing this. The question is, therefore, why is 2x the optimum step? For 1D problems this is explained by classical numerical analysis; the root the Newton's method is finding has multiplicity 2. I'm wondering what exactly would exhibit this behavior in multiple dimensions for a linear problem. - Peter On Tue, Aug 14, 2012 at 9:51 PM, Barry Smith bsmith at mcs.anl.gov wrote: Likely a bug in our bt. It says it is taking a full step but then seems to take a half-step (based on the result from l2Line search: lambdas = [1, 0.5, 0], fnorms = [2.1936e-08, 1.87107, 3.74214] [0] SNESLineSearchApply_BT(): Initial fnorm 3.742138023215e+00 gnorm 1.871069011453e+00 Line search: Using full step: fnorm 3.742138023215e+00 gnorm 1.871069011453e+00 [0] SNESSolve_LS(): fnorm=3.7421380232151638e+00, gnorm=1.8710690114527022e+00, ynorm=2.5919210284812890e+00, lssucceed=1 1 SNES Function norm 1.871069011453e+00 ^ result as if took a 1/2 step Barry On Aug 14, 2012, at 6:24 PM, Blaise Bourdin bourdin at lsu.edu wrote: Hi, On Aug 14, 2012, at 5:58 PM, Barry Smith bsmith at mcs.anl.gov wrote: Blaise, You can run with -snes_linesearch_monitor -info -ksp_monitor_true_residual -ksp_converged_reason to get much more information about what is happening and why the line search is failing. Focussing on the relevant part of the output, I get the following for the first SNES step *** Using l2 Residual norms for temp_ solve. 0 KSP preconditioned resid norm 2.352873068990e+00 true resid norm 3.742138023215e+00 ||r(i)||/||b|| 1.e+00 [0] KSPDefaultConverged(): user has provided nonzero initial guess, computing 2-norm of preconditioned RHS 1 KSP preconditioned resid norm 7.175926524783e-01 true resid norm 8.026926174904e-01 ||r(i)||/||b|| 2.145010719836e-01 2 KSP preconditioned resid norm 4.099791012407e-01 true resid norm 6.219898727406e-01 ||r(i)||/||b|| 1.662124349455e-01 3 KSP preconditioned resid norm 2.769612150659e-01 true resid norm 4.622335508644e-01 ||r(i)||/||b|| 1.235212458752e-01 4 KSP preconditioned resid norm 8.991164116822e-02 true resid norm 1.938840972976e-01 ||r(i)||/||b|| 5.181104921701e-02 5 KSP preconditioned resid norm 1.369794733551e-02 true resid norm 2.867541652138e-02 ||r(i)||/||b|| 7.662843097578e-03 6 KSP preconditioned resid norm 3.522245138600e-03 true resid norm 5.452585588775e-03 ||r(i)||/||b|| 1.457077626466e-03 7 KSP preconditioned resid norm 1.117008651636e-03 true resid norm 1.551905826961e-03 ||r(i)||/||b|| 4.147110067382e-04 8 KSP preconditioned resid norm 5.008635361807e-04 true resid norm 6.226120116381e-04 ||r(i)||/||b|| 1.663786872038e-04 9 KSP preconditioned resid norm 2.079118910844e-04 true resid norm 3.430664466007e-04 ||r(i)||/||b|| 9.167658821571e-05 10 KSP preconditioned resid norm 4.528126074206e-05 true resid norm 9.520866575330e-05 ||r(i)||/||b|| 2.544231804457e-05 11 KSP preconditioned resid norm 8.441137224072e-06 true resid norm 1.519916221879e-05 ||r(i)||/||b|| 4.061625232553e-06 12 KSP preconditioned resid norm 1.839317974157e-06 true resid norm 3.245208227421e-06 ||r(i)||/||b|| 8.672069836252e-07 13 KSP preconditioned resid norm 4.346353491153e-07 true resid norm 7.198101954057e-07 ||r(i)||/||b|| 1.923526580100e-07 14 KSP preconditioned resid norm 6.321413805477e-08 true resid norm 1.280486229700e-07 ||r(i)||/||b|| 3.421803850515e-08 15 KSP preconditioned resid norm 9.029476674935e-09 true resid norm 2.193598397200e-08 ||r(i)||/||b|| 5.861885327562e-09 [0] KSPDefaultConverged(): Linear solver has converged. Residual norm 9.029476674935e-09 is less than relative tolerance 1.e-08 times initial right hand side norm 2.352873068990e+00 at iteration 15 Linear solve converged due to CONVERGED_RTOL iterations 15 [0] SNESSolve_LS(): iter=0, linear solve iterations=15 [0] SNESLSCheckResidual_Private(): ||J^T(F-Ax)||/||F-AX|| 4.179425981164e+00 near zero implies inconsistent rhs Line search: lambdas = [1, 0.5, 0], fnorms = [2.1936e-08, 1.87107, 3.74214] Line search terminated: lambda = 1, fnorms = 2.19338e-08 [0] SNESSolve_LS(): fnorm=3.7421380232151638e+00, gnorm=2.1933796240714327e-08, ynorm=5.1838420564550498e+00, lssucceed=1 1 SNES Function norm 2.193379624071e-08 [0] SNESDefaultConverged(): Converged due to function norm 2.193379624071e-08 3.742138023215e-08 (relative tolerance) SNESTemp converged in in1
[petsc-users] Using -ts_type sundials with -snes-fd
Note that in the code in ts/impls/implicit/sundials it says: This uses its own nonlinear solver and krylov method so PETSc SNES and KSP options do not apply... - Peter On Jun 21, 2012 7:59 AM, Geoff Oxberry goxberry at gmail.com wrote: Running the following example from PETSC 3.3.0-dev (changeset: 23631:0e86ac5e4170) /path/to/petsc-dev/src/ts/examples/tutorials/ex8 -problem_type rober -snes_fd -ts_type sundials gives the following output steps 1000 (0 rejected, 0 SNES fails), ftime 744.845, nonlinits 3739, linits 3739 WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-snes_fd no value Just to confirm, is it currently impossible to use a finite difference Jacobian matrix in concert with CVODE? If so, could this feature be implemented in a future release? I currently rely on Sundials to integrate stiff systems of ODEs, and for my application, it is impractical to derive an analytical Jacobian matrix. (It is an issue I've discussed both with Jed and Matt on another forum.) Cheers, Geoff -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120621/b6e21df1/attachment.html
[petsc-users] Using -ts_type sundials with -snes-fd
What do you see when you run with -ts_view? - Peter On Thu, Jun 21, 2012 at 9:40 AM, Geoff Oxberry goxberry at gmail.com wrote: Peter, Just wanted to make sure there wasn't some Sundials-specific option for finite difference Jacobians that I was missing; despite reading the manual, it's a large package, and it's easy to miss things. If that's the case, I'd like to make a feature request for such an option. Geoff On Jun 21, 2012, at 9:53 AM, Peter Brune wrote: Note that in the code in ts/impls/implicit/sundials it says: This uses its own nonlinear solver and krylov method so PETSc SNES and KSP options do not apply... - Peter On Jun 21, 2012 7:59 AM, Geoff Oxberry goxberry at gmail.com wrote: Running the following example from PETSC 3.3.0-dev (changeset: 23631:0e86ac5e4170) /path/to/petsc-dev/src/ts/examples/tutorials/ex8 -problem_type rober -snes_fd -ts_type sundials gives the following output steps 1000 (0 rejected, 0 SNES fails), ftime 744.845, nonlinits 3739, linits 3739 WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-snes_fd no value Just to confirm, is it currently impossible to use a finite difference Jacobian matrix in concert with CVODE? If so, could this feature be implemented in a future release? I currently rely on Sundials to integrate stiff systems of ODEs, and for my application, it is impractical to derive an analytical Jacobian matrix. (It is an issue I've discussed both with Jed and Matt on another forum.) Cheers, Geoff -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120621/e6bc88df/attachment-0001.html
[petsc-users] Using -ts_type sundials with -snes-fd
I asked my question because it might be happening automatically. His output looked like the equation was solved. A brief google search on this makes it look like they do this anyways. In addition: http://www.mcs.anl.gov/petsc/petsc-dev/src/ts/impls/implicit/sundials/sundials.c.html#TSView_Sundials directly refers to Sundials no. of rhs calls for finite diff. Jacobian-vector evals. - Peter On Thu, Jun 21, 2012 at 9:48 AM, Matthew Knepley knepley at gmail.com wrote: On Thu, Jun 21, 2012 at 8:45 AM, Peter Brune prbrune at gmail.com wrote: What do you see when you run with -ts_view? - Peter On Thu, Jun 21, 2012 at 9:40 AM, Geoff Oxberry goxberry at gmail.comwrote: Peter, Just wanted to make sure there wasn't some Sundials-specific option for finite difference Jacobians that I was missing; despite reading the manual, it's a large package, and it's easy to miss things. If that's the case, I'd like to make a feature request for such an option. If I understand correctly, you want a MF Jacobian with Sundials. We can't do that because Sundials is completely closed package, which we cannot pry apart to insert something like this. The alternative is to use the stuff solvers we currently have in TS. I thought that you had used the Rosenbrock-W stuff. Is this sufficient? Thanks, Matt Geoff On Jun 21, 2012, at 9:53 AM, Peter Brune wrote: Note that in the code in ts/impls/implicit/sundials it says: This uses its own nonlinear solver and krylov method so PETSc SNES and KSP options do not apply... - Peter On Jun 21, 2012 7:59 AM, Geoff Oxberry goxberry at gmail.com wrote: Running the following example from PETSC 3.3.0-dev (changeset: 23631:0e86ac5e4170) /path/to/petsc-dev/src/ts/examples/tutorials/ex8 -problem_type rober -snes_fd -ts_type sundials gives the following output steps 1000 (0 rejected, 0 SNES fails), ftime 744.845, nonlinits 3739, linits 3739 WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-snes_fd no value Just to confirm, is it currently impossible to use a finite difference Jacobian matrix in concert with CVODE? If so, could this feature be implemented in a future release? I currently rely on Sundials to integrate stiff systems of ODEs, and for my application, it is impractical to derive an analytical Jacobian matrix. (It is an issue I've discussed both with Jed and Matt on another forum.) Cheers, Geoff -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120621/627e5c15/attachment.html
[petsc-users] Linesearch algorithm
On Tue, May 15, 2012 at 3:43 PM, behzad baghapour behzad.baghapour at gmail.com wrote: In addition, Is it possible to define knew objective function for line search in SNES in order to deal with pseudo-transient continuation? This feature will be added to petsc-dev shortly post-release. - Peter -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120515/29368f73/attachment.htm
[petsc-users] SNES convergence tol question
What you're seeing is convergence in the step tolerance, not the relative tolerance applied to the norm of the residual. This means that the norm of the steplength between successive iterations has gone below snes-xtol times the norm of the solution. This appears to be set through -snes_stol. - Peter On Tue, Mar 20, 2012 at 10:29 AM, Mark F. Adams mark.adams at columbia.eduwrote: We've been trying to drive the nonlinear residual down and SNES seems to return prematurely. Appended is Dan's note. We are asking for -snes_rtol 1.e-12 but only getting ~1.e-8. Anyone have any ideas? Mark Nonlinear solve converged due to CONVERGED_PNORM_RELATIVE iterations 8 However, here are the initial and final norms: 0 SNES Function norm 1.999411334131e+08 8 SNES Function norm 5.999267565004e-01 2e8/6e-1 = 3.3e8. So, I'm confused. Am I misunderstanding the convergence criterion? here's the dump of petsc options at the end: #PETSc Option Table entries: -ksp_converged_use_initial_residual_norm -ksp_max_it 100 -ksp_monitor -ksp_rtol 1.e-6 -ksp_type gmres -mg_levels_ksp_max_it 4 -options_left -pc_gamg_agg_nsmooths 1 -pc_gamg_sym_graph -pc_gamg_threshold .01 -pc_gamg_type agg -pc_gamg_verbose 2 -pc_hypre_boomeramg_agg_nl 1 -pc_hypre_boomeramg_coarsen_type HMIS -pc_hypre_boomeramg_interp_type ext+i -pc_hypre_boomeramg_no CF -pc_hypre_type boomeramg -pc_type gamg -snes_converged_reason -snes_mf_operator -snes_monitor -snes_rtol 1.e-12 #End of PETSc Option Table entries Mark -- next part -- An HTML attachment was scrubbed... URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120320/9312c64e/attachment.htm