Re: [petsc-users] Possible to recover ILU(k) from hypre/pilut?

2017-11-15 Thread Smith, Barry F.


> On Nov 15, 2017, at 9:57 PM, Mark Lohry  wrote:
> 
> What are the limitations of ILU in parallel you're referring to? Does 
> Schwarz+local ILU typically fare better?

  If ILU works fine for scalably in parallel that is great. Most of the PETSc 
team has an explicit bias against ILU generally speaking.

  Barry

> 
> On Nov 15, 2017 10:50 PM, "Smith, Barry F."  wrote:
> 
> 
> > On Nov 15, 2017, at 9:40 PM, Jed Brown  wrote:
> >
> > "Smith, Barry F."  writes:
> >
> >>> On Nov 15, 2017, at 6:38 AM, Mark Lohry  wrote:
> >>>
> >>> I've found ILU(0) or (1) to be working well for my problem, but the petsc 
> >>> implementation is serial only. Running with -pc_type hypre -pc_hypre_type 
> >>> pilut with default settings has considerably worse convergence. I've 
> >>> tried using -pc_hypre_pilut_factorrowsize (number of actual elements in 
> >>> row) to trick it into doing ILU(0), to no effect.
> >>>
> >>> Is there any way to recover classical ILU(k) from pilut?
> >>>
> >>> Hypre's docs state pilut is no longer supported, and Euclid should be 
> >>> used for anything moving forward. pc_hypre_boomeramg has options for 
> >>> Euclid smoothers. Any hope of a pc_hypre_type euclid?
> >>
> >>  Not unless someone outside the PETSc team decides to put it back in.
> >
> > PETSc used to have a Euclid interface.  My recollection is that Barry
> > removed it because users were finding too many bugs in Euclid and
> > upstream wasn't fixing them.  A contributed revival of the interface
> > won't fix the upstream problem.
> 
>The hypre team now claims they care about Euclid. But given the 
> limitations of ILU in parallel I can't imagine anyone cares all that much.
> 
> 



Re: [petsc-users] Possible to recover ILU(k) from hypre/pilut?

2017-11-15 Thread Mark Lohry
What are the limitations of ILU in parallel you're referring to? Does
Schwarz+local ILU typically fare better?

On Nov 15, 2017 10:50 PM, "Smith, Barry F."  wrote:

>
>
> > On Nov 15, 2017, at 9:40 PM, Jed Brown  wrote:
> >
> > "Smith, Barry F."  writes:
> >
> >>> On Nov 15, 2017, at 6:38 AM, Mark Lohry  wrote:
> >>>
> >>> I've found ILU(0) or (1) to be working well for my problem, but the
> petsc implementation is serial only. Running with -pc_type hypre
> -pc_hypre_type pilut with default settings has considerably worse
> convergence. I've tried using -pc_hypre_pilut_factorrowsize (number of
> actual elements in row) to trick it into doing ILU(0), to no effect.
> >>>
> >>> Is there any way to recover classical ILU(k) from pilut?
> >>>
> >>> Hypre's docs state pilut is no longer supported, and Euclid should be
> used for anything moving forward. pc_hypre_boomeramg has options for Euclid
> smoothers. Any hope of a pc_hypre_type euclid?
> >>
> >>  Not unless someone outside the PETSc team decides to put it back in.
> >
> > PETSc used to have a Euclid interface.  My recollection is that Barry
> > removed it because users were finding too many bugs in Euclid and
> > upstream wasn't fixing them.  A contributed revival of the interface
> > won't fix the upstream problem.
>
>The hypre team now claims they care about Euclid. But given the
> limitations of ILU in parallel I can't imagine anyone cares all that much.
>
>
>


Re: [petsc-users] Possible to recover ILU(k) from hypre/pilut?

2017-11-15 Thread Smith, Barry F.


> On Nov 15, 2017, at 9:40 PM, Jed Brown  wrote:
> 
> "Smith, Barry F."  writes:
> 
>>> On Nov 15, 2017, at 6:38 AM, Mark Lohry  wrote:
>>> 
>>> I've found ILU(0) or (1) to be working well for my problem, but the petsc 
>>> implementation is serial only. Running with -pc_type hypre -pc_hypre_type 
>>> pilut with default settings has considerably worse convergence. I've tried 
>>> using -pc_hypre_pilut_factorrowsize (number of actual elements in row) to 
>>> trick it into doing ILU(0), to no effect.
>>> 
>>> Is there any way to recover classical ILU(k) from pilut?
>>> 
>>> Hypre's docs state pilut is no longer supported, and Euclid should be used 
>>> for anything moving forward. pc_hypre_boomeramg has options for Euclid 
>>> smoothers. Any hope of a pc_hypre_type euclid?
>> 
>>  Not unless someone outside the PETSc team decides to put it back in. 
> 
> PETSc used to have a Euclid interface.  My recollection is that Barry
> removed it because users were finding too many bugs in Euclid and
> upstream wasn't fixing them.  A contributed revival of the interface
> won't fix the upstream problem.

   The hypre team now claims they care about Euclid. But given the limitations 
of ILU in parallel I can't imagine anyone cares all that much.




Re: [petsc-users] Possible to recover ILU(k) from hypre/pilut?

2017-11-15 Thread Jed Brown
"Smith, Barry F."  writes:

>> On Nov 15, 2017, at 6:38 AM, Mark Lohry  wrote:
>> 
>> I've found ILU(0) or (1) to be working well for my problem, but the petsc 
>> implementation is serial only. Running with -pc_type hypre -pc_hypre_type 
>> pilut with default settings has considerably worse convergence. I've tried 
>> using -pc_hypre_pilut_factorrowsize (number of actual elements in row) to 
>> trick it into doing ILU(0), to no effect.
>> 
>> Is there any way to recover classical ILU(k) from pilut?
>> 
>> Hypre's docs state pilut is no longer supported, and Euclid should be used 
>> for anything moving forward. pc_hypre_boomeramg has options for Euclid 
>> smoothers. Any hope of a pc_hypre_type euclid?
>
>   Not unless someone outside the PETSc team decides to put it back in. 

PETSc used to have a Euclid interface.  My recollection is that Barry
removed it because users were finding too many bugs in Euclid and
upstream wasn't fixing them.  A contributed revival of the interface
won't fix the upstream problem.


Re: [petsc-users] ISGlobalToLocalMappingApplyBlock

2017-11-15 Thread Adrian Croucher
I've debugged into the ISGlobalToLocalMappingApplyBlock() function and 
it seems to me the bounds checking in there is not correct when the 
blocksize is > 1.


It checks against the same bounds, scaled up by the blocksize, in both 
the block and non-block versions of the function. I think for the block 
version the bounds should not be scaled.


I've just created a pull request 
(acroucher/fix-IS-global-to-local-mapping-block) with a suggested fix.


- Adrian


On 16/11/17 11:52, Adrian Croucher wrote:
I actually attached the wrong test program last time- I've attached 
the right one here, which is much simpler. It test global indices 0, 
1, ... 9.


If I run on 2 processes, the local indices it returns are:

rank 0: 0, 1, 2, 3, 4, 0, 0, 0, -253701943, 0
rank 1: -1, -1, -1, -1, -1, -1, -1, -1, -1, -1

The results I expected are:

rank 0: 0, 1, 2, 3, 4, -1, -1, -1, -1, -1
rank 1: -1, -1, -1, -1, -1, 0, 1, 2, 3, 4

So the results for global indices 0, 1,... 4 are what I expected, on 
both ranks. But the results for global indices 5, 6, ... 9 are not.


I tried increasing the blocksize to 3 or 4, and the results were 
exactly the same.


It only gives the results I expected if I change the blocksize to 1.

- Adrian



--
Dr Adrian Croucher
Senior Research Fellow
Department of Engineering Science
University of Auckland, New Zealand
email: a.crouc...@auckland.ac.nz
tel: +64 (0)9 923 4611



Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande
Thanks, Barry,

On Wed, Nov 15, 2017 at 4:04 PM, Smith, Barry F.  wrote:

>
>   Do the ASM runs for thousands of time-steps produce the same final
> "physical results" as the MUMPS run for thousands of timesteps?  While with
> SuperLU you get a very different "physical results"?
>

Let me update a little bit more. The simulation with SuperLU may fail at
certain time step. Sometime we can also run the simulation  successfully
for the whole time range.  It is totally random.

We will try ASM and MUMPS.

Fande,




>
>   Barry
>
>
> > On Nov 15, 2017, at 4:52 PM, Kong, Fande  wrote:
> >
> >
> >
> > On Wed, Nov 15, 2017 at 3:35 PM, Smith, Barry F. 
> wrote:
> >
> >   Since the convergence labeled linear does not converge to 14 digits in
> one iteration I am assuming you are using lagged preconditioning and or
> lagged  Jacobian?
> >
> > We are using Jacobian-free Newton. So Jacobian is different from the
> preconditioning matrix.
> >
> >
> >What happens if you do no lagging and solve each linear solve with a
> new LU factorization?
> >
> > We have the following results without using Jacobian-free Newton. Again,
> superlu_dist produces differences, while MUMPS gives the same results in
> terms of the residual norms.
> >
> >
> > Fande,
> >
> >
> > Superlu_dist run1:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.322285e-11
> >  1 Nonlinear |R| = 1.666987e-11
> >
> >
> > Superlu_dist run2:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.322171e-11
> >  1 Nonlinear |R| = 1.666977e-11
> >
> >
> > Superlu_dist run3:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.321964e-11
> >  1 Nonlinear |R| = 1.666959e-11
> >
> >
> > Superlu_dist run4:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.321978e-11
> >  1 Nonlinear |R| = 1.668688e-11
> >
> >
> > MUMPS run1:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.360637e-11
> >  1 Nonlinear |R| = 1.654334e-11
> >
> > MUMPS run 2:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.360637e-11
> >  1 Nonlinear |R| = 1.654334e-11
> >
> > MUMPS run 3:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.360637e-11
> >  1 Nonlinear |R| = 1.654334e-11
> >
> > MUMPS run4:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.360637e-11
> >  1 Nonlinear |R| = 1.654334e-11
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >   Barry
> >
> >
> > > On Nov 15, 2017, at 4:24 PM, Kong, Fande  wrote:
> > >
> > >
> > >
> > > On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F. 
> wrote:
> > >
> > >
> > > > On Nov 15, 2017, at 3:36 PM, Kong, Fande  wrote:
> > > >
> > > > Hi Barry,
> > > >
> > > > Thanks for your reply. I was wondering why this happens only when we
> use superlu_dist. I am trying to understand the algorithm in superlu_dist.
> If we use ASM or MUMPS, we do not produce these differences.
> > > >
> > > > The differences actually are NOT meaningless.  In fact, we have a
> real transient application that presents this issue.   When we run the
> simulation with superlu_dist in parallel for thousands of time steps, the
> final physics  solution looks totally different from different runs. The
> differences are not acceptable any more.  For a steady problem, the
> difference may be meaningless. But it is significant for the transient
> problem.
> > >
> > >   I submit that the "physics solution" of all of these runs is equally
> right and equally wrong. If the solutions are very different due to a small
> perturbation than something is wrong with the model or the integrator, I
> don't think you can blame the linear solver (see below)
> > > >
> > > > This makes the solution not reproducible, and we can not even set a
> targeting solution in the test system because the solution is so different
> from one run to another.   I guess there might/may be a tiny bug in
> superlu_dist or the PETSc interface to superlu_dist.
> > >
> > >   This is possible but it is also possible this is due to normal round
> off inside of SuperLU dist.
> > >
> > >Since you have SuperLU_Dist inside a nonlinear iteration it
> shouldn't really matter exactly how well SuperLU_Dist does. The nonlinear
> iteration does essential defect correction for you; are you making sure
> that the nonlinear iteration always works for every timestep? For example
> confirm that SNESGetConvergedReason() is always positive.
> > >
> > > Definitely it could be something wrong on my side.  But let us focus
> on the simple question first.
> > >
> > > To make the discussion a little simpler, let 

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Smith, Barry F.

  Do the ASM runs for thousands of time-steps produce the same final "physical 
results" as the MUMPS run for thousands of timesteps?  While with SuperLU you 
get a very different "physical results"?

  Barry


> On Nov 15, 2017, at 4:52 PM, Kong, Fande  wrote:
> 
> 
> 
> On Wed, Nov 15, 2017 at 3:35 PM, Smith, Barry F.  wrote:
> 
>   Since the convergence labeled linear does not converge to 14 digits in one 
> iteration I am assuming you are using lagged preconditioning and or lagged  
> Jacobian?
> 
> We are using Jacobian-free Newton. So Jacobian is different from the 
> preconditioning matrix. 
>  
> 
>What happens if you do no lagging and solve each linear solve with a new 
> LU factorization?
> 
> We have the following results without using Jacobian-free Newton. Again, 
> superlu_dist produces differences, while MUMPS gives the same results in 
> terms of the residual norms.
> 
> 
> Fande,
>  
> 
> Superlu_dist run1:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.322285e-11
>  1 Nonlinear |R| = 1.666987e-11
> 
> 
> Superlu_dist run2:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.322171e-11
>  1 Nonlinear |R| = 1.666977e-11
> 
> 
> Superlu_dist run3:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.321964e-11
>  1 Nonlinear |R| = 1.666959e-11
> 
> 
> Superlu_dist run4:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.321978e-11
>  1 Nonlinear |R| = 1.668688e-11
> 
> 
> MUMPS run1:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.360637e-11
>  1 Nonlinear |R| = 1.654334e-11
> 
> MUMPS run 2:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.360637e-11
>  1 Nonlinear |R| = 1.654334e-11
> 
> MUMPS run 3:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.360637e-11
>  1 Nonlinear |R| = 1.654334e-11
> 
> MUMPS run4:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.360637e-11
>  1 Nonlinear |R| = 1.654334e-11
> 
> 
> 
> 
> 
> 
> 
>  
> 
>   Barry
> 
> 
> > On Nov 15, 2017, at 4:24 PM, Kong, Fande  wrote:
> >
> >
> >
> > On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F.  wrote:
> >
> >
> > > On Nov 15, 2017, at 3:36 PM, Kong, Fande  wrote:
> > >
> > > Hi Barry,
> > >
> > > Thanks for your reply. I was wondering why this happens only when we use 
> > > superlu_dist. I am trying to understand the algorithm in superlu_dist. If 
> > > we use ASM or MUMPS, we do not produce these differences.
> > >
> > > The differences actually are NOT meaningless.  In fact, we have a real 
> > > transient application that presents this issue.   When we run the 
> > > simulation with superlu_dist in parallel for thousands of time steps, the 
> > > final physics  solution looks totally different from different runs. The 
> > > differences are not acceptable any more.  For a steady problem, the 
> > > difference may be meaningless. But it is significant for the transient 
> > > problem.
> >
> >   I submit that the "physics solution" of all of these runs is equally 
> > right and equally wrong. If the solutions are very different due to a small 
> > perturbation than something is wrong with the model or the integrator, I 
> > don't think you can blame the linear solver (see below)
> > >
> > > This makes the solution not reproducible, and we can not even set a 
> > > targeting solution in the test system because the solution is so 
> > > different from one run to another.   I guess there might/may be a tiny 
> > > bug in superlu_dist or the PETSc interface to superlu_dist.
> >
> >   This is possible but it is also possible this is due to normal round off 
> > inside of SuperLU dist.
> >
> >Since you have SuperLU_Dist inside a nonlinear iteration it shouldn't 
> > really matter exactly how well SuperLU_Dist does. The nonlinear iteration 
> > does essential defect correction for you; are you making sure that the 
> > nonlinear iteration always works for every timestep? For example confirm 
> > that SNESGetConvergedReason() is always positive.
> >
> > Definitely it could be something wrong on my side.  But let us focus on the 
> > simple question first.
> >
> > To make the discussion a little simpler, let us back to the simple problem 
> > (heat conduction).   Now I want to understand why this happens to 
> > superlu_dist only. When we are using ASM or MUMPS,  why we can not see the 
> > differences from one run to another?  I posted the residual histories for 
> > MUMPS and ASM.  We can not see any differences in terms of the residual 
> > norms when using MUMPS or ASM. Does superlu_dist have higher round off than 
> > other solvers?
> >
> 

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande
On Wed, Nov 15, 2017 at 3:35 PM, Smith, Barry F.  wrote:

>
>   Since the convergence labeled linear does not converge to 14 digits in
> one iteration I am assuming you are using lagged preconditioning and or
> lagged  Jacobian?
>

We are using Jacobian-free Newton. So Jacobian is different from the
preconditioning matrix.


>
>What happens if you do no lagging and solve each linear solve with a
> new LU factorization?
>

We have the following results without using Jacobian-free Newton. Again,
superlu_dist produces differences, while MUMPS gives the same results in
terms of the residual norms.


Fande,


Superlu_dist run1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.322285e-11
 1 Nonlinear |R| = 1.666987e-11


Superlu_dist run2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.322171e-11
 1 Nonlinear |R| = 1.666977e-11


Superlu_dist run3:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.321964e-11
 1 Nonlinear |R| = 1.666959e-11


Superlu_dist run4:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.321978e-11
 1 Nonlinear |R| = 1.668688e-11


MUMPS run1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.360637e-11
 1 Nonlinear |R| = 1.654334e-11

MUMPS run 2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.360637e-11
 1 Nonlinear |R| = 1.654334e-11

MUMPS run 3:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.360637e-11
 1 Nonlinear |R| = 1.654334e-11

MUMPS run4:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.360637e-11
 1 Nonlinear |R| = 1.654334e-11









>
>   Barry
>
>
> > On Nov 15, 2017, at 4:24 PM, Kong, Fande  wrote:
> >
> >
> >
> > On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F. 
> wrote:
> >
> >
> > > On Nov 15, 2017, at 3:36 PM, Kong, Fande  wrote:
> > >
> > > Hi Barry,
> > >
> > > Thanks for your reply. I was wondering why this happens only when we
> use superlu_dist. I am trying to understand the algorithm in superlu_dist.
> If we use ASM or MUMPS, we do not produce these differences.
> > >
> > > The differences actually are NOT meaningless.  In fact, we have a real
> transient application that presents this issue.   When we run the
> simulation with superlu_dist in parallel for thousands of time steps, the
> final physics  solution looks totally different from different runs. The
> differences are not acceptable any more.  For a steady problem, the
> difference may be meaningless. But it is significant for the transient
> problem.
> >
> >   I submit that the "physics solution" of all of these runs is equally
> right and equally wrong. If the solutions are very different due to a small
> perturbation than something is wrong with the model or the integrator, I
> don't think you can blame the linear solver (see below)
> > >
> > > This makes the solution not reproducible, and we can not even set a
> targeting solution in the test system because the solution is so different
> from one run to another.   I guess there might/may be a tiny bug in
> superlu_dist or the PETSc interface to superlu_dist.
> >
> >   This is possible but it is also possible this is due to normal round
> off inside of SuperLU dist.
> >
> >Since you have SuperLU_Dist inside a nonlinear iteration it shouldn't
> really matter exactly how well SuperLU_Dist does. The nonlinear iteration
> does essential defect correction for you; are you making sure that the
> nonlinear iteration always works for every timestep? For example confirm
> that SNESGetConvergedReason() is always positive.
> >
> > Definitely it could be something wrong on my side.  But let us focus on
> the simple question first.
> >
> > To make the discussion a little simpler, let us back to the simple
> problem (heat conduction).   Now I want to understand why this happens to
> superlu_dist only. When we are using ASM or MUMPS,  why we can not see the
> differences from one run to another?  I posted the residual histories for
> MUMPS and ASM.  We can not see any differences in terms of the residual
> norms when using MUMPS or ASM. Does superlu_dist have higher round off than
> other solvers?
> >
> >
> >
> > MUMPS run1:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020993e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 4.836162e-08
> >   2 Linear |R| = 7.055620e-14
> >  2 Nonlinear |R| = 4.836392e-08
> >
> > MUMPS run2:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020993e-08
> >  1 Nonlinear |R| = 

Re: [petsc-users] ISGlobalToLocalMappingApplyBlock

2017-11-15 Thread Adrian Croucher
I actually attached the wrong test program last time- I've attached the 
right one here, which is much simpler. It test global indices 0, 1, ... 9.


If I run on 2 processes, the local indices it returns are:

rank 0: 0, 1, 2, 3, 4, 0, 0, 0, -253701943, 0
rank 1: -1, -1, -1, -1, -1, -1, -1, -1, -1, -1

The results I expected are:

rank 0: 0, 1, 2, 3, 4, -1, -1, -1, -1, -1
rank 1: -1, -1, -1, -1, -1, 0, 1, 2, 3, 4

So the results for global indices 0, 1,... 4 are what I expected, on 
both ranks. But the results for global indices 5, 6, ... 9 are not.


I tried increasing the blocksize to 3 or 4, and the results were exactly 
the same.


It only gives the results I expected if I change the blocksize to 1.

- Adrian

--
Dr Adrian Croucher
Senior Research Fellow
Department of Engineering Science
University of Auckland, New Zealand
email: a.crouc...@auckland.ac.nz
tel: +64 (0)9 923 4611

program testl2g

  ! Test PETSc local-to-global mapping

#include 

  use petsc

  implicit none
  PetscInt, parameter :: blocksize = 1
  PetscInt, parameter :: n = 5, nin = 10
  PetscMPIInt :: rank
  PetscInt :: i
  PetscInt, allocatable :: indices(:)
  ISLocalToGlobalMapping :: l2g
  PetscInt :: global(nin), local(nin), nout
  PetscErrorCode :: ierr

  call PetscInitialize(PETSC_NULL_CHARACTER, ierr); CHKERRQ(ierr)
  call MPI_comm_rank(PETSC_COMM_WORLD, rank, ierr)

  select case (rank)
  case (0)
 indices = [(i, i = 0, 4)]
  case (1)
 indices = [(i, i = 5, 9)]
  end select

  call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD, blocksize, n, indices, &
   PETSC_COPY_VALUES, l2g, ierr); CHKERRQ(ierr)
  call ISLocalToGlobalMappingView(l2g, PETSC_VIEWER_STDOUT_WORLD, ierr); CHKERRQ(ierr)

  global = [(i, i = 0, 9)]
  call ISGlobalToLocalMappingApplyBlock(l2g, IS_GTOLM_MASK, nin, &
   global, nout, local, ierr); CHKERRQ(ierr)
  write(*,*) 'rank:', rank, 'global:', global, 'local:', local, 'nout:', nout
  
  call ISLocalToGlobalMappingDestroy(l2g, ierr); CHKERRQ(ierr)
  call PetscFinalize(ierr); CHKERRQ(ierr)

end program testl2g


Re: [petsc-users] IDR availalbe in PETSC?

2017-11-15 Thread Jed Brown
There isn't an IDR in PETSc, but there is BCGSL which usually performs
similarly.  Contributions welcome.

Evan Um  writes:

> Dear PETSC users,
>
> I was wondering if anyone already tried/developed an induced dimension
> reduction (IDR) solver for PETSC? I think that it is a useful one but I
> couldn't find its example with PETSC. If you have any idea about IDR
> routines for PETSC, please let me know. Thanks!
>
> Best,
> Evan


Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Smith, Barry F.

  Since the convergence labeled linear does not converge to 14 digits in one 
iteration I am assuming you are using lagged preconditioning and or lagged  
Jacobian?

   What happens if you do no lagging and solve each linear solve with a new LU 
factorization?

  Barry


> On Nov 15, 2017, at 4:24 PM, Kong, Fande  wrote:
> 
> 
> 
> On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F.  wrote:
> 
> 
> > On Nov 15, 2017, at 3:36 PM, Kong, Fande  wrote:
> >
> > Hi Barry,
> >
> > Thanks for your reply. I was wondering why this happens only when we use 
> > superlu_dist. I am trying to understand the algorithm in superlu_dist. If 
> > we use ASM or MUMPS, we do not produce these differences.
> >
> > The differences actually are NOT meaningless.  In fact, we have a real 
> > transient application that presents this issue.   When we run the 
> > simulation with superlu_dist in parallel for thousands of time steps, the 
> > final physics  solution looks totally different from different runs. The 
> > differences are not acceptable any more.  For a steady problem, the 
> > difference may be meaningless. But it is significant for the transient 
> > problem.
> 
>   I submit that the "physics solution" of all of these runs is equally right 
> and equally wrong. If the solutions are very different due to a small 
> perturbation than something is wrong with the model or the integrator, I 
> don't think you can blame the linear solver (see below)
> >
> > This makes the solution not reproducible, and we can not even set a 
> > targeting solution in the test system because the solution is so different 
> > from one run to another.   I guess there might/may be a tiny bug in 
> > superlu_dist or the PETSc interface to superlu_dist.
> 
>   This is possible but it is also possible this is due to normal round off 
> inside of SuperLU dist.
> 
>Since you have SuperLU_Dist inside a nonlinear iteration it shouldn't 
> really matter exactly how well SuperLU_Dist does. The nonlinear iteration 
> does essential defect correction for you; are you making sure that the 
> nonlinear iteration always works for every timestep? For example confirm that 
> SNESGetConvergedReason() is always positive.
> 
> Definitely it could be something wrong on my side.  But let us focus on the 
> simple question first. 
> 
> To make the discussion a little simpler, let us back to the simple problem 
> (heat conduction).   Now I want to understand why this happens to 
> superlu_dist only. When we are using ASM or MUMPS,  why we can not see the 
> differences from one run to another?  I posted the residual histories for 
> MUMPS and ASM.  We can not see any differences in terms of the residual norms 
> when using MUMPS or ASM. Does superlu_dist have higher round off than other 
> solvers?   
> 
>  
> 
> MUMPS run1:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.013384e-02
>   2 Linear |R| = 4.020993e-08
>  1 Nonlinear |R| = 1.404678e-02
>   0 Linear |R| = 1.404678e-02
>   1 Linear |R| = 4.836162e-08
>   2 Linear |R| = 7.055620e-14
>  2 Nonlinear |R| = 4.836392e-08
> 
> MUMPS run2:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.013384e-02
>   2 Linear |R| = 4.020993e-08
>  1 Nonlinear |R| = 1.404678e-02
>   0 Linear |R| = 1.404678e-02
>   1 Linear |R| = 4.836162e-08
>   2 Linear |R| = 7.055620e-14
>  2 Nonlinear |R| = 4.836392e-08
> 
> MUMPS run3:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.013384e-02
>   2 Linear |R| = 4.020993e-08
>  1 Nonlinear |R| = 1.404678e-02
>   0 Linear |R| = 1.404678e-02
>   1 Linear |R| = 4.836162e-08
>   2 Linear |R| = 7.055620e-14
>  2 Nonlinear |R| = 4.836392e-08
> 
> MUMPS run4:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.013384e-02
>   2 Linear |R| = 4.020993e-08
>  1 Nonlinear |R| = 1.404678e-02
>   0 Linear |R| = 1.404678e-02
>   1 Linear |R| = 4.836162e-08
>   2 Linear |R| = 7.055620e-14
>  2 Nonlinear |R| = 4.836392e-08
> 
> 
> 
> ASM run1:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 6.189229e+03
>   2 Linear |R| = 3.252487e+02
>   3 Linear |R| = 3.485174e+01
>   4 Linear |R| = 8.600695e+00
>   5 Linear |R| = 3.333942e+00
>   6 Linear |R| = 1.706112e+00
>   7 Linear |R| = 5.047863e-01
>   8 Linear |R| = 2.337297e-01
>   9 Linear |R| = 1.071627e-01
>  10 Linear |R| = 4.692177e-02
>  11 Linear |R| = 1.340717e-02
>  12 Linear |R| = 4.753951e-03
>  1 Nonlinear |R| = 2.320271e-02
>   0 Linear |R| = 2.320271e-02
>   1 Linear |R| = 4.367880e-03
>   2 Linear |R| = 1.407852e-03
>   3 Linear |R| = 6.036360e-04
>   4 Linear |R| = 1.867661e-04
>   5 Linear |R| = 8.760076e-05
>   6 

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande
On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F.  wrote:

>
>
> > On Nov 15, 2017, at 3:36 PM, Kong, Fande  wrote:
> >
> > Hi Barry,
> >
> > Thanks for your reply. I was wondering why this happens only when we use
> superlu_dist. I am trying to understand the algorithm in superlu_dist. If
> we use ASM or MUMPS, we do not produce these differences.
> >
> > The differences actually are NOT meaningless.  In fact, we have a real
> transient application that presents this issue.   When we run the
> simulation with superlu_dist in parallel for thousands of time steps, the
> final physics  solution looks totally different from different runs. The
> differences are not acceptable any more.  For a steady problem, the
> difference may be meaningless. But it is significant for the transient
> problem.
>
>   I submit that the "physics solution" of all of these runs is equally
> right and equally wrong. If the solutions are very different due to a small
> perturbation than something is wrong with the model or the integrator, I
> don't think you can blame the linear solver (see below)
>
>
> > This makes the solution not reproducible, and we can not even set a
> targeting solution in the test system because the solution is so different
> from one run to another.   I guess there might/may be a tiny bug in
> superlu_dist or the PETSc interface to superlu_dist.
>
>   This is possible but it is also possible this is due to normal round off
> inside of SuperLU dist.
>
>Since you have SuperLU_Dist inside a nonlinear iteration it shouldn't
> really matter exactly how well SuperLU_Dist does. The nonlinear iteration
> does essential defect correction for you; are you making sure that the
> nonlinear iteration always works for every timestep? For example confirm
> that SNESGetConvergedReason() is always positive.
>

Definitely it could be something wrong on my side.  But let us focus on the
simple question first.

To make the discussion a little simpler, let us back to the simple problem
(heat conduction).   Now I want to understand why this happens to
superlu_dist only. When we are using ASM or MUMPS,  why we can not see the
differences from one run to another?  I posted the residual histories for
MUMPS and ASM.  We can not see any differences in terms of the residual
norms when using MUMPS or ASM. Does superlu_dist have higher round off than
other solvers?



MUMPS run1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020993e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 4.836162e-08
  2 Linear |R| = 7.055620e-14
 2 Nonlinear |R| = 4.836392e-08

MUMPS run2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020993e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 4.836162e-08
  2 Linear |R| = 7.055620e-14
 2 Nonlinear |R| = 4.836392e-08

MUMPS run3:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020993e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 4.836162e-08
  2 Linear |R| = 7.055620e-14
 2 Nonlinear |R| = 4.836392e-08

MUMPS run4:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020993e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 4.836162e-08
  2 Linear |R| = 7.055620e-14
 2 Nonlinear |R| = 4.836392e-08



ASM run1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 6.189229e+03
  2 Linear |R| = 3.252487e+02
  3 Linear |R| = 3.485174e+01
  4 Linear |R| = 8.600695e+00
  5 Linear |R| = 3.333942e+00
  6 Linear |R| = 1.706112e+00
  7 Linear |R| = 5.047863e-01
  8 Linear |R| = 2.337297e-01
  9 Linear |R| = 1.071627e-01
 10 Linear |R| = 4.692177e-02
 11 Linear |R| = 1.340717e-02
 12 Linear |R| = 4.753951e-03
 1 Nonlinear |R| = 2.320271e-02
  0 Linear |R| = 2.320271e-02
  1 Linear |R| = 4.367880e-03
  2 Linear |R| = 1.407852e-03
  3 Linear |R| = 6.036360e-04
  4 Linear |R| = 1.867661e-04
  5 Linear |R| = 8.760076e-05
  6 Linear |R| = 3.260519e-05
  7 Linear |R| = 1.435418e-05
  8 Linear |R| = 4.532875e-06
  9 Linear |R| = 2.439053e-06
 10 Linear |R| = 7.998549e-07
 11 Linear |R| = 2.428064e-07
 12 Linear |R| = 4.766918e-08
 13 Linear |R| = 1.713748e-08
 2 Nonlinear |R| = 3.671573e-07


ASM run2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 6.189229e+03
  2 Linear |R| = 3.252487e+02
  3 Linear |R| = 3.485174e+01
  4 Linear |R| = 8.600695e+00
  5 Linear |R| = 3.333942e+00
  6 Linear |R| = 1.706112e+00
  7 

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Mark Adams
To be clear: these differences completely go away with MUMPS?

Can you valgrind this? We have seen some valgrind warning from MUMPS from
BLAS routines. It could be that your BLAS is buggy (and SuperLU uses some
BLAS routines that MUMPS does not). I think SuperLU does more/different
pivoting than MUMPS.  What BLAS are you using? (download, MKL, ...)


On Wed, Nov 15, 2017 at 4:52 PM, Smith, Barry F.  wrote:

>
>
> > On Nov 15, 2017, at 3:36 PM, Kong, Fande  wrote:
> >
> > Hi Barry,
> >
> > Thanks for your reply. I was wondering why this happens only when we use
> superlu_dist. I am trying to understand the algorithm in superlu_dist. If
> we use ASM or MUMPS, we do not produce these differences.
> >
> > The differences actually are NOT meaningless.  In fact, we have a real
> transient application that presents this issue.   When we run the
> simulation with superlu_dist in parallel for thousands of time steps, the
> final physics  solution looks totally different from different runs. The
> differences are not acceptable any more.  For a steady problem, the
> difference may be meaningless. But it is significant for the transient
> problem.
>
>   I submit that the "physics solution" of all of these runs is equally
> right and equally wrong. If the solutions are very different due to a small
> perturbation than something is wrong with the model or the integrator, I
> don't think you can blame the linear solver (see below)
> >
> > This makes the solution not reproducible, and we can not even set a
> targeting solution in the test system because the solution is so different
> from one run to another.   I guess there might/may be a tiny bug in
> superlu_dist or the PETSc interface to superlu_dist.
>
>   This is possible but it is also possible this is due to normal round off
> inside of SuperLU dist.
>
>Since you have SuperLU_Dist inside a nonlinear iteration it shouldn't
> really matter exactly how well SuperLU_Dist does. The nonlinear iteration
> does essential defect correction for you; are you making sure that the
> nonlinear iteration always works for every timestep? For example confirm
> that SNESGetConvergedReason() is always positive.
>
>
> >
> >
> > Fande,
> >
> >
> >
> >
> > On Wed, Nov 15, 2017 at 1:59 PM, Smith, Barry F. 
> wrote:
> >
> >   Meaningless differences
> >
> >
> > > On Nov 15, 2017, at 2:26 PM, Kong, Fande  wrote:
> > >
> > > Hi,
> > >
> > > There is a heat conduction problem. When superlu_dist is used as a
> preconditioner, we have random results from different runs. Is there a
> random algorithm in superlu_dist? If we use ASM or MUMPS as the
> preconditioner, we then don't have this issue.
> > >
> > > run 1:
> > >
> > >  0 Nonlinear |R| = 9.447423e+03
> > >   0 Linear |R| = 9.447423e+03
> > >   1 Linear |R| = 1.013384e-02
> > >   2 Linear |R| = 4.020995e-08
> > >  1 Nonlinear |R| = 1.404678e-02
> > >   0 Linear |R| = 1.404678e-02
> > >   1 Linear |R| = 5.104757e-08
> > >   2 Linear |R| = 7.699637e-14
> > >  2 Nonlinear |R| = 5.106418e-08
> > >
> > >
> > > run 2:
> > >
> > >  0 Nonlinear |R| = 9.447423e+03
> > >   0 Linear |R| = 9.447423e+03
> > >   1 Linear |R| = 1.013384e-02
> > >   2 Linear |R| = 4.020995e-08
> > >  1 Nonlinear |R| = 1.404678e-02
> > >   0 Linear |R| = 1.404678e-02
> > >   1 Linear |R| = 5.109913e-08
> > >   2 Linear |R| = 7.189091e-14
> > >  2 Nonlinear |R| = 5.111591e-08
> > >
> > > run 3:
> > >
> > >  0 Nonlinear |R| = 9.447423e+03
> > >   0 Linear |R| = 9.447423e+03
> > >   1 Linear |R| = 1.013384e-02
> > >   2 Linear |R| = 4.020995e-08
> > >  1 Nonlinear |R| = 1.404678e-02
> > >   0 Linear |R| = 1.404678e-02
> > >   1 Linear |R| = 5.104942e-08
> > >   2 Linear |R| = 7.465572e-14
> > >  2 Nonlinear |R| = 5.106642e-08
> > >
> > > run 4:
> > >
> > >  0 Nonlinear |R| = 9.447423e+03
> > >   0 Linear |R| = 9.447423e+03
> > >   1 Linear |R| = 1.013384e-02
> > >   2 Linear |R| = 4.020995e-08
> > >  1 Nonlinear |R| = 1.404678e-02
> > >   0 Linear |R| = 1.404678e-02
> > >   1 Linear |R| = 5.102730e-08
> > >   2 Linear |R| = 7.132220e-14
> > >  2 Nonlinear |R| = 5.104442e-08
> > >
> > > Solver details:
> > >
> > > SNES Object: 8 MPI processes
> > >   type: newtonls
> > >   maximum iterations=15, maximum function evaluations=1
> > >   tolerances: relative=1e-08, absolute=1e-11, solution=1e-50
> > >   total number of linear solver iterations=4
> > >   total number of function evaluations=7
> > >   norm schedule ALWAYS
> > >   SNESLineSearch Object: 8 MPI processes
> > > type: basic
> > > maxstep=1.00e+08, minlambda=1.00e-12
> > > tolerances: relative=1.00e-08, absolute=1.00e-15,
> lambda=1.00e-08
> > > maximum iterations=40
> > >   KSP Object: 8 MPI processes
> > > type: gmres
> > >   restart=30, using Classical (unmodified) Gram-Schmidt
> 

Re: [petsc-users] ISGlobalToLocalMappingApplyBlock

2017-11-15 Thread Adrian Croucher

hi Dave,


On 15/11/17 21:34, Dave May wrote:



Or am I wrong to expect this to give the same results regardless of
blocksize?



Yep.


Maybe I am not using this function correctly then.

The man page says it "Provides the local block numbering for a list of 
integers specified with a block global numbering."


So I thought if I put in global block indices it would give me the 
corresponding local block indices- which would be the same regardless of 
the size of each block.


However the large negative number being printed looks an uninitialized 
variable. This seems odd as with mode = MASK nout should equal N and 
any requested block indices not in the IS should result in -1 being 
inserted in your local_indices array.


What's the value of nout?


nout returns 1 on both ranks, as expected.

- Adrian

--
Dr Adrian Croucher
Senior Research Fellow
Department of Engineering Science
University of Auckland, New Zealand
email: a.crouc...@auckland.ac.nz
tel: +64 (0)9 923 4611



Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Matthew Knepley
On Wed, Nov 15, 2017 at 4:36 PM, Kong, Fande  wrote:

> Hi Barry,
>
> Thanks for your reply. I was wondering why this happens only when we use
> superlu_dist. I am trying to understand the algorithm in superlu_dist. If
> we use ASM or MUMPS, we do not produce these differences.
>
> The differences actually are NOT meaningless.  In fact, we have a real
> transient application that presents this issue.   When we run the
> simulation with superlu_dist in parallel for thousands of time steps, the
> final physics  solution looks totally different from different runs. The
> differences are not acceptable any more.  For a steady problem, the
> difference may be meaningless. But it is significant for the transient
> problem.
>

Are you sure this formulation is stable? It does not seem like it.

Matt


> This makes the solution not reproducible, and we can not even set a
> targeting solution in the test system because the solution is so different
> from one run to another.   I guess there might/may be a tiny bug in
> superlu_dist or the PETSc interface to superlu_dist.
>
>
> Fande,
>
>
>
>
> On Wed, Nov 15, 2017 at 1:59 PM, Smith, Barry F. 
> wrote:
>
>>
>>   Meaningless differences
>>
>>
>> > On Nov 15, 2017, at 2:26 PM, Kong, Fande  wrote:
>> >
>> > Hi,
>> >
>> > There is a heat conduction problem. When superlu_dist is used as a
>> preconditioner, we have random results from different runs. Is there a
>> random algorithm in superlu_dist? If we use ASM or MUMPS as the
>> preconditioner, we then don't have this issue.
>> >
>> > run 1:
>> >
>> >  0 Nonlinear |R| = 9.447423e+03
>> >   0 Linear |R| = 9.447423e+03
>> >   1 Linear |R| = 1.013384e-02
>> >   2 Linear |R| = 4.020995e-08
>> >  1 Nonlinear |R| = 1.404678e-02
>> >   0 Linear |R| = 1.404678e-02
>> >   1 Linear |R| = 5.104757e-08
>> >   2 Linear |R| = 7.699637e-14
>> >  2 Nonlinear |R| = 5.106418e-08
>> >
>> >
>> > run 2:
>> >
>> >  0 Nonlinear |R| = 9.447423e+03
>> >   0 Linear |R| = 9.447423e+03
>> >   1 Linear |R| = 1.013384e-02
>> >   2 Linear |R| = 4.020995e-08
>> >  1 Nonlinear |R| = 1.404678e-02
>> >   0 Linear |R| = 1.404678e-02
>> >   1 Linear |R| = 5.109913e-08
>> >   2 Linear |R| = 7.189091e-14
>> >  2 Nonlinear |R| = 5.111591e-08
>> >
>> > run 3:
>> >
>> >  0 Nonlinear |R| = 9.447423e+03
>> >   0 Linear |R| = 9.447423e+03
>> >   1 Linear |R| = 1.013384e-02
>> >   2 Linear |R| = 4.020995e-08
>> >  1 Nonlinear |R| = 1.404678e-02
>> >   0 Linear |R| = 1.404678e-02
>> >   1 Linear |R| = 5.104942e-08
>> >   2 Linear |R| = 7.465572e-14
>> >  2 Nonlinear |R| = 5.106642e-08
>> >
>> > run 4:
>> >
>> >  0 Nonlinear |R| = 9.447423e+03
>> >   0 Linear |R| = 9.447423e+03
>> >   1 Linear |R| = 1.013384e-02
>> >   2 Linear |R| = 4.020995e-08
>> >  1 Nonlinear |R| = 1.404678e-02
>> >   0 Linear |R| = 1.404678e-02
>> >   1 Linear |R| = 5.102730e-08
>> >   2 Linear |R| = 7.132220e-14
>> >  2 Nonlinear |R| = 5.104442e-08
>> >
>> > Solver details:
>> >
>> > SNES Object: 8 MPI processes
>> >   type: newtonls
>> >   maximum iterations=15, maximum function evaluations=1
>> >   tolerances: relative=1e-08, absolute=1e-11, solution=1e-50
>> >   total number of linear solver iterations=4
>> >   total number of function evaluations=7
>> >   norm schedule ALWAYS
>> >   SNESLineSearch Object: 8 MPI processes
>> > type: basic
>> > maxstep=1.00e+08, minlambda=1.00e-12
>> > tolerances: relative=1.00e-08, absolute=1.00e-15,
>> lambda=1.00e-08
>> > maximum iterations=40
>> >   KSP Object: 8 MPI processes
>> > type: gmres
>> >   restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>> >   happy breakdown tolerance 1e-30
>> > maximum iterations=100, initial guess is zero
>> > tolerances:  relative=1e-06, absolute=1e-50, divergence=1.
>> > right preconditioning
>> > using UNPRECONDITIONED norm type for convergence test
>> >   PC Object: 8 MPI processes
>> > type: lu
>> >   out-of-place factorization
>> >   tolerance for zero pivot 2.22045e-14
>> >   matrix ordering: natural
>> >   factor fill ratio given 0., needed 0.
>> > Factored matrix follows:
>> >   Mat Object: 8 MPI processes
>> > type: superlu_dist
>> > rows=7925, cols=7925
>> > package used to perform factorization: superlu_dist
>> > total: nonzeros=0, allocated nonzeros=0
>> > total number of mallocs used during MatSetValues calls =0
>> >   SuperLU_DIST run parameters:
>> > Process grid nprow 4 x npcol 2
>> > Equilibrate matrix TRUE
>> > Matrix input mode 1
>> > Replace tiny pivots FALSE
>> > Use iterative refinement TRUE
>> > Processors in row 4 col 

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Smith, Barry F.


> On Nov 15, 2017, at 3:36 PM, Kong, Fande  wrote:
> 
> Hi Barry,
> 
> Thanks for your reply. I was wondering why this happens only when we use 
> superlu_dist. I am trying to understand the algorithm in superlu_dist. If we 
> use ASM or MUMPS, we do not produce these differences. 
> 
> The differences actually are NOT meaningless.  In fact, we have a real 
> transient application that presents this issue.   When we run the simulation 
> with superlu_dist in parallel for thousands of time steps, the final physics  
> solution looks totally different from different runs. The differences are not 
> acceptable any more.  For a steady problem, the difference may be 
> meaningless. But it is significant for the transient problem. 

  I submit that the "physics solution" of all of these runs is equally right 
and equally wrong. If the solutions are very different due to a small 
perturbation than something is wrong with the model or the integrator, I don't 
think you can blame the linear solver (see below)
> 
> This makes the solution not reproducible, and we can not even set a targeting 
> solution in the test system because the solution is so different from one run 
> to another.   I guess there might/may be a tiny bug in superlu_dist or the 
> PETSc interface to superlu_dist.

  This is possible but it is also possible this is due to normal round off 
inside of SuperLU dist.

   Since you have SuperLU_Dist inside a nonlinear iteration it shouldn't really 
matter exactly how well SuperLU_Dist does. The nonlinear iteration does 
essential defect correction for you; are you making sure that the nonlinear 
iteration always works for every timestep? For example confirm that 
SNESGetConvergedReason() is always positive.

  
> 
> 
> Fande,
> 
> 
> 
> 
> On Wed, Nov 15, 2017 at 1:59 PM, Smith, Barry F.  wrote:
> 
>   Meaningless differences
> 
> 
> > On Nov 15, 2017, at 2:26 PM, Kong, Fande  wrote:
> >
> > Hi,
> >
> > There is a heat conduction problem. When superlu_dist is used as a 
> > preconditioner, we have random results from different runs. Is there a 
> > random algorithm in superlu_dist? If we use ASM or MUMPS as the 
> > preconditioner, we then don't have this issue.
> >
> > run 1:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.104757e-08
> >   2 Linear |R| = 7.699637e-14
> >  2 Nonlinear |R| = 5.106418e-08
> >
> >
> > run 2:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.109913e-08
> >   2 Linear |R| = 7.189091e-14
> >  2 Nonlinear |R| = 5.111591e-08
> >
> > run 3:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.104942e-08
> >   2 Linear |R| = 7.465572e-14
> >  2 Nonlinear |R| = 5.106642e-08
> >
> > run 4:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.102730e-08
> >   2 Linear |R| = 7.132220e-14
> >  2 Nonlinear |R| = 5.104442e-08
> >
> > Solver details:
> >
> > SNES Object: 8 MPI processes
> >   type: newtonls
> >   maximum iterations=15, maximum function evaluations=1
> >   tolerances: relative=1e-08, absolute=1e-11, solution=1e-50
> >   total number of linear solver iterations=4
> >   total number of function evaluations=7
> >   norm schedule ALWAYS
> >   SNESLineSearch Object: 8 MPI processes
> > type: basic
> > maxstep=1.00e+08, minlambda=1.00e-12
> > tolerances: relative=1.00e-08, absolute=1.00e-15, 
> > lambda=1.00e-08
> > maximum iterations=40
> >   KSP Object: 8 MPI processes
> > type: gmres
> >   restart=30, using Classical (unmodified) Gram-Schmidt 
> > Orthogonalization with no iterative refinement
> >   happy breakdown tolerance 1e-30
> > maximum iterations=100, initial guess is zero
> > tolerances:  relative=1e-06, absolute=1e-50, divergence=1.
> > right preconditioning
> > using UNPRECONDITIONED norm type for convergence test
> >   PC Object: 8 MPI processes
> > type: lu
> >   out-of-place factorization
> >   tolerance for zero pivot 2.22045e-14
> >   matrix ordering: natural
> >   factor fill ratio given 0., needed 0.
> > Factored matrix follows:
> >   Mat Object: 8 MPI processes
> > 

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande
Hi Barry,

Thanks for your reply. I was wondering why this happens only when we use
superlu_dist. I am trying to understand the algorithm in superlu_dist. If
we use ASM or MUMPS, we do not produce these differences.

The differences actually are NOT meaningless.  In fact, we have a real
transient application that presents this issue.   When we run the
simulation with superlu_dist in parallel for thousands of time steps, the
final physics  solution looks totally different from different runs. The
differences are not acceptable any more.  For a steady problem, the
difference may be meaningless. But it is significant for the transient
problem.

This makes the solution not reproducible, and we can not even set a
targeting solution in the test system because the solution is so different
from one run to another.   I guess there might/may be a tiny bug in
superlu_dist or the PETSc interface to superlu_dist.


Fande,




On Wed, Nov 15, 2017 at 1:59 PM, Smith, Barry F.  wrote:

>
>   Meaningless differences
>
>
> > On Nov 15, 2017, at 2:26 PM, Kong, Fande  wrote:
> >
> > Hi,
> >
> > There is a heat conduction problem. When superlu_dist is used as a
> preconditioner, we have random results from different runs. Is there a
> random algorithm in superlu_dist? If we use ASM or MUMPS as the
> preconditioner, we then don't have this issue.
> >
> > run 1:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.104757e-08
> >   2 Linear |R| = 7.699637e-14
> >  2 Nonlinear |R| = 5.106418e-08
> >
> >
> > run 2:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.109913e-08
> >   2 Linear |R| = 7.189091e-14
> >  2 Nonlinear |R| = 5.111591e-08
> >
> > run 3:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.104942e-08
> >   2 Linear |R| = 7.465572e-14
> >  2 Nonlinear |R| = 5.106642e-08
> >
> > run 4:
> >
> >  0 Nonlinear |R| = 9.447423e+03
> >   0 Linear |R| = 9.447423e+03
> >   1 Linear |R| = 1.013384e-02
> >   2 Linear |R| = 4.020995e-08
> >  1 Nonlinear |R| = 1.404678e-02
> >   0 Linear |R| = 1.404678e-02
> >   1 Linear |R| = 5.102730e-08
> >   2 Linear |R| = 7.132220e-14
> >  2 Nonlinear |R| = 5.104442e-08
> >
> > Solver details:
> >
> > SNES Object: 8 MPI processes
> >   type: newtonls
> >   maximum iterations=15, maximum function evaluations=1
> >   tolerances: relative=1e-08, absolute=1e-11, solution=1e-50
> >   total number of linear solver iterations=4
> >   total number of function evaluations=7
> >   norm schedule ALWAYS
> >   SNESLineSearch Object: 8 MPI processes
> > type: basic
> > maxstep=1.00e+08, minlambda=1.00e-12
> > tolerances: relative=1.00e-08, absolute=1.00e-15,
> lambda=1.00e-08
> > maximum iterations=40
> >   KSP Object: 8 MPI processes
> > type: gmres
> >   restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >   happy breakdown tolerance 1e-30
> > maximum iterations=100, initial guess is zero
> > tolerances:  relative=1e-06, absolute=1e-50, divergence=1.
> > right preconditioning
> > using UNPRECONDITIONED norm type for convergence test
> >   PC Object: 8 MPI processes
> > type: lu
> >   out-of-place factorization
> >   tolerance for zero pivot 2.22045e-14
> >   matrix ordering: natural
> >   factor fill ratio given 0., needed 0.
> > Factored matrix follows:
> >   Mat Object: 8 MPI processes
> > type: superlu_dist
> > rows=7925, cols=7925
> > package used to perform factorization: superlu_dist
> > total: nonzeros=0, allocated nonzeros=0
> > total number of mallocs used during MatSetValues calls =0
> >   SuperLU_DIST run parameters:
> > Process grid nprow 4 x npcol 2
> > Equilibrate matrix TRUE
> > Matrix input mode 1
> > Replace tiny pivots FALSE
> > Use iterative refinement TRUE
> > Processors in row 4 col partition 2
> > Row permutation LargeDiag
> > Column permutation METIS_AT_PLUS_A
> > Parallel symbolic factorization FALSE
> > Repeated factorization SamePattern
> > linear system matrix followed by preconditioner matrix:
> > Mat 

Re: [petsc-users] superlu_dist produces random results

2017-11-15 Thread Smith, Barry F.

  Meaningless differences


> On Nov 15, 2017, at 2:26 PM, Kong, Fande  wrote:
> 
> Hi,
> 
> There is a heat conduction problem. When superlu_dist is used as a 
> preconditioner, we have random results from different runs. Is there a random 
> algorithm in superlu_dist? If we use ASM or MUMPS as the preconditioner, we 
> then don't have this issue. 
> 
> run 1:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.013384e-02
>   2 Linear |R| = 4.020995e-08
>  1 Nonlinear |R| = 1.404678e-02
>   0 Linear |R| = 1.404678e-02
>   1 Linear |R| = 5.104757e-08
>   2 Linear |R| = 7.699637e-14
>  2 Nonlinear |R| = 5.106418e-08
> 
> 
> run 2:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.013384e-02
>   2 Linear |R| = 4.020995e-08
>  1 Nonlinear |R| = 1.404678e-02
>   0 Linear |R| = 1.404678e-02
>   1 Linear |R| = 5.109913e-08
>   2 Linear |R| = 7.189091e-14
>  2 Nonlinear |R| = 5.111591e-08
> 
> run 3:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.013384e-02
>   2 Linear |R| = 4.020995e-08
>  1 Nonlinear |R| = 1.404678e-02
>   0 Linear |R| = 1.404678e-02
>   1 Linear |R| = 5.104942e-08
>   2 Linear |R| = 7.465572e-14
>  2 Nonlinear |R| = 5.106642e-08
> 
> run 4:
> 
>  0 Nonlinear |R| = 9.447423e+03
>   0 Linear |R| = 9.447423e+03
>   1 Linear |R| = 1.013384e-02
>   2 Linear |R| = 4.020995e-08
>  1 Nonlinear |R| = 1.404678e-02
>   0 Linear |R| = 1.404678e-02
>   1 Linear |R| = 5.102730e-08
>   2 Linear |R| = 7.132220e-14
>  2 Nonlinear |R| = 5.104442e-08
> 
> Solver details:
> 
> SNES Object: 8 MPI processes
>   type: newtonls
>   maximum iterations=15, maximum function evaluations=1
>   tolerances: relative=1e-08, absolute=1e-11, solution=1e-50
>   total number of linear solver iterations=4
>   total number of function evaluations=7
>   norm schedule ALWAYS
>   SNESLineSearch Object: 8 MPI processes
> type: basic
> maxstep=1.00e+08, minlambda=1.00e-12
> tolerances: relative=1.00e-08, absolute=1.00e-15, 
> lambda=1.00e-08
> maximum iterations=40
>   KSP Object: 8 MPI processes
> type: gmres
>   restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
> with no iterative refinement
>   happy breakdown tolerance 1e-30
> maximum iterations=100, initial guess is zero
> tolerances:  relative=1e-06, absolute=1e-50, divergence=1.
> right preconditioning
> using UNPRECONDITIONED norm type for convergence test
>   PC Object: 8 MPI processes
> type: lu
>   out-of-place factorization
>   tolerance for zero pivot 2.22045e-14
>   matrix ordering: natural
>   factor fill ratio given 0., needed 0.
> Factored matrix follows:
>   Mat Object: 8 MPI processes
> type: superlu_dist
> rows=7925, cols=7925
> package used to perform factorization: superlu_dist
> total: nonzeros=0, allocated nonzeros=0
> total number of mallocs used during MatSetValues calls =0
>   SuperLU_DIST run parameters:
> Process grid nprow 4 x npcol 2 
> Equilibrate matrix TRUE 
> Matrix input mode 1 
> Replace tiny pivots FALSE 
> Use iterative refinement TRUE 
> Processors in row 4 col partition 2 
> Row permutation LargeDiag 
> Column permutation METIS_AT_PLUS_A
> Parallel symbolic factorization FALSE 
> Repeated factorization SamePattern
> linear system matrix followed by preconditioner matrix:
> Mat Object: 8 MPI processes
>   type: mffd
>   rows=7925, cols=7925
> Matrix-free approximation:
>   err=1.49012e-08 (relative error in function evaluation)
>   Using wp compute h routine
>   Does not compute normU
> Mat Object: () 8 MPI processes
>   type: mpiaij
>   rows=7925, cols=7925
>   total: nonzeros=63587, allocated nonzeros=63865
>   total number of mallocs used during MatSetValues calls =0
> not using I-node (on process 0) routines
> 
> 
> Fande,
> 
> 



[petsc-users] superlu_dist produces random results

2017-11-15 Thread Kong, Fande
Hi,

There is a heat conduction problem. When superlu_dist is used as a
preconditioner, we have random results from different runs. Is there a
random algorithm in superlu_dist? If we use ASM or MUMPS as the
preconditioner, we then don't have this issue.

run 1:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020995e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 5.104757e-08
  2 Linear |R| = 7.699637e-14
 2 Nonlinear |R| = 5.106418e-08


run 2:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020995e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 5.109913e-08
  2 Linear |R| = 7.189091e-14
 2 Nonlinear |R| = 5.111591e-08

run 3:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020995e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 5.104942e-08
  2 Linear |R| = 7.465572e-14
 2 Nonlinear |R| = 5.106642e-08

run 4:

 0 Nonlinear |R| = 9.447423e+03
  0 Linear |R| = 9.447423e+03
  1 Linear |R| = 1.013384e-02
  2 Linear |R| = 4.020995e-08
 1 Nonlinear |R| = 1.404678e-02
  0 Linear |R| = 1.404678e-02
  1 Linear |R| = 5.102730e-08
  2 Linear |R| = 7.132220e-14
 2 Nonlinear |R| = 5.104442e-08

Solver details:

SNES Object: 8 MPI processes
  type: newtonls
  maximum iterations=15, maximum function evaluations=1
  tolerances: relative=1e-08, absolute=1e-11, solution=1e-50
  total number of linear solver iterations=4
  total number of function evaluations=7
  norm schedule ALWAYS
  SNESLineSearch Object: 8 MPI processes
type: basic
maxstep=1.00e+08, minlambda=1.00e-12
tolerances: relative=1.00e-08, absolute=1.00e-15,
lambda=1.00e-08
maximum iterations=40
  KSP Object: 8 MPI processes
type: gmres
  restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
  happy breakdown tolerance 1e-30
maximum iterations=100, initial guess is zero
tolerances:  relative=1e-06, absolute=1e-50, divergence=1.
right preconditioning
using UNPRECONDITIONED norm type for convergence test
  PC Object: 8 MPI processes
type: lu
  out-of-place factorization
  tolerance for zero pivot 2.22045e-14
  matrix ordering: natural
  factor fill ratio given 0., needed 0.
Factored matrix follows:
  Mat Object: 8 MPI processes
type: superlu_dist
rows=7925, cols=7925
package used to perform factorization: superlu_dist
total: nonzeros=0, allocated nonzeros=0
total number of mallocs used during MatSetValues calls =0
  SuperLU_DIST run parameters:
Process grid nprow 4 x npcol 2
Equilibrate matrix TRUE
Matrix input mode 1
Replace tiny pivots FALSE
Use iterative refinement TRUE
Processors in row 4 col partition 2
Row permutation LargeDiag
Column permutation METIS_AT_PLUS_A
Parallel symbolic factorization FALSE
Repeated factorization SamePattern
linear system matrix followed by preconditioner matrix:
Mat Object: 8 MPI processes
  type: mffd
  rows=7925, cols=7925
Matrix-free approximation:
  err=1.49012e-08 (relative error in function evaluation)
  Using wp compute h routine
  Does not compute normU
Mat Object: () 8 MPI processes
  type: mpiaij
  rows=7925, cols=7925
  total: nonzeros=63587, allocated nonzeros=63865
  total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines


Fande,


[petsc-users] IDR availalbe in PETSC?

2017-11-15 Thread Evan Um
Dear PETSC users,

I was wondering if anyone already tried/developed an induced dimension
reduction (IDR) solver for PETSC? I think that it is a useful one but I
couldn't find its example with PETSC. If you have any idea about IDR
routines for PETSC, please let me know. Thanks!

Best,
Evan


Re: [petsc-users] Possible to recover ILU(k) from hypre/pilut?

2017-11-15 Thread Mark Lohry
>
>
> > Partially unrelated, PC block-jacobi fails with MFFD type not supported,
> but additive schwarz with 0 overlap, which I think is identical, works
> fine. Is this a bug?
>
>Huh, is this related to hypre, or plan PETSc? Please send all
> information, command line options etc that reproduce the problem,
> preferably on a PETSc example.


 Unrelated to hypre, pure petsc. Using:

SNESSetJacobian(ctx.snes, ctx.JPre, ctx.JPre,
SNESComputeJacobianDefaultColor, fdcoloring);

and -snes_mf_operator,

-pc_type asm works as expected, ksp_view:

PC Object: 32 MPI processes
  type: asm
total subdomain blocks = 32, amount of overlap = 1
restriction/interpolation type - RESTRICT
Local solve is same for all blocks, in the following KSP and PC objects:
  KSP Object: (sub_) 1 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
left preconditioning
using NONE norm type for convergence test
  PC Object: (sub_) 1 MPI processes
type: ilu
  out-of-place factorization
  0 levels of fill
  tolerance for zero pivot 2.22045e-14
  matrix ordering: natural
  factor fill ratio given 1., needed 1.
Factored matrix follows:
  Mat Object: 1 MPI processes
type: seqaij
rows=3600, cols=3600
package used to perform factorization: petsc
total: nonzeros=69, allocated nonzeros=69
total number of mallocs used during MatSetValues calls =0
  using I-node routines: found 720 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
  type: seqaij
  rows=3600, cols=3600
  total: nonzeros=69, allocated nonzeros=69
  total number of mallocs used during MatSetValues calls =0
using I-node routines: found 720 nodes, limit used is 5
  linear system matrix followed by preconditioner matrix:
  Mat Object: 32 MPI processes
type: mffd
rows=76800, cols=76800
  Matrix-free approximation:
err=1.49012e-08 (relative error in function evaluation)
Using wp compute h routine
Does not compute normU
  Mat Object: 32 MPI processes
type: mpiaij
rows=76800, cols=76800
total: nonzeros=1632, allocated nonzeros=1632
total number of mallocs used during MatSetValues calls =0


-pc_type bjacobi:

[0]PETSC ERROR: - Error Message
--
[0]PETSC ERROR: No support for this operation for this object type
[0]PETSC ERROR: Not coded for this matrix type
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.8.1, Nov, 04, 2017
[0]PETSC ERROR: maDG on a arch-linux2-c-opt named templeton by mlohry Wed
Nov 15 08:53:20 2017
[0]PETSC ERROR: Configure options
PETSC_DIR=/home/mlohry/dev/build/external/petsc
PETSC_ARCH=arch-linux2-c-opt --with-cc=mpicc --with-cxx=mpic++ --with-fc=0
--with-clanguage=C++ --with-pic=1 --with-debugging=1 COPTFLAGS=-O3
CXXOPTFLAGS=-O3 --with-shared-libraries=1 --download-parmetis
--download-metis --download-hypre=yes --download-superlu_dist=yes
--with-64-bit-indices
[0]PETSC ERROR: #1 MatGetDiagonalBlock() line 307 in
/home/mlohry/dev/build/external/petsc/src/mat/interface/matrix.c
[0]PETSC ERROR: #2 PCSetUp_BJacobi() line 119 in
/home/mlohry/dev/build/external/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c
[0]PETSC ERROR: #3 PCSetUp() line 924 in
/home/mlohry/dev/build/external/petsc/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #4 KSPSetUp() line 381 in
/home/mlohry/dev/build/external/petsc/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #5 KSPSolve() line 612 in
/home/mlohry/dev/build/external/petsc/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #6 SNESSolve_NEWTONLS() line 224 in
/home/mlohry/dev/build/external/petsc/src/snes/impls/ls/ls.c
[0]PETSC ERROR: #7 SNESSolve() line 4108 in
/home/mlohry/dev/build/external/petsc/src/snes/interface/snes.c
[0]PETSC ERROR: #8 TS_SNESSolve() line 176 in
/home/mlohry/dev/build/external/petsc/src/ts/impls/implicit/theta/theta.c
[0]PETSC ERROR: #9 TSStep_Theta() line 216 in
/home/mlohry/dev/build/external/petsc/src/ts/impls/implicit/theta/theta.c
[0]PETSC ERROR: #10 TSStep() line 4120 in
/home/mlohry/dev/build/external/petsc/src/ts/interface/ts.c
[0]PETSC ERROR: #11 TSSolve() line 4373 in
/home/mlohry/dev/build/external/petsc/src/ts/interface/ts.c









On Wed, Nov 15, 2017 at 8:47 AM, Smith, Barry F.  wrote:

>
>
> > On Nov 15, 2017, at 6:38 AM, Mark Lohry  wrote:
> >
> > I've found ILU(0) or (1) to be working well for my problem, but the
> petsc implementation is serial only. Running with -pc_type hypre
> -pc_hypre_type pilut with default settings has considerably worse
> convergence. I've tried using -pc_hypre_pilut_factorrowsize (number of
> actual elements in 

[petsc-users] Possible to recover ILU(k) from hypre/pilut?

2017-11-15 Thread Mark Lohry
I've found ILU(0) or (1) to be working well for my problem, but the petsc
implementation is serial only. Running with -pc_type hypre -pc_hypre_type
pilut with default settings has considerably worse convergence. I've tried
using -pc_hypre_pilut_factorrowsize (number of actual elements in row) to
trick it into doing ILU(0), to no effect.

Is there any way to recover classical ILU(k) from pilut?

Hypre's docs state pilut is no longer supported, and Euclid should be used
for anything moving forward. pc_hypre_boomeramg has options for Euclid
smoothers. Any hope of a pc_hypre_type euclid?


Partially unrelated, PC block-jacobi fails with MFFD type not supported,
but additive schwarz with 0 overlap, which I think is identical, works
fine. Is this a bug?


Thanks,
Mark


Re: [petsc-users] indices into Vec/Mat associated to a DMPlex

2017-11-15 Thread Matthew Knepley
On Wed, Nov 15, 2017 at 6:09 AM, Matteo Semplice 
wrote:

> On 15/11/2017 11:39, Matthew Knepley wrote:
>
> On Wed, Nov 15, 2017 at 3:11 AM, Matteo Semplice  > wrote:
>
>> Hi.
>>
>> I am struggling with indices into matrices associated to a DMPLex mesh. I
>> can explain my problem to the following minimal example.
>>
>> Let's say I want to assemble the matrix to solve an equation (say
>> Laplace) with data attached to cells and the finite volume method. In
>> principle I
>>
>> - loop over the faces of the DMPlex (height=1)
>>
>> - for each face, find neighbouring cells (say points i and j in the
>> DMPlex) and compute the contributions coming from that face (in the example
>> it would be something like (u_j-u_i)* with x=centroids, n=scaled
>> normal to face and <,> the inner product)
>>
>> - insert the contributions +/- in rows/columns n(i) and n(j)
>> of the matrix where n(k) is the index into Vec/Mat of the unknown
>> associated to the k-th cell
>>
>> My problem is how to find out n(k). I assume that the Section should be
>> able to tell me, but I cannot find the correct function to call. I see that
>> FEM methods can use DMPlexMatSetClosure but here we'd need a
>> DMPlexMatSetCone...
>>
>
> Everything always reduces to raw Section calls. For instance, for cell c
>
>   PetscSectionGetDof(sec, c, );
>   PetscSectionGetOffset(sec, c, );
>
> give the number of degrees of freedom on the cell, and the offset into
> storage (a local vector for the local section, and global vector for the
> global section).
> The Closure stuff just calls this for every point in the closure.
>
>
> All right, so the offset is also the (global) index and that is to be used
> in calls to MatSetValues.
>

Not quite. For all owned dofs, it is (in the global section). For all
unowned dofs, it is -(off+1), so you have
to convert it if you want to set off-process values.

  Thanks,

Matt


> Thanks a lot!
>
> Matteo
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] indices into Vec/Mat associated to a DMPlex

2017-11-15 Thread Matteo Semplice



On 15/11/2017 11:39, Matthew Knepley wrote:
On Wed, Nov 15, 2017 at 3:11 AM, Matteo Semplice 
> wrote:


Hi.

I am struggling with indices into matrices associated to a DMPLex
mesh. I can explain my problem to the following minimal example.

Let's say I want to assemble the matrix to solve an equation (say
Laplace) with data attached to cells and the finite volume method.
In principle I

- loop over the faces of the DMPlex (height=1)

- for each face, find neighbouring cells (say points i and j in
the DMPlex) and compute the contributions coming from that face
(in the example it would be something like (u_j-u_i)*
with x=centroids, n=scaled normal to face and <,> the inner product)

- insert the contributions +/- in rows/columns n(i) and
n(j) of the matrix where n(k) is the index into Vec/Mat of the
unknown associated to the k-th cell

My problem is how to find out n(k). I assume that the Section
should be able to tell me, but I cannot find the correct function
to call. I see that FEM methods can use DMPlexMatSetClosure but
here we'd need a DMPlexMatSetCone...


Everything always reduces to raw Section calls. For instance, for cell c

  PetscSectionGetDof(sec, c, );
  PetscSectionGetOffset(sec, c, );

give the number of degrees of freedom on the cell, and the offset into 
storage (a local vector for the local section, and global vector for 
the global section).

The Closure stuff just calls this for every point in the closure.


All right, so the offset is also the (global) index and that is to be 
used in calls to MatSetValues.


Thanks a lot!

    Matteo


Re: [petsc-users] indices into Vec/Mat associated to a DMPlex

2017-11-15 Thread Matthew Knepley
On Wed, Nov 15, 2017 at 3:11 AM, Matteo Semplice 
wrote:

> Hi.
>
> I am struggling with indices into matrices associated to a DMPLex mesh. I
> can explain my problem to the following minimal example.
>
> Let's say I want to assemble the matrix to solve an equation (say Laplace)
> with data attached to cells and the finite volume method. In principle I
>
> - loop over the faces of the DMPlex (height=1)
>
> - for each face, find neighbouring cells (say points i and j in the
> DMPlex) and compute the contributions coming from that face (in the example
> it would be something like (u_j-u_i)* with x=centroids, n=scaled
> normal to face and <,> the inner product)
>
> - insert the contributions +/- in rows/columns n(i) and n(j) of
> the matrix where n(k) is the index into Vec/Mat of the unknown associated
> to the k-th cell
>
> My problem is how to find out n(k). I assume that the Section should be
> able to tell me, but I cannot find the correct function to call. I see that
> FEM methods can use DMPlexMatSetClosure but here we'd need a
> DMPlexMatSetCone...
>

Everything always reduces to raw Section calls. For instance, for cell c

  PetscSectionGetDof(sec, c, );
  PetscSectionGetOffset(sec, c, );

give the number of degrees of freedom on the cell, and the offset into
storage (a local vector for the local section, and global vector for the
global section).
The Closure stuff just calls this for every point in the closure.


> On a side note, I noticed that the grad field in PetscFVFaceGeom is not
> computed by DMPlexComputeGeometryFVM. Is it meant or could it be used to
> store the precomputed   values?


There are separate functions for reconstructing the gradient since it is so
expensive (and can be inaccurate for some unstructured cases at boundaries).


> But even if so, this wouldn't sovle the problem of where to put the
> elements in the matrix, right?


No.

   Matt


>
> Matteo

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] ISGlobalToLocalMappingApplyBlock

2017-11-15 Thread Dave May
On Wed, 15 Nov 2017 at 05:55, Adrian Croucher 
wrote:

> hi
>
> I'm trying to use ISGlobalToLocalMappingApplyBlock() and am a bit
> puzzled about the results it's giving.
>
> I've attached a small test to illustrate. It just sets up a
> local-to-global mapping with 10 elements. Running on two processes the
> first has global indices 0 - 4 and the the second has 5 - 9. I then try
> to find the local index corresponding to global index 8.
>
> If I set the blocksize parameter to 1, it correctly gives the results -1
> on rank 0 and 3 on rank 1.
>
> But if I set the blocksize to 2 (or more), the results are -253701943 on
> rank 0 and -1 on rank 1. Neither of these are what I expected- I thought
> they should be the same as in the blocksize 1 case.


The man page says to use "block global  numbering"


>
> I'm presuming the global indices I pass in to
> ISGlobalToLocalMappingApplyBlock() should be global block indices (i.e.
> not scaled up by blocksize).


Yes, the indices should relate to the blocks

If I do scale them up it doesn't give the
> answers I expect either.
>
> Or am I wrong to expect this to give the same results regardless of
> blocksize?


Yep.

However the large negative number being printed looks an uninitialized
variable. This seems odd as with mode = MASK nout should equal N and any
requested block indices not in the IS should result in -1 being inserted in
your local_indices array.

What's the value of nout?

Thanks,
  Dave


>
> Cheers, Adrian
>
> --
> Dr Adrian Croucher
> Senior Research Fellow
> Department of Engineering Science
> University of Auckland, New Zealand
> email: a.crouc...@auckland.ac.nz
> tel: +64 (0)9 923 4611
>
>


[petsc-users] indices into Vec/Mat associated to a DMPlex

2017-11-15 Thread Matteo Semplice

Hi.

I am struggling with indices into matrices associated to a DMPLex mesh. 
I can explain my problem to the following minimal example.


Let's say I want to assemble the matrix to solve an equation (say 
Laplace) with data attached to cells and the finite volume method. In 
principle I


- loop over the faces of the DMPlex (height=1)

- for each face, find neighbouring cells (say points i and j in the 
DMPlex) and compute the contributions coming from that face (in the 
example it would be something like (u_j-u_i)* with 
x=centroids, n=scaled normal to face and <,> the inner product)


- insert the contributions +/- in rows/columns n(i) and n(j) 
of the matrix where n(k) is the index into Vec/Mat of the unknown 
associated to the k-th cell


My problem is how to find out n(k). I assume that the Section should be 
able to tell me, but I cannot find the correct function to call. I see 
that FEM methods can use DMPlexMatSetClosure but here we'd need a 
DMPlexMatSetCone...


On a side note, I noticed that the grad field in PetscFVFaceGeom is not 
computed by DMPlexComputeGeometryFVM. Is it meant or could it be used to 
store the precomputed   values? But even if so, this wouldn't 
sovle the problem of where to put the elements in the matrix, right?


Matteo