Re: [petsc-users] MPI linear solver reproducibility question

2023-04-02 Thread Mark McClure
Ok, good to know. I'll update to latest Petsc, and do some testing, and let
you know either way.


On Sun, Apr 2, 2023 at 6:31 AM Jed Brown  wrote:

> Vector communication used a different code path in 3.13. If you have a
> reproducer with current PETSc, I'll have a look. Here's a demo that the
> solution is bitwise identical (the sha256sum is the same every time you run
> it, though it might be different on your computer from mine due to compiler
> version and flags).
>
> $ mpiexec -n 8 ompi/tests/snes/tutorials/ex5 -da_refine 3 -snes_monitor
> -snes_view_solution binary && sha256sum binaryoutput
>   0 SNES Function norm 1.265943996096e+00
>   1 SNES Function norm 2.831564838232e-02
>   2 SNES Function norm 4.456686729809e-04
>   3 SNES Function norm 1.206531765776e-07
>   4 SNES Function norm 1.740255643596e-12
> 5410f84e91a9db3a74a2ac336031fb48e7eaf739614192cfd53344517986
> binaryoutput
>
> Mark McClure  writes:
>
> > In the typical FD implementation, you only set local rows, but with FE
> and
> > sometimes FV, you also create values that need to be communicated and
> > summed on other processors.
> > Makes sense.
> >
> > Anyway, in this case, I am certain that I am giving the solver bitwise
> > identical matrices from each process. I am not using a preconditioner,
> > using BCGS, with Petsc version 3.13.3.
> >
> > So then, how can I make sure that I am "using an MPI that follows the
> > suggestion for implementers about determinism"? I am using MPICH version
> > 3.3a2, didn't do anything special when installing it. Does that sound OK?
> > If so, I could upgrade to the latest Petsc, try again, and if confirmed
> > that it persists, could provide a reproduction scenario.
> >
> >
> >
> > On Sat, Apr 1, 2023 at 9:53 PM Jed Brown  wrote:
> >
> >> Mark McClure  writes:
> >>
> >> > Thank you, I will try BCGSL.
> >> >
> >> > And good to know that this is worth pursuing, and that it is possible.
> >> Step
> >> > 1, I guess I should upgrade to the latest release on Petsc.
> >> >
> >> > How can I make sure that I am "using an MPI that follows the
> suggestion
> >> for
> >> > implementers about determinism"? I am using MPICH version 3.3a2.
> >> >
> >> > I am pretty sure that I'm assembling the same matrix every time, but
> I'm
> >> > not sure how it would depend on 'how you do the communication'. Each
> >> > process is doing a series of MatSetValues with INSERT_VALUES,
> >> > assembling the matrix by rows. My understanding of this process is
> that
> >> > it'd be deterministic.
> >>
> >> In the typical FD implementation, you only set local rows, but with FE
> and
> >> sometimes FV, you also create values that need to be communicated and
> >> summed on other processors.
> >>
>


Re: [petsc-users] MPI linear solver reproducibility question

2023-04-02 Thread Jed Brown
Vector communication used a different code path in 3.13. If you have a 
reproducer with current PETSc, I'll have a look. Here's a demo that the 
solution is bitwise identical (the sha256sum is the same every time you run it, 
though it might be different on your computer from mine due to compiler version 
and flags).

$ mpiexec -n 8 ompi/tests/snes/tutorials/ex5 -da_refine 3 -snes_monitor 
-snes_view_solution binary && sha256sum binaryoutput
  0 SNES Function norm 1.265943996096e+00
  1 SNES Function norm 2.831564838232e-02
  2 SNES Function norm 4.456686729809e-04
  3 SNES Function norm 1.206531765776e-07
  4 SNES Function norm 1.740255643596e-12
5410f84e91a9db3a74a2ac336031fb48e7eaf739614192cfd53344517986  binaryoutput

Mark McClure  writes:

> In the typical FD implementation, you only set local rows, but with FE and
> sometimes FV, you also create values that need to be communicated and
> summed on other processors.
> Makes sense.
>
> Anyway, in this case, I am certain that I am giving the solver bitwise
> identical matrices from each process. I am not using a preconditioner,
> using BCGS, with Petsc version 3.13.3.
>
> So then, how can I make sure that I am "using an MPI that follows the
> suggestion for implementers about determinism"? I am using MPICH version
> 3.3a2, didn't do anything special when installing it. Does that sound OK?
> If so, I could upgrade to the latest Petsc, try again, and if confirmed
> that it persists, could provide a reproduction scenario.
>
>
>
> On Sat, Apr 1, 2023 at 9:53 PM Jed Brown  wrote:
>
>> Mark McClure  writes:
>>
>> > Thank you, I will try BCGSL.
>> >
>> > And good to know that this is worth pursuing, and that it is possible.
>> Step
>> > 1, I guess I should upgrade to the latest release on Petsc.
>> >
>> > How can I make sure that I am "using an MPI that follows the suggestion
>> for
>> > implementers about determinism"? I am using MPICH version 3.3a2.
>> >
>> > I am pretty sure that I'm assembling the same matrix every time, but I'm
>> > not sure how it would depend on 'how you do the communication'. Each
>> > process is doing a series of MatSetValues with INSERT_VALUES,
>> > assembling the matrix by rows. My understanding of this process is that
>> > it'd be deterministic.
>>
>> In the typical FD implementation, you only set local rows, but with FE and
>> sometimes FV, you also create values that need to be communicated and
>> summed on other processors.
>>


Re: [petsc-users] MPI linear solver reproducibility question

2023-04-01 Thread Mark McClure
In the typical FD implementation, you only set local rows, but with FE and
sometimes FV, you also create values that need to be communicated and
summed on other processors.
Makes sense.

Anyway, in this case, I am certain that I am giving the solver bitwise
identical matrices from each process. I am not using a preconditioner,
using BCGS, with Petsc version 3.13.3.

So then, how can I make sure that I am "using an MPI that follows the
suggestion for implementers about determinism"? I am using MPICH version
3.3a2, didn't do anything special when installing it. Does that sound OK?
If so, I could upgrade to the latest Petsc, try again, and if confirmed
that it persists, could provide a reproduction scenario.



On Sat, Apr 1, 2023 at 9:53 PM Jed Brown  wrote:

> Mark McClure  writes:
>
> > Thank you, I will try BCGSL.
> >
> > And good to know that this is worth pursuing, and that it is possible.
> Step
> > 1, I guess I should upgrade to the latest release on Petsc.
> >
> > How can I make sure that I am "using an MPI that follows the suggestion
> for
> > implementers about determinism"? I am using MPICH version 3.3a2.
> >
> > I am pretty sure that I'm assembling the same matrix every time, but I'm
> > not sure how it would depend on 'how you do the communication'. Each
> > process is doing a series of MatSetValues with INSERT_VALUES,
> > assembling the matrix by rows. My understanding of this process is that
> > it'd be deterministic.
>
> In the typical FD implementation, you only set local rows, but with FE and
> sometimes FV, you also create values that need to be communicated and
> summed on other processors.
>


Re: [petsc-users] MPI linear solver reproducibility question

2023-04-01 Thread Jed Brown
Mark McClure  writes:

> Thank you, I will try BCGSL.
>
> And good to know that this is worth pursuing, and that it is possible. Step
> 1, I guess I should upgrade to the latest release on Petsc.
>
> How can I make sure that I am "using an MPI that follows the suggestion for
> implementers about determinism"? I am using MPICH version 3.3a2.
>
> I am pretty sure that I'm assembling the same matrix every time, but I'm
> not sure how it would depend on 'how you do the communication'. Each
> process is doing a series of MatSetValues with INSERT_VALUES,
> assembling the matrix by rows. My understanding of this process is that
> it'd be deterministic.

In the typical FD implementation, you only set local rows, but with FE and 
sometimes FV, you also create values that need to be communicated and summed on 
other processors.


Re: [petsc-users] MPI linear solver reproducibility question

2023-04-01 Thread Mark McClure
Thank you, I will try BCGSL.

And good to know that this is worth pursuing, and that it is possible. Step
1, I guess I should upgrade to the latest release on Petsc.

How can I make sure that I am "using an MPI that follows the suggestion for
implementers about determinism"? I am using MPICH version 3.3a2.

I am pretty sure that I'm assembling the same matrix every time, but I'm
not sure how it would depend on 'how you do the communication'. Each
process is doing a series of MatSetValues with INSERT_VALUES,
assembling the matrix by rows. My understanding of this process is that
it'd be deterministic.



On Sat, Apr 1, 2023 at 9:05 PM Jed Brown  wrote:

> If you use unpreconditioned BCGS and ensure that you assemble the same
> matrix (depends how you do the communication for that), I think you'll get
> bitwise reproducible results when using an MPI that follows the suggestion
> for implementers about determinism. Beyond that, it'll depend somewhat on
> the preconditioner.
>
> If you like BCGS, you may want to try BCGSL, which has a longer memory and
> tends to be more robust. But preconditioning is usually critical and the
> place to devote most effort.
>
> Mark McClure  writes:
>
> > Hello,
> >
> > I have been a user of Petsc for quite a few years, though I haven't
> updated
> > my version in a few years, so it's possible that my comments below could
> be
> > 'out of date'.
> >
> > Several years ago, I'd asked you guys about reproducibility. I observed
> > that if I gave an identical matrix to the Petsc linear solver, I would
> get
> > a bit-wise identical result back if running on one processor, but if I
> ran
> > with MPI, I would see differences at the final sig figs, below the
> > convergence criterion. Even if rerunning the same exact calculation on
> the
> > same exact machine.
> >
> > Ie, with repeated tests, it was always converging to the same answer
> > 'within convergence tolerance', but not consistent in the sig figs beyond
> > the convergence tolerance.
> >
> > At the time, the response that this was unavoidable, and related to the
> > issue that machine arithmetic is not commutative, and so the timing of
> when
> > processors were recombining information (which was random, effectively a
> > race condition) was causing these differences.
> >
> > Am I remembering correctly? And, if so, is this still a property of the
> > Petsc linear solver with MPI, and is there now any option available to
> > resolve it? I would be willing to accept a performance hit in order to
> get
> > guaranteed bitwise consistency, even when running with MPI.
> >
> > I am using the solver KSPBCGS, without a preconditioner. This is the
> > selection because several years ago, I did testing, and found that on the
> > particular linear systems that I am usually working with, this solver
> (with
> > no preconditioner) was the most robust, in terms of consistently
> > converging, and in terms of performance. Actually, I also tested a
> variety
> > of other linear solvers other than Petsc (including other implementations
> > of BiCGStab), and found that the Petsc BCGS was the best performer.
> Though,
> > I'm curious, have there been updates to that algorithm in recent years,
> > where I should consider updating to a newer Petsc build and comparing?
> >
> > Best regards,
> > Mark McClure
>


Re: [petsc-users] MPI linear solver reproducibility question

2023-04-01 Thread Jed Brown
If you use unpreconditioned BCGS and ensure that you assemble the same matrix 
(depends how you do the communication for that), I think you'll get bitwise 
reproducible results when using an MPI that follows the suggestion for 
implementers about determinism. Beyond that, it'll depend somewhat on the 
preconditioner.

If you like BCGS, you may want to try BCGSL, which has a longer memory and 
tends to be more robust. But preconditioning is usually critical and the place 
to devote most effort.

Mark McClure  writes:

> Hello,
>
> I have been a user of Petsc for quite a few years, though I haven't updated
> my version in a few years, so it's possible that my comments below could be
> 'out of date'.
>
> Several years ago, I'd asked you guys about reproducibility. I observed
> that if I gave an identical matrix to the Petsc linear solver, I would get
> a bit-wise identical result back if running on one processor, but if I ran
> with MPI, I would see differences at the final sig figs, below the
> convergence criterion. Even if rerunning the same exact calculation on the
> same exact machine.
>
> Ie, with repeated tests, it was always converging to the same answer
> 'within convergence tolerance', but not consistent in the sig figs beyond
> the convergence tolerance.
>
> At the time, the response that this was unavoidable, and related to the
> issue that machine arithmetic is not commutative, and so the timing of when
> processors were recombining information (which was random, effectively a
> race condition) was causing these differences.
>
> Am I remembering correctly? And, if so, is this still a property of the
> Petsc linear solver with MPI, and is there now any option available to
> resolve it? I would be willing to accept a performance hit in order to get
> guaranteed bitwise consistency, even when running with MPI.
>
> I am using the solver KSPBCGS, without a preconditioner. This is the
> selection because several years ago, I did testing, and found that on the
> particular linear systems that I am usually working with, this solver (with
> no preconditioner) was the most robust, in terms of consistently
> converging, and in terms of performance. Actually, I also tested a variety
> of other linear solvers other than Petsc (including other implementations
> of BiCGStab), and found that the Petsc BCGS was the best performer. Though,
> I'm curious, have there been updates to that algorithm in recent years,
> where I should consider updating to a newer Petsc build and comparing?
>
> Best regards,
> Mark McClure


[petsc-users] MPI linear solver reproducibility question

2023-04-01 Thread Mark McClure
Hello,

I have been a user of Petsc for quite a few years, though I haven't updated
my version in a few years, so it's possible that my comments below could be
'out of date'.

Several years ago, I'd asked you guys about reproducibility. I observed
that if I gave an identical matrix to the Petsc linear solver, I would get
a bit-wise identical result back if running on one processor, but if I ran
with MPI, I would see differences at the final sig figs, below the
convergence criterion. Even if rerunning the same exact calculation on the
same exact machine.

Ie, with repeated tests, it was always converging to the same answer
'within convergence tolerance', but not consistent in the sig figs beyond
the convergence tolerance.

At the time, the response that this was unavoidable, and related to the
issue that machine arithmetic is not commutative, and so the timing of when
processors were recombining information (which was random, effectively a
race condition) was causing these differences.

Am I remembering correctly? And, if so, is this still a property of the
Petsc linear solver with MPI, and is there now any option available to
resolve it? I would be willing to accept a performance hit in order to get
guaranteed bitwise consistency, even when running with MPI.

I am using the solver KSPBCGS, without a preconditioner. This is the
selection because several years ago, I did testing, and found that on the
particular linear systems that I am usually working with, this solver (with
no preconditioner) was the most robust, in terms of consistently
converging, and in terms of performance. Actually, I also tested a variety
of other linear solvers other than Petsc (including other implementations
of BiCGStab), and found that the Petsc BCGS was the best performer. Though,
I'm curious, have there been updates to that algorithm in recent years,
where I should consider updating to a newer Petsc build and comparing?

Best regards,
Mark McClure