On Fri, Jul 30, 2021 at 12:22 PM Sebastian Berg <sebast...@sipsolutions.net>
wrote:

> On Fri, 2021-07-30 at 11:04 -0700, Jerry Morrison wrote:
> > On Tue, Jul 27, 2021 at 4:55 PM Sebastian Berg <
> > sebast...@sipsolutions.net>
> > wrote:
> >
> > > Hi all,
> > >
> > > there is a proposal to add some Intel specific fast math routine to
> > > NumPy:
> > >
> > >     https://github.com/numpy/numpy/pull/19478
> > >
> > > part of numerical algorithms is that there is always a speed vs.
> > > precision trade-off, giving a more precise result is slower.
> > >
>
> <snip>
>


> > "Close enough" depends on the application but non-linear models can
> > get the
> > "butterfly effect" where the results diverge if they aren't
> > identical.
>
>
> Right, so my hope was to gauge what the general expectation is.  I take
> it you expect a high accuracy.
>
> The error for the computations itself is seems low on first sight, but
> of course they can explode quickly in non-linear settings...
> (In the chaotic systems I worked with, the shadowing theorem would
> usually alleviate such worries. And testing the integration would be
> more important.  But I am sure for certain questions things may be far
> more tricky.)
>

I'll put forth an expectation that after installing a specific set of
libraries, the floating point results would be identical across platforms
and into the future. Ideally developers could install library updates (for
hardware compatibility, security fixes, or other reasons) and still get
identical results.

That expectation is for reproducibility, not high accuracy. So it'd be fine
to install different libraries [or maybe use those pip package options in
brackets, whatever they do?] to trade accuracy for speed. Could any
particular choice of accuracy still provide reproducible results across
platforms and time?



> > For a certain class of scientific programming applications,
> > reproducibility
> > is paramount.
> >
> > Development teams may use a variety of development laptops,
> > workstations,
> > scientific computing clusters, and cloud computing platforms. If the
> > tests
> > pass on your machine but fail in CI, you have a debugging problem.
> >
> > If your published scientific article links to source code that
> > replicates
> > your computation, scientists will expect to be able to run that code,
> > now
> > or in a couple decades, and replicate the same outputs. They'll be
> > using
> > different OS releases and maybe different CPU + accelerator
> > architectures.
> >
> > Reproducible Science is good. Replicated Science is better.
> > <http://rescience.github.io/>
> >
> > Clearly there are other applications where it's easy to trade
> > reproducibility and some precision for speed.
>
>
> Agreed, although there are so many factors, often out of our control,
> that I am not sure that true replicability is achievable without
> containers :(.
>
> It would be amazing if NumPy could have a "replicable" mode, but I am
> not sure how that could be done, or if the "ground work" in the math
> and linear algebra libraries even exists.
>
>
> However, even if it is practically impossible to make things
> replicable, there is an argument for improving reproducibility and
> replicability, e.g. by choosing the high-accuracy version here.  Even
> if it is impossible to actually ensure.
>

Yes! Let's at least have reproducibility in mind and work on improving
it, e.g. by removing failure modes.
(Ditto for security :-)
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to