Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-27 Thread Arnaldo Russo
Hi there,

Is this the implementation of fma?
https://github.com/nschloe/pyfma

Cheers,
Arnaldo

   .
   |\
 _/]_\_
~"~^~

Arnaldo D'Amaral Pereira Granja Russo
c i c l o t u x . o r g

2018-02-25 8:34 GMT-03:00 Nils Becker :

> Hey,
>
>
>> Anyone that understands FP better than I do:
>>
>> In the above code, you are multiplying the step by an integer -- is there
>> any precision loss when you do that??
>>
>>
> Elementary operations (add, sub, mul, div) are demanded to be correctly
> rounded (cr) by IEEE, i.e. accurate within +/- 0.5 ulp. Consequently, a cr
> multiplication followed by a cr addition will be accurate within +/-1 ulp.
> This is also true if the first multiplicand is an integer.
> Using FMA will reduce this to +/- 0.5 ulp. This increase in accuracy of
> the grid calculation should not be relevant - but it also does not hurt.
>
> Still I would suggest adding the FMA operation to numpy, e.g. np.fma(a, b,
> c). There are several places in numpy that could benefit from the increased
> accuracy, e.g. evaluation of polynomials using Horner's method. In cases
> like this due to iteration and consequent error propagation the accuracy
> benefit of using FMA can be far larger. There may also be a performance
> benefit on platforms that implement FMA in hardware (although I am not sure
> about that).
>
> Cheers
> Nils
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-25 Thread Nils Becker
Hey,


> Anyone that understands FP better than I do:
>
> In the above code, you are multiplying the step by an integer -- is there
> any precision loss when you do that??
>
>
Elementary operations (add, sub, mul, div) are demanded to be correctly
rounded (cr) by IEEE, i.e. accurate within +/- 0.5 ulp. Consequently, a cr
multiplication followed by a cr addition will be accurate within +/-1 ulp.
This is also true if the first multiplicand is an integer.
Using FMA will reduce this to +/- 0.5 ulp. This increase in accuracy of the
grid calculation should not be relevant - but it also does not hurt.

Still I would suggest adding the FMA operation to numpy, e.g. np.fma(a, b,
c). There are several places in numpy that could benefit from the increased
accuracy, e.g. evaluation of polynomials using Horner's method. In cases
like this due to iteration and consequent error propagation the accuracy
benefit of using FMA can be far larger. There may also be a performance
benefit on platforms that implement FMA in hardware (although I am not sure
about that).

Cheers
Nils
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-23 Thread Chris Barker
On Fri, Feb 9, 2018 at 1:16 PM, Matthew Harrigan  wrote:

> I apologize if I'm missing something basic, but why are floats being
> accumulated in the first place?  Can't arange and linspace operations with
> floats be done internally similar to `start + np.arange(num_steps) *
> step_size`?  I.e. always accumulate (really increment) integers to limit
> errors.
>

I haven't looked at the arange() code, but linspace does does not
accumulate floats -- which is why it's already almost as good as it can be.
As regards to a fused-multiply-add, it does have to do a single
multiply_add operation for each value (as per your example code), so we may
be able to save a ULP there.

The problem with arange() is that the definition is poorly specified:

start + (step_num * step) while value < stop.

Even without fp issues, it's weird if (stop - start) / step is not an
integer. -- the "final" step will not be the same as the rest.

Say you want a "grid" with fully integer values. if the step is just right,
all is easy:

In [*72*]: np.arange(0, 11, 2)

Out[*72*]: array([ 0,  2,  4,  6,  8, 10])

(this is assuming you want 10 as the end point.

but then:

In [*73*]: np.arange(0, 11, 3)

Out[*73*]: array([0, 3, 6, 9])

but I wanted 10 as an end point. so:

In [*74*]: np.arange(0, 13, 3)

Out[*74*]: array([ 0,  3,  6,  9, 12])

hmm, that's not right either. Of course it's not -- you can't get 10 as an
end point, 'cause it's not a multiple of the step. With integers, you CAN
require that the end point be a multiple of the step, but with fp, you
can't required that it be EXACTLY a multiple, because either the end point
or the step may not be exactly representable, even if you do the math with
no loss of precision. And now you are stuck with the user figuring out for
themselves whether the closest fp representation of the end point is
slightly larger or smaller than the real value, so the < check will work.
NOT good.

This is why arange is simply not the tool to use.

Making a grid, you usually want to specify the end points and the number of
steps which is almost what linspace does. Or, _maybe_ you want to specify
the step and the number of steps, and accept that the end point may not be
exactly what you "expect". There is no built-in function for this in numpy.
maybe there should be, but it's pretty easy to write, as you show above.

Anyone that understands FP better than I do:

In the above code, you are multiplying the step by an integer -- is there
any precision loss when you do that??

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-22 Thread Chris Barker
On Thu, Feb 22, 2018 at 11:57 AM, Sebastian Berg  wrote:

> > First, you are right...Decimal is not the right module for this. I
> > think instead I should use the 'fractions' module for loading grid
> > spec information from strings (command-line, configs, etc). The
> > tricky part is getting the yaml reader to use it instead of
> > converting to a float under the hood.
>

I'm not sure fractions is any better (Or necessary, anyway) in the end, you
need floats, so the inherent limitations of floats aren't the problem. In
your original use-case, you wanted a 32 bit float grid in the end, so doing
the calculations is 64 bit float and then downcasting is as good as you're
going to get, and easy and fast. And I suppose you could use 128 bit float
if you want to get to 64 bit in the end -- not as easy, and python itself
doesn't have it.

>  The
> tricky part is getting the yaml reader to use it instead of
> converting to a float under the hood.

64 bit floats support about 15 decimal digits -- are you string-based
sources providing more than that?? if not, then the 64 bit float version is
as good as it's going to get.

> Second, what has been pointed out about the implementation of arange
> > actually helps to explain some oddities I have encountered. In some
> > situations, I have found that it was better for me to produce the
> > reversed sequence, and then reverse that array back and use it.
>

interesting -- though I'd still say "don't use arange" is the "correct"
answer.


> > Third, it would be nice to do what we can to improve arange()'s
> > results. Would we be ok with a PR that uses fma() if it is available,
> > but then falls back on a regular multiply and add if it isn't
> > available, or are we going to need to implement it ourselves for
> > consistency?
>

I would think calling fma() if supported would be fine -- if there is an
easy macro to check if it's there. I don't know if numpy has a policy about
this sort of thing, but I'm pretty sure everywhere else, the final details
of computation fal back to the hardware/compiler/library (i.e. Intel used
extended precision fp, other platforms don't, etc) so I can't see that
having a slightly more accurate computation in arange on some platforms and
not others would cause a problem. If any of the tests are checking to that
level of accuracy, they should be fixed :-)

  2. It sounds *not* like a fix, but rather a
>  "make it slightly less bad, but it is still awful"
>

exactly -- better accuracy is a good thing, but it's not the source of the
problem here -- the source of the problem is inherent to FP, and/or poorly
specified goal. having arrange or linspace lose a couple ULPs fewer isn't
going to change anything.


> Using fma inside linspace might make linspace a bit more exact
> possible, and would be a good thing, though I am not sure we have a
> policy yet for something that is only used sometimes,


see above -- nor do I, but it seems like a fine idea to me.


> It also would be nice to add stable summation to numpy in general (in
> whatever way), which maybe is half related but on nobody's specific
> todo list.


I recall a conversation on this list a (long) while back about compensated
summation (Kahan summation) -- I guess nothign ever came of it?

> Lastly, there definitely needs to be a better tool for grid making.
> > The problem appears easy at first, but it is fraught with many
> > pitfalls and subtle issues. It is easy to say, "always use
> > linspace()", but if the user doesn't have the number of pixels, they
> > will need to calculate that using --- gasp! -- floating point
> > numbers, which could result in the wrong answer.


agreed -- this tends to be an inherently over-specified problem:

min_value
max_value
spacing
number_of_grid_spaces

That is four values, and only three independent ones.

arange() looks like it uses: min_value, max_value, spacing -- but it
doesn't really (see previous discussion) so not the right tool for anything.

linspace() uses:  min_value, max_value, (number_of_grid_spaces + 1), which
is about as good as you can get (except for that annoying 1).

But what if you are given min_value, spacing, number_of_grid_spaces?

Maybe we need a function for that?? (which I think would simply be:

np.arange(number_of_grid_spaces + 1) * spacing

Which is why we probably don't need a function :-) (note that that's only
error of one multiplication per grid point)

Or maybe a way to take all four values, and return a "best fit" grid. The
problem with that is that it's over specified, and and it may not be only
fp error that makes it not fit. What should a code do???

So Ben:

What is the problem you are trying to solve?  -- I'm still confused. What
information do you have to define the grid? Maybe all we need are docs for
how to best compute a grid with given specifications? And point to them in
the arange() and linspace() docstrings.

-CHB

I once wanted to add a "step" argument to linspace, but didn't in the
> end, largely 

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-22 Thread Sebastian Berg
On Thu, 2018-02-22 at 14:33 -0500, Benjamin Root wrote:
> Sorry, I have been distracted with xarray improvements the past
> couple of weeks.
> 
> Some thoughts on what has been discussed:
> 
> First, you are right...Decimal is not the right module for this. I
> think instead I should use the 'fractions' module for loading grid
> spec information from strings (command-line, configs, etc). The
> tricky part is getting the yaml reader to use it instead of
> converting to a float under the hood.
> 
> Second, what has been pointed out about the implementation of arange
> actually helps to explain some oddities I have encountered. In some
> situations, I have found that it was better for me to produce the
> reversed sequence, and then reverse that array back and use it.
> 
> Third, it would be nice to do what we can to improve arange()'s
> results. Would we be ok with a PR that uses fma() if it is available,
> but then falls back on a regular multiply and add if it isn't
> available, or are we going to need to implement it ourselves for
> consistency?
> 

I am not sure I like the idea:
  1. It sounds like it might break code
  2. It sounds *not* like a fix, but rather a
 "make it slightly less bad, but it is still awful"

Using fma inside linspace might make linspace a bit more exact
possible, and would be a good thing, though I am not sure we have a
policy yet for something that is only used sometimes, nor am I sure it
actually helps.

It also would be nice to add stable summation to numpy in general (in
whatever way), which maybe is half related but on nobody's specific
todo list.

> 
> Lastly, there definitely needs to be a better tool for grid making.
> The problem appears easy at first, but it is fraught with many
> pitfalls and subtle issues. It is easy to say, "always use
> linspace()", but if the user doesn't have the number of pixels, they
> will need to calculate that using --- gasp! -- floating point
> numbers, which could result in the wrong answer. Or maybe their
> first/last positions were determined by some other calculation, and
> so the resulting grid does not have the expected spacing. Another
> problem that I run into is starting from two different sized grids
> and padding them both to be the same spec -- and getting that to
> match what would come about if I had generated the grid from scratch.
> 

Maybe you are right, but right now I have no clue what that tool would
do :). If we should add it to numpy likely depends on what exactly it
does and how complex it is.
I once wanted to add a "step" argument to linspace, but didn't in the
end, largely because it basically enforced in a very convoluted way
that the step fit exactly to a number of steps (up to floating point
precision) and body was quite sure it was a good idea, since it would
just be useful for a little convenience when you do not want to
calculate the steps.

Best,

Sebastian


> 
> Getting these things right is hard. I am not even certain that my
> existing code for doing this even right. But, what I do know is that
> until we build such a tool, users will continue to incorrectly use
> arange() and linspace(), and waste time trying to re-invent the wheel
> badly, assuming they even notice their mistakes in the first place!
> So, should such a tool go into numpy, given how fundamental it is to
> generate a sequence of floating point numbers, or should we try to
> put it into a package like rasterio or xarray?
> 
> Cheers!
> Ben Root
> 
> 
> 
> On Thu, Feb 22, 2018 at 2:02 PM, Chris Barker 
> wrote:
> > @Ben: Have you found a solution to your problem? Are there thinks
> > we could do in numpy to make it better?
> > 
> > -CHB
> > 
> > 
> > On Mon, Feb 12, 2018 at 9:33 AM, Chris Barker  > v> wrote:
> > > I think it's all been said, but a few comments:
> > > 
> > > On Sun, Feb 11, 2018 at 2:19 PM, Nils Becker  > > com> wrote:
> > > > Generating equidistantly spaced grids is simply not always
> > > > possible.
> > > > 
> > > 
> > > exactly -- and linspace gives pretty much teh best possible
> > > result, guaranteeing tha tthe start an end points are exact, and
> > > the spacing is within an ULP or two (maybe we could make that
> > > within 1 ULP always, but not sure that's worth it).
> > >  
> > > > The reason is that the absolute spacing of the possible
> > > > floating point numbers depends on their magnitude [1].
> > > > 
> > > 
> > > Also that the exact spacing may not be exactly representable in
> > > FP -- so you have to have at least one space that's a bit off to
> > > get the end points right (or have the endpoints not exact).
> > >  
> > > > If you - for some reason - want the same grid spacing
> > > > everywhere you may choose an appropriate new spacing.
> > > > 
> > > 
> > > well, yeah, but usually you are trying to fit to some other
> > > constraint. I'm still confused as to where these couple of ULPs
> > > actually cause problems, unless you are doing in appropriate FP
> > > comparisons elsewhere.
> > > 
> > > > Curiou

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-22 Thread Benjamin Root
Sorry, I have been distracted with xarray improvements the past couple of
weeks.

Some thoughts on what has been discussed:

First, you are right...Decimal is not the right module for this. I think
instead I should use the 'fractions' module for loading grid spec
information from strings (command-line, configs, etc). The tricky part is
getting the yaml reader to use it instead of converting to a float under
the hood.

Second, what has been pointed out about the implementation of arange
actually helps to explain some oddities I have encountered. In some
situations, I have found that it was better for me to produce the reversed
sequence, and then reverse that array back and use it.

Third, it would be nice to do what we can to improve arange()'s results.
Would we be ok with a PR that uses fma() if it is available, but then falls
back on a regular multiply and add if it isn't available, or are we going
to need to implement it ourselves for consistency?

Lastly, there definitely needs to be a better tool for grid making. The
problem appears easy at first, but it is fraught with many pitfalls and
subtle issues. It is easy to say, "always use linspace()", but if the user
doesn't have the number of pixels, they will need to calculate that using
--- gasp! -- floating point numbers, which could result in the wrong
answer. Or maybe their first/last positions were determined by some other
calculation, and so the resulting grid does not have the expected spacing.
Another problem that I run into is starting from two different sized grids
and padding them both to be the same spec -- and getting that to match what
would come about if I had generated the grid from scratch.

Getting these things right is hard. I am not even certain that my existing
code for doing this even right. But, what I do know is that until we build
such a tool, users will continue to incorrectly use arange() and
linspace(), and waste time trying to re-invent the wheel badly, assuming
they even notice their mistakes in the first place! So, should such a tool
go into numpy, given how fundamental it is to generate a sequence of
floating point numbers, or should we try to put it into a package like
rasterio or xarray?

Cheers!
Ben Root



On Thu, Feb 22, 2018 at 2:02 PM, Chris Barker  wrote:

> @Ben: Have you found a solution to your problem? Are there thinks we could
> do in numpy to make it better?
>
> -CHB
>
>
> On Mon, Feb 12, 2018 at 9:33 AM, Chris Barker 
> wrote:
>
>> I think it's all been said, but a few comments:
>>
>> On Sun, Feb 11, 2018 at 2:19 PM, Nils Becker 
>> wrote:
>>
>>> Generating equidistantly spaced grids is simply not always possible.
>>>
>>
>> exactly -- and linspace gives pretty much teh best possible result,
>> guaranteeing tha tthe start an end points are exact, and the spacing is
>> within an ULP or two (maybe we could make that within 1 ULP always, but not
>> sure that's worth it).
>>
>>
>>> The reason is that the absolute spacing of the possible floating point
>>> numbers depends on their magnitude [1].
>>>
>>
>> Also that the exact spacing may not be exactly representable in FP -- so
>> you have to have at least one space that's a bit off to get the end points
>> right (or have the endpoints not exact).
>>
>>
>>> If you - for some reason - want the same grid spacing everywhere you may
>>> choose an appropriate new spacing.
>>>
>>
>> well, yeah, but usually you are trying to fit to some other constraint.
>> I'm still confused as to where these couple of ULPs actually cause
>> problems, unless you are doing in appropriate FP comparisons elsewhere.
>>
>> Curiously, either by design or accident, arange() seems to do something
>>> similar as was mentioned by Eric. It creates a new grid spacing by adding
>>> and subtracting the starting point of the grid. This often has similar
>>> effect as adding and subtracting N*dx (e.g. if the grid is symmetric around
>>> 0.0). Consequently, arange() seems to trade keeping the grid spacing
>>> constant for a larger error in the grid size and consequently in the end
>>> point.
>>>
>>
>> interesting -- but it actually makes sense -- that is the definition of
>> arange(), borrowed from range(), which was designed for integers, and, in
>> fact, pretty much mirroered the classic C index for loop:
>>
>>
>> for (int i=0; i> ...
>>
>>
>> or in python:
>>
>> i = start
>> while i < stop:
>> i += step
>>
>> The problem here is that termination criteria -- i < stop -- that is the
>> definition of the function, and works just fine for integers (where it came
>> from), but with FP, even with no error accumulation, stop may not be
>> exactly representable, so you could end up with a value for your last item
>> that is about (stop-step), or you could end up with a value that is a
>> couple ULPs less than step -- essentially including the end point when you
>> weren't supposed to.
>>
>> The truth is, making a floating point range() was simply a bad idea to
>> begin with -- it's not the way to define a r

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-22 Thread Chris Barker
@Ben: Have you found a solution to your problem? Are there thinks we could
do in numpy to make it better?

-CHB


On Mon, Feb 12, 2018 at 9:33 AM, Chris Barker  wrote:

> I think it's all been said, but a few comments:
>
> On Sun, Feb 11, 2018 at 2:19 PM, Nils Becker 
> wrote:
>
>> Generating equidistantly spaced grids is simply not always possible.
>>
>
> exactly -- and linspace gives pretty much teh best possible result,
> guaranteeing tha tthe start an end points are exact, and the spacing is
> within an ULP or two (maybe we could make that within 1 ULP always, but not
> sure that's worth it).
>
>
>> The reason is that the absolute spacing of the possible floating point
>> numbers depends on their magnitude [1].
>>
>
> Also that the exact spacing may not be exactly representable in FP -- so
> you have to have at least one space that's a bit off to get the end points
> right (or have the endpoints not exact).
>
>
>> If you - for some reason - want the same grid spacing everywhere you may
>> choose an appropriate new spacing.
>>
>
> well, yeah, but usually you are trying to fit to some other constraint.
> I'm still confused as to where these couple of ULPs actually cause
> problems, unless you are doing in appropriate FP comparisons elsewhere.
>
> Curiously, either by design or accident, arange() seems to do something
>> similar as was mentioned by Eric. It creates a new grid spacing by adding
>> and subtracting the starting point of the grid. This often has similar
>> effect as adding and subtracting N*dx (e.g. if the grid is symmetric around
>> 0.0). Consequently, arange() seems to trade keeping the grid spacing
>> constant for a larger error in the grid size and consequently in the end
>> point.
>>
>
> interesting -- but it actually makes sense -- that is the definition of
> arange(), borrowed from range(), which was designed for integers, and, in
> fact, pretty much mirroered the classic C index for loop:
>
>
> for (int i=0; i ...
>
>
> or in python:
>
> i = start
> while i < stop:
> i += step
>
> The problem here is that termination criteria -- i < stop -- that is the
> definition of the function, and works just fine for integers (where it came
> from), but with FP, even with no error accumulation, stop may not be
> exactly representable, so you could end up with a value for your last item
> that is about (stop-step), or you could end up with a value that is a
> couple ULPs less than step -- essentially including the end point when you
> weren't supposed to.
>
> The truth is, making a floating point range() was simply a bad idea to
> begin with -- it's not the way to define a range of numbers in floating
> point. Whiuch is why the docs now say "When using a non-integer step, such
> as 0.1, the results will often not
> be consistent.  It is better to use ``linspace`` for these cases."
>
> Ben wants a way to define grids in a consistent way -- make sense. And
> yes, sometimes, the original source you are trying to match (like GDAL)
> provides a starting point and step. But with FP, that is simply
> problematic. If:
>
> start + step*num_steps != stop
>
> exactly in floating point, then you'll need to do the math one way or
> another to get what you want -- and I'm not sure anyone but the user knows
> what they want -- do you want step to be as exact as possible, or do you
> want stop to be as exact as possible?
>
> All that being said -- if arange() could be made a tiny bit more accurate
> with fma or other numerical technique, why not? it won't solve the
> problem, but if someone writes and tests the code (and it does not require
> compiler or hardware features that aren't supported everywhere numpy
> compiles), then sure. (Same for linspace, though I'm not sure it's possible)
>
> There is one other option: a new function (or option) that makes a grid
> from a specification of: start, step, num_points. If that is really a
> common use case (that is, you don't care exactly what the end-point is),
> then it might be handy to have it as a utility.
>
> We could also have an arange-like function that, rather than < stop, would
> do "close to" stop. Someone that understands FP better than I might be able
> to compute what the expected error might be, and find the closest end point
> within that error. But I think that's a bad specification -- (stop - start)
> / step may be nowhere near an integer -- then what is the function supposed
> to do??
>
>
> BTW: I kind of wish that linspace specified the number of steps, rather
> than the number of points, that is (num+points - 1) that would save a
> little bit of logical thinking. So if something new is made, let's keep
> that in mind.
>
>
>
>> 1. Comparison to calculations with decimal can be difficult as not all
>> simple decimal step sizes are exactly representable as
>>
> finite floating point numbers.
>>
>
> yeah, this is what I mean by inappropriate use of Decimal -- decimal is
> not inherently "more accurate" than fp -- is just can represent _decimal_
> 

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-12 Thread Chris Barker
I think it's all been said, but a few comments:

On Sun, Feb 11, 2018 at 2:19 PM, Nils Becker  wrote:

> Generating equidistantly spaced grids is simply not always possible.
>

exactly -- and linspace gives pretty much teh best possible result,
guaranteeing tha tthe start an end points are exact, and the spacing is
within an ULP or two (maybe we could make that within 1 ULP always, but not
sure that's worth it).


> The reason is that the absolute spacing of the possible floating point
> numbers depends on their magnitude [1].
>

Also that the exact spacing may not be exactly representable in FP -- so
you have to have at least one space that's a bit off to get the end points
right (or have the endpoints not exact).


> If you - for some reason - want the same grid spacing everywhere you may
> choose an appropriate new spacing.
>

well, yeah, but usually you are trying to fit to some other constraint. I'm
still confused as to where these couple of ULPs actually cause problems,
unless you are doing in appropriate FP comparisons elsewhere.

Curiously, either by design or accident, arange() seems to do something
> similar as was mentioned by Eric. It creates a new grid spacing by adding
> and subtracting the starting point of the grid. This often has similar
> effect as adding and subtracting N*dx (e.g. if the grid is symmetric around
> 0.0). Consequently, arange() seems to trade keeping the grid spacing
> constant for a larger error in the grid size and consequently in the end
> point.
>

interesting -- but it actually makes sense -- that is the definition of
arange(), borrowed from range(), which was designed for integers, and, in
fact, pretty much mirroered the classic C index for loop:


for (int i=0; i 1. Comparison to calculations with decimal can be difficult as not all
> simple decimal step sizes are exactly representable as
>
finite floating point numbers.
>

yeah, this is what I mean by inappropriate use of Decimal -- decimal is not
inherently "more accurate" than fp -- is just can represent _decimal_
numbers exactly, which we are all use to -- we want  1 / 10 to be exact,
but dont mind that 1 / 3 isn't.

Decimal also provided variable precision -- so it can be handy for that. I
kinda wish Python had an arbitrary precision binary floating point built
in...

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-11 Thread Nils Becker
A slightly longer comment although I have to admit that it misses the
discussion here partially. Still I hope it provides some background to your
question.

Generating equidistantly spaced grids is simply not always possible. The
reason is that the absolute spacing of the possible floating point numbers
depends on their magnitude [1]. The true, equidistant grid may not be
representable in the precision that is used. Consequently, the spacing
between consecutive points in a grid created with linspace can vary:

x = np.linspace(-5.0, 5.0, 101) # dx = 0.1
np.min(np.diff(x))
> 0.0991268851
np.max(np.diff(x))
> 0.100582077

This does not necessarily mean that the calculation of the grid points is
off by more than one ulp. It simply shows that the absolute value of one
ulp changes over the extent of the grid.
It follows that even if the grid is calculated with higher precision (e.g.
fma operations) or even arbitrary precision, we will hit a fundamental
boundary: the finite precision floating point numbers closest to the "true"
grid simply are not equidistantly spaced.
My estimate is that the grid spacing is at most off by ~N ulp. This should
be far too small to have any consequences in calculations based on the grid
points themselves.
However, if you write code that reconstructs the grid spacing from
consecutive grid points, e.g. for scaling the value of an integral, you may
notice it. I did when writing a unit test for Fourier transforms where I
compared to analytical results. Take home message: use dx = (x[-1] -
x[0])/(N-1) instead of dx = x[1] - x[0].

If you - for some reason - want the same grid spacing everywhere you may
choose an appropriate new spacing. What generally seems to work (but not
always) is one that fits the floating point spacing at the largest number
in the grid. In other words add and subtract the largest number in your
grid to the grid spacing:

dx = 0.1
N = 101
x0 = -5.0

# "round" dx, assuming that the maximal magnitude of the grid is ~N*dx
dx = (dx + N * dx) - (N * dx)
> 0.100582077
x = x0 + np.arange(N) * dx
np.max(np.diff(x)) - np.min(np.diff(x))
> 0.0

Curiosly, either by design or accident, arange() seems to do something
similar as was mentioned by Eric. It creates a new grid spacing by adding
and subtracting the starting point of the grid. This often has similar
effect as adding and subtracting N*dx (e.g. if the grid is symmetric around
0.0). Consequently, arange() seems to trade keeping the grid spacing
constant for a larger error in the grid size and consequently in the end
point.

The above discussion somewhat ignored additional errors in the calculation
of the grid points. If every grid point is the result of one multiplication
and one addition, they should stay within one ulp (for x[i] != 0.0). If the
grid is calculated by accumulation they may be larger.

If you are concerned about how arange() or linspace() calculates the grid,
I would recommend to simply use

x = x0 + np.arange(N) * dx

which will yield an accurate representation of the grid you most probably
want (within 1 ulp for x[i] != 0.0). If you do not have x0, N, and dx, try
to infer them from other information on the grid.

To sum this all up: Even if your grid calculation is free of round-off
error, the resulting grid using finite precision floating point numbers
will only be exactly equidistant for appropriate grid spacing dx and
starting values x0. On the other hand neither this effect nor round-off in
the grid calculation should ever play a signification role in your
calculations as the resulting errors are too small.

Some additional comments:

1. Comparison to calculations with decimal can be difficult as not all
simple decimal step sizes are exactly representable as finite floating
point numbers. E.g., in single precision 0.100014901161193847656250 is
the number closest to 0.1. To make a good comparison you would have to
compare the output of arange() to a grid calculated to infinite precision
which has the single precision representation of 0.1 as grid spacing and
not its decimal value.

2. Calculating the difference in single precision in your C code as below
yields

float actual = -44.990;
float err1 = fabs(x1 - actual);
float err2 = fabs(x2 - actual);
$ ./test_fma
x1 -44.9899978637695312  x2 -44.9900016784667969
err1 3.81470e-06  err2 0.0e+00

If you calculate the ulp error you notice that the non-FMA solution is off
by exactly one ulp, while the FMA solution yields the single precision
float closest to the actual solution (error = 0 ulp). Losing one ulp should
not matter for any application. If it does, use a data type with higher
precision, e.g., double.

3. I wrote Python script that compares grids generated in different ways
with a grid calculated with mpmath in high precision [2]. Maybe it is
useful for further insights.

Cheers
Nils

PS: I realize that I went a little off-topic here and apologize for that.
However, a while 

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-10 Thread Matthias Geier
I just want to add a few links related to the topic:

https://mail.python.org/pipermail/numpy-discussion/2007-September/029129.html
https://quantumwise.com/forum/index.php?topic=110.0#.VIVgyIctjRZ
https://mail.python.org/pipermail/numpy-discussion/2012-February/060238.html
http://nbviewer.jupyter.org/github/mgeier/python-audio/blob/master/misc/arange.ipynb

cheers,
Matthias

On Fri, Feb 9, 2018 at 11:17 PM, Benjamin Root  wrote:

> Interesting...
>
> ```
> static void
> @NAME@_fill(@type@ *buffer, npy_intp length, void *NPY_UNUSED(ignored))
> {
> npy_intp i;
> @type@ start = buffer[0];
> @type@ delta = buffer[1];
> delta -= start;
> for (i = 2; i < length; ++i) {
> buffer[i] = start + i*delta;
> }
> }
> ```
>
> So, the second value is computed using the delta arange was given, but
> then tries to get the delta back, which incurs errors:
> ```
> >>> a = np.float32(-115)
> >>> delta = np.float32(0.01)
> >>> b = a + delta
> >>> new_delta = b - a
> >>> "%.16f" % delta
> '0.009997764826'
> >>> "%.16f" % new_delta
> '0.0100021362304688'
> ```
>
> Also, right there is a good example of where the use of fma() could be of
> value.
>
> Cheers!
> Ben Root
>
>
> On Fri, Feb 9, 2018 at 4:56 PM, Eric Wieser 
> wrote:
>
>> Can’t arange and linspace operations with floats be done internally
>>
>> Yes, and they probably should be - they’re done this way as a hack
>> because the api exposed for custom dtypes is here
>> ,
>> (example implementation here
>> )
>> - essentially, you give it the first two elements of the array, and ask it
>> to fill in the rest.
>> ​
>>
>> On Fri, 9 Feb 2018 at 13:17 Matthew Harrigan 
>> wrote:
>>
>>> I apologize if I'm missing something basic, but why are floats being
>>> accumulated in the first place?  Can't arange and linspace operations with
>>> floats be done internally similar to `start + np.arange(num_steps) *
>>> step_size`?  I.e. always accumulate (really increment) integers to limit
>>> errors.
>>>
>>> On Fri, Feb 9, 2018 at 3:43 PM, Benjamin Root 
>>> wrote:
>>>


 On Fri, Feb 9, 2018 at 12:19 PM, Chris Barker 
 wrote:

> On Wed, Feb 7, 2018 at 12:09 AM, Ralf Gommers 
> wrote:
>>
>>  It is partly a plea for some development of numerically accurate
>>> functions for computing lat/lon grids from a combination of inputs: 
>>> bounds,
>>> counts, and resolutions.
>>>
>>
> Can you be more specific about what problems you've run into -- I work
> with lat-lon grids all the time, and have never had a problem.
>
> float32 degrees gives you about 1 meter accuracy or better, so I can
> see how losing a few digits might be an issue, though I would argue that
> you maybe shouldn't use float32 if you are worried about anything close to
> 1m accuracy... -- or shift to a relative coordinate system of some sort.
>

 The issue isn't so much the accuracy of the coordinates themselves. I
 am only worried about 1km resolution (which is approximately 0.01 degrees
 at mid-latitudes). My concern is with consistent *construction* of a
 coordinate grid with even spacing. As it stands right now. If I provide a
 corner coordinate, a resolution, and the number of pixels, the result is
 not terrible (indeed, this is the approach used by gdal/rasterio). If I
 have start/end coordinates and the number of pixels, the result is not bad,
 either (use linspace). But, if I have start/end coordinates and a
 resolution, then determining the number of pixels from that is actually
 tricky to get right in the general case, especially with float32 and large
 grids, and especially if the bounding box specified isn't exactly divisible
 by the resolution.


>
> I have been playing around with the decimal package a bit lately,
>>>
>>
> sigh. decimal is so often looked at a solution to a problem it isn't
> designed for. lat-lon is natively Sexagesimal -- maybe we need that dtype
> :-)
>
> what you get from decimal is variable precision -- maybe a binary
> variable precision lib is a better answer -- that would be a good thing to
> have easy access to in numpy, but in this case, if you want better 
> accuracy
> in a computation that will end up in float32, just use float64.
>

 I am not concerned about computing distances or anything like that, I
 am trying to properly construct my grid. I need consistent results
 regardless of which way the grid is specified (start/end/count,
 start/res/count, start/end/res). I have found that loading up the grid
 specs (using in a config file or command-line) using the Decimal class
 allows me to exactly and 

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-09 Thread Benjamin Root
Interesting...

```
static void
@NAME@_fill(@type@ *buffer, npy_intp length, void *NPY_UNUSED(ignored))
{
npy_intp i;
@type@ start = buffer[0];
@type@ delta = buffer[1];
delta -= start;
for (i = 2; i < length; ++i) {
buffer[i] = start + i*delta;
}
}
```

So, the second value is computed using the delta arange was given, but then
tries to get the delta back, which incurs errors:
```
>>> a = np.float32(-115)
>>> delta = np.float32(0.01)
>>> b = a + delta
>>> new_delta = b - a
>>> "%.16f" % delta
'0.009997764826'
>>> "%.16f" % new_delta
'0.0100021362304688'
```

Also, right there is a good example of where the use of fma() could be of
value.

Cheers!
Ben Root


On Fri, Feb 9, 2018 at 4:56 PM, Eric Wieser 
wrote:

> Can’t arange and linspace operations with floats be done internally
>
> Yes, and they probably should be - they’re done this way as a hack because
> the api exposed for custom dtypes is here
> ,
> (example implementation here
> )
> - essentially, you give it the first two elements of the array, and ask it
> to fill in the rest.
> ​
>
> On Fri, 9 Feb 2018 at 13:17 Matthew Harrigan 
> wrote:
>
>> I apologize if I'm missing something basic, but why are floats being
>> accumulated in the first place?  Can't arange and linspace operations with
>> floats be done internally similar to `start + np.arange(num_steps) *
>> step_size`?  I.e. always accumulate (really increment) integers to limit
>> errors.
>>
>> On Fri, Feb 9, 2018 at 3:43 PM, Benjamin Root 
>> wrote:
>>
>>>
>>>
>>> On Fri, Feb 9, 2018 at 12:19 PM, Chris Barker 
>>> wrote:
>>>
 On Wed, Feb 7, 2018 at 12:09 AM, Ralf Gommers 
 wrote:
>
>  It is partly a plea for some development of numerically accurate
>> functions for computing lat/lon grids from a combination of inputs: 
>> bounds,
>> counts, and resolutions.
>>
>
 Can you be more specific about what problems you've run into -- I work
 with lat-lon grids all the time, and have never had a problem.

 float32 degrees gives you about 1 meter accuracy or better, so I can
 see how losing a few digits might be an issue, though I would argue that
 you maybe shouldn't use float32 if you are worried about anything close to
 1m accuracy... -- or shift to a relative coordinate system of some sort.

>>>
>>> The issue isn't so much the accuracy of the coordinates themselves. I am
>>> only worried about 1km resolution (which is approximately 0.01 degrees at
>>> mid-latitudes). My concern is with consistent *construction* of a
>>> coordinate grid with even spacing. As it stands right now. If I provide a
>>> corner coordinate, a resolution, and the number of pixels, the result is
>>> not terrible (indeed, this is the approach used by gdal/rasterio). If I
>>> have start/end coordinates and the number of pixels, the result is not bad,
>>> either (use linspace). But, if I have start/end coordinates and a
>>> resolution, then determining the number of pixels from that is actually
>>> tricky to get right in the general case, especially with float32 and large
>>> grids, and especially if the bounding box specified isn't exactly divisible
>>> by the resolution.
>>>
>>>

 I have been playing around with the decimal package a bit lately,
>>
>
 sigh. decimal is so often looked at a solution to a problem it isn't
 designed for. lat-lon is natively Sexagesimal -- maybe we need that dtype
 :-)

 what you get from decimal is variable precision -- maybe a binary
 variable precision lib is a better answer -- that would be a good thing to
 have easy access to in numpy, but in this case, if you want better accuracy
 in a computation that will end up in float32, just use float64.

>>>
>>> I am not concerned about computing distances or anything like that, I am
>>> trying to properly construct my grid. I need consistent results regardless
>>> of which way the grid is specified (start/end/count, start/res/count,
>>> start/end/res). I have found that loading up the grid specs (using in a
>>> config file or command-line) using the Decimal class allows me to exactly
>>> and consistently represent the grid specification, and gets me most of the
>>> way there. But the problems with arange() is frustrating, and I have to
>>> have extra logic to go around that and over to linspace() instead.
>>>
>>>

 and I discovered the concept of "fused multiply-add" operations for
>> improved accuracy. I have come to realize that fma operations could be 
>> used
>> to greatly improve the accuracy of linspace() and arange().
>>
>
 arange() is problematic for non-integer use anyway, by its very
 definition (getting the "end point" correct

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-09 Thread Eric Wieser
Can’t arange and linspace operations with floats be done internally

Yes, and they probably should be - they’re done this way as a hack because
the api exposed for custom dtypes is here
,
(example implementation here
)
- essentially, you give it the first two elements of the array, and ask it
to fill in the rest.
​

On Fri, 9 Feb 2018 at 13:17 Matthew Harrigan 
wrote:

> I apologize if I'm missing something basic, but why are floats being
> accumulated in the first place?  Can't arange and linspace operations with
> floats be done internally similar to `start + np.arange(num_steps) *
> step_size`?  I.e. always accumulate (really increment) integers to limit
> errors.
>
> On Fri, Feb 9, 2018 at 3:43 PM, Benjamin Root 
> wrote:
>
>>
>>
>> On Fri, Feb 9, 2018 at 12:19 PM, Chris Barker 
>> wrote:
>>
>>> On Wed, Feb 7, 2018 at 12:09 AM, Ralf Gommers 
>>> wrote:

  It is partly a plea for some development of numerically accurate
> functions for computing lat/lon grids from a combination of inputs: 
> bounds,
> counts, and resolutions.
>

>>> Can you be more specific about what problems you've run into -- I work
>>> with lat-lon grids all the time, and have never had a problem.
>>>
>>> float32 degrees gives you about 1 meter accuracy or better, so I can see
>>> how losing a few digits might be an issue, though I would argue that you
>>> maybe shouldn't use float32 if you are worried about anything close to 1m
>>> accuracy... -- or shift to a relative coordinate system of some sort.
>>>
>>
>> The issue isn't so much the accuracy of the coordinates themselves. I am
>> only worried about 1km resolution (which is approximately 0.01 degrees at
>> mid-latitudes). My concern is with consistent *construction* of a
>> coordinate grid with even spacing. As it stands right now. If I provide a
>> corner coordinate, a resolution, and the number of pixels, the result is
>> not terrible (indeed, this is the approach used by gdal/rasterio). If I
>> have start/end coordinates and the number of pixels, the result is not bad,
>> either (use linspace). But, if I have start/end coordinates and a
>> resolution, then determining the number of pixels from that is actually
>> tricky to get right in the general case, especially with float32 and large
>> grids, and especially if the bounding box specified isn't exactly divisible
>> by the resolution.
>>
>>
>>>
>>> I have been playing around with the decimal package a bit lately,
>

>>> sigh. decimal is so often looked at a solution to a problem it isn't
>>> designed for. lat-lon is natively Sexagesimal -- maybe we need that dtype
>>> :-)
>>>
>>> what you get from decimal is variable precision -- maybe a binary
>>> variable precision lib is a better answer -- that would be a good thing to
>>> have easy access to in numpy, but in this case, if you want better accuracy
>>> in a computation that will end up in float32, just use float64.
>>>
>>
>> I am not concerned about computing distances or anything like that, I am
>> trying to properly construct my grid. I need consistent results regardless
>> of which way the grid is specified (start/end/count, start/res/count,
>> start/end/res). I have found that loading up the grid specs (using in a
>> config file or command-line) using the Decimal class allows me to exactly
>> and consistently represent the grid specification, and gets me most of the
>> way there. But the problems with arange() is frustrating, and I have to
>> have extra logic to go around that and over to linspace() instead.
>>
>>
>>>
>>> and I discovered the concept of "fused multiply-add" operations for
> improved accuracy. I have come to realize that fma operations could be 
> used
> to greatly improve the accuracy of linspace() and arange().
>

>>> arange() is problematic for non-integer use anyway, by its very
>>> definition (getting the "end point" correct requires the right step, even
>>> without FP error).
>>>
>>> and would it really help with linspace? it's computing a delta with one
>>> division in fp, then multiplying it by an integer (represented in fp --
>>> why? why not keep that an integer till the multiply?).
>>>
>>
>> Sorry, that was a left-over from a previous draft of my email after I
>> discovered that linspace's accuracy was on par with fma(). And while
>> arange() has inherent problems, it can still be made better than it is now.
>> In fact, I haven't investigated this, but I did recently discover some unit
>> tests of mine started to fail after a numpy upgrade, and traced it back to
>> a reduction in the accuracy of a usage of arange() with float32s. So,
>> something got worse at some point, which means we could still get accuracy
>> back if we can fi

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-09 Thread Matthew Harrigan
I apologize if I'm missing something basic, but why are floats being
accumulated in the first place?  Can't arange and linspace operations with
floats be done internally similar to `start + np.arange(num_steps) *
step_size`?  I.e. always accumulate (really increment) integers to limit
errors.

On Fri, Feb 9, 2018 at 3:43 PM, Benjamin Root  wrote:

>
>
> On Fri, Feb 9, 2018 at 12:19 PM, Chris Barker 
> wrote:
>
>> On Wed, Feb 7, 2018 at 12:09 AM, Ralf Gommers 
>> wrote:
>>>
>>>  It is partly a plea for some development of numerically accurate
 functions for computing lat/lon grids from a combination of inputs: bounds,
 counts, and resolutions.

>>>
>> Can you be more specific about what problems you've run into -- I work
>> with lat-lon grids all the time, and have never had a problem.
>>
>> float32 degrees gives you about 1 meter accuracy or better, so I can see
>> how losing a few digits might be an issue, though I would argue that you
>> maybe shouldn't use float32 if you are worried about anything close to 1m
>> accuracy... -- or shift to a relative coordinate system of some sort.
>>
>
> The issue isn't so much the accuracy of the coordinates themselves. I am
> only worried about 1km resolution (which is approximately 0.01 degrees at
> mid-latitudes). My concern is with consistent *construction* of a
> coordinate grid with even spacing. As it stands right now. If I provide a
> corner coordinate, a resolution, and the number of pixels, the result is
> not terrible (indeed, this is the approach used by gdal/rasterio). If I
> have start/end coordinates and the number of pixels, the result is not bad,
> either (use linspace). But, if I have start/end coordinates and a
> resolution, then determining the number of pixels from that is actually
> tricky to get right in the general case, especially with float32 and large
> grids, and especially if the bounding box specified isn't exactly divisible
> by the resolution.
>
>
>>
>> I have been playing around with the decimal package a bit lately,

>>>
>> sigh. decimal is so often looked at a solution to a problem it isn't
>> designed for. lat-lon is natively Sexagesimal -- maybe we need that dtype
>> :-)
>>
>> what you get from decimal is variable precision -- maybe a binary
>> variable precision lib is a better answer -- that would be a good thing to
>> have easy access to in numpy, but in this case, if you want better accuracy
>> in a computation that will end up in float32, just use float64.
>>
>
> I am not concerned about computing distances or anything like that, I am
> trying to properly construct my grid. I need consistent results regardless
> of which way the grid is specified (start/end/count, start/res/count,
> start/end/res). I have found that loading up the grid specs (using in a
> config file or command-line) using the Decimal class allows me to exactly
> and consistently represent the grid specification, and gets me most of the
> way there. But the problems with arange() is frustrating, and I have to
> have extra logic to go around that and over to linspace() instead.
>
>
>>
>> and I discovered the concept of "fused multiply-add" operations for
 improved accuracy. I have come to realize that fma operations could be used
 to greatly improve the accuracy of linspace() and arange().

>>>
>> arange() is problematic for non-integer use anyway, by its very
>> definition (getting the "end point" correct requires the right step, even
>> without FP error).
>>
>> and would it really help with linspace? it's computing a delta with one
>> division in fp, then multiplying it by an integer (represented in fp --
>> why? why not keep that an integer till the multiply?).
>>
>
> Sorry, that was a left-over from a previous draft of my email after I
> discovered that linspace's accuracy was on par with fma(). And while
> arange() has inherent problems, it can still be made better than it is now.
> In fact, I haven't investigated this, but I did recently discover some unit
> tests of mine started to fail after a numpy upgrade, and traced it back to
> a reduction in the accuracy of a usage of arange() with float32s. So,
> something got worse at some point, which means we could still get accuracy
> back if we can figure out what changed.
>
>
>>
>> In particular, I have been needing improved results for computing
 latitude/longitude grids, which tend to be done in float32's to save memory
 (at least, this is true in data I come across).

>>>
>>> If you care about saving memory *and* accuracy, wouldn't it make more
>>> sense to do your computations in float64, and convert to float32 at the
>>> end?
>>>
>>
>> that does seem to be the easy option :-)
>>
>
> Kinda missing the point, isn't it? Isn't that like saying "convert all
> your data to float64s prior to calling np.mean()"? That's ridiculous.
> Instead, we made np.mean() upcast the inner-loop operation, and even allow
> an option to specify what the dtype that should be used for the a

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-09 Thread Benjamin Root
On Fri, Feb 9, 2018 at 12:19 PM, Chris Barker  wrote:

> On Wed, Feb 7, 2018 at 12:09 AM, Ralf Gommers 
> wrote:
>>
>>  It is partly a plea for some development of numerically accurate
>>> functions for computing lat/lon grids from a combination of inputs: bounds,
>>> counts, and resolutions.
>>>
>>
> Can you be more specific about what problems you've run into -- I work
> with lat-lon grids all the time, and have never had a problem.
>
> float32 degrees gives you about 1 meter accuracy or better, so I can see
> how losing a few digits might be an issue, though I would argue that you
> maybe shouldn't use float32 if you are worried about anything close to 1m
> accuracy... -- or shift to a relative coordinate system of some sort.
>

The issue isn't so much the accuracy of the coordinates themselves. I am
only worried about 1km resolution (which is approximately 0.01 degrees at
mid-latitudes). My concern is with consistent *construction* of a
coordinate grid with even spacing. As it stands right now. If I provide a
corner coordinate, a resolution, and the number of pixels, the result is
not terrible (indeed, this is the approach used by gdal/rasterio). If I
have start/end coordinates and the number of pixels, the result is not bad,
either (use linspace). But, if I have start/end coordinates and a
resolution, then determining the number of pixels from that is actually
tricky to get right in the general case, especially with float32 and large
grids, and especially if the bounding box specified isn't exactly divisible
by the resolution.


>
> I have been playing around with the decimal package a bit lately,
>>>
>>
> sigh. decimal is so often looked at a solution to a problem it isn't
> designed for. lat-lon is natively Sexagesimal -- maybe we need that dtype
> :-)
>
> what you get from decimal is variable precision -- maybe a binary variable
> precision lib is a better answer -- that would be a good thing to have easy
> access to in numpy, but in this case, if you want better accuracy in a
> computation that will end up in float32, just use float64.
>

I am not concerned about computing distances or anything like that, I am
trying to properly construct my grid. I need consistent results regardless
of which way the grid is specified (start/end/count, start/res/count,
start/end/res). I have found that loading up the grid specs (using in a
config file or command-line) using the Decimal class allows me to exactly
and consistently represent the grid specification, and gets me most of the
way there. But the problems with arange() is frustrating, and I have to
have extra logic to go around that and over to linspace() instead.


>
> and I discovered the concept of "fused multiply-add" operations for
>>> improved accuracy. I have come to realize that fma operations could be used
>>> to greatly improve the accuracy of linspace() and arange().
>>>
>>
> arange() is problematic for non-integer use anyway, by its very definition
> (getting the "end point" correct requires the right step, even without FP
> error).
>
> and would it really help with linspace? it's computing a delta with one
> division in fp, then multiplying it by an integer (represented in fp --
> why? why not keep that an integer till the multiply?).
>

Sorry, that was a left-over from a previous draft of my email after I
discovered that linspace's accuracy was on par with fma(). And while
arange() has inherent problems, it can still be made better than it is now.
In fact, I haven't investigated this, but I did recently discover some unit
tests of mine started to fail after a numpy upgrade, and traced it back to
a reduction in the accuracy of a usage of arange() with float32s. So,
something got worse at some point, which means we could still get accuracy
back if we can figure out what changed.


>
> In particular, I have been needing improved results for computing
>>> latitude/longitude grids, which tend to be done in float32's to save memory
>>> (at least, this is true in data I come across).
>>>
>>
>> If you care about saving memory *and* accuracy, wouldn't it make more
>> sense to do your computations in float64, and convert to float32 at the
>> end?
>>
>
> that does seem to be the easy option :-)
>

Kinda missing the point, isn't it? Isn't that like saying "convert all your
data to float64s prior to calling np.mean()"? That's ridiculous. Instead,
we made np.mean() upcast the inner-loop operation, and even allow an option
to specify what the dtype that should be used for the aggregator.


>
>
>> Now, to the crux of my problem. It is next to impossible to generate a
>>> non-trivial numpy array of coordinates, even in double precision, without
>>> hitting significant numerical errors.
>>>
>>
> I'm confused, the example you posted doesn't have significant errors...
>

Hmm, "errors" was the wrong word. "Differences between methods" might be
more along the lines of what I was thinking. Remember, I am looking for
consistency.


>
>
>> Which has lead me down the 

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-09 Thread Chris Barker
On Wed, Feb 7, 2018 at 12:09 AM, Ralf Gommers 
wrote:
>
>  It is partly a plea for some development of numerically accurate
>> functions for computing lat/lon grids from a combination of inputs: bounds,
>> counts, and resolutions.
>>
>
Can you be more specific about what problems you've run into -- I work with
lat-lon grids all the time, and have never had a problem.

float32 degrees gives you about 1 meter accuracy or better, so I can see
how losing a few digits might be an issue, though I would argue that you
maybe shouldn't use float32 if you are worried about anything close to 1m
accuracy... -- or shift to a relative coordinate system of some sort.

I have been playing around with the decimal package a bit lately,
>>
>
sigh. decimal is so often looked at a solution to a problem it isn't
designed for. lat-lon is natively Sexagesimal -- maybe we need that dtype
:-)

what you get from decimal is variable precision -- maybe a binary variable
precision lib is a better answer -- that would be a good thing to have easy
access to in numpy, but in this case, if you want better accuracy in a
computation that will end up in float32, just use float64.

and I discovered the concept of "fused multiply-add" operations for
>> improved accuracy. I have come to realize that fma operations could be used
>> to greatly improve the accuracy of linspace() and arange().
>>
>
arange() is problematic for non-integer use anyway, by its very definition
(getting the "end point" correct requires the right step, even without FP
error).

and would it really help with linspace? it's computing a delta with one
division in fp, then multiplying it by an integer (represented in fp --
why? why not keep that an integer till the multiply?).

In particular, I have been needing improved results for computing
>> latitude/longitude grids, which tend to be done in float32's to save memory
>> (at least, this is true in data I come across).
>>
>
> If you care about saving memory *and* accuracy, wouldn't it make more
> sense to do your computations in float64, and convert to float32 at the
> end?
>

that does seem to be the easy option :-)


> Now, to the crux of my problem. It is next to impossible to generate a
>> non-trivial numpy array of coordinates, even in double precision, without
>> hitting significant numerical errors.
>>
>
I'm confused, the example you posted doesn't have significant errors...


> Which has lead me down the path of using the decimal package (which
>> doesn't play very nicely with numpy because of the lack of casting rules
>> for it). Consider the following:
>> ```
>> $ cat test_fma.py
>> from __future__ import print_function
>> import numpy as np
>> res = np.float32(0.01)
>> cnt = 7001
>> x0 = np.float32(-115.0)
>> x1 = res * cnt + x0
>> print("res * cnt + x0 = %.16f" % x1)
>> x = np.arange(-115.0, -44.99 + (res / 2), 0.01, dtype='float32')
>> print("len(arange()): %d  arange()[-1]: %16f" % (len(x), x[-1]))
>> x = np.linspace(-115.0, -44.99, cnt, dtype='float32')
>> print("linspace()[-1]: %.16f" % x[-1])
>>
>> $ python test_fma.py
>> res * cnt + x0 = -44.9900015648454428
>> len(arange()): 7002  arange()[-1]:   -44.975044
>> linspace()[-1]: -44.9900016784667969
>> ```
>> arange just produces silly results (puts out an extra element... adding
>> half of the resolution is typically mentioned as a solution on mailing
>> lists to get around arange()'s limitations -- I personally don't do this).
>>
>
The real solution is "don't do that" arange is not the right tool for the
job.

Then there is this:

res * cnt + x0 = -44.9900015648454428
linspace()[-1]: -44.9900016784667969

that's as good as you are ever going to get with 32 bit floats...

Though I just noticed something about your numbers -- there should be a
nice even base ten delta if you have 7001 gaps -- but linspace produces N
points, not N gaps -- so maybe you want:


In [*17*]: l = np.linspace(-115.0, -44.99, 7002)


In [*18*]: l[:5]

Out[*18*]: array([-115.  , -114.99, -114.98, -114.97, -114.96])


In [*19*]: l[-5:]

Out[*19*]: array([-45.03, -45.02, -45.01, -45.  , -44.99])


or, in float32 -- not as pretty:


In [*20*]: l = np.linspace(-115.0, -44.99, 7002, dtype=np.float32)


In [*21*]: l[:5]

Out[*21*]:

array([-115., -114.98999786, -114.98000336, -114.97000122,

   -114.9508], dtype=float32)


In [*22*]: l[-5:]

Out[*22*]: array([-45.02999878, -45.0246, -45.00999832, -45.,
-44.99000168], dtype=float32)


but still as good as you get with float32, and exactly the same result as
computing in float64 and converting:



In [*25*]: l = np.linspace(-115.0, -44.99, 7002).astype(np.float32)


In [*26*]: l[:5]

Out[*26*]:

array([-115., -114.98999786, -114.98000336, -114.97000122,

   -114.9508], dtype=float32)


In [*27*]: l[-5:]

Out[*27*]: array([-45.02999878, -45.0246, -45.00999832, -45.,
-44.99000168], dtype=float32)



>> So, does it make any sense to improve arange by utilizing fma() under the
>> hood?
>>
>

Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-06 Thread Ralf Gommers
On Wed, Feb 7, 2018 at 2:57 AM, Benjamin Root  wrote:

> Note, the following is partly to document my explorations in computing
> lat/on grids in numpy lately, in case others come across these issues in
> the future. It is partly a plea for some development of numerically
> accurate functions for computing lat/lon grids from a combination of
> inputs: bounds, counts, and resolutions.
>

You're talking about accuracy several times, it would help to define what
your actual requirements are. And a quantification of what the accuracy
loss currently is. I wouldn't expect it to me more than a few ulp for
float64.


>
> I have been playing around with the decimal package a bit lately, and I
> discovered the concept of "fused multiply-add" operations for improved
> accuracy. I have come to realize that fma operations could be used to
> greatly improve the accuracy of linspace() and arange(). In particular, I
> have been needing improved results for computing latitude/longitude grids,
> which tend to be done in float32's to save memory (at least, this is true
> in data I come across).
>

If you care about saving memory *and* accuracy, wouldn't it make more sense
to do your computations in float64, and convert to float32 at the end? That
must be much more accurate than any new float32 algorithm you come up with.

Ralf



>
> Since fma is not available yet in python, please consider the following C
> snippet:
>
> ```
> $ cat test_fma.c
> #include
> #include
> #include
>
> int main(){
> float res = 0.01;
> float x0 = -115.0;
> int cnt = 7001;
> float x1 = res * cnt + x0;
> float x2 = fma(res, cnt, x0);
> printf("x1 %.16f  x2 %.16f\n", x1, x2);
> float err1 = fabs(x1 - -44.990);
> float err2 = fabs(x2 - -44.990);
> printf("err1 %.5e  err2 %.5e\n", err1, err2);
> return 0;
> }
> $ gcc test_fma.c -lm -o test_fma
> $ ./test_fma
> x1 -44.9899978637695312  x2 -44.9900016784667969
> err1 2.13623e-06  err2 1.67847e-06
> ```
>
> And if you do similar for double-precision, the fma still yields
> significant accuracy over double precision explicit multiply-add.
>
> Now, to the crux of my problem. It is next to impossible to generate a
> non-trivial numpy array of coordinates, even in double precision, without
> hitting significant numerical errors. Which has lead me down the path of
> using the decimal package (which doesn't play very nicely with numpy
> because of the lack of casting rules for it). Consider the following:
> ```
> $ cat test_fma.py
> from __future__ import print_function
> import numpy as np
> res = np.float32(0.01)
> cnt = 7001
> x0 = np.float32(-115.0)
> x1 = res * cnt + x0
> print("res * cnt + x0 = %.16f" % x1)
> x = np.arange(-115.0, -44.99 + (res / 2), 0.01, dtype='float32')
> print("len(arange()): %d  arange()[-1]: %16f" % (len(x), x[-1]))
> x = np.linspace(-115.0, -44.99, cnt, dtype='float32')
> print("linspace()[-1]: %.16f" % x[-1])
>
> $ python test_fma.py
> res * cnt + x0 = -44.9900015648454428
> len(arange()): 7002  arange()[-1]:   -44.975044
> linspace()[-1]: -44.9900016784667969
> ```
> arange just produces silly results (puts out an extra element... adding
> half of the resolution is typically mentioned as a solution on mailing
> lists to get around arange()'s limitations -- I personally don't do this).
>
> linspace() is a tad bit better than arange(), but still a little bit worse
> than just computing the value straight out. It also produces a result that
> is on par with fma(), so that is reassuring that it is about as accurate as
> can be at the moment. This also matches up almost exactly to what I would
> get if I used Decimal()s and then casted to float32's for final storage, so
> that's not too bad as well.
>
> So, does it make any sense to improve arange by utilizing fma() under the
> hood? Also, any plans for making fma() available as a ufunc?
>
> Notice that most of my examples required knowing the number of grid points
> ahead of time. But what if I didn't know that? What if I just have the
> bounds and the resolution? Then arange() is the natural fit, but as I
> showed, its accuracy is lacking, and you have to do some sort of hack to do
> a closed interval.
>
> Thoughts? Comments?
> Ben Root
> (donning flame suit)
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion