Re: [Numpy-discussion] improving arange()? introducing fma()?

Matthew Harrigan Fri, 09 Feb 2018 13:17:54 -0800

I apologize if I'm missing something basic, but why are floats being
accumulated in the first place?  Can't arange and linspace operations with
floats be done internally similar to `start + np.arange(num_steps) *
step_size`?  I.e. always accumulate (really increment) integers to limit
errors.


On Fri, Feb 9, 2018 at 3:43 PM, Benjamin Root <[email protected]> wrote:

>
>
> On Fri, Feb 9, 2018 at 12:19 PM, Chris Barker <[email protected]>
> wrote:
>
>> On Wed, Feb 7, 2018 at 12:09 AM, Ralf Gommers <[email protected]>
>> wrote:
>>>
>>>  It is partly a plea for some development of numerically accurate
>>>> functions for computing lat/lon grids from a combination of inputs: bounds,
>>>> counts, and resolutions.
>>>>
>>>
>> Can you be more specific about what problems you've run into -- I work
>> with lat-lon grids all the time, and have never had a problem.
>>
>> float32 degrees gives you about 1 meter accuracy or better, so I can see
>> how losing a few digits might be an issue, though I would argue that you
>> maybe shouldn't use float32 if you are worried about anything close to 1m
>> accuracy... -- or shift to a relative coordinate system of some sort.
>>
>
> The issue isn't so much the accuracy of the coordinates themselves. I am
> only worried about 1km resolution (which is approximately 0.01 degrees at
> mid-latitudes). My concern is with consistent *construction* of a
> coordinate grid with even spacing. As it stands right now. If I provide a
> corner coordinate, a resolution, and the number of pixels, the result is
> not terrible (indeed, this is the approach used by gdal/rasterio). If I
> have start/end coordinates and the number of pixels, the result is not bad,
> either (use linspace). But, if I have start/end coordinates and a
> resolution, then determining the number of pixels from that is actually
> tricky to get right in the general case, especially with float32 and large
> grids, and especially if the bounding box specified isn't exactly divisible
> by the resolution.
>
>
>>
>> I have been playing around with the decimal package a bit lately,
>>>>
>>>
>> sigh. decimal is so often looked at a solution to a problem it isn't
>> designed for. lat-lon is natively Sexagesimal -- maybe we need that dtype
>> :-)
>>
>> what you get from decimal is variable precision -- maybe a binary
>> variable precision lib is a better answer -- that would be a good thing to
>> have easy access to in numpy, but in this case, if you want better accuracy
>> in a computation that will end up in float32, just use float64.
>>
>
> I am not concerned about computing distances or anything like that, I am
> trying to properly construct my grid. I need consistent results regardless
> of which way the grid is specified (start/end/count, start/res/count,
> start/end/res). I have found that loading up the grid specs (using in a
> config file or command-line) using the Decimal class allows me to exactly
> and consistently represent the grid specification, and gets me most of the
> way there. But the problems with arange() is frustrating, and I have to
> have extra logic to go around that and over to linspace() instead.
>
>
>>
>> and I discovered the concept of "fused multiply-add" operations for
>>>> improved accuracy. I have come to realize that fma operations could be used
>>>> to greatly improve the accuracy of linspace() and arange().
>>>>
>>>
>> arange() is problematic for non-integer use anyway, by its very
>> definition (getting the "end point" correct requires the right step, even
>> without FP error).
>>
>> and would it really help with linspace? it's computing a delta with one
>> division in fp, then multiplying it by an integer (represented in fp --
>> why? why not keep that an integer till the multiply?).
>>
>
> Sorry, that was a left-over from a previous draft of my email after I
> discovered that linspace's accuracy was on par with fma(). And while
> arange() has inherent problems, it can still be made better than it is now.
> In fact, I haven't investigated this, but I did recently discover some unit
> tests of mine started to fail after a numpy upgrade, and traced it back to
> a reduction in the accuracy of a usage of arange() with float32s. So,
> something got worse at some point, which means we could still get accuracy
> back if we can figure out what changed.
>
>
>>
>> In particular, I have been needing improved results for computing
>>>> latitude/longitude grids, which tend to be done in float32's to save memory
>>>> (at least, this is true in data I come across).
>>>>
>>>
>>> If you care about saving memory *and* accuracy, wouldn't it make more
>>> sense to do your computations in float64, and convert to float32 at the
>>> end?
>>>
>>
>> that does seem to be the easy option :-)
>>
>
> Kinda missing the point, isn't it? Isn't that like saying "convert all
> your data to float64s prior to calling np.mean()"? That's ridiculous.
> Instead, we made np.mean() upcast the inner-loop operation, and even allow
> an option to specify what the dtype that should be used for the aggregator.
>
>
>>
>>
>>> Now, to the crux of my problem. It is next to impossible to generate a
>>>> non-trivial numpy array of coordinates, even in double precision, without
>>>> hitting significant numerical errors.
>>>>
>>>
>> I'm confused, the example you posted doesn't have significant errors...
>>
>
> Hmm, "errors" was the wrong word. "Differences between methods" might be
> more along the lines of what I was thinking. Remember, I am looking for
> consistency.
>
>
>>
>>
>>> Which has lead me down the path of using the decimal package (which
>>>> doesn't play very nicely with numpy because of the lack of casting rules
>>>> for it). Consider the following:
>>>> ```
>>>> $ cat test_fma.py
>>>> from __future__ import print_function
>>>> import numpy as np
>>>> res = np.float32(0.01)
>>>> cnt = 7001
>>>> x0 = np.float32(-115.0)
>>>> x1 = res * cnt + x0
>>>> print("res * cnt + x0 = %.16f" % x1)
>>>> x = np.arange(-115.0, -44.99 + (res / 2), 0.01, dtype='float32')
>>>> print("len(arange()): %d  arange()[-1]: %16f" % (len(x), x[-1]))
>>>> x = np.linspace(-115.0, -44.99, cnt, dtype='float32')
>>>> print("linspace()[-1]: %.16f" % x[-1])
>>>>
>>>> $ python test_fma.py
>>>> res * cnt + x0 = -44.9900015648454428
>>>> len(arange()): 7002  arange()[-1]:       -44.975044
>>>> linspace()[-1]: -44.9900016784667969
>>>> ```
>>>> arange just produces silly results (puts out an extra element... adding
>>>> half of the resolution is typically mentioned as a solution on mailing
>>>> lists to get around arange()'s limitations -- I personally don't do this).
>>>>
>>>
>> The real solution is "don't do that" arange is not the right tool for the
>> job.
>>
>
> Well, it isn't the right tool because as far as I am concerned, it is
> useless for anything but integers. Why not fix it to be more suitable for
> floating point?
>
>
>>
>> Then there is this:
>>
>> res * cnt + x0 = -44.9900015648454428
>> linspace()[-1]: -44.9900016784667969
>>
>> that's as good as you are ever going to get with 32 bit floats...
>>
>
> Consistency is the key thing. I am fine with one of those values, so long
> as that value is what happens no matter which way I specify my grid.
>
>
>>
>> Though I just noticed something about your numbers -- there should be a
>> nice even base ten delta if you have 7001 gaps -- but linspace produces N
>> points, not N gaps -- so maybe you want:
>>
>>
>> In [*17*]: l = np.linspace(-115.0, -44.99, 7002)
>>
>>
>> In [*18*]: l[:5]
>>
>> Out[*18*]: array([-115.  , -114.99, -114.98, -114.97, -114.96])
>>
>>
>> In [*19*]: l[-5:]
>>
>> Out[*19*]: array([-45.03, -45.02, -45.01, -45.  , -44.99])
>>
>>
>> or, in float32 -- not as pretty:
>>
>>
>> In [*20*]: l = np.linspace(-115.0, -44.99, 7002, dtype=np.float32)
>>
>>
>> In [*21*]: l[:5]
>>
>> Out[*21*]:
>>
>> array([-115.        , -114.98999786, -114.98000336, -114.97000122,
>>
>>        -114.95999908], dtype=float32)
>>
>>
>> In [*22*]: l[-5:]
>>
>> Out[*22*]: array([-45.02999878, -45.02000046, -45.00999832, -45.        ,
>> -44.99000168], dtype=float32)
>>
>>
>> but still as good as you get with float32, and exactly the same result as
>> computing in float64 and converting:
>>
>>
>>
>> In [*25*]: l = np.linspace(-115.0, -44.99, 7002).astype(np.float32)
>>
>>
>> In [*26*]: l[:5]
>>
>> Out[*26*]:
>>
>> array([-115.        , -114.98999786, -114.98000336, -114.97000122,
>>
>>        -114.95999908], dtype=float32)
>>
>>
>> In [*27*]: l[-5:]
>>
>> Out[*27*]: array([-45.02999878, -45.02000046, -45.00999832, -45.        ,
>> -44.99000168], dtype=float32)
>>
>
> Argh! I got myself mixed up between specifying pixel corners versus pixel
> centers. rasterio has been messing me up on this.
>
>
>>
>>
>>>> So, does it make any sense to improve arange by utilizing fma() under
>>>> the hood?
>>>>
>>>
>> no -- this is simply not the right use-case for arange() anyway.
>>
>
> arange() has accuracy problems, so why not fix it?
>
> >>> l4 = np.arange(-115, -44.99, 0.01, dtype=np.float32)
> >>> np.median(np.diff(l4))
> 0.0099945068
> >>> np.float32(0.01)
> 0.0099999998
>
> There is something significantly wrong here if arange(), which takes a
> resolution parameter, can't seem to produce a sequence with the proper
> delta.
>
>
>
>>
>>
>>> Also, any plans for making fma() available as a ufunc?
>>>>
>>>
>> could be nice -- especially if used internally.
>>
>>
>>> Notice that most of my examples required knowing the number of grid
>>>> points ahead of time. But what if I didn't know that? What if I just have
>>>> the bounds and the resolution? Then arange() is the natural fit, but as I
>>>> showed, its accuracy is lacking, and you have to do some sort of hack to do
>>>> a closed interval.
>>>>
>>>
>> no -- it's not -- if you have the bounds and the resolution, you have an
>> over-specified problem. That is:
>>
>> x_min + (n * delta_x) == x_max
>>
>> If there is ANY error in either delta_x or x_max (or x_min), then you'll
>> get a missmatch. which is why arange is not the answer (you can make the
>> algorithm a bit more accurate, I suppose but there is still fp limited
>> precision -- if you can't exactly represent either delta_x or x_max, then
>> you CAN'T use the arange() definition and expect to work consistently.
>>
>> The "right" way to do it is to compute N with: round((x_max - x_min) /
>> delta), and then use linspace:
>>
>> linspace(x_min, x_max, N+1)
>>
>> (note that it's too bad you need to do N+1 -- if I had to do it over
>> again, I'd use N as the number of "gaps" rather than the number of points
>> -- that's more commonly what people want, if they care at all)
>>
>> This way, you get a grid with the endpoints as exact as they can be, and
>> the deltas as close to each-other as they can be as well.
>>
>> maybe you can do a better algorithm in linspace to save an ULP, but it's
>> hard to imagine when that would matter.
>>
>
> Yes, it is overspecified. My problem is that different tools require
> different specs (ahem... rasterio/gdal), and I have gird specs coming from
> other sources. And I need to produce data onto the same grid so that tools
> like xarray won't yell at me when I am trying to do an operation between
> gridded data that should have the same coordinates, but are off slightly
> because they were computed differently for whatever reason.
>
> I guess I am crying out for some sort of tool that will help the community
> stop making the same mistakes. A one-stop shop that'll allow us to specify
> a grid in a few different ways and still produce the right thing, and even
> do the inverse... provide a coordinate array and get grids specs in
> whatever form we want. Maybe even have options for dealing with pixel
> corner vs. pixel centers, too? There are additional fun problems such as
> padding out coordinate arrays, which np.pad doesn't really do a great job
> with.
>
> Cheers!
> Ben Root
>
> _______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>

_______________________________________________
NumPy-Discussion mailing list
[email protected]
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] improving arange()? introducing fma()?

Reply via email to