On Thu, Feb 22, 2018 at 11:57 AM, Sebastian Berg <sebast...@sipsolutions.net
> wrote:

> > First, you are right...Decimal is not the right module for this. I
> > think instead I should use the 'fractions' module for loading grid
> > spec information from strings (command-line, configs, etc). The
> > tricky part is getting the yaml reader to use it instead of
> > converting to a float under the hood.

I'm not sure fractions is any better (Or necessary, anyway) in the end, you
need floats, so the inherent limitations of floats aren't the problem. In
your original use-case, you wanted a 32 bit float grid in the end, so doing
the calculations is 64 bit float and then downcasting is as good as you're
going to get, and easy and fast. And I suppose you could use 128 bit float
if you want to get to 64 bit in the end -- not as easy, and python itself
doesn't have it.

>  The
> tricky part is getting the yaml reader to use it instead of
> converting to a float under the hood.

64 bit floats support about 15 decimal digits -- are you string-based
sources providing more than that?? if not, then the 64 bit float version is
as good as it's going to get.

> Second, what has been pointed out about the implementation of arange
> > actually helps to explain some oddities I have encountered. In some
> > situations, I have found that it was better for me to produce the
> > reversed sequence, and then reverse that array back and use it.

interesting -- though I'd still say "don't use arange" is the "correct"

> > Third, it would be nice to do what we can to improve arange()'s
> > results. Would we be ok with a PR that uses fma() if it is available,
> > but then falls back on a regular multiply and add if it isn't
> > available, or are we going to need to implement it ourselves for
> > consistency?

I would think calling fma() if supported would be fine -- if there is an
easy macro to check if it's there. I don't know if numpy has a policy about
this sort of thing, but I'm pretty sure everywhere else, the final details
of computation fal back to the hardware/compiler/library (i.e. Intel used
extended precision fp, other platforms don't, etc) so I can't see that
having a slightly more accurate computation in arange on some platforms and
not others would cause a problem. If any of the tests are checking to that
level of accuracy, they should be fixed :-)

  2. It sounds *not* like a fix, but rather a
>      "make it slightly less bad, but it is still awful"

exactly -- better accuracy is a good thing, but it's not the source of the
problem here -- the source of the problem is inherent to FP, and/or poorly
specified goal. having arrange or linspace lose a couple ULPs fewer isn't
going to change anything.

> Using fma inside linspace might make linspace a bit more exact
> possible, and would be a good thing, though I am not sure we have a
> policy yet for something that is only used sometimes,

see above -- nor do I, but it seems like a fine idea to me.

> It also would be nice to add stable summation to numpy in general (in
> whatever way), which maybe is half related but on nobody's specific
> todo list.

I recall a conversation on this list a (long) while back about compensated
summation (Kahan summation) -- I guess nothign ever came of it?

> Lastly, there definitely needs to be a better tool for grid making.
> > The problem appears easy at first, but it is fraught with many
> > pitfalls and subtle issues. It is easy to say, "always use
> > linspace()", but if the user doesn't have the number of pixels, they
> > will need to calculate that using --- gasp! -- floating point
> > numbers, which could result in the wrong answer.

agreed -- this tends to be an inherently over-specified problem:


That is four values, and only three independent ones.

arange() looks like it uses: min_value, max_value, spacing -- but it
doesn't really (see previous discussion) so not the right tool for anything.

linspace() uses:  min_value, max_value, (number_of_grid_spaces + 1), which
is about as good as you can get (except for that annoying 1).

But what if you are given min_value, spacing, number_of_grid_spaces?

Maybe we need a function for that?? (which I think would simply be:

np.arange(number_of_grid_spaces + 1) * spacing

Which is why we probably don't need a function :-) (note that that's only
error of one multiplication per grid point)

Or maybe a way to take all four values, and return a "best fit" grid. The
problem with that is that it's over specified, and and it may not be only
fp error that makes it not fit. What should a code do???

So Ben:

What is the problem you are trying to solve?  -- I'm still confused. What
information do you have to define the grid? Maybe all we need are docs for
how to best compute a grid with given specifications? And point to them in
the arange() and linspace() docstrings.


I once wanted to add a "step" argument to linspace, but didn't in the
> end, largely because it basically enforced in a very convoluted way
> that the step fit exactly to a number of steps (up to floating point
> precision) and body was quite sure it was a good idea, since it would
> just be useful for a little convenience when you do not want to
> calculate the steps.




Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

NumPy-Discussion mailing list

Reply via email to