Re: Repair to floating point enumerations?

Christopher Lane Hinson Wed, 15 Oct 2008 14:23:50 -0700


I agree with David, we should be using multiplication, not addition.
However, I think that under the law of least surprise, we should
require that for all a,b,z:


all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z].

For example, anything in the neighborhood of this is just unfair, evenif it's within David's fudge factor:


Prelude> map (\x -> 1 / (x-0.6)) [0,0.1..0.55]
[-1.6666666666666667,-2.0,-2.5,-3.333333333333334,-5.000000000000001,-10.000000000000002,Infinity]

If I want to include the terminating value, then what I really want isprobably some f such that:


f (6 :: Int) 0 0.55 = [0,0.11,0.22,0.33,0.44,0.55]

--Lane



On Wed, 15 Oct 2008, David Roundy wrote:

Here's a counter-proposal:

numericEnumFromThenTo   :: RealFloat a => a -> a -> a -> [a]
numericEnumFrom 0 = map fromInteger [1..]
numericEnumFrom n = map ((n+).fromInteger) [1..]
numericEnumFromThen n m =  map (\x -> n+fromInteger x*(m-n)) [1..]
numericEnumFromTo n m = takeWhile (<= m*(1 + epsilon)) (numericEnumFrom n)
numericEnumFromThenTo n m p = takeWhile (<= p*(1 + 2*epsilon)) 
(numericEnumFromThen n m)

epsilon :: RealFloat a => a
epsilon = 1/2^(floatDigits (undefined :: a))

This uses quite a reasonable approximation for the roundoff error, and
has the advantage of not inappropriately returning _|_.  It does
sometimes create duplicate entries in the list, but I think that is
better that returning an infinite list of duplicate entries as the
code proposed below does.

And yes, the fuzzy comparison is a bit ugly, but at least it means
that every user is not forced to implement fuzzy comparison in their
quick and dirty code (which is the only thing this syntax is good
for).

David

On Wed, Oct 15, 2008 at 03:55:09PM +0100, Lennart Augustsson wrote:

I'm sorry, but people who write [0.0,0.1 .. x] where x is a multiple
of 0.1 get exactly what they deserve if x is not included.  Floating
point numbers are not the real numbers, and the sooner they learn that
the better.  We can fudge this all we like, but 0.1 is never going to
be exactly representable as a binary floating point number no matter
what we do.

On Wed, Oct 15, 2008 at 3:44 PM, David Roundy <[EMAIL PROTECTED]> wrote:

On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:

Dear Haskell-Primers (and libraries).

Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98
Prelude, regarding numeric enumerations for Floats/Doubles:

    Prelude> [0, 0.3 .. 1.1]
    [0.0,0.3,0.6,0.899999,1.2]

What is odd is that the last value of the list is much larger than the
specified termination value.  But this is exactly as specified by the
Haskell'98 Report.

    http://haskell.org/onlinereport/basic.html#sect6.3.4

    "For Float and Double, the semantics of the enumFrom family is given
    by the rules for Int above, except that the list terminates when the
    elements become greater than e3+i/2 for positive increment i, or
    when they become less than e3+i/2 for negative i.

We have discussed this question (and related ones, such as whether Float
and Double belong in the Enum class at all) several times before, and I
do not wish to rehash all of those points again e.g.:

    http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
    http://www.haskell.org/pipermail/haskell/2001-October/008218.html
    http://www.haskell.org/pipermail/haskell/2002-October/010607.html

Phil proposes that, although retaining the instances of Enum for Float
and Double, we simplify the definitions of the numericEnumFrom family:

  numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
  numericEnumFrom         =  iterate (+1)
  numericEnumFromThen n m =  iterate (+(m-n)) n
  numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
  numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)

The particular feature of note is that the odd fudge factor of (1/2 *
the increment) is removed.  The inexact nature of floating point numbers
would therefore cause a specification like

    [ 0.0, 0.1 .. 0.3 ]

to yield the sequence

    [ 0.0, 0.1, 0.2 ]

that is, to omit the upper bound, because (3 * 0.1) is actually
represented as 0.30000000000004, strictly greater than 0.3.

Phil argues that this behaviour is more desirable: "the simple fix is
that the user must add a suitable epsilon to the upper bound.  The key
word here is *suitable*.  The old definitions included completely
bizarre and often highly unsuitable choices of epsilon."

This proposal seems to me to improve the consistency of the enumeration
syntax across both the integral and floating types.  Some users may
still be surprised, but the surprise will be easier to explain.

I am bringing this question to the attention of all who are interested
in Haskell Prime, because it seems like a sensible and well-reasoned
change.  Discussion on whether to adopt this proposal for H' is welcome.

But as maintainer and bug-fixer of the Haskell'98 Report, I have also
been asked whether we should make this change retrospectively to the
Haskell'98 language (as a "typo").  Since it involves not merely an
ordinary library function, but a Prelude function, and moreover a
function that is used in the desugaring of syntax, it is less clear to
me whether to alter Haskell'98.

Thoughts?


It sounds like a bad fix to me.  It seems important that the
[0.0,0.1.. x] notation should work correctly in the common cases.  And
the common case really is that the final value is intended as an exact
multiple of the increment.

Why not look for a heuristic that gets the common cases right, rather
than going with an elegant wrong solution? After all, these
enumerations are most often used by people who neither care nor know
how they're implemented, but who most likely would prefer if haskell
worked as well as matlab, python, etc.

One reasonable option would be to actually take into account the
expected roundoff error (which isn't hard to compute for simple sums
like this).

It would also be a good idea to reduce that roundoff error by using
multiplication rather than addition to create the enumeration (which
also allows us to ensure that finite enumerations terminate).
e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate.
Admittedly, this is a stupid thing to compute, but it's also stupid to
add many small numbers together in sequence, since it always serves to
increase the roundoff error.  It's true that most people's C code
would be just as naive, but we're writing Haskell, and you're talking
about the Prelude, which should be written intelligently, not using
the simplest code, in such a way that it won't bite programmers.

David
_______________________________________________
Libraries mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/libraries

_______________________________________________
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime

Re: Repair to floating point enumerations?

Reply via email to