[sage-devel] Re: weird sympy evalf behaviour

rjf Sun, 04 Oct 2009 09:53:33 -0700

On Oct 4, 8:00 am, Fredrik Johansson <fredrik.johans...@gmail.com>
wrote:
> On Sun, Oct 4, 2009 at 6:57 AM, rjf <fate...@gmail.com> wrote:
> > On Oct 3, 5:11 am, Fredrik Johansson <fredrik.johans...@gmail.com>
> > wrote:
>
> > My guess is that you have not talked this over with a numerical
> > analyst.
>
> No, and I suppose a might if a compelling case were presented to me
> that I'm doing something wrong. So far you have failed to present such
> a case.

It is hard to argue with someone who rejects expertise.
>
> >> The purpose of this code is *not* to add a list of binary
> >> fractions accurately.
>
> > It is unlikely that the best way to add a list of any numbers that you
> > are given is
> > to start by throwing out information that provides the detailed values
> > of those numbers. Which is what you appear to be doing.
>
> Most of the time the terms are generated within the evalf code, and
> are known to be approximations.

The best way to evaluate the sum of approximations does NOT start with
approximating the approximations to lower precision. Are you disputing
this?


> The evalf code *does* allow you to add
> exact binary numbers exactly, if that's what you really want to do --
> it will do this by creating an intermediate mantissa that is large
> enough to hold the exact sum at any time.

Well, if you consulted a numerical analyst, or even read the paper I
mentioned
earlier -- look at the Shewchuk reference,  you would see that your
approach
is not necessary.


>This just happens to be
> inefficient when adding terms that are several hundred orders of
> magnitude apart, which is rare.

Yes, that's why you might do something else (like Shewchuk).

 (If we're talking about differences
> less than 100 orders of magnitude or so, the overhead of doing extra
> arbitrary-precision operations is much higher than the cost of
> increasing the precision. The simple algorithm is faster in the common
> case.)

I don't understand.  "increasing the precision"  IS doing "extra
arbitrary-precision operations."
>
> It would be possible to check for the case where all terms are exact
> binary fractions of widely different magnitudes and detect exact
> cancellations.

All binary floating-point numbers ARE EXACT BINARY FRACTIONS.
Pardon my shouting but this is fundamental.

 But exact addition of huge and tiny binary fractions is
> a relatively uncommon thing to do and optimizing for it isn't
> necessarily worth the added complexity.

I'm not asking for you to optimize for it. Just to do it correctly.


> I've never needed faster code
> for such a thing in SymPy, and no one has asked for it.

This sounds like an excellent defense. Not.

>
> Generally, the evalf code is written subject to the assumption that
> rational parts of symbolic expressions have already been simplified.
> SymPy has separate facilities for exact rational arithmetic, and those
> are generally better for exact arithmetic.

We are not discussing the addition of terms that are stored in Sage in
a numerator/denominator form, where the addition is intended to
maintain
the exactness of the addition at any cost.

>
>
>
> >>It is to approximate the sum of symbolically
> >> defined terms, which with a high probability don't have exact binary
> >> representations.
>
> > So you are saying that I am lying, or don't know what I'm doing when I
> > type in s=1.234501000,   (maybe even 1234501.0/1000000.0)
>
> > You are saying that you are so sure that I don't mean that number that
> > you are going to throw out some of the digits.
>
> > And let us say that I didn't type in that number, but I typed in
> > r = 5559698243588505001/4503599627370496000
> > and then coerced it to a float, call it rf.
> > Are you still so sure that I meant some irrational number that you are
> > going to throw out some of the digits that I supplied?
>
> > would you then compute s-rf  by rounding each of them to 5 digits, and
> > computing it as zero? Or would you compute the difference between s,
> > and r exactly, about 7.07e-17 ?
>
> I'm saying that the code *isn't optimized* for handling extremely
> special cases involving exact binary numbers. evalf deals *correctly*
> with them. It definitely assumes that user-input floating-point
> numbers are what they are (that the user knows what he's doing), and
> does not pretend that they are magical approximations of something
> else. If you input the Python float (=double) 0.1 it will consider it
> to be the exact fraction 3602879701896397/36028797018963968 because
> there is no reason to assume anything else.

Good.

>
> For example:
>
> >>> Add(Rational(-1,10), x).evalf(subs={x:0.1})
>
> 5.55111512312578e-18>>> Add(Rational(-1,10), x).evalf(30, subs={x:0.1})
>
> 5.55111512312578270211815834045e-18
>

This is pretty obscure syntactically, but I think that you are saying
that some variant of
(-1/10)+0.1  comes out as 5.55..e-18.  This depends on your rules for
coercions.

>
>
>
>
> >> In this context, including much smaller terms is a
> >> waste of time.
>
> > I think people would expect Sage to get the arithmetically closest
> > representable answer, and would not be so concerned about the
> > possibility of wasting a little time.
>
> >> I'll illustrate with a simple example (in decimal). Consider three
> >> symbolic terms (with high probability irrational or at least not exact
> >> binary numbers)
>
> > How could you possibly know that the terms you are viewing are not
> > exact binary numbers?
> > Your best bet is to believe that the numbers you are asked to add
> > together are exactly those binary numbers you are asked to add,
> > because any other bet is less accurate.
>
> Answered above.

What you said above is that if evalf is asked to convert 0.1 to a 30
digit number,
it takes the binary representation of 0.1, adds binary zeros to it,
and comes up with a number that looks like
.100000000000000005551115123126

This is fine.
>
> >> which to 10 digit accuracy are:
>
> >> 1.234501000
> >> 1.000000000e-10
> >> -1.234500000
>
> >> Suppose we want to evaluate their sum to a precision of 5 digits.
>
> > If you want to do arithmetic on data that is 10 decimal digits
> > precisely in decimal,
>
> That's not what I said. I said that those were 10 digit accurate
> representations of some numbers (possibly exact decimals and possibly
> not).

Whether a binary floating-point number  "is an exact decimal" or not
is, in general, unknowable. So basing your arithmetic somehow on an
assumption that
it "probably isn't exact" so we will deliberately lose information...
doesn't make
sense to me.

>
>
>
> > it makes sense to do that in 10 (or more) decimal digit arithmetic. If
> > you want the answer to be correct to 10 decimals, you ordinarily have
> > to do a little extra work to get it right. If you want to evaluate the
> > sum of a new set of data that is the rounded value (to 5 digits) of a
> > set of numbers using 5 digit arithmetic, that is a different task.
> > The first two numbers add up to
> > 1.2345010001 if you did it accurately, but maybe you are not... for 5
> > digits, you are adding 1.2345 and 1.0000e-10
> > which rounds, in 5 digit arithmetic, to
> > 1.2345
> > Next, if you add -1.2345 the sum is zero.
>
> >> The
> >> sum should be about 1.0001e-6.
>
> > No, you are not adding in 5 digits at all. You apparently are using 10
> > digits.
>
> The exact sum of the original numbers, rounded to 5 digits, is
> 1.0001e-6. This is what evalf computes when called with a precision
> argument of 5 digits.

OK, then you are saying you are adding the exact numbers
e.g. numbers like this:
1.0000000000000000364321973155...e-10  (which is 1e-10 with "more
bits")
and then afterward, rounding to 5 decimal digits.
exactly.

  Let us say the sum of the numbers is
1234.45
do we round to "even" giving exactly 1234.4 ?
or do we need to compute some more digits because
1234.45000000000000000000000001
would be rounded to 1234.5, in order to round to "nearest"?

If the terms "round to even" or "round to nearest" or this general
issue known as the
table-maker's dilemma  are unfamiliar, it might be nice to know them.







>
> >  First we evaluate each term to 5
> >> digits:
>
> >> 1.2345
> >> 1.0000e-10
> >> -1.2345
>
> >> Now, adding these terms in order naively yields 0.0 which is wrong.
>
> > Actually, 0.0 is perfectly correct, in 5 digit arithmetic.
>
> Yes, that's exactly what I said. 0.0 is the sum using 5 digit
> arithmetic, but the wrong value for the exact sum, and the exact sum
> is what evalf is trying to approximate.

Either evalf is approximating the sum, or it is computing the exact
sum.

I think that what you mean is that you are attempting to return a
result which is
the exact sum rounded to a certain number of places. This is something
you
are probably not doing correctly, unless you have taken care of the
situation above.

A naive approach might get all the numbers right except rounded
incorrectly on the margins.
So random testing might not discover this.
>
>
>
> >> Adding them using compensated (or exact) summation yields 1.0000e-10,
> >> which is the exact sum of the approximated terms but the wrong value
> >> for the sum of the exact terms.
>
> > Well, it is a far more accurate sum of the three numbers you provided,
> > than you
> > would expect with a 5-digit adder.
>
> >> The only way to compute the sum of the
> >> exact terms accurately is to restart with higher precision
> >> approximations for the terms.
>
> > You can compute the sum perfectly accurately.   If you restart with
> > different values, you can compute THAT sum perfectly accurately as
> > well.  You are confusing summation with
> > some idea about better approximation of terms.
>
> I am definitely not.

OK, then I am quite confused by what you mean by "higher precision
approximations for the terms."
>
> > You are missing a basic fact, and that is you think there are some
> > "other" numbers, perhaps irrational  (why not transcendental?  why not
> > complex with itsy bitsy imaginary parts?) that are the "accurate"
> > values, and that your summation program is required to somehow read
> > the minds of the values to get "higher precision".
>
> That's right, the main purpose of the evalf code is to "read the
> minds" of the "other" numbers. It's to be able to convert an exact
> expression like exp(1/pi^1000)-1 into a floating-point approximation
> of the real (or complex) number it represents.

OK, then you are not summing numbers.  You are attempting to find a
value for an expression that is accurate to some prespecified goal, by
evaluating subparts and combining them. This is quite difficult to do
right

Try to evaluate sinh(x)-  1/2*(exp(x)-exp(-x))    acccurately to 30
places.

Mathematica's ... N[  ... some expression ..., 100]   would attempt to
evaluate the expression so that the result is accurate to 100 decimal
digits.

> Since "reading the
> minds" is generally impossible (determining whether general
> expressions are zero is undecidable), it uses a heuristic (tracking
> the error and adaptively increasing the precision) that produces the
> right result most of the time,

Most of the time?  Might it ever converge and report the wrong result?
[can you come up with examples where 10-digit arithmetic, 20-digit
arithmetic,
and 30-digits arithmetic all come up with the same incorrect answer,
that is
totally different from the correct answer, computable in 31 digits...]

> and otherwise reports that it failed to
> reach the accuracy goal (in the form of a low precision result),

This is probably not a good idea (returning low precision), at least
if you are doing the same kind of thing that Mathematica does with a
"low precision result".  Mathematica lets
you make decisions like  Is x==0?  when x is such low precision that
it could
be zero or not.   A warning might be better.


> allowing the user to decide what to do (e.g. repeating the calculation
> with even higher precision, try to rewrite it in a more numerically
> stable form, or simplifying it symbolically...). For example:
>
> >>> (exp(1/pi**1000)-1).evalf()
> .0e-125
> >>> (exp(1/pi**1000)-1).evalf(maxprec=1000)
>
> 7.08153336783938e-498>>> (2*exp(1/pi**1000/2)*sinh(1/pi**1000/2)).evalf()
>
> 7.08153336783938e-498


This is not a summation, and we've wandered pretty far afield. ( by
the way, your number looks right compared to maxima:
fpprec:1000;
bfloat(exp(1/%pi^1000)-1)

)

>
> The best way to handle sub-evaluations that may have exact results is
> to try to do them symbolically / with exact rationals before calling
> evalf.

It seems to me that your evalf should be perfectly capable of handling
exact rationals (perfectly).


>Ultimately the user needs to be aware of these issues and
> choose the appropriate sequence of operations.

The usual point of something like N[]  in mathematica or your evalf()
is that the user should NOT need to know these things.  There is a
whole literature on robust calculation,
which explains how to do this one way or another. e.g. work by Chee K.
Yap.  But if you don't think you need to know how to do this because
you are already doing it right, what can I say?

Incidentally, I think evalf() is a function name used in Maple, but it
doesn't do the same thing.



RJF

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel-unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---
[sage-devel] Re: weird sympy evalf behaviour

Reply via email to