Re: [Python-Dev] Rewrite of cmath module?

2007-03-18 Thread Tim Peters
[Mark Dickinson]
 The current cmath module has some significant numerical problems,
 particularly in the inverse trig and inverse hyperbolic trig
 functions.

IIRC, cmath was essentially whipped up over a weekend by a scientist,
transcribing formulas from a math reference.  Such an approach is
bound to suffer significant numerical problems, but has the
advantage of taking little time to implement ;-)  A few of the worst
have been improved (e.g., see c_quot(), which contains the original
code in a comment), but no systematic effort has been made.

 ...

 I'm wondering whether anyone would be interested in a rewrite of the
 cmath module.  I have a drop-in replacement written in pure Python,
 based largely on the formulas given by Kahan in his `Much Ado about
 Nothing's Sign Bit' article, which I believe eliminates the vast
 majority of the current numerical problems.

I believe that :-)

 Besides the numerical fixes, the only significant difference from the
 current behaviour is that the branch cuts for asinh have been moved.
 (The docs suggest that the current branch cuts for asinh should be
 regarded as buggy).  I'm offering to translate this into C and add
 appropriate documentation and test cases, but I have a couple of
 questions before I start.

Make the Python implementation available first and solicit feedback
from the Numeric community?

 (1) Is there actually any interest in fixing this module?  I guess
 that the cmath module can't be used that much, otherwise these
 problems would have surfaced before.

That's the rub:  most people don't use cmath, while most who do have
no idea what these functions should return.  If there's a gross
problem, most wouldn't even notice.  The few places in cmath that have
been improved were due to users who did notice.  Of course all should
be fixed.

 (2) (Disregard if the answer to question 1 is `No'.)  What should be
 done about branch cuts?  The value of a function on a branch cut is
 going to depend on signs of zeros, so it'll be pretty much impossible
 to make any guarantees about the behaviour.  Even so, there are two
 approaches that seem feasible:

...

Follow the C99 standard whenever possible.  C99 added a complex type
to C, and added complex-valued math functions too.  Alas, C99 hasn't
been widely adopted yet, but Python should be compatible with the
platform C implementation to the extent possible.  Of course the
easiest way to do that is to /use/ the platform C implementation when
possible.

...

 (3) Is it worth trying to be consistent in exceptional cases?  The
 current math module documentation notes that math.log(0) might produce
 -inf on one platform and raise ValueError on another, so being consistent
 about whether a non-representable result gives a NaN, an infinity or a
 Python exception seems as though it would be tricky.  I should also note that
 I've made no effort to do the right thing when the argument to any of the
 functions contains a NaN or an infinity in either the real or imaginary part.

Welcome to the club ;-)  To the extent that your code relies on
real-valued libm functions, you inherit uncertainties from the
platform C's implementation of those too.  There's a long debate about
whether that's a feature (Python math functions act the same way as
the platform C's, and likely the platform Fortran's too) or a bug
(Python acts differently on different platforms).  Feature is the
traditional answer to that.

...
 [list of specific bad behaviors]
...

 One more thing: since this is my first post to python-dev I should
 probably introduce myself.  I'm a mathematician, working mostly in
 number theory.  I learnt programming and numerical analysis the hard
 way, coding service-life prediction algorithms for rocket motors in
 Fortran 77 during several long summers.  I've been using Python for
 more than ten years, mainly for teaching but also occasionally for
 research purposes.

Of course a strong background in math is very helpful for this.  If
you also spent several long summers cleaning sewers with your bare
hands, your qualifications for working on libm functions are beyond
reproach ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Rewrite of cmath module?

2007-03-17 Thread Mark Dickinson
The current cmath module has some significant numerical problems,
particularly in the inverse trig and inverse hyperbolic trig
functions.  I've listed a few of these below, but for now I'll just
note the following upsetting result:

 asin(-1e8)
Traceback (most recent call last):
 File stdin, line 1, in module
OverflowError: math range error

(The true value is somewhere around -pi/2 +/- 19.1138j, which is
hardly deserving of an overflow error.)

I'm wondering whether anyone would be interested in a rewrite of the
cmath module.  I have a drop-in replacement written in pure Python,
based largely on the formulas given by Kahan in his `Much Ado about
Nothing's Sign Bit' article, which I believe eliminates the vast
majority of the current numerical problems.  Besides the numerical
fixes, the only significant difference from the current behaviour is
that the branch cuts for asinh have been moved. (The docs suggest that
the current branch cuts for asinh should be regarded as buggy).  I'm
offering to translate this into C and add appropriate documentation
and test cases, but I have a couple of questions before I start.

(1) Is there actually any interest in fixing this module?  I guess
that the cmath module can't be used that much, otherwise these
problems would have surfaced before.

(2) (Disregard if the answer to question 1 is `No'.)  What should be
done about branch cuts?  The value of a function on a branch cut is
going to depend on signs of zeros, so it'll be pretty much impossible
to make any guarantees about the behaviour.  Even so, there are two
approaches that seem feasible:

(a) One can ensure that on a platform that happens to have IEEE-754
compliant hardware and a signed-zero-friendly math library, the
complex functions `do the right thing': i.e., are continuous on both
sides of each of the branch cuts, using the sign of zero on a branch
cut to determine which side of the branch cut the user is interested
in.  Doing this portably in C89 involves some trickery, but it seems
possible.  For other platforms behaviour on the branch cuts would be
undefined.  Alternatively:

(b) Force collapsing of all signed zeros to `positive' zero, so that
the functions are continuous only on one side of the branch cuts, and
the continuity is exactly as specified in the current documentation
(though probably including the suggested changes for atan and atanh.)
This is mostly what happens with the current implementation, though
it's not clear to me whether that's by design or by accident.

(c) Some other approach that I haven't considered.

Which of these approaches is preferable?  The Python module that I
have currently does (a).

(3) Is it worth trying to be consistent in exceptional cases?  The
current math module documentation notes that math.log(0) might produce
-inf on one platform and raise ValueError on another, so being consistent
about whether a non-representable result gives a NaN, an infinity or a
Python exception seems as though it would be tricky.  I should also note that
I've made no effort to do the right thing when the argument to any of the
functions contains a NaN or an infinity in either the real or imaginary part.

Here are a few examples of the current cmath problems, on my machine
(OS X 10.4.8/PowerPC).

(*) As already mentioned in the docs, the current branch cuts for
asinh are somewhat unconventional, and should probably be fixed.  I
believe that this is a fairly trivial fix.

(*) asinh gives inaccurate answers in the portion of the complex plane
between the current branch cuts, when the real part is large and
negative.

 asinh(-1e7+0.9j)
(-16.811242831518271+8.9158318101199366e-08j)
 asinh(-1e8+0.9j)
(-19.113827924512311+0j)
(The imaginary parts here should be approximately 8.e-8 and
8.e-9 respectively.)

(*) asinh gives inaccurate results for small x:

 asinh(1e-12)
(1.88900582091e-12+0j)
 asinh(1e-20)
(4.4408920985006257e-16+0j)

(asinh(x) ~ x - 1/6 x^3 + O(x^5) for small x, so the correct values
are 1e-12 and 1e-20 to within machine precision.)

(*) acos behaves badly for large reals:

 acos(1e7)
16.805431370234086j # true value ~ 16.811242831518262597j
 acos(1e8)
Traceback (most recent call last):
 File stdin, line 1, in module
OverflowError: math range error

(*) tan and tanh overflow unnecessarily, even though tan(z) and
tanh(z) should be representable for all representable complex arguments.

 tan(1000j)
(nannanj)
(Actual value: 1j, to within machine precision.)

(*) log and sqrt overflow unnecessarily for arguments close to the
limits of double precision:

 z = 1.4e308+1.4e308j
 sqrt(z)
Traceback (most recent call last):
 File stdin, line 1, in module
OverflowError: math range error
 log(z)
(inf+0.78539816339744828j)

(*) sqrt gives inaccurate answers for very small arguments (where the
real and imaginary parts are denormalized numbers):

 z = .5e-323 + .5e-323j
 sqrt(z)
(2.2227587494850775e-162+0j)

(Correct value for sqrt(z) is close to 2.442109e-162 + 

Re: [Python-Dev] Rewrite of cmath module?

2007-03-17 Thread Aahz
On Sun, Mar 18, 2007, Mark Dickinson wrote:

 I'm wondering whether anyone would be interested in a rewrite of the
 cmath module.  

In theory, yes, but my observations of past discussions have been that
they tend to founder on the issues of cross-platform support and
long-term code maintenance.

 I have a drop-in replacement written in pure Python,
 based largely on the formulas given by Kahan in his `Much Ado about
 Nothing's Sign Bit' article, which I believe eliminates the vast
 majority of the current numerical problems.  

If nothing else, if you are willing to support this code, *PLEASE*
upload it to the Cheese Shop.  Public interest in such code makes it
easier to sell back to the core.

BTW, I noticed that your post was held for moderation, probably because
you aren't subscribed to python-dev.  If you want to participate in this
discussion, subscribing makes that much easier.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

Typing is cheap.  Thinking is expensive.  --Roy Smith
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com