Re: [Python-Dev] Rewrite of cmath module?
[Mark Dickinson] The current cmath module has some significant numerical problems, particularly in the inverse trig and inverse hyperbolic trig functions. IIRC, cmath was essentially whipped up over a weekend by a scientist, transcribing formulas from a math reference. Such an approach is bound to suffer significant numerical problems, but has the advantage of taking little time to implement ;-) A few of the worst have been improved (e.g., see c_quot(), which contains the original code in a comment), but no systematic effort has been made. ... I'm wondering whether anyone would be interested in a rewrite of the cmath module. I have a drop-in replacement written in pure Python, based largely on the formulas given by Kahan in his `Much Ado about Nothing's Sign Bit' article, which I believe eliminates the vast majority of the current numerical problems. I believe that :-) Besides the numerical fixes, the only significant difference from the current behaviour is that the branch cuts for asinh have been moved. (The docs suggest that the current branch cuts for asinh should be regarded as buggy). I'm offering to translate this into C and add appropriate documentation and test cases, but I have a couple of questions before I start. Make the Python implementation available first and solicit feedback from the Numeric community? (1) Is there actually any interest in fixing this module? I guess that the cmath module can't be used that much, otherwise these problems would have surfaced before. That's the rub: most people don't use cmath, while most who do have no idea what these functions should return. If there's a gross problem, most wouldn't even notice. The few places in cmath that have been improved were due to users who did notice. Of course all should be fixed. (2) (Disregard if the answer to question 1 is `No'.) What should be done about branch cuts? The value of a function on a branch cut is going to depend on signs of zeros, so it'll be pretty much impossible to make any guarantees about the behaviour. Even so, there are two approaches that seem feasible: ... Follow the C99 standard whenever possible. C99 added a complex type to C, and added complex-valued math functions too. Alas, C99 hasn't been widely adopted yet, but Python should be compatible with the platform C implementation to the extent possible. Of course the easiest way to do that is to /use/ the platform C implementation when possible. ... (3) Is it worth trying to be consistent in exceptional cases? The current math module documentation notes that math.log(0) might produce -inf on one platform and raise ValueError on another, so being consistent about whether a non-representable result gives a NaN, an infinity or a Python exception seems as though it would be tricky. I should also note that I've made no effort to do the right thing when the argument to any of the functions contains a NaN or an infinity in either the real or imaginary part. Welcome to the club ;-) To the extent that your code relies on real-valued libm functions, you inherit uncertainties from the platform C's implementation of those too. There's a long debate about whether that's a feature (Python math functions act the same way as the platform C's, and likely the platform Fortran's too) or a bug (Python acts differently on different platforms). Feature is the traditional answer to that. ... [list of specific bad behaviors] ... One more thing: since this is my first post to python-dev I should probably introduce myself. I'm a mathematician, working mostly in number theory. I learnt programming and numerical analysis the hard way, coding service-life prediction algorithms for rocket motors in Fortran 77 during several long summers. I've been using Python for more than ten years, mainly for teaching but also occasionally for research purposes. Of course a strong background in math is very helpful for this. If you also spent several long summers cleaning sewers with your bare hands, your qualifications for working on libm functions are beyond reproach ;-) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Rewrite of cmath module?
The current cmath module has some significant numerical problems, particularly in the inverse trig and inverse hyperbolic trig functions. I've listed a few of these below, but for now I'll just note the following upsetting result: asin(-1e8) Traceback (most recent call last): File stdin, line 1, in module OverflowError: math range error (The true value is somewhere around -pi/2 +/- 19.1138j, which is hardly deserving of an overflow error.) I'm wondering whether anyone would be interested in a rewrite of the cmath module. I have a drop-in replacement written in pure Python, based largely on the formulas given by Kahan in his `Much Ado about Nothing's Sign Bit' article, which I believe eliminates the vast majority of the current numerical problems. Besides the numerical fixes, the only significant difference from the current behaviour is that the branch cuts for asinh have been moved. (The docs suggest that the current branch cuts for asinh should be regarded as buggy). I'm offering to translate this into C and add appropriate documentation and test cases, but I have a couple of questions before I start. (1) Is there actually any interest in fixing this module? I guess that the cmath module can't be used that much, otherwise these problems would have surfaced before. (2) (Disregard if the answer to question 1 is `No'.) What should be done about branch cuts? The value of a function on a branch cut is going to depend on signs of zeros, so it'll be pretty much impossible to make any guarantees about the behaviour. Even so, there are two approaches that seem feasible: (a) One can ensure that on a platform that happens to have IEEE-754 compliant hardware and a signed-zero-friendly math library, the complex functions `do the right thing': i.e., are continuous on both sides of each of the branch cuts, using the sign of zero on a branch cut to determine which side of the branch cut the user is interested in. Doing this portably in C89 involves some trickery, but it seems possible. For other platforms behaviour on the branch cuts would be undefined. Alternatively: (b) Force collapsing of all signed zeros to `positive' zero, so that the functions are continuous only on one side of the branch cuts, and the continuity is exactly as specified in the current documentation (though probably including the suggested changes for atan and atanh.) This is mostly what happens with the current implementation, though it's not clear to me whether that's by design or by accident. (c) Some other approach that I haven't considered. Which of these approaches is preferable? The Python module that I have currently does (a). (3) Is it worth trying to be consistent in exceptional cases? The current math module documentation notes that math.log(0) might produce -inf on one platform and raise ValueError on another, so being consistent about whether a non-representable result gives a NaN, an infinity or a Python exception seems as though it would be tricky. I should also note that I've made no effort to do the right thing when the argument to any of the functions contains a NaN or an infinity in either the real or imaginary part. Here are a few examples of the current cmath problems, on my machine (OS X 10.4.8/PowerPC). (*) As already mentioned in the docs, the current branch cuts for asinh are somewhat unconventional, and should probably be fixed. I believe that this is a fairly trivial fix. (*) asinh gives inaccurate answers in the portion of the complex plane between the current branch cuts, when the real part is large and negative. asinh(-1e7+0.9j) (-16.811242831518271+8.9158318101199366e-08j) asinh(-1e8+0.9j) (-19.113827924512311+0j) (The imaginary parts here should be approximately 8.e-8 and 8.e-9 respectively.) (*) asinh gives inaccurate results for small x: asinh(1e-12) (1.88900582091e-12+0j) asinh(1e-20) (4.4408920985006257e-16+0j) (asinh(x) ~ x - 1/6 x^3 + O(x^5) for small x, so the correct values are 1e-12 and 1e-20 to within machine precision.) (*) acos behaves badly for large reals: acos(1e7) 16.805431370234086j # true value ~ 16.811242831518262597j acos(1e8) Traceback (most recent call last): File stdin, line 1, in module OverflowError: math range error (*) tan and tanh overflow unnecessarily, even though tan(z) and tanh(z) should be representable for all representable complex arguments. tan(1000j) (nannanj) (Actual value: 1j, to within machine precision.) (*) log and sqrt overflow unnecessarily for arguments close to the limits of double precision: z = 1.4e308+1.4e308j sqrt(z) Traceback (most recent call last): File stdin, line 1, in module OverflowError: math range error log(z) (inf+0.78539816339744828j) (*) sqrt gives inaccurate answers for very small arguments (where the real and imaginary parts are denormalized numbers): z = .5e-323 + .5e-323j sqrt(z) (2.2227587494850775e-162+0j) (Correct value for sqrt(z) is close to 2.442109e-162 +
Re: [Python-Dev] Rewrite of cmath module?
On Sun, Mar 18, 2007, Mark Dickinson wrote: I'm wondering whether anyone would be interested in a rewrite of the cmath module. In theory, yes, but my observations of past discussions have been that they tend to founder on the issues of cross-platform support and long-term code maintenance. I have a drop-in replacement written in pure Python, based largely on the formulas given by Kahan in his `Much Ado about Nothing's Sign Bit' article, which I believe eliminates the vast majority of the current numerical problems. If nothing else, if you are willing to support this code, *PLEASE* upload it to the Cheese Shop. Public interest in such code makes it easier to sell back to the core. BTW, I noticed that your post was held for moderation, probably because you aren't subscribed to python-dev. If you want to participate in this discussion, subscribing makes that much easier. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ Typing is cheap. Thinking is expensive. --Roy Smith ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com