The current cmath module has some significant numerical problems, particularly in the inverse trig and inverse hyperbolic trig functions. I've listed a few of these below, but for now I'll just note the following upsetting result:
>>> asin(-1e8) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: math range error (The true value is somewhere around -pi/2 +/- 19.1138j, which is hardly deserving of an overflow error.) I'm wondering whether anyone would be interested in a rewrite of the cmath module. I have a drop-in replacement written in pure Python, based largely on the formulas given by Kahan in his `Much Ado about Nothing's Sign Bit' article, which I believe eliminates the vast majority of the current numerical problems. Besides the numerical fixes, the only significant difference from the current behaviour is that the branch cuts for asinh have been moved. (The docs suggest that the current branch cuts for asinh should be regarded as buggy). I'm offering to translate this into C and add appropriate documentation and test cases, but I have a couple of questions before I start. (1) Is there actually any interest in fixing this module? I guess that the cmath module can't be used that much, otherwise these problems would have surfaced before. (2) (Disregard if the answer to question 1 is `No'.) What should be done about branch cuts? The value of a function on a branch cut is going to depend on signs of zeros, so it'll be pretty much impossible to make any guarantees about the behaviour. Even so, there are two approaches that seem feasible: (a) One can ensure that on a platform that happens to have IEEE-754 compliant hardware and a signed-zero-friendly math library, the complex functions `do the right thing': i.e., are continuous on both sides of each of the branch cuts, using the sign of zero on a branch cut to determine which side of the branch cut the user is interested in. Doing this portably in C89 involves some trickery, but it seems possible. For other platforms behaviour on the branch cuts would be undefined. Alternatively: (b) Force collapsing of all signed zeros to `positive' zero, so that the functions are continuous only on one side of the branch cuts, and the continuity is exactly as specified in the current documentation (though probably including the suggested changes for atan and atanh.) This is mostly what happens with the current implementation, though it's not clear to me whether that's by design or by accident. (c) Some other approach that I haven't considered. Which of these approaches is preferable? The Python module that I have currently does (a). (3) Is it worth trying to be consistent in exceptional cases? The current math module documentation notes that math.log(0) might produce -inf on one platform and raise ValueError on another, so being consistent about whether a non-representable result gives a NaN, an infinity or a Python exception seems as though it would be tricky. I should also note that I've made no effort to do the right thing when the argument to any of the functions contains a NaN or an infinity in either the real or imaginary part. Here are a few examples of the current cmath problems, on my machine (OS X 10.4.8/PowerPC). (*) As already mentioned in the docs, the current branch cuts for asinh are somewhat unconventional, and should probably be fixed. I believe that this is a fairly trivial fix. (*) asinh gives inaccurate answers in the portion of the complex plane between the current branch cuts, when the real part is large and negative. >>> asinh(-1e7+0.9j) (-16.811242831518271+8.9158318101199366e-08j) >>> asinh(-1e8+0.9j) (-19.113827924512311+0j) (The imaginary parts here should be approximately 8.999999999999e-8 and 8.999999999999e-9 respectively.) (*) asinh gives inaccurate results for small x: >>> asinh(1e-12) (1.000088900582091e-12+0j) >>> asinh(1e-20) (4.4408920985006257e-16+0j) (asinh(x) ~ x - 1/6 x^3 + O(x^5) for small x, so the correct values are 1e-12 and 1e-20 to within machine precision.) (*) acos behaves badly for large reals: >>> acos(1e7) 16.805431370234086j # true value ~ 16.811242831518262597j >>> acos(1e8) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: math range error (*) tan and tanh overflow unnecessarily, even though tan(z) and tanh(z) should be representable for all representable complex arguments. >>> tan(1000j) (nannanj) (Actual value: 1j, to within machine precision.) (*) log and sqrt overflow unnecessarily for arguments close to the limits of double precision: >>> z = 1.4e308+1.4e308j >>> sqrt(z) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: math range error >>> log(z) (inf+0.78539816339744828j) (*) sqrt gives inaccurate answers for very small arguments (where the real and imaginary parts are denormalized numbers): >>> z = .5e-323 + .5e-323j >>> sqrt(z) (2.2227587494850775e-162+0j) (Correct value for sqrt(z) is close to 2.442109e-162 + 1.011554e-162j.) (*) exp, cosh and sinh overflow unnecessarily for some values. For example: >>> exp(710+pi/4j) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: math range error The correct value is around 1.579673e308 + 1.579673e308j, which should be representable in IEEE double precision. This last problem is almost not worth fixing: these functions are going to overflow eventually anyway, and whether that happens when the real part of the argument is 709.0 or 710.0 isn't going to make a huge difference to anyone's life. In contrast, functions like acos, log and sqrt should have representable output for *any* finite representable complex number (with the obvious exception of 0 for log), so it's a bit embarrassing if they fail for some inputs. One more thing: since this is my first post to python-dev I should probably introduce myself. I'm a mathematician, working mostly in number theory. I learnt programming and numerical analysis the hard way, coding service-life prediction algorithms for rocket motors in Fortran 77 during several long summers. I've been using Python for more than ten years, mainly for teaching but also occasionally for research purposes. Mark Dickinson _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com