Hi Regina, On Tuesday, 2008-08-05 22:17:56 +0200, Regina Henschel wrote:
> It seems to be one of the most difficult functions. To get an idea of > that function, have a look at the file toms708.c of the R-project. Wow, impressive. The original coder(s) must have spent a lot of time for numeric error analysis on that. >>> (2) >>> Nearly all terms inside have parts like (1-x)^r. When the x argument >>> is close to 1, approximately x > 0.9999, then the term (1-x)^r has >>> large cancellation errors. I know no way to avoid it. Switching to >>> power series 1+x^2+x^3+... is no solution, because it nearly do not >>> converge for x near 1. >>> >>> That leads to the question: Which accuracy should the function have >>> in which part of the domain? My suggestion would be, in case x > >>> 0.9999 not to try to get more accuracy but document the loss of >>> accuracy in the application help. >> >> Do you think this is a general problem for this particular function, or >> is it related to the algorithm used? Would there be not overly >> sophisticated algorithms that eliminated the accuracy loss? > > Please take a spreadsheet and enter x1=0.999999+6.1E-016 and > x2=0.999999-4.9E-016 and then calculate 1-x1 an 1-x2. That is the range, > which is possible for an input, that gives 0.999999 in the 15-digit > format, Calc uses. Understood. > So I think in the meantime, that it cannot be solved > with a program that is bound to double. Note that the trick Leonard mentioned in his reply indeed gives identical results for 0.5-x1+0.5 respectively 0.5-x2+0.5 when used on raw double values in C/C++ assigned to a double and displayed using printf, not in a result of a spreadsheet formula expression though; same in Calc and Gnumeric, so not due to some Calc rounding madness. > The beta distribution is > affected by cancellation errors more than other functions, because in > its definition you have "x" as well as "1-x". > > If I understand the Gnumeric code correct, they use long double values > where possible. But MSVC Express doesn't have long double, but maps it > to double. Sigh.. not only MSVC Express, also the .NET compilers. They gave up on it after 16-bit times, claiming it was for compatibility with non-x86 machines. > I would like to compare Calc with a Gnumeric, which is > compiled without support of long double. I'm not sure if any gcc did have a switch for mapping long double to double, there was a -Wno-long-double but AFAIK that gave only a warning. Recent versions don't have that anymore. Gnumeric for Windows does exist, but I doubt it would be compilable using MSVC. Last resort would be to replace long double with double in the functions of interest and recompile ... >>> If someone knows a solution that gives more accuracy with less >>> iterations, please let me know. I failed with the algorithm BASYM >>> from Didonato likely because of the needed erfc function. > > It seems, that I can solve the erfc problem :) I need to learn about > "static methods" (see other thread). > Heh, that's an easy one :-) >> So again MSVC seems to be lacking a C99 mathematical function. Given an >> erfc() function the BASYM algorithm would do fine? > > I have implemented BASYM directly into ScTTT now. The accuracy is over > 10 digits, sometimes up to 15 digits, same as the continued fraction. It > calculates with less then 1000 iteration. So the accuracy is about the > same as continued fraction, but with less iterations. At least 10 digits accuracy sounds pretty good to me. >>> [... iterations ...] >>> What to do? Setting a lower limit? >> >> We could do with a lower limit for OOo3.1 if indeed the 50000 iterations >> are hit too many times and turn out to be a bottleneck, and once you >> read the book ;) improve on the algorithm for OOo3.2 > > Using the function in a spreadsheet, I notice no performance problems > with that limit of iterations. But a real performance test would be > necessary to find an acceptable limit. For a first attempt try with thousand and several thousand formula cells depending on one value cell, place a SUM() function beneath that to force calculation of all formula cells and change the value in the value cell. > I'll try to finish it in a way, that BASYM is used where needed. > Unfortunately summer holidays are over and my vocation ends. Too bad.. >>> (4) >>> Which values of alpha and beta should be supported? The larger they >>> are the smaller is the range in which the result goes from near 0 to >>> near 1. So one machine number for x would cover a large range of >>> "correct" results. I have no experience in using BETADIST in real >>> life, but I doubt that something like alpha=20000 is really needed. >> >> I actually have no idea :-( Maybe someone on the list could help out >> with some expertise? > > The spec should give a recommendation. Maybe someone in the formula committee could give some. > MuPad, the CAS I have got at > home, overflows for large parameters. Of cause high end programs like > Mathematica have no problems there (You can use it online for single > function calls). Call me blind, I didn't find that. Do you have an URL? >>> (6) >>> Adapt the algorithms to the solution concerning expm1 and log1p. >> >> Added a dependency to i91602. > > Without them it is impossible to get the desired accuracy. Hopefully the results of the replacement function and different compiler's built-in functions are nearly identical.. > I know it is a very long mail and you need not to answer. I felt I needed to ... > I'll attach the version with BASYM to the issue, but it will take > some time until it is ready. I'll ask if I stuck. I want to give you every support I can, you're doing such a great work on our functions' accuracy. Thanks Eike -- PGP/OpenPGP/GnuPG encrypted mail preferred in all private communication. Key ID: 0x293C05FD - 997A 4C60 CE41 0149 0DB3 9E96 2F1A D073 293C 05FD
pgpQxzGpbQuhl.pgp
Description: PGP signature
