Hi Regina, I postponed this reply far too many times.. sorry. Too many questions at once ;-)
On Saturday, 2008-07-19 00:35:29 +0200, Regina Henschel wrote: > BETADIST(x,alpha,beta,lower bound, upper bound, cumulative) > I have attached the actual stage of my work to issue 91547. Thank you, I grabbed the issue from the requirements queue. > (1) > The definition in OpenDocument-formula-20080618.odt in 6.17.7 has errors > in the "density" case. Eike, in addition to the document I already sent > to you: The definition does not state, what result should give > BETADIST(1,1,beta,0,1,false()) for beta < 1 > BETADIST(0,alpha,1,0,1,false()) for alpha < 1 So this is merely missing the definition that these pole values do not have a solution? We can add that to the spec. > In both cases there is a pole. I set "illegal argument" now. Fine. > (2) > Nearly all terms inside have parts like (1-x)^r. When the x argument is > close to 1, approximately x > 0.9999, then the term (1-x)^r has large > cancellation errors. I know no way to avoid it. Switching to power > series 1+x^2+x^3+... is no solution, because it nearly do not converge > for x near 1. > > That leads to the question: Which accuracy should the function have in > which part of the domain? My suggestion would be, in case x > 0.9999 not > to try to get more accuracy but document the loss of accuracy in the > application help. Do you think this is a general problem for this particular function, or is it related to the algorithm used? Would there be not overly sophisticated algorithms that eliminated the accuracy loss? If not, and it is a general problem, we could state that in the ODFF spec as well. Stating it in the application help of course would be good anyway. > (3) > For x near alpha/(alpha+beta), which is mean p, the loops need huge > amount of iterations. I cound more than 50000 in some cases. Currently > the algorithm allows this 50000 iterations, the accuracy reaches up to > 12 digits. Limit the number of iterations to a reasonable value looses > accuracy in that cases. The normal amount of iterations is below 50. That sounds reasonable enough for normal usage, doesn't it? How likely would the condition "x near alpha/(alpha+beta)" occur in real data? > I tried to shift up and down - like I_x(alpha+1,b) -, but then the > accuracy decreases. The problem gets worse when alpha is large and beta > small, which gives a mean near 1 and the problem (2) hits in addition. > > If someone knows a solution that gives more accuracy with less > iterations, please let me know. I failed with the algorithm BASYM from > Didonato likely because of the needed erfc function. So again MSVC seems to be lacking a C99 mathematical function. Given an erfc() function the BASYM algorithm would do fine? > In a test as BASIC macro I got only 6 digit accuracy. That doesn't seem to be sufficient if other algorithms would give much better accuracy. > There will be a new book [1] about the numeric of special functions end > of July, and I hope to find some solutions there. But till I get the > book via public library, and read it, and test algorithms, it will be to > late to get a solution into OOo3.1. > > What to do? Setting a lower limit? We could do with a lower limit for OOo3.1 if indeed the 50000 iterations are hit too many times and turn out to be a bottleneck, and once you read the book ;) improve on the algorithm for OOo3.2 > Implement some shifting, which will > decrease the iterations in many cases, but give less accuracy? I don't think this would be a good solution, given what you mentioned above about "a mean near 1 and the problem (2) hits in addition". > Return the reached values although they are not as accurate as others, > or set a "no convergence" error? Is there a runtime measurement for "not as accurate as others" that could be used to determine a "no convergence" case opposed to "this might still do"? I mean, some people already freak out if the 9th significant digit isn't correct, others don't even care about the 6th digit. > (4) > Which values of alpha and beta should be supported? The larger they are > the smaller is the range in which the result goes from near 0 to near 1. > So one machine number for x would cover a large range of "correct" > results. I have no experience in using BETADIST in real life, but I > doubt that something like alpha=20000 is really needed. I actually have no idea :-( Maybe someone on the list could help out with some expertise? > (5) > The spec says that the "Cumulative" parameter has type "logical". In > which type is it pushed to the stack? How shall I get it from the stack? Use ScInterpreter::GetBool() if the parameter is present, in this case if ((nParamCount = GetByte()) >= 6) > No problems, but ToDo's: > (6) > Adapt the algorithms to the solution concerning expm1 and log1p. Added a dependency to i91602. > (7) > Remove the part with ScTTT, which I have included for testing. Hopefully will remember that ;) > (8) > Write a spec, The spec will be ODFF/OpenFormula. > and change UI I'll do so. > and application help for the sixth parameter. > > (9) > The patch contains algorithms for the Beta function in normal and > logarithmic version. They are needed for BETADIST. If there will be a > spec, both functions can be brought to UI easily. Nice. > Currently they are > only mentioned in the "huge" group in OpenDocument-formula-20080618.odt. Which is the bin for "we may want to define some of these in future, if needed". Thanks Eike -- OOo/SO Calc core developer. Number formatter stricken i18n transpositionizer. SunSign 0x87F8D412 : 2F58 5236 DB02 F335 8304 7D6C 65C9 F9B5 87F8 D412 OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS Please don't send personal mail to the [EMAIL PROTECTED] account, which I use for mailing lists only and don't read from outside Sun. Use [EMAIL PROTECTED] Thanks.
pgpou8MCDFAJo.pgp
Description: PGP signature
