https://bugs.documentfoundation.org/show_bug.cgi?id=117041

            Bug ID: 117041
           Summary: Cumulative hypergeometric probability calculation
                    error when discrete probabilities are too small
           Product: LibreOffice
           Version: 5.4.5.1 release
          Hardware: x86-64 (AMD64)
                OS: Windows (All)
            Status: UNCONFIRMED
          Severity: normal
          Priority: medium
         Component: Calc
          Assignee: libreoffice-bugs@lists.freedesktop.org
          Reporter: charles...@gmail.com

Description:
HYPGEOM.DIST(x,NSample,Successes,Npopulation,Cumulative) function fails to
calculate cumulative probability (Cumulative=1) for situations where part of
the individual mass functions are too small and gives out "Err:502". What the
function should do is to equate those values to zero or do not consider them
when adding Pr(X=0) + Pr(X=1) + ... + Pr(X=x) to calculate Pr(X<=x).

That is what, for instance, MS Excel 2013 seems to do when calculating
cumulative hypergeometric probabilities. Its corresponding function
(HYPGEOM.DIST) does not give out wrong answers in situations where Calc does.

Example:
HYPGEOM.DIST(34,36,89,100,1) is 1.78862E+20 in Calc version 5.4.5.1 (x64). That
is absurd, as probabilities are limited to 1.

The correct answer is 0.9562476872.

There are countless input values that will give out erros too. For instance,
HYPGEOM.DIST(53,55,269,300,1) in Calc is 0.991282957, but the more correct
answer is 0.988500551. Such kind of erro is worse, because it is not patently
absurd.

Excel, R and hypergeometric calculators found in the Internet (for instance,
http://stattrek.com/online-calculator/hypergeometric.aspx) give the correct
answers.
It seems just discarding the individual mass probabilities for values of
Pr(X=x) where the results are so small they fall beyond (or bellow) machine
precision is enough to prevent the error.

I did that using the individual hypergeometric mass probability function from
Calc (either HYPGEOMDIST or HYPGEOM.DIST with "Cumulative=0") and the error is
avoided (see the "Aux" sheet in the .ods file I am sending attached).

Steps to Reproduce:
In any Calc cell enter, for instance, "=HYPGEOM.DIST(34,36,89,100,1)" or
"HYPGEOM.DIST(53,55,269,300,1)".

Actual Results:  
The incorrect results will be, respectively, "1.78862483604230E+20" and
"0.991282957036719".

Expected Results:
The correct results should be, respectively, "0.95624768719555" and
"0.988500551352757".


Reproducible: Always


User Profile Reset: No



Additional Info:
All I did seems to point to an easy solution to have the bug fixed: just equate
to zero or do not take into consideration the values of Pr(X=x) in error (value
too small, bellow machine precision).
I only tested in Windows 10 environment. I do not know if the error in
HYPGEOM.DIST also occurs in Linux environments.


User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs

Reply via email to