This is a query about what "should happen" when a function is given 
arguments that are in some sense out of range.

The hypergeometric distribution function computes the probability that x 
"successes" are observed in a sample of n trials from a finite 
population of N  elements  where M of the N are "successes. We sample 
WITHOUT replacement. Example: 10 people in a village of 100 have math 
anxiety. We sample 5 of them and clearly don't check the same people 
twice ('without replacement'). Hypergeometric distribution gives the 
probabilities we get 0, 1, 2, 3, 4 or 5 with MA in our sample.

Now suppose we have nobody with the disease. Clearly P(0, 5, 0, 100) =1 
and the rest should be 0. Gnumeric reports for "=hypgeomdist(1,5,0,100)" 
the result

#NUM!

which has some merit, since we are doing something that is "impossible" 
(getting 1 out of none).

I can see reporting an error for

=hypgeomdist(1,15,0,10)

i.e., sample bigger than population. But I know I'd rather get 0 for the 
values that are "impossible" in a feasible sample.

There are several of these borderline cases throughout the functions. It 
would be nice to document them and then come up with a good set of choices.

Comments welcome.

JN

_______________________________________________
gnumeric-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/gnumeric-list

Reply via email to