Thanks to Robert J. MacG. Dawson for explaining how binomial sampling
is done.

>       If P is smaller, the interval width will be correspondingly smaller.  
>This breaks down if P is so small that NP < about 10; in the latter case
>other methods can be used based on a Poisson distribution.

Since we don't actually know what P is, how would we decide that it is
more appropriate to use the Poisson distribution?

>       The big question is: are your independence assumptions valid? And that
>depends on where the data come from. What works for a dictionary (which,
>as I've explained, is not a scenario where your assumption that the
>small book is better holds water) may not work for DNA sequencing or
>tables of integrals or phone books.

I look at it this way. If D1 (the smaller, allegedly more reliable, dictionary)
didn't exist, people would use D2 (the more comprehensive, but allegedly
less reliable, dictionary) and hope for the best while learning to navigate
around its deficiencies. Since the need for the more comprehensive dictionary
exists independently of the existence of D1, the decision a dictionary user
has to make is whether to purchase and carry around both dictionaries or
whether he/she can simply rely on D2. Therefore, it doesn't really matter
whether the assumption is valid that the probability of a word being correct
in D2 is independent of whether the word also occurs in D1. It only matters
whether D2 is good enough at performing D1's job to make D1 unnecessary most
of the time. That is why it is permissible to make the assumption, even if
it isn't true.

Admittedly, if the goal were to make draw an absolutely valid conclusion about
the probability of an error in D2, one would have to examine the assumption
more critically. But for this application, I think it is ok.

Ignorantly,
Allan Adler
[EMAIL PROTECTED]

****************************************************************************
*                                                                          *
*  Disclaimer: I am a guest and *not* a member of the MIT Artificial       *
*              Intelligence Lab. My actions and comments do not reflect    *
*              in any way on MIT. Moreover, I am nowhere near the Boston   *
*              metropolitan area.                                          *
*                                                                          *
****************************************************************************
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to