[EMAIL PROTECTED] (Rich Strauss) wrote in message
news:<[EMAIL PROTECTED]>...
> The biological problem involves observing mutations in strings of DNA
> nucleotides ("sites"). Say that we have a DNA sequence of known length
> (i.e., known number of sites). If we make certain assumptions about the
> mutation rates at individual sites (where a mutation is a change in the
> state of a nucleotide), we can estimate the probability of a mutation
> occurring within the sequence. The simplest assumption is that mutations
> are Poisson-distributed, with a constant mean mutation rate among sites,
> but we're also invoking other kinds of assumptions. Since mutations are
> rather rare, the probability estimate is going to be low (on the order of,
> say, 1e-2 to 1e-4). We want to then ask: how many samples would we have
> to examine to be, say, 95% confident of observing a mutation within the
> sequence? [...]
P = Prob(one or more mutations somewhere in a single sample)
C = Prob(at least one mutation in N samples)
= 1 - Prob(no mutations in N samples)
= 1 - (1-P)^N
N = log(1-C)/log(1-P)
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================