[EMAIL PROTECTED] (Rich Strauss) wrote in message 
news:<[EMAIL PROTECTED]>...
> The biological problem involves observing mutations in strings of DNA 
> nucleotides ("sites").  Say that we have a DNA sequence of known length 
> (i.e., known number of sites).  If we make certain assumptions about the 
> mutation rates at individual sites (where a mutation is a change in the 
> state of a nucleotide), we can estimate the probability of a mutation 
> occurring within the sequence.  The simplest assumption is that mutations 
> are Poisson-distributed, with a constant mean mutation rate among sites, 
> but we're also invoking other kinds of assumptions.  Since mutations are 
> rather rare, the probability estimate is going to be low (on the order of, 
> say, 1e-2 to 1e-4).   We want to then ask: how many samples would we have 
> to examine to be, say, 95% confident of observing a mutation within the 
> sequence? [...]

P = Prob(one or more mutations somewhere in a single sample)

C = Prob(at least one mutation in N samples)
  = 1 - Prob(no mutations in N samples)
  = 1 - (1-P)^N

N = log(1-C)/log(1-P)
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to