Re: Rates and proportions

Robert Dawson Thu, 22 Jun 2000 05:46:01 -0700


> On Wed, 21 Jun 2000, Dale Berger wrote:
>
> > Yet, p=0 is a special case where an outcome is impossible.  A
> > reasonable confidence interval for p should not include zero if the
> > outcome has been observed in a sample.  Not so?

and Donald Burrill replied:

> I am unable to reconcile this assertion with the fact that the only
> values one can observe, in the vicinity of (some small) p, are 0/n,
> 1/n, 2/n, ... ;  and that if 1/n is observed, 0/n is possible to have
> observed, in which case one's estimate of  p  would, presumably, have
> been 0, at least to the precision available in the data.

    I do not see the conflict between these statements. A "reasonable"
confidence interval should be computed from the data that _were_ observed,
not from what they might have been.

    If the outcome has been observed, not only is 0 a value that is flatly
contradicted by the data, but moreover values very close to 0 have very low
likelihood. A "reasonable" confidence interval would thus omit such values.

    Now, not all confidence intervals are reasonable. In particular, as has
been said before, a specified confidence level gives no guarantee that the
interval estimator retains any of the information about the parameter that
was present in the data. It may achieve its confidence level merely by
mixing intervals that are far too large with a few that are far too small in
a random or arbitrary fashion.

    Again, a confidence interval may be useful (if not optimal) while
including values that are obviously absurd. Examples are:

    the Z interval for proportion, in cases where the confidence level is
greater than 98% and the np>=5 criterion is only just met.  Because the
critical value is greater than sqrt(5), the interval contains 0 and some
negative values for p.

    Another example: one could, motivated by the method of moments, derive a
confidence interval for the parameter A based on a sample of data from the
uniform distribution on [0,A], of the form [c x-bar, d x-bar]. This would of
course be nonoptimal, but it would not be positively stupid! For appropriate
combinations of sample size and confidence level, however, it would have a
nonzero probability of yielding an interval containing _no_ values
consistent with the data.

    Finally (and this time we _are_ being silly) if we drop the (itself
irrelevant) condition of connectedness, we could create a 95% confidence
region for the mean of the form

    (-infinity, xbar - t_0.475,n-1 s/sqrt(n)) union
    (xbar + t_0.525,n-1 s/sqrt(n), infinity)

---------------------------)      (------------------------

containing precisely the _least_ likely values for mu.

    As somebody once said, the main reason that confidence intervals, as
usually constructed, work is that they often resemble likelihood intervals.

    -Robert Dawson




===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================
Re: Rates and proportions

Reply via email to