In article <[EMAIL PROTECTED]>,
Robert J. MacG. Dawson <[EMAIL PROTECTED]> wrote:
>"W. D. Allen Sr." wrote:
>> A common mistake made in statistical inference is to assume every data set
>> is normally distributed. This seems to be the rule rather than the
>> exception, even among professional statisticians.
>> Either the Chi Square or S-K test, as appropriate, should be conducted to
>> determine normality before interpreting population percentages using
>> standard deviations.
> Another common mistake made in statistical inference is confounding the
>two propositions:
> "The data are close enough to the model that a hypothesis test cannot
>reject the model at some fixed p-value"
> and
> "The population is close enough to the model for the interpretation to
>be useful".
> Logically, these are almost entirely unrelated. In particular, for very
>large samples the test will almost always reject the model, even when
>the population distribution is very close; for small samples the model
>will rarely be rejected even when it is in fact flagrantly wrong, due to
>lack of power in the test.
> What is needed in the small-sample case is outside _knowledge_ (not
>"well, it _might_ be true" or "in this discipline we usually assume..."
>assumptions!) about the distribution - without this we should not be
>making any distributional assumptions. In the large-sample case we need
>a measure of closeness that is independent of sample size and based on
>the idea of "close enough for practical purposes" not "have we enough
>data to quibble?"
> -Robert Dawson
This problem is automatically addressed in a "reasonably
correct" decision-theoretic formulation. This is where
one should start.
Has anyone else addressed this problem? It is discussed
in my paper in the First Purdue Symposium, 30 years ago.
In the symmetric test of an imprecise point null, the
general results (not well stated there, as they only
became really clear later) are that if the width of the
acceptance region is small compared to the precision of the
estimator, use a test such as the one in my paper with
Sethuraman (Sankhya, 1965) with the "prior" of the null
the integrated loss-weighted prior over the acceptance
region, and if it is much larger than the precision, just
estimate the parameter.
The problem in between unfortunately depends on the particular
prior assumptions of the user. This follows from reasonable
numerical examples.
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================