what is interesting to me in our discussions of p values ... .05 for
example is ... we have failed (generally that is) to put this one piece of
information in the context of the total environment of the investigation or
study ... we have blown totally out of proportion ... THIS one "fact" to
all the other components of the study ... which are FAR more important
take for example a very simple experimental study where we are doing drug
trials ... and have assigned Ss to the experimental, placebo control, and
regular control conditions ... and then when the study is over ... we do
our ANOVA ... see the printed p value ... then make our decision about the
meaningfulness of the results of this study ....
1. does this p value truly represent what p should be IF the null is true
AND nothing but sampling error is producing the result seen?
2. have Ss really be assigned at random ... or treatments at random to Ss?
3. have the drug therapies been implemented consistently and accurately
throughout the study?
4. are all the Ss in the study at the end ... compared to the beginning?
5. was the dosage level (if that was the thing being examined) really the
right one to use?
6. if these were humans .... are we totally sure that NO S had ANY contact
with ANY other S ... thus having the possible contamination effect across
experimental conditions?
7. have all the data been recorded correctly? if not, would there be ANY
way to know if a mistake had been made?
8. if humans were involved, and there was some element of self reporting
involved in the way the data were collected ... have Ss honestly and
accurately reported their data?
and on and on and on
there are SO many factors that produce the results ... that we have no way
of knowing which of the above or any other ... might have influenced the
results ... BUT, the p value only applies IF we are assuming sampling error
is the only factor involved ...
thus ... when we spend all this time on debating the usefulness or lack of
usefulness of a p value ... whether it be the .05 level or ANY other ... we
are totally ignoring the fact that this p value that is reported ... could
have been the result of many factors having NOTHING to do with sampling
error ... and nothing to do with the treatments ...
our persistence on insisting on a p value like .05 as being either the
magical or agreed to cut point ... is SO FAR OVERSHADOWED by all these
other potential problems ... that it makes the interpretation and
DEPENDENCE ON ANY reported p value highly suspect ...
so here we are, arguing about .03 versus .06 ... when we should be arguing
about things like items 2 to 8 ... and then ONLY when we have been able to
account for and do away with all of those ... then we MIGHT have a look at
the p value and see what it is ...
but until we do, our essentially total fixation of p values is so highly
misplaced attention ... as to be almost downright laughable behavior
and this is what we are passing along to our students? and this is what we
are passing along to our peers via published documents?
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================