On Wed, 27 Nov 2013 06:48:47 -0800, Jim Clark wrote:
Hi
See
http://www.nature.com/news/psychologists-strike-a-blow-for-reproducibility-1.14232
I'm not convinced of the need for such explicit efforts, assuming that
scientific psychology does value replication and meta-analyses. After
all,
who's to say whether the non-reproduced effects in this study were
"correct" or
the original studies that found effects?
Okay, let me make explicit some assumptions that are implicit both in
the
article the Jim Clark has linked to and in the article itself:
(1) Assuming one has conducted an experimental study AND has obtained
a statistical result (e.g., Mean RT-Related is different from Mean
RT-Unrelated
stimuli) then there are two reasons for the statistically significant
result:
(a) there is a "real" effect, that is, in this case, one is faster at
responding
to related stimuli than to unrelated stimuli,
and
(b) this is a Type I error, that is, the two means differ only because
of a large
error that happened to occurred this 1 out of 20 times.
There is no way to know which of the two cases above holds unless one
does a series of replications. The probability of committing a Type I
error
for the first experiment is .05 but the probability that a series of
statistically
significant results are all due to Type I errors decreases dramatically
as the
number of replications increases. Ten replications all providing similar
significant
results will have an astronomically low overall Type I error rate.
Meta-analysis can provide additional information about the magnitude of
the
effect (standardized difference between means) and its variability and
meta-regression will allow one to determine whether there are
study/participant
characteristics that mediate/moderate/interact with the study design to
affect the
effect size. This, I think, is all to the good.
NOTE: Jim uses the term "correct" above but I'm not sure what he means
by
that. My focus is whether a result or a specific effect size can be
obtained
reliably under similar conditions.
(2) Another point to consider is what I refer to as the "Miller-Dworkin"
defense. This defense was presented in their 1986 paper that attempted
to explain why there is a "decline effect" in Leo DiCara's research
(i.e.,
an initial statistically significant result is found but with each
replication it
is reduced until the effect size is essentially zero; Jonathan Schooler
has
used this term to refer to some of his research, research by Rhine and
other parapsychologists, and it has been discussed previously on TiPS).
The abstract to the paper can be found here:
http://psycnet.apa.org/journals/bne/100/3/299/
Miller and Dworkin made a heroic effort to reproduce DiCara finding of
operant conditioning of autonomic function but were unable to reproduce
it. Nonetheless, they express the belief that DiCara really did find
such an
effect but because of the number of uncontrolled factors that may have
operated in his studies which may have differed from later studies, the
effect could not be reproduced. This is the charitable explanation;
I'll
leave it to others to provide other speculations. Some of the recent
work
on priming effects seem to be similar to this situation, that is, strong
results
are initially found but replications produce reduced effects or no
effects.
So, what is going on in these situations? Overly complex experimental
set-ups? Certain variables affecting the original studies but not later
studies? Possible expectancy effects in early studies but not in later
studies?
Too many Type I errors? Who knows? What is known is that an effect
that once was reproducible is not longer reproducible. Was it really
there
or are later researchers doing something wrong? Replications, even
failures
to replicate original results, allow one to determine what the overall
Type I
error is. Tests by Ioannidis and Francis provide the means for
deterring
whether an effect size is really present (see the Tips post at this link
and
the website links within it):
http://www.mail-archive.com/[email protected]/msg07150.html
If I wanted to improve scientific
psychology and its public image, I would encourage journals to report
small
empirical studies without grandiose theorizing (with or without
statistically
significant effects)
I don't think that this is ever going to happen for a variety of reasons
(e.g., studies that don't fit into a theoretical framework pose problems
of interpretations for those studies) but perhaps the critical reason is
the
following:
If a research study fails to produce a statistically significant result,
is
it because:
(a) there is no effect to detect
(b) there is insufficient power to detect the difference
(c) the researcher is incompetent and cannot conduct a proper
research study even with assistance from other.
Uncharitable people, especially on tenure and promotion committees,
might
be biased toward (c) if they want to get rid of the researcher from the
faculty.
However, with the growing number of "pay per page" journals, it seems
likely that a number of small studies might be published there because
of
the apparent ease of getting published (and an author being able to pay
for only short articles to be published).
and ban the use of University public relations departments
from disseminating the results of single studies in press releases, no
matter
how "newsworthy" the results might appear to be.
I strongly agree with this point especially if the results have only
been presented
at research conference. I believe APA examines presentations at its
annual
convention for research that would attract public interest and provides
press
releases for this -- another practice that I think that should stop.
-Mike Palij
New York University
[email protected]
---
You are currently subscribed to tips as: [email protected].
To unsubscribe click here:
http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5&n=T&l=tips&o=30846
or send a blank email to
leave-30846-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu