Hi

James M. Clark
Professor & Chair of Psychology
[email protected]
Room 4L41A
204-786-9757
204-774-4134 Fax
Dept of Psychology, U of Winnipeg
515 Portage Ave, Winnipeg, MB
R3B 0R4  CANADA


>>> "Mike Palij" <[email protected]> 12-Apr-13 7:00 AM >>>
On Thu, 11 Apr 2013 19:50:14 -0700, Jim Clark wrote:
Consider Leo Dicara's research and if a meta-analysis were done of
his research on operant conditioning of the autonomic nervous system.
After the initial positive findings, replications fail and stop being done.
But, net-net, there will be some non-zero effect size because (a) of the
early effects and (b) the overall large sample size.  But Dworkin and
Miller show the problems in this:

JC: I was simply asking a statistical question about whether aggregating p 
values would produce different results than a test on the entire sample using 
exactly the same data.  As Mike P has focused on cases where there is no 
effect, I changed my simulation so that the null was true (i.e., samples drawn 
from mu = 50 vs pop mu = 50). The aggregate results for 250 samples of 10 
produced a p = ~.5 using Fisher's procedure.  The overall t was also not even 
close to significant.

Mike P carries on:
You consistently avoid the issue of:

(1) Making a firm decision of what effect size a researcher thinks is
present and whether it is best to view it as a fixed effect or a random 
effect.

JC: Yes, I was not interested in that question, as noted above, except to 
examine the case of an effect with low power for individual tests.  And I'm not 
aware that fixed vs random factors applies to a single sample test.  I thought 
it had to do with the levels of a factor when multiple conditions were being 
compared, and across studies the conditions stayed the same (fixed) or varied 
(random).

Mike P:
(2) Doing an a priori power analysis in order to determine what
is the probability of detecting an effect (i.e., prob of reject a false null
hypothesis).  If statistical power is less than .50, I think that it is 
unethical
to allow such research to be done -- who wants to do research where
the probability of making a Type II error is greater than 50%.  In the
course of doing an a priori power analysis, one can determine what the
number of subjects/participants one will need to detect the effect size
one has specified (and the total sample will probably some more people
in order to take into account subject loss due to attrition, errors made in
procedure, acts of God, etc.).

JC: I've never been enamored of the idea of mixing methods and ethics 
questions, but perhaps you have a more positive view of REBs than I do.  And if 
a researcher recognizes the dangers of weak power and acts accordingly, I'm not 
sure what the ethical issues would be.

If you can't get enough subjects for an acceptable level of power
(e.g., power= .80, which I consider to be low because it means that
there is a 20% chance of committing a Type II error, a rate 4 times
that of making a Type I error -- this makes clear a researcher's bias
and costs associated with making errors), one shouldn't do the study.
One might consider doing a pilot study to get an estimate of the
effect size that one might obtain and if it is too small to be detected
given your resources, do a qualitative study instead.

Neuroimaging studies are expensive all over the place and it is very bad
practice to use them in studies where it is almost impossible to detect
a false null hypothesis.  They should be used only in studies where firm
conclusions can be reached (i..e., high powered, properly conducted
studies).  This is a waste of precious resources and this is the type
of practice the Button et al complain about.

Using low power studies and then meta-analyzing them may
result in one detecting systematic errors and bias unrelated to the 
phenomenon
being study (i.e., the "tweaking" that researchers do to get statistically
significant results).  Meta-analyze Dicara's published studies and tell me
what is the mean effect is that one obtains.  After you do so, I'll explain
why it's wrong.

JC: There may very well be biases in the system (researchers, publication 
practices, press releases, ...), which I acknowledged several times,  that make 
the aggregate approach problematic.  My question was simply whether the 
statistical approach of aggregating ps itself was problematic.  Under the 
idealistic circumstances of a simulation, it indeed appears to NOT DETECT 
effects that are NOT there, and to DETECT effects that ARE there, albeit rarely 
significant in the individual samples.

Take care
Jim


---
You are currently subscribed to tips as: [email protected].
To unsubscribe click here: 
http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5&n=T&l=tips&o=24986
or send a blank email to 
leave-24986-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu

<<attachment: Jim_Clark.vcf>>

Reply via email to