I opined "correlation is necessary but not sufficient for establishing a causal relationship."  Jim opined "depending on precisely what Karl means by "correlation is necessary," I'd have to disagree strongly.
 
    More nearly precisely what I mean follows, but is long.
 

When Does Correlation Imply Causation?

 

          It is not rare for researchers and students to confuse (1) correlation as a statistical technique with (2) nonexperimental data collection methods, which are also often described as “correlational.”  For example, a doctoral candidate at Florida State University hired me to assist him with the statistical analysis of data collected for his dissertation.  No variables were manipulated in his research.  I used multiple regression (a path analysis) to test his causal model.  When he presented this analysis to his dissertation committee the chair asked him to reanalyze the data with an ANOVA, explaining that results obtained with ANOVA would allow them to infer causality, but results obtained with multiple regression would not because “correlation does not imply causation.”  I cannot politely tell you what my initial response to this was.  After I cooled down, and realizing that it would be fruitless to try to explain to this chair that ANOVA is simply a multiple regression with dummy coded predictors, I suggested that the student present the same analysis but describe it as a “ hierarchical least squares ANOVA.”  The analysis was accepted under this name and the chair felt that she then had the appropriate tool with which to make causal inferences from the data.

 

          I have frequently encountered this delusion, the belief that it is the type of statistical analysis done, not the method by which the data are collected, which determines whether or not one can make causal inferences with confidence.  Several times I have I had to explain to my colleagues that two-group t tests and ANOVA are just special cases of correlation/regression analysis.  One was a senior colleague who taught statistics, research methods, and experimental psychology in a graduate program.  When I demonstrated to him that a test of the null hypothesis that a point biserial correlation coefficient is zero is absolutely equivalent to an independent samples (pooled variances) two-groups t test, he was amazed.

 

          The hypothetical example that I give my students is this:  Imagine that we go downtown and ask people to take a reaction time test and to blow into a device that measures alcohol in the breath.  We correlate these two measures and reject the hypothesis of independence.  Can we conclude, from this evidence, that drinking alcohol causes increased reaction time?  Of course not.  There are all sorts of potential noncausal explanations of the observed correlation.  Perhaps some constellation of “third variables” is causing variance in both reaction time and alcohol consumption -- for example, perhaps certain brain defects both (1) slow reaction time and (2) dispose people to consume alcohol.  Suppose we take these same data and dichotomize the alcohol measure.  We employ an independent samples t test to compare the mean reaction time of those who have consumed alcohol with that of those who have not.  We find the mean of those who have consumed alcohol to be significantly higher.  Does that now allow us to infer that drinking alcohol causes increases in reaction time.  Of course not -- the same potential noncausal explanations that prevented such inference with the correlational analysis also prevent such inference with the two-group t test conducted on data collected in nonexperimental means.

 

          Now consider that we bring our research into the lab.  We employ experimental means -- we randomly assign some folks to an alcohol consumption group, others to a placebo group, taking care to avoid any procedural or other confounds.  When a two-groups t test shows that those in the alcohol group have significantly higher reaction time than those in the placebo group, we are confident that we have results that allow us to infer that drinking alcohol causes slowed reaction time.  If we had conducted the analysis by computing the point biserial correlation coefficient and testing its deviation from zero, we should be no less confident of our causal inference, and, of course, the value of t (or F) and p obtained by these two seemingly different analyses would be identical.

 

          Accordingly, I argue that correlation is a necessary but not a sufficient condition to make causal inferences with reasonable confidence.  Also necessary is an appropriate method of data collection.  To make such causal inferences one must gather the data by experimental means, controlling extraneous variables which might confound the results.  Having gathered the data in this fashion, if one can establish that the experimentally manipulated variable is correlated with the dependent variable (and that correlation does not need to be linear), then one should be (somewhat) comfortable in making a causal inference.  That is, when the data have been gathered by experimental means and confounds have been eliminated, correlation does imply causation.

 

          So why is it that many persons believe that one can make causal inferences with confidence from the results of two-group t tests and ANOVA but not with the results of correlation/regression techniques.  I believe that this delusion stems from the fact that experimental research typically involves a small number of experimental treatments that data from such research are conveniently evaluated with two-group t tests and ANOVA.  Accordingly, t tests and ANOVA are covered when students are learning about experimental research.  Students then confuse the statistical technique with the experimental method.  I also feel that the use of the term “correlational design” contributes to the problem.  When students are taught to use the term “correlational design” to describe nonexperimental methods of collecting data, and cautioned regarding the problems associated with inferring causality from such data, the students mistake correlational statistical techniques with “correlational” data collection methods.  I refuse to use the word “correlational” when describing a design.  I much prefer “nonexperimental” or “observational.”

 

          In closing, let me be a bit picky about the meaning of the word “imply.”  Today this word is used most often to mean “to hint” or “to suggest” rather than “to have as a necessary part.”  Accordingly, I argue that correlation does imply (hint at) causation, even when the correlation is observed in data not collected by experimental means.  Of course, with nonexperimental models, the potential causal explanations of the observed correlation between X and Y must include models that involve additional variables and which differ with respect to which events are causes and which effects.


Karl L. Wuensch, Department of Psychology,
East Carolina University, Greenville NC  27858-4353
Voice:  252-328-4102     Fax:  252-328-6283
[EMAIL PROTECTED]
http://core.ecu.edu/psyc/wuenschk/klw.htm

Reply via email to