To understand correlations, dependence and causality better, I suggest
going to the source -- "Causality" by Judea Pearl. Good book. He goes
into necessary and sufficient conditions, etc.

P

_____________________________________
Pradyumna Sribharga Upadrashta, PhD Student
Scientific Computation, UofMN


>-----Original Message-----
>From: [EMAIL PROTECTED] 
>[mailto:[EMAIL PROTECTED] On Behalf Of Jay Warner
>Sent: Monday, September 29, 2003 5:15 PM
>Cc: [EMAIL PROTECTED]
>Subject: Re: [edstat] Unbaisedness and Variance - Regression
>
>
>Now we're getting into it - seriously!
>
>Eric Bohlman wrote:
>
>> [EMAIL PROTECTED] (Jay Warner) wrote in news:[EMAIL PROTECTED]:
>>
>> > "correlation" says that when we observe a change in one 
>variable, we 
>> > see a consistent change in the other.  If the correlation is 
>> > positive, then both go up together and down together.  If the 
>> > correlation is negative, then as one goes up, the other goes down, 
>> > and vice versa.
>>
>> Not quite.  Correlation specifically implies that the average of one 
>> variable is proportional to the value of the other.  There are 
>> relationships that meet your definition but show little correlation.
>
>"Average"?
>
>Suppose I have a set of paired data, say 20  pairs, x(i) and 
>y(i).  I plot them on a 2-D field, a standard scatter plot.
>
>If the 'cloud' of points forms a lens tipped upward toward the 
>right, I have a positive correlation, true? And if the cloud 
>is a lens tipped downward to the right, I have a negative 
>correlation.  True? And if the cloud is a horizontal lens, or 
>a circular  shape, the correlation will, no doubt, be near 0, true?
>
>(Sorry I can't draw the pictures here.)
>
>I see nothing in here that says the average of either set of 
>data.  In fact, I believe 'correlation' doesn't care about the 
>average of either set.
>
>True?
>
>Nonetheless, for the case of a positive correlation, a higher 
>x(i) is associated with a higher y(i).
>
>where is the logic off here?
>
>> >
>> > "dependence"  indicates that variable B is controlled, or 
>is caused 
>> > by, variable A.  the question of what is 'causality' takes up more 
>> > space than the internet has available.
>>
>> Specifically, dependence means that if you know the value of B, you 
>> can make a better guess at the value of A than if you didn't.
>
>This definition of 'dependence' is less stringent than what I 
>was thinking of as 'cause.'  I stand corrected.  Yes, if I go 
>to Oldenberg, in the years in question, and count the number 
>of storks, I can predict the number of people. True.  If I  
>now go to Oldenberg in this year, and the relationships of the 
>variables are still valid, then I could still predict the 
>human population.
>
>But if I go around the city and shoot half of the storks 
>(which I assure you, I don't intend to do), the human 
>population of the city would not go down accordingly.
>
>In the case of storks and people, and other such 'dependence' 
>cases, I believe the reason is that a third (or more) 
>un-displayed variables in fact build the causal link between 
>the two observed variables.  If the population of Oldenberg 
>was reduced by warfare in W.W.II, I'm sure the stork 
>population declined with it.  When we exercise the un 
>displayed (often called 'hidden') variables, then we see the 
>effects that result form 'causes.'
>
>> > In Box, Hunter & Hunter is a plot of the human population of 
>> > Oldenberg, Germany, against the number of nesting storks for a 
>> > certain time period.  It "proves" that storks bring human babies, 
>> > since more storks means more people. Does the human population 
>> > "depend" on the stork population?  I don't think so.  Is the human 
>> > population correlated with the stork population?  Yup.
>>
>> But technically, that *is* a relationship of dependence.  
>Knowing the 
>> stork population helps you estimate the human population.  
>That in no 
>> way implies any sort of causal relationship; all it implies is that 
>> the joint distribution of human population and stork 
>population isn't 
>> the same for different marginal values of stork populations.  
>> Dependence != causality. Dependence is not an inherently asymmetric 
>> relationship.
>
>got it!  this looks suspiciously like a statement using proper 
>terminology that I was attempting to get at above.
>
>So can we say that
>
>'correlation' indicates a mutual dependence between two 
>variables, and neither correlation nor dependence indicate 
>causal relationships.
>
>?
>
>In order to show a causal relationship, we would have to go to 
>a DoE, where we consciously set factor levels, and watch for 
>changes in response.  OR, do something like Shanin's "Big Red 
>X" scene, where you make a change, see the problem go away, 
>then make the change back and see if it reappears.  If it goes 
>away and come back 'under control,' then we say there is a 
>causal relationship, and we can fix the original problem.
>
>Make sense?
>
>Cheers,
>Jay
>--
>Jay Warner
>Principal Scientist
>Warner Consulting, Inc.
>4444 North Green Bay Road
>Racine, WI 53404-1216
>USA
>
>Ph: (262) 634-9100
>FAX: (262) 681-1133
>email: [EMAIL PROTECTED]
>web: http://www.a2q.com
>
>The A2Q Method (tm) -- What do you want to improve today?
>
>
>
>
>.
>. =================================================================
>Instructions for joining and leaving this list, remarks about 
>the problem of INAPPROPRIATE MESSAGES, and archives are available at:
>.                  http://jse.stat.ncsu.edu/                    .
>=================================================================
>

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to