Re: Regression CIs (was: Normally Distributed ANOVA FACTORS?)

Gus Gassmann Sun, 29 Sep 2002 08:44:53 -0700


--------------940E438666566E6136F88694
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit




[EMAIL PROTECTED] wrote:

> Gus,
>
> I am still not sure what you are doing.  What is a bucket? The essence of
> what you seem to be claiming is that when we sample y to be uniform, then CR
> gives us the opposite results.  You admit, however, that CR works with the
> usual approach.

"Works" is a loaded word. I admit that when x1 and x2 are sampled to be uniform
and you compute y = x1 + x2 and further compute the correlation of y with x1
both overall and over subsets of the x1 (lower, middle and upper), the
correlation
coefficients follow a predicted pattern.

Now here is another stab at explaining what I did. First off, when you generate
x1 uniformly on [a1,b1] and x2 uniformly on [a2,b2], then you can set up
a grid on the rectangle [a1,b1] x [a2,b2], like this: (Hope this comes out;
it is intended to be viewed with a monospaced font.)

(a1,b2)                                       (b1,b2)
     ---- ---- ----       ----
    |    |    |    |     |    |
    |    |    |    | ... |    |
     ---- ---- ----       ----
    |    |    |    |     |    |
    |    |    |    |     |    |
     ---- ---- ----       ----

      ...

     ---- ---- ----       ----
    |    |    |    |     |    |
    |    |    |    | ... |    |
     ---- ---- ----       ----
(a1,a2)                                       (b1,a2)

If the distribution is uniform, you would expect each of the smaller
rectangles to contain roughly the same number of points (provided
the sample is large enough). My idea was to force this relationship
by _picking_ one set of values in each tiny rectangle. If you have
10000 tiny rectangles (100 x 100), that gives you a pretty good
approximation to a uniform distribution (with 10000 data points).
(Do you agree?)

So then I set out to find a subsample of the big one that was uniform
in x2 and y (where y was computed earlier as y = x1 + x2). Here's the
important question: Do you agree that the causality in the subsample
should be the same as the causality in the overall sample?

To my surprise and amazement, I did not find the pattern I predicted.
In fact, what this smaller sample suggests is that x2 and y cause x1.
In other words, different selection of the data points results in a
different causal relationship. This should not happen.

--------------940E438666566E6136F88694
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
&nbsp;
<p>[EMAIL PROTECTED] wrote:
<blockquote TYPE=CITE>Gus,
<p>I am still not sure what you are doing.&nbsp; What is a bucket? The
essence of
<br>what you seem to be claiming is that when we sample y to be uniform,
then CR
<br>gives us the opposite results.&nbsp; You admit, however, that CR works
with the
<br>usual approach.</blockquote>
"Works" is a loaded word. I admit that when x1 and x2 are sampled to be
uniform
<br>and you compute y = x1 + x2 and further compute the correlation of
y with x1
<br>both overall and over subsets of the x1 (lower, middle and upper),
the correlation
<br>coefficients follow a predicted pattern.
<p>Now here is another stab at explaining what I did. First off, when you
generate
<br>x1 uniformly on [a1,b1] and x2 uniformly on [a2,b2], then you can set
up
<br>a grid on the rectangle [a1,b1] x [a2,b2], like this: (Hope this comes
out;
<br>it is intended to be viewed with a monospaced font.)
<p>(a1,b2)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
(b1,b2)
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp; ---- ---- ----&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
----</tt>
<br><tt>&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;
|&nbsp;&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |</tt>
<br><tt>&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;
| ... |&nbsp;&nbsp;&nbsp; |</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp; ---- ---- ----&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
----</tt>
<br><tt>&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;
|&nbsp;&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |</tt>
<br><tt>&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;
|&nbsp;&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp; ---- ---- ----&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
----</tt>
<br><tt>&nbsp;</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ...</tt>
<br>&nbsp;
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp; ---- ---- ----&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
----</tt>
<br><tt>&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;
|&nbsp;&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |</tt>
<br><tt>&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;
| ... |&nbsp;&nbsp;&nbsp; |</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp; ---- ---- ----&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
----</tt>
<br>(a1,a2)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
(b1,a2)
<p>If the distribution is uniform, you would expect each of the smaller
<br>rectangles to contain roughly the same number of points (provided
<br>the sample is large enough). My idea was to force this relationship
<br>by _picking_ one set of values in each tiny rectangle. If you have
<br>10000 tiny rectangles (100 x 100), that gives you a pretty good
<br>approximation to a uniform distribution (with 10000 data points).
<br>(Do you agree?)
<p>So then I set out to find a subsample of the big one that was uniform
<br>in x2 and y (where y was computed earlier as y = x1 + x2). Here's the
<br>important question: Do you agree that the causality in the subsample
<br>should be the same as the causality in the overall sample?
<p>To my surprise and amazement, I did not find the pattern I predicted.
<br>In fact, what this smaller sample suggests is that x2 and y cause x1.
<br>In other words, different selection of the data points results in a
<br>different causal relationship. This should not happen.</html>

--------------940E438666566E6136F88694--

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: Regression CIs (was: Normally Distributed ANOVA FACTORS?)

Reply via email to