Gus,

Does the procedure you use fill out the corners of a cross tabulation of x1
and x2?  Are the intervals or equal width?

Bill

"Gus Gassmann" <[EMAIL PROTECTED]> wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
>
>
> [EMAIL PROTECTED] wrote:
>
> > Hi Gus,
> >
> > major snip...
> >
> > >
> > > >       It sounds to me that when you collect subsamples you are
> > > >      selecting y values somehow so that you are building in
> > > >      additional dependencies between the collected x values and
> > > >      the y values.
> > > >
> > > This is impossible, like I said. When I construct y as the sum of x1
and
> > > x2, then y is the effect
> > > and x1 and x2 are the causes. This fact is not altered in the least by
> > > my decision to report
> > > only every tenth set of values, or every one hundredth, or any other
> > > subset. (At least in my
> > > definition of "cause". If you disagree on this point, then there is
> > > indeed no purpose in
> > > continuing.) Whether the causal effect is _visible_ or not is of
course
> > > another matter.
> >
> > If you simply counted every 5th or tenth value then you are collecting
> > uniform subsamples of a normal distribution. This will not work because
you
> > are not allowing for coincidences of the extremes of x1 and x2. They are
> > still very rare and do not tend to occur together. Thus you are merely
> > subsampling uniformly from normal distributions! In doing so, you are
not
> > filling out the corners of the cross tabulation of x1 and x2. There will
> > still not be data in which similar values of x1 and x2 are crossed in
their
> > extremes. So we need to talk about what we mean by uniform
distributions.
>
> That is not what I meant, so let's back up. What I mean by a uniform
random
> variable in one dimension is something that has the probability density
> 1/(b-a) I_{b-a}(x), that is, the probability that a realization of this
random
> variable falls into any subinterval of [a,b] of length delta depends only
on
> delta,
> not on the endpoints of the subinterval. The excel function rand() spits
out
> such uniformly distributed random variables (on [0,1]).
>
> Let's say I collect a sample of size 100 from the two uniformly
distributed
> random variables x1 and x2 and I compute y = x1 + x2. In this sample y is
> caused by x1 and x2, the way I understand it. (Do you agree?)
>
> I don't want to give the entire sample here, but let's say it looks like
this:
>
> Row     x1      x2      y
>   1       0.47   0.15   0.62
>   2       0.71   0.43   1.14
>   3       0.77   0.87   1.64
> ...
> 100     0.50   0.74   1.24
>
> Now suppose I take a subsample from this, for instance,
> I select rows 2, 7, 16, 33, 39, 54, 66, 71, 90, 99.
> In this subsample, is y still caused by x1 and x2 or not?
>
>
>
>
>



.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to