Gus,
You need to explain what you did. You keep changing your story or expressing
it in such abstract terms that it is not clear what you did.
> >
> > If you simply counted every 5th or tenth value then you are collecting
> > uniform subsamples of a normal distribution. This will not work because
you
> > are not allowing for coincidences of the extremes of x1 and x2. They are
> > still very rare and do not tend to occur together. Thus you are merely
> > subsampling uniformly from normal distributions! In doing so, you are
not
> > filling out the corners of the cross tabulation of x1 and x2. There will
> > still not be data in which similar values of x1 and x2 are crossed in
their
> > extremes. So we need to talk about what we mean by uniform
distributions.
>
> That is not what I meant, so let's back up. What I mean by a uniform
random
> variable in one dimension is something that has the probability density
> 1/(b-a) I_{b-a}(x), that is, the probability that a realization of this
random
> variable falls into any subinterval of [a,b] of length delta depends only
on
> delta,
> not on the endpoints of the subinterval. The excel function rand() spits
out
> such uniformly distributed random variables (on [0,1]).
Gus, it is not so complicated as you are making it. A uniform distribution
is simply a range from which each interval is sampled an equal number of
times. The point of using uniform distributions is to allow the extremes of
y to be determined by the conjunction of the most extreme similar values of
x1 and x2. When this occurs, as in a manifold, then the highest values of y
will happen when the highest values of x1 and x2 are added. The lowest
values of y will occur when the lowest values of x1 and x2 occur together.
If you sample enough to allow this to happen, then x1 and x2 will be
positively correlated in the extremes of y. It is simple logic.
>
> Let's say I collect a sample of size 100 from the two uniformly
distributed
> random variables x1 and x2 and I compute y = x1 + x2. In this sample y is
> caused by x1 and x2, the way I understand it. (Do you agree?)
Yes, this is the model of causation. y=x1+x2 where x1 and x2 are uniformly
distributed. We expect therefore to have incidents in which the extreme
similar values of x1 and x2 are paired to produce the theoretical highest
and lowest values of y that are possible. If not enough data is gathered to
allow x1 and x2 to be crossed at all their levels, then CR will not work
because the extremes of x1 and x2 are not paired. This would be a case of
sloppy sampling, not of invalidation of CR.
>
> I don't want to give the entire sample here, but let's say it looks like
this:
>
> Row x1 x2 y
> 1 0.47 0.15 0.62
> 2 0.71 0.43 1.14
> 3 0.77 0.87 1.64
> ...
> 100 0.50 0.74 1.24
Ok.
>
> Now suppose I take a subsample from this, for instance,
> I select rows 2, 7, 16, 33, 39, 54, 66, 71, 90, 99.
> In this subsample, is y still caused by x1 and x2 or not?
I ask AGAIN... on what basis are chosing to select rows 2,7,16, 33.... 99?
I know for sure that if you simply trim the sparsely elaborated tails off y,
that CR works on average. If all you are doing is chosing ten rows out of
100, I must ask why those ten? Why cut the data down from 100 observations
to just 10? Do not try to snowball me with abstractions. I am far to old to
be fooled by that kind of sophistiry. Anyway, I am not that stupid with
mathematics. I will trace down your abstractions to their roots. How do you
think I managed to discover CR? You still do not seem to understand that the
point of trimming and the point of uniform causes is to allow for
combinations of the extrems of x1 and x2. Normal distributions do not allow
for such extremes. Using them is to use a rigged test. If there is no or
very slim possibilities of the pairing of extremes of x1 and x2 then they
can not be correlated. REAL scientists can ask the question, would I expect
the causal pattern to occur if I paired like extremes of the causes. All
honest scientists would say yes! Does your mysterious subsampling procedure
allow us to pair like extremes of x1 and x2 enough times to see the
correlation that I predict. If not, your test is rigged and you are cheating
or ignorant of what you do. Which is it?
Bill
>
>
>
>
>
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================