Re: Regression CIs (was: Normally Distributed ANOVA FACTORS?)

bijag Mon, 30 Sep 2002 08:57:49 -0700

Hi Gus,

>
> > Gus,
> >
> > It would be possible to subsample any data set in order to produce any
> > effect desired. Until I know how you find the subset, I can not know if
that
> > action is removing or obscuring the causal pattern.
>
> Neither do I. But I need to know whether the causal pattern is still
there. I
> guess
> you allow the possibility that the causal relation is removed. I just
can't see
> how.
> Let's say, I take the causal relation "Smoking causes cancer". The
existence of
> this causal relation should not depend on how I pick my subjects, should
it?


The existence of the causal pattern in the victims of cancer is there but
the numbers will not  reflect that cancer if they are chosen in a way that
obscures the causal mechanism. If we subsampled a population by retaining
only those values that deviate widely from the expected values (according to
regression) then we magnify the errors at the expense of seeing the common
variance. This would be a way of obscuring the causal pattern. I am not
saying this is what you are doing. But samples can be collected that
disinherit the causal evidence. Sampling causes normally is a way of
disinheriting important causal data, since the extremes of the causes will
very rarely be combined.  This is an example of how our sampling can obscure
the presence of causal patterns. Sampling is a very dynamic thing, it is not
the holy grail that so many correlational and SEM folks make it out to be
with their snap shot samples. What nature throws up for the convenience of a
scientists snap shot is in no way more valid than a systematically collected
sample, by which the scientist counterbalances and seeks information in a
more efficient and conscientious manner.



> Of course, if I end up with only nonsmokers in the subsample, I won't
detect
> the relationship, but that does not invalidate its existence. What am I
missing?

If you remove those with cancer then you will not be able to detect cancer
and the data will have disinherited the cancer mechanism.   The folks with
cancer still have it but the sample is inadequate to reflect the cancer.

When we trim the ends off normal distributions,  the disjunctive cases still
exist but  we remove them so that the conjunctive pairings can exist.
Sampling normally removes the conjunctive cases in the extremes. By trimming
we just continue by counterbalancing and removing the disjunctive as well,
leaving data that has ot been filtered (the new extremes). There is
absolutely nothing wrong with picking your subjects so long as you are not
rigging the data to imply something that is not true. We might, for example,
pick subjects with lung cancer and those with none. Then we might compare
number of years people from each group smoked. There is nothing wrong with
doing this. Similarly, when a chemist experiments on some chemical, there is
no reason he should be forbidden from isolating that chemical from the other
chemicals that might correlate or occur with it in the wild. Purification is
a common practice in most sciences. We just have to be very explict about
what we do.

So far, I still do not know what you are doing.

Bill



.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: Regression CIs (was: Normally Distributed ANOVA FACTORS?)

Reply via email to