Steve,

You make some good points.  I respond in context below:


Subject: RE: Applying Corresponding Regressions across five data sets


> Dr. Chambers,
>
> Rather than focus on the data sets that failed, why not focus on the data
> sets where the method at least partially succeeded?

Because to me the successes are also questionable.  I approach the issue in
search of an understanding of the basic principles of causality and causal
inference, not as a commercial venture.  A method that is correct for
unknown reasons is pretty worthless to me, except as something for further
exploration.  I am not looking to real data sets for confirmation of CR,
since real data sets are usually so poorly constructed and of such unknown
causal nature, as to be worthless as criteria for validation. I ranked tea
leaf reading a close follower to the analysis of such survey data because it
is the mysterious nature of real data that creates the need for methods such
as CR.

>
> Why did the housing data set show good results when the car mileage data
set
> did not? Both data sets would appear to have the same amount of
measurement
> error in them.

That can only be explained by a deeper knowledge of the data sets than we
currently have.  Measurement error has many forms.  Random error can be
handled by CR.  X2 maybe a causal factor or an artifactual random error.
Either way CR works. The problem comes when there are nonrandom errors. All
statistical methods are vulnerable to these kind of errors, not just CR.
Mesurement theory in the social/behavioral sciences is still in its infancy.
It has been crippled by commercial/tenure interests to such an extent that
most researchers can not even comprehend the extent of the dangers of
confounding.  Just last week no one could imagine how mileage could cause
the types of engine specifications. But  the idea of confounding a final
cause (intention of the engineers) with a proxy measure of this intention
(actual mileage) was always a very real possibility that most folks did not
and probably still can not conceive of. The reason for this ignorance is
that current methods allow us to confound things without being shocked by
our conclusions. If we do not infer causation, we do not have to consider
the absurdity of confounded measures.

>
> Why did the cigarettes and mortality data set show good results when the
> cigarettes and cancer data set did not? Both data sets would seem to have
> the same sort of trouble with confounders.


You can not make this assumption since you do not know either data set very
well.   There is no alternative to knowing the data if you intend to make
assertions based on the data. Knowing the data takes work that most people
have neither the trainning nor the dedication to accomplish.  Its cheap and
easy to use convenience samples because they seem to have face validity to
those who lack the intelligence and/or education to know what a can of worms
such measures can be.

>
> When you find out the distinction between the data sets where CR fails and
> the data sets where CR succeeds, you will have made a truly valuable
> discovery.

Sceince progresses in steps. It will not be difficult to find out where CR
works and where it does not.  The difficulty is getting people to understand
that the problem with sloppy data is much bigger than CR.



>And if you find out that there is no possible way to tell a
> priori for which data sets CR will work or not, then you will have
> definitive proof that CR does not help in any practical setting.

Practicality is defined by what lazy and incompetent people are willing to
use to deceive the public.  If the data is inherently worthless, then no
method will make a silk purse from a sow's ear.  This is the point that you
and others seem not to get. If the data does not work with CR, and if CR is
a valid measure of combinations, then what is the data good for?  The
practicality of data is not a basis for justifying it. Garbage may be
practical but it is still garbage. I think that this is what frightens the
SEM folks most about the article they accepted and then rejected. I quoted
people like Thurstone and showed that normally distributed causes is an
invalid statistical concept.  This assertion should be far more threatening
to the SEM empire than CR and it is not surprising that Erlbaum's journal
went to criminal lengths to suppress this point. Most SEM folks assume it
best when all their variables are nornally distributed.

IF we can not tell whether or not data sets conform to fundamental
assumptions, then we really do not know what the data mean.  That some
statistics appear to be robust to such violations is probably a sign of the
insensitivity and thus lack of validity of the methods.   Real scientists
would not consider it asking to much to use good data. The same should apply
to behavioral sciences. If the data is sloppy, then we should just get
better data.

>
> You have also proposed several testable hypotheses. First, CR does not
work
> with convenience samples. You can test this hypothesis by running CR on a
> dozen or so data sets where the data was collected randomly rather than by
a
> convenience sample. Does CR succeed more often with these data sets than
> with the ones I used?

This is just not so Steve. Convenience data is simply any data that is
gathered in a nonscientific manner. Its is data that is more practical to
gather because to gather it requires less trainning, intelligence, work and
expense. The convenience factor is identical to the practicality factor.
That something is practical does not mean it is scientifically useful. A
thousand convenience samples would tell us little more than would one
convenience sample. If we do not have good data, but only practical data,
then we are in the dark.  By experiments and simulations we can determine
which assumptions and data structures are important in causal inference. But
I would never trust convenience or practical samples.  Convenience/practical
samples are the big problem and should not be taken as the authors or judges
of scientific control. Throwing dirt together because it is practical  and
then not getting a computer out of the pile is no sort of definitive proof.

>
> The second testable hypothesis is that CR fails when there are confounding
> variables. This is testable in two ways. One way is by identifying what
the
> confounding variables are. I'd encourage you to analyze more data sets
> involving smoking, because that is an area where the confounding variables
> are well understood.

The truth is that although I will test as I can, my academic career was
destroyed and I have neither the time nor resources to test much data.  We
can think about real data and what kind of confounds may arise from it,
simulate it and then gain more understanding and skill in scientific
measurement. But this is not something that should be undertaken simply to
test CR. This is something that every statistician needs to do when
statistics of any sort are used.



Another way is by running CR on designed experiments
> (not simulations) where confounding is not possible. Douglas Montgomery
has
> a nice book with some examples of data from designed experiments that you
> should try. Does CR work more effectively with data from Montgomery than
> with data that I have used?

I do not know this data. In my 1991 paper I did analysze the data from
experiments and the results were supportive of CR. So its not like I have
not already done what you suggest.  Nobody listened or bothered to replicate
the findings.


>
> The third testable hypothesis is that CR fails when there is measurement
> error. This is very easily testable. Run your same simulations, but add
some
> measurement error.

I have done this thousands of times.If the error is random and it attenuates
the correlation between cause and effect so that it falls in the range
between about .3 to .95, then CR will work, given large enough samples,
uniform (nontruncated) causes etc. I do not know why this is not already
abundantly clear to all.




You could also test this by finding a dozen or so data
> sets where there was no measurement error. Does CR succeed more often with
> these data sets than with the ones I used?


Have you seen any experiments with no measurement error? Is so, show me the
data and I will analyze it.



> My theory that CR is not very robust may be a vague one, but it is based
on
> comments you yourself have made. In your own emails you mentioned problems
> with
>
> 1. non-uniform errors (e.g., normal distribution errors),

In fact, so long as at least on cause is adequately sampled (uniform) CR
tends to work.

> 2. non-linearity,

Pearson and spearmans correlations are also linear and not robust to
nonlinearity. Do you therefore dismiss them as being not robust??? No, they
are simply meant for linear data.

> 3. convenience samples,

Garbage is not a proper test for robustness. It is convenient to see if
computers are created from inorganic material by just throwing sand into a
bucket and seeing if a computer is created. Such a test would be based on
convenience. If the test produces nonsense, it would be wrong to say the
test (does computer sit in the bucket or not?) lacks robustness when the
problem is that the sample is just convenient dirt.  We must sample the
processes and the intricate components, not the primitive forms that lead to
them.

> 4. confounding variables, and

Absolutely no statistical test is robust to confounding.

> 5. measurement errors.

Yes but lets be clear about what sort of errors.  Remember the chapters in
those undergraduate psychometric texts that talked about systematic versus
random errors?  We must be articulate about the meaning of errors.

Steve, your tests are not very scientific as you define them.  You seek
practicality before having an understanding of the fundamentals. That is not
way to test statistical ideas. First get the logic, then the math, then
practice operationalizing it with simulations. If it is working still, then
you have something interesting. Then get transparently obvious data (as with
experiments). This is the limit of validation. When it comes to the survey
and convenience samples you espouse, those are absolutely not capable of
validating anything. They are the problem, not the solution.


>
> In addition to this list, I personally suspect that CR would have problems
> with
>
> 1. skewed data, and
> 2. outliers

Yes, I agree.  A warped compass would also give an explorer trouble.  The
answer is to through out the garbage, If what yuou are interested in is
describing the distributions that reveal themselves in convenience samples,
ok.... not use CR. But if what you want to know is what causes what, then
use proper data.

>
> Whether CR actually has problems with all of these is an open question,
and
> perhaps some of these things weaken but do not destroy the ability of CR
to
> detect causes. But if all of these things cause serious problems with CR,
> then the list of data sets where we can apply CR safely would be very
small
> indeed.

This makes me wonder how more garbage research society will tolerate before
turning its back on research of any kind.


>
> The comment that CR works with uniform data leads to another intriguing
> hypothesis. Does CR perform better on real data when you first replace the
> data by its ranks?


This I know the answer to. NO.  CR does not require uniformity for
uniformity sake. The issue is the adequacy of the sampling of extremes. In
causation, something happens in the midrange and extremes of the causes and
effects (polarizaiton). In order to see this phenomenon, we must have enough
data to show the differences. Normally sampled causes are sparsely sampled
in the extremes.  Forget about mystical pursuits of population parameters.
Do you have enough data to show the effect in the extremes of the DV?
Ranking data that is sparse in the extremes only covers up the holes in the
data. This is a mistake that John Loehlin made on semnet. If there are no or
very few cases where the extremes of y are caused by combinations of
extremes of both x1 and x2, then the cause is not reflected in the data.
This is a sampling error. The error is there whether we use CR or not.

Why is it that I keep mentioning that no one in his right mind would create
normally distributed sample sizes across the cells of an ANOVA. They would
not! They want the extreme values of their factor levels, to be adequately
samples when the means are compared across the rows of the anova.  I am
saying the EXACT same thing.  The levels of the causes in nonexperimental
data should also be uniform if you want valid measures of the causal
relations. Would you PLEASE comment on this fact, since no one in all these
years has been willing to embrace this scary thought. The implications go
way beyond CR. The paper that Erlbaum and Marouclides suppressed,
essentially revolved around this uniform sampling issue. Thrustone, Nunnally
and other knew the absurdity of using normally sampled causes. Nunnally even
called such variables DEAD DATA.  So why does everybody ignore them....
convenience.


>
> So here's a pretty explicitly drawn road map that will tell you how to
> determine under what conditions that CR can identify cause and effect
> relationships. If I have time to run any of the above experiments, I will
> share the results with you. You are capable, though, of doing all of the
> work I have suggested. It will take a lot of time. I would welcome you to
> present any data that would address the above hypotheses.

Steve, please think about what you just said and how it was said. I have
been doing this for a long time. Yoiu catch up with me, and then we can talk
about you teaching me how to do science.


>
> For now, my working hypothesis will be that CR works well in simulations
but
> not with real data because it has poor robustness properties.

Steve, do you realise that nothing else works with simulations?  Do know
that just five years ago the 1300+ members of semnet dismissed the
possibility of doing this even with simulations?  Pause for minute and think
about the significance of CR working in any circumstance.  The history of
science shows that such toe holds often lead to revolutionary outcomes with
further work.  Now think about Marcoulides and Erlbaum with all his money,
accepting a paper, holding if for three years, and then rejecting it without
anyone(but me) raising an eyebrow.  Is this the sort of environment that
people expect from the profession of statistics?


That's the
> interpretation that I think is most consistent with the results I
presented
> yesterday. If additional experience with other real data sets shows
> otherwise, I will be prepared to entertain a different hypothesis.

I will probably be sending more posts to the newslist, unless I am banned
from it as well.  Please to comment on that ANOVA comparison. I would really
like to know what you think.

Bill
>
> Steve Simon, [EMAIL PROTECTED], Standard Disclaimer.
> The STATS web page has moved to
> http://www.childrens-mercy.org/stats

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to