Steve, You make some good points. I respond in context below:
Subject: RE: Applying Corresponding Regressions across five data sets > Dr. Chambers, > > Rather than focus on the data sets that failed, why not focus on the data > sets where the method at least partially succeeded? Because to me the successes are also questionable. I approach the issue in search of an understanding of the basic principles of causality and causal inference, not as a commercial venture. A method that is correct for unknown reasons is pretty worthless to me, except as something for further exploration. I am not looking to real data sets for confirmation of CR, since real data sets are usually so poorly constructed and of such unknown causal nature, as to be worthless as criteria for validation. I ranked tea leaf reading a close follower to the analysis of such survey data because it is the mysterious nature of real data that creates the need for methods such as CR. > > Why did the housing data set show good results when the car mileage data set > did not? Both data sets would appear to have the same amount of measurement > error in them. That can only be explained by a deeper knowledge of the data sets than we currently have. Measurement error has many forms. Random error can be handled by CR. X2 maybe a causal factor or an artifactual random error. Either way CR works. The problem comes when there are nonrandom errors. All statistical methods are vulnerable to these kind of errors, not just CR. Mesurement theory in the social/behavioral sciences is still in its infancy. It has been crippled by commercial/tenure interests to such an extent that most researchers can not even comprehend the extent of the dangers of confounding. Just last week no one could imagine how mileage could cause the types of engine specifications. But the idea of confounding a final cause (intention of the engineers) with a proxy measure of this intention (actual mileage) was always a very real possibility that most folks did not and probably still can not conceive of. The reason for this ignorance is that current methods allow us to confound things without being shocked by our conclusions. If we do not infer causation, we do not have to consider the absurdity of confounded measures. > > Why did the cigarettes and mortality data set show good results when the > cigarettes and cancer data set did not? Both data sets would seem to have > the same sort of trouble with confounders. You can not make this assumption since you do not know either data set very well. There is no alternative to knowing the data if you intend to make assertions based on the data. Knowing the data takes work that most people have neither the trainning nor the dedication to accomplish. Its cheap and easy to use convenience samples because they seem to have face validity to those who lack the intelligence and/or education to know what a can of worms such measures can be. > > When you find out the distinction between the data sets where CR fails and > the data sets where CR succeeds, you will have made a truly valuable > discovery. Sceince progresses in steps. It will not be difficult to find out where CR works and where it does not. The difficulty is getting people to understand that the problem with sloppy data is much bigger than CR. >And if you find out that there is no possible way to tell a > priori for which data sets CR will work or not, then you will have > definitive proof that CR does not help in any practical setting. Practicality is defined by what lazy and incompetent people are willing to use to deceive the public. If the data is inherently worthless, then no method will make a silk purse from a sow's ear. This is the point that you and others seem not to get. If the data does not work with CR, and if CR is a valid measure of combinations, then what is the data good for? The practicality of data is not a basis for justifying it. Garbage may be practical but it is still garbage. I think that this is what frightens the SEM folks most about the article they accepted and then rejected. I quoted people like Thurstone and showed that normally distributed causes is an invalid statistical concept. This assertion should be far more threatening to the SEM empire than CR and it is not surprising that Erlbaum's journal went to criminal lengths to suppress this point. Most SEM folks assume it best when all their variables are nornally distributed. IF we can not tell whether or not data sets conform to fundamental assumptions, then we really do not know what the data mean. That some statistics appear to be robust to such violations is probably a sign of the insensitivity and thus lack of validity of the methods. Real scientists would not consider it asking to much to use good data. The same should apply to behavioral sciences. If the data is sloppy, then we should just get better data. > > You have also proposed several testable hypotheses. First, CR does not work > with convenience samples. You can test this hypothesis by running CR on a > dozen or so data sets where the data was collected randomly rather than by a > convenience sample. Does CR succeed more often with these data sets than > with the ones I used? This is just not so Steve. Convenience data is simply any data that is gathered in a nonscientific manner. Its is data that is more practical to gather because to gather it requires less trainning, intelligence, work and expense. The convenience factor is identical to the practicality factor. That something is practical does not mean it is scientifically useful. A thousand convenience samples would tell us little more than would one convenience sample. If we do not have good data, but only practical data, then we are in the dark. By experiments and simulations we can determine which assumptions and data structures are important in causal inference. But I would never trust convenience or practical samples. Convenience/practical samples are the big problem and should not be taken as the authors or judges of scientific control. Throwing dirt together because it is practical and then not getting a computer out of the pile is no sort of definitive proof. > > The second testable hypothesis is that CR fails when there are confounding > variables. This is testable in two ways. One way is by identifying what the > confounding variables are. I'd encourage you to analyze more data sets > involving smoking, because that is an area where the confounding variables > are well understood. The truth is that although I will test as I can, my academic career was destroyed and I have neither the time nor resources to test much data. We can think about real data and what kind of confounds may arise from it, simulate it and then gain more understanding and skill in scientific measurement. But this is not something that should be undertaken simply to test CR. This is something that every statistician needs to do when statistics of any sort are used. Another way is by running CR on designed experiments > (not simulations) where confounding is not possible. Douglas Montgomery has > a nice book with some examples of data from designed experiments that you > should try. Does CR work more effectively with data from Montgomery than > with data that I have used? I do not know this data. In my 1991 paper I did analysze the data from experiments and the results were supportive of CR. So its not like I have not already done what you suggest. Nobody listened or bothered to replicate the findings. > > The third testable hypothesis is that CR fails when there is measurement > error. This is very easily testable. Run your same simulations, but add some > measurement error. I have done this thousands of times.If the error is random and it attenuates the correlation between cause and effect so that it falls in the range between about .3 to .95, then CR will work, given large enough samples, uniform (nontruncated) causes etc. I do not know why this is not already abundantly clear to all. You could also test this by finding a dozen or so data > sets where there was no measurement error. Does CR succeed more often with > these data sets than with the ones I used? Have you seen any experiments with no measurement error? Is so, show me the data and I will analyze it. > My theory that CR is not very robust may be a vague one, but it is based on > comments you yourself have made. In your own emails you mentioned problems > with > > 1. non-uniform errors (e.g., normal distribution errors), In fact, so long as at least on cause is adequately sampled (uniform) CR tends to work. > 2. non-linearity, Pearson and spearmans correlations are also linear and not robust to nonlinearity. Do you therefore dismiss them as being not robust??? No, they are simply meant for linear data. > 3. convenience samples, Garbage is not a proper test for robustness. It is convenient to see if computers are created from inorganic material by just throwing sand into a bucket and seeing if a computer is created. Such a test would be based on convenience. If the test produces nonsense, it would be wrong to say the test (does computer sit in the bucket or not?) lacks robustness when the problem is that the sample is just convenient dirt. We must sample the processes and the intricate components, not the primitive forms that lead to them. > 4. confounding variables, and Absolutely no statistical test is robust to confounding. > 5. measurement errors. Yes but lets be clear about what sort of errors. Remember the chapters in those undergraduate psychometric texts that talked about systematic versus random errors? We must be articulate about the meaning of errors. Steve, your tests are not very scientific as you define them. You seek practicality before having an understanding of the fundamentals. That is not way to test statistical ideas. First get the logic, then the math, then practice operationalizing it with simulations. If it is working still, then you have something interesting. Then get transparently obvious data (as with experiments). This is the limit of validation. When it comes to the survey and convenience samples you espouse, those are absolutely not capable of validating anything. They are the problem, not the solution. > > In addition to this list, I personally suspect that CR would have problems > with > > 1. skewed data, and > 2. outliers Yes, I agree. A warped compass would also give an explorer trouble. The answer is to through out the garbage, If what yuou are interested in is describing the distributions that reveal themselves in convenience samples, ok.... not use CR. But if what you want to know is what causes what, then use proper data. > > Whether CR actually has problems with all of these is an open question, and > perhaps some of these things weaken but do not destroy the ability of CR to > detect causes. But if all of these things cause serious problems with CR, > then the list of data sets where we can apply CR safely would be very small > indeed. This makes me wonder how more garbage research society will tolerate before turning its back on research of any kind. > > The comment that CR works with uniform data leads to another intriguing > hypothesis. Does CR perform better on real data when you first replace the > data by its ranks? This I know the answer to. NO. CR does not require uniformity for uniformity sake. The issue is the adequacy of the sampling of extremes. In causation, something happens in the midrange and extremes of the causes and effects (polarizaiton). In order to see this phenomenon, we must have enough data to show the differences. Normally sampled causes are sparsely sampled in the extremes. Forget about mystical pursuits of population parameters. Do you have enough data to show the effect in the extremes of the DV? Ranking data that is sparse in the extremes only covers up the holes in the data. This is a mistake that John Loehlin made on semnet. If there are no or very few cases where the extremes of y are caused by combinations of extremes of both x1 and x2, then the cause is not reflected in the data. This is a sampling error. The error is there whether we use CR or not. Why is it that I keep mentioning that no one in his right mind would create normally distributed sample sizes across the cells of an ANOVA. They would not! They want the extreme values of their factor levels, to be adequately samples when the means are compared across the rows of the anova. I am saying the EXACT same thing. The levels of the causes in nonexperimental data should also be uniform if you want valid measures of the causal relations. Would you PLEASE comment on this fact, since no one in all these years has been willing to embrace this scary thought. The implications go way beyond CR. The paper that Erlbaum and Marouclides suppressed, essentially revolved around this uniform sampling issue. Thrustone, Nunnally and other knew the absurdity of using normally sampled causes. Nunnally even called such variables DEAD DATA. So why does everybody ignore them.... convenience. > > So here's a pretty explicitly drawn road map that will tell you how to > determine under what conditions that CR can identify cause and effect > relationships. If I have time to run any of the above experiments, I will > share the results with you. You are capable, though, of doing all of the > work I have suggested. It will take a lot of time. I would welcome you to > present any data that would address the above hypotheses. Steve, please think about what you just said and how it was said. I have been doing this for a long time. Yoiu catch up with me, and then we can talk about you teaching me how to do science. > > For now, my working hypothesis will be that CR works well in simulations but > not with real data because it has poor robustness properties. Steve, do you realise that nothing else works with simulations? Do know that just five years ago the 1300+ members of semnet dismissed the possibility of doing this even with simulations? Pause for minute and think about the significance of CR working in any circumstance. The history of science shows that such toe holds often lead to revolutionary outcomes with further work. Now think about Marcoulides and Erlbaum with all his money, accepting a paper, holding if for three years, and then rejecting it without anyone(but me) raising an eyebrow. Is this the sort of environment that people expect from the profession of statistics? That's the > interpretation that I think is most consistent with the results I presented > yesterday. If additional experience with other real data sets shows > otherwise, I will be prepared to entertain a different hypothesis. I will probably be sending more posts to the newslist, unless I am banned from it as well. Please to comment on that ANOVA comparison. I would really like to know what you think. Bill > > Steve Simon, [EMAIL PROTECTED], Standard Disclaimer. > The STATS web page has moved to > http://www.childrens-mercy.org/stats . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
