Steve, As I have said before, the survey data you are analyzing is about one small step above tea leaf reading. We just saw how Gus produced data that he says reverses the causation but upon closer inspection he did not follow the procedures I described for trimming the data. I likewise do not know what adjustments have been made to your data, or how you analyzed the data. What were the correlations, distributions, etc between the variables. Furthermore, CR is a statistical technique and false positives will likely happen, as with all other statistics.
As to robustness. I have a single shot 410 shotgun that is highly robust, more robust than my Browning automatic 30'06 rifle. You could throw the thing down in the much and it would still work. But I do not take a small gauge shot gun deer hunting. My hand held calculator is more robust than my HP computer, but I never want to calculate another correlation using a hand held calculator. A jeep is more robust than a sports car, but I would not race a jeep against a sports car on a sports car track. You have mystified robustness. This is because it is easy to study the robustness of statistics. Researchers therefore have romanticized the importance of the robustness of their crude statistics at the expense of maturing as scientists and mathematicians. They would be better off just collecting good data. CR is not designed to withstand the incompetence of just any idiot who collects data and sticks it into a computer. CR is not flakey, the data is flakey. Robust statistics do not reflect many of the properties of numbers, properties that themselves reflect the nature or the phenomena. So we end up with drunks looking for lost keys under robust street lights instead of in the dark where the keys are lost. This is the biggest problem with statisticians, they are rarely scientists and see scientific endeavors as just opportunities to apply elementary statistical methods. The stats should be designed for the scientitific endeavor, the endeavor should not be design ed for the convenient statistical design. The most robust thing in statistics is the arrogance and closedminded ignorance of statisticians, but that does not make it right. If you want to study the robustness of CR, then look for why and where it is not robust. Tea leaf reading does nothing but sensationalize a subject that should be based in math and careful measurement. And also, "traditional" do not let us infer causation from correlation. I refuse to limit myself the crude measures of the past. I applauded you because I assume you are seeking the truth, not because you have validated CR. I will applaud you louder if you ever analyze experiments that are nearly simulations. Taking small steps is far better strategy in science. If we analyse the results of experiments then we can more easily discover actual exceptions and formulate explanations. Thus far, you have not given any suggestion why CR should not work. We are back to the why question. I do not accept that CR has performed badly. You have not tested it fairly. Tea leaf reading is not fair. And there is no reason to believe your bladder cancer data does not contain confounds. I do not know what the causes of bladder cancer are. It could be something as simple as alcohol consumption. We already know that people who drink a lot also tend to smoke a lot. But then we do not even know if smoking causes bladder cancer. There is a lot we do not know about your data. This is not the fault of CR. It is the fault of who ever collected the data and it is your fault for assuming there are no confounds. This is why it is so important to work systematically up from simulations to experiments and then only last to survey data. Steve, you and I have been working privately to help you understand the trimming procedure, I think as of this afternoon you have it. Why not report on your simulations there, instead of jumping the gun and saying CR has performed badly. You have not tested CR yet. You have read tea leaves. Coming to a conclusion at this stage is intemperate and impulsive. Its like the other guy (cannot remember his name now) who insisted on interpreting data he simulated that directly violated the assumptions of CR. You need to know more about what you are doing before you jump to conclusions. Interpreting invalid tests is just a form of propaganda and beneathe the level of real science. . Bill "Simon, Steve, PhD" <[EMAIL PROTECTED]> wrote in message E7AC96207335D411B1E7009027FC284902A9B2CE@EXCHANGE2">news:E7AC96207335D411B1E7009027FC284902A9B2CE@EXCHANGE2... > William Chambers writes: > > > The people on this newslist have recently attempted to > > disprove CR by violating assumptions, ignoring the probable > > existence of confounding variables and by highly evasive > > subsampling strategies. Some continue to work for the > > truth, Gus and Steve among them. I applaud them both. > > I'm not sure you should be applauding me. I am the one who tested CR on five > data sets with "probable existence of confounding variables" and these data > sets provide the strongest evidence to date against CR. This doesn't mean > that I've given up, but we can't sugarcoat the bad news that these five data > tell us. > > Blaming the failure of CR on confounding is an easy thing to do, but with > that obligation comes the responsibility to try to identify the potential > confounders. If you can't identify the confounder that caused problems with > CR with this data set, you won't be able to eliminate the possibility of an > equal but opposite confounder in a data set where CR performs well. That > would leave you with the unhappy option of conceding that CR does poorly > with data sets that have any potential for confounding. > > For one data set that I tested CR on, the unstated confounding variables not > only masked the obvious causal direction between smoking and cancer, but > totally reversed it to the point where CR provided conclusive evidence that > cancer causes smoking. > > The link between smoking and cancer is very strong, with odds ratios on the > order of 10 or 20 in many data sets. The sort of confounder that could cause > problems with such a strong association would have to be made of kryptonite. > So either there is a super duper confounder that half a century of research > has failed to identify, or CR is extremely sensitive to weak confounding. > > It's a shame if CR turns out to be so sensitive to confounding, because > traditional statistical methods can handle confounding very well. Paul > Rosenbaum has a delightful book that lists a wide range of strategies for > handling confounding. He also has a very good article in the American > Statistician. Mitch Gail provides a wonderful overview of why we know that > smoking causes cancer, in spite of the spurious claims about confounders > that were raised in the 1950's and 1960's. > > And Ahluwalia et al discuss a very instructive case. A simple analysis > showed a protective effect of Environmental Tobacco Smoke, a completely > counter-intuitive finding. They demonstrate, however, that differences in > maternal age can explain these unusual results. Chen et al is another > instructive example, where a protective effect of maternal smoking against > Down's syndrome is also explained by proper adjustments for maternal age. I > can probably find a dozen more examples where traditional statistical > methods have overcome problems with confounding. > > Observational Studies. Rosenbaum P (1995) New York: Springer-Verlag. > > Replicating Effects and Biases. Rosenbaum PR. The Amercian Statistician > 2001:55(3);223-227. > > Statistics in Action. Gail MH. Journal of the American Statistical > Association 1996:91(433);1-13. > > Exposure to environmental tobacco smoke and birth outcome: increased effects > on pregnant women aged 30 years or older. Ahluwalia IB, Grummer-Strawn L and > Scanlon KS. Am J Epidemiol 1997:146(1);42-7. > > Maternal smoking and Down syndrome: the confounding effect of maternal age. > Chen CL, Gilbert TJ and Daling JR. Am J Epidemiol 1999:149(5);442-6. > > I would respectfully disagree with the words "probable existence of > confounding variables" because a confounder that could cause a total > reversal in the link between smoking and cancer is very IMPROBABLE. At least > there aren't any confounders out there that are strong enough to cause > traditional statistical methods to conclude that cancer causes smoking. > > The book isn't closed yet. We'll see how CR performs on additional data > sets, especially data sets from non-medical areas. When I find some good > data sets from these other areas, I will present the results here for all to > see. > > But I think it is fair to say that CR performed very badly on the first five > real data sets that I tried. Whether there are ANY real data sets out there > where CR performs well remains an open question. And until we can find a > large group of real data sets where CR performs well, I believe that > simulations are a waste of time. > > Steve Simon, [EMAIL PROTECTED], Standard Disclaimer. > The STATS web page has moved to > http://www.childrens-mercy.org/stats. > > P.S. Dr. Chambers has also suggested that if we changed how we collect data, > we might be able to better utilize CR to demonstrate causes. This is worth > exploring, perhaps, but I would want to look at some designed experiments > first. If CR performs poorly on a designed experiment with nice balanced > data, that would be very bad news indeed. > . > . > ================================================================= > Instructions for joining and leaving this list, remarks about the > problem of INAPPROPRIATE MESSAGES, and archives are available at: > . http://jse.stat.ncsu.edu/ . > ================================================================= . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
