[R] Chi-Square Test Disagreement
I was asked by my boss to do an analysis on a large data set, and I am trying to convince him to let me use R rather than SPSS. I think Sweave could make my life much much easier. To get me a little closer to this goal, I ran my analysis through R and SPSS and compared the resulting values. In all but one case, they were the same. Given the matrix [,1] [,2] [1,] 110 358 [2,] 71 312 [3,] 29 139 [4,] 31 77 [5,] 13 32 This is the output from R: chisq.test(test29) Pearson's Chi-squared test data: test29 X-squared = 9.593, df = 4, p-value = 0.04787 But, the same data in SPSS generates a p value of .051. It's a small but important difference. I played around and rescaled things, and tried different values for B, but I never could get R to reach .051. I'd like to know which program is correct - R or SPSS? I know, this is a biased place to ask such a question. I also appreciate all input that will help me use R more effectively. The difference could be the result of my own ignorance. thanks --andy -- Insert something humorous here. :-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-Square Test Disagreement
On 11/26/2008 9:51 AM, Andrew Choens wrote: I was asked by my boss to do an analysis on a large data set, and I am trying to convince him to let me use R rather than SPSS. I think Sweave could make my life much much easier. To get me a little closer to this goal, I ran my analysis through R and SPSS and compared the resulting values. In all but one case, they were the same. Given the matrix [,1] [,2] [1,] 110 358 [2,] 71 312 [3,] 29 139 [4,] 31 77 [5,] 13 32 This is the output from R: chisq.test(test29) Pearson's Chi-squared test data: test29 X-squared = 9.593, df = 4, p-value = 0.04787 But, the same data in SPSS generates a p value of .051. It's a small but important difference. I played around and rescaled things, and tried different values for B, but I never could get R to reach .051. I'd like to know which program is correct - R or SPSS? I know, this is a biased place to ask such a question. I also appreciate all input that will help me use R more effectively. The difference could be the result of my own ignorance. The SPSS p-value is for the Likelihood Ratio Chi-squared test, not Pearson's. For Pearson's Chi-squared test in SPSS (16.0.2), I get p=0.04787, so the results do match if you do the same Chi-squared test. thanks --andy -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-Square Test Disagreement
G'day Andy, On Wed, 26 Nov 2008 14:51:50 + Andrew Choens [EMAIL PROTECTED] wrote: I was asked by my boss to do an analysis on a large data set, and I am trying to convince him to let me use R rather than SPSS. Very laudable of you. :) This is the output from R: chisq.test(test29) Pearson's Chi-squared test data: test29 X-squared = 9.593, df = 4, p-value = 0.04787 But, the same data in SPSS generates a p value of .051. It's a small but important difference. Chuck explained already the reason for this small difference. I just take issue about it being an important difference. In my opinion, this difference is not important at all. It would only be important to people who are still sticking to arbitrary cut-off points that are mainly due to historical coincidences and the lack of computing power at those time in history. If somebody tells you that this difference is important, ask him or her whether he or she will be willing to finance you a room full of calculators (in the sense of Pearson's time) and whether he or she wants you to do all your calculations and analyses with these calculators in future. Alternatively, you could ask the person whether he or she would like the anaesthetist during his or her next operation to use chloroform given his or her nostalgic penchant for out-dated rituals/methods. I played around and rescaled things, and tried different values for B, but I never could get R to reach .051. Well, I have no problem when using simulated p-values to get something close to 0.051; look at the last try. The second one might also be noteworthy. Unfortunately, I didn't save the seed beforehand. test29 - matrix(c(110,358,71,312,29,139,31,77,13,32), byrow=TRUE, ncol=2) test29 [,1] [,2] [1,] 110 358 [2,] 71 312 [3,] 29 139 [4,] 31 77 [5,] 13 32 chisq.test(test29, simul=TRUE) Pearson's Chi-squared test with simulated p-value (based on 2000 replicates) data: test29 X-squared = 9.593, df = NA, p-value = 0.04798 chisq.test(test29, simul=TRUE) Pearson's Chi-squared test with simulated p-value (based on 2000 replicates) data: test29 X-squared = 9.593, df = NA, p-value = 0.05697 chisq.test(test29, simul=TRUE, B=2) Pearson's Chi-squared test with simulated p-value (based on 2 replicates) data: test29 X-squared = 9.593, df = NA, p-value = 0.0463 chisq.test(test29, simul=TRUE, B=2) Pearson's Chi-squared test with simulated p-value (based on 2 replicates) data: test29 X-squared = 9.593, df = NA, p-value = 0.0499 chisq.test(test29, simul=TRUE, B=2) Pearson's Chi-squared test with simulated p-value (based on 2 replicates) data: test29 X-squared = 9.593, df = NA, p-value = 0.0486 chisq.test(test29, simul=TRUE, B=2) Pearson's Chi-squared test with simulated p-value (based on 2 replicates) data: test29 X-squared = 9.593, df = NA, p-value = 0.05125 Cheers, Berwin === Full address = Berwin A TurlachTel.: +65 6516 4416 (secr) Dept of Statistics and Applied Probability+65 6516 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: [EMAIL PROTECTED] Singapore 117546http://www.stat.nus.edu.sg/~statba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-Square Test Disagreement
On Thu, 2008-11-27 at 00:46 +0800, Berwin A Turlach wrote: Chuck explained already the reason for this small difference. I just take issue about it being an important difference. In my opinion, this difference is not important at all. It would only be important to people who are still sticking to arbitrary cut-off points that are mainly due to historical coincidences and the lack of computing power at those time in history. If somebody tells you that this difference is important, ask him or her whether he or she will be willing to finance you a room full of calculators (in the sense of Pearson's time) and whether he or she wants you to do all your calculations and analyses with these calculators in future. Alternatively, you could ask the person whether he or she would like the anaesthetist during his or her next operation to use chloroform given his or her nostalgic penchant for out-dated rituals/methods. Yes he did and when I realized the source of my confusion I was appropriately chastised. I felt like a bit of a fool. Of course, I should try comparing apples to apples. Oranges are another thing entirely. As to the importance of the difference, I am of two minds. On the one hand I fully agree with you. It is an anachronistic approach. On the other hand we don't all have the pleasure of working in a math department where such subtleties are well understood. I work for a consulting firm that advises state and local governments (USA). I personally do try to expand my understanding on statistics and math (I do not have a degree in math), but my clients do not. When I'm working with someone from the government, it is sometimes easier to simply tell them that relationship x is significant at a certain level of certainty. Although I doubt they could really explain the details, they have some basic understanding of what I am talking about. Subtleties are sometimes lost on our public servants. And, since I do work for government, if I ask for a roomful of calculators, I might just get them. And really, what am I going to do with a roomful of calculators? --andy -- Insert something humorous here. :-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-Square Test Disagreement
On 26-Nov-08 17:57:52, Andrew Choens wrote: [...] And, since I do work for government, if I ask for a roomful of calculators, I might just get them. And really, what am I going to do with a roomful of calculators? --andy Insert something humorous here. :-) Next time the launch of an incoming nuclear strike is detected, set them to work as follows (following Karl Pearson's historical precedent): Anti-aircraft guns all day long: Computing for the Ministry of Munitions JUNE BARROW GREEN (Open University) From January 1917 until March 1918 Pearson and his staff of mathematicians and human computers at the Drapers Biometric Laboratory worked tirelessly on the computing of ballistic charts, high-angle range tables and fuze-scales for AV Hill of the Anti-Aircraft Experimental Section. Things did not always go smoothly -- Pearson did not take kindly to the calculations of his staff being questioned -- and Hill sometimes had to work hard to keep the peace. If you have enough of them (and Pearson undoubtedly did, so you can quote that in your requisition request), then you might just get the answer in time! [ The above excerpted from http://tinyurl.com/6byoub ] Good luck! Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 26-Nov-08 Time: 18:35:25 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-Square Test Disagreement
Next time the launch of an incoming nuclear strike is detected, set them to work as follows (following Karl Pearson's historical precedent): Anti-aircraft guns all day long: Computing for the Ministry of Munitions JUNE BARROW GREEN (Open University) From January 1917 until March 1918 Pearson and his staff of mathematicians and human computers at the Drapers Biometric Laboratory worked tirelessly on the computing of ballistic charts, high-angle range tables and fuze-scales for AV Hill of the Anti-Aircraft Experimental Section. Things did not always go smoothly -- Pearson did not take kindly to the calculations of his staff being questioned -- and Hill sometimes had to work hard to keep the peace. If you have enough of them (and Pearson undoubtedly did, so you can quote that in your requisition request), then you might just get the answer in time! [ The above excerpted from http://tinyurl.com/6byoub ] Good luck! Ted. That is absolutely classic. -- Insert something humorous here. :-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.