On Thu, 2005-11-24 at 18:50 -0700, P Ehlers wrote: > Marc Schwartz wrote: > > On Thu, 2005-11-24 at 21:55 +0000, Ted Harding wrote: > > > >>On 24-Nov-05 P Ehlers wrote: > >> > >>>Bianca Vieru- Dimulescu wrote: > >>> > >>>>Hello, > >>>>I'm trying to calculate a chi-squared test to see if my data are > >>>>different from the theoretical distribution or not: > >>>> > >>>>chisq.test(rbind(c(79,52,69,71,82,87,95,74,55,78,49,60), > >> > >> c(80,80,80,80,80,80,80,80,80,80,80,80))) > >> > >>>> Pearson's Chi-squared test > >>>> > >>>>data: rbind(c(79, 52, 69, 71, 82, 87, 95, 74, 55, 78, 49, 60), > >>>> c(80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80)) > >>>>X-squared = 17.6, df = 11, p-value = 0.09142 > >>>> > >>>>Is this correct? If I'm doing the same thing using Excel I obtained > >>>>a different value of p.. (1.65778E-14) > >>>> > >>>>Thanks a lot, > >>>>Bianca > >>> > >>>It would be unusual to have 12 observed frequencies all equal to 80. > >>>So I'm guessing that you have a 12-category variable and want to > >>>test its fit to a discrete uniform distribution. I assume that your > >>>frequencies are > >>> > >>>x <- c(79, 52, 69, 71, 82, 87, 95, 74, 55, 78, 49, 60) > >>> > >>>Then just use > >>> > >>>chisq.test(x) > >>> > >>>(see the help page). > >>> > >>>(If those 80's are expected cell frequencies, they should sum to > >>>sum(x) = 851.) > >>> > >>>I don't know what Excel does. > >>> > >>>Peter > >>> > >>>Peter Ehlers > >>>University of Calgary > >> > >>I'm rather with Peter on this question! I've tried to infer what > >>you're really trying to do. > >> > >>My a-priori plausible hypothesis was that you have > >> > >> k<-12 > >> > >>independent observations which have equal expected values > >> > >> m<-rep(80,k) > >> > >>and are observed as > >> > >> x<-c(79,52,69,71,82,87,95,74,55,78,49,60) > >> > >>On this basis, a chi-squared test Sum((O-E)^2/E) gives > >> > >> C2<-sum(((x-m)^2)/m) > >> > >>so C2 = 41.1375, and on this hypothesis the chi-squared would > >>have k=12 degrees of freedom. Then: > >> > >> 1-pchisq(C2,k) > >>## [1] 4.647553e-05 > >> > >>which is nowhere near the 1.65778E-14 you report from Excel. > >>Also, the result from Peter's chisq.test(x) is p = 0.0006468, > >>even further away. > > > > > > It's late on Turkey Day here, but shouldn't that be: > > > > > >>1 - pchisq(C2, k - 1) # 11 df > > > > [1] 2.282202e-05 > > > > which is what I get using OO.org's Calc 2.0 with the CHITEST function > > using the two vectors as the observed (x) and expected (m) values. I > > also get this result from Gnumeric 1.4.3 using the same CHITEST > > function. > > > [snip] > > Marc, it's a bit sad to see that OO.org copies Excel's behaviour > to a _fault_. As Peter D. points out, we would expect the expected > frequencies and the observed frequencies to sum to the same value. > Excel (and Calc) blithely ignores that. R, OTH, gives an error > message when the probabilities don't sum to 1.
Peter, yes indeed. If you search the archives, you see a thread here: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/18179.html and http://finzi.psych.upenn.edu/R/Rhelp02a/archive/18474.html where some discussion on this occurred within the context of rounding issues and IEEE 754 compliance. Calc has truly copied Excel's behavior to a fault, since the intention is to be a "drop-in" replacement for the latter. At least Gnumeric has not done so in all cases, though it has here. Calc and Gnumeric indicate that CHITEST is a test for independence, not for goodness of fit. I did not pay attention to Excel's description, but presumably it is similar. Clearly no checks on O vs E sums though in any of these apps. Further data to reinforce the notion of not using spreadsheets for this. > Turkey soup for a few days now? Yes, indeed, along with turkey salad, turkey sandwiches... :-) My son is home from McGill in Montreal for the weekend, so he gets to celebrate Thanksgiving a second time. He can help to reduce the turkey inventory before flying back on Sunday... ;-) Best regards, Marc ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
