(Ted Harding) <[EMAIL PROTECTED]> writes: > On 24-Nov-05 P Ehlers wrote: > > Bianca Vieru- Dimulescu wrote: > >> Hello, > >> I'm trying to calculate a chi-squared test to see if my data are > >> different from the theoretical distribution or not: > >> > >> chisq.test(rbind(c(79,52,69,71,82,87,95,74,55,78,49,60), > c(80,80,80,80,80,80,80,80,80,80,80,80))) > >> > >> Pearson's Chi-squared test > >> > >> data: rbind(c(79, 52, 69, 71, 82, 87, 95, 74, 55, 78, 49, 60), > >> c(80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80)) > >> X-squared = 17.6, df = 11, p-value = 0.09142 > >> > >> Is this correct? If I'm doing the same thing using Excel I obtained > >> a different value of p.. (1.65778E-14) > >> > >> Thanks a lot, > >> Bianca > > > > It would be unusual to have 12 observed frequencies all equal to 80. > > So I'm guessing that you have a 12-category variable and want to > > test its fit to a discrete uniform distribution. I assume that your > > frequencies are > > > > x <- c(79, 52, 69, 71, 82, 87, 95, 74, 55, 78, 49, 60) > > > > Then just use > > > > chisq.test(x) > > > > (see the help page). > > > > (If those 80's are expected cell frequencies, they should sum to > > sum(x) = 851.) > > > > I don't know what Excel does. > > > > Peter > > > > Peter Ehlers > > University of Calgary > > I'm rather with Peter on this question! I've tried to infer what > you're really trying to do. > > My a-priori plausible hypothesis was that you have > > k<-12 > > independent observations which have equal expected values > > m<-rep(80,k) > > and are observed as > > x<-c(79,52,69,71,82,87,95,74,55,78,49,60) > > On this basis, a chi-squared test Sum((O-E)^2/E) gives > > C2<-sum(((x-m)^2)/m) > > so C2 = 41.1375, and on this hypothesis the chi-squared would > have k=12 degrees of freedom. Then: > > 1-pchisq(C2,k) > ## [1] 4.647553e-05 > > which is nowhere near the 1.65778E-14 you report from Excel. > Also, the result from Peter's chisq.test(x) is p = 0.0006468, > even further away. > > So this makes me really wonder what you are doing. > > The nearest I can get to your Excel result 1.65778E-14 is > > ix<-x<m > prod(2*ppois(x[ix],m[ix]))*prod(2*(1-ppois(x[!ix],m[!ix]))) > ## 2.831963e-14 > > which is based on the guess that independent 2-sided Poisson > tests of agreement between O and E have been carried out on each > component, and the final P-value is the product of these P-values. > > But this doesn't make a lot of sense from a statistical point > of view, so it's time to stop guessing! > > Please tell us what hypothesis you are testing, what sort of > distribution the x-values are supposed to have, what the > repeated "80" values represent, and also please tell us > in detail what you asked Excel to do! > > Then, perhaps, a useful reply can be made.
I think what Excel does is outlined here: http://www.gifted.uconn.edu/siegle/research/ChiSquare/chiexcel.htm (Notice the helpful wizard which in step 2 claims that you are doing a test for independence, not for a given distribution.) This would seem to coincide with Peter E's guess. The example on that page matches chisq.test(c(10,3,2)) I believe that the expected values are expected (!) to sum to the total counts. If they do not, I guess that Excel is numb-skulled enough to compute sum((O-E)^2/E) anyway and look it up its p value with k-1 DF. Still gets you nowhere near 1.6e-14 though. -- O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
