Lucke, Joseph F wrote: > R and SPSS are using different but equivalent statistics. R is using > the rank sum of group1 adjusted for the mean rank. SPSS is using the > rank sum of group2 adjusted for the mean rank. > > Close: It is the _minimum_ possible rank sum that is getting subtracted. If everyone in group1 is less than everyone in group2, R's W statistic will be zero. Other way around in SPSS.
> Example. > >> G1=group1 >> G2=group2[-length(group2)] #get rid of the NA >> n1=length(G1) #n1=28 >> n2=length(G2) #n2=27 >> > # convert to ranks > >> W=rank(c(G1,G2)) >> R1=W[1:n1] #put the ranks back into the groups >> R2=W[n1+1:n2] >> > #Get the sum of the ranks for each group > >> W1=sum(R1) >> W2=sum(R2) >> > #Adjust for mean rank for group 1 > >> W1-n1*(n1+1)/2 >> > [1] 405.5 > #Adjust for mean rank for group 2 > >> W2-n2*(n2+1)/2 >> > [1] 350.5 > > W1-n1*(n1+1)/2 gives R's result; W2-n2*(n2+1)/2 gives SPSS's result. > > Ties throw a wrench in the works. R uses a continuity correction by > default, SPSS does not. > Taking out the continuity correction, > >> wilcox.test(G1,G2,correct=FALSE) >> > > Wilcoxon rank sum test > > data: G1 and G2 > W = 405.5, p-value = 0.6433 > alternative hypothesis: true location shift is not equal to 0 > > Warning message: > cannot compute exact p-value with ties in: wilcox.test.default(G1, G2, > correct = FALSE) > > This p-value is the same as SPSS's. > > > Consult a serious non-parametrics text. I used > Lehmann, E. L., Nonparametrics: Statistical methods based on ranks. > 1975. Holden-Day. San Francisco, CA. > > > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Natalie O'Toole > Sent: Wednesday, August 15, 2007 1:07 PM > To: r-help@stat.math.ethz.ch > Subject: Re: [R] Mann-Whitney U > > Hi, > > I do want to use the Mann-Whitney test which ranks my data and then uses > those ranks rather than the actual data. > > Here is the R code i am using: > > group1<- > c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2, > 2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3) > >> group2<- >> > c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9 > 7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA) > >> result <- wilcox.test(group1, group2, paired=FALSE, conf.level = >> 0.95, >> > na.action) > > paired = FALSE so that the Wilcoxon rank sum test which is equivalent to > the Mann-Whitney test is used (my samples are NOT paired). > conf.level = 0.95 to specify the confidence level na.action is used > because i have a NA value (i suspect i am not using na.action in the > correct manner) > > When i use this code i get the following error message: > > Error in arg == choices : comparison (1) is possible only for atomic and > list types > > When i use this code: > > group1<- > c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2, > 2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3) > >> group2<- >> > c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9 > 7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA) > >> result <- wilcox.test(group1, group2, paired=FALSE, conf.level = >> 0.95) >> > > I get the following result: > > Wilcoxon rank sum test with continuity correction > > data: group1 and group2 > W = 405.5, p-value = 0.6494 > alternative hypothesis: true location shift is not equal to 0 > > Warning message: > cannot compute exact p-value with ties in: wilcox.test.default(group1, > group2, paired = FALSE, conf.level = 0.95) > > The W value here is 405.5 with a p-value of 0.6494 > > > in SPSS, i am ranking my data and then performing a Mann-Whitney U by > selecting analyze - non-parametric tests - 2 independent samples and > then checking off the Mann-Whitney U test. > > For the Mann-Whitney test in SPSS i am gettting the following results: > > Mann-Whitney U = 350.5 > 2- tailed p value = 0.643 > > I think maybe the descrepancy has to do with the specification of the NA > values in R, but i'm not sure. > > > If anyone has any suggestions, please let me know! > > I hope i have provided enough information to convey my problem. > > Thank-you, > > Nat > __________________ > > > Natalie, > > It's best to provide at least a sample of your data. Your field names > suggest > that your data might be collected in units of mm^2 or some similar > measurement of area. Why do you want to use Mann-Whitney, which will > rank > > your data and then use those ranks rather than your actual data? Unless > > your > sample is quite small, why not use a two sample t-test? Also,are your > samples paired? If they aren't, did you use the "paired = FALSE" > option? > > JWDougherty > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > ------------------------------------------------------------------------ > ------------------------------------------------ > > This communication is intended for the use of the recipient to which it > is > addressed, and may > contain confidential, personal, and or privileged information. Please > contact the sender > immediately if you are not the intended recipient of this communication, > > and do not copy, > distribute, or take action relying on it. Any communication received in > error, or subsequent > reply, should be deleted or destroyed. > > > ------------------------------------------------------------------------ > ------------------------------------------------ > > This communication is intended for the use of the recipient to which it > is > addressed, and may > contain confidential, personal, and or privileged information. Please > contact the sender > immediately if you are not the intended recipient of this communication, > > and do not copy, > distribute, or take action relying on it. Any communication received in > error, or subsequent > reply, should be deleted or destroyed. > ------------------------------------------------------------------------ > ------------------------------------------------ > > This communication is intended for the use of the recipient to which it > is > addressed, and may > contain confidential, personal, and or privileged information. Please > contact the sender > immediately if you are not the intended recipient of this communication, > > and do not copy, > distribute, or take action relying on it. Any communication received in > error, or subsequent > reply, should be deleted or destroyed. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.