This query is offtopic for this list, as it is about statistics, not R programming. stats.stackexchange.com is a good venue for statistics questions.
However, you are confused. Wilcoxon does NOT test for differences in population means. e.g. Consider the 2 samples: A: 5,6,7 B: 1,2, 50 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Aug 22, 2017 at 9:20 AM, Karolis Uziela <karolis.uzi...@gmail.com> wrote: > Hi, > > I am using wilcox.test function to test the difference between the means of > two samples. The data points are paired, so I am using a paired test. > > There is one strange case. Sample A has a higher mean than a sample B. > However, wilcox.test function says that sample B has a significantly higher > "mean rank" than sample A. How is it possible? > > Here is the code (data file is attached): > df <- read.table("wilcox_data.txt", head=TRUE) > mean(df$A) > [1] 0.7987849 > mean(df$B) > [1] 0.7977966 > mean(df$C) > [1] 0.6350737 > > wilcox.test(df$B, df$A, paired=TRUE, alternative="greater") > Wilcoxon signed rank test with continuity correction > > data: df$B and df$A > V = 134300, p-value = 3.299e-05 > alternative hypothesis: true location shift is greater than 0 > > wilcox.test(df$C, df$A, paired=TRUE, alternative="greater") > Wilcoxon signed rank test with continuity correction > > data: df$C and df$A > V = 41423, p-value = 1 > alternative hypothesis: true location shift is greater than 0 > > The p-value of the first test is rather low (3.299e-05), which indicates > that the alternative hypothesis is true - sample B has a higher "mean rank" > than sample A. Just to make sure I am not doing a dumb mistake, I added a > third variable C to this example, which is much smaller than A or B. As > expected, the second test has p-value = 1, which means that "mean rank" of > C is lower than A (null hypothesis is true). > > I am afraid, I am not very strong in statistics, but I would very much > appreciate if someone could explain me in simple words: > 1) Wikipedia says that Wilcoxon signed-rank test is used to test whether > population "mean ranks" differ. What is exactly the definition of "mean > rank"? https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test > 2) How can the mean of a variable A be bigger than the mean of variable B, > but the "mean rank" of variable B is significantly bigger than "mean rank" > of variable A. > > There is a small chance that this is because of a bug in wilcox.test > function, but it is probably more likely that this paradox is because of > some statistics phenomena that I don't understand. > > Best regards, > Karolis Uziela > > P. S. I have another strange example, where the difference between A and B > is much smaller than the difference between A and C, but the significance > of the "mean rank" difference between A and B is much larger then the > significance of mean rank difference between A and C. For simplicity > reasons, I didn't add that example here, but I guess that the answer to the > above question will be related to this one. > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.