Re: [R] Difference Between R: wilcox.test and STATA: signrank

David Winsemius Mon, 09 Aug 2010 10:16:46 -0700


On Aug 9, 2010, at 9:52 AM, peter dalgaard wrote:

On Aug 9, 2010, at 3:03 PM, Alain Guillet wrote:
Hi,
Look at the output of the test made in R and you can see it is aWilcoxon rank sum test and not a Wilcoxon signed rank test.
It might be helpful to add that paired=TRUE is needed in the call toget the signed-rank test.
If there are ties, I know I prefer wilcox.exact from theexactRankTests.
(Not that much of an issue in larger sample sizes, I'd say. Evenwith binary data, the normal approximation works reasonably wellunder the usual assumptions of expected counts > 5, since the tie-adjustment for the variance is exact for the distribution of theranks. The continuity correction doesn't quite work though. Anyways,wilcox.exact is of course a nice thing to have.)


The OP's data:

> table(xvals=dat$x, yvals=dat$y)
      yvals
xvals   0 0.25 0.5  1 1.1 1.5  2  3 3.5  5 5.5  6  8
  0    35    0   0  1   0   1  2  1   0  0   0  0  0
  0.5   2    1   1  0   0   0  0  0   0  0   0  0  0
  0.75  0    0   1  0   0   0  0  0   0  0   0  0  0
  1     7    0   1  3   0   0  1  0   1  0   0  0  0
  1.1   0    0   0  0   1   0  0  0   0  0   0  0  0
  1.5   1    1   0  4   0   2  0  0   0  0   0  0  0
  2     3    0   0  6   0   2  4  2   1  0   0  0  0
  2.1   0    0   1  0   0   0  0  0   0  0   0  0  0
  2.5   0    0   0  0   0   1  0  0   0  2   0  0  0
  3     2    0   0  0   0   0  5  3   1  1   0  0  0
  3.3   1    0   0  0   0   0  0  0   0  0   0  0  0
  3.33  0    0   0  1   0   0  0  0   0  0   0  0  0
  3.5   0    0   0  1   0   1  0  0   0  1   1  0  0
  5     0    0   0  0   0   0  0  0   0  2   0  1  1
  10    0    0   0  0   0   0  0  0   0  1   0  0  0

Adding paired=TRUE to the wilcox.test call give the signed rank testalthough tht is not likely to satisfy the OP since she seems to beexpecting a higher degree of congruence with Stata.

The wilcox.test and wilcox.exact give results that only differ at the4th decimal place.


> wilcox.test(dat$x, dat$y, paired=TRUE)

        Wilcoxon signed rank test with continuity correction

data:  dat$x and dat$y
V = 1181, p-value = 0.08872
alternative hypothesis: true location shift is not equal to 0

> wilcox.exact(dat$x, dat$y, paired=TRUE)

        Asymptotic Wilcoxon signed rank test

data:  dat$x and dat$y
V = 1181, p-value = 0.08805
alternative hypothesis: true mu is not equal to 0

The Stata output indicates some sort of adjustment for zeros. Thewilcox.test basically throws out the zeros (presumably the zerodifferences), so there may be a difference in the algorithm. Her datahas 51 zero differences and 61 non-zero differences.


> sum(dat$x==dat$y)
[1] 51
> sum(dat$x!=dat$y)
[1] 61

Wait a minute; the Stata report said she had 49 zeros and only 108records.

Different data. Different results. I suppose it could be my editingerrors. Taking out all the extraneous html junk and restoring missingdelimiters was kind of a pain.

Capasia; Don't use Google sheets to transmit data. Instead use dputon the datablatt object and just post the results of that output.


--
David.

Alain

On 09-Aug-10 12:43, Capasia wrote:

This is my first post to the mailing list and I guess it's apretty stupidquestion but I can't figure it out. I hope this is the right forumfor these
kind of questions.
Before I started using R I was using STATA to run a Wilcoxonsigned-rank
test on two variables. See data below:
https://spreadsheets.google.com/pub?key=0ApodAA2GAEP_dDZkdzZHSFBqX1JHOWJBX1dMQUZCVkE&hl=en&output=html<%20%20https://spreadsheets.google.com/pub?key=0ApodAA2GAEP_dDZkdzZHSFBqX1JHOWJBX1dMQUZCVkE&hl=en&output=html>
STATA Output:
. signrank x=y

Wilcoxon signed-rank test

      sign |      obs   sum ranks    expected
-------------+---------------------------------
  positive |       41        3101      2330.5
  negative |       18        1560      2330.5
      zero |       49        1225        1225
-------------+---------------------------------
       all |      108        5886        5886

unadjusted variance   106438.50
adjustment for ties     -282.38
adjustment for zeros  -10106.25
                   ----------
adjusted variance      96049.88

Ho: transfer_2_a = transfer_2_b
           z =   2.486
  Prob>  |z| =   *0.0129*

When running a Wilcoxon signed-rank test
wilcox.test(datablatt$x, datablatt$y)
Wilcoxon rank sum test with continuity correction

data:  datablatt$x and datablatt$y
W = 7059.5, p-value = *0.09197*
alternative hypothesis: true location shift is not equal to 0
As you can see the p Values are different (one with H0 rejectionand theother one not). I tested whether it could be that the STATA oneisn't paired
but this doesn't seem to be the problem.
I'm dumbfound what could lead to such a difference. I couldn'tfind anyseetings I have missed but I somehow I guess I'm using thefunction in the
wrong way...
Any ideas?
Thanks a lot in advance!

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Alain Guillet
Statistician and Computer Scientist

SMCS - IMMAQ - Université catholique de Louvain
Bureau c.316
Voie du Roman Pays, 20
B-1348 Louvain-la-Neuve
Belgium

tel: +32 10 47 30 50

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference Between R: wilcox.test and STATA: signrank

Reply via email to