Re: [R] Bug in Kendall for n4?

2008-11-24 Thread Martin Maechler
Dear Ian,

thanks a lot for your clarifications.

 AIM == A I McLeod [EMAIL PROTECTED]
 on Sat, 22 Nov 2008 22:24:11 -0500 (EST) writes:

AIM The package Kendall computes the p-value when there are
AIM ties in one ranking. This often happens with trend
AIM testing with environmental data. I get about 5-10
AIM emails per year from scientists using Kendall for that
AIM purpose who don't know how to use R very well. I
AIM suspect this means there are many users of this
AIM package.

Indeed, the case of ties in the data is an important one in
possibly many applications, and indeed, cor.test() is
and hence the Kendall package is
serving an important need!

I do apologize for my impolite wording to which I was lead by
the example (and 'Subject').
If the topic is just *computation* of Kendall's tau, I don't
think anyone should use the Kendall package.
If, however, one is interested in P-values of (H0:  tau = 0),
your Kendall package is indeed a valuable asset!

AIM Thank you though for your comments.  So I will improve
AIM the documentation for Kendall by terminating the
AIM program with an error message when n=3 (this case is
AIM of no interest to me) and warning message when n12
AIM that the p-values may be inaccurate. My student Paul
AIM Valz in this Ph.D. thesis discussed an enumeration
AIM algorithm for the exact p-value computation for any n
AIM with arbitrary ties in both variables -- but the
AIM algorithm is complex and for practical purposes, I
AIM prefer to use the algorithm in Kendall -- especially
AIM for trend testing with block bootstrap. That is the
AIM reason for the existence of this package.

AIM Valz's algorithm was published in JCGS but I am believe
AIM there is a mistake, so I don't use it.  The approximate
AIM algorithm, for p-values, that is used in Kendall, has
AIM been extensively tested.

AIM Also, I doubt if the current p-values from cor.test are
AIM correct for small n and I notice that ties in one
AIM ranking do produce a warning.

That's an interesting point about which I think we should
exchange more, but really in a different thread, possibly on
R-devel rather than R-help.

Thanking you and apologizing once more:
Martin Maechler, ETH Zurich


AIM Finally, I will also make more clear in the
AIM documentation about cor and cor.test being alternative
AIM functions which may be more appropriate for some users.

AIM Ian McLeod

 On Sat, Nov 22, 2008 at 9:04 AM, Martin Maechler
 [EMAIL PROTECTED] wrote:
SM I believe Kendall tau is well-defined for this case...
 
 The real question is *WHY* there needs to be a separate
 package 'Kendall' when R itself does everything you want
 and does not show any problems?
 
 Thanks for pointing me to cor(...,method=kendall),
 which I did not know about; I used the Kendall CRAN
 package out of pure ignorance.
 
 In my defense, I think it is excusable ignorance, as
 Search on the R Project home page finds the Kendall
 package (which only mentions cor as a See Also).  I
 only more recently discovered the advantages of
 help.search.
 
 By the way, is Kendall well-defined when the arguments
 are not permutations of each other?  cor seems to return
 results even in this case:
 
 a-factor(c(Alice,Bob,Chris)) b-a[1:2] c-a[2:3]
 cor(a,b,method=kendall) = 1
 
 apparently interpreting b as c(1,2) and c as c(1,2) based
 on alphabetical order (even though it is an UNordered
 factor), which seems to make the value depend on the
 subjects' names, which I'd think was wrong for a
 rank-order statistic.
 
 Thanks again,
 
 -s


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug in Kendall for n4?

2008-11-22 Thread Martin Maechler
 SM == Stavros Macrakis [EMAIL PROTECTED]
 on Fri, 21 Nov 2008 14:44:37 -0500 writes:

 library(Kendall) Kendall(1:3,1:3)
SM WARNING: Error exit, tauk2. IFAULT = 12  tau = 1,
SM 2-sided pvalue =1

SM I believe Kendall tau is well-defined for this case and
SM the reported value is correct; isn't it a bug to give a
SM warning?  (And if, e.g., the pvalue is not well-defined
SM in this case, wouldn't it be better to return NA or NaN
SM or something?) Also, shouldn't the error code be given
SM in plain English -- or at least the meaning of IFAULT =
SM 12 documented on the help page?

The real question is  *WHY* there needs to be a separate package
'Kendall'  when R itself does everything you want and does not
show any problems ?

 cor()
 cov()
 cor.test()

all have a method = kendall  and seem to work alright,
even for  n=2

   cor(1:3, c(3,1,2), method=kendall)
  [1] -0.333
   cov(1:3, c(3,1,2), method=kendall)
  [1] -2
   cor(1:3, 1:3, method=kendall)
  [1] 1
   cor.test(1:3, 1:3, method=kendall)

  Kendall's rank correlation tau

  data:  1:3 and 1:3 
  T = 3, p-value = 0.
  alternative hypothesis: true tau is not equal to 0 
  sample estimates:
  tau 
1 


Questions about the 'Kendall' package should typically first go
to its author ...
But those on  cor(), cor.test() etc do belong here.

Best regards,
Martin Maechler, ETH Zurich


SM A somewhat less clear case is Kendall(1:2,1:2), which
SM gives the same error.  Though the usual formula for
SM Kendall tau has a zero in the denominator in this case,
SM I'd think the correct generalization is 1 if the two
SM elements are in the same order, and -1 if they are not
SM (the only possibilities). But perhaps I don't fully
SM understand the interpretation of this statistic.

SM   -s

SM __
SM R-help@r-project.org mailing list
SM https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
SM read the posting guide
SM http://www.R-project.org/posting-guide.html and provide
SM commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug in Kendall for n4?

2008-11-22 Thread Stavros Macrakis
On Sat, Nov 22, 2008 at 9:04 AM, Martin Maechler
[EMAIL PROTECTED] wrote:
SM I believe Kendall tau is well-defined for this case...

 The real question is  *WHY* there needs to be a separate package 'Kendall'  
 when R itself does everything you want and does not show any problems?

Thanks for pointing me to cor(...,method=kendall), which I did not
know about; I used the Kendall CRAN package out of pure ignorance.

In my defense, I think it is excusable ignorance, as Search on the R
Project home page finds the Kendall package (which only mentions cor
as a See Also).  I only more recently discovered the advantages of
help.search.

By the way, is Kendall well-defined when the arguments are not
permutations of each other?  cor seems to return results even in this
case:

   a-factor(c(Alice,Bob,Chris))
   b-a[1:2]
   c-a[2:3]
   cor(a,b,method=kendall)
   =  1

apparently interpreting b as c(1,2) and c as c(1,2) based on
alphabetical order (even though it is an UNordered factor), which
seems to make the value depend on the subjects' names, which I'd think
was wrong for a rank-order statistic.

Thanks again,

   -s

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bug in Kendall for n4?

2008-11-21 Thread Stavros Macrakis
 library(Kendall)
 Kendall(1:3,1:3)
WARNING: Error exit, tauk2. IFAULT =  12  
tau = 1, 2-sided pvalue =1

I believe Kendall tau is well-defined for this case and the reported
value is correct; isn't it a bug to give a warning?  (And if, e.g.,
the pvalue is not well-defined in this case, wouldn't it be better to
return NA or NaN or something?) Also, shouldn't the error code be
given in plain English -- or at least the meaning of IFAULT = 12
documented on the help page?

A somewhat less clear case is Kendall(1:2,1:2), which gives the same
error.  Though the usual formula for Kendall tau has a zero in the
denominator in this case, I'd think the correct generalization is 1 if
the two elements are in the same order, and -1 if they are not (the
only possibilities). But perhaps I don't fully understand the
interpretation of this statistic.

  -s

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.