> -----Original Message-----
> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> Sent: Wednesday, June 27, 2012 1:24 AM
> To: Duncan Murdoch
> Cc: Adler, Avraham; r-devel@r-project.org
> Subject: Re: [Rd] Fast Kendall's Tau
> 
> On 26/06/2012 22:44, Duncan Murdoch wrote:
>> On 12-06-25 2:48 PM, Adler, Avraham wrote:
>>> Hello.
>>>
>>> Has any further action been taken regarding implementing David
>>> Simcha's fast Kendall tau code (now found in the package pcaPP as
>>> cor.fk) into R-base? It is literally hundreds of times faster,
>>> although I am uncertain as to whether he wrote code for testing the
>>> significance of the parameter. The last mention I have seen of this
>>> was in
>>> 2010<https://stat.ethz.ch/pipermail/r-devel/2010-February/056745.html>.
>>
>> You could check the NEWS file, but I don't remember anything being
>> done along these lines.  If the code is in a CRAN package, there
>> doesn't seem to be any need to move it to base R.
>
> In addition, this is something very specialized, and the code in R is fast
> enough for all but the most unusual instances of that specialized task.
> example(cor.fk) shows the R implementation takes well under a second for 2000
> cases (a far higher value than is usual).

Thank you all very much for the replies. I was approaching the problem from the 
vantage point of trying to fit Archimedean copulas to events which come from 
non-elliptical distributions, and had a few hundred thousand data points. Not 
as bad as the authors of this paper, 
<http://vigna.dsi.unimi.it/ftp/papers/ParadoxicalPageRank.pdf> who needed to 
calculate Kendall's tau based on hundreds of millions of pairs(!). I wrote an 
implementation in VBA, and when I went to R to confirm my calculations, I was 
surprised to see that even my VBA code was probably hundreds of times as fast 
as R (on a vector of exactly 100,000 pairs). The implementation in pcaPP runs 
in a second or less on the same vector.

Perhaps, as was suggested in another e-mail, the least intrusive (and best 
bang-for-buck) option is to have the documentation/help of "cor" updated to 
refer to cor.fk so that more people can be made aware of the availability for 
those of us who have to deal with ungainly data sets.

Thank you again,

Avraham Adler
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to