Please do the comparison with R-devel. That has a different policy, including switching to quicksort when partial has more than 10 elements.
BTW you need to compare with quicksort (which suffices here and is quite a lot faster than shellsort for large numeric vectors). For 1 million+1 I am getting timings on my Windows laptop sort 0.39 quicksort 0.24 quantile(50%) 0.13 quantile() 0.17 kuantile(50%) 0.16 kuantile() 0.17 (timings for a million are about 50% more in the last four lines as roughly twice as many raw quantiles are needed.) Note that there is never more than about a factor of three gain over quicksort (remembering that quantile has some overhead of its own, so I also timed a version of quantile always using quicksort, using about 0.34). Given how rare this is as a task for R, it would probably be quite sufficient to always use quicksort. On Sat, 11 Mar 2006, roger koenker wrote: > Motivated by Deepayan's recent inquiries about the efficiency of the > R 'quantile' > function: > > http://tolstoy.newcastle.edu.au/R/devel/05/11/3305.html > http://tolstoy.newcastle.edu.au/R/devel/06/03/4358.html > > I decided to try to revive an old project to implement a version of > the Floyd > and Rivest (1975) algorithm for finding quantiles with O(n) > comparisons. I > used to have a direct translation from FR's algol68 in my quantreg > package, > but was forced to remove it when it encountered g77 compilers that > didn't like the way I had handled recursion. > > Fortunately, in the interim, I've corresponded with Krzysztof Kiwiel > in Warsaw > who has made a detailed study of the algorithm: > > K.C. Kiwiel: On Floyd and Rivest's SELECT Algorithm, Theoretical > Computer Sci. 347 (2005) 214-238. > > He has also kindly agreed to allow me to incorporate his > implementation of > the FR algorithm in R under GPL. > > For the moment, I have made an alpha-test package with a single > function -- > 'kuantile' -- that attempts to reproduce the functionality of the > current > base quantile function, essentially replacing the partial sorting done > there with calls to Kiwiel's 'select' function. This package is > available > from: > > http://www.econ.uiuc.edu/~roger/research/rq/kuantile > > The good news is that the new function _is_ somewhat quicker than > the base 'quantile' function. The following table is based on means > of 10 replications. The first two columns report times for computing > just the median, the next two columns for computing the five default > quantiles: seq(0,1,.25), and the last column is the time needed for > a call > of sort(). > > mean system.time()[1] for various calls* > > median only default 5 > n quantile kuantile quantile kuantile sort > > 100 0.002 0.001 0.004 0.004 0.000 > 1000 0.002 0.001 0.003 0.002 0.002 > 10000 0.003 0.002 0.009 0.005 0.005 > 1e+05 0.024 0.010 0.061 0.013 0.031 > 1e+06 0.206 0.120 0.535 0.142 0.502 > 1e+07 2.132 0.790 5.575 1.035 7.484 > > * On a mac G5 running R 2.2.1. > > The bad news is that this approach doesn't help with Deepayan's > original question which involved instances in which quantile() was > being called for rather large number of probabilities. In such > cases it seems that it is difficult to improve upon doing a full sort. > > I would welcome comments on any of this.... > > url: www.econ.uiuc.edu/~roger Roger Koenker > email: [EMAIL PROTECTED] Department of Economics > vox: 217-333-4558 University of Illinois > fax: 217-244-6678 Champaign, IL 61820 > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel