x is a 1XN sparse matrix of numerics. I am using the Matrix package to represent as a sparse matrix; the representation has a numeric vector representing the positions within the matrix. My goal is find the columns with the n largest values, here positive correlations. Part of my strategy is to only sort the nonzeros which are available as a numeric vector.
Thanks for your interest and input. Prof Brian Ripley wrote: > > What is 'x' here? What type? Does it contain NAs? Are there ties? R's > ordering functions are rather general, and you can gain efficiency by > ruling some of these out. > > See ?sort, look at the 'partial' argument, including the comments in the > Details. And also look at ?sort.list. > > sort.int(x) is more efficient than x[order(x)], and x[order(x)[1:n]] is > more efficient than x[order(x)][1:n] for most types. > > Finally, does efficiency matter? As the examples in ?sort show, R can > sort a vector of length 2000 is well under 1ms, and 1e7 random normals in > less time than they take to generate. There are not many tasks where > gaining efficiency over x[order(x)][1:n] will be important. E.g. > >> system.time(x <- rnorm(1e6)) > user system elapsed > 0.44 0.00 0.44 >> system.time(x[order(x)][1:4]) > user system elapsed > 1.72 0.00 1.72 >> system.time(x2 <- sort.int(x, method = "quick")[1:4]) > user system elapsed > 0.31 0.00 0.32 >> system.time(min(x)) > user system elapsed > 0.02 0.00 0.02 >> system.time(x2 <- sort.int(x, partial=1)[1]) > user system elapsed > 0.07 0.00 0.07 > > and do savings of tenths of a second matter? (There is also > quantreg::kselect, if you work out how to use it, which apparently is > a bit faster at partial sorting on MacOS X but not elsewhere.) > > > On Sun, 11 Nov 2007, David Katz wrote: > >> >> What is the most efficient alternative to x[order(x)][1:n] where >> length(x)>>n? > > That is the smallest n values, pace your subject line. > >> I also need the positions of the mins/maxs perhaps by preserving names. >> >> Thanks for any suggestions. >> > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/Largest-N-Values-Efficiently--tf4788033.html#a13708965 Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.