The caching is in the disc system: you need to find and read the
package metadata for every package. AFAIK it is not easy to flush the
disc cache, but quite easy to overwrite it with later reads. (Google
for more info.)
If you are not concerned about validity of the installed packages you
could skip the tests and hence the reads.
Your times are quite a bit slower than mine, so a faster disc system
might help. Since my server has just been rebooted (for a new
kernel), with all of CRAN and most of BioC I get
system.time( packs <- .packages( all = T ) )
user system elapsed
0.518 0.262 25.042
system.time( packs <- .packages( all = T ) )
user system elapsed
0.442 0.080 0.522
length(packs)
[1] 2096
There's a similar issue when installing packages: the Perl code reads
the indices from every visible package to resolve links, and that can
be slow the first time.
On Tue, 3 Mar 2009, Romain Francois wrote:
Hello,
The first time in a session I call .packages( all.available = T ), it takes a
long time (I have many packages installed from CRAN):
system.time( packs <- .packages( all = T ) )
user system elapsed
0.738 0.276 43.787
When I call it again, the time is now much reduced, so there must be some
caching somewhere. I would like to try to reduce the time it takes the first
time, but I have not been able to identify where the caching takes place, and
so how I can remove it to try to improve the running time without the
caching. Without this, I have to restart my computer each time to vanish the
caching to test a new version of the function (this is not going to happen)
Here is the .packages function, I am suspicious about this part : "ans <-
c(ans, nam)" which grows the ans vector each time a suitable package is
found, this does not sound right.
It's OK as there are only going to be ca 2000 packages. Try
profiling this: .readRDS and grepl take most of the time.
.packages
function (all.available = FALSE, lib.loc = NULL)
{ if (is.null(lib.loc))
lib.loc <- .libPaths() if (all.available) {
ans <- character(0L) lib.loc <-
lib.loc[file.exists(lib.loc)]
valid_package_version_regexp <-
.standard_regexps()$valid_package_version
for (lib in lib.loc) {
a <- list.files(lib, all.files = FALSE, full.names = FALSE)
for (nam in a) {
pfile <- file.path(lib, nam, "Meta", "package.rds")
if (file.exists(pfile))
info <- .readRDS(pfile)$DESCRIPTION[c("Package",
"Version")] else
next if
((length(info) != 2L) || any(is.na(info)))
next
if (!grepl(valid_package_version_regexp, info["Version"]))
next
ans <- c(ans, nam) ########## suspicious about this
}
}
return(unique(ans))
}
s <- search()
return(invisible(substring(s[substr(s, 1L, 8L) == "package:"],
9)))
}
version
_
platform i686-pc-linux-gnu
arch i686
os linux-gnu
system i686, linux-gnu
status Under development (unstable)
major 2
minor 9.0
year 2009
month 02
day 08
svn rev 47879
language R
version.string R version 2.9.0 Under development (unstable) (2009-02-08
r47879)
--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Brian D. Ripley, rip...@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel