Now multi-core machines are more widely available, we have gotten to stress-test the parallel building capabilities of R and of packages. The current Windows and Mac build machines are both 8-core and I test on an 8-core machine. These are all fairly recent changes of hardware and the following applies only to R-devel, the version to become 2.9.0 next month.

Parallel builds of R under Unix-alikes have long been supported, and now allow rather more to be done in parallel. Using 'make -j' will work on a machine with enough resources, and gives something like a 3x speed up. THe main limiting factor is converting help, which is done serially and is likely remain so until we move to R-based conversion. New for this version is the ability to install in parallel (e.g. 'make -j install install-pdf'). It is also possible to check the R build in parallel, but the output is so intermingled that it is hard to see any discrepancies. However, few people build R from scratch every day and for those with such powerful machines building R is probably already fast enough (ca 3 mins).

When installing or checking a single package, the only thing done in parallel is making any compiled code (2.8.x ran tests in the 'tests' directory in parallel, but this is not currently done). The standard procedures work safely with parallel make, but users who write Makevars or Makefile files need to take this into account. For example, yesterday's Rcsdp/src/Makevars has

PHONY: all

all: before $(SHLIB)

before: Csdp.ts

but this 'before' target has to be completed before $(SHLIB) can be built (and it failed to install for me). I am aware that the documentation did not stress this sufficiently in the past, and 'Writing R Extensions' has been revised to do so. (And the package has been updated with alacrity, thank you.)

Because Windows users are much less likely to be aware of these issues and because of the last para below, Uwe and I tweaked the procedures for the Windows build machine so that packages are always installed/checked with a non-paralle make.

Installing/updating packages in parallel can help a lot, and we've made two changes to facilitate that. First, there is a new option for R CMD INSTALL, --pkglock. This uses locks on a per-package basis so prevents more than one process trying to install a package at the same time, but allows several packages to be installed to the same library simultaneously. This places the onus on the caller to ensure that dependencies are installed first, and the 'Ncpus' option to install.packages() provides a way to marshall package installation to make best use of multiple CPUs.

Under Windows 'make all' and 'make recommended; (but not 'make distribution') can each be done in parallel. There are some question marks over how well the 'make; used on Windows works in parallel (we found one case where it worked incorrectly and had to rethink share/make/winshlib.mk) so it should be used with caution.

Please give these new facilties a go and report (here) how you get on.

--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to