As I mentioned in my previous message about speeding up evalList, I've been looking at ways to speed up the R interpreter. One sees in the code many, many calls of PROTECT, UNPROTECT, and related functions, so that seems like an obvious target for optimization. Indeed, I've found that one can speed up the interpreter by about 10% by just changing these.
The functions are actually macros defined in Rinternals.h, but end up just calling functions defined in memory.c (apparently as protect, etc., but really as Rf_protect, etc.). So there is function call overhead every time they are used. To get rid of the function call overhead, without generating lots of extra code, one can redefine the macros to handled to the common case inline, and call the function in memory.c for the uncommon cases (eg, error on stack underflow). Here are my versions that do this: #define PROTECT(s) do { \ SEXP tmp_prot_sexp = (s); \ if (R_PPStackTop < R_PPStackSize) \ R_PPStack[R_PPStackTop++] = tmp_prot_sexp; \ else \ Rf_protect(tmp_prot_sexp); \ } while (0) #define UNPROTECT(n) (R_PPStackTop >= (n) ? \ (void) (R_PPStackTop -= (n)) : Rf_unprotect(n)) #define PROTECT_WITH_INDEX(x,i) do { \ PROTECT(x); \ *(i) = R_PPStackTop - 1; \ } while (0) #define REPROTECT(x,i) ( (void) (R_PPStack[i] = x) ) Unfortunately, one can't just change the definitions in Rinternals.h. Some uses of PROTECT are in places where R_PPStack, etc. aren't visible. Instead, one can redefine them at the end of Defn.h, where these variables are declared. That alone also doesn't work, however, because some references don't work at link time. So instead I redefine them in Defn.h only if USE_FAST_PROTECT_MACROS is defined. I define this before including Defn.h in all the .c files in the main directory. Another complication is that the PROTECT macro no longer returns its argument. One can avoid this by writing it another way, but this then results in its argument being referenced in two places (though only evaluated once), which seems to slow things down, presumably because the larger amount of code generated affects cache behaviour. Instead, I just changed the relatively few occurrences of v = PROTECT(...) to PROTECT(v = ...), which is the dominant idiom in the code anyway. (In some places slightly more change is needed, as when this is in an initializer.) This works fine, speeding up R programs that aren't dominated by large operations like multiplying big matrices by about 10%. The effect is cumulative with the change I mentioned in my previous message about avoiding extra CONS operations in evalList, for a total speedup of about 15%. Radford Neal ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel