[Rd] Speed up code, profiling, optimization, lapply vs. loops

Thorn Thaler Mon, 06 Jul 2009 01:27:51 -0700

High everybody,

currently I'm writinig a package that, for a given family of variancefunctions depending on a parameter theta, say, computes the extendedquasi likelihood (eql) function for different values of theta.

The computation involves a couple of calls of the 'glm' routine. WhatI'm doing now is to call 'lapply' for a list of theta values and afunction, that constructs a family object for the particular choice oftheta, computes the glm and uses the results to get the eql. Notsurprisingly the function is not very fast. Depending on the size of theparameter space under consideration it takes a couple of minutes untilthe function finishes. Testing ~1000 Parameters takes about 5 minutes onmy machine.

I know that loops in R are slow more often than not. Thus, I thoughtusing 'lapply' is a better way. But anyways, it is just another way of aloop. Besides, it involves some overhead for the function call and hencei'm not sure wheter using 'lapply' is really the better choice.

What I like to know is to figure out, where the bottleneck lies.Vectorization would help, but since I don't think that there isvectorized 'glm' function, which is able to handle a vector of familyobjects. I'm not aware if there is any choice aside from using a loop.


So my questions:
- how can I figure out where the bottleneck lies?
- is 'lapply' always superior to a loop in terms of execution time?

- are there any 'evil' commands that should be avoided in a loop, forthey slow down the computation?- are there any good books, tutorials about how to profile R codeefficiently?


TIA 4 ur help,

Thorn

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Speed up code, profiling, optimization, lapply vs. loops

Reply via email to