Hi Nelissa I hope the answers from Joshua and Milan helped you figure out a good solution.
So basically in R you have two big parallel packages, the parallel/snow described by Joshua, or the foreach one. If you would be to use foreach, you would need probably a nested loop, which is described in detail in a vignette: http://stat.ethz.ch/CRAN/web/packages/foreach/vignettes/nested.pdf I guess in any case you will want to be more specific in your loop regarding what is returned, and select precisely the parameter you want (instead of returning the whole object). In your case, function getTh() will probably do the trick. Now regarding the specificities of the threshold VECM. Unfortunately the lm methods discussed by Joshua won't work here, and you cannot even use here Joshua's trick to run multiple responses (a simple reason being that a VECM is already a multiple response model). Threshold VECMs are really heavy models, so do expect a very long time to run each model, so enormous time to run 3000 models! Do not hesitate to ask also on the dedicated mailing list: ts...@googlegroups.com best Message: 36 Date: Mon, 18 Feb 2013 08:50:04 -0800 From: Joshua Wiley <jwiley.ps...@gmail.com> To: Milan Bouchet-Valat <nalimi...@club.fr> Cc: "r-help@r-project.org" <r-help@r-project.org>, "Jamora, Nelissa" <nelissa.jam...@agr.uni-goettingen.de> Subject: Re: [R] foreach loop, stata equivalent Message-ID: <canz9z_ld_subcacw1xabbubqtds2kwu-9e2mn02f4g40h2u...@mail.gmail.com> Content-Type: text/plain; charset=UTF-8 If you have the same set of predictors for multiple outcomes, you can speed up the process considerably by taking advantage of the fact that the design matrix is the same for multiple outcomes. As an example: set.seed(10) y <- matrix(rnorm(10000 * 14), ncol = 14) x <- matrix(rnorm(10000 * 2), ncol = 2) system.time(res <- lapply(1:14, function(i) lm(y[, i] ~ x))) ## user system elapsed ## 0.34 0.00 0.34 system.time(res2 <- lm(y ~ x)) ## user system elapsed ## 0.05 0.02 0.06 lm can accept a matrix as the dependent variable. So if various combinations of variables predict all 14 outcomes, do not fit 14 * number of combinations of predictors separately, do them in chunks for substantial speedups. Finally, as long as you are not using factors or any other fancy things, and can work with just raw data matrices, instead of using lm(), which is a high level function, use lm.fit(). It is not especially clever, just expects the design matrix and response matrix. It will not an intercept by default, so to your data column bind on a vector of 1s. system.time(res3 <- lm.fit(cbind(1, x), y)) ## user system elapsed ## 0.02 0.00 0.01 These three methods produce identical results: res[[1]] ## Call: ## lm(formula = y[, i] ~ x) ## Coefficients: ## (Intercept) x1 x2 ## 0.0014401 -0.0198232 -0.0005721 res2[[1]][, 1] ## (Intercept) x1 x2 ## 0.0014401149 -0.0198232209 -0.0005720764 res3[[1]][, 1] ## x1 x2 x3 ## 0.0014401149 -0.0198232209 -0.0005720764 however, fit each response one at a time instead of taking advantage of fitting multiple responses at once (so the design matrix can be reused) and taking advantage of lower level functions when you have a simple, specific, repetitive task takes the estimation time from .34 down to .01. You would need 34 cores to achieve that by simply throwing more hardware at the problem as opposed to using more optimized code. Of course you can do both, and probably get the results pretty quickly. Cheers, Josh [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.