[R] run many linear regressions against the same independent variables in batch
R function lm(response ~ term) allows me to run a linear regression on a single response vector. For example, I have recent one year historical prices for a stock and SP index. I can run regression of the stock prices (as response vector) against the SP index prices (as term vector). Now assume I have 1000 stocks to run the above regressions (against the same SP index prices). The only way I know is that I write a loop. Within each loop I do the regression for one stock price. Is there a batch method to run the 1000 regressions in one shot? Note that this functionality is available in SAS (the SAS procedure reg). Actually, some times we run such regressions for about 300K securities. Performing regressions in loop takes a long time. On the contrary, running on SAS is much faster. Thank you in advance. Heng Sun 212-855-5754 Director Quantitative Risk Depository Trust and Clearing Corporation [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] run many linear regressions against the same independent variables in batch
Thank you Gabor. I suspected R could do that. But I tried a data frame and it did not work. Now I test for matrix and it works. So it seems performing many regressions in one shot works for a matrix, but not a data frame. Heng Gabor Grothendieck [EMAIL PROTECTED] 10/14/2005 12:05 PM To Heng Sun [EMAIL PROTECTED] cc r-help@stat.math.ethz.ch Subject Re: [R] run many linear regressions against the same independent variables in batch This runs a regression of each column (except the first) of matrix state.x77 against the first: lm(state.x77[,-1] ~ state.x77[,1]) On 10/14/05, Heng Sun [EMAIL PROTECTED] wrote: R function lm(response ~ term) allows me to run a linear regression on a single response vector. For example, I have recent one year historical prices for a stock and SP index. I can run regression of the stock prices (as response vector) against the SP index prices (as term vector). Now assume I have 1000 stocks to run the above regressions (against the same SP index prices). The only way I know is that I write a loop. Within each loop I do the regression for one stock price. Is there a batch method to run the 1000 regressions in one shot? Note that this functionality is available in SAS (the SAS procedure reg). Actually, some times we run such regressions for about 300K securities. Performing regressions in loop takes a long time. On the contrary, running on SAS is much faster. Thank you in advance. Heng Sun 212-855-5754 Director Quantitative Risk Depository Trust and Clearing Corporation [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] KalmanLike: missing exogenous factor?
Prof Ripley, Thanks for explanation. I now understand where KalmanLike fits. I should not use exogenous factor. It should be called exogenous variable or inputs or known effects. My study on how trading sizes impact on stock prices has trading sizes as this exogenous variable. As you said, this should belong to some package. I did internet searches and found something similar but not covering my case. The restrictive access to web makes subscription from my work place not convenient. Sorry. Heng Sun Senior Quantitative Analyst Depository Trust and Clearing Corporation New York, USA 10041 Tel: 212-755-5754 --- Prof Brian Ripley [EMAIL PROTECTED] wrote: On Mon, 11 Oct 2004, Heng Sun wrote: From the help document on KalmanLike, KalmanRun, etc., I see the linear Gaussian state space model is a - T a + R e y = Z' a + eta following the book of Durbin and Koopman. In practice, it is useful to run Kalman filtering/smoothing/forecasting with exogenous factor: a - T a + L b + R e y = Z' a + M b + eta where b is some known vector (a function of time). Some other software like S-plus and Mathematica include the above exogenous factor. SsfPack by Koopman, etal. also has the factor built in the model to accommodate practical uses. So what is the rationale for R to leave off the exogenous factor? Is there a feasible way to convert the general model to the simple model in R? What is the rationale for your raising this? KalmanLike, KalmanRun, etc were written for R 1.5.0 as part of the ts package (see my article in R-news), and the ts applications (see the See Also section) do not need a so-called `exogenous factor' (which is not a `factor'). R does not pretend to have facilities for whatever subject area you mean (but do not say) by `in practice'. That's what addon packages are for (and some do touch on this area). We have no idea who [EMAIL PROTECTED]' is: it is courteous to use a signature and give your affiliation. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] KalmanLike: missing exogenous factor?
From the help document on KalmanLike, KalmanRun, etc., I see the linear Gaussian state space model is a - T a + R e y = Z' a + eta following the book of Durbin and Koopman. In practice, it is useful to run Kalman filtering/smoothing/forecasting with exogenous factor: a - T a + L b + R e y = Z' a + M b + eta where b is some known vector (a function of time). Some other software like S-plus and Mathematica include the above exogenous factor. SsfPack by Koopman, etal. also has the factor built in the model to accommodate practical uses. So what is the rationale for R to leave off the exogenous factor? Is there a feasible way to convert the general model to the simple model in R? Thanks, Heng Sun ___ Declare Yourself - Register online to vote today! __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html