[R] run many linear regressions against the same independent variables in batch

2005-10-14 Thread Heng Sun
R function 
lm(response ~ term)
allows me to run a linear regression on a single response vector. For 
example, I have recent one year historical prices for a stock and SP 
index. I can run regression of the stock prices (as response vector) 
against the SP index prices (as term vector).

Now assume I have 1000 stocks to run the above regressions (against the 
same SP index prices). The only way I know is that I write a loop. Within 
each loop I do the regression for one stock price.

Is there a batch method to run the 1000 regressions in one shot? Note that 
this functionality is available in SAS (the SAS procedure reg).

Actually, some times we run such regressions for about 300K securities. 
Performing regressions in loop takes a long time. On the contrary, running 
on SAS is much faster.

Thank you in advance.

Heng Sun
212-855-5754

Director
Quantitative Risk
Depository Trust and Clearing Corporation

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] run many linear regressions against the same independent variables in batch

2005-10-14 Thread Heng Sun
Thank you Gabor.

I suspected R could do that. But I tried a data frame and it did not work. 
Now I test for matrix and it works. So it seems performing many 
regressions in one shot works for a matrix, but not a data frame.

Heng





Gabor Grothendieck [EMAIL PROTECTED] 
10/14/2005 12:05 PM

To
Heng Sun [EMAIL PROTECTED]
cc
r-help@stat.math.ethz.ch
Subject
Re: [R] run many linear regressions against the same independent variables 
in batch






This runs a regression of each column (except the first)
of matrix state.x77 against the first:

lm(state.x77[,-1] ~ state.x77[,1])

On 10/14/05, Heng Sun [EMAIL PROTECTED] wrote:
 R function
 lm(response ~ term)
 allows me to run a linear regression on a single response vector. For
 example, I have recent one year historical prices for a stock and SP
 index. I can run regression of the stock prices (as response vector)
 against the SP index prices (as term vector).

 Now assume I have 1000 stocks to run the above regressions (against the
 same SP index prices). The only way I know is that I write a loop. 
Within
 each loop I do the regression for one stock price.

 Is there a batch method to run the 1000 regressions in one shot? Note 
that
 this functionality is available in SAS (the SAS procedure reg).

 Actually, some times we run such regressions for about 300K securities.
 Performing regressions in loop takes a long time. On the contrary, 
running
 on SAS is much faster.

 Thank you in advance.

 Heng Sun
 212-855-5754

 Director
 Quantitative Risk
 Depository Trust and Clearing Corporation

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] KalmanLike: missing exogenous factor?

2004-10-12 Thread Heng Sun
Prof Ripley,

Thanks for explanation. I now understand where
KalmanLike fits. 

I should not use exogenous factor. It should be
called exogenous variable or inputs or known
effects. My study on how trading sizes impact on
stock prices has trading sizes as this exogenous
variable. As you said, this should belong to some
package. I did internet searches and found something
similar but not covering my case.

The restrictive access to web makes subscription from
my work place not convenient. Sorry.

Heng Sun
Senior Quantitative Analyst
Depository Trust and Clearing Corporation
New York, USA 10041
Tel: 212-755-5754


--- Prof Brian Ripley [EMAIL PROTECTED] wrote:

 On Mon, 11 Oct 2004, Heng Sun wrote:
 
  From the help document on KalmanLike, KalmanRun,
 etc.,
  I see the linear Gaussian state space model is 
  
  a - T a + R e
  y = Z' a + eta
  
  following the book of Durbin and Koopman.
  
  In practice, it is useful to run Kalman
  filtering/smoothing/forecasting with exogenous
 factor:
  
  a - T a + L b + R e
  y = Z' a + M b + eta
  
  where b is some known vector (a function of time).
  
  Some other software like S-plus and Mathematica
  include the above exogenous factor. SsfPack by
  Koopman, etal. also has the factor built in the
 model
  to accommodate practical uses.
  
  So what is the rationale for R to leave off the
  exogenous factor? Is there a feasible way to
 convert
  the general model to the simple model in R?
 
 What is the rationale for your raising this?
 
 KalmanLike, KalmanRun, etc were written for R 1.5.0
 as part of the ts 
 package (see my article in R-news), and the ts
 applications (see the See 
 Also section) do not need a so-called `exogenous
 factor' (which is not a 
 `factor').   R does not pretend to have facilities
 for whatever subject 
 area you mean (but do not say) by `in practice'. 
 That's what addon
 packages are for (and some do touch on this area).
 
 We have no idea who [EMAIL PROTECTED]' is: it is
 courteous to use a 
 signature and give your affiliation.
 
 -- 
 Brian D. Ripley, 
 [EMAIL PROTECTED]
 Professor of Applied Statistics, 
 http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865
 272861 (self)
 1 South Parks Road, +44 1865
 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865
 272595
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] KalmanLike: missing exogenous factor?

2004-10-11 Thread Heng Sun
From the help document on KalmanLike, KalmanRun, etc.,
I see the linear Gaussian state space model is 

a - T a + R e
y = Z' a + eta

following the book of Durbin and Koopman.

In practice, it is useful to run Kalman
filtering/smoothing/forecasting with exogenous factor:

a - T a + L b + R e
y = Z' a + M b + eta

where b is some known vector (a function of time).

Some other software like S-plus and Mathematica
include the above exogenous factor. SsfPack by
Koopman, etal. also has the factor built in the model
to accommodate practical uses.

So what is the rationale for R to leave off the
exogenous factor? Is there a feasible way to convert
the general model to the simple model in R?

Thanks,

Heng Sun



___

Declare Yourself - Register online to vote today!

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html