Thanks a lot for the recommendations - some of them I am implementing already.
Just a clarification: the only reason I try to compare things to SPSS is that I am the only person in my office using R. Whenever I work on an R code my goal is not just to make it work, but also to "boast" to the SPSS users that it's much easier/faster/niftier in R. So, you are preaching to the choir here. Dimitri On Thu, Aug 4, 2011 at 4:02 PM, Joshua Wiley <[email protected]> wrote: > > > On Aug 4, 2011, at 11:46, Dimitri Liakhovitski > <[email protected]> wrote: > >> Thanks a lot, guys! >> It's really helpful. But - to be objective- it's still quite a few >> lines longer than in SPSS. > > Not once you've sources the function! For the simple case of a vector, try: > > X <- 1:10 > mylag2 <- function(X, lag) { > c(rep(NA, length(seq(lag))), X[-seq(lag)]) > } > > Though this does not work for lead, it is fairly short. Then you could use > the *apply family if you needed it on multiple columns or vectors. > > Cheers, > > Josh > >> Dimitri >> >> On Thu, Aug 4, 2011 at 2:36 PM, Daniel Nordlund <[email protected]> >> wrote: >>> >>> >>>> -----Original Message----- >>>> From: [email protected] [mailto:[email protected]] >>>> On Behalf Of Dimitri Liakhovitski >>>> Sent: Thursday, August 04, 2011 8:24 AM >>>> To: r-help >>>> Subject: [R] Efficient way of creating a shifted (lagged) variable? >>>> >>>> Hello! >>>> >>>> I have a data set: >>>> set.seed(123) >>>> y<-data.frame(week=seq(as.Date("2010-01-03"), as.Date("2011-01- >>>> 31"),by="week")) >>>> y$var1<-c(1,2,3,round(rnorm(54),1)) >>>> y$var2<-c(10,20,30,round(rnorm(54),1)) >>>> >>>> # All I need is to create lagged variables for var1 and var2. I looked >>>> around a bit and found several ways of doing it. They all seem quite >>>> complicated - while in SPSS it's just a few letters (like LAG()). Here >>>> is what I've written but I wonder. It works - but maybe there is a >>>> very simple way of doing it in R that I could not find? >>>> I need the same for "lead" (opposite of lag). >>>> Any hint is greatly appreciated! >>>> >>>> ### The function I created: >>>> mylag <- function(x,max.lag=1){ # x has to be a 1-column data frame >>>> temp<- >>>> as.data.frame(embed(c(rep(NA,max.lag),x[[1]]),max.lag+1))[2:(max.lag+1)] >>>> for(i in 1:length(temp)){ >>>> names(temp)[i]<-paste(names(x),".lag",i,sep="") >>>> } >>>> return(temp) >>>> } >>>> >>>> ### Running mylag to get my result: >>>> myvars<-c("var1","var2") >>>> for(i in myvars) { >>>> y<-cbind(y,mylag(y[i]),max.lag=2) >>>> } >>>> (y) >>>> >>>> -- >>>> Dimitri Liakhovitski >>>> marketfusionanalytics.com >>>> >>> >>> Dimitri, >>> >>> I would first look into the zoo package as has already been suggested. >>> However, if you haven't already got your solution then here are a couple of >>> functions that might help you get started. I won't vouch for efficiency. >>> >>> >>> lag.fun <- function(df, x, max.lag=1) { >>> for(i in x) { >>> for(j in 1:max.lag){ >>> lagx <- paste(i,'.lag',j,sep='') >>> df[,lagx] <- c(rep(NA,j),df[1:(nrow(df)-j),i]) >>> } >>> } >>> df >>> } >>> >>> lead.fun <- function(df, x, max.lead=1) { >>> for(i in x) { >>> for(j in 1:max.lead){ >>> leadx <- paste(i,'.lead',j,sep='') >>> df[,leadx] <- c(df[(j+1):(nrow(df)),i],rep(NA,j)) >>> } >>> } >>> df >>> } >>> >>> y <- lag.fun(y,myvars,2) >>> y <- lead.fun(y,myvars,2) >>> >>> >>> Hope this is helpful, >>> >>> Dan >>> >>> Daniel Nordlund >>> Bothell, WA USA >>> >>> >>> >> >> >> >> -- >> Dimitri Liakhovitski >> marketfusionanalytics.com >> >> ______________________________________________ >> [email protected] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > -- Dimitri Liakhovitski marketfusionanalytics.com ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

