Hi Faradj, Yes, the function expects at least three values for each country. Glad it worked.
Jim On Tue, May 29, 2018 at 10:53 PM, Faradj Koliev <farad...@gmail.com> wrote: > Dear Jim, > > wow! It worked! Thanks a lot. > > I did as you suggested and it worked well with the real data. Although it > gave me this error: Error in if (!is.na(x$Y[i])) { : argument is of length > zero. For some reason the X1 produced less observations than it is in the > data. But it's not a big deal - I identified those cases and simply deleted > from the data (it was countries that only appeared twice in the data (e.g. > USSR Yugoslavia etc). > > Best, > Faradj > > > 29 maj 2018 kl. 02:15 skrev Jim Lemon <drjimle...@gmail.com>: > > Hi Faradj, > What a problem! I think I have worked it out, but only because the > result is the one you said you wanted. > > # the sample data frame is named fkdf > Y2Xby3<-function(x) { > nrows<-dim(x)[1] > X<-rep(0,nrows) > for(i in 1:(nrows-2)) { > if(!is.na(x$Y[i])) { > if(x$Y[i] == 1 && any(is.na(x$Y[(i+1):(i+2)]))) X[i]<-1 > if(i > 1) { > if(X[i-1] == 1) X[i]<-0 > } > } > else { > if(!is.na(x$Y[i+1])) { > if(x$Y[i+1] == 1 && is.na(x$Y[i+2]) && X[i] == 0) > X[i+1]<-1 > } > } > } > return(X) > } > countries<-as.character(unique(fkdf$country)) > X1<-NULL > for(country in countries) > X1<-c(X1,Y2Xby3(fkdf[fkdf$country == country,])) > X1 > [1] 1 0 0 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 > 0 > [38] 1 0 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 1 > 0 > [75] 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 > > fkdf$X > > [1] 1 0 0 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 > 0 > [38] 1 0 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 1 > 0 > [75] 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 > > Jim > > On Mon, May 28, 2018 at 8:43 PM, Faradj Koliev <farad...@gmail.com> wrote: > > Hi everyone, > > I am trying to generate a conditional dummy variable ”X" with the following > rules > > set X=1 if Y is =1, two years prior to the NA. [0,0,NA]. > > For example, if the pattern for Y is 0,0,NA then the X variable is =0 for > all the two years prior to the NA. If the pattern for Y is 0,1,NA or 1,0,NA > then the X =1 . To be clear, if 1,1,NA then the X=1 that first specific > year, it should only count once (X=1), not twice. > > The code that I have now is not complete and I would appreciate some advice > here. This is the code: > dat2 <- dat1 %>% > group_by(country) %>% > group_by(grp = cumsum(is.na(lag(Y))), add = TRUE) %>% > mutate(first_year_at_1 = match(1, Y) * any(is.na(Y)) * any(tail(Y, 3) == > 1L), > X = {x <- integer(length(Y)) ; x[first_year_at_1] <- 1L ; x}) %>% > ungroup() > > It doesn’t really generate what I described above. Any help here would be > much appreciated. > > Below you can see my sample data with the desired outcome ”X” dummy in it. > > Thank you! > > dput(data) > > structure(list(year = c(1991L, 1992L, 1993L, 1994L, 1995L, 1996L, > 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, > 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, > 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, > 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, > 2011L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, > 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, > 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 1993L, > 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, > 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, > 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, > 1999L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, > 2007L, 2008L, 2009L, 2010L, 2011L), country = structure(c(1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, > 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Canada", > "Cuba", "Dominican Republic", "Haiti", "Jamaica"), class = "factor"), > Y = c(1L, NA, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L, > 1L, NA, 1L, NA, 1L, NA, 1L, NA, NA, 1L, 1L, NA, NA, 1L, NA, > 1L, NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, > NA, 1L, NA, 1L, 0L, 0L, 0L, 1L, NA, 0L, 1L, 0L, 0L, 0L, 0L, > 0L, 1L, NA, 0L, 1L, 1L, NA, 0L, 1L, NA, 1L, NA, 1L, NA, 1L, > NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L, > 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, NA, 0L, 1L, 1L, 1L, > NA, 1L, NA, 0L, 1L, 1L, NA), X = c(1L, 0L, 0L, 1L, 0L, 0L, > 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, > 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, > 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, > 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, > 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, > 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, > 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L)), .Names = > c("year", > "country", "Y", "X"), class = "data.frame", row.names = c(NA, > -110L)) > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.