Re: [R] lag, count

2016-10-15 Thread jim holtman
Here is a solution using 'dplyr'

> require(dplyr)
> lag<-read.table(text=" ID, y1, y2
+ 1,0,12/25/2014
+ 1,125,9/15/2015
+ 1,350,1/30/2016
+ 2,0,12/25/2012
+ 2,450,9/15/2014
+ 2,750,1/30/2016
+ 2,  656, 11/30/2016
+ ",sep=",",header=TRUE)
>
> new_lag <- lag %>%
+ mutate(y2 = as.Date(y2, format = "%m/%d/%Y")) %>%  # convert date
+ arrange(ID, y2) %>%  # sort if necessary
+ group_by(ID) %>%
+ mutate(flag = seq(n()),
+ y1diff = c(0, diff(y1)),
+ y2diff = c(0, diff(y2))
+ )
>
>
> new_lag
Source: local data frame [7 x 6]
Groups: ID [2]

 IDy1 y2  flag y1diff y2diff
 
1 1 0 2014-12-25 1  0  0
2 1   125 2015-09-15 2125264
3 1   350 2016-01-30 3225137
4 2 0 2012-12-25 1  0  0
5 2   450 2014-09-15 2450629
6 2   750 2016-01-30 3300502
7 2   656 2016-11-30 4-94305

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Sat, Oct 15, 2016 at 2:54 PM, Rui Barradas  wrote:
> I forgot about the sorting part and assumed the data.frame was already
> sorted. If not, after converting y2 to class Date, you can do
>
> lag <- lag[order(lag$ID, lag$y2), ]
>
> Rui Barradas
>
>
> Em 15-10-2016 19:45, Rui Barradas escreveu:
>>
>> Hello,
>>
>> Try the following.
>>
>>
>> lag<-read.table(text=" ID, y1, y2
>> 1,0,12/25/2014
>> 1,125,9/15/2015
>> 1,350,1/30/2016
>> 2,0,12/25/2012
>> 2,450,9/15/2014
>> 2,750,1/30/2016
>> 2,  656, 11/30/2016
>> ",sep=",",header=TRUE)
>>
>> str(lag)
>> lag$y2 <- as.Date(lag$y2, format = "%m/%d/%Y")
>> str(lag)
>>
>> # 1)
>> flag <- ave(lag$ID, lag$ID, FUN = seq_along)
>> lag2 <- cbind(lag[1], flag, lag[-1])
>>
>> # 2)
>> y1dif <- ave(lag2$y1, lag2$ID, FUN = function(y) c(0, y[-1] -
>> y[-length(y)]))
>> y2dif <- unlist(tapply(lag2$y2, lag2$ID, FUN = function(y) c(0, y[-1] -
>> y[-length(y)])))
>>
>> lag2 <- cbind(lag2, y1dif, y2dif)
>> lag2
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Em 15-10-2016 17:57, Val escreveu:
>>>
>>> Hi all,
>>>
>>> I want sort the data by ID and Y2 then count the number of rows within
>>> IDs.  Assign a "flag" variable to reach row starting from first  to
>>> the last row.
>>> For instance, in the following data ID "1" has three rows   and each
>>> row is assigned flag sequentially 1, 2,3.
>>>
>>> 2. In the second step, within each ID, I want get the difference
>>> between the subsequent row values of y1 and y2(date) values.
>>> Within each ID the first value of y1diff  and y2diff are always 0. The
>>> second values for each will  be the current row minus the previous
>>> row.
>>>
>>>
>>>
>>> lag<-read.table(text=" ID, y1, y2
>>> ID,Y1,y2
>>> 1,0,12/25/2014
>>> 1,125,9/15/2015
>>> 1,350,1/30/2016
>>> 2,0,12/25/2012
>>> 2,450,9/15/2014
>>> 2,750,1/30/2016
>>> 2,  656, 11/30/2016
>>> ",sep=",",header=TRUE)
>>>
>>> output looks like as follows
>>>
>>> ID,flag,y1,y2,y1dif,y2dif
>>> 1,1,0,12/25/2014,0,0
>>> 1,2,125,9/15/2015,125,264
>>> 1,3,350,1/30/2016,225,137
>>> 2,1,0,12/25/2012,0,0
>>> 2,2,450,9/15/2014,450,629
>>> 2,3,750,1/30/2016,300,502
>>> 2, 4, 656 11/30/2016, -94, 305
>>>
>>> Thank you
>>>
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lag, count

2016-10-15 Thread Rui Barradas
I forgot about the sorting part and assumed the data.frame was already 
sorted. If not, after converting y2 to class Date, you can do


lag <- lag[order(lag$ID, lag$y2), ]

Rui Barradas

Em 15-10-2016 19:45, Rui Barradas escreveu:

Hello,

Try the following.


lag<-read.table(text=" ID, y1, y2
1,0,12/25/2014
1,125,9/15/2015
1,350,1/30/2016
2,0,12/25/2012
2,450,9/15/2014
2,750,1/30/2016
2,  656, 11/30/2016
",sep=",",header=TRUE)

str(lag)
lag$y2 <- as.Date(lag$y2, format = "%m/%d/%Y")
str(lag)

# 1)
flag <- ave(lag$ID, lag$ID, FUN = seq_along)
lag2 <- cbind(lag[1], flag, lag[-1])

# 2)
y1dif <- ave(lag2$y1, lag2$ID, FUN = function(y) c(0, y[-1] -
y[-length(y)]))
y2dif <- unlist(tapply(lag2$y2, lag2$ID, FUN = function(y) c(0, y[-1] -
y[-length(y)])))

lag2 <- cbind(lag2, y1dif, y2dif)
lag2

Hope this helps,

Rui Barradas

Em 15-10-2016 17:57, Val escreveu:

Hi all,

I want sort the data by ID and Y2 then count the number of rows within
IDs.  Assign a "flag" variable to reach row starting from first  to
the last row.
For instance, in the following data ID "1" has three rows   and each
row is assigned flag sequentially 1, 2,3.

2. In the second step, within each ID, I want get the difference
between the subsequent row values of y1 and y2(date) values.
Within each ID the first value of y1diff  and y2diff are always 0. The
second values for each will  be the current row minus the previous
row.



lag<-read.table(text=" ID, y1, y2
ID,Y1,y2
1,0,12/25/2014
1,125,9/15/2015
1,350,1/30/2016
2,0,12/25/2012
2,450,9/15/2014
2,750,1/30/2016
2,  656, 11/30/2016
",sep=",",header=TRUE)

output looks like as follows

ID,flag,y1,y2,y1dif,y2dif
1,1,0,12/25/2014,0,0
1,2,125,9/15/2015,125,264
1,3,350,1/30/2016,225,137
2,1,0,12/25/2012,0,0
2,2,450,9/15/2014,450,629
2,3,750,1/30/2016,300,502
2, 4, 656 11/30/2016, -94, 305

Thank you



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lag, count

2016-10-15 Thread Rui Barradas

Hello,

Try the following.


lag<-read.table(text=" ID, y1, y2
1,0,12/25/2014
1,125,9/15/2015
1,350,1/30/2016
2,0,12/25/2012
2,450,9/15/2014
2,750,1/30/2016
2,  656, 11/30/2016
",sep=",",header=TRUE)

str(lag)
lag$y2 <- as.Date(lag$y2, format = "%m/%d/%Y")
str(lag)

# 1)
flag <- ave(lag$ID, lag$ID, FUN = seq_along)
lag2 <- cbind(lag[1], flag, lag[-1])

# 2)
y1dif <- ave(lag2$y1, lag2$ID, FUN = function(y) c(0, y[-1] - 
y[-length(y)]))
y2dif <- unlist(tapply(lag2$y2, lag2$ID, FUN = function(y) c(0, y[-1] - 
y[-length(y)])))


lag2 <- cbind(lag2, y1dif, y2dif)
lag2

Hope this helps,

Rui Barradas

Em 15-10-2016 17:57, Val escreveu:

Hi all,

I want sort the data by ID and Y2 then count the number of rows within
IDs.  Assign a "flag" variable to reach row starting from first  to
the last row.
For instance, in the following data ID "1" has three rows   and each
row is assigned flag sequentially 1, 2,3.

2. In the second step, within each ID, I want get the difference
between the subsequent row values of y1 and y2(date) values.
Within each ID the first value of y1diff  and y2diff are always 0. The
second values for each will  be the current row minus the previous
row.



lag<-read.table(text=" ID, y1, y2
ID,Y1,y2
1,0,12/25/2014
1,125,9/15/2015
1,350,1/30/2016
2,0,12/25/2012
2,450,9/15/2014
2,750,1/30/2016
2,  656, 11/30/2016
",sep=",",header=TRUE)

output looks like as follows

ID,flag,y1,y2,y1dif,y2dif
1,1,0,12/25/2014,0,0
1,2,125,9/15/2015,125,264
1,3,350,1/30/2016,225,137
2,1,0,12/25/2012,0,0
2,2,450,9/15/2014,450,629
2,3,750,1/30/2016,300,502
2, 4, 656 11/30/2016, -94, 305

Thank you



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.