Hi All,

I'm having trouble selecting rows to delete, that i can't seem to overcome.

Below is some sample data, i am trying to dedup the data based on each user,
and simultaneously the timestamp (at the side i have highlighted expected
row to be removed)

I've looked at the lag function but can't seem to make it work?

My logic ran along the lines of an ifelse statement and then remove after
that, but it doesn't seem to work? Any help appreciated

Let's call the data test

test$lag <- ifelse(test$user_id==lag(test$user_id)
& test$timestamp==lag(test$timestamp),1,0)

Can anyone help on this?

Mike



Source_type           timestamp            user_id
75381           0 07-07-2008-21:03:55 848307909687
75379           1 07-07-2008-19:52:55 848307838407
75380           2 07-07-2008-19:54:14 848307838407
75378           1 07-07-2008-15:24:01 848285633277
75374           1 07-07-2008-13:39:17 848273633667
75377           2 07-07-2008-13:39:55 848273633667
75376           2 07-07-2008-13:39:55 848273633667    Remove
75375           2 07-07-2008-13:56:05 848273633667
75373           1 07-07-2008-17:11:00 848272661427
75371           1 07-07-2008-13:19:00 848270431847
75372           2 07-07-2008-13:19:14 848270431847
75369           1 07-07-2008-12:49:16 848269676907   Remove
75370           2 07-07-2008-12:49:16 848269676907
75366           1 07-07-2008-13:29:15 848263484847
75368           2 07-07-2008-13:29:44 848263484847

Thanks in advance

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to