subject:"\[R\] Selection on dataframe based on order of rows"

[R] Selection on dataframe based on order of rows

2006-08-22 Thread Bonfigli Sandro

I have a dataframe with the following structure

iddate value
-
122/08/2006 48
124/08/2006 50
128/08/2006 150
130/08/2006 100
101/09/2006 30
211/08/2006 30
222/08/2006 100
228/08/2006 11
202/09/2006 5
301/07/2006 3
301/08/2006 100
301/09/2006 100
422/08/2006 48
424/08/2006 50
428/08/2006 150
430/08/2006 100
401/09/2006 30
403/09/2006 100
406/09/2006 100


N.B.: dates in european format; ordered dataframe

For each ID I need to select the first occurrence of
all the rows which are the first of at least two with 
value = 50.

Rather convoluted explication. I mean that for each id I have to select
the first row in which value is  50 only if at least the following row 
has value  50 too. If this is not true I repeat the test for all the 
following rows in which value  50 untill I find a record that respects
the condition

this means that with my example dataframe the result is :
iddate value
-
128/08/2006 150
301/08/2006 100
428/08/2006 150

It's clear that a for loop would work but I think that that is a better 
way.

I tried by and could obtain the first row for wich value is  50.

I thought of an iterative process (delete the first row  50, find the 
second row  50, examine if there are rows in the middle) but it
is quite inelegant as if the first value is not the good one I have to 
repeat the process for a a priori unknown number of times.

Thanks in advance for Your help

  Sandro Bonfigli

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selection on dataframe based on order of rows

2006-08-22 Thread Gabor Grothendieck

Try this:

# data
DF - structure(list(id = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4,
4, 4, 4, 4, 4, 4), date = structure(c(8, 9, 10, 11, 3, 7, 8,
10, 4, 1, 2, 3, 8, 9, 10, 11, 3, 5, 6), .Label = c(01/07/2006,
01/08/2006, 01/09/2006, 02/09/2006, 03/09/2006, 06/09/2006,
11/08/2006, 22/08/2006, 24/08/2006, 28/08/2006, 30/08/2006
), class = factor), value = c(48, 50, 150, 100, 30, 30, 100,
11, 5, 3, 100, 100, 48, 50, 150, 100, 30, 100, 100)), .Names = c(id,
date, value), class = data.frame, row.names = c(1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19))

f - function(x) {
idx - which(x$value  50  c(x$value[-1], 0)  50)
if (length(idx)  0) x[idx[1],]
}
do.call(rbind, by(DF, DF$id, f))


On 8/22/06, Bonfigli Sandro [EMAIL PROTECTED] wrote:
 I have a dataframe with the following structure

 iddate value
 -
 122/08/2006 48
 124/08/2006 50
 128/08/2006 150
 130/08/2006 100
 101/09/2006 30
 211/08/2006 30
 222/08/2006 100
 228/08/2006 11
 202/09/2006 5
 301/07/2006 3
 301/08/2006 100
 301/09/2006 100
 422/08/2006 48
 424/08/2006 50
 428/08/2006 150
 430/08/2006 100
 401/09/2006 30
 403/09/2006 100
 406/09/2006 100


 N.B.: dates in european format; ordered dataframe

 For each ID I need to select the first occurrence of
 all the rows which are the first of at least two with
 value = 50.

 Rather convoluted explication. I mean that for each id I have to select
 the first row in which value is  50 only if at least the following row
 has value  50 too. If this is not true I repeat the test for all the
 following rows in which value  50 untill I find a record that respects
 the condition

 this means that with my example dataframe the result is :
 iddate value
 -
 128/08/2006 150
 301/08/2006 100
 428/08/2006 150

 It's clear that a for loop would work but I think that that is a better
 way.

 I tried by and could obtain the first row for wich value is  50.

 I thought of an iterative process (delete the first row  50, find the
 second row  50, examine if there are rows in the middle) but it
 is quite inelegant as if the first value is not the good one I have to
 repeat the process for a a priori unknown number of times.

 Thanks in advance for Your help

  Sandro Bonfigli

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Selection on dataframe based on order of rows

Re: [R] Selection on dataframe based on order of rows

2 matches

Site Navigation

Mail list logo

Footer information