Re: [R] subsetting with by() or other function??

2005-10-13 Thread Brian S Cade
Fair enough.  To clarify what I'm trying to achieve I've pasted below a 
small piece of the larger data frame with the hierarchical structure of 
factors POPULATION and LOCID and the ascending order of YEARS and the 
variable DBC that I would like to transform to another variable that is a 
lag of the previous years DBC (call it LAG1DBC) within LOCID within 
POPULATION.  The desired outcome is shown in the second example data set 
pasted below the first.  The setup is desired for doing some 1st order 
autoregressive analyses (not in the time series library).  Any examples 
I've tried doing using by() only seem to work for outputing results not 
creating new variables in an existing data frame.  I suspect that people 
do similar types of hierarchical subgroup data manipulations all the time 
in R (I know how to do these easily in SYSTAT), so I'm sure I'm missing 
some obvious, simple trick.  My search of the R newslist archives and 
various other R documentation has not yielded any solutions yet. 
Suggestions are graciously welcomed.

   LOCID  POPULATION  YEARDBC
1  algb-1   A 1992 0.70451575
2  algb-1   A 1993 0.59506851
3  algb-1   A 1997 0.84837544
4  algb-1   A 1998 0.50283182
5  algb-1   A 2000 0.91242707
6  algb-2   A 1992 0.09747155
7  algb-2   A 1993 0.84772253
8  algb-2   A 1997 0.43974081
9  algb-2   A 1998 0.83108544
10 algb-2   A 2000 0.22291192
11 algb-3   A 1992 0.44234175
12 algb-3   A 1993 0.54089534
5680 taylr-73   B 2001 0.43918082
5681 taylr-73   B 2002 0.34694427
5682 taylr-73   B 2003 3.35619190
5683 taylr-73   B 2004 0.71575815
5684 taylr-73   B 2005 0.42038506
5685 taylr-74   B 1992 3.88410354
5686 taylr-74   B 1993 3.32472557
5687 taylr-74   B 1994 3.29861501
5688 taylr-74   B 1996 0.48153827
5689 taylr-74   B 1997 3.63570636
5690 taylr-74   B 1998 1.94630194

   LOCID  POPULATION  YEARDBC LAG1DBC
1  algb-1   A 1992 0.70451575   NA 
2  algb-1   A 1993 0.59506851 0.70451575
3  algb-1   A 1997 0.84837544   0.59506851
4  algb-1   A 1998 0.50283182 0.84837544
5  algb-1   A 2000 0.91242707   0.50283182
6  algb-2   A 1992 0.09747155   NA
7  algb-2   A 1993 0.84772253 0.09747155
8  algb-2   A 1997 0.43974081   0.84772253
9  algb-2   A 1998 0.83108544   0.43974081
10 algb-2   A 2000 0.22291192   0.83108544
11 algb-3   A 1992 0.44234175   NA
12 algb-3   A 1993 0.54089534   0.44234175
5680 taylr-73   B 2001 0.43918082   NA
5681 taylr-73   B 2002 0.34694427   0.43918082
5682 taylr-73   B 2003 3.35619190   0.34694427
5683 taylr-73   B 2004 0.71575815   3.35619190
5684 taylr-73   B 2005 0.42038506   0.71575815
5685 taylr-74   B 1992 3.88410354   NA
5686 taylr-74   B 1993 3.32472557   3.88410354
5687 taylr-74   B 1994 3.29861501   3.32472557
5688 taylr-74   B 1996 0.48153827   3.29861501
5689 taylr-74   B 1997 3.63570636   0.48153827
5690 taylr-74   B 1998 1.94630194   3.63570636

Brian



Brian S. Cade

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326



Florence Combes [EMAIL PROTECTED] 
10/13/2005 05:34 AM

To
Brian S Cade [EMAIL PROTECTED]
cc

Subject
Re: [R] subsetting with by() or other function??






maybe an example of the data you have and the data you want could be 
helpful for the people of the list to understand, and so to be able to 
help you ? 

best regards, 

Florence. 



On 10/12/05, Brian S Cade [EMAIL PROTECTED] wrote:
I think I must be missing something obvious, but I'm having trouble
getting a data transformation to work on groupings of data within a data
frame (csss3) as defined by 2 factors (population, locid).  The data are
sorted by year within locid within population and I want to lag another
variable (dbc), i.e, shift them down by 1 row replacing the first row with
NA, within groups defined by locid nested within population.  I thought I 
could do something using by(csss3,list(locid, population), function) but
don't seem to be having any success.  Any suggestions??

Brian

Brian S. Cade

U. S. Geological Survey
Fort Collins Science Center 
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326
[[alternative HTML version deleted]]

__ 
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

Re: [R] subsetting with by() or other function??

2005-10-13 Thread Brian S Cade
Dimitris:  Thank you for the suggestion but I get an error just as when I 
did similar commands using by(), The error given is 
Error in $-.data.frame(`*tmp*`, LAGDBC, value = 
tapply(csss3lagm81$DBC,  : 
replacement has 1089 rows, data has 8314

So I'm not sure what the problem is - why does the transformed tmp only 
have 1089 rows instead of 8314 like the full data frame?

Brian

Brian S. Cade

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326



Dimitris Rizopoulos [EMAIL PROTECTED] 
10/13/2005 10:04 AM

To
Brian S Cade [EMAIL PROTECTED]
cc

Subject
Re: [R] subsetting with by() or other function??






I think this should be something like:

dat$LAG1DBC - tapply(dat$DBC, dat$LOCID, function(x) c(NA, 
x[-length(x)]))

I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Brian S Cade [EMAIL PROTECTED]
To: Florence Combes [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Sent: Thursday, October 13, 2005 5:48 PM
Subject: Re: [R] subsetting with by() or other function??


 Fair enough.  To clarify what I'm trying to achieve I've pasted 
 below a
 small piece of the larger data frame with the hierarchical structure 
 of
 factors POPULATION and LOCID and the ascending order of YEARS and 
 the
 variable DBC that I would like to transform to another variable that 
 is a
 lag of the previous years DBC (call it LAG1DBC) within LOCID within
 POPULATION.  The desired outcome is shown in the second example data 
 set
 pasted below the first.  The setup is desired for doing some 1st 
 order
 autoregressive analyses (not in the time series library).  Any 
 examples
 I've tried doing using by() only seem to work for outputing results 
 not
 creating new variables in an existing data frame.  I suspect that 
 people
 do similar types of hierarchical subgroup data manipulations all the 
 time
 in R (I know how to do these easily in SYSTAT), so I'm sure I'm 
 missing
 some obvious, simple trick.  My search of the R newslist archives 
 and
 various other R documentation has not yielded any solutions yet.
 Suggestions are graciously welcomed.

   LOCID  POPULATION  YEARDBC
 1  algb-1   A 1992 0.70451575
 2  algb-1   A 1993 0.59506851
 3  algb-1   A 1997 0.84837544
 4  algb-1   A 1998 0.50283182
 5  algb-1   A 2000 0.91242707
 6  algb-2   A 1992 0.09747155
 7  algb-2   A 1993 0.84772253
 8  algb-2   A 1997 0.43974081
 9  algb-2   A 1998 0.83108544
 10 algb-2   A 2000 0.22291192
 11 algb-3   A 1992 0.44234175
 12 algb-3   A 1993 0.54089534
 5680 taylr-73   B 2001 0.43918082
 5681 taylr-73   B 2002 0.34694427
 5682 taylr-73   B 2003 3.35619190
 5683 taylr-73   B 2004 0.71575815
 5684 taylr-73   B 2005 0.42038506
 5685 taylr-74   B 1992 3.88410354
 5686 taylr-74   B 1993 3.32472557
 5687 taylr-74   B 1994 3.29861501
 5688 taylr-74   B 1996 0.48153827
 5689 taylr-74   B 1997 3.63570636
 5690 taylr-74   B 1998 1.94630194

   LOCID  POPULATION  YEARDBC
 1  algb-1   A 1992 0.70451575   NA
 2  algb-1   A 1993 0.59506851 0.70451575
 3  algb-1   A 1997 0.84837544   0.59506851
 4  algb-1   A 1998 0.50283182 0.84837544
 5  algb-1   A 2000 0.91242707   0.50283182
 6  algb-2   A 1992 0.09747155   NA
 7  algb-2   A 1993 0.84772253 0.09747155
 8  algb-2   A 1997 0.43974081   0.84772253
 9  algb-2   A 1998 0.83108544   0.43974081
 10 algb-2   A 2000 0.22291192   0.83108544
 11 algb-3   A 1992 0.44234175   NA
 12 algb-3   A 1993 0.54089534   0.44234175
 5680 taylr-73   B 2001 0.43918082   NA
 5681 taylr-73   B 2002 0.34694427   0.43918082
 5682 taylr-73   B 2003 3.35619190   0.34694427
 5683 taylr-73   B 2004 0.71575815   3.35619190
 5684 taylr-73   B 2005 0.42038506   0.71575815
 5685 taylr-74   B 1992 3.88410354   NA
 5686 taylr-74   B 1993 3.32472557   3.88410354
 5687 taylr-74   B 1994 3.29861501   3.32472557
 5688 taylr-74   B 1996 0.48153827   3.29861501
 5689 taylr-74   B 1997 3.63570636   0.48153827
 5690 taylr-74   B 1998 1.94630194   3.63570636

 Brian



 Brian S. Cade

 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort