Re: [R] Data manipulation problem

moleps Mon, 05 Apr 2010 13:12:18 -0700

Thx Erik,
I have no idea what went wrong with the other code snippet, but this one 
works.. Appreciate it.


qta<- table(cut(age,breaks = seq(0, 100, by = 10),include.lowest = 
TRUE),cut(year,breaks=seq(1950,2010,by=5),include.lowest=TRUE))

M


On 5. apr. 2010, at 21.45, Erik Iverson wrote:

> I don't know what your data are like, since you haven't given a reproducible 
> example. I was imagining something like:
> 
> ## generate fake data
> age <- sample(20:90, 100, replace = TRUE)
> year <- sample(1950:2000, 100, replace = TRUE)
> 
> ##look at big table
> table(age, year)
> 
> ## categorize data
> ## see include.lowest and right arguments to cut
> age.factor <- cut(age, breaks = seq(20, 90, by = 10),
>                  include.lowest = TRUE)
> 
> year.factor <- cut(year, breaks = seq(1950, 2000, by = 10),
>                   include.lowest = TRUE)
> 
> table(age.factor, year.factor)
> 
> moleps wrote:
>> I already did try the regression modeling approach. However the 
>> epidemiologists (referee) turns out to be quite fond of comparing the 
>> incidence rates to different standard populations, hence the need for this 
>> labourius approach. And trying the "cutting" approach I ended up with :
>>> table (age5)
>> age5
>>   (0,5]   (5,10]  (10,15]  (15,20]  (20,25]  (25,30]  (30,35]  (35,40]  
>> (40,45]  (45,50]  (50,55]  (55,60]  (60,65]  (65,70]  (70,75]  (75,80]  
>> (80,85] (85,100]       35       34       33       47       51      109      
>> 157      231      362      511      745      926     1002      866      547  
>>     247       82       18 
>>> table (yr5)
>> yr5
>> (1950,1955] (1955,1960] (1960,1965] (1965,1970] (1970,1975] (1975,1980] 
>> (1980,1985] (1985,1990] (1990,1995] (1995,2000] (2000,2005] (2005,2009]      
>>      3           5           5           5           5           5           
>> 5           5           5           5           5           3 
>>> table (yr5,age5)
>> Error in table(yr5, age5) : all arguments must have the same length
>> Sincerely,
>> M
>> On 5. apr. 2010, at 20.59, Bert Gunter wrote:
>>> You have tempted, and being weak, I yield to temptation:
>>> 
>>> "Any good ideas?"
>>> 
>>> Yes. Don't do this.
>>> 
>>> (what you probably really want to do is fit a model with age as a factor,
>>> which can be done statistically e.g. by logistic regression; or graphically
>>> using conditioning plots, e.g. via trellis graphics (the lattice package).
>>> This avoids the arbitrariness and discontinuities of binning by age range.)
>>> 
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>> 
>>> -----Original Message-----
>>> From: [email protected] [mailto:[email protected]] On
>>> Behalf Of moleps
>>> Sent: Monday, April 05, 2010 11:46 AM
>>> To: [email protected]
>>> Subject: [R] Data manipulation problem
>>> 
>>> Dear R´ers.
>>> 
>>> I´ve got a dataset with age and year of diagnosis. In order to
>>> age-standardize the incidence I need to transform the data into a matrix
>>> with age-groups (divided in 5 or 10 years) along one axis and year divided
>>> into 5 years along the other axis. Each cell should contain the number of
>>> cases for that age group and for that period. 
>>> I.e.
>>> My data format now is
>>> ID-age (to one decimal)-year(yearly data).
>>> 
>>> What I´d like is 
>>> 
>>> age 1960-1965 1966-1970 etc...
>>> 0-5 3 8 10 15
>>> 6-10 2 5 8 13
>>> etc..
>>> 
>>> 
>>> Any good ideas?
>>> 
>>> Regards,
>>> M
>>> 
>>> ______________________________________________
>>> [email protected] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> ______________________________________________
>> [email protected] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data manipulation problem

Reply via email to