Thx Erik, I have no idea what went wrong with the other code snippet, but this one works.. Appreciate it.
qta<- table(cut(age,breaks = seq(0, 100, by = 10),include.lowest = TRUE),cut(year,breaks=seq(1950,2010,by=5),include.lowest=TRUE)) M On 5. apr. 2010, at 21.45, Erik Iverson wrote: > I don't know what your data are like, since you haven't given a reproducible > example. I was imagining something like: > > ## generate fake data > age <- sample(20:90, 100, replace = TRUE) > year <- sample(1950:2000, 100, replace = TRUE) > > ##look at big table > table(age, year) > > ## categorize data > ## see include.lowest and right arguments to cut > age.factor <- cut(age, breaks = seq(20, 90, by = 10), > include.lowest = TRUE) > > year.factor <- cut(year, breaks = seq(1950, 2000, by = 10), > include.lowest = TRUE) > > table(age.factor, year.factor) > > moleps wrote: >> I already did try the regression modeling approach. However the >> epidemiologists (referee) turns out to be quite fond of comparing the >> incidence rates to different standard populations, hence the need for this >> labourius approach. And trying the "cutting" approach I ended up with : >>> table (age5) >> age5 >> (0,5] (5,10] (10,15] (15,20] (20,25] (25,30] (30,35] (35,40] >> (40,45] (45,50] (50,55] (55,60] (60,65] (65,70] (70,75] (75,80] >> (80,85] (85,100] 35 34 33 47 51 109 >> 157 231 362 511 745 926 1002 866 547 >> 247 82 18 >>> table (yr5) >> yr5 >> (1950,1955] (1955,1960] (1960,1965] (1965,1970] (1970,1975] (1975,1980] >> (1980,1985] (1985,1990] (1990,1995] (1995,2000] (2000,2005] (2005,2009] >> 3 5 5 5 5 5 >> 5 5 5 5 5 3 >>> table (yr5,age5) >> Error in table(yr5, age5) : all arguments must have the same length >> Sincerely, >> M >> On 5. apr. 2010, at 20.59, Bert Gunter wrote: >>> You have tempted, and being weak, I yield to temptation: >>> >>> "Any good ideas?" >>> >>> Yes. Don't do this. >>> >>> (what you probably really want to do is fit a model with age as a factor, >>> which can be done statistically e.g. by logistic regression; or graphically >>> using conditioning plots, e.g. via trellis graphics (the lattice package). >>> This avoids the arbitrariness and discontinuities of binning by age range.) >>> >>> Bert Gunter >>> Genentech Nonclinical Biostatistics >>> >>> -----Original Message----- >>> From: [email protected] [mailto:[email protected]] On >>> Behalf Of moleps >>> Sent: Monday, April 05, 2010 11:46 AM >>> To: [email protected] >>> Subject: [R] Data manipulation problem >>> >>> Dear R´ers. >>> >>> I´ve got a dataset with age and year of diagnosis. In order to >>> age-standardize the incidence I need to transform the data into a matrix >>> with age-groups (divided in 5 or 10 years) along one axis and year divided >>> into 5 years along the other axis. Each cell should contain the number of >>> cases for that age group and for that period. >>> I.e. >>> My data format now is >>> ID-age (to one decimal)-year(yearly data). >>> >>> What I´d like is >>> >>> age 1960-1965 1966-1970 etc... >>> 0-5 3 8 10 15 >>> 6-10 2 5 8 13 >>> etc.. >>> >>> >>> Any good ideas? >>> >>> Regards, >>> M >>> >>> ______________________________________________ >>> [email protected] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> ______________________________________________ >> [email protected] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

