[R] FW: variable format
Anybody? From: Cory Nissen Sent: Tue 9/4/2007 9:30 AM To: r-help@stat.math.ethz.ch Subject: variable format Okay, I want to do something similar to SAS proc format. I usually do this... a - NULL a$divisionOld - c(1,2,3,4,5) divisionTable - matrix(c(1, New England, 2, Middle Atlantic, 3, East North Central, 4, West North Central, 5, South Atlantic), ncol=2, byrow=T) a$divisionNew[match(a$divisionOld, divisionTable[,1])] - divisionTable[,2] But how do I handle the case where... a$divisionOld - c(0,1,2,3,4,5) #no format available for 0, this throws an error. OR divisionTable - matrix(c(1, New England, 2, Middle Atlantic, 3, East North Central, 4, West North Central, 5, South Atlantic, 6, East South Central, 7, West South Central, 8, Mountain, 9, Pacific), ncol=2, byrow=T) There are extra formats available... this throws a warning. Thanks Cory [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: variable format
This was what I was looking for. I figured factor was the way to go, but I wasn't sure how to implement it. The car recommendation looks good too, but I want to try to stay away from having to download another package if I can. Thanks cn From: Martin Becker [mailto:[EMAIL PROTECTED] Sent: Fri 9/7/2007 10:55 AM To: Cory Nissen Cc: r-help@stat.math.ethz.ch Subject: Re: [R] FW: variable format Dear Cory, I am not familiar with SAS, but is this what you are looking for? divisionTable - matrix(c(1, New England, 2, Middle Atlantic, 3, East North Central, 4, West North Central, 5, South Atlantic, 6, East South Central, 7, West South Central, 8, Mountain, 9, Pacific), ncol=2, byrow=T) a - NULL a$divisionOld - c(0,1,2,3,4,5) a$divisionNew - as.character(factor(a$divisionOld,levels=divisionTable[,1],labels=divisionTable[,2])) a$divisionNew [1] NA New EnglandMiddle Atlantic [4] East North Central West North Central South Atlantic Kind regards, Martin Cory Nissen schrieb: Anybody? From: Cory Nissen Sent: Tue 9/4/2007 9:30 AM To: r-help@stat.math.ethz.ch Subject: variable format Okay, I want to do something similar to SAS proc format. I usually do this... a - NULL a$divisionOld - c(1,2,3,4,5) divisionTable - matrix(c(1, New England, 2, Middle Atlantic, 3, East North Central, 4, West North Central, 5, South Atlantic), ncol=2, byrow=T) a$divisionNew[match(a$divisionOld, divisionTable[,1])] - divisionTable[,2] But how do I handle the case where... a$divisionOld - c(0,1,2,3,4,5) #no format available for 0, this throws an error. OR divisionTable - matrix(c(1, New England, 2, Middle Atlantic, 3, East North Central, 4, West North Central, 5, South Atlantic, 6, East South Central, 7, West South Central, 8, Mountain, 9, Pacific), ncol=2, byrow=T) There are extra formats available... this throws a warning. Thanks Cory [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by group problem
Perfect, except for one little bit... topN.2 is missing one comma... It should read as follows topN.2 - function(data,n=5) data[order(data[,3], decreasing=T),][1:n,] Thank you very much. cn From: Petr PIKAL [mailto:[EMAIL PROTECTED] Sent: Mon 9/3/2007 3:51 AM To: Cory Nissen Cc: r-help@stat.math.ethz.ch Subject: RE: [R] by group problem Hi now I understand better what you want topN.2 - function(data,n=5) data[order(data[,3], decreasing=T),][1:n] # I presume data is data frame with 3 columns and the third is percent lapply(split(data,data$state), topN.2) Regards Petr [EMAIL PROTECTED] Cory Nissen [EMAIL PROTECTED] napsal dne 31.08.2007 17:21:01: That didn't work for me... Here's some data to help with a solution. data - NULL data$state - c(rep(Illinois, 10), rep(Wisconsin, 10)) data$county - c(Adams, Brown, Bureau, Cass, Champaign, Christian, Coles, De Witt, Douglas, Edgar, Adams, Ashland, Barron, Bayfield, Buffalo, Burnett, Chippewa, Clark, Columbia, Crawford) data$percentOld - c(17.554849, 16.826594, 18.196593, 17.139242, 8.743823, 17.862746, 13.747967, 16.626302, 15.258940, 18.984435, 19.347022, 17.814436, 16.903067, 17.632781, 16.659305, 20.337817, 14.293354, 17.252820, 15.647179, 16.825596) return something like this... $Illinois Edgar 18.984435 Bureau 18.196593 ... $Wisconsin Burnett 20.33782 Adams 19.34702 ... My Solution gives... topN - function(column, n=5) { column - sort(column, decreasing=T) return(column[1:n]) } tapply(data$percentOld, data$state, topN) $Illinois [1] 18.98444 18.19659 17.86275 17.55485 17.13924 $Wisconsin [1] 20.33782 19.34702 17.81444 17.63278 17.25282 I get an error with this try... aggregate(data$percentOld, list(data$state, data$county), topN) Error in aggregate.data.frame(as.data.frame(x), ...) : 'FUN' must always return a scalar Thanks cn From: Petr PIKAL [mailto:[EMAIL PROTECTED] Sent: Fri 8/31/2007 8:15 AM To: Cory Nissen Cc: r-help@stat.math.ethz.ch Subject: Odp: [R] by group problem Hi I am working with census data. My columns of interest are... PercentOld - the percentage of people in each county that are over 65 County - the county in each state State - the state in the US There are about 3100 rows, with each row corresponding to a county within a state. I want to return the top five PercentOld by state. But I want the County and the Value. I tried this... topN - function(column, n=5) { column - sort(column, decreasing=T) return(column[1:n]) } top5PerState - tapply(data$percentOld, data$STATE, topN) Try aggregate(data$PercentOld, list(data$State, data$County), topN) Regards Petr But this only returns the value for percentOld per state, I also want the corresponding County. I think I'm close, but I just can't get it... Thanks cn [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] variable format
Okay, I want to do something similar to SAS proc format. I usually do this... a - NULL a$divisionOld - c(1,2,3,4,5) divisionTable - matrix(c(1, New England, 2, Middle Atlantic, 3, East North Central, 4, West North Central, 5, South Atlantic), ncol=2, byrow=T) a$divisionNew[match(a$divisionOld, divisionTable[,1])] - divisionTable[,2] But how do I handle the case where... a$divisionOld - c(0,1,2,3,4,5) #no format available for 0, this throws an error. OR divisionTable - matrix(c(1, New England, 2, Middle Atlantic, 3, East North Central, 4, West North Central, 5, South Atlantic, 6, East South Central, 7, West South Central, 8, Mountain, 9, Pacific), ncol=2, byrow=T) There are extra formats available... this throws a warning. Thanks Cory [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] by group problem
I am working with census data. My columns of interest are... PercentOld - the percentage of people in each county that are over 65 County - the county in each state State - the state in the US There are about 3100 rows, with each row corresponding to a county within a state. I want to return the top five PercentOld by state. But I want the County and the Value. I tried this... topN - function(column, n=5) { column - sort(column, decreasing=T) return(column[1:n]) } top5PerState - tapply(data$percentOld, data$STATE, topN) But this only returns the value for percentOld per state, I also want the corresponding County. I think I'm close, but I just can't get it... Thanks cn [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by group problem
That didn't work for me... Here's some data to help with a solution. data - NULL data$state - c(rep(Illinois, 10), rep(Wisconsin, 10)) data$county - c(Adams, Brown, Bureau, Cass, Champaign, Christian, Coles, De Witt, Douglas, Edgar, Adams, Ashland, Barron, Bayfield, Buffalo, Burnett, Chippewa, Clark, Columbia, Crawford) data$percentOld - c(17.554849, 16.826594, 18.196593, 17.139242, 8.743823, 17.862746, 13.747967, 16.626302, 15.258940, 18.984435, 19.347022, 17.814436, 16.903067, 17.632781, 16.659305, 20.337817, 14.293354, 17.252820, 15.647179, 16.825596) return something like this... $Illinois Edgar 18.984435 Bureau 18.196593 ... $Wisconsin Burnett 20.33782 Adams 19.34702 ... My Solution gives... topN - function(column, n=5) { column - sort(column, decreasing=T) return(column[1:n]) } tapply(data$percentOld, data$state, topN) $Illinois [1] 18.98444 18.19659 17.86275 17.55485 17.13924 $Wisconsin [1] 20.33782 19.34702 17.81444 17.63278 17.25282 I get an error with this try... aggregate(data$percentOld, list(data$state, data$county), topN) Error in aggregate.data.frame(as.data.frame(x), ...) : 'FUN' must always return a scalar Thanks cn From: Petr PIKAL [mailto:[EMAIL PROTECTED] Sent: Fri 8/31/2007 8:15 AM To: Cory Nissen Cc: r-help@stat.math.ethz.ch Subject: Odp: [R] by group problem Hi I am working with census data. My columns of interest are... PercentOld - the percentage of people in each county that are over 65 County - the county in each state State - the state in the US There are about 3100 rows, with each row corresponding to a county within a state. I want to return the top five PercentOld by state. But I want the County and the Value. I tried this... topN - function(column, n=5) { column - sort(column, decreasing=T) return(column[1:n]) } top5PerState - tapply(data$percentOld, data$STATE, topN) Try aggregate(data$PercentOld, list(data$State, data$County), topN) Regards Petr But this only returns the value for percentOld per state, I also want the corresponding County. I think I'm close, but I just can't get it... Thanks cn [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] First elements of a list.
Suppose I have the following list: a - strsplit(c(John;Smith, Jane;Doe, koda, gunner), ;) I want to get to these two vectors without looping... firstNames:c(John, Jane, koda, gunner) lastNames:c(Jane, Doe, NA, NA) Thanks cn [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.