Re: [R] by group problem

Cory Nissen Fri, 31 Aug 2007 08:27:10 -0700

That didn't work for me...
 
Here's some data to help with a solution.
 
data <- NULL
data$state <- c(rep("Illinois", 10), rep("Wisconsin", 10))
data$county <- c("Adams", "Brown", "Bureau", "Cass", "Champaign",  
                 "Christian", "Coles", "De Witt", "Douglas", "Edgar",
                 "Adams", "Ashland", "Barron", "Bayfield", "Buffalo",   
                 "Burnett", "Chippewa", "Clark", "Columbia", "Crawford")
data$percentOld <- c(17.554849, 16.826594, 18.196593, 17.139242,  8.743823,
                     17.862746, 13.747967, 16.626302, 15.258940, 18.984435,
                     19.347022, 17.814436, 16.903067, 17.632781, 16.659305,
                     20.337817, 14.293354, 17.252820, 15.647179, 16.825596)


return something like this...
$Illinois
"Edgar"
18.984435
"Bureau"
18.196593
...
$Wisconsin
"Burnett"
20.33782
"Adams"
19.34702
...
 
My Solution gives...
topN <- function(column, n=5)
  {
    column <- sort(column, decreasing=T)
    return(column[1:n])
  }
tapply(data$percentOld, data$state, topN)
 
$Illinois
[1] 18.98444 18.19659 17.86275 17.55485 17.13924
$Wisconsin
[1] 20.33782 19.34702 17.81444 17.63278 17.25282
 
I get an error with this try...
aggregate(data$percentOld, list(data$state, data$county), topN)

Error in aggregate.data.frame(as.data.frame(x), ...) : 
 'FUN' must always return a scalar
 
Thanks
 
cn
 
 

________________________________

From: Petr PIKAL [mailto:[EMAIL PROTECTED]
Sent: Fri 8/31/2007 8:15 AM
To: Cory Nissen
Cc: r-help@stat.math.ethz.ch
Subject: Odp: [R] by group problem



Hi

> I am working with census data.  My columns of interest are...
>
> PercentOld - the percentage of people in each county that are over 65
> County - the county in each state
> State - the state in the US
>
> There are about 3100 rows, with each row corresponding to a county
within a state.
>
> I want to return the top five "PercentOld" by state.  But I want the
County
> and the Value.
>
> I tried this...
>
> topN <- function(column, n=5)
>   {
>     column <- sort(column, decreasing=T)
>     return(column[1:n])
>   }
> top5PerState <- tapply(data$percentOld, data$STATE, topN)

Try

aggregate(data$PercentOld, list(data$State, data$County), topN)

Regards
Petr


>
> But this only returns the value for "percentOld" per state, I also want
the
> corresponding County.
>
> I think I'm close, but I just can't get it...
>
> Thanks
>
> cn
>
>    [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] by group problem

Reply via email to