Hi r-help-boun...@r-project.org napsal dne 26.04.2010 17:05:54: > I guess my problem was seeing a bunch of examples where they pulled a > variable from a dataframe.. > > tapply(df$data, index=list(..
df$data results in vector so as eg. df[,5] unless you use drop=FALSE option > > and I > assumed that the df$data was just generalizable to a collection of vectors > a vector of vector being a vector df[,1:15] is not a vector of vectors. R sometimes can give you nasty surprise with object types and modes but changing a type of object merely by selecting some part of it wold be quite problematic. see str(df$data) str(df[, 1]) str(df[,1, drop=FALSE]) str(df[,1:15]) Regards Petr > > Thanks. > > On Mon, Apr 26, 2010 at 2:43 AM, Petr PIKAL <petr.pi...@precheza.cz> wrote: > > > Hi > > > > > > steven mosher <mosherste...@gmail.com> napsal dne 26.04.2010 10:21:37: > > > > > That fails: > > > > > > The manual says: > > > > > > tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE) > > > > > Arguments > > > > > > X > > > > > > an atomic object, typically a vector. > > > > > > INDEX > > > > > > list of factors, each of same length as X. The elements are coerced to > > factors by > > > as.factor. > > > > > > my error says: > > > > > > > > Error in tapply(DF[, 1:15], DF$Year, mean, na.rm = T) : > > > > > > arguments must have same length > > > > > > The issue that I have is I dont understand what the requirements for the > > list of factors > > > are. In my example DF$Years is a sequence of > > years..1979,1980,1982,1983, 1987.. > > > like that with missing years: so when the manual say: list of factors > > each the same > > > length as X? what does that mean? I could have a DF with 20 rows and > > only two > > > different years. or 20 rows and 20 different years. > > > > > > Suppose: > > > > > > a<- c(1,2,3,4) > > > > b<-c(2,3,4,5) > > > > df=data.frame(a,b) > > > > length(df) > > > > data frame is not vector nor atomic but list hence length(df) gives you > > number of columns. It is similar to length of a list > > > > > lll<-list(a=1, b=2, c=3) > > > length(lll) > > [1] 3 > > > > > > > If you accept that the first argument of tapply has to be vector you can > > not put data frame there. > > > > Next second argument has to be list of factors so you can put there > > several factors, each of the same length as first argument (a vector). > > > > If you want to perform aggregating operation on whole data frame you shall > > consider > > > > ?by or ?aggregate > > > > Other options are plyr or doBy packages. > > > > Syntax for aggregate is quite similar to tapply, only first argument can > > be data frame. > > > > Regards > > Petr > > > > > > > > > > The length of DF is 2. > > > Does that mean the "list of factors, each of same length as X." would > > have to be > > > 2? that doesnt seem to make sense. > > > > > > > > > > > > On Mon, Apr 26, 2010 at 12:26 AM, Petr PIKAL <petr.pi...@precheza.cz> > > wrote: > > > Hi > > > > > > r-help-boun...@r-project.org napsal dne 26.04.2010 06:52:55: > > > > > > > Having some difficulties with understanding how tapply works and > > getting > > > > return values I expect > > > > > > > > Data: dataframe. DF DF$Id $D $Year....... > > > > > > > > Id D Year Jan Feb Mar Apr May Jun Jul Aug > > Sep > > > Oct > > > > Nov Dec > > > > 11264402000 1 1980 NA NA NA NA NA 212 203 209 228 237 > > NA > > > NA > > > > 11264402000 0 1981 NA NA 243 244 NA NA NA NA 225 NA > > 231 > > > NA > > > > 11264402000 1 1981 NA 251 NA 248 241 NA NA NA 235 NA > > NA > > > 245 > > > > 11264402000 0 1982 236 237 242 240 242 205 199 NA NA NA > > NA > > > NA > > > > 11264402000 1 1982 236 NA NA 240 242 NA NA NA NA NA > > NA > > > NA > > > > 11264402000 0 1983 NA 247 NA NA NA NA NA 205 NA NA > > NA > > > NA > > > > 11264402000 1 1983 NA 247 NA NA NA NA NA NA NA 225 > > NA > > > NA > > > > 11264402000 0 1986 NA NA NA 240 NA NA NA 213 NA NA > > NA > > > NA > > > > 11264402000 0 1987 241 NA NA NA NA 218 NA NA 235 243 > > 240 > > > NA > > > > 11264402000 1 1987 NA NA NA NA NA 218 NA NA 235 243 > > 240 > > > NA > > > > 11264402000 3 1987 NA NA NA NA NA 218 NA NA 235 243 > > 240 > > > NA > > > > 11264402000 0 1988 238 246 249 NA 244 213 212 224 232 238 > > 232 > > > 230 > > > > 11264402000 1 1988 238 246 249 246 244 213 212 224 232 NA > > NA > > > 230 > > > > 11264402000 3 1988 238 246 249 246 244 213 212 224 232 NA > > NA > > > 230 > > > > 11264402000 0 1989 232 233 238 239 231 NA 215 NA NA NA > > NA > > > 238 > > > > 11264402000 1 1989 232 233 238 239 231 NA NA NA NA NA > > NA > > > 238 > > > > 11264402000 3 1989 232 233 238 239 231 NA NA NA NA NA > > NA > > > 238 > > > > > > > > and the result should be a dataframe of column means by year with the > > > > variable D dropped (or kept doesnt matter) > > > > > > > > 11264402000 1 1980 NA NA NA NA NA 212 203 209 228 237 > > NA > > > NA > > > > 11264402000 .5 1981 NA NA 243 244 NA NA NA NA 225 NA > > 231 > > > NA > > > > 11264402000 .5 1982 236 237 242 240 242 205 199 NA NA NA > > NA > > > NA > > > > 11264402000 .5 1983 NA 247 NA NA NA NA NA 205 NA 225 > > NA > > > > NA > > > > 11264402000 1 1986 NA NA NA 240 NA NA NA 213 NA NA > > NA > > > NA > > > > 11264402000 2 1987 241 NA NA NA NA 218 NA NA 235 243 > > 240 > > > NA > > > > 11264402000 1.33 1988 238 246 249 246 244 213 212 224 232 238 > > > 232 > > > > 230 > > > > 11264402000 1.33 1989 232 233 238 239 231 NA 215 NA NA NA > > > NA > > > > 238 > > > > > > > > It would seem that Tapply should work > > > > result<-tapply( DF[,1:15], DF$Year, colMeans,na.rm=T) > > > > > Why colMeans? It is function used instead of apply(...,.. ,mean). > > > > > > Maybe you want > > > > > > result<-tapply( DF[,1:15], DF$Year, mean,na.rm=T) > > > > > > Regards > > > Petr > > > > > > > > > > > but i get errors about the length of arguments, which > > > > > > > > [[alternative HTML version deleted]] > > > > > > > > ______________________________________________ > > > > R-help@r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.