I guess my problem was seeing a bunch of examples where they pulled a
variable from a dataframe..

  tapply(df$data, index=list(..

and I
assumed that the df$data was just generalizable to a collection of vectors
a vector of vector being a vector

Thanks.

On Mon, Apr 26, 2010 at 2:43 AM, Petr PIKAL <petr.pi...@precheza.cz> wrote:

> Hi
>
>
> steven mosher <mosherste...@gmail.com> napsal dne 26.04.2010 10:21:37:
>
> > That fails:
> >
> > The manual says:
> >
> > tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE)
>
> > Arguments
> >
> > X
> >
> > an atomic object, typically a vector.
> >
> > INDEX
> >
> > list of factors, each of same length as X. The elements are coerced to
> factors by
> > as.factor.
> >
> > my error says:
>
> >
> > Error in tapply(DF[, 1:15], DF$Year, mean, na.rm = T) :
> >
> >   arguments must have same length
> >
> > The issue that I have is I dont understand what the requirements for the
> list of factors
> > are. In my example DF$Years is  a sequence of
> years..1979,1980,1982,1983, 1987..
> > like that with missing years: so when the manual say: list of factors
> each the same
> > length as X? what does that mean? I could have a DF with 20 rows and
> only two
> > different years. or 20 rows and 20 different years.
> >
> > Suppose:
> >
> > a<- c(1,2,3,4)
> > > b<-c(2,3,4,5)
> > > df=data.frame(a,b)
> > > length(df)
>
> data frame is not vector nor atomic but list hence length(df) gives you
> number of columns. It is similar to length of a list
>
> > lll<-list(a=1, b=2, c=3)
> > length(lll)
> [1] 3
> >
>
> If you accept that the first argument of tapply has to be vector you can
> not put data frame there.
>
> Next second argument has to be list of factors so you can put there
> several factors, each of the same length as first argument (a vector).
>
> If you want to perform aggregating operation on whole data frame you shall
> consider
>
> ?by or ?aggregate
>
> Other options are plyr or doBy packages.
>
> Syntax for aggregate is quite similar to tapply, only first argument can
> be data frame.
>
> Regards
> Petr
>
>
> >
> > The length of DF is 2.
> > Does that mean the "list of factors, each of same length as X." would
> have to be
> > 2? that doesnt seem to make sense.
> >
> >
> >
> > On Mon, Apr 26, 2010 at 12:26 AM, Petr PIKAL <petr.pi...@precheza.cz>
> wrote:
> > Hi
> >
> > r-help-boun...@r-project.org napsal dne 26.04.2010 06:52:55:
> >
> > > Having some difficulties with understanding how tapply works and
> getting
> > > return values I expect
> > >
> > > Data: dataframe. DF  DF$Id $D $Year.......
> > >
> > >  Id                          D  Year Jan Feb Mar Apr May Jun Jul Aug
> Sep
> > Oct
> > > Nov Dec
> > >  11264402000         1 1980  NA  NA  NA  NA  NA 212 203 209 228 237
>  NA
> > NA
> > >  11264402000         0 1981  NA  NA 243 244  NA  NA  NA  NA 225  NA
> 231
> > NA
> > >  11264402000         1 1981  NA 251  NA 248 241  NA  NA  NA 235  NA
>  NA
> > 245
> > >  11264402000         0 1982 236 237 242 240 242 205 199  NA  NA  NA
>  NA
> > NA
> > >  11264402000         1 1982 236  NA  NA 240 242  NA  NA  NA  NA  NA
>  NA
> > NA
> > >  11264402000         0 1983  NA 247  NA  NA  NA  NA  NA 205  NA  NA
>  NA
> > NA
> > >  11264402000         1 1983  NA 247  NA  NA  NA  NA  NA  NA  NA 225
>  NA
> > NA
> > >  11264402000         0 1986  NA  NA  NA 240  NA  NA  NA 213  NA  NA
>  NA
> > NA
> > >  11264402000         0 1987 241  NA  NA  NA  NA 218  NA  NA 235 243
> 240
> > NA
> > >  11264402000         1 1987  NA  NA  NA  NA  NA 218  NA  NA 235 243
> 240
> > NA
> > >  11264402000         3 1987  NA  NA  NA  NA  NA 218  NA  NA 235 243
> 240
> > NA
> > >  11264402000         0 1988 238 246 249  NA 244 213 212 224 232 238
> 232
> > 230
> > >  11264402000         1 1988 238 246 249 246 244 213 212 224 232  NA
>  NA
> > 230
> > >  11264402000         3 1988 238 246 249 246 244 213 212 224 232  NA
>  NA
> > 230
> > >  11264402000         0 1989 232 233 238 239 231  NA 215  NA  NA  NA
>  NA
> > 238
> > >  11264402000         1 1989 232 233 238 239 231  NA  NA  NA  NA  NA
>  NA
> > 238
> > >  11264402000         3 1989 232 233 238 239 231  NA  NA  NA  NA  NA
>  NA
> > 238
> > >
> > > and the result should be a dataframe of column means by year  with the
> > > variable D dropped (or kept doesnt matter)
> > >
> > > 11264402000         1  1980  NA  NA  NA  NA  NA 212 203 209 228 237
>  NA
> > NA
> > >  11264402000        .5  1981  NA  NA 243 244  NA  NA  NA  NA 225  NA
> 231
> >  NA
> > >  11264402000        .5  1982 236 237 242 240 242 205 199  NA  NA  NA
>  NA
> >  NA
> > >  11264402000        .5  1983  NA 247  NA  NA  NA  NA  NA 205  NA  225
> NA
> > >  NA
> > >  11264402000        1  1986  NA  NA  NA 240  NA  NA  NA 213  NA  NA
>  NA
> > NA
> > >  11264402000         2 1987 241  NA  NA  NA  NA 218  NA  NA 235 243
> 240
> > NA
> > >  11264402000        1.33 1988 238 246 249  246 244 213 212 224 232 238
> > 232
> > > 230
> > >  11264402000        1.33  1989 232 233 238 239 231  NA 215  NA  NA  NA
> > NA
> > > 238
> > >
> > >  It would seem that Tapply should work
> > >  result<-tapply( DF[,1:15], DF$Year, colMeans,na.rm=T)
>
> > Why colMeans?  It is function used instead of apply(...,.. ,mean).
> >
> > Maybe you want
> >
> > result<-tapply( DF[,1:15], DF$Year, mean,na.rm=T)
> >
> > Regards
> > Petr
> >
> > >
> > >  but i get errors about the length of arguments, which
> > >
> > >    [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to