On 12/15/05, January Weiner <[EMAIL PROTECTED]> wrote: > Hello again, > > On 12/14/05, Thomas Lumley <[EMAIL PROTECTED]> wrote: > > You want > > > > by(df[,-1], df$Day, function.that.means.each.column) > > OK, slowly :-) I don't understand it. > > - why df[,-1] and not df? don't we loose the df$Day entries?
You don't get them as a column but you get them as the component labels. by(df, df$Day, function(x) colMeans(x[,-1])) If you convert it to a data frame you get them as the rownames: do.call("rbind", by(df, df$Day, function(x) colMeans(x[,-1]))) > > (by the way, why does typeof(df) show "list"? I thought that > read.table() returns a data frame?) I think you want class(df) which shows its a data frame. > > > so all you need to do is write function.that.means.each.column() > > In this case there is a built-in function, colMeans, so you don't even > > have to write it. > > Hmmmmm, I tried it and it did not work. That is, it works - but not as > intended :-). > > Fake example: > > > df <- data.frame(Day=c("Tue","Tue","Tue", "Wed", "Wed"), val1=seq(1,5), > > val2=3*seq(1,5)) > > df > Day val1 val2 > 1 Tue 1 3 > 2 Tue 2 6 > 3 Tue 3 9 > 4 Wed 4 12 > 5 Wed 5 15 > > ddf <- by(df[,-1], df$Day, colMeans) > > ddf > df$Day: Tue > val1 val2 > 2 6 > ------------------------------------------------------------ > df$Day: Wed > val1 val2 > 4.5 13.5 > > ddf$Day > NULL > > ddf$val1 > NULL > > In real data, instead of "days", I have around 6000 items, so I need > them to be in one column called "Days" (or whatever). OK. So correct > me if I understand wrongly what is happening here: > > by() divides df in data frame subsets and applies a function > (colMeans) to each of them. The result of colMeans ... manual says > that colMeans returns the following: > > A numeric or complex array of suitable size, or a vector if the > result is one-dimensional. The 'dimnames' (or 'names' for a > vector result) are taken from the original array. > > ...which doesn't tell me much. typeof(colMeans(...)) tells me > "double" but I think it lies. OK, lets assume it is a vector (should > be, I assume the result is one-dimensional, as I can hardly imagine a > multidimensional result). > > So in the end I have a list with as many columns as I have days, and > in each column I have a vector with N named dimensions, where N is the > numbers of variables in the original data frame bar one. But what I > would like to have is a data frame with exactly the same column names, > and rows being just a summary. And no clue how to convert one in the > other :-) > > > More generally (eg the approach would work for medians as well) > > > > by(df[,1], df$Day, function(today) apply(today, 2, mean)) > > Huh? why is it df[,1] now? I think I'm completly lost. df[,1] and df$Day both refer to the same first column. > > > Finally, you could just use aggregate(). > > Probably, yes. As soon as I figure out how to use it, that is :-) (an aggregate(df[,-1], df[,1,drop = FALSE], mean) or aggregate(df[,-1], list(Day = df$Day), mean) The second arg of aggregate must be a list which is why we used drop = FALSE in the first instance and an explicit list in the second. Another alternative is to use summaryBy from the doBy package found at http://genetics.agrsci.dk/~sorenh/misc/ : library(doBy) summaryBy(cbind(var1, var2) ~ Day, data = df) > hour later: OK, I got it! yuppie!) However what I really needed was > smth like this: > > ddf <- by(df[,-1], df$Day, function(z) { return(cor(z$val1,z$val2)) ; } ) > > (but I still don't know how to convert it to a friendly data frame...) > do.call("rbind", ddf) > Thanks for the answers! > > January > > -- > ------------ January Weiner 3 ---------------------+--------------- > Division of Bioinformatics, University of Muenster | Schloßplatz 4 > (+49)(251)8321634 | D48149 Münster > http://www.uni-muenster.de/Biologie.Botanik/ebb/ | Germany > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html