arnaud chozo wrote:
Hi all,
I'd like to use the Hmisc::summarize function, but it uses a function (FUN)
of a single vector argument to create the statistical summaries.
Consider an easy case: I'd like to compute the correlation between two
variables in my dataframe, grouped according to other variables in the same
dataframe.
For exemple, consider the following dataframe D:
V1 V2 V3
A 1 -1
A 1 1
A -1 -1
B 1 1
B 1 1
I'd like to use Hmisc::summarize(X=D, by=llist(myvar=D$V1), FUN=corr.V2.V3)
where corr.V2.V3 is defined as follows:
corr.V2.V3 = function(x) {
d = cbind(x$V2, x$V3)
out = c(cor(d))
names(out) = c("CORR")
return(out)
}
I was not able to use Hmisc::summarize in this case because FUN should be a
function of a matrix argument. Any idea?
Thanks in advance,
Arnaud
See the Hmisc mApply or summary.formula functions, or use tapply using a
vector of possible subscripts (1:n) as the first argument; then you can
use the subscripts selected to address multiple variables.
Frank
--
Frank E Harrell Jr Professor and Chairman School of Medicine
Department of Biostatistics Vanderbilt University
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.