[R] Correlations by group

Peter J. Lee Mon, 24 Jul 2006 03:56:28 -0700

I'm aware that S N Krishna asked the same 
question. However, I have failed to implement the 
posted solution for running rank order 
correlations on multiple subsets of data using the by() function.


Here is my problem:

Take a set of data from two subjects, who 
provided numerical infant mortality (IM) estimates for five countries:

         sub <- c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2) 
#grouping variable = 5 rows x 2 subjects
         est <- c(60, 20, 260, 160, 42, 2, 1, 3, 
7, 12) #response variable = 5 estimates x 2 subjects
         im <- c(4, 5, 7, 8, 10, 4, 5, 7, 8, 10) #actual IM values x 2 subjects
         data <- cbind(sub, est, im)
         data

Using the by() function:

         by(data, sub, function(x) cor(est, im, method = "spearman"))

does result in two correlation coefficients. But 
instead of by subject, the est x im correlation 
for the entire set is reported, and then assigned 
to both subjects. This can be checked using:

         cor(est, im, method = "spearman")

Nevertheless, the true coeff's and p-values should be:

         sub[1] cor.coef = 0.1 p > .1
         sub[2] cor.coef = 0.9 p < .05

I find it peculiar that running a simple regression by groups does work:

         by(data, sub, function(x) lm(est ~ im, data = x))

indicating that perhaps I'm using the wrong 
grouping function for correlations. I'm using a 
fairly standard Pentium 4 running Windows XP.

On occasion I am required to calculate up to a 
quarter of a million individual correlations, so 
any help would be very much appreciated.

Best wishes,

Peter James Lee
_________________________

Peter James Lee
Assistant Professor

Psikoloji Bölümü
Bilkent University
Bilkent
Ankara
Turkey
06800

e-mail: [EMAIL PROTECTED]
office: (90) 312 290 1807
home: (90) 312 290 3447
website: http://www.bilkent.edu.tr/~peterjl/index.html
_________________________
        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Correlations by group

Reply via email to