If I understood correctly, the following might be simpler (dat is the data frame holding the data):
> sum(ave(dat$x, dat$id, FUN=scale, scale=FALSE) * + ave(dat$y, dat$id, FUN=scale, scale=FALSE)) [1] 6.229377 Andy > From: Huntsinger, Reid > > You could do something like > > ids <- unique(mydata$id) > ans <- vector(length=length(ids), mode="list") > for (i in ids) { > g <- which(mydata$id == i) > ans[[i]] <- (length(g) - 1)*cov(mydata$x[g], mydata$y[g]) > } > ans > > but cov() returns NA for length 1 vectors, so you'd want an > if (length(g) == > 1) ans[i] <- 0 else ans[i] <- ... construction. > > This is almost brute force; you could also use tapply, as follows: > > sx <- tapply(mydata$x,INDEX=mydata$id,FUN=sum) > sy <- tapply(mydata$y,INDEX=mydata$id,FUN=sum) > sxy <- tapply(mydata$x*mydata$y, INDEX=mydata$id, FUN=sum) > n <- tapply(mydata$id,INDEX=mydata$id,FUN=length) # or use table()! > > and now your inner sum is > > sxy - 2*sx*(sy/n) + n*(sx/n)*(sy/n) = sxy - sx*sy/n > > so > > sum(sxy - sx*sy/n) should do. > > One more approach is to make your dataset into a list of data > frames, one > for each id, then use lapply(). The list can be created by > split(). In one > line, > > lapply(split(mydata,f=mydata$id),function(z) (length(z$x) - > 1)*cov(z$x,z$y)) > > and take sum(,na.rm=TRUE) to remove the NAs due to single ids > that you want > to be zeros. > > Reid Huntsinger > > > > > Reid Huntsinger > > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Kerry Bush > Sent: Wednesday, June 15, 2005 11:41 AM > To: r-help@stat.math.ethz.ch > Subject: [R] need help on computing double summation > > > Dear helpers in this forum, > > This is a clarified version of my previous > questions in this forum. I really need your generous > help on this issue. > > > Suppose I have the following data set: > > > > > > ...... > > > > Now I want to compute the following double summation: > > sum_{i=1}^k > sum_{j=1}^{n_i}(x_{ij}-mean(x_i))*(y_{ij}-mean(y_i)) > > i is from 1 to k, > indexing the ith subject id; and j is from 1 to n_i, > indexing the jth observation for the ith subject. > > in the above expression, mean(x_i) is the mean of x > values for the ith > subject, mean(y_i) is the mean of y values for the ith > subject. > > Is there a simple way to do this in R? > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any attachments, > contains information of Merck & Co., Inc. (One Merck Drive, > Whitehouse Station, New Jersey, USA 08889), and/or its > affiliates (which may be known outside the United States as > Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as > Banyu) that may be confidential, proprietary copyrighted > and/or legally privileged. It is intended solely for the use > of the individual or entity named on this message. If you > are not the intended recipient, and have received this > message in error, please notify us immediately by reply > e-mail and then delete it from your system. > -------------------------------------------------------------- > ---------------- > > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html