You've had answers (three different ones!) so far from Dennis Roberts, , and Jerry Dallal; it appears to me that only Jerry answered the question you asked. Dennis's answer will get you the so-called "pooled variance" estimate, or the within mean square (MSW) in ANOVA terms, but what you asked for would be the total sum of squares (SST) divided by the total number of degrees of freedom (N-1 = n1 + n2 + ... - 1).
I agree with Dennis's second post: I do not see how you can have correlated data, unless you have the same persons in each of the different samples, which would appear to be denied by your assertion of different sample sizes. But if you do have correlated data, you need to know what those correlations (or the corresponding covariances) actually are, as W.D. Allen's post indicates. But note that WDA's equations do not take into account the variability among the several means, so it would produce an analogue of MSW, not of what I understand you to be asking for as the "global variance" (or the square root thereof). I've added a comment at the end, also. -- DFB. On 27 Nov 2002, David Robinson wrote: > I've searched for this, and failed to come up with an answer. It's not > a homework question, or even a purely academic question - I'm trying > to combine psychoacoustic data from a number of sources, because I > need to calculate the percentage of the population who are able to > detect certain sounds. > > Here's the problem: > > I have several measurements of the SAME QUANTITY from different tests. > For each test, I have: > > number of subjects, mean value, standard deviation (or standard error > of the mean.) > > (All standard errors will be converted to standard deviations - I know > they don't quite mean the same thing (even when divided by root N), > but I believe SD is more appropriate in this context). > > I want to combine all the data, to get a global mean, and a global > standard deviation. The results I want for these two values should be > the results I would get if I had ALL the original data from all the > tests, and simply calculated the mean and standard deviation over all > the data. > > (I do not have all the original data - some of it does not exist > anymore) > > I can calculate the global mean easily (n1*x_bar1 + n2*x_bar2 + > ...)/n_total > > I do not know how to calculate the global standard deviation. Please > can you help? > > (I don't understand it, but from what I have read I believe the usual > technique of simply adding the squares of the SDs is not appropriate > because I have (hopefully) correlated data, and different sample > sizes) The usual technique, as Dennis illustrates, would entail adding the sum of squared deviations from the mean (that is, the usual numerator of the variance formula) for each sample (thus producing the pooled sum of squared deviations, aka the within-group sum of squares (SSW) in ANOVA terms); and then adding to that the contributions due to the (presumed) fact that the sample means are not all identical, so that there is a between-group component (SSB), as Jerry points out. "Adding the squares of the SDs" (and then presumably dividing by the number of SDs so added) would be equivalent if the sample sizes were all the same, but not otherwise (at least, not necessarily). HTH -- DFB. ----------------------------------------------------------------------- Donald F. Burrill [EMAIL PROTECTED] 56 Sebbins Pond Drive, Bedford, NH 03110 (603) 626-0816 [was: 184 Nashua Road, Bedford, NH 03110 (603) 471-7128] . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
