On Wed, Aug 13, 2014 at 7:41 AM, Rolf Turner <r.tur...@auckland.ac.nz> wrote:
> On 13/08/14 07:57, Ron Michael wrote: > >> Hi, >> >> I would need to get a clarification on a quite fundamental statistics >> property, hope expeRts here would not mind if I post that here. >> >> I leant that variance-covariance matrix of the standardized data is equal >> to the correlation matrix for the unstandardized data. So I used following >> data. >> > > <SNIP> > > > (t(Data_Normalized) %*% Data_Normalized)/dim(Data_Normalized)[1] >> >> >> >> Point is that I am not getting exact CORR matrix. Can somebody point me >> what I am missing here? >> > > You are using a denominator of "n" in calculating your "covariance" matrix > for your normalized data. But these data were normalized using the sd() > function which (correctly) uses a denominator of n-1 so as to obtain an > unbiased estimator of the population standard deviation. > As a small point n - 1 is not _quite_ an unbiased estimator of the population SD see Cureton. (1968). Unbiased Estimation of the Standard Deviation, The American Statistician, 22(1). To see this in action: res <- unlist(parLapply(cl, 1:1e7, function(i) sd(rnorm(10, mean = 0, sd = 1)))) correction <- function(n) { gamma((n-1)/2) * sqrt((n-1)/2) / gamma(n/2) } mean(res) # 0.972583 mean(res * correction(10)) # 0.9999216 The calculation for sample variance is an unbiased estimate of the population variance, but square root is a nonlinear function and the square root of an unbiased estimator is not itself necessarily unbiased. > > If you calculated > > > (t(Data_Normalized) %*% Data_Normalized)/(dim(Data_Normalized)[1]-1) > > then you would get the same result as you get from cor(Data) (to within > about 1e-15). > > cheers, > > Rolf Turner > > -- > Rolf Turner > Technical Editor ANZJS > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua F. Wiley Ph.D. Student, UCLA Department of Psychology http://joshuawiley.com/ Senior Analyst, Elkhart Group Ltd. http://elkhartgroup.com Office: 260.673.5518 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.