{from R-help, diverted to R-devel}: UweL> Wang Tian Hua wrote:
UweL> hi, when i was computing the variance of a simple UweL> vector, i found unexpect result. not sure whether it UweL> is a bug. UweL> Not a bug! ?var: UweL> "The denominator n - 1 is used which gives an unbiased UweL> estimator of the (co)variance for UweL> i.i.d. observations." UweL> > var(c(1,2,3)) UweL> [1] 1 #which should be 2/3. UweL> > var(c(1,2,3,4,5)) UweL> [1] 2.5 #which should be 10/5=2 UweL> UweL> it seems to me that the program uses (sample size -1) instead of sample UweL> size at the denominator. how can i rectify this? UweL> Simply change it by: UweL> x <- c(1,2,3,4,5) UweL> n <- length(x) UweL> var(x)*(n-1)/n UweL> if you really want it. It seems Insightful at some point in time have given in to this user request, and S-plus nowadays has an argument "unbiased = TRUE" where the user can choose {to shoot (him/her)self in the leg and} require 'unbiased = FALSE'. {and there's also 'SumSquraes = FALSE' which allows to not require any division (by N or N-1)} Since in some ``schools of statistics'' people are really still taught to use a 1/N variance, we could envisage to provide such an argument to var() {and cov()} as well. Otherwise, people define their own variance function such as VAR <- function(x,....) .. N/(N-1)*var(x,...) Should we? BTW: S+ even has the 'unbiased' argument for cor() where of course it really doesn't make any difference (!), and actually I think is rather misleading, since the sample correlation is not unbiased in almost all cases AFAICS. Martin ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel