oh now I got, R is taking unbiased estimator, anyways thanks On Sun, Jul 27, 2008 at 2:54 PM, Arun Kumar Saha <[EMAIL PROTECTED]>wrote:
> I feel there is something wrong in definition in weighted variance. > > Say for a fair dice the variance of outcome should be : 2.92 ( > http://en.wikipedia.org/wiki/Variance) > > However if I use cov.wt() or weighted.var() by Gavin, I get : > > cov.wt(as.data.frame(1:6), rep(0.6, 6)) > $cov > 1:6 > 1:6 3.5 > > $center > 1:6 > 3.5 > > $n.obs > [1] 6 > > $wt > [1] 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 > > i.e. 3.5 > > Therefore if I want to calculate Variance for a r.v. with different prob > for different values then should not use those formulae. Is it the case? > > > On Sun, Jul 27, 2008 at 2:30 PM, Patrick Burns <[EMAIL PROTECTED]>wrote: > >> Have you seen 'cov.wt'? >> >> >> Patrick Burns >> [EMAIL PROTECTED] >> +44 (0)20 8525 0696 >> http://www.burns-stat.com >> (home of S Poetry and "A Guide for the Unwilling S User") >> >> Arun Kumar Saha wrote: >> >>> ur prog gives following result: >>> >>> weighted.var(c(1,-1), c(0.5,0.5)) >>> [1] 2 >>> >>> is it ok? >>> >>> On Thu, Jul 24, 2008 at 7:57 PM, Gavin Simpson <[EMAIL PROTECTED] >>> >wrote: >>> >>> >>> >>>> On Thu, 2008-07-24 at 02:25 +0530, Arun Kumar Saha wrote: >>>> >>>> >>>>> There is a R function to calculate weighted mean : weighted.mean() >>>>> under >>>>> stats package. Is there any direct R function for calculating weighted >>>>> variance as well? >>>>> >>>>> >>>> Here are two ways; weighted.var() is via the usual formula and >>>> weighted.var2() uses a running sums approach. The formulae for which are >>>> both on the weighted mean entry page on wikipedia for example. >>>> >>>> The removal of NA is as per weighted.mean, but I have not included any >>>> of the sanity checks that that functions contains. >>>> >>>> weighted.var <- function(x, w, na.rm = FALSE) { >>>> if (na.rm) { >>>> w <- w[i <- !is.na(x)] >>>> x <- x[i] >>>> } >>>> sum.w <- sum(w) >>>> sum.w2 <- sum(w^2) >>>> mean.w <- sum(x * w) / sum(w) >>>> (sum.w / (sum.w^2 - sum.w2)) * sum(w * (x - mean.w)^2, na.rm = >>>> na.rm) >>>> } >>>> >>>> weighted.var2 <- function(x, w, na.rm = FALSE) { >>>> if (na.rm) { >>>> w <- w[i <- !is.na(x)] >>>> x <- x[i] >>>> } >>>> sum.w <- sum(w) >>>> (sum(w*x^2) * sum.w - sum(w*x)^2) / (sum.w^2 - sum(w^2)) >>>> } >>>> >>>> ## from example section in ?weighted.mean >>>> ## GPA from Siegel 1994 >>>> wt <- c(5, 5, 4, 1)/15 >>>> x <- c(3.7,3.3,3.5,2.8) >>>> weighted.mean(x,wt) >>>> >>>> weighted.var(x, wt) >>>> >>>> weighted.var2(x, wt) >>>> >>>> And some timings: >>>> >>>> >>>> >>>>> system.time(replicate(100000, weighted.var(x, wt))) >>>>> >>>>> >>>> user system elapsed >>>> 2.679 0.014 2.820 >>>> >>>> >>>>> system.time(replicate(100000, weighted.var2(x, wt))) >>>>> >>>>> >>>> user system elapsed >>>> 2.224 0.010 2.315 >>>> >>>> HTH >>>> >>>> G >>>> -- >>>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >>>> Dr. Gavin Simpson [t] +44 (0)20 7679 0522 >>>> ECRC, UCL Geography, [f] +44 (0)20 7679 0565 >>>> Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk >>>> Gower Street, London [w] >>>> http://www.ucl.ac.uk/~ucfagls/<http://www.ucl.ac.uk/%7Eucfagls/> >>>> <http://www.ucl.ac.uk/%7Eucfagls/> >>>> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk >>>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >>>> >>>> >>>> >>>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >>> >> > > > -- > > -- [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.