Le jeudi 01 août 2013 à 00:10 +0800, Chaos Chen a écrit : > Hi all, > > I experienced some unmatched result using mean function in ffbase package > and cannot figure out what's wrong. > > I have a simulated ff vector with 1000000000 numbers inside and want to > calculate its mean. But the results are quite different. > > With mean( ) function in ffbase package, the mean is 152.6858. > But with R's mean( ) or adding sum from chunks directly, I got 667.5595 > > any idea ? Thank you in advance! Could you provide a fully reproducible example with a shorter vector (I cannot create such a large vector on my box)? Use set.seed() so that runif() gives exactly the same values.
>From quick tests here, the problem does not appear. Regards > Bayes Chen > > # F1 is an ffdf , F1$X1 is an ff vector > > length(F1$X1) > [1] 1000000000 > > # Use mean() function in ffbase package > > mean(F1$X1) > [1] 152.6858 > > > X2 = F1$X1[] #X2 is now an non-ff vector > > length(X2) > [1] 1000000000 > > mean(X2) # R's original mean function for ordinary vectors > [1] 667.5595 > > # calculate sum and then mean by chunks > > chunks = chunk(F1$X1, by=5000000) > > sumx = 0 > > for (i in chunks) { > + sumx = sumx + sum(F1$X1[i]) > + } > > sumx/length(F1$X1) > [1] 667.5595 > > ----------------------------------- below are some other trials > > X2 = F1$X1[1:1000000] > > mean(X2) > [1] 59.43149 > > mean(as.ff(X2)) > [1] 59.43149 > > > X2 = F1$X1[1:100000000] > > mean(X2) > [1] 59.41978 > > mean(as.ff(X2)) > [1] 59.42128 > > > X2 = F1$X1[1:500000000] > > mean(X2) > [1] 60.53615 > > mean(as.ff(X2)) > [1] 57.72168 > > > X2 = F1$X1[1:750000000] > > mean(X2) > [1] 59.37562 > > mean(as.ff(X2)) > [1] 57.81179 > > > X2 = F1$X1[1:900000000] > > mean(X2) > [1] 57.0867 > > mean(as.ff(X2)) > [1] 57.44862 > > > X3 = F1$X1[900000000:1000000000] > > mean(X3) > [1] 6161.814 > > mean(as.ff(X3)) > [1] 6161.797 > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.