Yeah... What Sean says. The inaccuracy surprises me a bit, but it is outside the intended usage.
Did you give the values in random order or in consecutive order? If they are consecutive, then I am not worried at all. If you got this error from random ordering, I am a bit more unhappy. On Sun, Apr 17, 2011 at 2:21 AM, Sean Owen <[email protected]> wrote: > The implementation is intentionally an approximation which uses > constant memory, instead of tracking the entire data set, which is > necessary to get an exact answer. You should find it converges to the > expected values with more data. > > On Sun, Apr 17, 2011 at 7:53 AM, Lance Norskog <[email protected]> wrote: > > If you add the Java methods at the bottom to the > > org.apache.mahout.stats.OnlineSummarizer and run the main(), a funny > > thing prints out: > > > > > [(count=200.0),(sd=28.8660),(mean=49.5000),(min=0.0),(25%=34.1312),(median=60.2104),(75%=83.8722),(max=99.0),] > > > > I added the numbers 0-99 twice to the summarizer. I would have > > expected the 25%=25 +/- 1, median=50 +/- 1, and 75%=75 +/- 1 > > Note that the mean is correct. > > > --------------------------------------------------------------------------- > > > > @Override > > public String toString() { > > return "[" + > > pair("count", getCount()) + pair("sd", getSD()) + pair("mean", > getMean()) + > > pair("min", getMin()) + pair("25%", getQuartile(1)) + > > pair("median", getMedian()) + > > pair("75%", getQuartile(3)) + pair("max", getMax()) + "]"; > > } > > > > private String pair(String tag, double value) { > > String s = Double.toString(value); > > if (s.length() > 8) > > s = s.substring(0, 7); > > return "(" + tag + "=" + s + "),"; > > } > > > > public static void main(String[] args) { > > OnlineSummarizer osQ = new OnlineSummarizer(); > > for(int i = 0; i < 200; i++) { > > osQ.add(i % 100); > > } > > System.out.println(osQ.toString()); > > } > > > > -- > > Lance Norskog > > [email protected] > > >
