Yeah...

What Sean says.  The inaccuracy surprises me a bit, but it is outside the
intended usage.

Did you give the values in random order or in consecutive order?  If they
are consecutive, then I am not worried at all.  If you got this error from
random ordering, I am a bit more unhappy.

On Sun, Apr 17, 2011 at 2:21 AM, Sean Owen <[email protected]> wrote:

> The implementation is intentionally an approximation which uses
> constant memory, instead of tracking the entire data set, which is
> necessary to get an exact answer. You should find it converges to the
> expected values with more data.
>
> On Sun, Apr 17, 2011 at 7:53 AM, Lance Norskog <[email protected]> wrote:
> > If you add the Java methods at the bottom to the
> > org.apache.mahout.stats.OnlineSummarizer and run the main(), a funny
> > thing prints out:
> >
> >
> [(count=200.0),(sd=28.8660),(mean=49.5000),(min=0.0),(25%=34.1312),(median=60.2104),(75%=83.8722),(max=99.0),]
> >
> > I added the numbers 0-99 twice to the summarizer. I would have
> > expected the 25%=25 +/- 1, median=50 +/- 1, and 75%=75 +/- 1
> > Note that the mean is correct.
> >
> ---------------------------------------------------------------------------
> >
> >  @Override
> >  public String toString() {
> >   return "[" +
> >   pair("count", getCount()) + pair("sd", getSD()) + pair("mean",
> getMean()) +
> >   pair("min", getMin()) + pair("25%", getQuartile(1)) +
> > pair("median", getMedian()) +
> >      pair("75%", getQuartile(3)) + pair("max", getMax()) + "]";
> >  }
> >
> >  private String pair(String tag, double value) {
> >    String s = Double.toString(value);
> >    if (s.length() > 8)
> >      s = s.substring(0, 7);
> >    return "(" + tag + "=" + s + "),";
> >  }
> >
> >  public static void main(String[] args) {
> >    OnlineSummarizer osQ = new OnlineSummarizer();
> >    for(int i = 0; i < 200; i++) {
> >      osQ.add(i % 100);
> >    }
> >    System.out.println(osQ.toString());
> >  }
> >
> > --
> > Lance Norskog
> > [email protected]
> >
>

Reply via email to