If you add the Java methods at the bottom to the
org.apache.mahout.stats.OnlineSummarizer and run the main(), a funny
thing prints out:

[(count=200.0),(sd=28.8660),(mean=49.5000),(min=0.0),(25%=34.1312),(median=60.2104),(75%=83.8722),(max=99.0),]

I added the numbers 0-99 twice to the summarizer. I would have
expected the 25%=25 +/- 1, median=50 +/- 1, and 75%=75 +/- 1
Note that the mean is correct.
---------------------------------------------------------------------------

  @Override
  public String toString() {
   return "[" +
   pair("count", getCount()) + pair("sd", getSD()) + pair("mean", getMean()) +
   pair("min", getMin()) + pair("25%", getQuartile(1)) +
pair("median", getMedian()) +
      pair("75%", getQuartile(3)) + pair("max", getMax()) + "]";
  }

  private String pair(String tag, double value) {
    String s = Double.toString(value);
    if (s.length() > 8)
      s = s.substring(0, 7);
    return "(" + tag + "=" + s + "),";
  }

  public static void main(String[] args) {
    OnlineSummarizer osQ = new OnlineSummarizer();
    for(int i = 0; i < 200; i++) {
      osQ.add(i % 100);
    }
    System.out.println(osQ.toString());
  }

-- 
Lance Norskog
[email protected]

Reply via email to