Increasing from 200 to 2000 on upwards drives the 25/median/75 numbers
towards 25/50/75.


On Sat, Apr 16, 2011 at 11:53 PM, Lance Norskog <[email protected]> wrote:
> If you add the Java methods at the bottom to the
> org.apache.mahout.stats.OnlineSummarizer and run the main(), a funny
> thing prints out:
>
> [(count=200.0),(sd=28.8660),(mean=49.5000),(min=0.0),(25%=34.1312),(median=60.2104),(75%=83.8722),(max=99.0),]
>
> I added the numbers 0-99 twice to the summarizer. I would have
> expected the 25%=25 +/- 1, median=50 +/- 1, and 75%=75 +/- 1
> Note that the mean is correct.
> ---------------------------------------------------------------------------
>
>  @Override
>  public String toString() {
>   return "[" +
>   pair("count", getCount()) + pair("sd", getSD()) + pair("mean", getMean()) +
>   pair("min", getMin()) + pair("25%", getQuartile(1)) +
> pair("median", getMedian()) +
>      pair("75%", getQuartile(3)) + pair("max", getMax()) + "]";
>  }
>
>  private String pair(String tag, double value) {
>    String s = Double.toString(value);
>    if (s.length() > 8)
>      s = s.substring(0, 7);
>    return "(" + tag + "=" + s + "),";
>  }
>
>  public static void main(String[] args) {
>    OnlineSummarizer osQ = new OnlineSummarizer();
>    for(int i = 0; i < 200; i++) {
>      osQ.add(i % 100);
>    }
>    System.out.println(osQ.toString());
>  }
>
> --
> Lance Norskog
> [email protected]
>



-- 
Lance Norskog
[email protected]

Reply via email to