On Tue, Dec 15, 2009 at 9:57 AM, Andrew Straw <straw...@astraw.com> wrote: > > notch_max = med + 1.57*iq/np.sqrt(row) > notch_min = med - 1.57*iq/np.sqrt(row) > > Is this code actually calculating a meaningful value? If so, what? >
>From the statistics ignoramus in the room, so take this with a grain of salt... I'd write that code as notch_max = med + (iq/2) * (pi/np.sqrt(row)) and it makes more sense. The notch limits are an estimate of the interval of the median, which is (one-half, for each up/down) the q3-q1 range times a normalization factor which is pi/sqrt(n), where n==row=len(d). The 1/sqrt(n) makes some sense, as it's the usual statistical error normalization factor. The multiplication by pi, I'm not so sure, and I can't find that exact formula in any quick stats reference, but I'm sure someone who actually knows stats can point out where it comes from. Note that the code below does: if notch_max > q3: notch_max = q3 if notch_min < q1: notch_min = q1 though matlab explicitly states in: http://www.mathworks.com/access/helpdesk/help/toolbox/stats/boxplot.html that """ Interval endpoints are the extremes of the notches or the centers of the triangular markers. When the sample size is small, notches may extend beyond the end of the box. """ So it seems to me that the more principled thing to do would be to leave those notch markers outside the box if they land there, because that's a warning of the robustness of the estimation. Clipping them to q1/q3 is effectively hiding a problem... cheers, f ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel