[matplotlib-devel] should mlab.prctile(x,50) == np.median(x)?
The following (uncommitted) test currently fails. The reason is that mlab.prctile(x,50) doesn't handle even length sequences according to the numpy and wikipedia convention for the definition of median. Do we agree that it should pass? Not only would I commit the test, but I also have a fix to make it pass, derived from scipy.stats.scoreatpercentile(). This would affect boxplot, if not more. def test_prctile(): # test odd lengths x=[1,2,3] assert mlab.prctile(x,50)==np.median(x) # test even lengths x=[1,2,3,4] assert mlab.prctile(x,50)==np.median(x) # derived from email sent by jason-sage to MPL-user on 20090914 ob1=[1,1,2,2,1,2,4,3,2,2,2,3,4,5,6,7,8,9,7,6,4,5,5] p = [75] expected = [5.5] # test vectorized actual = mlab.prctile(ob1,p) assert np.allclose( expected, actual ) # test scalar for pi, expectedi in zip(p,expected): actuali = mlab.prctile(ob1,pi) assert np.allclose( expectedi, actuali ) -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
[matplotlib-devel] boxplot notch
Hi, I've been reading about box plots and examining the source code for boxplot() lately. While there doesn't seem to be a convention about what the notch specifies, I can't find any justification (or text describing) what exactly the MPL notch is. The source code is: # get median and quartiles q1, med, q3 = mlab.prctile(d,[25,50,75]) iq = q3 - q1 notch_max = med + 1.57*iq/np.sqrt(row) notch_min = med - 1.57*iq/np.sqrt(row) Is this code actually calculating a meaningful value? If so, what? The original commit was r1098, which doesn't offer a useful comment either (only "aaplied several sf patches" ... looking through the SF bug tracker, I couldn't find anything relevant from before the commit date of 2005-03-28). -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
Re: [matplotlib-devel] boxplot notch
On Tue, Dec 15, 2009 at 9:57 AM, Andrew Straw wrote: > > notch_max = med + 1.57*iq/np.sqrt(row) > notch_min = med - 1.57*iq/np.sqrt(row) > > Is this code actually calculating a meaningful value? If so, what? > >From the statistics ignoramus in the room, so take this with a grain of salt... I'd write that code as notch_max = med + (iq/2) * (pi/np.sqrt(row)) and it makes more sense. The notch limits are an estimate of the interval of the median, which is (one-half, for each up/down) the q3-q1 range times a normalization factor which is pi/sqrt(n), where n==row=len(d). The 1/sqrt(n) makes some sense, as it's the usual statistical error normalization factor. The multiplication by pi, I'm not so sure, and I can't find that exact formula in any quick stats reference, but I'm sure someone who actually knows stats can point out where it comes from. Note that the code below does: if notch_max > q3: notch_max = q3 if notch_min < q1: notch_min = q1 though matlab explicitly states in: http://www.mathworks.com/access/helpdesk/help/toolbox/stats/boxplot.html that """ Interval endpoints are the extremes of the notches or the centers of the triangular markers. When the sample size is small, notches may extend beyond the end of the box. """ So it seems to me that the more principled thing to do would be to leave those notch markers outside the box if they land there, because that's a warning of the robustness of the estimation. Clipping them to q1/q3 is effectively hiding a problem... cheers, f -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
[matplotlib-devel] imshow without resampling in the ps backend.
A patch that enables drawing image in ps backend without resampling is committed in r8035. So, please test it if you're interested. The raw image is to be used only when interpolation=="nearest" and there is only one image. While extending this to other backend such as pdf and svg should be straight forward, I want to hear how others think about the overall approach, e.g., api change and such. The current approach is to minimize change in backends. There are a few minor issues, whose solution is not clear to me. * If there are multiple images (and the ps backend is used), the images are resampled as they are compositted into a single image. * The current solution forces not to resample whenever interpolation=="nearest", I think this is generally good behavior. However, on the highly extreme case of very high resolution of image (e.g., image dpi > 1000 ?), it might be better if the image is resampled (i.e., downsampled). One option would be to introduce a new "resample" keyword in the imshow command (which will become the attribute of the images). Regards, -JJ -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Matplotlib-devel mailing list Matplotlib-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel