On Aug 21, 2012, at 10:58 AM, Virgil Stokes wrote:

> In reference to my previous email.
>
> How can I find the outliers (samples points beyond the whiskers) in  
> the data
> used for the boxplot?
>
> Here is a code snippet that shows how it was used for the timings  
> data (a list
> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data  
> values),
>    ...
>    ...
>    ...
>    # Box Plots
>    plt.subplot(2,1,2)
>    timings = [y1,y2,y3,y4]
>    pos = np.array(range(len(timings)))+1
>    bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>                     positions=pos, notch=1, bootstrap=5000 )
>
>    plt.xlabel('Algorithm')
>    plt.ylabel('Exection time (sec)')
>    plt.ylim(0.9*ymin,1.1*ymax)
>
>    plt.setp(bp['whiskers'], color='k',  linestyle='-' )
>    plt.setp(bp['fliers'], markersize=3.0)
>    plt.title('Box plots (%4d trials)' %(n))
>    plt.show()
>    ...
>    ...
>    ...
>
> Again my questions:
> 1) How to get the value of the median?

This is easily calculated from your data. Numpy will even do it for  
you: np.median(timings)

> 2) How to find the outliers (outside the whiskers)?

 From the boxplot documentation: the whiskers extend to the most  
extreme data point within distance X of the bottom or top of the box,  
where X is 1.5 times the extent of the box. Any points more extreme  
than that are the outliers. The box itself of course extends from the  
25th percentile to the 75th percentile of your data. Again, you can  
easily calculate these values from your data.

> 3) How to find the width of the notch?

Again, from the docs: with bootstrap=5000, it calculates the width of  
the notch by bootstrap resampling your data (the timings array) 5000  
times and finding the 95% confidence interval of the median, and uses  
that as the notch width. You can redo that yourself pretty easily.  
Here is some bootstrap code for you to adapt:
http://mail.scipy.org/pipermail/scipy-user/2009-July/021704.html

I encourage you to read the documentation! This page is very useful  
for reference:
http://matplotlib.sourceforge.net/api/pyplot_api.html

-Jeff


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Reply via email to