On Sun, Apr 12, 2015 at 9:45 AM, Jaime Fernández del Río < [email protected]> wrote:
> On Sun, Apr 12, 2015 at 12:19 AM, Varun <[email protected]> wrote: > >> >> http://nbviewer.ipython.org/github/nayyarv/matplotlib/blob/master/examples/sta >> tistics/A >> <http://nbviewer.ipython.org/github/nayyarv/matplotlib/blob/master/examples/statistics/A> >> utomating%20Binwidth%20Choice%20for%20Histogram.ipynb >> >> Long story short, histogram visualisations that depend on numpy (such as >> matplotlib, or nearly all of them) have poor default behaviour as I have >> to >> constantly play around with the number of bins to get a good idea of >> what I'm >> looking at. The bins=10 works ok for up to 1000 points or very normal >> data, >> but has poor performance for anything else, and doesn't account for >> variability either. I don't have a method easily available to scale the >> number >> of bins given the data. >> >> R doesn't suffer from these problems and provides methods for use with >> it's >> hist method. I would like to provide similar functionality for >> matplotlib, to >> at least provide some kind of good starting point, as histograms are very >> useful for initial data discovery. >> >> The notebook above provides an explanation of the problem as well as some >> proposed alternatives. Use different datasets (type and size) to see the >> performance of the suggestions. All of the methods proposed exist in R >> and >> literature. >> >> I've put together an implementation to add this new functionality, but am >> hesitant to make a pull request as I would like some feedback from a >> maintainer before doing so. >> > > +1 on the PR. > +1 as well. Unfortunately we can't change the default of 10, but a number of string methods, with a "bins=auto" or some such name prominently recommended in the docstring, would be very good to have. Ralf
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
