Re: [Numpy-discussion] PR to add a function to calculate histogram edges without calculating the histogram

Thomas Caswell Thu, 15 Mar 2018 19:57:10 -0700

Yes I like the name.

The primary use-case for Matplotlib is that our `hist` method can take in a
list of arrays and produces N histograms in one shot. Currently with 'auto'
we only use the first data set to sort out what the bins should be and then
re-use those for the rest of the data sets.  This will let us get the bins
on the merged input, but I take Josef's point that this is not actually
what we want....


Tom

On Mon, Mar 12, 2018 at 11:35 PM <josef.p...@gmail.com> wrote:

> On Mon, Mar 12, 2018 at 11:20 PM, Eric Wieser
> <wieser.eric+nu...@gmail.com> wrote:
> >> Given that the bin selection are data driven, transferring them across
> datasets might not be so useful.
> >
> > The main application would be to compute bins across the union of all
> > datasets. This is already possibly by using `np.histogram` and
> > discarding the first result, but that's super wasteful.
>
> assuming "union" means a combined dataset.
>
> If you stack  datasets, then the number of observations will not be
> correct for individual datasets.
>
> In that case an additional keyword like nobs, or whatever name would
> be appropriate for numpy, would be useful, e.g. use the average number
> of observations across datasets.
> Auxiliary statistic like std could then be computed on the total
> dataset (if that makes sense, which would not be the case if the
> variance across datasets is larger than the variance within datasets.
>
> Josef
>
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] PR to add a function to calculate histogram edges without calculating the histogram

Reply via email to