Good question Brian and interesting discussion.

I agree that the key to the problem is to agree exactly what the desired
behaviour of histogram should be.
Mathematical/statistical convention seems to be that Intervals should be
closed on the left and open on the right. This makes life a bit harder for
us because I. calculates intervals that are natively open on the left and
closed on the right. Luckily Brian has solved this problem for us with his
Idotr verb.

The next issue to resolve is how to specify the intervals. That is, what
format of bins/limits should the desired verb histogram expect and how
should it interpret them. Should the upper and/or lower limits be provided
or should __ and _ be implied as Gilles suggests.

I think my preference would be to imply __ and _. When generating a
histogram for some data it is generally to understand what the data looks
like. For that reason I'd prefer for all data to be included and not to
have to spend too much (any!) time working out sensible bin values.

Which leads to ... another nice thing to add would be a verb (calcBins ?)
to calculate the values to specify the desired number of bins. If we
decided that the upper and lower limits should be explicitly specified,
then calcBins could obviously determine the min and max of the data to
ensure that everything is included, however outliers may provide some
issues.

BinCounts=: Limits histogram Data
BinCounts=: 10 (calcBins histogram ]) Data

I recently added a J solution for the "Bin given limits" task on Rosetta
Code: https://rosettacode.org/wiki/Bin_given_limits
Brian's histogram2 gives the same bin counts as my solution there, is more
performant for just calculating the counts and will work better for
non-integer data. It would currently be my preferred solution for the
stats/base library.







On Sun, Apr 11, 2021 at 9:33 AM Brian Schott <[email protected]> wrote:

> As you thought, Steve Jost kindly confirmed.
> I had never heard of the upper bin being treated differently.
>
> On Sat, Apr 10, 2021 at 1:59 PM Raul Miller <[email protected]> wrote:
>
> > I give the english sentence precedence over the label for a specific
> > row in a table.
> >
> > For the example treated by that table, it does not change the numbers.
> >
> > That said, I suppose it would be worth talking with Steve Jost about
> > this issue. He likely has references worth reading that lead him to
> > write that sentence, and he might even like to hear someone having
> > noticed the conflict in his treatment of that topic.
> >
> > Thanks,
> >
> > --
> > Raul
> >
> > --
> (B=)
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to