About the histogram verb in stats/base
NB. The result is a list of counts of the number of data points in each
interval.
Intervals are specified by the left argument. histogram uses dyadic I.
The I. primitive defines (1+#x) intervals. See its clear definition
in Nuvoc. In short, it reports the index of the interval of the
corresponding data value.
The primitive implies both __ and _ limits:
_5 0 5 I. _2 _7 0 3 9
1 0 1 2 3
But histogram does not always correctly report on the last interval:
_5 0 5 histogram _2 _7 0 3 9
1 2 1 0
Where has the last y value been classified? Five values were passed,
but only four counted.
_5 0 5 _ histogram _2 _7 0 3 9 NB. OK
1 2 1 1
As currently defined, the interval defined by {:x must include the
maximum value of y.
Apparently the current definition comes from the (old) Vocabulary in
an example of dyadic I. But the definition there was appropriate for the
given example (or vice-versa).
In a library, should we not have a more easily reliable tool? and
with better symmetry.
~ Gilles
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm