> William B. Ware <[EMAIL PROTECTED]> wrote: > >Anyway, more to the point... the "add one" is an old argument based on > >the notion of "real limits." Suppose the range of scores is 50 to > >89. It was argued that 50 really goes down to 49.5 and 89 really > >goes up to 89.5. Thus the range was defined as 89.5 - 49.5... thus > >the additional one unit...
I recall textbooks (in the late 1960s and 1970s) that defined both an "exclusive range" (= max - min) and an "inclusive range" (= max - min + 1), the latter invariably being illustrated with examples of data that came in integers. (In fact, the examples _may_ always have been of variables that were counts.) On Sat, 6 Oct 2001, Stan Brown replied: > Perhaps a better argument is that if you count the numbers you get > forty of them: 50, 51, 52, ..., 59 makes ten, and similarly for the > 60s, 70s, and 80s. I see the argument, but I don't know as I'd call it "better". Seems to be confusing apples with oranges. By the idea of "range", does one want to mean the _distance_ between the largest and smallest values in the data, or the _number_of_different_values_ between those two extremes? (These are NOT equivalent concepts!) And if the latter is of interest, does one want the number of different values _in_this_data_set_, or the number of _possible_ different values that might have been observed (under what hypothetical conditions?)? The "inclusive range" rule supplies the latter (under the assumption that the possible values can only be integers, which is an interesting restriction in itself) -- but not for all imaginable variables. [Counterexample: What's the range of possible values of a hand in cribbage? The smallest possible value is 0, the largest is 29. The "exclusive range" (in a possibly artificial data set that includes all possible hands, or at least all possible values) is 29-0 = 29. The "inclusive range" is 30, which is the number of integers between 0 and 29 inclusive. The number of _actual_values_ that can possibly be observed is 29 (of the integers from 0 to 29, 19 is not a possible value for a cribbage hand).] Anyway: one justification for arguing about how to calculate the range lies in not having decided whether one wants to mean "range" in the sense of "distance in the measured variable", or "range" in the sense of "number of [possible?] different values of the measured variable", and indeed in not having perceived that there _is_ such a distinction to be made. As William Ware reminds us, in the idea of "range" as distance, there may still be a distinction to be made based on the size of the units of measurement to which the measured variable is reported, and on whether one wishes to include the (presumed) half-units at either end of the empirical distribution (or, for variables like "age" that are customarily truncated rather than rounded, the (presumed) whole unit at the right end). The "inclusive" argument seems essentially to require (i) that the latent variable being measured be continuous, (ii) that one knows the precision of measurement to which the measured variable is being reported, and (iii) that one wishes not so much to describe the (empirical) sample in hand as to make inferences to the population from which one conceives it to have been drawn, under a specific (but usually only IMplicit) model under which the observed values are thought to have been derived from the latent values. Hmph. Didn't intend to be quite so long-winded. -- DFB. ------------------------------------------------------------------------ Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 ================================================================= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =================================================================