"Robert J. MacG. Dawson" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > > > Mark R Marino wrote: > > > > This is a multi-part message in MIME format. > > > > ------=_NextPart_000_003C_01C37FA6.22532980 > > Content-Type: text/plain; > > charset="iso-8859-1" > > Content-Transfer-Encoding: quoted-printable > > > > Why do you think that the algorithms in these computer packages differ? > > Why aren't they universal? > > Any thoughts or comments? > > The reason is that there is not and has never been (since five minutes > after the first person to think of using quartiles went down to the pub > and told his buddies about it) an agreement on the *definition* of > quartiles unless N = 4n+1. > > The median is not well defined unless N is odd. Usually it is taken as > the mean of the two adjacent values, but this is artificial, goes > against the "non-arithmetic", transformation-invariant nature of the > median, and cannot be extended to ordinal data. It's realy an interval > for even N, but many people have a prejudice against returning an > interval instead of a number here ("dammit, I started with 11 numbers, > and you've given me an uncountable set, and you call that a > _summary_?!?!?!") so something arbitrary usually gets done. > > Now, with quartiles, there are several approaches. If you think of Q1 > and Q3 directly as quantiles, you may choose weighted means, based on > the idea that (N+1)/4 may be a quarter-integer so closer to one > neighbour than to the next; unweighted means, taking the "most generic" > member of the interval. If you think of "medians of lower/upper halves" > it is tempting to use the "fencepost" ambiguity in the middle to ensure > that this always exists. > > I would like to make a small, possibly novel, proposal here; that for > EDA purposes this should be done, because the depiction (in a boxplot, > etc) of actual data is in the spirit of the exercise; and that when N is > even and the median not repeated, the box should be marked with both > data adjacent to the median quantile. > > ____ _ ____ > Thus: -----------I____|_|____I--- > > -Robert Dawson > .--------------------------------------------------------------------------- -------------- You are complicating things. Tukey invented all this stuff so that descriptive statistics would be very simple and easy to convey.
Why doesn't anybody talk about Tukey's hinges, and go on and on about how they aren't really quartiles.......... The really important stuff we don't argue about, only the simple stuff. DAHeiser. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
