Statistical programs have a range of alternatives for the quantile function. 
The following script embodies forms
considered in a useful survey paper some years ago. It requires further 
functions to answer Devon's question but
does include some options worth considering.

NB.  Quantile functions


NB.  Each of the functions below will generate a plot
NB.  object which enables you to plot the quantile function
NB.  for a data vector.  The plot object consists of the
NB.  boxed p values and boxed quantiles.

NB.  Using this form linear interpolation
NB.  is used to find values between the data points in the
NB.  plot object.

NB.  The functions are numbered as in Rob J. Hyndman and Yanan Fan,
NB.  Sample Quantiles in Statistical Packages.  The American
NB.  Statistician, 1996, Vol 50(4), 361-365.  Their functions
NB.  QP2 and QP3 are closely related to QP1 and dominated in nearly
NB.  all respects by later alternatives so are not included.

NB.  I have checked the usage in SPlus 6 and R but comments on
NB.  other packages are from Hyndman and Fan.

NB.  In Hyndman and Fan the treatment of p outside the interval
NB.  associated with x(1),...,x(n) is not always defined.  We have
NB.  adopted the uniform practise that in any such region the
NB.  inverse of the EDF is used as the definition.

NB.  Classical definition - invert the EDF

NB.   Functions for Generating Graphs
NB.  [lowest limit, all upper limits] EDF frequency

EDF=: 3 : 0
(2#/:~y.);}.}:(2#i.>:#y.)%(#y.)
:
(}.2#y.,+`-/_1 _1 _2{y.); }:2#0,((+/\x.)%+/x.),1
)

NB.  Available in SAS  PROC UNIVARIATE
Invert =: |."1
QP1 =: [: Invert EDF

NB.  Parzen (1979)  Interpolates step function of QP1
NB.  Available in SAS  PROC UNIVARIATE
QP4 =: ((([: i. [: >: #)%#);{.,])@/:~

NB.  Old definition proposed by Hazen(1914).  Used by hydrologists.
NB.  Used in $tab interpolate in GLIM V3.77
NB.  Appears to be also used by SPlus and R for estimating quartiles
NB.  for box plots.
QP5 =: 3 : 0
(0,(((i.n)+0.5)%n=. #y),1);({.,],{:)y =. /:~y.
)

NB.  Weibull(1939) and Gumbell(1939) both proposed this measure.
NB.  Divides space into n+1 regions each with prob %(n+1) on average.
NB.  Used in BMDP for quartiles.
NB.  Used in Minitab  DESCRIBE command for quartiles.
NB.  Available in SAS  PROC UNIVARIATE
NB.  Appears to be used by SPSS
QP6 =: 3 : 0
(0,((>:i.n)%(1+n=.#y)),1);({.,],{:)y=. /:~y.
)

NB.  Gumbell(1939) also proposed this definition.
NB.  Divides the range in (n-1) intervals.
NB.  Exactly 100p% of intervals lie to the left of QP7(p)
NB.  Used in SPlus 6  quantile(), but not for box plots.
QP7 =: 3 : 0
((i.#y)%<:#y);y =. /:~ y.
)

NB.  Reiss(1989) and Hyndman and Fan(1996)
NB.  Sample quantile is median unbiased of O(n^_0.5)
QP8 =: 3 : 0
(0,((_1r3+>:i.n)%(1r3+n=.#y)),1);({.,],{:)y =. /:~y.
)

NB.  Blom(1958) shows this is a better approximation to
NB.  F(E(X(k))) for the normal distribution.  QP9(p(k)) is
NB.  an approximately unbiased estimator of Q(p(k)) when F
NB.  is normal.  Tends to be used for normal QQ plots.
QP9 =: 3 : 0
(0,((_0.375+>:i.n)%(0.25+n=.#y )),1);({.,],{:)y =. /:~y.
)

QP =: Q8   NB.  The Hyndman and Fan recommendation
           NB.  Also used in R


----- Original Message ----- 
From: "Sherlock, Ric" <[email protected]>
To: "Programming forum" <[email protected]>
Sent: Sunday, October 18, 2009 9:01 PM
Subject: Re: [Jprogramming] "median" considered inaccurate?


> There is a problem with the previous general versions and Don's original 
> for odd-numbered groups - ntiles2 fixes using the same mechanism and 
> similar structure to median
>
>   ntiles1=: [: -:@(+/) (,: <:)@((%~ i.&.<:)@[ >.@:* #...@]) { /:~...@]
>
>   midpts=: (%~ i.&.<:)@[ * <:@#...@]
>   ntiles2=: -:@(+/)@(<. ,: >.)@midpts { /:~...@]
>
>   midpt=: -:@<:@#
>   median=: -:@(+/)@((<. , >.)@midpt { /:~)
>
>   median 3 4 5 6
> 4.5
>   2 ntiles1 3 4 5 6
> 4.5
>   2 ntiles2 3 4 5 6
> 4.5
>   median 3 4 5 6 7
> 5
>   2 ntiles1 3 4 5 6 7
> 5.5
>   2 ntiles2 3 4 5 6 7
> 5
>
>
>> -----Original Message-----
>> From: [email protected] [mailto:programming-
>> [email protected]] On Behalf Of Devon McCormick
>> Sent: Sunday, 18 October 2009 15:24
>> To: Programming forum
>> Subject: Re: [Jprogramming] "median" considered inaccurate?
>>
>> Cool!  Consider it swiped!
>>
>> Quantiles are a very useful way to compare stochastic models, e.g.
>> what's
>> the performance of bottom-decile PE stocks versus top-decile ones?  And
>> if
>> there is a consistent relation between the top and bottom deciles, does
>> it
>> also hold if we use 11-tiles or 9-tiles?
>>
>> On Sat, Oct 17, 2009 at 8:24 PM, Sherlock, Ric
>> <[email protected]>wrote:
>>
>> >   ntiles=: -:@(+/)@(] {~ (,: <:)@([ ((%~ i.&.<:)@[ >.@:* #...@]) /:~...@]))
>> >
>> >   2 tiles scrs
>> > 61
>> >   3 tiles scrs
>> > 57 69
>> >   4 tiles scrs
>> > 52.5 61 70.5
>> >    5 tiles scrs
>> > 51 58.5 65.5 72.5
>> >
>> >
>> > > From: Devon McCormick
>> > >
>> > > If you can't stop, you should look to generalize this: quartiles
>> are
>> > > only a
>> > > special case of N-tiles.
>> > >
>> > > On Sat, Oct 17, 2009 at 2:11 AM, Sherlock, Ric
>> > > <[email protected]>wrote:
>> > >
>> > > > Sorry, couldn't stop....
>> > > >
>> > > > A few more versions of quartiles:
>> > > >
>> > > > Tidied up version of Don's
>> > > > quartiles0=: -:@(+/)@({~ (,: <:)@(0.25 0.5 0.75 >.@:* #))@/:~
>> > > >
>> > > > Simplified version of Keith's
>> > > > quartiles1=: median (([: median ] #~ >) , [ , [: median ] #~ <) ]
>> > > >
>> > > > A slightly different approach:
>> > > > quartiles2=: /:~@(median ([ , > median/. ]) ])
>> > > >
>> > > >
>> > > > > From: Sherlock, Ric
>> > > > >
>> > > > > The following is based on Keith Similie's stats companion.
>> > > > >
>> > > > > NB. Median and quartiles
>> > > > > midpt=: -:@<:@#
>> > > > > median=: -:@(+/)@((<.,>.)@midpt { /:~)
>> > > > > Q1=: [: median ] #~ median > ]
>> > > > > Q3=: [: median ] #~ median < ]
>> > > > > quartiles=: Q1 , median , Q3
>> > > > >
>> > > > > Another definition of median where the domain is integers.
>> > > > >
>> > > > > median=: ~.@((<.,>.)@midpt { /:~)
>> > > > >
>> > > > > > From: Devon McCormick
>> > > > > >
>> > > > > > Don - I like yours better than the one I have now, though
>> I'll
>> > > > > probably
>> > > > > > generalize it into an "Ntiler".
>> > > > > >
>> > > > > > Part of the problem is that there are multiple correct
>> answers if
>> > > we
>> > > > > > define
>> > > > > > quartile numbers as those which divide the set as evenly as
>> > > possible
>> > > > > > into
>> > > > > > four groups, e.g.
>> > > > > >
>> > > > > >    quartileCt=:  4 : '+/"1 (y>:/~x,_) *. y< /~__,x'  NB.
>> Count
>> > > > > > elements/quartile
>> > > > > > NB. All these different answers work correctly:
>> > > > > >    (52.75 61 70.25) quartileCt scrs  NB. Excel
>> > > > > > 5 5 5 5
>> > > > > >    (52.5 61 70.5) quartileCt scrs    NB. web site
>> > > > > > 5 5 5 5
>> > > > > >    (52.1 61.1 70.1) quartileCt scrs  NB. another answer...
>> > > > > > 5 5 5 5
>> > > > > >
>> > > > > > One way to test, as you suggest is to look at the behavior
>> when
>> > > we
>> > > > > have
>> > > > > > an
>> > > > > > odd number of elements, i.e. "odd" with respect to four:
>> > > > > >
>> > > > > > NB. Two different ways of counting number of
>> elements/quartile:
>> > > > > >    quartileCt=:  4 : '+/"1 (y>:/~x,_) *. y< /~__,x'
>> > > > > >    quartileCt2=: 4 : '+/"1 (y> /~x,_) *. y<:/~__,x'
>> > > > > > NB. Two different quartilers:
>> > > > > >    test0=: 1 : '(3{.4 ntilebps y) u y'  NB. Mine
>> > > > > >    test1=: 1 : '(qr y) u y'             NB. Don's
>> > > > > >
>> > > > > > NB. Both work OK for even and odd cases counted one way...
>> > > > > >    quartileCt test0&>0 1 2 3 4}.&.><scrs
>> > > > > > 5 5 5 5
>> > > > > > 4 5 5 5
>> > > > > > 4 5 4 5
>> > > > > > 4 4 4 5
>> > > > > > 4 4 4 4
>> > > > > >    quartileCt test1&>0 1 2 3 4}.&.><scrs
>> > > > > > 5 5 5 5
>> > > > > > 5 5 5 4
>> > > > > > 5 4 5 4
>> > > > > > 5 4 4 4
>> > > > > > 4 4 4 4
>> > > > > >
>> > > > > > NB. Mine falls down for a couple of cases counted the other
>> way:
>> > > > > >    quartileCt2 test0&>0 1 2 3 4}.&.><scrs
>> > > > > > 4 5 5 6
>> > > > > > 4 5 5 5
>> > > > > > 4 4 5 5
>> > > > > > 4 4 4 5
>> > > > > > 3 4 4 5
>> > > > > > NB. but Don's works OK under different counting method as
>> well:
>> > > > > >    quartileCt2 test1&>0 1 2 3 4}.&.><scrs
>> > > > > > 5 5 5 5
>> > > > > > 5 5 5 4
>> > > > > > 5 4 5 4
>> > > > > > 5 4 4 4
>> > > > > > 4 4 4 4
>> > > > > >
>> > > > > > Thanks for your suggestions.
>> > > > > >
>> > > > > > Regards,
>> > > > > >
>> > > > > > Devon
>> > > > > >
>> > > > > > On Fri, Oct 16, 2009 at 3:47 PM, Don Guinn
>> <[email protected]>
>> > > > > wrote:
>> > > > > >
>> > > > > > > Looked up the definition of "median" and it appears that
>> there
>> > > are
>> > > > > > several
>> > > > > > > definitions of "median". And, according to
>> > > > > > > http://en.wikipedia.org/wiki/Median median and quartiles
>> can be
>> > > > > messy
>> > > > > > with
>> > > > > > > badly skewed data. Best I can tell this is a measurement
>> that
>> > > > > should
>> > > > > > be
>> > > > > > > used
>> > > > > > > with care.
>> > > > > > > I wrote a quick verb which gives the same answers as the
>> site
>> > > you
>> > > > > > > referenced
>> > > > > > > and it does strange things, depending on the data. If the
>> count
>> > > of
>> > > > > > the set
>> > > > > > > is odd, which group should have the extra number? What if
>> the
>> > > data
>> > > > > is
>> > > > > > > really
>> > > > > > > skewed?
>> > > > > > >
>> > > > > > >   qr=.([:([:(+/%#)]{~[:(<:,:])[:>.0.25 0.5 0.75"_*#)]/:])
>> NB.
>> > > Needs
>> > > > > > > cleaning up.
>> > > > > > >   qr scrs
>> > > > > > > 52.5 61 70.5
>> > > > > > >    qr i.4
>> > > > > > > 0.5 1.5 2.5
>> > > > > > >   qr i.5
>> > > > > > > 1.5 2.5 3.5
>> > > > > > >   qr i.12
>> > > > > > > 2.5 5.5 8.5
>> > > > > > >   qr i.11
>> > > > > > > 2.5 5.5 8.5
>> > > > > > >   qr i.13
>> > > > > > > 3.5 6.5 9.5
>> > > > > > >   -~/0 2{qr scrs
>> > > > > > > 18
>> > > > > > >   qr 1 1 1 1 1 2 3 4
>> > > > > > > 1 1 2.5
>> > > > > > >
>> > > > > > >
>> > > > > > > On Fri, Oct 16, 2009 at 1:21 PM, Devon McCormick
>> > > > > <[email protected]>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Members of the forum -
>> > > > > > > >
>> > > > > > > > while looking up some statistical definitions, I came
>> across
>> > > this
>> > > > > > example
>> > > > > > > >
>> > > > > >
>> > > > >
>> > >
>> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability
>> > > > > > > > in which the calculation of the median disagrees with the
>> > > result
>> > > > > of
>> > > > > > the
>> > > > > > > one
>> > > > > > > > listed as "m0=: median=: <....@-:@# { /:~" in "MathStats" on
>> the
>> > > J
>> > > > > > wiki.
>> > > > > > > >
>> > > > > > > > I was actually looking at the definition of quartiles
>> when I
>> > > > > > noticed
>> > > > > > > this.
>> > > > > > > >
>> > > > > > > > For the series
>> > > > > > > >
>> > > > > > > >   #scrs=. 43 48 50 50 52 53 56 58 59 60 62 65 66 68 70 71
>> 74
>> > > 76
>> > > > > 78
>> > > > > > 80
>> > > > > > > > 20
>> > > > > > > >   m0=: <....@-:@# { /:~
>> > > > > > > >   m0 scrs
>> > > > > > > > 62
>> > > > > > > >   median scrs  NB. my own definition
>> > > > > > > > 61
>> > > > > > > >   median
>> > > > > > > > -:@(+/)@((<. , >.)@midpt { /:~)
>> > > > > > > >   midpt
>> > > > > > > > -:@<:@#
>> > > > > > > >
>> > > > > > > > Also, this site's answers disagree with Excel and with my
>> own
>> > > > > > quartile
>> > > > > > > > function, applied to "scrs" above, but I think the site
>> is
>> > > > > correct:
>> > > > > > > >   NB. Quartiles 1-3 according to Excel:
>> > > > > > > >   52.75 61 70.25
>> > > > > > > >
>> > > > > > > >   NB. According to
>> > > > > > > >
>> > > > > >
>> > > > >
>> > >
>> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability:
>> > > > > > > >   52.5 61 70.5
>> > > > > > > >
>> > > > > > > >   0 1 2 quartile&><scrs
>> > > > > > > > 52 60 70
>> > > > > > > >
>> > > > > > > > NB. My "quartile" disagrees with my "median": the middle
>> > > quartile
>> > > > > > should
>> > > > > > > be
>> > > > > > > > the same as the median.
>> > > > > > > >   quartile
>> > > > > > > > 4 : 'x{4 ntilebps y'
>> > > > > > > >   ntilebps
>> > > > > > > > 4 : 0
>> > > > > > > > NB.* ntilebps: return breakpoint values of x-tiles of y;
>> e.g.
>> > > 4
>> > > > > > ntilebps
>> > > > > > > y
>> > > > > > > > NB.  -> quartiles; 0-based so "1st" quartile is 0{4
>> ntilebps
>> > > y.
>> > > > > > > >   quant=. x
>> > > > > > > >   y=. /:~y
>> > > > > > > >   wh=. 0 1#:(i.quant)*quant%~#y  NB. Where partition
>> points
>> > > are
>> > > > > > exactly
>> > > > > > > >   'n f'=. |:wh                    NB. whole and
>> fractional
>> > > part
>> > > > > of
>> > > > > > > > partitions
>> > > > > > > >   1|.+/"1 ((1-f),.f)*(n+/_1 0){y NB. "1|." moves top
>> quantile
>> > > to
>> > > > > > end.
>> > > > > > > > )
>> > > > > > > >
>> > > > > > > > Anyone care to weigh in on this?
>> > > > > > > >
>> > > > > > > > Regards,
>> > > > > > > >
>> > > > > > > > Devon
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Devon McCormick, CFA
>> > > > > > > > ^me^ at acm.
>> > > > > > > > org is my
>> > > > > > > > preferred e-mail
>> > > > > > > > ---------------------------------------------------------
>> ----
>> > > ----
>> > > > > --
>> > > > > > ---
>> > > > > > > > For information about J forums see
>> > > > > > http://www.jsoftware.com/forums.htm
>> > > > > > > >
>> > > > > > > -----------------------------------------------------------
>> ----
>> > > ----
>> > > > > --
>> > > > > > -
>> > > > > > > For information about J forums see
>> > > > > > http://www.jsoftware.com/forums.htm
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Devon McCormick, CFA
>> > > > > > ^me^ at acm.
>> > > > > > org is my
>> > > > > > preferred e-mail
>> > > > > > -------------------------------------------------------------
>> ----
>> > > ----
>> > > > > -
>> > > > > > For information about J forums see
>> > > > > http://www.jsoftware.com/forums.htm
>> > > > > ---------------------------------------------------------------
>> ----
>> > > ---
>> > > > > For information about J forums see
>> > > http://www.jsoftware.com/forums.htm
>> > > > -----------------------------------------------------------------
>> ----
>> > > -
>> > > > For information about J forums see
>> > > http://www.jsoftware.com/forums.htm
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Devon McCormick, CFA
>> > > ^me^ at acm.
>> > > org is my
>> > > preferred e-mail
>> > > -------------------------------------------------------------------
>> ---
>> > > For information about J forums see
>> http://www.jsoftware.com/forums.htm
>> > ---------------------------------------------------------------------
>> -
>> > For information about J forums see
>> http://www.jsoftware.com/forums.htm
>> >
>>
>>
>>
>> --
>> Devon McCormick, CFA
>> ^me^ at acm.
>> org is my
>> preferred e-mail
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm 


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to