Tukey did show that median and quantiles are often much more stable than average.( f.e median polish and median smoothing in general)
------------------- > We are solving a problem which should not be solved. The advantage of Median and Quantiles, as opposed to Arithmetical Mean and Standard Deviation, is that they can be evaluated without computing. Having a computer at hand, this argument vanishes. > > --- Den søn 18/10/09 skrev Fraser Jackson <[email protected]>: > > > Fra: Fraser Jackson <[email protected]> > > Emne: Re: [Jprogramming] "median" considered inaccurate? > > Til: "Programming forum" <[email protected]> > > Dato: søndag 18. oktober 2009 11.51 > > Statistical programs have a range of > > alternatives for the quantile function. > > The following script embodies forms > > considered in a useful survey paper some years ago. It > > requires further > > functions to answer Devon's question but > > does include some options worth considering. > > > > NB. Quantile functions > > > > > > NB. Each of the functions below will generate a plot > > NB. object which enables you to plot the quantile > > function > > NB. for a data vector. The plot object consists > > of the > > NB. boxed p values and boxed quantiles. > > > > NB. Using this form linear interpolation > > NB. is used to find values between the data points in > > the > > NB. plot object. > > > > NB. The functions are numbered as in Rob J. Hyndman > > and Yanan Fan, > > NB. Sample Quantiles in Statistical Packages. > > The American > > NB. Statistician, 1996, Vol 50(4), 361-365. > > Their functions > > NB. QP2 and QP3 are closely related to QP1 and > > dominated in nearly > > NB. all respects by later alternatives so are not > > included. > > > > NB. I have checked the usage in SPlus 6 and R but > > comments on > > NB. other packages are from Hyndman and Fan. > > > > NB. In Hyndman and Fan the treatment of p outside the > > interval > > NB. associated with x(1),...,x(n) is not always > > defined. We have > > NB. adopted the uniform practise that in any such > > region the > > NB. inverse of the EDF is used as the definition. > > > > NB. Classical definition - invert the EDF > > > > NB. Functions for Generating Graphs > > NB. [lowest limit, all upper limits] EDF frequency > > > > EDF=: 3 : 0 > > (2#/:~y.);}.}:(2#i.>:#y.)%(#y.) > > : > > (}.2#y.,+`-/_1 _1 _2{y.); }:2#0,((+/\x.)%+/x.),1 > > ) > > > > NB. Available in SAS PROC UNIVARIATE > > Invert =: |."1 > > QP1 =: [: Invert EDF > > > > NB. Parzen (1979) Interpolates step function of > > QP1 > > NB. Available in SAS PROC UNIVARIATE > > QP4 =: ((([: i. [: >: #)%#);{.,])@/:~ > > > > NB. Old definition proposed by Hazen(1914). > > Used by hydrologists. > > NB. Used in $tab interpolate in GLIM V3.77 > > NB. Appears to be also used by SPlus and R for > > estimating quartiles > > NB. for box plots. > > QP5 =: 3 : 0 > > (0,(((i.n)+0.5)%n=. #y),1);({.,],{:)y =. /:~y. > > ) > > > > NB. Weibull(1939) and Gumbell(1939) both proposed > > this measure. > > NB. Divides space into n+1 regions each with prob > > %(n+1) on average. > > NB. Used in BMDP for quartiles. > > NB. Used in Minitab DESCRIBE command for > > quartiles. > > NB. Available in SAS PROC UNIVARIATE > > NB. Appears to be used by SPSS > > QP6 =: 3 : 0 > > (0,((>:i.n)%(1+n=.#y)),1);({.,],{:)y=. /:~y. > > ) > > > > NB. Gumbell(1939) also proposed this definition. > > NB. Divides the range in (n-1) intervals. > > NB. Exactly 100p% of intervals lie to the left of > > QP7(p) > > NB. Used in SPlus 6 quantile(), but not for box > > plots. > > QP7 =: 3 : 0 > > ((i.#y)%<:#y);y =. /:~ y. > > ) > > > > NB. Reiss(1989) and Hyndman and Fan(1996) > > NB. Sample quantile is median unbiased of O(n^_0.5) > > QP8 =: 3 : 0 > > (0,((_1r3+>:i.n)%(1r3+n=.#y)),1);({.,],{:)y =. /:~y. > > ) > > > > NB. Blom(1958) shows this is a better approximation > > to > > NB. F(E(X(k))) for the normal distribution. > > QP9(p(k)) is > > NB. an approximately unbiased estimator of Q(p(k)) > > when F > > NB. is normal. Tends to be used for normal QQ > > plots. > > QP9 =: 3 : 0 > > (0,((_0.375+>:i.n)%(0.25+n=.#y )),1);({.,],{:)y =. > > /:~y. > > ) > > > > QP =: Q8 NB. The Hyndman and Fan > > recommendation > > NB. > > Also used in R > > > > > > ----- Original Message ----- > > From: "Sherlock, Ric" <[email protected]> > > To: "Programming forum" <[email protected]> > > Sent: Sunday, October 18, 2009 9:01 PM > > Subject: Re: [Jprogramming] "median" considered > > inaccurate? > > > > > > > There is a problem with the previous general versions > > and Don's original > > > for odd-numbered groups - ntiles2 fixes using the same > > mechanism and > > > similar structure to median > > > > > > ntiles1=: [: -:@(+/) (,: <:)@((%~ > > i.&.<:)@[ >.@:* #...@]) { /:~...@] > > > > > > midpts=: (%~ i.&.<:)@[ * > > <:@#...@] > > > ntiles2=: -:@(+/)@(<. ,: > > >.)@midpts { /:~...@] > > > > > > midpt=: -:@<:@# > > > median=: -:@(+/)@((<. , > > >.)@midpt { /:~) > > > > > > median 3 4 5 6 > > > 4.5 > > > 2 ntiles1 3 4 5 6 > > > 4.5 > > > 2 ntiles2 3 4 5 6 > > > 4.5 > > > median 3 4 5 6 7 > > > 5 > > > 2 ntiles1 3 4 5 6 7 > > > 5.5 > > > 2 ntiles2 3 4 5 6 7 > > > 5 > > > > > > > > >> -----Original Message----- > > >> From: [email protected] > > [mailto:programming- > > >> [email protected]] > > On Behalf Of Devon McCormick > > >> Sent: Sunday, 18 October 2009 15:24 > > >> To: Programming forum > > >> Subject: Re: [Jprogramming] "median" considered > > inaccurate? > > >> > > >> Cool! Consider it swiped! > > >> > > >> Quantiles are a very useful way to compare > > stochastic models, e.g. > > >> what's > > >> the performance of bottom-decile PE stocks versus > > top-decile ones? And > > >> if > > >> there is a consistent relation between the top and > > bottom deciles, does > > >> it > > >> also hold if we use 11-tiles or 9-tiles? > > >> > > >> On Sat, Oct 17, 2009 at 8:24 PM, Sherlock, Ric > > >> <[email protected]>wrote: > > >> > > >> > ntiles=: -:@(+/)@(] {~ (,: > > <:)@([ ((%~ i.&.<:)@[ >.@:* #...@]) /:~...@])) > > >> > > > >> > 2 tiles scrs > > >> > 61 > > >> > 3 tiles scrs > > >> > 57 69 > > >> > 4 tiles scrs > > >> > 52.5 61 70.5 > > >> > 5 tiles scrs > > >> > 51 58.5 65.5 72.5 > > >> > > > >> > > > >> > > From: Devon McCormick > > >> > > > > >> > > If you can't stop, you should look to > > generalize this: quartiles > > >> are > > >> > > only a > > >> > > special case of N-tiles. > > >> > > > > >> > > On Sat, Oct 17, 2009 at 2:11 AM, > > Sherlock, Ric > > >> > > <[email protected]>wrote: > > >> > > > > >> > > > Sorry, couldn't stop.... > > >> > > > > > >> > > > A few more versions of quartiles: > > >> > > > > > >> > > > Tidied up version of Don's > > >> > > > quartiles0=: -:@(+/)@({~ (,: > > <:)@(0.25 0.5 0.75 >.@:* #))@/:~ > > >> > > > > > >> > > > Simplified version of Keith's > > >> > > > quartiles1=: median (([: median ] > > #~ >) , [ , [: median ] #~ <) ] > > >> > > > > > >> > > > A slightly different approach: > > >> > > > quartiles2=: /:~@(median ([ , > > > median/. ]) ]) > > >> > > > > > >> > > > > > >> > > > > From: Sherlock, Ric > > >> > > > > > > >> > > > > The following is based on > > Keith Similie's stats companion. > > >> > > > > > > >> > > > > NB. Median and quartiles > > >> > > > > midpt=: -:@<:@# > > >> > > > > median=: > > -:@(+/)@((<.,>.)@midpt { /:~) > > >> > > > > Q1=: [: median ] #~ median > > > ] > > >> > > > > Q3=: [: median ] #~ median > > < ] > > >> > > > > quartiles=: Q1 , median , Q3 > > >> > > > > > > >> > > > > Another definition of median > > where the domain is integers. > > >> > > > > > > >> > > > > median=: > > ~.@((<.,>.)@midpt { /:~) > > >> > > > > > > >> > > > > > From: Devon McCormick > > >> > > > > > > > >> > > > > > Don - I like yours better > > than the one I have now, though > > >> I'll > > >> > > > > probably > > >> > > > > > generalize it into an > > "Ntiler". > > >> > > > > > > > >> > > > > > Part of the problem is > > that there are multiple correct > > >> answers if > > >> > > we > > >> > > > > > define > > >> > > > > > quartile numbers as those > > which divide the set as evenly as > > >> > > possible > > >> > > > > > into > > >> > > > > > four groups, e.g. > > >> > > > > > > > >> > > > > > > > quartileCt=: 4 : '+/"1 (y>:/~x,_) *. y< > > /~__,x' NB. > > >> Count > > >> > > > > > elements/quartile > > >> > > > > > NB. All these different > > answers work correctly: > > >> > > > > > (52.75 61 > > 70.25) quartileCt scrs NB. Excel > > >> > > > > > 5 5 5 5 > > >> > > > > > (52.5 61 > > 70.5) quartileCt scrs NB. web site > > >> > > > > > 5 5 5 5 > > >> > > > > > (52.1 61.1 > > 70.1) quartileCt scrs NB. another answer... > > >> > > > > > 5 5 5 5 > > >> > > > > > > > >> > > > > > One way to test, as you > > suggest is to look at the behavior > > >> when > > >> > > we > > >> > > > > have > > >> > > > > > an > > >> > > > > > odd number of elements, > > i.e. "odd" with respect to four: > > >> > > > > > > > >> > > > > > NB. Two different ways of > > counting number of > > >> elements/quartile: > > >> > > > > > > > quartileCt=: 4 : '+/"1 (y>:/~x,_) *. y< /~__,x' > > >> > > > > > > > quartileCt2=: 4 : '+/"1 (y> /~x,_) *. y<:/~__,x' > > >> > > > > > NB. Two different > > quartilers: > > >> > > > > > test0=: 1 : > > '(3{.4 ntilebps y) u y' NB. Mine > > >> > > > > > test1=: 1 : > > '(qr y) u y' > > NB. Don's > > >> > > > > > > > >> > > > > > NB. Both work OK for even > > and odd cases counted one way... > > >> > > > > > quartileCt > > test0&>0 1 2 3 4}.&.><scrs > > >> > > > > > 5 5 5 5 > > >> > > > > > 4 5 5 5 > > >> > > > > > 4 5 4 5 > > >> > > > > > 4 4 4 5 > > >> > > > > > 4 4 4 4 > > >> > > > > > quartileCt > > test1&>0 1 2 3 4}.&.><scrs > > >> > > > > > 5 5 5 5 > > >> > > > > > 5 5 5 4 > > >> > > > > > 5 4 5 4 > > >> > > > > > 5 4 4 4 > > >> > > > > > 4 4 4 4 > > >> > > > > > > > >> > > > > > NB. Mine falls down for a > > couple of cases counted the other > > >> way: > > >> > > > > > quartileCt2 > > test0&>0 1 2 3 4}.&.><scrs > > >> > > > > > 4 5 5 6 > > >> > > > > > 4 5 5 5 > > >> > > > > > 4 4 5 5 > > >> > > > > > 4 4 4 5 > > >> > > > > > 3 4 4 5 > > >> > > > > > NB. but Don's works OK > > under different counting method as > > >> well: > > >> > > > > > quartileCt2 > > test1&>0 1 2 3 4}.&.><scrs > > >> > > > > > 5 5 5 5 > > >> > > > > > 5 5 5 4 > > >> > > > > > 5 4 5 4 > > >> > > > > > 5 4 4 4 > > >> > > > > > 4 4 4 4 > > >> > > > > > > > >> > > > > > Thanks for your > > suggestions. > > >> > > > > > > > >> > > > > > Regards, > > >> > > > > > > > >> > > > > > Devon > > >> > > > > > > > >> > > > > > On Fri, Oct 16, 2009 at > > 3:47 PM, Don Guinn > > >> <[email protected]> > > >> > > > > wrote: > > >> > > > > > > > >> > > > > > > Looked up the > > definition of "median" and it appears that > > >> there > > >> > > are > > >> > > > > > several > > >> > > > > > > definitions of > > "median". And, according to > > >> > > > > > > http://en.wikipedia.org/wiki/Median median and > > quartiles > > >> can be > > >> > > > > messy > > >> > > > > > with > > >> > > > > > > badly skewed data. > > Best I can tell this is a measurement > > >> that > > >> > > > > should > > >> > > > > > be > > >> > > > > > > used > > >> > > > > > > with care. > > >> > > > > > > I wrote a quick verb > > which gives the same answers as the > > >> site > > >> > > you > > >> > > > > > > referenced > > >> > > > > > > and it does strange > > things, depending on the data. If the > > >> count > > >> > > of > > >> > > > > > the set > > >> > > > > > > is odd, which group > > should have the extra number? What if > > >> the > > >> > > data > > >> > > > > is > > >> > > > > > > really > > >> > > > > > > skewed? > > >> > > > > > > > > >> > > > > > > > > qr=.([:([:(+/%#)]{~[:(<:,:])[:>.0.25 > > 0.5 0.75"_*#)]/:]) > > >> NB. > > >> > > Needs > > >> > > > > > > cleaning up. > > >> > > > > > > qr > > scrs > > >> > > > > > > 52.5 61 70.5 > > >> > > > > > > qr i.4 > > >> > > > > > > 0.5 1.5 2.5 > > >> > > > > > > qr > > i.5 > > >> > > > > > > 1.5 2.5 3.5 > > >> > > > > > > qr > > i.12 > > >> > > > > > > 2.5 5.5 8.5 > > >> > > > > > > qr > > i.11 > > >> > > > > > > 2.5 5.5 8.5 > > >> > > > > > > qr > > i.13 > > >> > > > > > > 3.5 6.5 9.5 > > >> > > > > > > > > -~/0 2{qr scrs > > >> > > > > > > 18 > > >> > > > > > > qr > > 1 1 1 1 1 2 3 4 > > >> > > > > > > 1 1 2.5 > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > On Fri, Oct 16, 2009 > > at 1:21 PM, Devon McCormick > > >> > > > > <[email protected]> > > >> > > > > > > wrote: > > >> > > > > > > > > >> > > > > > > > Members of the > > forum - > > >> > > > > > > > > > >> > > > > > > > while looking > > up some statistical definitions, I came > > >> across > > >> > > this > > >> > > > > > example > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > >> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability > > >> > > > > > > > in which the > > calculation of the median disagrees with the > > >> > > result > > >> > > > > of > > >> > > > > > the > > >> > > > > > > one > > >> > > > > > > > listed as "m0=: > > median=: <....@-:@# { /:~" in "MathStats" on > > >> the > > >> > > J > > >> > > > > > wiki. > > >> > > > > > > > > > >> > > > > > > > I was actually > > looking at the definition of quartiles > > >> when I > > >> > > > > > noticed > > >> > > > > > > this. > > >> > > > > > > > > > >> > > > > > > > For the series > > >> > > > > > > > > > >> > > > > > > > > > #scrs=. 43 48 50 50 52 53 56 58 59 60 > > 62 65 66 68 70 71 > > >> 74 > > >> > > 76 > > >> > > > > 78 > > >> > > > > > 80 > > >> > > > > > > > 20 > > >> > > > > > > > > > m0=: <....@-:@# { /:~ > > >> > > > > > > > > > m0 scrs > > >> > > > > > > > 62 > > >> > > > > > > > > > median scrs NB. my own > > definition > > >> > > > > > > > 61 > > >> > > > > > > > > > median > > >> > > > > > > > -:@(+/)@((<. > > , >.)@midpt { /:~) > > >> > > > > > > > > > midpt > > >> > > > > > > > -:@<:@# > > >> > > > > > > > > > >> > > > > > > > Also, this > > site's answers disagree with Excel and with my > > >> own > > >> > > > > > quartile > > >> > > > > > > > function, > > applied to "scrs" above, but I think the site > > >> is > > >> > > > > correct: > > >> > > > > > > > > > NB. Quartiles 1-3 according to Excel: > > >> > > > > > > > > > 52.75 61 70.25 > > >> > > > > > > > > > >> > > > > > > > > > NB. According to > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > >> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability : > > >> > > > > > > > > > 52.5 61 70.5 > > >> > > > > > > > > > >> > > > > > > > > > 0 1 2 quartile&><scrs > > >> > > > > > > > 52 60 70 > > >> > > > > > > > > > >> > > > > > > > NB. My > > "quartile" disagrees with my "median": the middle > > >> > > quartile > > >> > > > > > should > > >> > > > > > > be > > >> > > > > > > > the same as the > > median. > > >> > > > > > > > > > quartile > > >> > > > > > > > 4 : 'x{4 > > ntilebps y' > > >> > > > > > > > > > ntilebps > > >> > > > > > > > 4 : 0 > > >> > > > > > > > NB.* ntilebps: > > return breakpoint values of x-tiles of y; > > >> e.g. > > >> > > 4 > > >> > > > > > ntilebps > > >> > > > > > > y > > >> > > > > > > > NB. -> > > quartiles; 0-based so "1st" quartile is 0{4 > > >> ntilebps > > >> > > y. > > >> > > > > > > > > > quant=. x > > >> > > > > > > > > > y=. /:~y > > >> > > > > > > > > > wh=. 0 1#:(i.quant)*quant%~#y > > NB. Where partition > > >> points > > >> > > are > > >> > > > > > exactly > > >> > > > > > > > > > 'n f'=. |:wh > > NB. whole > > and > > >> fractional > > >> > > part > > >> > > > > of > > >> > > > > > > > partitions > > >> > > > > > > > > > 1|.+/"1 ((1-f),.f)*(n+/_1 0){y NB. > > "1|." moves top > > >> quantile > > >> > > to > > >> > > > > > end. > > >> > > > > > > > ) > > >> > > > > > > > > > >> > > > > > > > Anyone care to > > weigh in on this? > > >> > > > > > > > > > >> > > > > > > > Regards, > > >> > > > > > > > > > >> > > > > > > > Devon > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > -- > > >> > > > > > > > Devon > > McCormick, CFA > > >> > > > > > > > ^me^ at acm. > > >> > > > > > > > org is my > > >> > > > > > > > preferred > > e-mail > > >> > > > > > > > > > --------------------------------------------------------- > > >> ---- > > >> > > ---- > > >> > > > > -- > > >> > > > > > --- > > >> > > > > > > > For information > > about J forums see > > >> > > > > > http://www.jsoftware.com/forums.htm > > >> > > > > > > > > > >> > > > > > > > > ----------------------------------------------------------- > > >> ---- > > >> > > ---- > > >> > > > > -- > > >> > > > > > - > > >> > > > > > > For information > > about J forums see > > >> > > > > > http://www.jsoftware.com/forums.htm > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > -- > > >> > > > > > Devon McCormick, CFA > > >> > > > > > ^me^ at acm. > > >> > > > > > org is my > > >> > > > > > preferred e-mail > > >> > > > > > > > ------------------------------------------------------------- > > >> ---- > > >> > > ---- > > >> > > > > - > > >> > > > > > For information about J > > forums see > > >> > > > > http://www.jsoftware.com/forums.htm > > >> > > > > > > --------------------------------------------------------------- > > >> ---- > > >> > > --- > > >> > > > > For information about J forums > > see > > >> > > http://www.jsoftware.com/forums.htm > > >> > > > > > ----------------------------------------------------------------- > > >> ---- > > >> > > - > > >> > > > For information about J forums see > > >> > > http://www.jsoftware.com/forums.htm > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > -- > > >> > > Devon McCormick, CFA > > >> > > ^me^ at acm. > > >> > > org is my > > >> > > preferred e-mail > > >> > > > > ------------------------------------------------------------------- > > >> --- > > >> > > For information about J forums see > > >> http://www.jsoftware.com/forums.htm > > >> > > > --------------------------------------------------------------------- > > >> - > > >> > For information about J forums see > > >> http://www.jsoftware.com/forums.htm > > >> > > > >> > > >> > > >> > > >> -- > > >> Devon McCormick, CFA > > >> ^me^ at acm. > > >> org is my > > >> preferred e-mail > > >> > > ---------------------------------------------------------------------- > > >> For information about J forums see http://www.jsoftware.com/forums.htm > > > > > ---------------------------------------------------------------------- > > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > > Trænger du til at se det store billede? Kelkoo giver dig gode tilbud på LCD TV! Se her http://dk.yahoo.com/r/pat/lcd > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
