We are solving a problem which should not be solved. The advantage of Median and Quantiles, as opposed to Arithmetical Mean and Standard Deviation, is that they can be evaluated without computing. Having a computer at hand, this argument vanishes.
--- Den søn 18/10/09 skrev Fraser Jackson <[email protected]>: > Fra: Fraser Jackson <[email protected]> > Emne: Re: [Jprogramming] "median" considered inaccurate? > Til: "Programming forum" <[email protected]> > Dato: søndag 18. oktober 2009 11.51 > Statistical programs have a range of > alternatives for the quantile function. > The following script embodies forms > considered in a useful survey paper some years ago. It > requires further > functions to answer Devon's question but > does include some options worth considering. > > NB. Quantile functions > > > NB. Each of the functions below will generate a plot > NB. object which enables you to plot the quantile > function > NB. for a data vector. The plot object consists > of the > NB. boxed p values and boxed quantiles. > > NB. Using this form linear interpolation > NB. is used to find values between the data points in > the > NB. plot object. > > NB. The functions are numbered as in Rob J. Hyndman > and Yanan Fan, > NB. Sample Quantiles in Statistical Packages. > The American > NB. Statistician, 1996, Vol 50(4), 361-365. > Their functions > NB. QP2 and QP3 are closely related to QP1 and > dominated in nearly > NB. all respects by later alternatives so are not > included. > > NB. I have checked the usage in SPlus 6 and R but > comments on > NB. other packages are from Hyndman and Fan. > > NB. In Hyndman and Fan the treatment of p outside the > interval > NB. associated with x(1),...,x(n) is not always > defined. We have > NB. adopted the uniform practise that in any such > region the > NB. inverse of the EDF is used as the definition. > > NB. Classical definition - invert the EDF > > NB. Functions for Generating Graphs > NB. [lowest limit, all upper limits] EDF frequency > > EDF=: 3 : 0 > (2#/:~y.);}.}:(2#i.>:#y.)%(#y.) > : > (}.2#y.,+`-/_1 _1 _2{y.); }:2#0,((+/\x.)%+/x.),1 > ) > > NB. Available in SAS PROC UNIVARIATE > Invert =: |."1 > QP1 =: [: Invert EDF > > NB. Parzen (1979) Interpolates step function of > QP1 > NB. Available in SAS PROC UNIVARIATE > QP4 =: ((([: i. [: >: #)%#);{.,])@/:~ > > NB. Old definition proposed by Hazen(1914). > Used by hydrologists. > NB. Used in $tab interpolate in GLIM V3.77 > NB. Appears to be also used by SPlus and R for > estimating quartiles > NB. for box plots. > QP5 =: 3 : 0 > (0,(((i.n)+0.5)%n=. #y),1);({.,],{:)y =. /:~y. > ) > > NB. Weibull(1939) and Gumbell(1939) both proposed > this measure. > NB. Divides space into n+1 regions each with prob > %(n+1) on average. > NB. Used in BMDP for quartiles. > NB. Used in Minitab DESCRIBE command for > quartiles. > NB. Available in SAS PROC UNIVARIATE > NB. Appears to be used by SPSS > QP6 =: 3 : 0 > (0,((>:i.n)%(1+n=.#y)),1);({.,],{:)y=. /:~y. > ) > > NB. Gumbell(1939) also proposed this definition. > NB. Divides the range in (n-1) intervals. > NB. Exactly 100p% of intervals lie to the left of > QP7(p) > NB. Used in SPlus 6 quantile(), but not for box > plots. > QP7 =: 3 : 0 > ((i.#y)%<:#y);y =. /:~ y. > ) > > NB. Reiss(1989) and Hyndman and Fan(1996) > NB. Sample quantile is median unbiased of O(n^_0.5) > QP8 =: 3 : 0 > (0,((_1r3+>:i.n)%(1r3+n=.#y)),1);({.,],{:)y =. /:~y. > ) > > NB. Blom(1958) shows this is a better approximation > to > NB. F(E(X(k))) for the normal distribution. > QP9(p(k)) is > NB. an approximately unbiased estimator of Q(p(k)) > when F > NB. is normal. Tends to be used for normal QQ > plots. > QP9 =: 3 : 0 > (0,((_0.375+>:i.n)%(0.25+n=.#y )),1);({.,],{:)y =. > /:~y. > ) > > QP =: Q8 NB. The Hyndman and Fan > recommendation > NB. > Also used in R > > > ----- Original Message ----- > From: "Sherlock, Ric" <[email protected]> > To: "Programming forum" <[email protected]> > Sent: Sunday, October 18, 2009 9:01 PM > Subject: Re: [Jprogramming] "median" considered > inaccurate? > > > > There is a problem with the previous general versions > and Don's original > > for odd-numbered groups - ntiles2 fixes using the same > mechanism and > > similar structure to median > > > > ntiles1=: [: -:@(+/) (,: <:)@((%~ > i.&.<:)@[ >.@:* #...@]) { /:~...@] > > > > midpts=: (%~ i.&.<:)@[ * > <:@#...@] > > ntiles2=: -:@(+/)@(<. ,: > >.)@midpts { /:~...@] > > > > midpt=: -:@<:@# > > median=: -:@(+/)@((<. , > >.)@midpt { /:~) > > > > median 3 4 5 6 > > 4.5 > > 2 ntiles1 3 4 5 6 > > 4.5 > > 2 ntiles2 3 4 5 6 > > 4.5 > > median 3 4 5 6 7 > > 5 > > 2 ntiles1 3 4 5 6 7 > > 5.5 > > 2 ntiles2 3 4 5 6 7 > > 5 > > > > > >> -----Original Message----- > >> From: [email protected] > [mailto:programming- > >> [email protected]] > On Behalf Of Devon McCormick > >> Sent: Sunday, 18 October 2009 15:24 > >> To: Programming forum > >> Subject: Re: [Jprogramming] "median" considered > inaccurate? > >> > >> Cool! Consider it swiped! > >> > >> Quantiles are a very useful way to compare > stochastic models, e.g. > >> what's > >> the performance of bottom-decile PE stocks versus > top-decile ones? And > >> if > >> there is a consistent relation between the top and > bottom deciles, does > >> it > >> also hold if we use 11-tiles or 9-tiles? > >> > >> On Sat, Oct 17, 2009 at 8:24 PM, Sherlock, Ric > >> <[email protected]>wrote: > >> > >> > ntiles=: -:@(+/)@(] {~ (,: > <:)@([ ((%~ i.&.<:)@[ >.@:* #...@]) /:~...@])) > >> > > >> > 2 tiles scrs > >> > 61 > >> > 3 tiles scrs > >> > 57 69 > >> > 4 tiles scrs > >> > 52.5 61 70.5 > >> > 5 tiles scrs > >> > 51 58.5 65.5 72.5 > >> > > >> > > >> > > From: Devon McCormick > >> > > > >> > > If you can't stop, you should look to > generalize this: quartiles > >> are > >> > > only a > >> > > special case of N-tiles. > >> > > > >> > > On Sat, Oct 17, 2009 at 2:11 AM, > Sherlock, Ric > >> > > <[email protected]>wrote: > >> > > > >> > > > Sorry, couldn't stop.... > >> > > > > >> > > > A few more versions of quartiles: > >> > > > > >> > > > Tidied up version of Don's > >> > > > quartiles0=: -:@(+/)@({~ (,: > <:)@(0.25 0.5 0.75 >.@:* #))@/:~ > >> > > > > >> > > > Simplified version of Keith's > >> > > > quartiles1=: median (([: median ] > #~ >) , [ , [: median ] #~ <) ] > >> > > > > >> > > > A slightly different approach: > >> > > > quartiles2=: /:~@(median ([ , > > median/. ]) ]) > >> > > > > >> > > > > >> > > > > From: Sherlock, Ric > >> > > > > > >> > > > > The following is based on > Keith Similie's stats companion. > >> > > > > > >> > > > > NB. Median and quartiles > >> > > > > midpt=: -:@<:@# > >> > > > > median=: > -:@(+/)@((<.,>.)@midpt { /:~) > >> > > > > Q1=: [: median ] #~ median > > ] > >> > > > > Q3=: [: median ] #~ median > < ] > >> > > > > quartiles=: Q1 , median , Q3 > >> > > > > > >> > > > > Another definition of median > where the domain is integers. > >> > > > > > >> > > > > median=: > ~.@((<.,>.)@midpt { /:~) > >> > > > > > >> > > > > > From: Devon McCormick > >> > > > > > > >> > > > > > Don - I like yours better > than the one I have now, though > >> I'll > >> > > > > probably > >> > > > > > generalize it into an > "Ntiler". > >> > > > > > > >> > > > > > Part of the problem is > that there are multiple correct > >> answers if > >> > > we > >> > > > > > define > >> > > > > > quartile numbers as those > which divide the set as evenly as > >> > > possible > >> > > > > > into > >> > > > > > four groups, e.g. > >> > > > > > > >> > > > > > > quartileCt=: 4 : '+/"1 (y>:/~x,_) *. y< > /~__,x' NB. > >> Count > >> > > > > > elements/quartile > >> > > > > > NB. All these different > answers work correctly: > >> > > > > > (52.75 61 > 70.25) quartileCt scrs NB. Excel > >> > > > > > 5 5 5 5 > >> > > > > > (52.5 61 > 70.5) quartileCt scrs NB. web site > >> > > > > > 5 5 5 5 > >> > > > > > (52.1 61.1 > 70.1) quartileCt scrs NB. another answer... > >> > > > > > 5 5 5 5 > >> > > > > > > >> > > > > > One way to test, as you > suggest is to look at the behavior > >> when > >> > > we > >> > > > > have > >> > > > > > an > >> > > > > > odd number of elements, > i.e. "odd" with respect to four: > >> > > > > > > >> > > > > > NB. Two different ways of > counting number of > >> elements/quartile: > >> > > > > > > quartileCt=: 4 : '+/"1 (y>:/~x,_) *. y< /~__,x' > >> > > > > > > quartileCt2=: 4 : '+/"1 (y> /~x,_) *. y<:/~__,x' > >> > > > > > NB. Two different > quartilers: > >> > > > > > test0=: 1 : > '(3{.4 ntilebps y) u y' NB. Mine > >> > > > > > test1=: 1 : > '(qr y) u y' > NB. Don's > >> > > > > > > >> > > > > > NB. Both work OK for even > and odd cases counted one way... > >> > > > > > quartileCt > test0&>0 1 2 3 4}.&.><scrs > >> > > > > > 5 5 5 5 > >> > > > > > 4 5 5 5 > >> > > > > > 4 5 4 5 > >> > > > > > 4 4 4 5 > >> > > > > > 4 4 4 4 > >> > > > > > quartileCt > test1&>0 1 2 3 4}.&.><scrs > >> > > > > > 5 5 5 5 > >> > > > > > 5 5 5 4 > >> > > > > > 5 4 5 4 > >> > > > > > 5 4 4 4 > >> > > > > > 4 4 4 4 > >> > > > > > > >> > > > > > NB. Mine falls down for a > couple of cases counted the other > >> way: > >> > > > > > quartileCt2 > test0&>0 1 2 3 4}.&.><scrs > >> > > > > > 4 5 5 6 > >> > > > > > 4 5 5 5 > >> > > > > > 4 4 5 5 > >> > > > > > 4 4 4 5 > >> > > > > > 3 4 4 5 > >> > > > > > NB. but Don's works OK > under different counting method as > >> well: > >> > > > > > quartileCt2 > test1&>0 1 2 3 4}.&.><scrs > >> > > > > > 5 5 5 5 > >> > > > > > 5 5 5 4 > >> > > > > > 5 4 5 4 > >> > > > > > 5 4 4 4 > >> > > > > > 4 4 4 4 > >> > > > > > > >> > > > > > Thanks for your > suggestions. > >> > > > > > > >> > > > > > Regards, > >> > > > > > > >> > > > > > Devon > >> > > > > > > >> > > > > > On Fri, Oct 16, 2009 at > 3:47 PM, Don Guinn > >> <[email protected]> > >> > > > > wrote: > >> > > > > > > >> > > > > > > Looked up the > definition of "median" and it appears that > >> there > >> > > are > >> > > > > > several > >> > > > > > > definitions of > "median". And, according to > >> > > > > > > http://en.wikipedia.org/wiki/Median median and > quartiles > >> can be > >> > > > > messy > >> > > > > > with > >> > > > > > > badly skewed data. > Best I can tell this is a measurement > >> that > >> > > > > should > >> > > > > > be > >> > > > > > > used > >> > > > > > > with care. > >> > > > > > > I wrote a quick verb > which gives the same answers as the > >> site > >> > > you > >> > > > > > > referenced > >> > > > > > > and it does strange > things, depending on the data. If the > >> count > >> > > of > >> > > > > > the set > >> > > > > > > is odd, which group > should have the extra number? What if > >> the > >> > > data > >> > > > > is > >> > > > > > > really > >> > > > > > > skewed? > >> > > > > > > > >> > > > > > > > qr=.([:([:(+/%#)]{~[:(<:,:])[:>.0.25 > 0.5 0.75"_*#)]/:]) > >> NB. > >> > > Needs > >> > > > > > > cleaning up. > >> > > > > > > qr > scrs > >> > > > > > > 52.5 61 70.5 > >> > > > > > > qr i.4 > >> > > > > > > 0.5 1.5 2.5 > >> > > > > > > qr > i.5 > >> > > > > > > 1.5 2.5 3.5 > >> > > > > > > qr > i.12 > >> > > > > > > 2.5 5.5 8.5 > >> > > > > > > qr > i.11 > >> > > > > > > 2.5 5.5 8.5 > >> > > > > > > qr > i.13 > >> > > > > > > 3.5 6.5 9.5 > >> > > > > > > > -~/0 2{qr scrs > >> > > > > > > 18 > >> > > > > > > qr > 1 1 1 1 1 2 3 4 > >> > > > > > > 1 1 2.5 > >> > > > > > > > >> > > > > > > > >> > > > > > > On Fri, Oct 16, 2009 > at 1:21 PM, Devon McCormick > >> > > > > <[email protected]> > >> > > > > > > wrote: > >> > > > > > > > >> > > > > > > > Members of the > forum - > >> > > > > > > > > >> > > > > > > > while looking > up some statistical definitions, I came > >> across > >> > > this > >> > > > > > example > >> > > > > > > > > >> > > > > > > >> > > > > > >> > > > >> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability > >> > > > > > > > in which the > calculation of the median disagrees with the > >> > > result > >> > > > > of > >> > > > > > the > >> > > > > > > one > >> > > > > > > > listed as "m0=: > median=: <....@-:@# { /:~" in "MathStats" on > >> the > >> > > J > >> > > > > > wiki. > >> > > > > > > > > >> > > > > > > > I was actually > looking at the definition of quartiles > >> when I > >> > > > > > noticed > >> > > > > > > this. > >> > > > > > > > > >> > > > > > > > For the series > >> > > > > > > > > >> > > > > > > > > #scrs=. 43 48 50 50 52 53 56 58 59 60 > 62 65 66 68 70 71 > >> 74 > >> > > 76 > >> > > > > 78 > >> > > > > > 80 > >> > > > > > > > 20 > >> > > > > > > > > m0=: <....@-:@# { /:~ > >> > > > > > > > > m0 scrs > >> > > > > > > > 62 > >> > > > > > > > > median scrs NB. my own > definition > >> > > > > > > > 61 > >> > > > > > > > > median > >> > > > > > > > -:@(+/)@((<. > , >.)@midpt { /:~) > >> > > > > > > > > midpt > >> > > > > > > > -:@<:@# > >> > > > > > > > > >> > > > > > > > Also, this > site's answers disagree with Excel and with my > >> own > >> > > > > > quartile > >> > > > > > > > function, > applied to "scrs" above, but I think the site > >> is > >> > > > > correct: > >> > > > > > > > > NB. Quartiles 1-3 according to Excel: > >> > > > > > > > > 52.75 61 70.25 > >> > > > > > > > > >> > > > > > > > > NB. According to > >> > > > > > > > > >> > > > > > > >> > > > > > >> > > > >> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability: > >> > > > > > > > > 52.5 61 70.5 > >> > > > > > > > > >> > > > > > > > > 0 1 2 quartile&><scrs > >> > > > > > > > 52 60 70 > >> > > > > > > > > >> > > > > > > > NB. My > "quartile" disagrees with my "median": the middle > >> > > quartile > >> > > > > > should > >> > > > > > > be > >> > > > > > > > the same as the > median. > >> > > > > > > > > quartile > >> > > > > > > > 4 : 'x{4 > ntilebps y' > >> > > > > > > > > ntilebps > >> > > > > > > > 4 : 0 > >> > > > > > > > NB.* ntilebps: > return breakpoint values of x-tiles of y; > >> e.g. > >> > > 4 > >> > > > > > ntilebps > >> > > > > > > y > >> > > > > > > > NB. -> > quartiles; 0-based so "1st" quartile is 0{4 > >> ntilebps > >> > > y. > >> > > > > > > > > quant=. x > >> > > > > > > > > y=. /:~y > >> > > > > > > > > wh=. 0 1#:(i.quant)*quant%~#y > NB. Where partition > >> points > >> > > are > >> > > > > > exactly > >> > > > > > > > > 'n f'=. |:wh > NB. whole > and > >> fractional > >> > > part > >> > > > > of > >> > > > > > > > partitions > >> > > > > > > > > 1|.+/"1 ((1-f),.f)*(n+/_1 0){y NB. > "1|." moves top > >> quantile > >> > > to > >> > > > > > end. > >> > > > > > > > ) > >> > > > > > > > > >> > > > > > > > Anyone care to > weigh in on this? > >> > > > > > > > > >> > > > > > > > Regards, > >> > > > > > > > > >> > > > > > > > Devon > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > -- > >> > > > > > > > Devon > McCormick, CFA > >> > > > > > > > ^me^ at acm. > >> > > > > > > > org is my > >> > > > > > > > preferred > e-mail > >> > > > > > > > > --------------------------------------------------------- > >> ---- > >> > > ---- > >> > > > > -- > >> > > > > > --- > >> > > > > > > > For information > about J forums see > >> > > > > > http://www.jsoftware.com/forums.htm > >> > > > > > > > > >> > > > > > > > ----------------------------------------------------------- > >> ---- > >> > > ---- > >> > > > > -- > >> > > > > > - > >> > > > > > > For information > about J forums see > >> > > > > > http://www.jsoftware.com/forums.htm > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > -- > >> > > > > > Devon McCormick, CFA > >> > > > > > ^me^ at acm. > >> > > > > > org is my > >> > > > > > preferred e-mail > >> > > > > > > ------------------------------------------------------------- > >> ---- > >> > > ---- > >> > > > > - > >> > > > > > For information about J > forums see > >> > > > > http://www.jsoftware.com/forums.htm > >> > > > > > --------------------------------------------------------------- > >> ---- > >> > > --- > >> > > > > For information about J forums > see > >> > > http://www.jsoftware.com/forums.htm > >> > > > > ----------------------------------------------------------------- > >> ---- > >> > > - > >> > > > For information about J forums see > >> > > http://www.jsoftware.com/forums.htm > >> > > > > >> > > > >> > > > >> > > > >> > > -- > >> > > Devon McCormick, CFA > >> > > ^me^ at acm. > >> > > org is my > >> > > preferred e-mail > >> > > > ------------------------------------------------------------------- > >> --- > >> > > For information about J forums see > >> http://www.jsoftware.com/forums.htm > >> > > --------------------------------------------------------------------- > >> - > >> > For information about J forums see > >> http://www.jsoftware.com/forums.htm > >> > > >> > >> > >> > >> -- > >> Devon McCormick, CFA > >> ^me^ at acm. > >> org is my > >> preferred e-mail > >> > ---------------------------------------------------------------------- > >> For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > Trænger du til at se det store billede? Kelkoo giver dig gode tilbud på LCD TV! Se her http://dk.yahoo.com/r/pat/lcd ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
