There is a problem with the previous general versions and Don's original for
odd-numbered groups - ntiles2 fixes using the same mechanism and similar
structure to median
ntiles1=: [: -:@(+/) (,: <:)@((%~ i.&.<:)@[ >.@:* #...@]) { /:~...@]
midpts=: (%~ i.&.<:)@[ * <:@#...@]
ntiles2=: -:@(+/)@(<. ,: >.)@midpts { /:~...@]
midpt=: -:@<:@#
median=: -:@(+/)@((<. , >.)@midpt { /:~)
median 3 4 5 6
4.5
2 ntiles1 3 4 5 6
4.5
2 ntiles2 3 4 5 6
4.5
median 3 4 5 6 7
5
2 ntiles1 3 4 5 6 7
5.5
2 ntiles2 3 4 5 6 7
5
> -----Original Message-----
> From: [email protected] [mailto:programming-
> [email protected]] On Behalf Of Devon McCormick
> Sent: Sunday, 18 October 2009 15:24
> To: Programming forum
> Subject: Re: [Jprogramming] "median" considered inaccurate?
>
> Cool! Consider it swiped!
>
> Quantiles are a very useful way to compare stochastic models, e.g.
> what's
> the performance of bottom-decile PE stocks versus top-decile ones? And
> if
> there is a consistent relation between the top and bottom deciles, does
> it
> also hold if we use 11-tiles or 9-tiles?
>
> On Sat, Oct 17, 2009 at 8:24 PM, Sherlock, Ric
> <[email protected]>wrote:
>
> > ntiles=: -:@(+/)@(] {~ (,: <:)@([ ((%~ i.&.<:)@[ >.@:* #...@]) /:~...@]))
> >
> > 2 tiles scrs
> > 61
> > 3 tiles scrs
> > 57 69
> > 4 tiles scrs
> > 52.5 61 70.5
> > 5 tiles scrs
> > 51 58.5 65.5 72.5
> >
> >
> > > From: Devon McCormick
> > >
> > > If you can't stop, you should look to generalize this: quartiles
> are
> > > only a
> > > special case of N-tiles.
> > >
> > > On Sat, Oct 17, 2009 at 2:11 AM, Sherlock, Ric
> > > <[email protected]>wrote:
> > >
> > > > Sorry, couldn't stop....
> > > >
> > > > A few more versions of quartiles:
> > > >
> > > > Tidied up version of Don's
> > > > quartiles0=: -:@(+/)@({~ (,: <:)@(0.25 0.5 0.75 >.@:* #))@/:~
> > > >
> > > > Simplified version of Keith's
> > > > quartiles1=: median (([: median ] #~ >) , [ , [: median ] #~ <) ]
> > > >
> > > > A slightly different approach:
> > > > quartiles2=: /:~@(median ([ , > median/. ]) ])
> > > >
> > > >
> > > > > From: Sherlock, Ric
> > > > >
> > > > > The following is based on Keith Similie's stats companion.
> > > > >
> > > > > NB. Median and quartiles
> > > > > midpt=: -:@<:@#
> > > > > median=: -:@(+/)@((<.,>.)@midpt { /:~)
> > > > > Q1=: [: median ] #~ median > ]
> > > > > Q3=: [: median ] #~ median < ]
> > > > > quartiles=: Q1 , median , Q3
> > > > >
> > > > > Another definition of median where the domain is integers.
> > > > >
> > > > > median=: ~.@((<.,>.)@midpt { /:~)
> > > > >
> > > > > > From: Devon McCormick
> > > > > >
> > > > > > Don - I like yours better than the one I have now, though
> I'll
> > > > > probably
> > > > > > generalize it into an "Ntiler".
> > > > > >
> > > > > > Part of the problem is that there are multiple correct
> answers if
> > > we
> > > > > > define
> > > > > > quartile numbers as those which divide the set as evenly as
> > > possible
> > > > > > into
> > > > > > four groups, e.g.
> > > > > >
> > > > > > quartileCt=: 4 : '+/"1 (y>:/~x,_) *. y< /~__,x' NB.
> Count
> > > > > > elements/quartile
> > > > > > NB. All these different answers work correctly:
> > > > > > (52.75 61 70.25) quartileCt scrs NB. Excel
> > > > > > 5 5 5 5
> > > > > > (52.5 61 70.5) quartileCt scrs NB. web site
> > > > > > 5 5 5 5
> > > > > > (52.1 61.1 70.1) quartileCt scrs NB. another answer...
> > > > > > 5 5 5 5
> > > > > >
> > > > > > One way to test, as you suggest is to look at the behavior
> when
> > > we
> > > > > have
> > > > > > an
> > > > > > odd number of elements, i.e. "odd" with respect to four:
> > > > > >
> > > > > > NB. Two different ways of counting number of
> elements/quartile:
> > > > > > quartileCt=: 4 : '+/"1 (y>:/~x,_) *. y< /~__,x'
> > > > > > quartileCt2=: 4 : '+/"1 (y> /~x,_) *. y<:/~__,x'
> > > > > > NB. Two different quartilers:
> > > > > > test0=: 1 : '(3{.4 ntilebps y) u y' NB. Mine
> > > > > > test1=: 1 : '(qr y) u y' NB. Don's
> > > > > >
> > > > > > NB. Both work OK for even and odd cases counted one way...
> > > > > > quartileCt test0&>0 1 2 3 4}.&.><scrs
> > > > > > 5 5 5 5
> > > > > > 4 5 5 5
> > > > > > 4 5 4 5
> > > > > > 4 4 4 5
> > > > > > 4 4 4 4
> > > > > > quartileCt test1&>0 1 2 3 4}.&.><scrs
> > > > > > 5 5 5 5
> > > > > > 5 5 5 4
> > > > > > 5 4 5 4
> > > > > > 5 4 4 4
> > > > > > 4 4 4 4
> > > > > >
> > > > > > NB. Mine falls down for a couple of cases counted the other
> way:
> > > > > > quartileCt2 test0&>0 1 2 3 4}.&.><scrs
> > > > > > 4 5 5 6
> > > > > > 4 5 5 5
> > > > > > 4 4 5 5
> > > > > > 4 4 4 5
> > > > > > 3 4 4 5
> > > > > > NB. but Don's works OK under different counting method as
> well:
> > > > > > quartileCt2 test1&>0 1 2 3 4}.&.><scrs
> > > > > > 5 5 5 5
> > > > > > 5 5 5 4
> > > > > > 5 4 5 4
> > > > > > 5 4 4 4
> > > > > > 4 4 4 4
> > > > > >
> > > > > > Thanks for your suggestions.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Devon
> > > > > >
> > > > > > On Fri, Oct 16, 2009 at 3:47 PM, Don Guinn
> <[email protected]>
> > > > > wrote:
> > > > > >
> > > > > > > Looked up the definition of "median" and it appears that
> there
> > > are
> > > > > > several
> > > > > > > definitions of "median". And, according to
> > > > > > > http://en.wikipedia.org/wiki/Median median and quartiles
> can be
> > > > > messy
> > > > > > with
> > > > > > > badly skewed data. Best I can tell this is a measurement
> that
> > > > > should
> > > > > > be
> > > > > > > used
> > > > > > > with care.
> > > > > > > I wrote a quick verb which gives the same answers as the
> site
> > > you
> > > > > > > referenced
> > > > > > > and it does strange things, depending on the data. If the
> count
> > > of
> > > > > > the set
> > > > > > > is odd, which group should have the extra number? What if
> the
> > > data
> > > > > is
> > > > > > > really
> > > > > > > skewed?
> > > > > > >
> > > > > > > qr=.([:([:(+/%#)]{~[:(<:,:])[:>.0.25 0.5 0.75"_*#)]/:])
> NB.
> > > Needs
> > > > > > > cleaning up.
> > > > > > > qr scrs
> > > > > > > 52.5 61 70.5
> > > > > > > qr i.4
> > > > > > > 0.5 1.5 2.5
> > > > > > > qr i.5
> > > > > > > 1.5 2.5 3.5
> > > > > > > qr i.12
> > > > > > > 2.5 5.5 8.5
> > > > > > > qr i.11
> > > > > > > 2.5 5.5 8.5
> > > > > > > qr i.13
> > > > > > > 3.5 6.5 9.5
> > > > > > > -~/0 2{qr scrs
> > > > > > > 18
> > > > > > > qr 1 1 1 1 1 2 3 4
> > > > > > > 1 1 2.5
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Oct 16, 2009 at 1:21 PM, Devon McCormick
> > > > > <[email protected]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Members of the forum -
> > > > > > > >
> > > > > > > > while looking up some statistical definitions, I came
> across
> > > this
> > > > > > example
> > > > > > > >
> > > > > >
> > > > >
> > >
> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability
> > > > > > > > in which the calculation of the median disagrees with the
> > > result
> > > > > of
> > > > > > the
> > > > > > > one
> > > > > > > > listed as "m0=: median=: <....@-:@# { /:~" in "MathStats" on
> the
> > > J
> > > > > > wiki.
> > > > > > > >
> > > > > > > > I was actually looking at the definition of quartiles
> when I
> > > > > > noticed
> > > > > > > this.
> > > > > > > >
> > > > > > > > For the series
> > > > > > > >
> > > > > > > > #scrs=. 43 48 50 50 52 53 56 58 59 60 62 65 66 68 70 71
> 74
> > > 76
> > > > > 78
> > > > > > 80
> > > > > > > > 20
> > > > > > > > m0=: <....@-:@# { /:~
> > > > > > > > m0 scrs
> > > > > > > > 62
> > > > > > > > median scrs NB. my own definition
> > > > > > > > 61
> > > > > > > > median
> > > > > > > > -:@(+/)@((<. , >.)@midpt { /:~)
> > > > > > > > midpt
> > > > > > > > -:@<:@#
> > > > > > > >
> > > > > > > > Also, this site's answers disagree with Excel and with my
> own
> > > > > > quartile
> > > > > > > > function, applied to "scrs" above, but I think the site
> is
> > > > > correct:
> > > > > > > > NB. Quartiles 1-3 according to Excel:
> > > > > > > > 52.75 61 70.25
> > > > > > > >
> > > > > > > > NB. According to
> > > > > > > >
> > > > > >
> > > > >
> > >
> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability:
> > > > > > > > 52.5 61 70.5
> > > > > > > >
> > > > > > > > 0 1 2 quartile&><scrs
> > > > > > > > 52 60 70
> > > > > > > >
> > > > > > > > NB. My "quartile" disagrees with my "median": the middle
> > > quartile
> > > > > > should
> > > > > > > be
> > > > > > > > the same as the median.
> > > > > > > > quartile
> > > > > > > > 4 : 'x{4 ntilebps y'
> > > > > > > > ntilebps
> > > > > > > > 4 : 0
> > > > > > > > NB.* ntilebps: return breakpoint values of x-tiles of y;
> e.g.
> > > 4
> > > > > > ntilebps
> > > > > > > y
> > > > > > > > NB. -> quartiles; 0-based so "1st" quartile is 0{4
> ntilebps
> > > y.
> > > > > > > > quant=. x
> > > > > > > > y=. /:~y
> > > > > > > > wh=. 0 1#:(i.quant)*quant%~#y NB. Where partition
> points
> > > are
> > > > > > exactly
> > > > > > > > 'n f'=. |:wh NB. whole and
> fractional
> > > part
> > > > > of
> > > > > > > > partitions
> > > > > > > > 1|.+/"1 ((1-f),.f)*(n+/_1 0){y NB. "1|." moves top
> quantile
> > > to
> > > > > > end.
> > > > > > > > )
> > > > > > > >
> > > > > > > > Anyone care to weigh in on this?
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > Devon
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Devon McCormick, CFA
> > > > > > > > ^me^ at acm.
> > > > > > > > org is my
> > > > > > > > preferred e-mail
> > > > > > > > ---------------------------------------------------------
> ----
> > > ----
> > > > > --
> > > > > > ---
> > > > > > > > For information about J forums see
> > > > > > http://www.jsoftware.com/forums.htm
> > > > > > > >
> > > > > > > -----------------------------------------------------------
> ----
> > > ----
> > > > > --
> > > > > > -
> > > > > > > For information about J forums see
> > > > > > http://www.jsoftware.com/forums.htm
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Devon McCormick, CFA
> > > > > > ^me^ at acm.
> > > > > > org is my
> > > > > > preferred e-mail
> > > > > > -------------------------------------------------------------
> ----
> > > ----
> > > > > -
> > > > > > For information about J forums see
> > > > > http://www.jsoftware.com/forums.htm
> > > > > ---------------------------------------------------------------
> ----
> > > ---
> > > > > For information about J forums see
> > > http://www.jsoftware.com/forums.htm
> > > > -----------------------------------------------------------------
> ----
> > > -
> > > > For information about J forums see
> > > http://www.jsoftware.com/forums.htm
> > > >
> > >
> > >
> > >
> > > --
> > > Devon McCormick, CFA
> > > ^me^ at acm.
> > > org is my
> > > preferred e-mail
> > > -------------------------------------------------------------------
> ---
> > > For information about J forums see
> http://www.jsoftware.com/forums.htm
> > ---------------------------------------------------------------------
> -
> > For information about J forums see
> http://www.jsoftware.com/forums.htm
> >
>
>
>
> --
> Devon McCormick, CFA
> ^me^ at acm.
> org is my
> preferred e-mail
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm