The tests in my last post use the incorrect name.

Here is another version - simpler and nicer to parse:

   ntiles1=: [: -:@(+/) (,: <:)@((%~ i.&.<:)@[ >.@:* #...@]) { /:~...@]

> -----Original Message-----
> From: [email protected] [mailto:programming-
> [email protected]] On Behalf Of Sherlock, Ric
> Sent: Sunday, 18 October 2009 13:24
> To: Programming forum
> Subject: Re: [Jprogramming] "median" considered inaccurate?
> 
>    ntiles=: -:@(+/)@(] {~ (,: <:)@([ ((%~ i.&.<:)@[ >.@:* #...@]) /:~...@]))
> 
>    2 tiles scrs
> 61
>    3 tiles scrs
> 57 69
>    4 tiles scrs
> 52.5 61 70.5
>    5 tiles scrs
> 51 58.5 65.5 72.5
> 
> 
> > From: Devon McCormick
> >
> > If you can't stop, you should look to generalize this: quartiles are
> > only a
> > special case of N-tiles.
> >
> > On Sat, Oct 17, 2009 at 2:11 AM, Sherlock, Ric
> > <[email protected]>wrote:
> >
> > > Sorry, couldn't stop....
> > >
> > > A few more versions of quartiles:
> > >
> > > Tidied up version of Don's
> > > quartiles0=: -:@(+/)@({~ (,: <:)@(0.25 0.5 0.75 >.@:* #))@/:~
> > >
> > > Simplified version of Keith's
> > > quartiles1=: median (([: median ] #~ >) , [ , [: median ] #~ <) ]
> > >
> > > A slightly different approach:
> > > quartiles2=: /:~@(median ([ , > median/. ]) ])
> > >
> > >
> > > > From: Sherlock, Ric
> > > >
> > > > The following is based on Keith Similie's stats companion.
> > > >
> > > > NB. Median and quartiles
> > > > midpt=: -:@<:@#
> > > > median=: -:@(+/)@((<.,>.)@midpt { /:~)
> > > > Q1=: [: median ] #~ median > ]
> > > > Q3=: [: median ] #~ median < ]
> > > > quartiles=: Q1 , median , Q3
> > > >
> > > > Another definition of median where the domain is integers.
> > > >
> > > > median=: ~.@((<.,>.)@midpt { /:~)
> > > >
> > > > > From: Devon McCormick
> > > > >
> > > > > Don - I like yours better than the one I have now, though I'll
> > > > probably
> > > > > generalize it into an "Ntiler".
> > > > >
> > > > > Part of the problem is that there are multiple correct answers
> if
> > we
> > > > > define
> > > > > quartile numbers as those which divide the set as evenly as
> > possible
> > > > > into
> > > > > four groups, e.g.
> > > > >
> > > > >    quartileCt=:  4 : '+/"1 (y>:/~x,_) *. y< /~__,x'  NB. Count
> > > > > elements/quartile
> > > > > NB. All these different answers work correctly:
> > > > >    (52.75 61 70.25) quartileCt scrs  NB. Excel
> > > > > 5 5 5 5
> > > > >    (52.5 61 70.5) quartileCt scrs    NB. web site
> > > > > 5 5 5 5
> > > > >    (52.1 61.1 70.1) quartileCt scrs  NB. another answer...
> > > > > 5 5 5 5
> > > > >
> > > > > One way to test, as you suggest is to look at the behavior when
> > we
> > > > have
> > > > > an
> > > > > odd number of elements, i.e. "odd" with respect to four:
> > > > >
> > > > > NB. Two different ways of counting number of elements/quartile:
> > > > >    quartileCt=:  4 : '+/"1 (y>:/~x,_) *. y< /~__,x'
> > > > >    quartileCt2=: 4 : '+/"1 (y> /~x,_) *. y<:/~__,x'
> > > > > NB. Two different quartilers:
> > > > >    test0=: 1 : '(3{.4 ntilebps y) u y'  NB. Mine
> > > > >    test1=: 1 : '(qr y) u y'             NB. Don's
> > > > >
> > > > > NB. Both work OK for even and odd cases counted one way...
> > > > >    quartileCt test0&>0 1 2 3 4}.&.><scrs
> > > > > 5 5 5 5
> > > > > 4 5 5 5
> > > > > 4 5 4 5
> > > > > 4 4 4 5
> > > > > 4 4 4 4
> > > > >    quartileCt test1&>0 1 2 3 4}.&.><scrs
> > > > > 5 5 5 5
> > > > > 5 5 5 4
> > > > > 5 4 5 4
> > > > > 5 4 4 4
> > > > > 4 4 4 4
> > > > >
> > > > > NB. Mine falls down for a couple of cases counted the other
> way:
> > > > >    quartileCt2 test0&>0 1 2 3 4}.&.><scrs
> > > > > 4 5 5 6
> > > > > 4 5 5 5
> > > > > 4 4 5 5
> > > > > 4 4 4 5
> > > > > 3 4 4 5
> > > > > NB. but Don's works OK under different counting method as well:
> > > > >    quartileCt2 test1&>0 1 2 3 4}.&.><scrs
> > > > > 5 5 5 5
> > > > > 5 5 5 4
> > > > > 5 4 5 4
> > > > > 5 4 4 4
> > > > > 4 4 4 4
> > > > >
> > > > > Thanks for your suggestions.
> > > > >
> > > > > Regards,
> > > > >
> > > > > Devon
> > > > >
> > > > > On Fri, Oct 16, 2009 at 3:47 PM, Don Guinn <[email protected]>
> > > > wrote:
> > > > >
> > > > > > Looked up the definition of "median" and it appears that
> there
> > are
> > > > > several
> > > > > > definitions of "median". And, according to
> > > > > > http://en.wikipedia.org/wiki/Median median and quartiles can
> be
> > > > messy
> > > > > with
> > > > > > badly skewed data. Best I can tell this is a measurement that
> > > > should
> > > > > be
> > > > > > used
> > > > > > with care.
> > > > > > I wrote a quick verb which gives the same answers as the site
> > you
> > > > > > referenced
> > > > > > and it does strange things, depending on the data. If the
> count
> > of
> > > > > the set
> > > > > > is odd, which group should have the extra number? What if the
> > data
> > > > is
> > > > > > really
> > > > > > skewed?
> > > > > >
> > > > > >   qr=.([:([:(+/%#)]{~[:(<:,:])[:>.0.25 0.5 0.75"_*#)]/:]) NB.
> > Needs
> > > > > > cleaning up.
> > > > > >   qr scrs
> > > > > > 52.5 61 70.5
> > > > > >    qr i.4
> > > > > > 0.5 1.5 2.5
> > > > > >   qr i.5
> > > > > > 1.5 2.5 3.5
> > > > > >   qr i.12
> > > > > > 2.5 5.5 8.5
> > > > > >   qr i.11
> > > > > > 2.5 5.5 8.5
> > > > > >   qr i.13
> > > > > > 3.5 6.5 9.5
> > > > > >   -~/0 2{qr scrs
> > > > > > 18
> > > > > >   qr 1 1 1 1 1 2 3 4
> > > > > > 1 1 2.5
> > > > > >
> > > > > >
> > > > > > On Fri, Oct 16, 2009 at 1:21 PM, Devon McCormick
> > > > <[email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Members of the forum -
> > > > > > >
> > > > > > > while looking up some statistical definitions, I came
> across
> > this
> > > > > example
> > > > > > >
> > > > >
> > > >
> >
> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability
> > > > > > > in which the calculation of the median disagrees with the
> > result
> > > > of
> > > > > the
> > > > > > one
> > > > > > > listed as "m0=: median=: <....@-:@# { /:~" in "MathStats" on
> the
> > J
> > > > > wiki.
> > > > > > >
> > > > > > > I was actually looking at the definition of quartiles when
> I
> > > > > noticed
> > > > > > this.
> > > > > > >
> > > > > > > For the series
> > > > > > >
> > > > > > >   #scrs=. 43 48 50 50 52 53 56 58 59 60 62 65 66 68 70 71
> 74
> > 76
> > > > 78
> > > > > 80
> > > > > > > 20
> > > > > > >   m0=: <....@-:@# { /:~
> > > > > > >   m0 scrs
> > > > > > > 62
> > > > > > >   median scrs  NB. my own definition
> > > > > > > 61
> > > > > > >   median
> > > > > > > -:@(+/)@((<. , >.)@midpt { /:~)
> > > > > > >   midpt
> > > > > > > -:@<:@#
> > > > > > >
> > > > > > > Also, this site's answers disagree with Excel and with my
> own
> > > > > quartile
> > > > > > > function, applied to "scrs" above, but I think the site is
> > > > correct:
> > > > > > >   NB. Quartiles 1-3 according to Excel:
> > > > > > >   52.75 61 70.25
> > > > > > >
> > > > > > >   NB. According to
> > > > > > >
> > > > >
> > > >
> >
> http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability:
> > > > > > >   52.5 61 70.5
> > > > > > >
> > > > > > >   0 1 2 quartile&><scrs
> > > > > > > 52 60 70
> > > > > > >
> > > > > > > NB. My "quartile" disagrees with my "median": the middle
> > quartile
> > > > > should
> > > > > > be
> > > > > > > the same as the median.
> > > > > > >   quartile
> > > > > > > 4 : 'x{4 ntilebps y'
> > > > > > >   ntilebps
> > > > > > > 4 : 0
> > > > > > > NB.* ntilebps: return breakpoint values of x-tiles of y;
> e.g.
> > 4
> > > > > ntilebps
> > > > > > y
> > > > > > > NB.  -> quartiles; 0-based so "1st" quartile is 0{4
> ntilebps
> > y.
> > > > > > >   quant=. x
> > > > > > >   y=. /:~y
> > > > > > >   wh=. 0 1#:(i.quant)*quant%~#y  NB. Where partition points
> > are
> > > > > exactly
> > > > > > >   'n f'=. |:wh                    NB. whole and fractional
> > part
> > > > of
> > > > > > > partitions
> > > > > > >   1|.+/"1 ((1-f),.f)*(n+/_1 0){y NB. "1|." moves top
> quantile
> > to
> > > > > end.
> > > > > > > )
> > > > > > >
> > > > > > > Anyone care to weigh in on this?
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > Devon
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Devon McCormick, CFA
> > > > > > > ^me^ at acm.
> > > > > > > org is my
> > > > > > > preferred e-mail
> > > > > > > -----------------------------------------------------------
> --
> > ----
> > > > --
> > > > > ---
> > > > > > > For information about J forums see
> > > > > http://www.jsoftware.com/forums.htm
> > > > > > >
> > > > > > -------------------------------------------------------------
> --
> > ----
> > > > --
> > > > > -
> > > > > > For information about J forums see
> > > > > http://www.jsoftware.com/forums.htm
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Devon McCormick, CFA
> > > > > ^me^ at acm.
> > > > > org is my
> > > > > preferred e-mail
> > > > > ---------------------------------------------------------------
> --
> > ----
> > > > -
> > > > > For information about J forums see
> > > > http://www.jsoftware.com/forums.htm
> > > > -----------------------------------------------------------------
> --
> > ---
> > > > For information about J forums see
> > http://www.jsoftware.com/forums.htm
> > > -------------------------------------------------------------------
> --
> > -
> > > For information about J forums see
> > http://www.jsoftware.com/forums.htm
> > >
> >
> >
> >
> > --
> > Devon McCormick, CFA
> > ^me^ at acm.
> > org is my
> > preferred e-mail
> > ---------------------------------------------------------------------
> -
> > For information about J forums see
> http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to