Don - I like yours better than the one I have now, though I'll probably
generalize it into an "Ntiler".

Part of the problem is that there are multiple correct answers if we define
quartile numbers as those which divide the set as evenly as possible into
four groups, e.g.

   quartileCt=:  4 : '+/"1 (y>:/~x,_) *. y< /~__,x'  NB. Count
elements/quartile
NB. All these different answers work correctly:
   (52.75 61 70.25) quartileCt scrs  NB. Excel
5 5 5 5
   (52.5 61 70.5) quartileCt scrs    NB. web site
5 5 5 5
   (52.1 61.1 70.1) quartileCt scrs  NB. another answer...
5 5 5 5

One way to test, as you suggest is to look at the behavior when we have an
odd number of elements, i.e. "odd" with respect to four:

NB. Two different ways of counting number of elements/quartile:
   quartileCt=:  4 : '+/"1 (y>:/~x,_) *. y< /~__,x'
   quartileCt2=: 4 : '+/"1 (y> /~x,_) *. y<:/~__,x'
NB. Two different quartilers:
   test0=: 1 : '(3{.4 ntilebps y) u y'  NB. Mine
   test1=: 1 : '(qr y) u y'             NB. Don's

NB. Both work OK for even and odd cases counted one way...
   quartileCt test0&>0 1 2 3 4}.&.><scrs
5 5 5 5
4 5 5 5
4 5 4 5
4 4 4 5
4 4 4 4
   quartileCt test1&>0 1 2 3 4}.&.><scrs
5 5 5 5
5 5 5 4
5 4 5 4
5 4 4 4
4 4 4 4

NB. Mine falls down for a couple of cases counted the other way:
   quartileCt2 test0&>0 1 2 3 4}.&.><scrs
4 5 5 6
4 5 5 5
4 4 5 5
4 4 4 5
3 4 4 5
NB. but Don's works OK under different counting method as well:
   quartileCt2 test1&>0 1 2 3 4}.&.><scrs
5 5 5 5
5 5 5 4
5 4 5 4
5 4 4 4
4 4 4 4

Thanks for your suggestions.

Regards,

Devon

On Fri, Oct 16, 2009 at 3:47 PM, Don Guinn <[email protected]> wrote:

> Looked up the definition of "median" and it appears that there are several
> definitions of "median". And, according to
> http://en.wikipedia.org/wiki/Median median and quartiles can be messy with
> badly skewed data. Best I can tell this is a measurement that should be
> used
> with care.
> I wrote a quick verb which gives the same answers as the site you
> referenced
> and it does strange things, depending on the data. If the count of the set
> is odd, which group should have the extra number? What if the data is
> really
> skewed?
>
>   qr=.([:([:(+/%#)]{~[:(<:,:])[:>.0.25 0.5 0.75"_*#)]/:]) NB. Needs
> cleaning up.
>   qr scrs
> 52.5 61 70.5
>    qr i.4
> 0.5 1.5 2.5
>   qr i.5
> 1.5 2.5 3.5
>   qr i.12
> 2.5 5.5 8.5
>   qr i.11
> 2.5 5.5 8.5
>   qr i.13
> 3.5 6.5 9.5
>   -~/0 2{qr scrs
> 18
>   qr 1 1 1 1 1 2 3 4
> 1 1 2.5
>
>
> On Fri, Oct 16, 2009 at 1:21 PM, Devon McCormick <[email protected]>
> wrote:
>
> > Members of the forum -
> >
> > while looking up some statistical definitions, I came across this example
> > http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability
> > in which the calculation of the median disagrees with the result of the
> one
> > listed as "m0=: median=: <....@-:@# { /:~" in "MathStats" on the J wiki.
> >
> > I was actually looking at the definition of quartiles when I noticed
> this.
> >
> > For the series
> >
> >   #scrs=. 43 48 50 50 52 53 56 58 59 60 62 65 66 68 70 71 74 76 78 80
> > 20
> >   m0=: <....@-:@# { /:~
> >   m0 scrs
> > 62
> >   median scrs  NB. my own definition
> > 61
> >   median
> > -:@(+/)@((<. , >.)@midpt { /:~)
> >   midpt
> > -:@<:@#
> >
> > Also, this site's answers disagree with Excel and with my own quartile
> > function, applied to "scrs" above, but I think the site is correct:
> >   NB. Quartiles 1-3 according to Excel:
> >   52.75 61 70.25
> >
> >   NB. According to
> > http://www2.le.ac.uk/offices/ssds/sd/ld/resources/numeracy/variability:
> >   52.5 61 70.5
> >
> >   0 1 2 quartile&><scrs
> > 52 60 70
> >
> > NB. My "quartile" disagrees with my "median": the middle quartile should
> be
> > the same as the median.
> >   quartile
> > 4 : 'x{4 ntilebps y'
> >   ntilebps
> > 4 : 0
> > NB.* ntilebps: return breakpoint values of x-tiles of y; e.g. 4 ntilebps
> y
> > NB.  -> quartiles; 0-based so "1st" quartile is 0{4 ntilebps y.
> >   quant=. x
> >   y=. /:~y
> >   wh=. 0 1#:(i.quant)*quant%~#y  NB. Where partition points are exactly
> >   'n f'=. |:wh                    NB. whole and fractional part of
> > partitions
> >   1|.+/"1 ((1-f),.f)*(n+/_1 0){y NB. "1|." moves top quantile to end.
> > )
> >
> > Anyone care to weigh in on this?
> >
> > Regards,
> >
> > Devon
> >
> >
> > --
> > Devon McCormick, CFA
> > ^me^ at acm.
> > org is my
> > preferred e-mail
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>



-- 
Devon McCormick, CFA
^me^ at acm.
org is my
preferred e-mail
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to