Re: [Jprogramming] Classification problem

Roger Hui Fri, 24 Sep 2010 09:03:04 -0700

http://www.jsoftware.com/jwiki/Essays/Order%20Statistics


Incidentally, the solutions to this problem are an instance 
of the "elegant vs. practical" discussion more finely balanced
than the dyadic index-of example.  There is a short and 
elegant solution which is not extravagantly more costly than 
the longer and faster solution.  The argument tilts further 
towards the "short and elegant" solution when you have 
to compute more than one, such as (min,q1,med,q2 max).



----- Original Message -----
From: Kip Murray <k...@math.uh.edu>
Date: Friday, September 24, 2010 8:09
Subject: Re: [Jprogramming] Classification problem
To: programming@jsoftware.com

> About "Don't do that!" modern statistics texts recommend the 
> five-number summary
> 
> Minimum Q1 M Q3 Maximum
> 
> as a reasonably complete description of center and spread. M Q1 
> Q3 are the 
> median and quartiles, see David S. Moore and George P. McCabe, 
> Introduction to 
> the Practice of Statistics, p. 42, for a description of how to 
> calculate.
> (Find the median, then find the median of observations below the 
> median and the 
> median of observations above the median.  J exercise: find 
> the five number 
> summary of a list.)
> 
> Kip Murray
> 
> 
> On 9/24/2010 1:37 AM, Bo Jacoby wrote:
> > First comes relevance, then correctness, performance, and 
> elegance in some order. Not every problem is put right. For 
> example. Question: from a long list of numbers, how to compute 
> the 15th, 50th and 85th percentile? Answer: Don't do that! 
> Compute the mean value and the standard deviation instead.
> > Venlig hilsen, Bo
> >
> >
> > --- Den tors 23/9/10 skrev Roger Hui<rhui...@shaw.ca>:
> >
> > Fra: Roger Hui<rhui...@shaw.ca>
> > Emne: Re: [Jprogramming] Classification problem
> > Til: "Programming forum"<programming@jsoftware.com>
> > Dato: torsdag 23. september 2010 16.48
> >
> > At first I was going to respond to R.E. Boss with a
> > cute and annoying reply, something like "first comes
> > elegance, then elegance, then elegance".  But I think
> > now I agree with him, "first comes correctness, then
> > performance, then elegance",  rather than "first comes
> > elegance and correctness".  A counterexample to the
> > latter is a model for dyadic index-of for vectors.
> > An elegant and correct model is:
> >
> > ix=: #...@[ - (+/)@(+./\)@(=/)
> >
> > But due to abysmal performance this is pretty useless
> > in practice.
> >
> > I guess it depends on what you mean by "first comes".
> > If you were implementing dyadic index-of you _would_
> > first writing something like the above, but almost
> > immediately you write something else.
> >
> >
> >
> > ----- Original Message -----
> > From: Robert Raschke<rtrli...@googlemail.com>
> > Date: Thursday, September 23, 2010 6:57
> > Subject: Re: [Jprogramming] Classification problem
> > To: Programming forum<programming@jsoftware.com>
> >
> >> I disagree, first comes elegance and correctness, and these tend
> >> to go hand
> >> in hand if you are concentrating on elegance. I quite strongly
> >> believe a
> >> human reader is way more important than any machine.
> >>
> >> Robby
> >>
> >> On Thu, Sep 23, 2010 at 2:24 PM, R.E. Boss
> >> <r.e.b...@planet.nl>  wrote:
> >>
> >>> I agree. First comes correctness, then performance, then elegance.
> >>>
> >>>
> >>> R.E. Boss
> >>>
> >>>
> >>>> -----Oorspronkelijk bericht-----
> >>>> Van: programming-boun...@jsoftware.com [mailto:programming-
> >>>> boun...@jsoftware.com] Namens Brian Schott
> >>>> Verzonden: donderdag 23 september 2010 14:41
> >>>> Aan: Programming forum
> >>>> Onderwerp: Re: [Jprogramming] Classification problem
> >>>>
> >>>> I agree that this solution is elegant, but for a large data
> >> set I
> >>>> assume that Raul's idea of prepending and then dropping 3 
> elements>>>> would be more efficient. Don't you, too?
> >>>>
> >>>>     (<@}./.~ *) _1 0 1,data
> >>>>
> >>>> On Thu, Sep 23, 2010 at 8:26 AM, R.E. Boss
> >> <r.e.b...@planet.nl>  wrote:
> >>>>> One of the more elegant solutions is
> >>>>>
> >>>>>     ((/:&(*</. ])) *) data
> >>>>> +---------------+---+-----+
> >>>>> |_3 _1 _10 _2 _4|0 0|1 1 6|
> >>>>> +---------------+---+-----+
> >>>>>
> >>>>>
> >>>>> R.E. Boss
> >>>>>
> >>>>>
> >>>>>> -----Oorspronkelijk bericht-----
> >>>>>> Van: programming-boun...@jsoftware.com
> >> [mailto:programming-
> >>>>>> boun...@jsoftware.com] Namens Marshall Lochbaum
> >>>>>> Verzonden: woensdag 22 september 2010 23:57
> >>>>>> Aan: Programming forum
> >>>>>> Onderwerp: Re: [Jprogramming] Classification problem
> >>>>>>
> >>>>>> It looks to me like the most terse way is
> >>>>>> ((</. /: ~...@[)~ f) data
> >>>>>>
> >>>>>> Although this computes the nub twice, unlike some earlier
> >> solutions.>  >  >>
> >>>>>> Marshall
> >>>>>>
> >>>>>> ________________________________________
> >>>>>> From: programming-boun...@jsoftware.com [programming-
> >>>>>> boun...@jsoftware.com] On Behalf Of Raul Miller [
> >>> rauldmil...@gmail.com]
> >>>>>> Sent: Wednesday, September 22, 2010 3:29 PM
> >>>>>> To: Programming forum
> >>>>>> Subject: Re: [Jprogramming] Classification problem
> >>>>>>
> >>>>>> Note, if * is the universe of interesting
> >>>>>> functions (if it does not need to be generic) then
> >>>>>> I would be tempted to use a variation on Dan's
> >>>>>> second suggestion:
> >>>>>>
> >>>>>>      (<@}./.~ *) _1 0 1,data
> >>>>>>
> >>>>>> Note that this also preserves the relative
> >>>>>> ordering of the data items.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Classification problem

Reply via email to