The treatment of an argument with duplicate entries obtain as follows:

f =: 3 : 0
sorted=: \:~ y
vals =: (+/\ % +/) sorted
(sorted i. y) { vals
)

g =: 3 : 0
i /:~ (+/\ % +/) i{y [ i=. \:y
)

When the argument does not have duplicate entries, f and g are identical:

   (f -: g) 40?60
1
   (f -: g) 40?60
1
   (f -: g) 40?60
1

But when there are duplicate entries:

   (f -: g) y=: 3 4 3 3 3
0
   f y
0.4375 0.25 0.4375 0.4375 0.4375
   g y
0.4375 0.25 0.625 0.8125 1

g on the original y is seen to be "correct" when applied to a slightly
perturbed y :

   y1=: y+1e_10*i.#y
   f y1
1 0.25 0.8125 0.625 0.4375
   g y1
1 0.25 0.8125 0.625 0.4375

The difference between f and g is that g re-orders the cumulated ratios
using the inverse permutation of i that sorted the argument.  When y has no
duplicates, the inverse permutation i is the same as (sorted i. y); in
general, the inverse permutation is /:i.  Note: i/:~blah ←→ blah/:i ←→
(/:i){blah.




On Tue, Jun 3, 2014 at 6:49 AM, Joe Bogner <[email protected]> wrote:

> Thanks Pascal - good solution to my incorrect approach.
>
> Roger, I am using this with a pareto chart to identify what bin each record
> would fall under.
>
> http://en.wikipedia.org/wiki/Pareto_chart
>
> I don't need to draw the chart, I just need to know what each record would
> be classified as:
>
> causes =: > ' ' cut each LF cut(0 : 0)
> Public 47
> Weather 28
> Oversight 18
> Emergency 12
> Traffic 5
> ChildCare 57
> )
>
> vals=. ". > 1}"1 causes
>
> runsumpct =: 3 : 0
> sorted=. \:~ y
> vals =. (+/\ % +/) sorted
> (sorted i. y) { vals
> )
>
> pct=:runsumpct vals
> causes,.>each pct
>
> ┌─────────┬──┬────────┐
> │Public   │47│0.622754│
> ├─────────┼──┼────────┤
> │Weather  │28│0.790419│
> ├─────────┼──┼────────┤
> │Oversight│18│0.898204│
> ├─────────┼──┼────────┤
> │Emergency│12│0.97006 │
> ├─────────┼──┼────────┤
> │Traffic  │5 │1       │
> ├─────────┼──┼────────┤
> │ChildCare│57│0.341317│
> └─────────┴──┴────────┘
>
> (/: pct) { (causes,.>each pct)
>
> ┌─────────┬──┬────────┐
> │ChildCare│57│0.341317│
> ├─────────┼──┼────────┤
> │Public   │47│0.622754│
> ├─────────┼──┼────────┤
> │Weather  │28│0.790419│
> ├─────────┼──┼────────┤
> │Oversight│18│0.898204│
> ├─────────┼──┼────────┤
> │Emergency│12│0.97006 │
> ├─────────┼──┼────────┤
> │Traffic  │5 │1       │
> └─────────┴──┴────────┘
>
> I suppose I could sort the data before providing it to the function if that
> helps.
>
> You are right that dupes cause problems with using i. to locate the record.
> Thank you for pointing that out. I don't know how to fix it yet and would
> welcome any suggestions.
>
> causes =: > ' ' cut each LF cut(0 : 0)
> Public 47
> Weather 28
> Oversight 18
> Emergency 12
> Traffic 5
> ChildCare 57
> XYZ 5
> )
>
>
> (/: pct) { (causes,.>each pct)
> ┌─────────┬──┬────────┐
> │ChildCare│57│0.331395│
> ├─────────┼──┼────────┤
> │Public   │47│0.604651│
> ├─────────┼──┼────────┤
> │Weather  │28│0.767442│
> ├─────────┼──┼────────┤
> │Oversight│18│0.872093│
> ├─────────┼──┼────────┤
> │Emergency│12│0.94186 │
> ├─────────┼──┼────────┤
> │Traffic  │5 │0.97093 │
> ├─────────┼──┼────────┤
> │XYZ      │5 │0.97093 │
> └─────────┴──┴────────┘
>
>
>
>
> On Tue, Jun 3, 2014 at 9:10 AM, Roger Hui <[email protected]>
> wrote:
>
> >    y
> > 1 100 5 10
> >    y,.runsumpct y
> >   1        1
> > 100 0.862069
> >   5 0.991379
> >  10 0.948276
> >
> > Please provide an English description of the problem being solved.  In
> > particular, I don't understand how the result is "in the original order".
> >  In addition, won't you have a problem if the argument has duplicate
> > entries?
> >
> >    t,.runsumpct t=: y,1 1 1
> >   1  0.97479
> > 100 0.840336
> >   5 0.966387
> >  10  0.92437
> >   1  0.97479
> >   1  0.97479
> >   1  0.97479
> >
> >
> >
> >
> >
> > On Tue, Jun 3, 2014 at 4:32 AM, Joe Bogner <[email protected]> wrote:
> >
> > > Is there a cleaner way to write this or is this a reasonable
> > > implementation?
> > >
> > > runsumpct =: 3 : 0
> > >
> > > sorted=: \:~ y
> > >
> > > vals =: (+/\ % +/) sorted
> > >
> > > (sorted i. y) { vals
> > >
> > > )
> > >
> > >
> > > runsumpct 1 100 5 10
> > >
> > > 1 0.862069 0.991379 0.948276
> > >
> > >
> > >
> > > I'm interested if there's a cleaner approach to sorting, operating, and
> > > then returning the result in the original order.
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> > >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to