In such cases it may be worthwhile to make a keying list with items that correspond to those of the list for which you want to compute the nub. If you calculate simple unique values for each item you may rely on the correspondence as needed. On Jun 25, 2013 11:29 PM, "Christopher Rosin" <[email protected]> wrote:
> I was having a performance problem that I traced to nub applied to boxed > arrays. > > Nub sieve ~: gives the same results here whether the items are unboxed, > boxed, or doubly boxed: > +/ ~: (?. 10000 4 4 $ 2) > 9255 > +/ ~: <"2 (?. 10000 4 4 $ 2) > 9255 > +/ ~: <"0 <"2 (?. 10000 4 4 $ 2) > 9255 > > But the runtime is very different in the doubly boxed case: > 6!:2 '+/ ~: (?. 10000 4 4 $ 2)' > 0.00105408 > 6!:2 '+/ ~: <"2 (?. 10000 4 4 $ 2)' > 0.00585098 > 6!:2 '+/ ~: <"0 <"2 (?. 10000 4 4 $ 2)' > 14.9312 > > Boxing the items only once, performance appears close to linear: > 6!:2 '+/ ~: <"2 (?. 1000 4 4 $ 2)' > 0.000527954 > 6!:2 '+/ ~: <"2 (?. 10000 4 4 $ 2)' > 0.00488113 > 6!:2 '+/ ~: <"2 (?. 100000 4 4 $ 2)' > 0.075351 > > But doubly-boxed, performance seems to become nearly quadratic: > 6!:2 '+/ ~: <"0 <"2 (?. 1000 4 4 $ 2)' > 0.162159 > 6!:2 '+/ ~: <"0 <"2 (?. 10000 4 4 $ 2)' > 14.9312 > 6!:2 '+/ ~: <"0 <"2 (?. 100000 4 4 $ 2)' > 1106.85 > > Timing is similar with nub instead of nub sieve. > > Is there any J documentation that explains the performance of nub in > various scenarios? I haven't been able to find any. > > Thanks. > -Chris > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
