If you want to generalize to more columns, consider that giving each column
a separate name gets in the way of this from the start.
mat=: ?(1e6,3)$999
sum=: +/
key=. 2{."1 mat NB. Why not use a meaningful name?
z=. 2{"1 mat
6!:2 '(~.key) ; (i.~key) sum/. z'
0.85759438
6!:2 '(~.2{."1 mat);,.(2{."1 mat) +//. 2{"1 mat'
0.39987687
NB. So, half the time spent interpreting "sum".
NB. Pre-calculated "key" and "z" makes essentially no difference.
6!:2 '(~.key) ; ,.key +//. z'
0.37439193
NB. Breakdown of the two major components:
6!:2 '~.key'
0.16903943
6!:2 'key+//.z'
0.20682803
NB. But partitioning is really the object:
6!:2 '(~.key);key</.z'
0.74056007
6!:2 '~.key'
0.1676011
6!:2 'key</.z'
0.50268364
NB. Timing very dependent on number of distinct partitions:
$~.key=. ?(1e6,2)$100
10000 2
6!:2 'key </. z'
0.10464443
$~.key=. ?(1e6,2)$1000
632047 2
6!:2 'key </. z'
0.49147517
On 2/8/08, Jack Andrews <[EMAIL PROTECTED]> wrote:
>
> > Finally, I'm keen to have a generalised form of this groupby available
> to
> > me. Ie. Group by an arbitrary number of columns, not just two.
>
> /. does generalize
>
> see the thread starting:
> http://www.jsoftware.com/pipermail/general/2000-June/003599.html
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
--
Devon McCormick, CFA
^me^ at acm.
org is my
preferred e-mail
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm