This thread caused me to look at the following message from 2009
http://www.jsoftware.com/pipermail/programming/2009-November/016859.html
From: Raul Miller
Date: Sat Nov 14 03:33:39 HKT 2009
Subject: rank by key
On Fri, Nov 13, 2009 at 1:10 PM, Tirrell, Jordan (Consultant)
> key=: 1 1,1 0,1 1,1 0,1 0,0 1,:1 1
> data=: 0 1,1 1,0 2,0 2,2 0,0 0,:1 1
> key rankbykey1 data
> 2 1 1 2 0 0 0
> key rankbykey2 data
> 2 1 1 2 0 0 0
>
> I cannot figure out how to use ~: to express this function as Raul
> Miller suggested, does anyone see a specific solution using it?
Perhaps something like this would work for you?
rankbykey3=: (/:@/:@(i.~ ~.)@[ { [:; <@(i.~ \:~)/.)
Thanks,
--
Raul
And I have discovered an improvement. The amount of improvement is data
dependent. To make it easier to talk about, I'll split the function into
pieces:
rx =: /:@/:@(i.~ ~.)@[
ry =: [: ; <@(i.~ \:~)/.
rk3=: rx { ry
rx is an instance of the problem described in
http://www.jsoftware.com/jwiki/Essays/Index_in_Nub , and you can speed it
up by using a *longer* function:
rx@(i.~)@[
and the reason why it's faster is explained in the essay. Moreover, once
having gone to the trouble of applying i.~ to the keys, you can also use
the resultant integer vector as the left argument (keys) to f/., because
integers are more efficiently handled than other kinds of keys. Putting it
all together:
rk4=: i.~@[ (rx { ry) ]
For example:
k=: 1e6 2 ?@$ 1e4
d=: 1e6 2 ?@$ 1e6
k (rankbykey3 -: rk4) d
1
timer 'k rankbykey3 d'
4.02691
timer 'k rk4 d'
3.1878
As I said, the amount of improvement is data-dependent.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm