Joe

>What i meant by "just not box them separately in the first place" was, eg,

   ax=:{. ; }.
   fsp=:' ' i.~ ]
   tc =: (ax~fsp) ;._2 (0 : 0)
c1 p1 0.2
c1 p2 0.35
c2 p1 0.2
c2 p2 0.35
c3 p1 0.2
c3 p2 0.35
c3 p3 0.45
)

Then you would run tc through REB's fine finder.

greg
~krsnadas.org

--

from: 'Pascal Jasmin' via Programming <[email protected]>
to: "[email protected]" <[email protected]>
date: 27 July 2014 07:22
subject: Re: [Jprogramming] finding matching sets

Thanks for letting us know, Joe.

>You might see a big improvement with unboxed symbols taken just before 
>comparisons as in:

(~.{."1 t)</.~  ( i.~ ~.)  s:@:<@:, &> ({."1 </.([:;}.)"1) t

or

lr =: 3 : ' 5!:5 < ''y'''

(~.{."1 t)</.~  ( i.~ ~.) s: lr each ({."1 </.([:;}.)"1) t
(~.{."1 t)</.~  ( i.~ ~.) s: lr each ({."1 </.}."1) (i.~ ~.)"1 &.|: t
(guessing previous slightly faster)

--

from: R.E. Boss <[email protected]>
to: [email protected]
date: 27 July 2014 06:56
subject: Re: [Jprogramming] finding matching sets

   pd 'reset'
   pd 'type marker'
   pd 'color red'
   pd j./"1  ,:7.87302 2.63376e8
   pd 'color green'
   pd j./"1  ,:2.74571 3.24026e8
   pd 'color blue'
   pd j./"1  [6.53307 2.23953e8 ,: 2.46257 2.33259e8
   pd 'color black'
   pd j./"1  ,:11.5519 5.73956e8
   pd 'show'

2 <-> red
3 <-> blue
4<-> green
5 <-> black

NB. I had to use  "1 ,:  otherwise

        error in: plot_gs_paint
        length error: qtmark_diamond
           'x y'    =.|:y

and I had to force J down
Engine: j701/2011-01-10/11:25
Library: 8.02.10
Qt IDE: 1.1.3/5.3.0
Platform: Win 64
Installer: J802 install

--

from: Joe Bogner <[email protected]>
to: [email protected]
date: 27 July 2014 06:18
subject: Re: [Jprogramming] finding matching sets

>Thank you all for the suggestions and alternate implementations. I ran 
>benchmarks on them all and wanted to share:

$ t2
1763140 3

NB. Thomas's suggestion
groups1 =: 3 : 0
]key=. (}."1</.{."1) y
(<"1 t) ,.~ (;key) ,. (<"1|:e.key)
)
timespacex 'groups1 t2'
NB. out of memory

NB. Joe's implementation
groups2 =: 3 : 0
groups=. [: (}."1 each </. ])  ({."1 </. ])
ids=. ; L:2 @: ([: (~.L:1) 0{"1 L:1 ])
ids groups y
)
timespacex 'groups2 t2'
NB. 7.87302 2.63376e8

NB. R.E. Boss's 1st implementation
groups3a=: 3 : 0
(~. {."1 t2) </.~ (i.~ ~.) ({."1 </. }."1) y
)
timespacex 'groups3a t2'
NB. 6.53307 2.23953e8

NB. R.E. Boss's 2nd implementation
groups3b =: 3 : 0
T=.(i.~ ~.)"1 &.|:y
(~.{."1 y) </.~ (i.~ ~.)({."1 </.}."1) T
)
timespacex 'groups3b t2'
NB. 2.46257 2.33259e8

NB. Greg Hei's implementation
groups4 =: 3 : 0
(~.{."1 y) </. ~ (i.~ ~.) ({."1 </.([:;}.)"1) y
)
timespacex 'groups4 t2'
NB. 2.74571 3.24026e8

NB. Pascal Jasmin's symbol test
groups5 =: 3 : 0
symboled=. ( ({. (, <)  [: s: <@:(1&{::  , ' ' , [: ": 2&{::))) "1 y
(([: ~. {."1) </.~ [: (i.~ ~.)({."1 </.}."1)) symboled
)
timespacex 'groups5 t2'
NB. 11.5519 5.73956e8

timespacex '(([: ~. {."1) </.~ [: (i.~ ~.)({."1 </.}."1)) symboled'
NB. 5.42007 1.38506e8

If I apply Greg's to the symboled:
timespacex '(~.{."1 symboled) </. ~ (i.~ ~.) ({."1 </.([:;}.)"1) symboled'
NB. 1.43217 1.09604e8

>So it seems that the symbol comparisons are faster, but in this case probably 
>not worth the six second penalty to create


Of the 1.7M rows:

There are 2104 unique values in the first column
 $ ~. 0{"1 t2
2104

And 7,277 in the 2nd two columns
$ ~. }."1 t2
7277 2

--

from: greg heil <[email protected]>
to: Programming forum <[email protected]>
date: 26 July 2014 21:58
subject: Re: [Jprogramming] finding matching sets

Joe

>If the only thing that matters is the car and cdr of each entry why not just 
>raze the cdr, eg

   (~.{."1 t)</.~ (i.~ ~.) ({."1 </.([:;}.)"1) t

or just not box them separately in the first place;-)

>One could independently take symbols, and do that timing experiment.

greg
~krsnadas.org
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to