I enjoyed your paper and particularly enjoyed playing with the native J implementation:
xiy =: 13 : '+/*./\x~:/y' (i. 1e6) xiy 1e5 100000 I was surprised that the tacit J version performed reasonably close to the special code C version: big=: 1e6 100 timespacex 'big xiy 1e5' 1.01128e_5 2816 100 timespacex 'big i. 1e5' 1.23157e_6 1664 The timing starts to diverge on significantly boxed arrays it seems at first glance: big=: (1e6 # <'a') 100 timespacex 'big i. <''a''' 3.50044e_6 1920 100 timespacex 'big xiy <''a''' 0.054772 2.09933e6 That seems to be hitting an optimization where it stops on first find. Compare to: 100 timespacex 'big i. <''z''' 0.0486771 1920 And it runs similarly to the native J version, which was 0.054772 At first I was also puzzled by why / was required. xiy1 =: 13 : '+/*./\x~:y' (i. 1e5) (xiy -: xiy1) 10 1 Then I realized this: (i. 1e5) xiy1 (10,2) |length error: xiy1 | (i.100000) xiy1(10,2) However: (i. 1e5) xiy (10,2) 10 2 It was neat to tinker with xiy to better understand how it works. I now need to spend some time better understanding the hashing. I understand at a surface level yet want to play with the examples too. I may try to create a J implementation of the algorithm using amend and loops. I realize it will be slow, but it will be easier to play with than C. If someone else wants to do it and share I'd be glad to use that instead. Otherwise, if I get around to it I will post it. On Wed, Jul 30, 2014 at 2:44 AM, Roger Hui <[email protected]> wrote: > My J Conference 2014 presentation can be found at: > > Slides only: http://www.jsoftware.com/papers/indexof/ > Slides and script: http://www.jsoftware.com/papers/indexof/indexofscript.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
