tldr; sharing some timings on symbols and trains. looks like special
code is at play
I was working on a large table that I wanted to calculate
min, max, avg, ct
per group
I had come up with a train that I was pretty happy with:
prod (<./,>./,#,(+/%#))/. price
I was running the same calculation in R with data.table. I was
somewhat surprised that my J calculation was about 1 second slower
(7%). No big deal, but I wanted to investigate.
On my production code, I was able to get the calculation down from 13
seconds to about 5 seconds by using symbols and breaking apart the
train. I suspect special code was able to be triggered with the train
broken out.
I am considering replacing some R code that is invoked from a .NET web
page with a JHS app
Here are some reproducible timings:
price=:?. 1e7#10
prod=: 1e7 8 $ (?. 1e7#5) { 'abcde'
NB. Case 1 - train
timespacex 'prod (<./,>./,#,(+/%#))/. price'
7.4167 2.86488e8
NB. Case 2 - calculate symbol at call time
timespacex '(s:prod) ((<./,>./,#,(+/%#)))/. price'
8.92996 3.53597e8
NB. Case 3 - precalculate symbols
sprod =: s: prod
timespacex 'sprod ((<./,>./,#,(+/%#)))/. price'
3.9147 2.86488e8
bt =: 4 : 0
min=. x (<./)/. y
max=. x (>./)/. y
ct=. x #/. y
mean=. x (+/%#)/. y
min,.max,.ct,.mean
)
NB. Case 4 - no train, with symbols
timespacex 'sprod bt price'
1.86927 6.50141e7
NB. Case 4 - no train, no symbols
timespacex 'prod bt price'
17.6812 2.72633e8
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm