I'm looking for any ideas to speed up this function.
I patched together this ecdf function from a few different ideas:
NB. > v<- c(2,2,2,4,4,6,6,8)
NB. > ecdf(v)(v)
NB. [1] 0.375 0.375 0.375 0.625 0.625 0.875 0.875 1.000
ecdf=: 3 : 0
valsct=. # y
tbl=:y,.(valsct %~ #) \ y
max=:(0{"1 tbl) (>./)/. tbl
, 1{"1 (({."1 max) i. y) { max
)
(0.375 0.375 0.375 0.625 0.625 0.875 0.875 1.000) -: ecdf (2,2,2,4,4,6,6,8)
1
timespacex 'ecdf i. 1e5'
8.84599 1.15392e7
The r function is nearly instantaneous
I need to run this on a 1m+ array
Thank you for any suggestions
http://en.wikipedia.org/wiki/Empirical_distribution_function
https://github.com/jstac/edtc-code/blob/master/python_code/ecdf.py
https://github.com/dmbates/ecdfExample
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm