ecdf=: 3 : 0
  valsct=. # y
  tbl=:y,.valsct %~ #\ y
  max=:(0{"1 tbl) (>./)/. tbl
  , 1{"1 (({."1 max) i. y) { max
)

performs a bit better for the 1e5 case.

I've not tried it on a large array though.

Thanks,

-- 
Raul

On Wed, Aug 13, 2014 at 3:06 PM, Joe Bogner <[email protected]> wrote:
> I'm looking for any ideas to speed up this function.
>
> I patched together this ecdf function from a few different ideas:
>
> NB. > v<- c(2,2,2,4,4,6,6,8)
> NB. > ecdf(v)(v)
> NB. [1] 0.375 0.375 0.375 0.625 0.625 0.875 0.875 1.000
>
> ecdf=: 3 : 0
>   valsct=. # y
>   tbl=:y,.(valsct %~ #) \ y
>   max=:(0{"1 tbl) (>./)/. tbl
>   , 1{"1 (({."1 max) i. y) { max
> )
>
> (0.375 0.375 0.375 0.625 0.625 0.875 0.875 1.000) -: ecdf (2,2,2,4,4,6,6,8)
> 1
>
>
> timespacex 'ecdf i. 1e5'
> 8.84599 1.15392e7
>
> The r function is nearly instantaneous
>
> I need to run this on a 1m+ array
>
> Thank you for any suggestions
>
> http://en.wikipedia.org/wiki/Empirical_distribution_function
> https://github.com/jstac/edtc-code/blob/master/python_code/ecdf.py
> https://github.com/dmbates/ecdfExample
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to