This looks really interesting.

My first look at this was mp_bench.ijs example. This produces (on my iMac):

┌────────┬────────┐
│ step   │ millis │
├────────┼────────┤
│mp      │3260    │
├────────┼────────┤
│acreate │708     │
├────────┼────────┤
│bcreate │474     │
├────────┼────────┤
│matmul  │0       │
├────────┼────────┤
│sync    │3252    │
├────────┼────────┤
│get     │432     │
├────────┼────────┤
│aftot   │4867    │
├────────┼────────┤
│mp%aftot│0.669918│
└────────┴────────┘

That’s certainly not flattering for Arrayfire.

My take on this goes like this. acreate and bcreate take significant time 
getting their arguments transposed. But once this is done, the matrix 
multiplication just zips by.

Then comes the synchronization step. A full 3 seconds. Is this elapsed time? 
Surely not actual resources used time? (Maybe my Mac was doing a backup when I 
set this in motion.)

And then get to undo the transposed matrix. Some significant time there.

What if we sent regular J data (row major) to Arrayfire and used the 
af_transpose function within Arrayfire to change columns to rows. Do the 
af_matmul and then finish up with another af_transpose on the result. This 
might be a lot quicker.

Can someone expand on the timing of af_sync.

If I were planning a machine learning (“ML”)  example from J, I think it would 
end up as an initial passing of data in, followed by a good deal of af_ 
processing, finishing with an af_sync and a return of results to J. Simply 
trying to maximize the time spent with the fastest tool available.

And that means expressing your ML logic in array terms. J should be good at 
that.

Mike Powell


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to