So, it took me all of 20 minutes to pull dgemm into J for a matrix multiplication speedup. I stuck it here, along with an org-emacs TODO list for making this actually happen. It's all "busy work" as far as I can tell, though it would be my first time writing code that links to CUDA.

Either way, the dgemm wrapper should eventually make its way into the API stuff, as it's a pretty good speedup over +/ .* for bigger array problems

https://github.com/locklin/jCUDA

Feel free to pitch in on the busy work if anyone has problems that would benefit from this.

-Scott
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to