Given that much of said works reduces to cuBlas + cuDNN, it seems like a
GPU-backed-J, although more concise, would end up calling the same
functions.
I expect you're right. I don't know what the interfaces to GPUs look
like, but the goal would be to have the rank operator (gpufunc"2 for
example), which loops over cells of input, allow operation on cells in
parallel.
As it happens I have some work to do in that area, aimed at reducing the
amount of data-copying for large arguments. What should I Google to
learn about interfacing to GPUs?
Henry Rich
---
This email has been checked for viruses by AVG.
http://www.avg.com
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm