Alex,
I think you should recheck your numbers. Both BIDMat and nvblas are
wrappers for cublas. The speeds are identical, except on machines that
have multiple GPUs which nvblas exploits and cublas doesnt.
It would be a good idea to add a column with Gflop throughput. Your
numbers for BIDMat
in optimizations)
On Thu, Mar 12, 2015 at 8:50 PM, jfcanny [hidden email]
/user/SendEmail.jtp?type=nodenode=11022i=0 wrote:
If you're contemplating GPU acceleration in Spark, its important to
look
beyond BLAS. Dense BLAS probably account for only 10% of the cycles
in the
datasets we've