Device emulation mode isn't that fast anyways; it forks a bunch of pthreads,
which is quite slow.

For relevant speed comparisons, you might look at mcuda. I'm not sure if
there's any source available though.

cython with the numpy buffer interface is a pretty fast way to implement
efficient host-side C algorithms.

regards,
Nicholas
_______________________________________________
PyCuda mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to