Well put. Thing is I'm attending a summer school on CUDA right now and
it seems that micro managing the threads, blocks, warps,
registers...etc. is not for the faint of heart. I am not a programmer
and I doubt that I will ever have the time to do all this fine tuning
to achieve optimal performance.This also depends on the code, so, it
may not be that hard for a lot of tasks that lends themselves well to
parallel processing.

An interesting remark was mentioned today is that there is a lot of
testing going on right now to automate the fine-tuning process, and it
was mentioned that a certain algorithm managed to squeeze 15~20% more
performance than the human optimized code. The optimizations done by
the algorithm would have taken a person weeks to implement. These fine
tuning features will be implemented later in CUDA. But it seems not
any time soon.

My guess is that once CUDA gets smart enough, it maybe then easier for
the non-professional programmer to use any tool whatsoever without
worrying too much about performance. I for one is leaning towards
open-source tools as well.

More input from other users are welcome.

-M.

On Wed, Aug 12, 2009 at 11:56 AM, Ahmed Fasih<[email protected]> wrote:
> For the list...
>
> On Wed, Aug 12, 2009 at 11:39 AM, M. Badawy<[email protected]> wrote:
>> From what I have read so far, it seems that using Matlab w/
>> AccelerEyes' Jacket seems to be the fastest and easiest method to use
>> CUDA w/ Matlab. So, assuming that I'll be using Jacket, is there a
>> significant performance
>> difference between this option and using Pycuda?
>
> I haven't used Jacket, maybe someone else can comment.
>
> There will not be any difference in the GPU computation whether you
> use mex or PyCUDA because the underlying kernels are in C anyway; but
> I think Jacket generates its own kernels so it might be faster or
> slower depending on how smart their compiler is, but at this stage, I
> suspect a compiler wouldn't be that much better at tuning a CUDA
> kernel than a human (might be a lot worse).
>
> I used PyCUDA with great success from Sage (then just Python with
> Numpy, etc.); then I wrote a mex interface to set up and call my
> kernels from Matlab. The latter took a lot more effort because you
> have to set up your device and convert your datastructures (e.g.,
> structure-of-arrays to array-of-structures for complex numbers) and
> handle transfers and free memory manually in C. And SciPy can save
> data to Matlab-compatible .mat files anyway for good
> inter-compatibility.
>
> I also want my implementation to be open-source one day, and don't
> want to force my colleagues to have copies of Matlab and Jacket.
>

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to