On Wed, Aug 12, 2009 at 11:31 AM, M. Badawy<[email protected]> wrote:
> Well put. Thing is I'm attending a summer school on CUDA right now and
> it seems that micro managing the threads, blocks, warps,
> registers...etc. is not for the faint of heart.

Think of it as a puzzle.  I'll take memory bank conflict avoidance
over sudoku any day :)

> I am not a programmer
> and I doubt that I will ever have the time to do all this fine tuning
> to achieve optimal performance.This also depends on the code, so, it
> may not be that hard for a lot of tasks that lends themselves well to
> parallel processing.

What problem are you trying to solve?  Maybe you're blessed with a
large problem with ridiculously fine-grained parallelism.

> An interesting remark was mentioned today is that there is a lot of
> testing going on right now to automate the fine-tuning process, and it
> was mentioned that a certain algorithm managed to squeeze 15~20% more
> performance than the human optimized code. The optimizations done by
> the algorithm would have taken a person weeks to implement. These fine
> tuning features will be implemented later in CUDA. But it seems not
> any time soon.

link?

> My guess is that once CUDA gets smart enough, it maybe then easier for
> the non-professional programmer to use any tool whatsoever without
> worrying too much about performance.

I would not hold my breath :)  I would be surprised if CUDA
programming changes qualitatively before some some completely
different architecture with a different programming model comes along.

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to