Read news, specs, presentations about it. Have a clear overall understanding of it. Downloaded the API recently. Went through examples.
Well GPGPU only comes with big performance gains if you hand-tune the (kernel and API interaction) code. Normally you even have to optimize it for specific GPUs to optimally use local memory etc. or you will end up being not much faster than a proper multi-threaded CPU implementation in most cases.
Gonna try making hello world examples.
If you want to use it with D: https://bitbucket.org/trass3r/cl4d
