On Monday, 27 February 2017 at 13:55:23 UTC, Guillaume Piolat wrote:
On Sunday, 26 February 2017 at 08:37:29 UTC, Nicholas Wilson wrote:
This will enable writing kernels in D utilising all of D's meta programming goodness across the device divide and will allow launching those kernels with a level of ease on par with CUDA's <<<...>>> syntax.

Interesting to write kernels in D, since a limitation of CUDA is that you need to multiply the entry points to instantiate a template differently, and a limitation of OpenCL C is that you need templates and includes in the first place.

Wait you mean you have to explicitly instantiate every instance of a templated kernel? Ouch. In D all you need do is have a reference to it somewhere, taking it's .mangleof suffices and is (part of) how the example below will achieve its elegance.

I should first emphasise the future tense of the second half of the sentence you quoted.

How does this work?

DCompute (the compiler infrastructure) is currently capable of building .ptx and .spv as part of the compilation process. They can be used directly in any process pipeline you may have already.

Does the host code need something like DerelictCL/CUDA to work?

If you want to call the kernel, yes. The eventual goal of DCompute (the D infrastructure) is to fully wrap and unify and abstract the OpeCL/CUDA runtime libraries (most likely provided by Derelict), and have something like:

Queue q = ...;
Buffer b = ...;
q.enqueue!(myTemplatedKernel!(Foo,bar,baz => myTransform(baz)))(b,other, args);
Although, there is no need to wait until DCompute reaches that point to use it, you would just have to do the (rather painful) API bashing yourself.

Reply via email to