On Monday, 27 February 2017 at 23:02:43 UTC, Nicholas Wilson
Interesting to write kernels in D, since a limitation of CUDA
is that you need to multiply the entry points to instantiate a
template differently, and a limitation of OpenCL C is that you
need templates and includes in the first place.
Wait you mean you have to explicitly instantiate every instance
of a templated kernel? Ouch.
IIRC, that entry point explosion happens in CUDA when you
separate strictly host and device code. Not sure for mixed mode
as I've never used that.
I should first emphasise the future tense of the second half of
the sentence you quoted.
How does this work?
DCompute (the compiler infrastructure) is currently capable of
building .ptx and .spv as part of the compilation process. They
can be used directly in any process pipeline you may have
.ptx, got it.
Does the host code need something like DerelictCL/CUDA to work?
If you want to call the kernel, yes. The eventual goal of
DCompute (the D infrastructure) is to fully wrap and unify and
abstract the OpeCL/CUDA runtime libraries (most likely provided
by Derelict), and have something like:
Let me know if you need more things in OpenCL bindings.