Am I the only one to be left completely cold with the new wave
of C++ to GPU libraries (Bolt/ArrayFire/OpenACC) which take
back the control compute APIs give? For example this one
removes double precision and multiple devices, something that
is builtin with OpenCL.
These libraries build on the myth that GPU's power can be
harnessed without pain, but at one point you have to expose the
multiple levels of parallelism that GPU have, use spatial cache
locality, etc. This is like, a 60% solution.
I am one of the developers of ArrayFire. As we went open source,
we removed all restrictions that were put in place for our older
commercial version. That is, double precision and multiple device
are are part of the open source project.
We also support CPU and OpenCL backends along with CUDA. This
way, you can use the same ArrayFire code to run across any of
those technologies without changes. All you need to do is link
the correct library.
We used a BSD 3-Clause license to make it easy for everyone to
use in their own projects.
Here is a blog I made about implementing Conway's Game of Life
demonstrates how easy it is to use ArrayFire.
Our goal is to make it easy for people to get started with GPU
programming and break down the barrier for non-programmers to use
the hardware efficiently. I agree that complex algorithms require
more custom solutions, but once you get started, things become