> If you would have a good math library with vectors like those from e.g. > Matlab or Eigen (c++) then you could use vector arithmetic and wouldn't even > need to write a single loop.
I would recommend the [linalg](https://github.com/unicredit/linear-algebra) library for that. It supports statically sized / dynamically sized vectors and matrices. I've used it for building the prototype of a project at work; it's missing some things (like dot product) that I had to add in myself, but it has CUDA support and support for basic operations (vector addition, subtraction, multiplication by a scalar, etc.).
