Hi Dmitriy, It is sweet to have the bindings, but it is very easy to downgrade the performance with them. The BLAS/LAPACK APIs have been there for more than 20 years and they are still the top choice for high-performance linear algebra. I'm thinking about whether it is possible to make the evaluation lazy in bindings. For example,
y += a * x can be translated to an AXPY call instead of creating a temporary vector for a*x. There were some work in C++ but none achieved good performance. I'm not sure whether this is a good direction to explore. Best, Xiangrui