@perturbation2 This is actually something that would help a lot in some portions. Users should not need to call BLAS manually, although that is always an option. It helps that in Neo I make al fields public - this is not very safe, but at least if a user needs to access a raw pointer in situations such as the above one, it is there.
That said, I could not find the time to expose two versions of each operation yet. I may eventually get to it, but if you want to contribute, a PR is always welcome! Even better, there should be rewrite rules that match your slow call and transform it into the BLAS version without intermediate results. I will add some feature that lets me test whether rewrite rules are being applied, and then start adding some of the most common ones
