As @Clonk mentions check out Weave for multithreading: <https://github.com/mratsim/weave>/
Regarding better Python / Numpy interop: Once we can finally (hopefully soon) merge the following PR: <https://github.com/mratsim/Arraymancer/pull/420> we'd be able to achieve that. For the time being we still have to copy the data around unfortunately. For the time being zero overhead interop is a little ugly, but can still be done using Nimpy's `RawPyBuffer`. See either: * <https://gist.github.com/apahl/d673b0033794cc5f9514de639285592b> by @apahl * <https://github.com/yglukhov/nimpy/issues/114#issuecomment-531504502> If the above PR is merged we can simply assign those raw buffers to Arraymancer tensors directly and use all of its functionality.