embedded?

mratsim Fri, 22 May 2020 02:15:32 -0700

Note: the docgen for API is still not ideal, there are internal stuff listed


> nd-arrays (vectors and matrices) of integer, floating number, or complex 
> values.

Yes

> Slicing, concatenation, transposing... these sorts of array operations.

  * 
[https://mratsim.github.io/Arraymancer/tuto.shapeshifting.html](https://mratsim.github.io/Arraymancer/tuto.shapeshifting.html)
  * CPU: 
[https://mratsim.github.io/Arraymancer/shapeshifting.html](https://mratsim.github.io/Arraymancer/shapeshifting.html)
  * CUDA: 
[https://mratsim.github.io/Arraymancer/shapeshifting_cuda.html](https://mratsim.github.io/Arraymancer/shapeshifting_cuda.html)



> Linear algebra (e.g. matrix multiplication, solving linear equations).

Matrix multiplication

  * CPU: 
[https://mratsim.github.io/Arraymancer/operators_blas_l2l3.html](https://mratsim.github.io/Arraymancer/operators_blas_l2l3.html)
  * CUDA: 
[https://mratsim.github.io/Arraymancer/operators_blas_l2l3_cuda.html](https://mratsim.github.io/Arraymancer/operators_blas_l2l3_cuda.html)
  * OpenCL: 
[https://mratsim.github.io/Arraymancer/operators_blas_l2l3_opencl.html](https://mratsim.github.io/Arraymancer/operators_blas_l2l3_opencl.html)



Solvers, matrix decomposition, PCA ..., CPU only at the moment

  * 
[https://mratsim.github.io/Arraymancer/least_squares.html](https://mratsim.github.io/Arraymancer/least_squares.html)
  * 
[https://mratsim.github.io/Arraymancer/linear_systems.html](https://mratsim.github.io/Arraymancer/linear_systems.html)
  * 
[https://mratsim.github.io/Arraymancer/decomposition.html](https://mratsim.github.io/Arraymancer/decomposition.html)
  * 
[https://mratsim.github.io/Arraymancer/pca.html](https://mratsim.github.io/Arraymancer/pca.html)
  * 
[https://mratsim.github.io/Arraymancer/decomposition_rand.html](https://mratsim.github.io/Arraymancer/decomposition_rand.html)



> 1D FFT, IFFT

Not implemented, wrapping MKL FFT can be a weekend project with c2nim or 
nimterop 
[https://software.intel.com/content/www/us/en/develop/documentation/mkl-developer-reference-c/top/appendix-e-code-examples/fourier-transform-functions-code-examples/fft-code-examples.html](https://software.intel.com/content/www/us/en/develop/documentation/mkl-developer-reference-c/top/appendix-e-code-examples/fourier-transform-functions-code-examples/fft-code-examples.html)

Implemening a pure Nim FFT is something I want to do at one point but lack of 
time.

> All above, running in CPU only (with MKL and/or automated multi-threading 
> e.g. for large FFT/IFFT)

You can use OpenBLAS or MKL in both Neo or Arraymancer.

That said, you can write pure Nim code that has performance similar to both 
OpenBLAS and MKL. I track benchmarks of pure Nim implementation with threading 
via Laser (using the Nim OpenMP operators) and Weave here: 
[https://github.com/mratsim/weave/tree/master/benchmarks/matmul_gemm_blas](https://github.com/mratsim/weave/tree/master/benchmarks/matmul_gemm_blas)
    
    
    iterator `||`[S, T](a: S; b: T; annotation: static string = "parallel 
for"): T
      ## See https://nim-lang.org/docs/system.html#%7C%7C.i%2CS%2CT%2Cstring
    iterator `||`[S, T](a: S; b: T; step: Positive; annotation: static string = 
"parallel for"): T
    
    
    Run

Last time I optimized this, I could reach 2.8 TFlops with Weave, 2.8 TFlops 
with Laser + OpenMP, 2.7 TFlops on OpenMP, 3 TFlops for MKL and 3.1 TFlops with 
Intel oneDNN 
([https://github.com/mratsim/weave/pull/94#issuecomment-571751545](https://github.com/mratsim/weave/pull/94#issuecomment-571751545))
 but i started from single-threaded performance of 160GFlops vs Intel and 
OpenBLAS 200GFlops on a 18-core machine.

> GPU

Yes but minimal, Cuda and OpenCL at the moment

> Statistical functions

PCA and SVD are well developped and actually 2x to 10x faster than in any other 
language (including Sklearn latest optimizations and Facebook's PCA)

  * 
[https://github.com/mratsim/Arraymancer/pull/384#issuecomment-536682906](https://github.com/mratsim/Arraymancer/pull/384#issuecomment-536682906)



> SPline, Numerical integration and ODE

  * 
[https://github.com/HugoGranstrom/numericalnim](https://github.com/HugoGranstrom/numericalnim)

Re: Write Nim in Matlab/Julia style using macros while still deploy to Cloud/PC/GPU/embedded?

Reply via email to