Hi there,

I am working in a group of research of the Michigan University (Mathematics), 
and we are thinking to increase the speed of some algorithms we are using and 
developed here, by using distributed systems.

We were thinking about using Spark, but I found recently Mahout and I read 
about it. We are using a lot KNN and Minimal Spanning Tree here, and our main 
concern is about dealing with the inversion of Matrix (really really big matrix)

I found this paper : https://web.njit.edu/~ansari/papers/16IEEEAccess.pdf 
<https://web.njit.edu/~ansari/papers/16IEEEAccess.pdf> , Spark-based 
Large-scale Matrix Inversion for Big Data Processing, which provides a really 
good method for dealing with the inversion issue.

My askings are: 
- Is it better for what we want to do to use Mahout, or Spark ? 
- I saw that you already have a distributed PCA. Do you have a really efficient 
matrix inversion algorithm in Mahout ? 
- How good is the linear algebra library in compare to Matlab for example ?

Finally, our main concern for using Spark is about the linear algebra library 
that is used with Spark. And we were wondering how good is the Mahout one ?

Thanking you in advance,

Best regards.
Thibaut

Reply via email to