I agree that more distributed matrix ops would be good to have, but I think there are a few things which need to happen first: * Now that the spark.ml package has local linear algebra separate from the spark.mllib package, we should migrate the distributed linear algebra implementations over to spark.ml. * This migration will require a bit of thinking about what the API should look like. Should it use Datasets? If so, are there missing requirements to fix within Datasets or local linear algebra?
I just created a JIRA; let's discuss more there: https://issues.apache.org/jira/browse/SPARK-15882 Thanks for bringing this up! Joseph On Fri, Jun 3, 2016 at 4:02 AM, José Manuel Abuín Mosquera < abui...@gmail.com> wrote: > Hello, > > I would like to add some linear algebra operations to all the > DistributedMatrix classes that Spark actually handles (CoordinateMatrix, > BlockMatrix, IndexedRowMatrix and RowMatrix), but first I would like do ask > if you consider this useful. (For me, it is) > > Of course, these operations will be distributed, but they will rely on the > local implementation of mllib linalg. For example, when multiplying an > IndexedRowMatrix by a DenseVector, the multiplication of one of the matrix > rows by the vector will be performed by using the local implementation > > What is your opinion about it? > > Thank you > > -- > José Manuel Abuín Mosquera > Pre-doctoral researcher > Centro de Investigación en Tecnoloxías da Información (CiTIUS) > University of Santiago de Compostela > 15782 Santiago de Compostela, Spain > > http://citius.usc.es/equipo/investigadores-en-formacion/josemanuel.abuin > http://jmabuin.github.io > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >