Hey Yuxi, We also have implemented a distributed matrix multiplication library in PasaLab. The repo is host on here https://github.com/PasaLab/marlin . We implemented three distributed matrix multiplication algorithms on Spark. As we see, communication-optimal does not always means the total-optimal. Thus, besides the CARMA matrix multiplication you mentioned, we also implemented the Block-splitting matrix multiplication and Broadcast matrix multiplication. They are more efficient than the CARMA matrix multiplication for some situations, for example a large matrix multiplies a small matrix.
Actually, We have shared the work on Spark Meetup@Beijing on October 26th.( http://www.meetup.com/spark-user-beijing-Meetup/events/210422112/ ). The slide can be download from the archive here http://pan.baidu.com/s/1dDoyHX3#path=%252Fmeetup-3rd Best, Rong 2014-11-18 13:11 GMT+08:00 顾荣 <gurongwal...@gmail.com>: > Hey Yuxi, > > We also have implemented a distributed matrix multiplication library in > PasaLab. The repo is host on here https://github.com/PasaLab/marlin . We > implemented three distributed matrix multiplication algorithms on Spark. As > we see, communication-optimal does not always means the total-optimal. > Thus, besides the CARMA matrix multiplication you mentioned, we also > implemented the Block-splitting matrix multiplication and Broadcast matrix > multiplication. They are more efficient than the CARMA matrix > multiplication for some situations, for example a large matrix multiplies a > small matrix. > > Actually, We have shared the work on Spark Meetup@Beijing on October > 26th.( http://www.meetup.com/spark-user-beijing-Meetup/events/210422112/ > ). The slide is also attached in this mail. > > Best, > Rong > > 2014-11-18 11:36 GMT+08:00 Zongheng Yang <zonghen...@gmail.com>: > >> There's been some work at the AMPLab on a distributed matrix library on >> top >> of Spark; see here [1]. In particular, the repo contains a couple >> factorization algorithms. >> >> [1] https://github.com/amplab/ml-matrix >> >> Zongheng >> >> On Mon Nov 17 2014 at 7:34:17 PM liaoyuxi <liaoy...@huawei.com> wrote: >> >> > Hi, >> > Matrix computation is critical for algorithm efficiency like least >> square, >> > Kalman filter and so on. >> > For now, the mllib module offers limited linear algebra on matrix, >> > especially for distributed matrix. >> > >> > We have been working on establishing distributed matrix computation APIs >> > based on data structures in MLlib. >> > The main idea is to partition the matrix into sub-blocks, based on the >> > strategy in the following paper. >> > http://www.cs.berkeley.edu/~odedsc/papers/bfsdfs-mm-ipdps13.pdf >> > In our experiment, it's communication-optimal. >> > But operations like factorization may not be appropriate to carry out in >> > blocks. >> > >> > Any suggestions and guidance are welcome. >> > >> > Thanks, >> > Yuxi >> > >> > >> > > > > -- > ------------------ > Rong Gu > Department of Computer Science and Technology > State Key Laboratory for Novel Software Technology > Nanjing University > Phone: +86 15850682791 > Email: gurongwal...@gmail.com > Homepage: http://pasa-bigdata.nju.edu.cn/people/ronggu/ > -- ------------------ Rong Gu Department of Computer Science and Technology State Key Laboratory for Novel Software Technology Nanjing University Phone: +86 15850682791 Email: gurongwal...@gmail.com Homepage: http://pasa-bigdata.nju.edu.cn/people/ronggu/