Moreover, there is still improvement to be made in BSP-based KMeans algorithm. I'll describe it on new JIRA ticket.
On Mon, Aug 12, 2013 at 7:57 AM, Yexi Jiang <[email protected]> wrote: > That's cool! > > > 2013/8/11 Edward J. Yoon <[email protected]> > >> Here's some interesting benchmarks (by Leonidas Fegaras) showing off >> the performance of Hama compared to the Spark. Pagerank and KMeans >> were run via MRQL query, which is not as fast as the native BSP code. >> Moreover, 0.5 is very slow. I've started to think that latest Hama may >> be faster than Spark. :-) >> >> ---- >> On laptop with 8 cores: >> Hama 0.5 Spark >> Pagerank 500K/2M: 211 341 >> KMeans 1M: 31 22 >> KMeans 2M: 41 40 >> KMeans 4M: 165 77 >> >> On cluster with 64 cores: >> Hama 0.5 Spark >> Pagerank 1M/10M: 3590 428 >> KMeans 10M: 87 82 >> KMeans 20M: 129 134 >> >> On cluster with 32 cores: >> Hama 0.5 Spark >> Pagerank 1M/10M: 4419 434 >> KMeans 10M: 98 74 >> KMeans 20M: 273 74 >> >> >> -- >> Best Regards, Edward J. Yoon >> @eddieyoon >> > > > > -- > ------ > Yexi Jiang, > ECS 251, [email protected] > School of Computer and Information Science, > Florida International University > Homepage: http://users.cis.fiu.edu/~yjian004/ -- Best Regards, Edward J. Yoon @eddieyoon
