great! p.s.: I just noticed Apache Mahout 0.8 has been released with streaming K-Means ( https://issues.apache.org/jira/browse/MAHOUT-1154) which should be much faster than previous one so it'd be worth considering it for the benchmarks
2013/7/26 Edward J. Yoon <[email protected]> > Thanks all, > > I'll add more information on how the algorithms work and use BSP > semantics, and Hama BSP benchmarks. > > On Tue, Jul 16, 2013 at 10:04 PM, Edward J. Yoon <[email protected]> > wrote: > >>>> BTW, the kmeans is 1000x faster than Mahout? > > > > P.S., yes, the kmeans is (almost) 1000x faster than Mahout MR version. > > > > On Tue, Jul 16, 2013 at 9:56 PM, Edward J. Yoon <[email protected]> > wrote: > >> There are few small mountains for us to climb. In fact, I think it's > >> inevitable, since there are some overlapped functionalities :-) > >> > >> IMO, Spark has some problems, too. > >> > >> On Tue, Jul 16, 2013 at 9:41 PM, Tommaso Teofili > >> <[email protected]> wrote: > >>> there was this benchmark run by ThomasJ some time ago but I don't know > if > >>> still applies: > >>> http://wiki.apache.org/hama/Benchmarks#K-Means_Clustering > >>> > >>> Tommaso > >>> > >>> 2013/7/16 Yexi Jiang <[email protected]> > >>> > >>>> Hi, Edward, > >>>> > >>>> Is there any need to compare Hama with the state-of-art frameworks > such as > >>>> spark, pregel etc? They draw a lot of attentions in recent years. As > far as > >>>> I know, spark is super fast. > >>>> > >>>> BTW, the kmeans is 1000x faster than Mahout? > >>>> > >>>> Regards, > >>>> Yexi > >>>> > >>>> > >>>> 2013/7/16 Tommaso Teofili <[email protected]> > >>>> > >>>> > Hi Edward, > >>>> > > >>>> > thanks, that's nice! > >>>> > One quick comment, I would make efficiency comparisons only if > backed by > >>>> > benchmarks run on latest versions (e.g. K-Means clustering > comparison > >>>> with > >>>> > Mahout) so that you can also provide updated graphs of benchmarks > so that > >>>> > people can "see better". > >>>> > > >>>> > Regards, > >>>> > Tommaso > >>>> > > >>>> > 2013/7/16 Edward J. Yoon <[email protected]> > >>>> > > >>>> > > Hi all, > >>>> > > > >>>> > > I'll talk at Hadoop In Seoul 2013 about Apache Hama. See speakers > at > >>>> > > http://hadoop.co.kr > >>>> > > > >>>> > > I'm working on my slides[1]. If you have any suggestion, Pls let > me > >>>> know. > >>>> > > > >>>> > > 1. > >>>> > > > >>>> > > >>>> > https://docs.google.com/presentation/d/1263QjLu8pgqcnrG2xNDf-SyVG-aR5k7-2naYB9gmzvg/edit?usp=sharing > >>>> > > > >>>> > > Thanks. > >>>> > > > >>>> > > -- > >>>> > > Best Regards, Edward J. Yoon > >>>> > > @eddieyoon > >>>> > > > >>>> > > >>>> > >>>> > >>>> > >>>> -- > >>>> ------ > >>>> Yexi Jiang, > >>>> ECS 251, [email protected] > >>>> School of Computer and Information Science, > >>>> Florida International University > >>>> Homepage: http://users.cis.fiu.edu/~yjian004/ > >>>> > >> > >> > >> > >> -- > >> Best Regards, Edward J. Yoon > >> @eddieyoon > > > > > > > > -- > > Best Regards, Edward J. Yoon > > @eddieyoon > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon >
