Hi Thomas, Interesting discussion, which examples do you have in mind that might be easier representable in general BSP than in Giraph/Pregel?
To add my 2-cent: I think the real question whether BSP itself is the best model for distributed machine learning or an asychronous model as implemented in GraphLab should be preferred. But that's more a scientific/esoteric question :) --sebastian On 25.05.2012 19:24, Thomas Jungblut wrote: > Hi Ted, > > Giraph offers a graph layer that uses internally BSP on top of MapReduce. > You don't have access to the BSP primitives, therefore you need to treat > every machine learning problem as graph problem which maybe very > inconvenient in many cases. > > 2012/5/25 Ted Dunning <[email protected]> > >> Apache Giraph probably offers a more mature BSP model of computation. My >> guess is that it would make a stronger implementation substrate. It >> certainly has a very strong community. >> >> On Fri, May 25, 2012 at 10:44 AM, Thomas Jungblut < >> [email protected]> wrote: >> >>> Hi Manuel, >>> >>> 300k is small, I have one with 6 mio clicks. >>> However it is more a question of interest and what algorithms could be >>> suitable for BSP. >>> In case you wonder what BSP is, it stands for bulk synchronous parallel >>> [1]. >>> We think that realtime and strongly iterative algorithms that are slow in >>> mapreduce could be more efficiently solved with BSP. >>> If you're interested, let us know. >>> >>> Regards, >>> Thomas >>> >>> [1] http://en.wikipedia.org/wiki/Bulk_synchronous_parallel >>> >>> 2012/5/25 Manuel Blechschmidt <[email protected]> >>> >>>> Hi Edward, >>>> do you already have a test dataset? >>>> >>>> I might get one with about 300.000 clicks for you. >>>> >>>> It is from www.nelou.com and we are already running a recommender in >>>> preview mode: >>>> >>> >> http://www.nelou.com/artikel-803746/Overall-von-mysuro#__apaxoPreviewMode >>>> >>>> It could be the case that you would have to sign an NDA. Would this be >>>> possible for you? >>>> >>>> /Manuel >>>> >>>> On 25.05.2012, at 10:34, Edward J. Yoon wrote: >>>> >>>>> OKay, I'm FWD this to mahout dev. >>>>> >>>>> I'm planning to create a project related to On-line machine learning, >>>>> as a Apache Hama sub-module. Since the graph of message queues and >>>>> workers could be implemented using BSP (see also [1]). The first idea >>>>> is On-line recommendation system based on click-stream data. >>>>> >>>>> If you have interested in this plan, let's talk together here. >>>>> >>>>> 1. >>>> >>> >> http://codingwiththomas.blogspot.com/2011/10/apache-hama-realtime-processing.html >>>>> >>>>> ---------- Forwarded message ---------- >>>>> From: Thomas Jungblut <[email protected]> >>>>> Date: Fri, May 25, 2012 at 4:55 PM >>>>> Subject: Re: Online machine learning on top of Hama BSP >>>>> To: [email protected] >>>>> >>>>> >>>>> Should we cooperate with the Mahout guys on this? I'm pretty sure >> they >>>>> would have fun with it. >>>>> Edward, do you want to ask them? >>>>> >>>>> 2012/5/25 Tommaso Teofili <[email protected]> >>>>> >>>>>> Do you have a plan for that Edward? >>>>>> A separate package in examples or a separate (online) machine >> learning >>>>>> module? Or something else? >>>>>> Regards >>>>>> Tommaso >>>>>> >>>>>> 2012/5/25 Edward J. Yoon <[email protected]> >>>>>> >>>>>>> OKay, then let's get started. >>>>>>> >>>>>>> My first idea is simple online recommendation system based on >>>>>> click-stream >>>>>>> data. >>>>>>> >>>>>>> On Thu, May 24, 2012 at 6:26 PM, Praveen Sripati >>>>>>> <[email protected]> wrote: >>>>>>>> +1 >>>>>>>> >>>>>>>> For those who are interested in ML, please check this. GNU Octave >> is >>>>>>> used. >>>>>>>> >>>>>>>> https://www.coursera.org/course/ml >>>>>>>> >>>>>>>> Another session is yet to be announced. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Praveen >>>>>>>> >>>>>>>> On Thu, May 24, 2012 at 12:54 PM, Thomas Jungblut < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> 2012/5/24 Tommaso Teofili <[email protected]> >>>>>>>>> >>>>>>>>>> and same here :) >>>>>>>>>> >>>>>>>>>> 2012/5/24 Vaijanath Rao <[email protected]> >>>>>>>>>> >>>>>>>>>>> +1 me too >>>>>>>>>>> On May 23, 2012 10:26 PM, "Aditya Sarawgi" < >>>>>>> [email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> +1 >>>>>>>>>>>> I would be happy to help :) >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 23, 2012 at 6:23 PM, Edward J. Yoon < >>>>>>>>> [email protected] >>>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> Does anyone interesting in online machine learning? >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Best Regards, Edward J. Yoon >>>>>>>>>>>>> @eddieyoon >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Aditya Sarawgi >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thomas Jungblut >>>>>>>>> Berlin <[email protected]> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, Edward J. Yoon >>>>>>> @eddieyoon >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thomas Jungblut >>>>> Berlin <[email protected]> >>>>> >>>>> >>>>> -- >>>>> Best Regards, Edward J. Yoon >>>>> @eddieyoon >>>> >>>> -- >>>> Manuel Blechschmidt >>>> Dortustr. 57 >>>> 14467 Potsdam >>>> Mobil: 0173/6322621 >>>> Twitter: http://twitter.com/Manuel_B >>>> >>>> >>> >>> >>> -- >>> Thomas Jungblut >>> Berlin <[email protected]> >>> >> > > >
