Hi Thomas,

Interesting discussion, which examples do you have in mind that might be
easier representable in general BSP than in Giraph/Pregel?

To add my 2-cent: I think the real question whether BSP itself is the
best model for distributed machine learning or an asychronous model as
implemented in GraphLab should be preferred. But that's more a
scientific/esoteric question :)

--sebastian

On 25.05.2012 19:24, Thomas Jungblut wrote:
> Hi Ted,
> 
> Giraph offers a graph layer that uses internally BSP on top of MapReduce.
> You don't have access to the BSP primitives, therefore you need to treat
> every machine learning problem as graph problem which maybe very
> inconvenient in many cases.
> 
> 2012/5/25 Ted Dunning <[email protected]>
> 
>> Apache Giraph probably offers a more mature BSP model of computation.  My
>> guess is that it would make a stronger implementation substrate.  It
>> certainly has a very strong community.
>>
>> On Fri, May 25, 2012 at 10:44 AM, Thomas Jungblut <
>> [email protected]> wrote:
>>
>>> Hi Manuel,
>>>
>>> 300k is small, I have one with 6 mio clicks.
>>> However it is more a question of interest and what algorithms could be
>>> suitable for BSP.
>>> In case you wonder what BSP is, it stands for bulk synchronous parallel
>>> [1].
>>> We think that realtime and strongly iterative algorithms that are slow in
>>> mapreduce could be more efficiently solved with BSP.
>>> If you're interested, let us know.
>>>
>>> Regards,
>>> Thomas
>>>
>>> [1] http://en.wikipedia.org/wiki/Bulk_synchronous_parallel
>>>
>>> 2012/5/25 Manuel Blechschmidt <[email protected]>
>>>
>>>> Hi Edward,
>>>> do you already have a test dataset?
>>>>
>>>> I might get one with about 300.000 clicks for you.
>>>>
>>>> It is from www.nelou.com and we are already running a recommender in
>>>> preview mode:
>>>>
>>>
>> http://www.nelou.com/artikel-803746/Overall-von-mysuro#__apaxoPreviewMode
>>>>
>>>> It could be the case that you would have to sign an NDA. Would this be
>>>> possible for you?
>>>>
>>>> /Manuel
>>>>
>>>> On 25.05.2012, at 10:34, Edward J. Yoon wrote:
>>>>
>>>>> OKay, I'm FWD this to mahout dev.
>>>>>
>>>>> I'm planning to create a project related to On-line machine learning,
>>>>> as a Apache Hama sub-module. Since the graph of message queues and
>>>>> workers could be implemented using BSP (see also [1]). The first idea
>>>>> is On-line recommendation system based on click-stream data.
>>>>>
>>>>> If you have interested in this plan, let's talk together here.
>>>>>
>>>>> 1.
>>>>
>>>
>> http://codingwiththomas.blogspot.com/2011/10/apache-hama-realtime-processing.html
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> From: Thomas Jungblut <[email protected]>
>>>>> Date: Fri, May 25, 2012 at 4:55 PM
>>>>> Subject: Re: Online machine learning on top of Hama BSP
>>>>> To: [email protected]
>>>>>
>>>>>
>>>>> Should we cooperate with the Mahout guys on this? I'm pretty sure
>> they
>>>>> would have fun with it.
>>>>> Edward, do you want to ask them?
>>>>>
>>>>> 2012/5/25 Tommaso Teofili <[email protected]>
>>>>>
>>>>>> Do you have a plan for that Edward?
>>>>>> A separate package in examples or a separate (online) machine
>> learning
>>>>>> module? Or something else?
>>>>>> Regards
>>>>>> Tommaso
>>>>>>
>>>>>> 2012/5/25 Edward J. Yoon <[email protected]>
>>>>>>
>>>>>>> OKay, then let's get started.
>>>>>>>
>>>>>>> My first idea is simple online recommendation system based on
>>>>>> click-stream
>>>>>>> data.
>>>>>>>
>>>>>>> On Thu, May 24, 2012 at 6:26 PM, Praveen Sripati
>>>>>>> <[email protected]> wrote:
>>>>>>>> +1
>>>>>>>>
>>>>>>>> For those who are interested in ML, please check this. GNU Octave
>> is
>>>>>>> used.
>>>>>>>>
>>>>>>>> https://www.coursera.org/course/ml
>>>>>>>>
>>>>>>>> Another session is yet to be announced.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Praveen
>>>>>>>>
>>>>>>>> On Thu, May 24, 2012 at 12:54 PM, Thomas Jungblut <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> +1
>>>>>>>>>
>>>>>>>>> 2012/5/24 Tommaso Teofili <[email protected]>
>>>>>>>>>
>>>>>>>>>> and same here :)
>>>>>>>>>>
>>>>>>>>>> 2012/5/24 Vaijanath Rao <[email protected]>
>>>>>>>>>>
>>>>>>>>>>> +1 me too
>>>>>>>>>>> On May 23, 2012 10:26 PM, "Aditya Sarawgi" <
>>>>>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +1
>>>>>>>>>>>> I would be happy to help :)
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, May 23, 2012 at 6:23 PM, Edward J. Yoon <
>>>>>>>>> [email protected]
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does anyone interesting in online machine learning?
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best Regards, Edward J. Yoon
>>>>>>>>>>>>> @eddieyoon
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Aditya Sarawgi
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thomas Jungblut
>>>>>>>>> Berlin <[email protected]>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards, Edward J. Yoon
>>>>>>> @eddieyoon
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thomas Jungblut
>>>>> Berlin <[email protected]>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards, Edward J. Yoon
>>>>> @eddieyoon
>>>>
>>>> --
>>>> Manuel Blechschmidt
>>>> Dortustr. 57
>>>> 14467 Potsdam
>>>> Mobil: 0173/6322621
>>>> Twitter: http://twitter.com/Manuel_B
>>>>
>>>>
>>>
>>>
>>> --
>>> Thomas Jungblut
>>> Berlin <[email protected]>
>>>
>>
> 
> 
> 

Reply via email to