Hi Manuel,

300k is small, I have one with 6 mio clicks.
However it is more a question of interest and what algorithms could be
suitable for BSP.
In case you wonder what BSP is, it stands for bulk synchronous parallel [1].
We think that realtime and strongly iterative algorithms that are slow in
mapreduce could be more efficiently solved with BSP.
If you're interested, let us know.

Regards,
Thomas

[1] http://en.wikipedia.org/wiki/Bulk_synchronous_parallel

2012/5/25 Manuel Blechschmidt <[email protected]>

> Hi Edward,
> do you already have a test dataset?
>
> I might get one with about 300.000 clicks for you.
>
> It is from www.nelou.com and we are already running a recommender in
> preview mode:
> http://www.nelou.com/artikel-803746/Overall-von-mysuro#__apaxoPreviewMode
>
> It could be the case that you would have to sign an NDA. Would this be
> possible for you?
>
> /Manuel
>
> On 25.05.2012, at 10:34, Edward J. Yoon wrote:
>
> > OKay, I'm FWD this to mahout dev.
> >
> > I'm planning to create a project related to On-line machine learning,
> > as a Apache Hama sub-module. Since the graph of message queues and
> > workers could be implemented using BSP (see also [1]). The first idea
> > is On-line recommendation system based on click-stream data.
> >
> > If you have interested in this plan, let's talk together here.
> >
> > 1.
> http://codingwiththomas.blogspot.com/2011/10/apache-hama-realtime-processing.html
> >
> > ---------- Forwarded message ----------
> > From: Thomas Jungblut <[email protected]>
> > Date: Fri, May 25, 2012 at 4:55 PM
> > Subject: Re: Online machine learning on top of Hama BSP
> > To: [email protected]
> >
> >
> > Should we cooperate with the Mahout guys on this? I'm pretty sure they
> > would have fun with it.
> > Edward, do you want to ask them?
> >
> > 2012/5/25 Tommaso Teofili <[email protected]>
> >
> >> Do you have a plan for that Edward?
> >> A separate package in examples or a separate (online) machine learning
> >> module? Or something else?
> >> Regards
> >> Tommaso
> >>
> >> 2012/5/25 Edward J. Yoon <[email protected]>
> >>
> >>> OKay, then let's get started.
> >>>
> >>> My first idea is simple online recommendation system based on
> >> click-stream
> >>> data.
> >>>
> >>> On Thu, May 24, 2012 at 6:26 PM, Praveen Sripati
> >>> <[email protected]> wrote:
> >>>> +1
> >>>>
> >>>> For those who are interested in ML, please check this. GNU Octave is
> >>> used.
> >>>>
> >>>> https://www.coursera.org/course/ml
> >>>>
> >>>> Another session is yet to be announced.
> >>>>
> >>>> Thanks,
> >>>> Praveen
> >>>>
> >>>> On Thu, May 24, 2012 at 12:54 PM, Thomas Jungblut <
> >>>> [email protected]> wrote:
> >>>>
> >>>>> +1
> >>>>>
> >>>>> 2012/5/24 Tommaso Teofili <[email protected]>
> >>>>>
> >>>>>> and same here :)
> >>>>>>
> >>>>>> 2012/5/24 Vaijanath Rao <[email protected]>
> >>>>>>
> >>>>>>> +1 me too
> >>>>>>> On May 23, 2012 10:26 PM, "Aditya Sarawgi" <
> >>> [email protected]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> +1
> >>>>>>>> I would be happy to help :)
> >>>>>>>>
> >>>>>>>> On Wed, May 23, 2012 at 6:23 PM, Edward J. Yoon <
> >>>>> [email protected]
> >>>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> Does anyone interesting in online machine learning?
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Best Regards, Edward J. Yoon
> >>>>>>>>> @eddieyoon
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Cheers,
> >>>>>>>> Aditya Sarawgi
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Thomas Jungblut
> >>>>> Berlin <[email protected]>
> >>>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best Regards, Edward J. Yoon
> >>> @eddieyoon
> >>>
> >>
> >
> >
> >
> > --
> > Thomas Jungblut
> > Berlin <[email protected]>
> >
> >
> > --
> > Best Regards, Edward J. Yoon
> > @eddieyoon
>
> --
> Manuel Blechschmidt
> Dortustr. 57
> 14467 Potsdam
> Mobil: 0173/6322621
> Twitter: http://twitter.com/Manuel_B
>
>


-- 
Thomas Jungblut
Berlin <[email protected]>

Reply via email to