Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread Dmitriy Lyubimov
On Tue, Jan 31, 2017 at 3:01 AM, Isabel Drost-Fromm wrote: > > Hi, > > > To give some advise to downstream users in the field - what would be your > advise > for people tasked with concrete use cases (stuff like fraud detection, > anomaly > detection, learning search ranking

Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread Florent Empis
>From my point of view, mahout as a whole has shifted from what it was in 2009-2012: At the time, Mahout (and Mahout in Action is a great testimony of that era) was a sum of bricks, full of relatively high-level mathematics concepts but useable by what I'd call (myself included) wanna-be

Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread Keith Aumiller
I was just watching it. ;) https://trevorgrant.org/ Thanks Trevor! On Tue, Jan 31, 2017 at 3:41 PM, scott cote wrote: > Trevor gave a great presentation at our user group. It was live streamed > on Periscope. Trevor - maybe you could share the url? I don’t have it >

Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread scott cote
Trevor gave a great presentation at our user group. It was live streamed on Periscope. Trevor - maybe you could share the url? I don’t have it handy at the moment. SCott > On Jan 31, 2017, at 8:50 AM, Trevor Grant wrote: > > Hello Isabel and Florent, > > I'm

Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread Ted Dunning
>From my perspective, the state of the art of machine learning is with systems like Tensorflow and dl4j. If you can deal with the limits of a non-clustered GPU system, then Theano and Cafe are very useful. Keras papers over the difference between different back-ends nicely. Tensorflow and Theano

Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread Pat Ferrel
My perspective comes from the data side. I work in recommenders and that means log analysis for huge amounts of data. Even a small shop doing this will immediately run our of the capacity in Python or R on a single node. MLlib is a set of prepackaged algorithms that will work (mostly) with big

Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread Trevor Grant
Hello Isabel and Florent, I'm currently working on a side-by-side demo of R / Python / SparkML(Mllib) / Mahout, but in very broad strokes here is how I would compare them: R- Most statistical functionality. Most flexibility. Implement your own algorithms- mathematically expressive language.

Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread Florent Empis
Hi, I am in the same spot as Isabel. Used to use/understand most of the «old» standalone mahout, now doing some data transformation with spark, but I am not sure where Samsara fits in the ecosystem. We also do quite a bit of computation in R. Basically we are willing to learn and support the

Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread Isabel Drost-Fromm
Hi, On Fri, Sep 16, 2016 at 11:36:03PM -0700, Andrew Musselman wrote: > and we're thinking about just how many pre-built algorithms we > should include in the library versus working on performance behind the > scenes. To pick this question up: I've been watching Mahout from a distance for quite