On 04/14/2014 08:00 AM, Dmitriy Lyubimov wrote:
not all things unfortunately map gracefully into algebra. But hopefully
some of the whole can still be.

Yes, that's why I was asking Andy if there are enough constructs. If not, we might have to add more.


I am even a little bit worried that we may develop almost too much (is
there such thing) of ML before we have a chance to cyrstallize data frames
and perhaps dictionary discussions. these are more tools to keep abstracted.

I think it's a very good thing to have early ML implementations on the DSL, because it allows us to validate whether we are on the right path. We should start with providing the things that are most popular in mahout, like the item-based recommender from MAHOUT-1464. Having a few implementations on the DSL also helps with designing new abstractions, because for every proposed feature we can look at the existing code and see how helpful the new feature would be.


I just don't want Mahout to be yet-another mllib. I shudder every time
somebody says "we want to create a Spark version of (an|the) algorithm".  I
know it will be creating wrong talking points for somebody anxious to draw
parallels.

Totally agree here. Looks history repeats itself from "I want to create a Hadoop implementation" to "I want to create a Spark implementation" :)



On Sun, Apr 13, 2014 at 10:51 PM, Sebastian Schelter <[email protected]> wrote:

Andy, that would be awesome. Have you had a look at our new scala DSL [1]?
Does it offer enough constructs for you to rewrite your implementation with
it?

--sebastian


[1] https://mahout.apache.org/users/sparkbindings/home.html


On 04/14/2014 07:47 AM, Andy Twigg wrote:

       +1 to removing present Random Forests. Andy Twigg had provided a
Spark
based Streaming Random Forests impl sometime last year. Its time to
restart
that conversation and integrate that into the codebase if the contributor
is still willing i.e.


I'm happy to contribute this, but as it stands it's written against
spark, even forgetting the 'streaming' aspect. Do you have any advice
on how to proceed?





Reply via email to