Re: Tackling the "legacy dilemma"

Dmitriy Lyubimov Mon, 14 Apr 2014 00:28:19 -0700

I am ready to order a t-shirt with "Go, Andy! +100" accross it if it makes
any pragmatical sense.
On Apr 13, 2014 11:11 PM, "Sebastian Schelter" <[email protected]> wrote:


> On 04/14/2014 08:00 AM, Dmitriy Lyubimov wrote:
>
>> not all things unfortunately map gracefully into algebra. But hopefully
>> some of the whole can still be.
>>
>
> Yes, that's why I was asking Andy if there are enough constructs. If not,
> we might have to add more.
>
>
>> I am even a little bit worried that we may develop almost too much (is
>> there such thing) of ML before we have a chance to cyrstallize data frames
>> and perhaps dictionary discussions. these are more tools to keep
>> abstracted.
>>
>
> I think it's a very good thing to have early ML implementations on the
> DSL, because it allows us to validate whether we are on the right path. We
> should start with providing the things that are most popular in mahout,
> like the item-based recommender from MAHOUT-1464. Having a few
> implementations on the DSL also helps with designing new abstractions,
> because for every proposed feature we can look at the existing code and see
> how helpful the new feature would be.
>
>
>> I just don't want Mahout to be yet-another mllib. I shudder every time
>> somebody says "we want to create a Spark version of (an|the) algorithm".
>>  I
>> know it will be creating wrong talking points for somebody anxious to draw
>> parallels.
>>
>
> Totally agree here. Looks history repeats itself from "I want to create a
> Hadoop implementation" to "I want to create a Spark implementation" :)
>
>
>>
>> On Sun, Apr 13, 2014 at 10:51 PM, Sebastian Schelter <[email protected]>
>> wrote:
>>
>>  Andy, that would be awesome. Have you had a look at our new scala DSL
>>> [1]?
>>> Does it offer enough constructs for you to rewrite your implementation
>>> with
>>> it?
>>>
>>> --sebastian
>>>
>>>
>>> [1] https://mahout.apache.org/users/sparkbindings/home.html
>>>
>>>
>>> On 04/14/2014 07:47 AM, Andy Twigg wrote:
>>>
>>>         +1 to removing present Random Forests. Andy Twigg had provided a
>>>>
>>>>> Spark
>>>>> based Streaming Random Forests impl sometime last year. Its time to
>>>>> restart
>>>>> that conversation and integrate that into the codebase if the
>>>>> contributor
>>>>> is still willing i.e.
>>>>>
>>>>>
>>>> I'm happy to contribute this, but as it stands it's written against
>>>> spark, even forgetting the 'streaming' aspect. Do you have any advice
>>>> on how to proceed?
>>>>
>>>>
>>>>
>>>
>>
>

Re: Tackling the "legacy dilemma"

Reply via email to