Re: Helping out on spark efforts

Dmitriy Lyubimov Wed, 30 Apr 2014 11:41:02 -0700

On Wed, Apr 30, 2014 at 10:53 AM, Dmitriy Lyubimov <[email protected]>wrote:


> +1.
>
> And the greatest benefit of data frames work is standardization of feature
> extraction in Mahout, not necessarily any particular algorithms. This has
> been the thorniest issue in the history and nobody does it well today as it
> stands.
>

Correction: nobody does it well in open source and in distributed way, that
is.


> If we tackle feature prep techniques in engine-agnostic way, this would be
> truly unique differentiation factor for Mahout.
>
>
>
> On Wed, Apr 30, 2014 at 7:52 AM, Sebastian Schelter <[email protected]>wrote:
>
>> I think you should concentrate on MAHOUT-1490, that is a highly important
>> task that will be the foundation for a lot of stuff to be built on top.
>> Let's focus on getting this thing right and then move on to other things.
>>
>> --sebastian
>>
>>
>> On 04/30/2014 04:44 PM, Saikat Kanjilal wrote:
>>
>>> Sebastien/Dmitry,In looking through the current list of issues I didnt
>>> see other algorithms in mahout that are talked about being ported to spark,
>>> I was wondering if there's any interest/need in porting or writing things
>>> like LR/KMeans/SVM to use spark, I'd like to help out in this area while
>>> working on 1490.  Also are we planning to port the distributed versions of
>>> taste to use spark as well at some point.
>>> Thanks in advance.
>>>
>>>
>>
>

Re: Helping out on spark efforts

Reply via email to