On Wed, Apr 30, 2014 at 10:53 AM, Dmitriy Lyubimov <[email protected]>wrote:
> +1. > > And the greatest benefit of data frames work is standardization of feature > extraction in Mahout, not necessarily any particular algorithms. This has > been the thorniest issue in the history and nobody does it well today as it > stands. > Correction: nobody does it well in open source and in distributed way, that is. > If we tackle feature prep techniques in engine-agnostic way, this would be > truly unique differentiation factor for Mahout. > > > > On Wed, Apr 30, 2014 at 7:52 AM, Sebastian Schelter <[email protected]>wrote: > >> I think you should concentrate on MAHOUT-1490, that is a highly important >> task that will be the foundation for a lot of stuff to be built on top. >> Let's focus on getting this thing right and then move on to other things. >> >> --sebastian >> >> >> On 04/30/2014 04:44 PM, Saikat Kanjilal wrote: >> >>> Sebastien/Dmitry,In looking through the current list of issues I didnt >>> see other algorithms in mahout that are talked about being ported to spark, >>> I was wondering if there's any interest/need in porting or writing things >>> like LR/KMeans/SVM to use spark, I'd like to help out in this area while >>> working on 1490. Also are we planning to port the distributed versions of >>> taste to use spark as well at some point. >>> Thanks in advance. >>> >>> >> >
