+1. And the greatest benefit of data frames work is standardization of feature extraction in Mahout, not necessarily any particular algorithms. This has been the thorniest issue in the history and nobody does it well today as it stands. If we tackle feature prep techniques in engine-agnostic way, this would be truly unique differentiation factor for Mahout.
On Wed, Apr 30, 2014 at 7:52 AM, Sebastian Schelter <[email protected]> wrote: > I think you should concentrate on MAHOUT-1490, that is a highly important > task that will be the foundation for a lot of stuff to be built on top. > Let's focus on getting this thing right and then move on to other things. > > --sebastian > > > On 04/30/2014 04:44 PM, Saikat Kanjilal wrote: > >> Sebastien/Dmitry,In looking through the current list of issues I didnt >> see other algorithms in mahout that are talked about being ported to spark, >> I was wondering if there's any interest/need in porting or writing things >> like LR/KMeans/SVM to use spark, I'd like to help out in this area while >> working on 1490. Also are we planning to port the distributed versions of >> taste to use spark as well at some point. >> Thanks in advance. >> >> >
