Re: [GSOC 2014] Uniform API for Mahout Clustering

Dmitriy Lyubimov Mon, 17 Mar 2014 10:52:22 -0700

Yes. there's interest.
Note that we are trying to unify linear algebra primitives and optimization
on Spark as well. All new linear algebra and interaction with spark context
should probably go thru this layer. This is ongoing thing but some stuff is
working [1]


[1] mAHOUT-1346 https://issues.apache.org/jira/browse/MAHOUT-1346


On Mon, Mar 17, 2014 at 10:37 AM, chalitha udara Perera <
[email protected]> wrote:

> Hi All,
>
> Going through the mail tread Mahout 1.0 goals, I found that the main focus
> of mahout is now towards the code re-factoring and integration with Spark
> rather than implementing new algorithms. Recently I have used mahout for
> implementing document clustering module a Content Management System.
>
> To be honest we had some problems with lack of uniformity among different
> clustering algorithms. For example simple Kmeans takes input as the
> sequence file with document TF-IDF vectors, while Spectral Kmeans takes the
> csv file that defines the similarity matrix.
>
> I think if we can provide a uniform clustering API as mentioned in 1.0
> goals, it would be very useful for end user developers.
>
> I would like to proceed with this idea as my GSOC 2014 project. Please let
> me know if you are interested in this project
> --
> J.M Chalitha Udara Perera
>
> *Department of Computer Science and Engineering,*
> *University of Moratuwa,*
> *Sri Lanka*
>

Re: [GSOC 2014] Uniform API for Mahout Clustering

Reply via email to