[jira] [Commented] (MAHOUT-1344) Self-Organizing Map algorithm (batch version)

Ted Dunning (JIRA) Tue, 01 Oct 2013 22:30:24 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783669#comment-13783669
 ]


Ted Dunning commented on MAHOUT-1344:
-------------------------------------

This is really unfortunate.

You just provided a very neatly formatted patch with about 4000 lines of code 
that is based on the old clustering framework instead of the new one.

It would have been exceedingly much better if you had engaged the community 
during your development.  The benefits of this would have been:

- your results would have been better (faster, more scalable)

- your code would be much more likely to survive changes (by being based on new 
stuff instead of stuff that is likely to be be removed)

- the parts of your changes that are not directly related to your final project 
could have been integrated along the way

- we could have discussed how this fits into Mahout in general.

As I said, unfortunate.

But, moving on, can you say more about what you intend for this code?  Will you 
be around to help maintain it?

Do you have a writeup with references that describes the implementation and 
algorithms?  Do you have a tutorial that describes how your code works?

Who do you think would need this code?


> Self-Organizing Map algorithm (batch version)
> ---------------------------------------------
>
>                 Key: MAHOUT-1344
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1344
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Clustering
>    Affects Versions: 0.8
>            Reporter: Álvaro Pérez Alarcón
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: MAHOUT-1344.patch
>
>
> Good morning.
> As part of my final year project, I have implemented a new module for Apache 
> Mahout, implementing Kohonen's self-organizing map algorithm, in its batch 
> version.
> The work is already done, and I will proceed to submit a patch ASAP. It was 
> developed over Mahout 0.8.
> The patch includes unit tests and the algorithm was successfully used in a 
> Hadoop cluster to cluster two big datasets. Results can be seen in [this 
> image gallery|http://imgur.com/a/DlgRT].
> The implementation uses the generic clustering algorithms implemented in the 
> ClusterIterator class. Minor changes were made to this and other related 
> classes to support some of the features, without affecting the execution of 
> other algorithms.
> The algorithm supports convergence and the ability to resume a work at a 
> given iteration (mainly, in order to initialize KohonenBatchClusteringPolicy 
> with a given iteration number, althought it also affects the names of the 
> output directories).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (MAHOUT-1344) Self-Organizing Map algorithm (batch version)

Reply via email to