[ 
https://issues.apache.org/jira/browse/MAHOUT-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576867#action_12576867
 ] 

Ankur commented on MAHOUT-4:
----------------------------

Hi Isabel,
                 The algorithm sure can be ported to a Map-reduce setting on 
Hadoop. In-fact the algorithm has already been map-reduced as mentioned in the 
Google new personalization paper (Please see the Javadoc for details).

I wrote the non-distributed version of the algorithm to help myself understand, 
visualize and see the EM algorithm in action starting with a very small 
dataset. The iterative logic and small dataset particularly helps in seeing how 
probability values of user and items belonging to a cluster converge for  users 
sharing large number of common items.

I also have a fair idea of how to Map-reduce it. Once the prototype is accepted 
suggesting features/changes that would be desirable in the map-reduce 
implementation, It shouldn't take me long to contribute the distributed version.


> Simple prototype for Expectation Maximization (EM)
> --------------------------------------------------
>
>                 Key: MAHOUT-4
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-4
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Ankur
>         Attachments: Mahout_EM.patch
>
>
> Create a simple prototype implementing Expectation Maximization - EM that 
> demonstrates the algorithm functionality given a set of (user, click-url) 
> data.
> The prototype should be functionally complete and should serve as a basis for 
> the Map-Reduce version of the EM algorithm.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to