[
https://issues.apache.org/jira/browse/MAHOUT-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576867#action_12576867
]
Ankur commented on MAHOUT-4:
----------------------------
Hi Isabel,
The algorithm sure can be ported to a Map-reduce setting on
Hadoop. In-fact the algorithm has already been map-reduced as mentioned in the
Google new personalization paper (Please see the Javadoc for details).
I wrote the non-distributed version of the algorithm to help myself understand,
visualize and see the EM algorithm in action starting with a very small
dataset. The iterative logic and small dataset particularly helps in seeing how
probability values of user and items belonging to a cluster converge for users
sharing large number of common items.
I also have a fair idea of how to Map-reduce it. Once the prototype is accepted
suggesting features/changes that would be desirable in the map-reduce
implementation, It shouldn't take me long to contribute the distributed version.
> Simple prototype for Expectation Maximization (EM)
> --------------------------------------------------
>
> Key: MAHOUT-4
> URL: https://issues.apache.org/jira/browse/MAHOUT-4
> Project: Mahout
> Issue Type: New Feature
> Reporter: Ankur
> Attachments: Mahout_EM.patch
>
>
> Create a simple prototype implementing Expectation Maximization - EM that
> demonstrates the algorithm functionality given a set of (user, click-url)
> data.
> The prototype should be functionally complete and should serve as a basis for
> the Map-Reduce version of the EM algorithm.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.