[ 
https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165049#comment-13165049
 ] 

issei yoshida commented on MAHOUT-918:
--------------------------------------

I posted the code in the review board and attached a design document.
https://reviews.apache.org/r/3072/

>  Is map-reduce an appropriate approach here for model averaging?
MPI or other frameworks may produce a better result,
but the important thing is that MapReduce implementation is easy to use for 
Hadoop users.
Some iterative algorithms (K-means or other clustering algorithms) which are 
implemented in Mahout may not be best suitable for MapReduce, but it is not the 
point.

The papers show that Iterative Parameter Mixture is the best way to distribute 
SGD in MapReduce.

> How do you plan to deal with randomization of data order?
It may be possible to randomize data order by customizing InputFormat.
                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic 
> regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is 
> referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to