[jira] [Commented] (MAHOUT-1464) RowSimilarityJob on Spark

Pat Ferrel (JIRA) Mon, 17 Mar 2014 09:54:24 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938020#comment-13938020
 ]


Pat Ferrel commented on MAHOUT-1464:
------------------------------------

So am I so no problem.

My plan is to update the Solr-recommender contrib with Spark for calculation of 
the indicator/similarity matrix. For the one action recommender this only needs 
RSJ, for the two action recommender it means matrix transpose and multiply OR 
XRSJ. The PreparePreferenceMatrixJob and its analogy, PrepareActionMatricesJob, 
will stay in plain old hadoop for now. Not sure there is much benefit moving a 
dataflow process to Spark

> RowSimilarityJob on Spark
> -------------------------
>
>                 Key: MAHOUT-1464
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1464
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.9
>         Environment: hadoop, spark
>            Reporter: Pat Ferrel
>              Labels: performance
>             Fix For: 0.9
>
>
> Create a version of RowSimilarityJob that runs on Spark. Ssc has a prototype 
> here: https://gist.github.com/sscdotopen/8314254. This should be compatible 
> with Mahout Spark DRM DSL so a DRM can be used as input. 
> Ideally this would extend to cover MAHOUT-1422 which is a feature request for 
> RSJ on two inputs to calculate the similarity of rows of one DRM with those 
> of another. This cross-similarity has several applications including 
> cross-action recommendations. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAHOUT-1464) RowSimilarityJob on Spark

Reply via email to