[ 
https://issues.apache.org/jira/browse/MAHOUT-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977534#comment-13977534
 ] 

Pat Ferrel commented on MAHOUT-1518:
------------------------------------

I think we need to remember that we are talking about the _current_ Jobs done 
with Mahout and the data they _currently_ operate on. The example for 
MAHOUT-1518 shows a nice way to make it easy on users to put data into Mahout, 
run the Spark equivalent of RowSimilarityJob on the data and get it back out 
with IDs they recognize. None of the above has any relevance to that nor does 
it apply to most of the uses of this type of thing. They are all existing 
Mahout functionality. The only new thing is handling external IDs for the user 
as a convenience, and a big one.

It would help me if you could narrow your objections to nomenclature or class 
inheritance. What specifically is wrong with this example?

Maybe we'll flesh it out a little more and you can review.

> Preprocessing for collaborative filtering with the Scala DSL
> ------------------------------------------------------------
>
>                 Key: MAHOUT-1518
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1518
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>            Reporter: Sebastian Schelter
>            Assignee: Sebastian Schelter
>             Fix For: 1.0
>
>         Attachments: MAHOUT-1518.patch
>
>
> The aim here is to provide some easy-to-use machinery to enable the usage of 
> the new Cooccurrence Analysis code from MAHOUT-1464 with datasets represented 
> as follows in a CSV file with the schema _timestamp, userId, itemId, action_, 
> e.g.
> {code}
> timestamp1, userIdString1, itemIdString1, “view"
> timestamp2, userIdString2, itemIdString1, “like"
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to