[
https://issues.apache.org/jira/browse/MAHOUT-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977534#comment-13977534
]
Pat Ferrel commented on MAHOUT-1518:
------------------------------------
I think we need to remember that we are talking about the _current_ Jobs done
with Mahout and the data they _currently_ operate on. The example for
MAHOUT-1518 shows a nice way to make it easy on users to put data into Mahout,
run the Spark equivalent of RowSimilarityJob on the data and get it back out
with IDs they recognize. None of the above has any relevance to that nor does
it apply to most of the uses of this type of thing. They are all existing
Mahout functionality. The only new thing is handling external IDs for the user
as a convenience, and a big one.
It would help me if you could narrow your objections to nomenclature or class
inheritance. What specifically is wrong with this example?
Maybe we'll flesh it out a little more and you can review.
> Preprocessing for collaborative filtering with the Scala DSL
> ------------------------------------------------------------
>
> Key: MAHOUT-1518
> URL: https://issues.apache.org/jira/browse/MAHOUT-1518
> Project: Mahout
> Issue Type: New Feature
> Components: Collaborative Filtering
> Reporter: Sebastian Schelter
> Assignee: Sebastian Schelter
> Fix For: 1.0
>
> Attachments: MAHOUT-1518.patch
>
>
> The aim here is to provide some easy-to-use machinery to enable the usage of
> the new Cooccurrence Analysis code from MAHOUT-1464 with datasets represented
> as follows in a CSV file with the schema _timestamp, userId, itemId, action_,
> e.g.
> {code}
> timestamp1, userIdString1, itemIdString1, “view"
> timestamp2, userIdString2, itemIdString1, “like"
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)