[
https://issues.apache.org/jira/browse/MAHOUT-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977314#comment-13977314
]
Dmitriy Lyubimov edited comment on MAHOUT-1518 at 4/22/14 8:10 PM:
-------------------------------------------------------------------
bq. Since there is drm inside the object it's a matrix in every sense that a
drm is, right?
No. That's actually what i am trying to avoid, i.e. introducing derived types
and saying they somehow fit as an algebraic operand thruout. It is not entirely
impossible, but it would complicate things a lot and makes it fairly hard to
express in simple set of concepts about matrix transformations. As we know, R
didn't even attempt to do that -- so we don't want to go that rat hole either.
More likely we may create specific constructs such as dictionaries to help
individual pipelines at "in" and "out" ends.
Don't forget, algebraic optimizer goes by a clear set of math-only identities
when doing rewrites such as (A'B) = (B'A)'. Throwing in some business-level
rules here is likely to have further addendums to clean mathematical rules and
identities. And of course we don't want to write our own book on (Linear
Algebra + Mahout) laws.
was (Author: dlyubimov):
bq. Since there is drm inside the object it's a matrix in every sense that a
drm is, right?
No. That's actually what i am trying to avoid, i.e. introducing derived types
and saying they somehow fit as an algebraic operand thruout. It is not entirely
impossible, but it would complicate things a lot and makes it fairly hard to
express in simple set of concepts about matrix transformations. As we know, R
didn't even attempt to do that -- so we don't want to go that rat hole either.
More likely we may create specific constructs such as dictionaries to help
individual pipelines at "in" and "out" ends.
> Preprocessing for collaborative filtering with the Scala DSL
> ------------------------------------------------------------
>
> Key: MAHOUT-1518
> URL: https://issues.apache.org/jira/browse/MAHOUT-1518
> Project: Mahout
> Issue Type: New Feature
> Components: Collaborative Filtering
> Reporter: Sebastian Schelter
> Assignee: Sebastian Schelter
> Fix For: 1.0
>
> Attachments: MAHOUT-1518.patch
>
>
> The aim here is to provide some easy-to-use machinery to enable the usage of
> the new Cooccurrence Analysis code from MAHOUT-1464 with datasets represented
> as follows in a CSV file with the schema _timestamp, userId, itemId, action_,
> e.g.
> {code}
> timestamp1, userIdString1, itemIdString1, “view"
> timestamp2, userIdString2, itemIdString1, “like"
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)