yes -- right -- this is built in the image of an R frame, we will need
something similar. We will need a bit richer operations though aside from
labeling (e.g. notion of missed values, vectorization with standardization,
etc.)


On Tue, Apr 22, 2014 at 12:10 PM, Sebastian Schelter (JIRA) <[email protected]
> wrote:

>
>     [
> https://issues.apache.org/jira/browse/MAHOUT-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977222#comment-13977222]
>
> Sebastian Schelter commented on MAHOUT-1518:
> --------------------------------------------
>
> This is more or less a quick shot to give Pat what he wanted to test the
> cooccurrence code in the new DSL. This is not and was not intended as a
> general solution.
>
> But I hope that it shows what we need in terms of usability: We need a
> datastructure that easily allows the users to load their data, even if it
> does not have consecutive numeric ids or strings as key. From that, the
> users need to be able to extract a DRM, run an algorithm and map the result
> back to their original keys.
>
> This concept is also found in the MLTable and MLNumericTable proposed for
> MLI in http://arxiv-web3.library.cornell.edu/pdf/1310.5426v2.pdf
>
> > Preprocessing for collaborative filtering with the Scala DSL
> > ------------------------------------------------------------
> >
> >                 Key: MAHOUT-1518
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-1518
> >             Project: Mahout
> >          Issue Type: New Feature
> >          Components: Collaborative Filtering
> >            Reporter: Sebastian Schelter
> >            Assignee: Sebastian Schelter
> >             Fix For: 1.0
> >
> >         Attachments: MAHOUT-1518.patch
> >
> >
> > The aim here is to provide some easy-to-use machinery to enable the
> usage of the new Cooccurrence Analysis code from MAHOUT-1464 with datasets
> represented as follows in a CSV file with the schema _timestamp, userId,
> itemId, action_, e.g.
> > {code}
> > timestamp1, userIdString1, itemIdString1, “view"
> > timestamp2, userIdString2, itemIdString1, “like"
> > {code}
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.2#6252)
>

Reply via email to