[
https://issues.apache.org/jira/browse/SPARK-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964649#comment-13964649
]
Sean Owen commented on SPARK-1357:
----------------------------------
Yeah I think it's reasonable to say that the core ALS API is only in terms of
numeric IDs and leave a higher-level translation to the caller. Longs give that
much more space to hash into.
The "cost" in terms of memory of something like a String is just a reference,
so roughly the same as a Double anyway. I think the more important question is
whether Double is too hacky API-wise as a representation of fundamentally
non-numeric data. That's up for debate, but yeah the question here is more
about reserving the right to change.
I'll submit a PR that marks the items I mention as experimental, for
consideration. See if it seems reasonable.
> [MLLIB] Annotate developer and experimental API's
> -------------------------------------------------
>
> Key: SPARK-1357
> URL: https://issues.apache.org/jira/browse/SPARK-1357
> Project: Spark
> Issue Type: Sub-task
> Components: MLlib
> Affects Versions: 1.0.0
> Reporter: Patrick Wendell
> Assignee: Xiangrui Meng
> Fix For: 1.0.0
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)