[
https://issues.apache.org/jira/browse/FLINK-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589604#comment-14589604
]
Till Rohrmann commented on FLINK-2211:
--------------------------------------
If you have more items than {{2^31-1}} then you clearly need {{Long}} IDs for
them. However every item block cannot contain more than {{2^31 - 1}} item
vectors, because they are stored in an array. However, by increasing the number
of item blocks one can decrease the number of items per block so that no block
contains more items than {{2^31 - 1}}. But I think this is a fair assumption
since you usually are not able to keep an array of {{#itemsPerBlock *
#latentFactors * sizeOfDouble}} bytes with {{#itemsPerBlock >> 2^31 - 1}} in
your memory anyway. Furthermore, it's safe to assume that {{#latentFactors <
2^31 -1}} IMO.
> Generalize ALS API
> ------------------
>
> Key: FLINK-2211
> URL: https://issues.apache.org/jira/browse/FLINK-2211
> Project: Flink
> Issue Type: Improvement
> Components: Machine Learning Library
> Affects Versions: 0.9
> Reporter: Ronny Bräunlich
> Priority: Minor
>
> predict() and fit() require at the moment DataSet[(Int, Int)] or
> DataSet[(Int, Int, Double]) respectively.
> This should be changed to Long to accept more values or to something more
> general.
> See also
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Apache-Flink-0-9-ALS-API-td6424.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)