[jira] [Commented] (FLINK-2211) Generalize ALS API

Till Rohrmann (JIRA) Wed, 17 Jun 2015 03:36:20 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589604#comment-14589604
 ]


Till Rohrmann commented on FLINK-2211:
--------------------------------------

If you have more items than {{2^31-1}} then you clearly need {{Long}} IDs for 
them. However every item block cannot contain more than {{2^31 - 1}} item 
vectors, because they are stored in an array. However, by increasing the number 
of item blocks one can decrease the number of items per block so that no block 
contains more items than {{2^31 - 1}}. But I think this is a fair assumption 
since you usually are not able to keep an array of {{#itemsPerBlock * 
#latentFactors * sizeOfDouble}} bytes with {{#itemsPerBlock >> 2^31 - 1}} in 
your memory anyway. Furthermore, it's safe to assume that {{#latentFactors < 
2^31 -1}} IMO. 

> Generalize ALS API
> ------------------
>
>                 Key: FLINK-2211
>                 URL: https://issues.apache.org/jira/browse/FLINK-2211
>             Project: Flink
>          Issue Type: Improvement
>          Components: Machine Learning Library
>    Affects Versions: 0.9
>            Reporter: Ronny Bräunlich
>            Priority: Minor
>
> predict() and fit() require at the moment DataSet[(Int, Int)] or 
> DataSet[(Int, Int, Double]) respectively.
> This should be changed to Long to accept more values or to something more 
> general.
> See also 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Apache-Flink-0-9-ALS-API-td6424.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2211) Generalize ALS API

Reply via email to