[ 
https://issues.apache.org/jira/browse/SPARK-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-1281:
-----------------------------------

    Assignee: Tor Myklebust

> Partitioning in ALS
> -------------------
>
>                 Key: SPARK-1281
>                 URL: https://issues.apache.org/jira/browse/SPARK-1281
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Xiangrui Meng
>            Assignee: Tor Myklebust
>            Priority: Minor
>             Fix For: 1.0.0
>
>
> There are some minor issues about partitioning with the current 
> implementation of ALS:
> 1. Mod-based partitioner is used for mapping users/products to blocks. This 
> might cause problems if the ids contains information. For example, the last 
> digit may indicate the user/product type. This can be fixed by hashing.
> 2. HashPartitioner is used on the initial partition. This is the same as the 
> mod-based partitioner when the key is a positive integer. But it is certainly 
> error-prone.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to