[jira] [Closed] (SPARK-1672) Support separate partitioners (and numbers of partitions) for users and products

Xiangrui Meng (JIRA) Wed, 11 Jun 2014 18:18:19 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Xiangrui Meng closed SPARK-1672.
--------------------------------

    Resolution: Implemented
      Assignee: Tor Myklebust

PR: https://github.com/apache/spark/pull/1014

> Support separate partitioners (and numbers of partitions) for users and 
> products
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-1672
>                 URL: https://issues.apache.org/jira/browse/SPARK-1672
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Tor Myklebust
>            Assignee: Tor Myklebust
>            Priority: Minor
>             Fix For: 1.1.0
>
>
> The user ought to be able to specify a partitioning of his data if he knows a 
> good one.  It's convenient to have separate partitioners for users and 
> products so that no strange mapping step needs to happen.
> It may also be reasonable to partition the users and products into different 
> numbers of partitions (for instance, to balance memory requirements) if the 
> dataset is tall, thin, and very sparse.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Closed] (SPARK-1672) Support separate partitioners (and numbers of partitions) for users and products

Reply via email to