Shixiong Zhu created SPARK-4644:
-----------------------------------

             Summary: Implement skewed join
                 Key: SPARK-4644
                 URL: https://issues.apache.org/jira/browse/SPARK-4644
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
            Reporter: Shixiong Zhu


Skewed data is not rare. For example, a book recommendation site may have 
several books which are liked by most of the users. Running ALS on such skewed 
data will raise a OutOfMemory error, if some book has too many users which 
cannot be fit into memory. To solve it, we propose a skewed join implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to