[GitHub] spark issue #17742: [Spark-11968][ML][MLLIB]Optimize MLLIB ALS recommendForA...

jtengyp Thu, 27 Apr 2017 18:18:12 -0700

Github user jtengyp commented on the issue:

    https://github.com/apache/spark/pull/17742
  
    I did some tests with the PR.
    Here is the cluster configure:
        3 workers, each has 10 cores and 30G memory.
    With the netflix dataset (480,189 users and 17770 movies), the 
recommendProductsForUsers time reduces from 488.36s to 60.93s, 8x faster than 
the original method.
    
    With a larger dataset (3.29million users and 0.21 million products), the 
recommendProductsForUsers time reduces from 48h to 39min, 73x faster than the 
original method.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #17742: [Spark-11968][ML][MLLIB]Optimize MLLIB ALS recommendForA...

Reply via email to