[ https://issues.apache.org/jira/browse/SPARK-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153911#comment-14153911 ]
Xiangrui Meng commented on SPARK-3541: -------------------------------------- I put the implementation at https://github.com/mengxr/spark-als/blob/master/src/main/scala/org/apache/spark/ml/SimpleALS.scala . It still needs some cleanups to become a PR, but early feedbacks are welcome. > Improve ALS internal storage > ---------------------------- > > Key: SPARK-3541 > URL: https://issues.apache.org/jira/browse/SPARK-3541 > Project: Spark > Issue Type: Improvement > Components: ML, MLlib > Reporter: Xiangrui Meng > Assignee: Xiangrui Meng > Original Estimate: 96h > Remaining Estimate: 96h > > The internal storage of ALS uses many small objects, which increases the GC > pressure and makes ALS difficult to scale to very large scale, e.g., 50 > billion ratings. In such cases, the full GC may take more than 10 minutes to > finish. That is longer than the default heartbeat timeout and hence executors > will be removed under default settings. > We can use primitive arrays to reduce the number of objects significantly. > This requires big change to the ALS implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org