viirya commented on issue #25856: [SPARK-29182][Core] Cache preferred locations of checkpointed RDD URL: https://github.com/apache/spark/pull/25856#issuecomment-533928501 > It sounds like huge. Besides `ALS`, could you give us some example which gets this benefits? > > > It reduces the time on huge union from few hours to dozens of minutes. This issue is not limited to ALS so this change is not specified to ALS. Actually it is common usage to checkpoint data in Spark, to increase reliability and cut RDD linage. Spark operations on the checkpointed data, will be beneficial.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
