viirya commented on issue #25576: [SPARK-28866][ML] Persist item factors RDD 
when checkpointing in ALS
URL: https://github.com/apache/spark/pull/25576#issuecomment-526378257
 
 
   In the implicit case, we don't do .count() after .checkpoint(), because in 
later computeFactors, we materialize  the checkpointed RDD. That is why there 
is a comment saying `itemFactors gets materialized in computeFactors:
   
   ```
   if (shouldCheckpoint(iter)) {
     itemFactors.checkpoint() // itemFactors gets materialized in computeFactors
   }
   ```
   
   In non-implicit case, computeFactors doesn't materialize it, so .count() is 
needed.
   
   In the non-implicit case, we don't need to persist user factors. Because in 
this case, the factors RDDs are only referred once in each iteration, and no 
materialization is happened (except for checkpoint + .count() on item factors).
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to