phaniarnab opened a new pull request, #1866: URL: https://github.com/apache/systemds/pull/1866
This patch adds methods to clean up the child RDDs of lineage cached RDDs. On the first hit, we marked the RDD but let it and its child RDDs get cleaned up by the rmVar logic. On the second hit, we call persist while putting that RDD in the cache. On a later hit, if the RDD is already persisted, we clean up the child RDDs including the checkpointed and broadcast variables. If still not persisted, we asynchronously move the RDD to Spark by triggering a job after a few local reuse. A future reuse then cleans up the child RDDs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org