Github user zhengruifeng commented on a diff in the pull request:
https://github.com/apache/spark/pull/20956#discussion_r180064831
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/NodeIdCache.scala ---
@@ -166,9 +166,13 @@ private[spark] class NodeIdCache(
}
}
}
+ if (nodeIdsForInstances != null) {
+ // Unpersist current one if one exists.
+ nodeIdsForInstances.unpersist(false)
+ }
if (prevNodeIdsForInstances != null) {
// Unpersist the previous one if one exists.
- prevNodeIdsForInstances.unpersist()
+ prevNodeIdsForInstances.unpersist(false)
--- End diff --
For now `deleteAllCheckpoints` is only called once in whole MLLIB, and
current `unpsersit` of `prevNodeIdsForInstances` is in it. So I think we do not
need to impl another method to unpersist datasets (like `PeriodicCheckpointer`)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]