Github user zhengruifeng commented on the issue:
https://github.com/apache/spark/pull/19288
@srowen I check `LDA` : although `unpersistDataSet` is not called in it,
no intermediate cached rdds is generated after `fit()`.
Then I check `Pregel`, and find that each call of `connectedComponents`
will add two intermediate cached rdds. So I call `unpersistDataSet` in it and
the issue is fixed.
There is no other places using checkpointer.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]