If a RDD object have non-empty .dependencies, does that means it have
lineage? How could I remove it?

I'm doing iterative computing and each iteration depends on the result
computed in previous iteration. After several iteration, it will throw
StackOverflowError.

At first I'm trying to use cache, I read the code in pregel.scala, which is
part of GraphX, they use a count method to materialize the object after
cache, but I attached a debugger and seems such approach does not empty
.dependencies, and that also does not work in my code.

Another alternative approach is using checkpoint, I tried checkpoint
vertices and edges for my Graph object and then materialize it by count
vertices and edges. Then I use .isCheckpointed to check if it is correctly
checkpointed, but it always return false.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Best-practices-for-removing-lineage-of-a-RDD-or-Graph-object-tp7779.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to